Home » SQL & PL/SQL » SQL & PL/SQL » SQL Help Parsing Column (Oracle 19.0.0.0.0 Windows 64bit)
SQL Help Parsing Column [message #690271] Wed, 12 February 2025 08:52 Go to next message
Duane
Messages: 585
Registered: December 2002
Senior Member
Hello,

I'm needing some help in parsing a column.  I'm just not sure how to go about it since the format is kind of unique.


Description

I need to parse the Comments field and break apart the different elements into single elements that are returned by a select statement (Desired Output statement).  Not sure this matters but I was told the data in the comments field are 3 sets of data distinguished by [] for each set.



Sample Data


create table sample_comments
(
  id_number  number(8) not null,
  comments   varchar2(4000 char)
);

set define off;
insert into sample_comments
   (id_number, comments)
 values
   (3333333, '[222222, 11/21/2024 9:07:17 PM, CGAAAAAAA] [INCOMPLETE] [, , ]');
insert into sample_comments
   (id_number, comments)
 values
   (3333333, '[333333, 11/12/2024 10:50:37 PM, CGBBBBBBB] [SUBMITTED] [CGBBBBBBB, Y, 11/13/2024]');
commit;

Parse Comments Column


[222222, 11/21/2024 9:07:17 PM, CGAAAAAAA] [INCOMPLETE] [, , ]

[333333, 11/12/2024 10:50:37 PM, CGBBBBBBB] [SUBMITTED] [CGBBBBBBB, Y, 11/13/2024]

Desired Output



select 222222 as id, 11/21/2024 9:07:17 PM as entered_date, CGAAAAAAA as user, INCOMPLETE as status, null as processor, null as ready_status, null as completed_date
       333333 as id, 11/12/2024 10:50:37 PM as entered_date, CGBBBBBBB as user, SUBMITTED as status, CGBBBBBBB as processor, Y as ready_status, 11/13/2024 as completed_date
  from sample_comments


[Updated on: Wed, 12 February 2025 09:16]

Report message to a moderator

Re: SQL Help Parsing Column [message #690272 is a reply to message #690271] Wed, 12 February 2025 12:27 Go to previous messageGo to next message
Michel Cadot
Messages: 68737
Registered: March 2007
Location: Saint-Maur, France, https...
Senior Member
Account Moderator

Assuming that "comments" always starts with a [ (or ,) and ends with a ] (or ,) (which is correct if "the comments field are 3 sets of data distinguished by [] for each set" is true):
SQL> col comments format a90
SQL> select * from sample_comments;
 ID_NUMBER COMMENTS
---------- ------------------------------------------------------------------------------------------
   3333333 [222222, 11/21/2024 9:07:17 PM, CGAAAAAAA] [INCOMPLETE] [, , ]
   3333333 [333333, 11/12/2024 10:50:37 PM, CGBBBBBBB] [SUBMITTED] [CGBBBBBBB, Y, 11/13/2024]

2 rows selected.

SQL> col id             format a7
SQL> col entered_date   format a22
SQL> col user           format a9
SQL> col status         format a10
SQL> col processor      format a10
SQL> col ready_status   format a12
SQL> col completed_date format a14
SQL> select id_number,
  2         regexp_substr(comments, '[[, ]+([^],]*)[],]', 1, 1, null, 1) id,
  3         regexp_substr(comments, '[[, ]+([^],]*)[],]', 1, 2, null, 1) entered_date,
  4         regexp_substr(comments, '[[, ]+([^[,]*)[],]', 1, 3, null, 1) "USER",
  5         regexp_substr(comments, '[[, ]+([^],]*)[],]', 1, 4, null, 1) status,
  6         regexp_substr(comments, '[[, ]+([^],]*)[],]', 1, 5, null, 1) processor,
  7         regexp_substr(comments, '[[, ]+([^],]*)[],]', 1, 6, null, 1) ready_status,
  8         regexp_substr(comments, '[[, ]+([^],]*)[],]', 1, 7, null, 1) completed_date
  9  from sample_comments
 10  /
 ID_NUMBER ID      ENTERED_DATE           USER      STATUS     PROCESSOR  READY_STATUS COMPLETED_DATE
---------- ------- ---------------------- --------- ---------- ---------- ------------ --------------
   3333333 222222  11/21/2024 9:07:17 PM  CGAAAAAAA INCOMPLETE
   3333333 333333  11/12/2024 10:50:37 PM CGBBBBBBB SUBMITTED  CGBBBBBBB  Y            11/13/2024

2 rows selected.
Re: SQL Help Parsing Column [message #690273 is a reply to message #690272] Wed, 12 February 2025 12:52 Go to previous messageGo to next message
Duane
Messages: 585
Registered: December 2002
Senior Member
You are correct in that assumption.

Perfect.  Thank you so much.
Re: SQL Help Parsing Column [message #690274 is a reply to message #690273] Wed, 12 February 2025 13:13 Go to previous messageGo to next message
Duane
Messages: 585
Registered: December 2002
Senior Member
Could you explain this:

[[, ]+([^],]*)[],]


I just want to ensure I understand what you are doing.

Something like:

Matching on Brackets with a comma '[, ]'
then matching on Brackets with anything following a comma '[^],]*'....... and so on.
Re: SQL Help Parsing Column [message #690275 is a reply to message #690274] Wed, 12 February 2025 13:59 Go to previous messageGo to next message
Michel Cadot
Messages: 68737
Registered: March 2007
Location: Saint-Maur, France, https...
Senior Member
Account Moderator

[[, ]+ matches one or more among "[", "," and " " characters
[^],]* matches 0 or more characters that are not "]" or ","
[],] matches one character among "]" and ","

(<somehing>) memorizes what is between () in a variable named 1, 2, 3... depending if this the first (), second ()... in the pattern. Here there is only one such expression, REGEXP_SUBSTR returns this value using the 1 in its last parameter.

This makes me realize that there may be cases that won't work. For instance, if in the last set the first value is not given but the other ones are (ex. "[, , 11/13/2024]".
An additional work has to be done if this can happen. Smile

[Updated on: Thu, 13 February 2025 00:25]

Report message to a moderator

Re: SQL Help Parsing Column [message #690276 is a reply to message #690275] Wed, 12 February 2025 14:15 Go to previous messageGo to next message
Michel Cadot
Messages: 68737
Registered: March 2007
Location: Saint-Maur, France, https...
Senior Member
Account Moderator

So here's a better one:
SQL> select * from sample_comments;
 ID_NUMBER COMMENTS
---------- ------------------------------------------------------------------------------------------
   3333333 [222222, 11/21/2024 9:07:17 PM, CGAAAAAAA] [INCOMPLETE] [, , ]
   3333333 [333333, 11/12/2024 10:50:37 PM, CGBBBBBBB] [SUBMITTED] [CGBBBBBBB, Y, 11/13/2024]
   3333333 [4444444, , CGCCCCCCC] [UNKNOWN] [, , 11/13/2024]

3 rows selected.

SQL> col id             format a7
SQL> col entered_date   format a22
SQL> col user           format a9
SQL> col status         format a10
SQL> col processor      format a10
SQL> col ready_status   format a12
SQL> col completed_date format a14
SQL> select id_number,
  2         regexp_substr(comments, '[[ ]+([^],]*)[],]', 1, 1, null, 1) id,
  3         regexp_substr(comments, '[[ ]+([^],]*)[],]', 1, 2, null, 1) entered_date,
  4         regexp_substr(comments, '[[ ]+([^[,]*)[],]', 1, 3, null, 1) "USER",
  5         regexp_substr(comments, '[[ ]+([^],]*)[],]', 1, 4, null, 1) status,
  6         regexp_substr(comments, '[[ ]+([^],]*)[],]', 1, 5, null, 1) processor,
  7         regexp_substr(comments, '[[ ]+([^],]*)[],]', 1, 6, null, 1) ready_status,
  8         regexp_substr(comments, '[[ ]+([^],]*)[],]', 1, 7, null, 1) completed_date
  9  from sample_comments
 10  /
 ID_NUMBER ID      ENTERED_DATE           USER      STATUS     PROCESSOR  READY_STATUS COMPLETED_DATE
---------- ------- ---------------------- --------- ---------- ---------- ------------ --------------
   3333333 222222  11/21/2024 9:07:17 PM  CGAAAAAAA INCOMPLETE
   3333333 333333  11/12/2024 10:50:37 PM CGBBBBBBB SUBMITTED  CGBBBBBBB  Y            11/13/2024
   3333333 4444444                        CGCCCCCCC UNKNOWN                            11/13/2024

3 rows selected.
Note that it works only if there is at least one space after a comma.
You can ensure that using "replace(comments, ',', ', ')" instead of "comments" in the REGEXP_SUBSTR function.

[Edit: add last sentence]

[Updated on: Wed, 12 February 2025 15:42]

Report message to a moderator

Re: SQL Help Parsing Column [message #690277 is a reply to message #690276] Wed, 12 February 2025 15:45 Go to previous message
Duane
Messages: 585
Registered: December 2002
Senior Member
Thank you.
Previous Topic: Sorting Versions into rows
Next Topic: sql performance (merged)
Goto Forum:
  


Current Time: Sat Feb 22 17:22:39 CST 2025