Home » SQL & PL/SQL » SQL & PL/SQL » SQL Help Parsing Column (Oracle 19.0.0.0.0 Windows 64bit)
SQL Help Parsing Column [message #690271] |
Wed, 12 February 2025 08:52  |
Duane
Messages: 585 Registered: December 2002
|
Senior Member |
|
|
Hello,
I'm needing some help in parsing a column. I'm just not sure how to go about it since the format is kind of unique.
Description
I need to parse the Comments field and break apart the different elements into single elements that are returned by a select statement (Desired Output statement). Not sure this matters but I was told the data in the comments field are 3 sets of data distinguished by [] for each set.
Sample Data
create table sample_comments
(
id_number number(8) not null,
comments varchar2(4000 char)
);
set define off;
insert into sample_comments
(id_number, comments)
values
(3333333, '[222222, 11/21/2024 9:07:17 PM, CGAAAAAAA] [INCOMPLETE] [, , ]');
insert into sample_comments
(id_number, comments)
values
(3333333, '[333333, 11/12/2024 10:50:37 PM, CGBBBBBBB] [SUBMITTED] [CGBBBBBBB, Y, 11/13/2024]');
commit;
Parse Comments Column
[222222, 11/21/2024 9:07:17 PM, CGAAAAAAA] [INCOMPLETE] [, , ]
[333333, 11/12/2024 10:50:37 PM, CGBBBBBBB] [SUBMITTED] [CGBBBBBBB, Y, 11/13/2024]
Desired Output
select 222222 as id, 11/21/2024 9:07:17 PM as entered_date, CGAAAAAAA as user, INCOMPLETE as status, null as processor, null as ready_status, null as completed_date
333333 as id, 11/12/2024 10:50:37 PM as entered_date, CGBBBBBBB as user, SUBMITTED as status, CGBBBBBBB as processor, Y as ready_status, 11/13/2024 as completed_date
from sample_comments
[Updated on: Wed, 12 February 2025 09:16] Report message to a moderator
|
|
|
Re: SQL Help Parsing Column [message #690272 is a reply to message #690271] |
Wed, 12 February 2025 12:27   |
 |
Michel Cadot
Messages: 68737 Registered: March 2007 Location: Saint-Maur, France, https...
|
Senior Member Account Moderator |
|
|
Assuming that "comments" always starts with a [ (or ,) and ends with a ] (or ,) (which is correct if "the comments field are 3 sets of data distinguished by [] for each set" is true):
SQL> col comments format a90
SQL> select * from sample_comments;
ID_NUMBER COMMENTS
---------- ------------------------------------------------------------------------------------------
3333333 [222222, 11/21/2024 9:07:17 PM, CGAAAAAAA] [INCOMPLETE] [, , ]
3333333 [333333, 11/12/2024 10:50:37 PM, CGBBBBBBB] [SUBMITTED] [CGBBBBBBB, Y, 11/13/2024]
2 rows selected.
SQL> col id format a7
SQL> col entered_date format a22
SQL> col user format a9
SQL> col status format a10
SQL> col processor format a10
SQL> col ready_status format a12
SQL> col completed_date format a14
SQL> select id_number,
2 regexp_substr(comments, '[[, ]+([^],]*)[],]', 1, 1, null, 1) id,
3 regexp_substr(comments, '[[, ]+([^],]*)[],]', 1, 2, null, 1) entered_date,
4 regexp_substr(comments, '[[, ]+([^[,]*)[],]', 1, 3, null, 1) "USER",
5 regexp_substr(comments, '[[, ]+([^],]*)[],]', 1, 4, null, 1) status,
6 regexp_substr(comments, '[[, ]+([^],]*)[],]', 1, 5, null, 1) processor,
7 regexp_substr(comments, '[[, ]+([^],]*)[],]', 1, 6, null, 1) ready_status,
8 regexp_substr(comments, '[[, ]+([^],]*)[],]', 1, 7, null, 1) completed_date
9 from sample_comments
10 /
ID_NUMBER ID ENTERED_DATE USER STATUS PROCESSOR READY_STATUS COMPLETED_DATE
---------- ------- ---------------------- --------- ---------- ---------- ------------ --------------
3333333 222222 11/21/2024 9:07:17 PM CGAAAAAAA INCOMPLETE
3333333 333333 11/12/2024 10:50:37 PM CGBBBBBBB SUBMITTED CGBBBBBBB Y 11/13/2024
2 rows selected.
|
|
|
|
Re: SQL Help Parsing Column [message #690274 is a reply to message #690273] |
Wed, 12 February 2025 13:13   |
Duane
Messages: 585 Registered: December 2002
|
Senior Member |
|
|
Could you explain this:
[[, ]+([^],]*)[],]
I just want to ensure I understand what you are doing.
Something like:
Matching on Brackets with a comma '[, ]'
then matching on Brackets with anything following a comma '[^],]*'....... and so on.
|
|
|
Re: SQL Help Parsing Column [message #690275 is a reply to message #690274] |
Wed, 12 February 2025 13:59   |
 |
Michel Cadot
Messages: 68737 Registered: March 2007 Location: Saint-Maur, France, https...
|
Senior Member Account Moderator |
|
|
[[, ]+ matches one or more among "[", "," and " " characters
[^],]* matches 0 or more characters that are not "]" or ","
[],] matches one character among "]" and ","
(<somehing>) memorizes what is between () in a variable named 1, 2, 3... depending if this the first (), second ()... in the pattern. Here there is only one such expression, REGEXP_SUBSTR returns this value using the 1 in its last parameter.
This makes me realize that there may be cases that won't work. For instance, if in the last set the first value is not given but the other ones are (ex. "[, , 11/13/2024]".
An additional work has to be done if this can happen. 
[Updated on: Thu, 13 February 2025 00:25] Report message to a moderator
|
|
|
Re: SQL Help Parsing Column [message #690276 is a reply to message #690275] |
Wed, 12 February 2025 14:15   |
 |
Michel Cadot
Messages: 68737 Registered: March 2007 Location: Saint-Maur, France, https...
|
Senior Member Account Moderator |
|
|
So here's a better one:
SQL> select * from sample_comments;
ID_NUMBER COMMENTS
---------- ------------------------------------------------------------------------------------------
3333333 [222222, 11/21/2024 9:07:17 PM, CGAAAAAAA] [INCOMPLETE] [, , ]
3333333 [333333, 11/12/2024 10:50:37 PM, CGBBBBBBB] [SUBMITTED] [CGBBBBBBB, Y, 11/13/2024]
3333333 [4444444, , CGCCCCCCC] [UNKNOWN] [, , 11/13/2024]
3 rows selected.
SQL> col id format a7
SQL> col entered_date format a22
SQL> col user format a9
SQL> col status format a10
SQL> col processor format a10
SQL> col ready_status format a12
SQL> col completed_date format a14
SQL> select id_number,
2 regexp_substr(comments, '[[ ]+([^],]*)[],]', 1, 1, null, 1) id,
3 regexp_substr(comments, '[[ ]+([^],]*)[],]', 1, 2, null, 1) entered_date,
4 regexp_substr(comments, '[[ ]+([^[,]*)[],]', 1, 3, null, 1) "USER",
5 regexp_substr(comments, '[[ ]+([^],]*)[],]', 1, 4, null, 1) status,
6 regexp_substr(comments, '[[ ]+([^],]*)[],]', 1, 5, null, 1) processor,
7 regexp_substr(comments, '[[ ]+([^],]*)[],]', 1, 6, null, 1) ready_status,
8 regexp_substr(comments, '[[ ]+([^],]*)[],]', 1, 7, null, 1) completed_date
9 from sample_comments
10 /
ID_NUMBER ID ENTERED_DATE USER STATUS PROCESSOR READY_STATUS COMPLETED_DATE
---------- ------- ---------------------- --------- ---------- ---------- ------------ --------------
3333333 222222 11/21/2024 9:07:17 PM CGAAAAAAA INCOMPLETE
3333333 333333 11/12/2024 10:50:37 PM CGBBBBBBB SUBMITTED CGBBBBBBB Y 11/13/2024
3333333 4444444 CGCCCCCCC UNKNOWN 11/13/2024
3 rows selected.
Note that it works only if there is at least one space after a comma.
You can ensure that using "replace(comments, ',', ', ')" instead of "comments" in the REGEXP_SUBSTR function.
[Edit: add last sentence]
[Updated on: Wed, 12 February 2025 15:42] Report message to a moderator
|
|
|
|
Goto Forum:
Current Time: Sat Feb 22 17:22:39 CST 2025
|