1. Get rid of all advertisements and get unlimited access to documents by upgrading to Premium Membership. Upgrade to Premium Now and also get a Premium Badge!

Parsing Challenge

Discussion in 'SQL PL/SQL' started by DataOnCrack, Sep 12, 2014.

  1. DataOnCrack

    DataOnCrack Active Member

    Messages:
    2
    Likes Received:
    0
    Trophy Points:
    55
    Hi, I need to parse some text in a large text field and not sure what syntax to use or if one or many user defined functions will be needed.

    The text I need to grab looks like this:

    As of 24-Aug-2014:
    [My Status: Grand Stuff, UML] --I need to grab everything after the colon up the the end bracket (exclude end bracket) and place them in derived field number 1 of the query.


    *This is some text I need to grab and may continue on for 2 or more sentences -- I need to grab sentences or paragraphs that start with a star * and place them in derived field number 2

    *Here is another sentence I need to grab because it starts with a star. Also place in derived field number 2. [some short note | 15-AUG] -- At the end of the last sentense that starts with a star there is some text within a bracket. I need to first grab the text within the bracket up to the pipe and place in derived field number 3. I then need to grab the text after the pipe and up to the closing bracket (the date) and place in derived field number 4.

    There may be text or multiple paragraphs after the last sentence that has a star. We can ignore that.

    Help appreciated. Thanks
     
  2. ocprep

    ocprep Forum Advisor

    Messages:
    277
    Likes Received:
    76
    Trophy Points:
    410
    Location:
    Orlando, Florida
    I've written several parse routines. You need to have a variable that is used to keep track of where in the parsed text the code is currently looking. This becomes one of the parameters for INSTR and SUBSTR in looking for the start and end point of the text to be grabbed and then grabbing it respectively. Once the value is equal to the length of the parsed text -- you're done.

    The logic is something like:

    v_length := LENGTH(v_text);
    v_ndx1 := 1;
    v_ndx2 := INSTR(v_text, 'My Status:', v_ndx1) + 10; --Locate 'My Status:' and add ten to be after colon
    v_ndx1 := v_ndx2; -- move pointer after the colon
    v_ndx2 := INSTR(v_text, ']', v_ndx1); -- starting after the colon, find the end bracket
    v_field1 := SUBSTR(v_text, v_ndx1, v_ndx2 - v_ndx1); -- populate field1
    v_ndx1 := v_ndx2; -- move pointer
    WHILE v_ndx1 < v_length LOOP
    v_ndx2 := INSTR(v_text, '*', v_ndx1) + 1; locate first asterisk.
    ...
     
    DataOnCrack likes this.
  3. DataOnCrack

    DataOnCrack Active Member

    Messages:
    2
    Likes Received:
    0
    Trophy Points:
    55
    Thank you, Matthew. I admit I'm like a fish out of water coming over from SQL Server. Im using Oracle SQL developer, and I'm not even sure how to declare the varaiables in your example. Maybe I can request this: The table I want to query is called tblStatus and the field with all the text is StatusNotes. Can you perhaps give me an example of how to extract the just the first field into its own column in a derived column of a query to get me started? Help much apprecaited. I understand the loop logic you did for the first field, but just hung up on syntax, declaring the variables and incorporating what you gave as an example with my table and field.

    Thank you!
     
  4. ocprep

    ocprep Forum Advisor

    Messages:
    277
    Likes Received:
    76
    Trophy Points:
    410
    Location:
    Orlando, Florida
  5. rajenb

    rajenb Forum Expert

    Messages:
    361
    Likes Received:
    114
    Trophy Points:
    655
    Location:
    Mauritius
    Hi,

    Try this script as a startup and alternative way and build-up on it:

    Code (SQL):
    DECLARE
      l_field1 VARCHAR2(100);
      l_field2 VARCHAR2(100);
    BEGIN
      SELECT regexp_replace (StatusNotes,'(.*):(.*)].*', '\2',1,1,'n'),
        regexp_replace (StatusNotes,'(.*)\*(.*)','\2',1,1,'n')
      INTO l_field1,
        l_field2
      FROM tblstatus;
      dbms_output.put_line(l_field1);
      dbms_output.put_line(l_field2);
    END;
    /