1. Get rid of all advertisements and get unlimited access to documents by upgrading to Premium Membership. Upgrade to Premium Now and also get a Premium Badge!

Identifying duplicate records issue reg.

Discussion in 'SQL PL/SQL' started by laxman, May 7, 2010.

  1. laxman

    laxman Forum Expert

    Messages:
    144
    Likes Received:
    0
    Trophy Points:
    230
    Location:
    MUMBAI
    Hi all,
    i am facing problem while identifying duplicate records in a table.

    i have one table A

    Code (SQL):
    sql> DESC A

    Name                       NULL?    TYPE
     ----------------------------------------- -------- ----------------------------
    custid                                          NOT NULL NUMBER(10)
    Reqid                          NOT NULL NUMBER(10)
     Reqtxt                        varchar2(4000)
    STATUS                                         varchar2(10)
    basically this table contains records of a person(custid) who is sending mail to our application and whose requests(reqid) and contents(reqtxt) of the mail body is being maintained in this table .

    some times it may happen that same customer send the mail multiple time with same text( which is getting stored in reqtxt)

    we need to track this request by identifying this duplicate request and update the status column to close except one request as open.

    Now
    Case 1--> if a same person sending mail multiple time with same text in mail body then problem solved in this way
    Code (SQL):
    BEGIN
    FOR cur_rec IN( SELECT reqid FROM A WHERE reqid NOT IN(SELECT MAX(reqid) FROM a
                                                                    GROUP BY custid,LOWER(ltrim(rtrim(REPLACE(REQTXT,chr(13),''))))))
    loop
              UPDATE A SET STATUS='close' WHERE reqid=cur_rec.reqid;
              commit;

    END loop;
    END;
    case 2--> 1> if a person sending same mail multiple time with some mail contains timestamp with text and some doesn't
    then how to ignore timestamp excluding text in order to identify duplicates
    2> if each person belong to different time zone and sending same mail multiple time with their own timestamp
    in this case also how to avoid timestamp since we concern only with text in mail body.

    need your suggestion for further implementation

    Regards
    Laxman
     
  2. rajavu

    rajavu Forum Guru

    Messages:
    815
    Likes Received:
    52
    Trophy Points:
    610
    Location:
    @ Bangalore , India
    Create a test case with Create and insert table scripts and show us what you have done ?
     
  3. zargon

    zargon Community Moderator Forum Guru

    Messages:
    2,351
    Likes Received:
    348
    Trophy Points:
    1,430
    Location:
    Aurora, CO
    Why would some email contain a timestamp and others not from the same custid? Are custids not linked to specific email accounts? Why is the timestamp not separated from the email text and stored in a separate column?
     
  4. laxman

    laxman Forum Expert

    Messages:
    144
    Likes Received:
    0
    Trophy Points:
    230
    Location:
    MUMBAI
    Dear Sir,
    I extremely sorry the way i explained ,Actually what i mean to say that it may happened that some customer may send their mail with such ids let say official-ids where system generated time stamp is automatically getting attached with the mail.In this case all the mail from one customer with same text couldn't be identified as duplicates so in this case if we have an approach in such a way that we can ignore those timestamp and then it would be quite easy to group the duplicate string for each customer and close all request except one.Here assuming that length of timestamp is always constant since system generated time stamp will never vary.

    Note:Timestamp is always appearing before the text in this format dd/mm/yy Hr:min:ss

    e.g:
    15/04/10 11:22 PM John:
    some text..........
    .........................................
    Regards
    John
    I apologizes if you couldn't able to understood the requirement.

    Regards
    Laxman
     
  5. zargon

    zargon Community Moderator Forum Guru

    Messages:
    2,351
    Likes Received:
    348
    Trophy Points:
    1,430
    Location:
    Aurora, CO
    You are still unclear with your description as you state this:

    "Note:Timestamp is always appearing before the text in this format dd/mm/yy Hr:min:ss "

    yet post this:

    "15/04/10 11:22 PM John:"

    What IS the actual format?? What is a valid example of this?? It appears you're asking us to fish using a hand mixer which rarely, if ever, succeeds. Give us something accurate to work with and you may be surprised that actual answers appear. Given what you've posted thus far your answers won't be posted since your 'information' is faulty.