Automating Duplicate Merges

Hello! Our organization is looking into developing a stored procedure to match and merge some of our duplicates automatically. Have any of you ever done this before, or do you know anyone who has? I'd love to hear what worked and what didn't, and find an existing procedure to base ours off of. 

Parents
  • I wouldn't recommend automatically merging records. There should always be some sort of review process prior to any merge.

    We did something similar to what Carol outlined, however we manually triggered the merges. 

    The process went something like this:

    1. Separate the potential merges in to different, smaller segments
    2. Review the segment's resulting records keep/delete matches identifying matches that we thought shouldn't be matches, or should remain separate records
    3. Import one segment into TEST into a local merge segment table
    4. Manually merge the records
    5. Review the results to ensure we were happy with them and make changes to the files when we found things we didn't like.
    6. Once satisfied, we rant the same process in PROD.
    7. Return to step 2 to review the next segment and begin the process again. 

    Once one segment was run and we proceeded to the next segment, when we got the the manual merge process in #4 we included the current segment's results along with the previous segment(s). 

    This process took several months but resulted in merges that were high quality. It really cleaned up our database. For reference there were 10's of thousands of records that we had to review and merge. 

Reply
  • I wouldn't recommend automatically merging records. There should always be some sort of review process prior to any merge.

    We did something similar to what Carol outlined, however we manually triggered the merges. 

    The process went something like this:

    1. Separate the potential merges in to different, smaller segments
    2. Review the segment's resulting records keep/delete matches identifying matches that we thought shouldn't be matches, or should remain separate records
    3. Import one segment into TEST into a local merge segment table
    4. Manually merge the records
    5. Review the results to ensure we were happy with them and make changes to the files when we found things we didn't like.
    6. Once satisfied, we rant the same process in PROD.
    7. Return to step 2 to review the next segment and begin the process again. 

    Once one segment was run and we proceeded to the next segment, when we got the the manual merge process in #4 we included the current segment's results along with the previous segment(s). 

    This process took several months but resulted in merges that were high quality. It really cleaned up our database. For reference there were 10's of thousands of records that we had to review and merge. 

Children
No Data