Automating Duplicate Merges

Elise Bargman $organization over 3 years ago

Hello! Our organization is looking into developing a stored procedure to match and merge some of our duplicates automatically. Have any of you ever done this before, or do you know anyone who has? I'd love to hear what worked and what didn't, and find an existing procedure to base ours off of.

Grant Offermann $organization over 3 years ago

Hi Elise,

We looked into it years ago but found that a majority of the duplicate merges need a set of eyes to look over both records to determine which one to keep/delete. So the benefit of writing something to automate some of that would be minimal.

We use CSI's to allow duplicate merges to be managed by a variety of staff members in our box office so it ends up not being much work.
Cancel
Vote Up +1 Vote Down

Sign in to reply

Cancel
Sarah Berryman (she/her) $organization over 3 years ago

Hi Elise,

Like Grant, we also looked into this a few years back. We spoke to Tessitura consulting and several organisations that had tried to do something similar... the conclusion was that it was too risky and needed humans to double-check for incorrect matches.

We currently have our box office team do merging during down time which isn't a very exciting task but felt safer than relying on a stored procedure. But I'd be very interested to hear if you do get it working!
Cancel
Vote Up +1 Vote Down

Sign in to reply

Cancel
Carol Keeney $organization over 3 years ago

I have done this for multiple organizations, but it does require a fair bit of analysis into how the kept and delete sides should be scheduled to be merged. Generally, VIPS and large donors are marked with Void Merge so they don't get pulled into the automated merging. For the process I wrote, it schedules the merges, sends out a report to the necessary contacts, and doesn't perform the merge until there is a sufficient time delay for the contact to unschedule any merges they aren't comfortable with.
Cancel
Vote Up 0 Vote Down

Sign in to reply

Cancel
Phillip Parks $organization over 3 years ago
I wouldn't recommend automatically merging records. There should always be some sort of review process prior to any merge.

We did something similar to what Carol outlined, however we manually triggered the merges.

The process went something like this:

Separate the potential merges in to different, smaller segments

Review the segment's resulting records keep/delete matches identifying matches that we thought shouldn't be matches, or should remain separate records

Import one segment into TEST into a local merge segment table

Manually merge the records

Review the results to ensure we were happy with them and make changes to the files when we found things we didn't like.

Once satisfied, we rant the same process in PROD.

Return to step 2 to review the next segment and begin the process again.

Once one segment was run and we proceeded to the next segment, when we got the the manual merge process in #4 we included the current segment's results along with the previous segment(s).

This process took several months but resulted in merges that were high quality. It really cleaned up our database. For reference there were 10's of thousands of records that we had to review and merge.
Cancel
Vote Up 0 Vote Down

Sign in to reply

Cancel
Chris Jensen $organization over 3 years ago

Elise Bargman said:
Our organization is looking into developing a stored procedure to match and merge some of our duplicates automatically.

Years ago, I recall Ken McSwain at the SOH created a process for that org that auto-merged records, saving all of the pre-merge data in a separate database, I think. A part of this suite included utilities that could un-merge records, too, since everything was saved. Very complex, and as far as I know, no-one else has set up anything that elaborate.

At the Guthrie, we did do some bulk merging in the weeks after we went live, at least 10-15K records iirc, identified with very strict matching rules and auto-categorized into Ks and Ds. We've use less strict rules ever since, and therefore checked all possible merges with human eyes before scheduling.
Cancel
Vote Up 0 Vote Down

Sign in to reply

Cancel
Brenda Fehlberg $organization over 3 years ago in reply to Chris Jensen

I recently caught up with staff at SOH and they have rewritten the auto script.
Currently I am working at the Arts Centre Melbourne and we are currently investigating using a script to pick up very obvious merges for an auto-merge. Previously IT have run the script to merge files. This has very strict business rules which we are revisting now. This will of course be tested before utilising in PROD.
Cancel
Vote Up +1 Vote Down

Sign in to reply

Cancel
Trini Ramos $organization over 3 years ago

Hello. We worked with Tessitura to build one for us. It was a long but fruitful process we completed a couple of years ago, shortly after we went live. We had brought in data from three different sources and had thousands of obvious duplicates. We required at least one member from each department to meet weekly with the Tessitura team. We created a hierarchy to determine the keep and delete record, like current membership or VIP constiturency. We also provided criteria where we would absolutely need to review manually and that's in a daily report. The basic duplicate criteria is name and address (after we did a NCOA update) and so far, we haven't had to "pull apart" an incorrect merge. (I'm not saying it's not happening, it's just that no one has squawked in the past two years.) This has helped us tremendously and was worth the effort and resources to make it happen. Hope that helps!
Cancel
Vote Up 0 Vote Down

Sign in to reply

Cancel
Katie Lachance-Duffy $organization over 3 years ago

Hi Elise,

There was a similar discussion a while back about merging where I mentioned the idea of "curated merging". (Not my term -- I heard it from a fellow Tessitura user and adopted it. :))

The short of it is, automating duplicate identification and/or merge scheduling is certainly possible, but looking over those merges before they happen is really important, even if you have a means of reversing the merge after the fact.

I'm happy to talk more! Good luck in your constituent consolidation journey!
Cancel
Vote Up 0 Vote Down

Sign in to reply

Cancel
Heath Wilder $organization over 3 years ago

We had a duplicate auto merge at SDC which was great.

I'm quite passionate about this topic. There is a real business case in labour saving and inaccuracy of inactive accounts vs the fear of incorrect merges deleting whole relationships.

For me the root problem is the difficulty in demerging.

I don't understand the binary of the problem. In terms of duplicates there are obvious through the various stages of likely, to unsure. If you can find obvious then merge them. The likelys then curate. The unsure, leave or survey for more data. The trick is testing your theories to prove yourself wrong (like any good scientist)
Cancel
Vote Up 0 Vote Down

Sign in to reply

Cancel