So we are finally ready to start tackling merging duplicate records! Sadly we have quite a large amount to tackle first. About 12,000. Gulp. What I'm thinking of doing is running the identify dupes procedure once and then scheduling the merge procedure to run nightly so that any dupes that were scheduled that day get merged and off the list.
After we tackle those then we had decided to schedule the procedures to run weekly, but I'm interested to hear how often other organizations schedule these procedures to run. Do you find it's better to do it daily or weekly?
Any advice on how to get our massive amount of duplicates down quickly? I think we're going to divide the alphabet up so we can have several people working on scheduling them at once.
Thanks!
I agree with Dale and Lucie: it's best to look for dupes more than once a week, every day if you can.They multiply out of control very quickly if left unattended.
We run the merge process daily, overnight, and run various different id scripts manually during the day. Don't limit yourself to the standard dupe id program alone: it will miss some dupes, and a combination of scripts has better odds of finding the highest percentage of them.
Also, if you are very confident of the quality of your dupes (i.e. the matching rules are very strict) it is possible to automate the scheduling of them via SQL. Early in our Tessitura life we did that with a large number of strictly identified dupes that were left over from our previous system, and were happy with the results.
Hi Chris,
You mentioned it is possible to automate the scheduling of merging duplicates via SQL. How and where is this possible ?
Thanks,
SS
Unknown said: You mentioned it is possible to automate the scheduling of merging duplicates via SQL. How and where is this possible ?
Add or modify a nightly SQL Server Agent Job to include a step that runs AP_MERGE_CUSTOMER2. Drop me a note offline and I can elaborate further if you like.