Identify Duplicates in Constituent Import

Hi everybody,

I have been doing a lot of constituent imports lately and am having a lot of trouble with the Identify Duplicates part of the Utility.  In essence, it doesn’t seem to be catching all of our duplicates.  I checked the Duplicate Matching settings in our T_DEFAULTS table and they seem to be sufficient, but regardless, it is not identifying duplicates even when they match these parameters – when the import name and address information is really, exactly the same as what exist in the db.  For reference, our duplicate matching settings are as follows:

fname=1,lname=20,postal_code=3,street1=3

Has anyone seen this happen before?  To note, we are still on v.10. 

Thanks!

Frannie

  • That "lname=20" looks awfully long to me. At the moment we are running with these values in T_DEFAULTS:

    fname=2,lname=3,postal_code=4,street1=4

    This string is the result of lots of experimentation. It finds perhaps more false positives than we'd like, but results in the best output for us, for now. (For v11 it finds lots of Households that aren't dupes, and hopefully the standard dupe code is being improved for v12.)

    All that said, I've never trusted the identify dupes portion of the constituent import process, and have preferred to import without it, then look for dupes as a separate process. YMMV, as always...

  • Thanks Chris. Per your advice I adjusted our Duplicate Matching settings, but the utility is still missing a HUGE percentage of duplicates that match the criteria.

    Is this a known problem for anyone else?

    Or Chris, what do you do if you import constituents without letting the utility check for duplicates?  Do you rely on the Merge Constituents screen?  Or does your org have other custom checks?  

    Any advice welcomed - this is becoming hugely time-consuming and messy.  We at times have imports of 900+ constituents--Tess will detect about 150 dupes when the utility runs in Review, and if I spot check, I can find up to 400-500 more.  Pulling my hair out!

    Thanks all--

    Frannie

  • Frannie,

    I would check the @changed_since_days parameter on your AP_IDENTIFY_DUPLICATES procedure.  Here is an excerpt from the documentation on identifying dups.

    "@changed_since_days – Each time the procedure runs, the list of potential duplicates is cleared. By default, a pair of potential duplicates that is not merged will not be added to the list of potential duplicates again unless one of the records had activity since the last time the procedure ran. The activity date window can be specified in days using the optional @changed_since_days parameter.

    For example, if the @changed_since_dates parameter is set to 30, one record in a potential duplicate pair must have had activity within the last 30 days in order for the pair to be included in the results. Setting @changed_since_days to a time period of several years will refresh the list of potential duplicates so that old pairs that were not merged can be reviewed again."

    Dale

  • Thanks for the tip Dale.  I checked our @changed_since_days parameter in the AP_IDENTIFY_DUPLICATES procedure, and the variable is set to NULL.  I am thinking that this setting means there is no limit to how recently an account must have had activity for Tess to include it as a potential duplicate.

    If there are any others with similar experiences or a fix I'd love to hear.  

    Thanks,
    Frannie 

  • Frannie,

    If your @changed_since_days is set to Null then I would expect the default behavior which according to the documentation is: "By default, a pair of potential duplicates that is not merged will not be added to the list of potential duplicates again unless one of the records had activity since the last time the procedure ran."

    Try changing your @changed_since_days to a large number like 730 (2 years) and see what happens.

    Dale

  • Hi Frannie,

    I'm working on using Constituent Import to load in some schools and found this thread while troubleshooting duplicate resolution. Essentially, it's not finding any duplicates.

    I audited the procedure and found at least two bugs in the FT_DUPLICATES_FOR_NEW_CONSTITUENT that is causing the problem. One relates to reading our duplicate check values from T_DEFAULTS and the other relates to it skipping rows, like organizations and households, that have a NULL fname. I'm opening a TASK ticket to get this resolved and in the meantime, implementing some corrections myself so we can get this working properly. 

    Are the records you are having trouble with organizations, individuals, or a combination of the two? 

    Thanks,
    David 

  • Hi David,

    Thanks for chiming in!  All of the records we're importing are individuals.

    For the most recent round, I adjusted the @changed_since_days parameter per Dale's suggestion and I *think* the procedure might have caught more duplicates (?).  It is still not identifying all - it sounds like maybe the first bug you mention, about the proc reading check values, could relate?

    At this point , I'm doing manual checks using homegrown SQL in conjunction with the built-in utility check for almost all imports, and sometimes, in the end, going person by person.  I tabled the issue for a few weeks, but am hoping there will be time in the summer to reach out to Tess Consulting to see if they can cook something more robust up for our org.  We do way too many imports to have it work halfway.

    Let me know if you come up with anything else, and good luck!

    Frannie

  • You’ve indeed spotted a few issues with this function.  I’ve followed up with Dave in the support ticket that was opened.  We’ll be posting a v11+ hotfix on TASK for this once we’ve completed internal testing.  This is also fixed for v12.

     

    +Ryan Creps

    +Tessitura Network

     

    From: Tessitura Technical Forum [mailto:forums-technical@tessituranetwork.com] On Behalf Of David Frederick
    Sent: Monday, May 6, 2013 3:46 PM
    To: Ryan Creps
    Subject: Re: [Tessitura Technical Forum] Identify Duplicates in Constituent Import

     

    Hi Frannie,

    I'm working on using Constituent Import to load in some schools and found this thread while troubleshooting duplicate resolution. Essentially, it's not finding any duplicates.

    I audited the procedure and found at least two bugs in the FT_DUPLICATES_FOR_NEW_CONSTITUENT that is causing the problem. One relates to reading our duplicate check values from T_DEFAULTS and the other relates to it skipping rows, like organizations and households, that have a NULL fname. I'm opening a TASK ticket to get this resolved and in the meantime, implementing some corrections myself so we can get this working properly. 

    Are the records you are having trouble with organizations, individuals, or a combination of the two? 

    Thanks,
    David 

    From: Frances O'Connell <bounce-francesoconnell5073@tessituranetwork.com>
    Sent: 4/4/2013 10:06:52 AM

    Hi everybody,

    I have been doing a lot of constituent imports lately and am having a lot of trouble with the Identify Duplicates part of the Utility.  In essence, it doesn’t seem to be catching all of our duplicates.  I checked the Duplicate Matching settings in our T_DEFAULTS table and they seem to be sufficient, but regardless, it is not identifying duplicates even when they match these parameters – when the import name and address information is really, exactly the same as what exist in the db.  For reference, our duplicate matching settings are as follows:

    fname=1,lname=20,postal_code=3,street1=3

    Has anyone seen this happen before?  To note, we are still on v.10. 

    Thanks!

    Frannie




    This message was sent automatically to you by www.tessituranetwork.com because you subscribed to the Tessitura Technical Forum. You may reply to this message to post to the Technical forum or visit the site to search, read and post to the forums. In the interest of keeping the forum posts from becoming cluttered, we encourage you to delete previous message text from your reply before sending. Thank you!