Duplicate Email Constituent Record Report?

Chelsea Marti $organization over 2 years ago

I am doing a huge database cleanup. In our old system, it was ok for 10 constituent records to all have the same email address attached to them...in Tessitura, it is not. We found that out the hard way, so now I am trying to manually go through and take all the emails off of the constituent records that they don't belong on.

Our Ongoing Services team sent me a list they pulled of all the constituents with the same emails that kept their logins, which I have already gone through. They also pulled us a list of every account they pulled the temp logins off of, but did not specify the duplicate emails. So I have a list of about 90,000 email addresses I am tediously going through to see if each email has multiple records attached to it. Since many do not, it's wasting my time to check those ones.

I have a ticket open about this but haven't heard back in a few days; I'm wondering if there is a list or an extraction or something I can pull myself to get the data I am looking for.

What I am looking for is a list of all of our constituent records that no longer have logins attached, but do have email addresses and that email address is linked to multiple constituent records. Is it doable myself or do I have to wait and have Ongoing Services do it?

Parents

Gawain Lavers $organization over 2 years ago
Okay, so I came up with a manual List query to at least organize the work a bit. If you create a list with one trivial criteria, save, and then go to "show query" you can delete all of the code in the box and replace it with this:

select distinct de.customer_no from T_CUSTOMER as c inner join ( select e.customer_no, case when e.login_no = -1 and d.login_no = -1 then 'N' else 'Y' end as connected_login, case when h.customer_no is not null then 'Y' else 'N' end as hh_dup, cust_count.customers from ( select e1.*, ISNULL(cl.login_no, -1) as login_no from T_EADDRESS as e1 inner join TR_EADDRESS_TYPE as et on et.id = e1.eaddress_type and et.email_ind = 'Y' left outer join T_CUST_LOGIN as cl on cl.eaddress_no = e1.eaddress_no ) as e inner join ( select e1.*, ISNULL(cl.login_no, -1) as login_no from T_EADDRESS as e1 inner join TR_EADDRESS_TYPE as et on et.id = e1.eaddress_type and et.email_ind = 'Y' left outer join T_CUST_LOGIN as cl on cl.eaddress_no = e1.eaddress_no ) as d on d.address = e.address and d.eaddress_no <> e.eaddress_no left outer join V_CUSTOMER_WITH_HOUSEHOLD as h on h.customer_no = e.customer_no and h.expanded_customer_no = d.customer_no and h.customer_no <> h.expanded_customer_no left outer join ( select e1.address, COUNT(distinct e1.customer_no) as customers from T_EADDRESS as e1 inner join TR_EADDRESS_TYPE as et on et.id = e1.eaddress_type and et.email_ind = 'Y' group by e1.address ) as cust_count on cust_count.address = e.address ) as de on de.customer_no = c.customer_no where c.inactive = 1 --and de.connected_login = 'Y' and de.customers = 1 --connected login, single customer account --and de.connected_login = 'N' and de.customers = 1 --no logins, single customer account --and de.connected_login = 'Y' and de.customers > 1 and de.hh_dup = 'Y' --connected login, dup emails within a household --and de.connected_login = 'N' and de.customers > 1 and de.hh_dup = 'Y' --no logins, dup emails within a household --and de.connected_login = 'Y' and de.customers > 1 and de.hh_dup = 'N' --connected login, dup emails outside a household --and de.connected_login = 'N' and de.customers > 1 and de.hh_dup = 'N' --no logins, dup emails outside a household ;

This will give you every customer who has at least one email address that is duplicated somewhere. Probably a lot! However, at the bottom of that block you see six "commented out" lines (i.e. they start with "--"). Remove the first two dashes on any one of them, and the results will be limited as the second comment says. In this fashion you can prioritize your work. I've identified where at least one of the email addresses found is connected to a login, since those are probably more important to fix first? Then there are three other groupings. First are email addresses where there are duplicates, but all duplicates are on the same constituent record. With these you can just click through to the constituent and know that you just have to fix the duplicates that you can see. The second are duplicates where the other email address is owned by a different member of the same household. These, again, can be more easily corrected, since you can just hop between household members using the affiliations menu. Lastly there are duplicates where the duplicate emails are on "unrelated" accounts. Here you'll have to go to the account, look at its email addresses, an then use those in turn in the constituent search menu.

Since, in the case of household or unrelated customer account, the list will contain both customer records of a duplicated email address, periodically regenerating the list will ideally keep you from clicking into accounts you've already fixed.

Caveat 1: this will not work if you are in a consortium and your consortium members don't share constituent accounts. In that case (we're one!) you'll get a ton of copied emails for each consortium member's record of the same person.

Caveat 2: I am not screening out inactive email addresses, nor inactive logins from consideration.
Cancel
Vote Up 0 Vote Down

Sign in to reply

Cancel

Reply

Gawain Lavers $organization over 2 years ago
Okay, so I came up with a manual List query to at least organize the work a bit. If you create a list with one trivial criteria, save, and then go to "show query" you can delete all of the code in the box and replace it with this:

select distinct de.customer_no from T_CUSTOMER as c inner join ( select e.customer_no, case when e.login_no = -1 and d.login_no = -1 then 'N' else 'Y' end as connected_login, case when h.customer_no is not null then 'Y' else 'N' end as hh_dup, cust_count.customers from ( select e1.*, ISNULL(cl.login_no, -1) as login_no from T_EADDRESS as e1 inner join TR_EADDRESS_TYPE as et on et.id = e1.eaddress_type and et.email_ind = 'Y' left outer join T_CUST_LOGIN as cl on cl.eaddress_no = e1.eaddress_no ) as e inner join ( select e1.*, ISNULL(cl.login_no, -1) as login_no from T_EADDRESS as e1 inner join TR_EADDRESS_TYPE as et on et.id = e1.eaddress_type and et.email_ind = 'Y' left outer join T_CUST_LOGIN as cl on cl.eaddress_no = e1.eaddress_no ) as d on d.address = e.address and d.eaddress_no <> e.eaddress_no left outer join V_CUSTOMER_WITH_HOUSEHOLD as h on h.customer_no = e.customer_no and h.expanded_customer_no = d.customer_no and h.customer_no <> h.expanded_customer_no left outer join ( select e1.address, COUNT(distinct e1.customer_no) as customers from T_EADDRESS as e1 inner join TR_EADDRESS_TYPE as et on et.id = e1.eaddress_type and et.email_ind = 'Y' group by e1.address ) as cust_count on cust_count.address = e.address ) as de on de.customer_no = c.customer_no where c.inactive = 1 --and de.connected_login = 'Y' and de.customers = 1 --connected login, single customer account --and de.connected_login = 'N' and de.customers = 1 --no logins, single customer account --and de.connected_login = 'Y' and de.customers > 1 and de.hh_dup = 'Y' --connected login, dup emails within a household --and de.connected_login = 'N' and de.customers > 1 and de.hh_dup = 'Y' --no logins, dup emails within a household --and de.connected_login = 'Y' and de.customers > 1 and de.hh_dup = 'N' --connected login, dup emails outside a household --and de.connected_login = 'N' and de.customers > 1 and de.hh_dup = 'N' --no logins, dup emails outside a household ;

This will give you every customer who has at least one email address that is duplicated somewhere. Probably a lot! However, at the bottom of that block you see six "commented out" lines (i.e. they start with "--"). Remove the first two dashes on any one of them, and the results will be limited as the second comment says. In this fashion you can prioritize your work. I've identified where at least one of the email addresses found is connected to a login, since those are probably more important to fix first? Then there are three other groupings. First are email addresses where there are duplicates, but all duplicates are on the same constituent record. With these you can just click through to the constituent and know that you just have to fix the duplicates that you can see. The second are duplicates where the other email address is owned by a different member of the same household. These, again, can be more easily corrected, since you can just hop between household members using the affiliations menu. Lastly there are duplicates where the duplicate emails are on "unrelated" accounts. Here you'll have to go to the account, look at its email addresses, an then use those in turn in the constituent search menu.

Since, in the case of household or unrelated customer account, the list will contain both customer records of a duplicated email address, periodically regenerating the list will ideally keep you from clicking into accounts you've already fixed.

Caveat 1: this will not work if you are in a consortium and your consortium members don't share constituent accounts. In that case (we're one!) you'll get a ton of copied emails for each consortium member's record of the same person.

Caveat 2: I am not screening out inactive email addresses, nor inactive logins from consideration.
Cancel
Vote Up 0 Vote Down

Sign in to reply

Cancel

Children

Chelsea Marti $organization over 2 years ago in reply to Gawain Lavers

This is great! I'll try it out and see what happens. Thanks for this
Cancel
Vote Up 0 Vote Down

Sign in to reply

Cancel