Script to replace personal data

I've been asked to 'anonymise' a copy of our database by replacing any data that can be used to identify someone - names, addresses, phone numbers, email addresses, payment details etc. The replacement data can either be random, or perhaps incremental generic (such as Customer0000001, Customer0000002 etc)

Has anyone had to do this before, and if so would they mind sharing their thoughts / hints / code??? I'm not sure how many tables need to be scrubbed at this point (but I'm guessing it's going to be quite a few!) and there's always the possiblity that it will need to be done again, so any info would be gratefully received.

Thanks,
Steve

Parents Reply Children
  • Hi Martin.

    It certainly would be easier to do that, but I've not been given the luxury of that option I'm afraid. The brief is to supply a copy of the database in which all information that could be used to identify an individual has been removed, prior to the copy being given to a 3rd party. I've identified about 40 tables so far - some only having 1 field that needs to be updated, some having 8 - but once I've worked out the method and what I'm going to use as replacement data, it shouldn't be too bad. I was just wondering if anyone had any code already written.

    Steve

  • Hi Steve,

    I would suggest to check out all tables containing column "Lname", see SQL below. (In our db there are 24 such tables excluding custom tables). Also, consider modification of customer names in the Credit Card table T_ACCOUNT_DATA. If the requirements allow it, I would purge both CC tables though: T_ACCOUNT_DATA and T_CC_SERVER_LOG.

    Simon

    use impresario
    go
    select distinct
    sysobjects.name tbl_name
    from syscolumns , sysobjects
    where syscolumns.id=sysobjects.id
    and sysobjects.type='U' 
    and syscolumns.name like '%Lname%'
    order by sysobjects.name