illegal characters in user customer data

Hi,

I was thinking of creating a stored procedure that would detect any records in the typical tables (T_Customer, T_Address, T_Phone, T_Eaddress) with characters symbols where they dont belong, and generally inappropriate characters like tab and return etc.

Has anyone come across a report like this already? I appreciate any advice before I get started inventing the wheel here.

  • Would the stored procedure update these incorrect findings or simply output them (say in a report?). If you're just looking to find these kinds of things, you could just set up some lists to find them. We have a variety of 'data cleanup' lists that find common errors with constituent names and addresses. The lists can be regenerated at any time and the box office staff does the cleanup. Happy to share that if you think it would be helpful.

    If you're wanting a procedure to update them automatically, I don't have something like that in place but I know others do. Hopefully one of them is watching and willing to share. :)

  • Aaron,

    With Tessitura we provide a procedure called UP_FIND_INVALID_CHARACTERS that looks for ascii characters 1-31 or 127.

    The main Constituent tables are covered in this proc but you can always create a local version to add more tables.  

    If @fix_data is set to 'Y' this actually fixes the data.  It should be run repeatadly until no additional errors are returned by the proc.


    Dale

  • Beth,

    We are always looking for ways to improve our clean-up efforts, and, while I shall continue to pay attention to this thread for additional ideas, I would also love to see what you have that you use in terms of 'data cleanup' lists with such common errors.

    Feel free to just either e-mail me or let me know where I should go.

    Thanks!

    John

  • Thank you everybody for you help. I'll be giving the UP a try for now. If thats not sufficient, I'll reach out to you Beth.

  • Dale -- should that procedure get documented in Tessitura Database Utilities? Anything else missing from there?

    On the topic of general data maintenance for customers/addresses, I have the following script set to run nightly to trim whitespace from the beginning or end of these fields:

    UPDATE c
    SET c.fname = LTRIM(RTRIM(fname)),
    	c.mname = LTRIM(RTRIM(mname)),
    	c.lname = LTRIM(RTRIM(lname))
    FROM dbo.T_CUSTOMER c
    WHERE c.fname LIKE ' %' OR c.fname LIKE '% ' OR
    	c.mname LIKE ' %' OR c.mname LIKE '% ' OR
    	c.lname LIKE ' %' OR c.lname LIKE '% ';
    
    UPDATE a
    SET a.street1 = LTRIM(RTRIM(street1)),
    	a.street2 = LTRIM(RTRIM(street2)),
    	a.city = LTRIM(RTRIM(city))
    FROM dbo.T_ADDRESS a
    WHERE a.street1 LIKE ' %' OR a.street1 LIKE '% ' OR
    	a.street2 LIKE ' %' OR a.street2 LIKE '% ' OR
    	a.city LIKE ' %' OR a.city LIKE '% ';

    P.S. Neat new "Code Sample" button in the forum editor! Now if only it supported SQL highlighting...

  • Beth I would love to know how you set up these lists.

    Dee Dee

     

     

    From: Tessitura Technical Forum [mailto:forums-technical@tessituranetwork.com] On Behalf Of Beth Hawryluk
    Sent: Tuesday, July 25, 2017 5:15 PM
    To: Dee Dee Fite
    Subject: Re: [Tessitura Technical Forum] illegal characters in user customer data

     

    Would the stored procedure update these incorrect findings or simply output them (say in a report?). If you're just looking to find these kinds of things, you could just set up some lists to find them. We have a variety of 'data cleanup' lists that find common errors with constituent names and addresses. The lists can be regenerated at any time and the box office staff does the cleanup. Happy to share that if you think it would be helpful.

    If you're wanting a procedure to update them automatically, I don't have something like that in place but I know others do. Hopefully one of them is watching and willing to share. :)

    From: Aaron Glynn <bounce-aaronglynn8317@tessituranetwork.com>
    Sent: 7/25/2017 4:53:31 PM

    Hi,

    I was thinking of creating a stored procedure that would detect any records in the typical tables (T_Customer, T_Address, T_Phone, T_Eaddress) with characters symbols where they dont belong, and generally inappropriate characters like tab and return etc.

    Has anyone come across a report like this already? I appreciate any advice before I get started inventing the wheel here.




    This message was sent automatically to you by www.tessituranetwork.com because you subscribed to the Tessitura Technical Forum. You may reply to this message to post to the Technical forum or visit the site to search, read and post to the forums. In the interest of keeping the forum posts from becoming cluttered, we encourage you to delete previous message text from your reply before sending. Thank you!

  • Thanks Nick, I'm checking in with our documentation team on this.

    Dale

  • > With Tessitura we provide a procedure called UP_FIND_INVALID_CHARACTERS

    Very helpful; as per your suggestion I made a version that looks at T_ORDER custom fields when we were having an issue with those. 

    On the topic of non-printable characters, when I just want to see what's in a field before I optionally fix it, I've found the function shared here to be very helpful:

    http://www.sqlteam.com/forums/topic.asp?TOPIC_ID=164618

     



    [edited by: Chris Jensen at 2:54 PM (GMT -6) on 27 Jul 2017]
  • Hello Aaron,

     

    Here is my script:

     

    -----------NONPRINTING CHARACTERS IDENTIFICATION-----------

    /*

    Created by Simon Basyuk, DBA of Carnegie Hall

    --PLEASE NOTE:

    the script will catch all nonprinting characters in a specified Tess table, for instance: T_CUSTOMER , T_ADDRESS

    The nonprinting characters examples: CR, LF (Line Feed), Tab, Space, .. --any one from  ASCII=0 through ASCII=31  

     

    --expected result:

    cust_no=xxxxxxxx, str=Simon Basyuk

     Houshold, ASCII=10, @l_pos=12, l_ch= >

    */

     

    declare

    @l_cntr                               int,

    @l_pos                               int,

    @l_str                                 varchar(256),

    @l_ch                                  char(1),

    @l_cust                              int,

    @l_ASCII_start   int,@l_ASCII_end int

     

    set @l_ASCII_start=0                                                    --HARDCODED!

    set @l_ASCII_end=31--ASCII 32 is space char         --HARDCODED!

     

    declare crs1 cursor local for SELECT

    customer_no, lname  --or: fname, postal_code street1,street2 --NOTE: TABLE NAME TO BE PROVIDED!

    from T_CUSTOMER --or: T_ADDRESS  for street1,street2

     

    open crs1

    fetch next from crs1 into @l_cust, @l_str

    while (@@fetch_status=0)

    BEGIN 

    set @l_cntr=@l_ASCII_start

    while @l_cntr<=@l_ASCII_end

    --checking for the first occurance of the char only!

    begin

      set @l_ch=char(@l_cntr)

      set @l_pos=CHARINDEX(@l_ch,@l_str)

      if @l_pos > 0

        print 'cust_no='+ convert(varchar,@l_cust)+', str='+@l_str+

        ', ASCII='+ convert(varchar,@l_cntr)+', @l_pos='+ convert(varchar,@l_pos)+', l_ch= >'+@l_ch+'<'

      if @l_cntr >=400--safety!

        BREAK

      set @l_cntr=@l_cntr+1

    end --while @l_cntr<=

    fetch next from crs1 into @l_cust, @l_str

    END --while @@fetch_status=0

    close crs1

    deallocate crs1

     

    Regards,

    Simon Basyuk

     


    Carnegie Hall
    Simon Basyuk
    Manager, Database Administration
    sbasyuk@carnegiehall.org
    Carnegie Hall
    881 Seventh Avenue
    New York, NY 10019
    T: 212-903-9691
    M: 646-296-0003
    Please consider the environment before printing this e-mail.
    This message may contain confidential or privileged information. If you are not the intended recipient, please advise the sender immediately by reply email and delete this message and any attachments without retaining a copy.

    From: Tessitura Technical Forum [mailto:forums-technical@tessituranetwork.com] On Behalf Of Aaron Glynn
    Sent: Tuesday, July 25, 2017 4:59 PM
    To: Basyuk, Simon <sbasyuk@carnegiehall.org>
    Subject: [Tessitura Technical Forum] illegal characters in user customer data

     

    Hi,

    I was thinking of creating a stored procedure that would detect any records in the typical tables (T_Customer, T_Address, T_Phone, T_Eaddress) with characters symbols where they dont belong, and generally inappropriate characters like tab and return etc.

    Has anyone come across a report like this already? I appreciate any advice before I get started inventing the wheel here.




    This message was sent automatically to you by www.tessituranetwork.com because you subscribed to the Tessitura Technical Forum. You may reply to this message to post to the Technical forum or visit the site to search, read and post to the forums. In the interest of keeping the forum posts from becoming cluttered, we encourage you to delete previous message text from your reply before sending. Thank you!

  • Former Member
    Former Member $organization

    Hi,

     

    Not sure if you already know about this -- Tessitura has a stored procedure, UP_FIND_INVALID_CHARACTERS, that can be run in reporting-only mode or fixing mode. It checks these tables: t_customer, t_address, tx_cust_sal and tx_alias_cust. And it checks for various characters.  We have it scheduled to run nightly.

     

    Thanks,
    Debbie

     

     

     

    From: Tessitura Technical Forum [mailto:forums-technical@tessituranetwork.com] On Behalf Of Simon Basyuk
    Sent: Tuesday, September 12, 2017 4:08 PM
    To: Debbie Jacob <djacob@mos.org>
    Subject: RE: [Tessitura Technical Forum] illegal characters in user customer data

     

    Hello Aaron,

     

    Here is my script:

     

    -----------NONPRINTING CHARACTERS IDENTIFICATION-----------

    /*

    Created by Simon Basyuk, DBA of Carnegie Hall

    --PLEASE NOTE:

    the script will catch all nonprinting characters in a specified Tess table, for instance: T_CUSTOMER , T_ADDRESS

    The nonprinting characters examples: CR, LF (Line Feed), Tab, Space, .. --any one from  ASCII=0 through ASCII=31  

     

    --expected result:

    cust_no=xxxxxxxx, str=Simon Basyuk

     Houshold, ASCII=10, @l_pos=12, l_ch= >

    */

     

    declare

    @l_cntr                               int,

    @l_pos                               int,

    @l_str                                 varchar(256),

    @l_ch                                  char(1),

    @l_cust                              int,

    @l_ASCII_start   int,@l_ASCII_end int

     

    set @l_ASCII_start=0                                                    --HARDCODED!

    set @l_ASCII_end=31--ASCII 32 is space char         --HARDCODED!

     

    declare crs1 cursor local for SELECT

    customer_no, lname  --or: fname, postal_code street1,street2 --NOTE: TABLE NAME TO BE PROVIDED!

    from T_CUSTOMER --or: T_ADDRESS  for street1,street2

     

    open crs1

    fetch next from crs1 into @l_cust, @l_str

    while (@@fetch_status=0)

    BEGIN 

    set @l_cntr=@l_ASCII_start

    while @l_cntr<=@l_ASCII_end

    --checking for the first occurance of the char only!

    begin

      set @l_ch=char(@l_cntr)

      set @l_pos=CHARINDEX(@l_ch,@l_str)

      if @l_pos > 0

        print 'cust_no='+ convert(varchar,@l_cust)+', str='+@l_str+

        ', ASCII='+ convert(varchar,@l_cntr)+', @l_pos='+ convert(varchar,@l_pos)+', l_ch= >'+@l_ch+'<'

      if @l_cntr >=400--safety!

        BREAK

      set @l_cntr=@l_cntr+1

    end --while @l_cntr<=

    fetch next from crs1 into @l_cust, @l_str

    END --while @@fetch_status=0

    close crs1

    deallocate crs1

     

    Regards,

    Simon Basyuk

     

     

    Carnegie Hall

    Simon Basyuk

    Manager, Database Administration

    sbasyuk@carnegiehall.org

    Carnegie Hall

    881 Seventh Avenue
    New York, NY 10019

    T: 

    212-903-9691

    M: 

    646-296-0003

    Please consider the environment before printing this e-mail.

    This message may contain confidential or privileged information. If you are not the intended recipient, please advise the sender immediately by reply email and delete this message and any attachments without retaining a copy.

     

    From: Tessitura Technical Forum [mailto:forums-technical@tessituranetwork.com] On Behalf Of Aaron Glynn
    Sent: Tuesday, July 25, 2017 4:59 PM
    To: Basyuk, Simon <sbasyuk@carnegiehall.org>
    Subject: [Tessitura Technical Forum] illegal characters in user customer data

     

    Hi,

    I was thinking of creating a stored procedure that would detect any records in the typical tables (T_Customer, T_Address, T_Phone, T_Eaddress) with characters symbols where they dont belong, and generally inappropriate characters like tab and return etc.

    Has anyone come across a report like this already? I appreciate any advice before I get started inventing the wheel here.




    This message was sent automatically to you by www.tessituranetwork.com because you subscribed to the Tessitura Technical Forum. You may reply to this message to post to the Technical forum or visit the site to search, read and post to the forums. In the interest of keeping the forum posts from becoming cluttered, we encourage you to delete previous message text from your reply before sending. Thank you!




    This message was sent automatically to you by www.tessituranetwork.com because you subscribed to the Tessitura Technical Forum. You may reply to this message to post to the Technical forum or visit the site to search, read and post to the forums. In the interest of keeping the forum posts from becoming cluttered, we encourage you to delete previous message text from your reply before sending. Thank you!