R Programming

Hello folks,

     Has anyone experimented with R for analysis?  If you have, I've love to chat about your experience and use cases.

 

Have a super day,

John

  • I haven't, but it's on my to-do list - so I'd love to hear what folks say as well! 

  • Former Member
    Former Member $organization

    Like Samantha, it's on my list. I've played around with it a little bit but so far only with sample data. I have yet to try to connect it to SQL Server. Definitely interested in hearing more.

  • Do you have any examples of the types of analysis you've like to do?  Even high-level examples are helpful.

  • Former Member
    Former Member $organization in reply to John Jakovich

    I'm mostly interested in visualizations that go beyond what's currently possible/easy with T-Stats and Excel. When looking at our Major Gifts program in particular, things like average gift size don't tell a useful story because the tiny handful of very large gifts you get (or don't get) in a year really skew the numbers. I've put a lot of time and effort into creating box plots, for example, (something that Excel doesn't do natively, though you can hack a stacked column chart toward the same end.) Sometimes I get to a finished result and then we think it might be interesting to look at a slightly different iteration, and that process eats up so much time. I expect that if I invested the time into learning how to do this in R, it could be a simpler process. Maybe.

    I'd also like to see dynamically updating charts--campaign revenue over time vs prior years, for example. We currently have fairly intricate Excel charts that are hand-updated weekly, which obviously makes them vulnerable to data entry errors, not to mention the time we could save if they were automated. I explored ways of doing this with Tableau, but the cost is prohibitive for the use cases I've come up with so far. I've played with accomplishing this in SSRS too, but it is currently way down my list of priorities...!

  • John,

     

    I’ve experimented with R and R studio. 

     

    We have been able to connect R via ODBC to our Tessitura T-Stats data warehouse and have experimented with Rattle “A free graphical interface for data mining with R.” which is a user interface that is supposed to make thing easier.  I guess that it does.  However, that whole process has been fairly heavy lifting.  From my point of view.  To be successful one really needs training to us R successfully.  And I have not yet had an opportunity to do this training.
     
    I’ve done a little bit of correlation visualization for example this is sort of a silly one on the Memb_Fact table out of T-Stats Warehouse.
     
     
    It is clear that Microsoft is making an investment in R as part of their machine learning and data visualization practices.  So I see it as worth the time to learn this.  Again it’s a question of time and finding someone to help me on my way.

     

    --Tom

    718.724.8135

    tbrown@BAM.org

     

    From: Self-service Business Intelligence [mailto:groups-selfservicebi@tessituranetwork.com] On Behalf Of John Jakovich
    Sent: Friday, March 25, 2016 12:47 PM
    To: Thomas Brown <tbrown@bam.org>
    Subject: [Self-service Business Intelligence] R Programming

     

    Hello folks,

         Has anyone experimented with R for analysis?  If you have, I've love to chat about your experience and use cases.

     

    Have a super day,

    John



  • Former Member
    Former Member $organization in reply to Tom Brown (Past Member)

    Thanks Tom! Lots of good info there. I've had mixed results using the PowerPivot and PowerQuery plugins for Excel (vague error messages even when they're not in use,) but I'll give PowerBI a look.

  • Matt,

     

    I’ve run into the same things with Major Gifts.  And just recently learned about how to do Box and Wisker type Graphs.  Several tools have them. 

     

     

    The New Power BI desktop is able to do several types of box plots.  And it can connect directly to the database.

     

    Check out the Power BI Visualization Gallery for all sorts of cool visuals.

     

    https://app.powerbi.com/visuals/

     

    /Community/cfs-filesystemfile.ashx/__key/CommunityServer.MailGateway.MailRoom.ForumsHandler/aabd223e4a484f8b8310051b4bd2c480/image001.png

     

     

    The Mekko Chart can be helpful as well for data sets with large skew.

     

    /Community/cfs-filesystemfile.ashx/__key/CommunityServer.MailGateway.MailRoom.ForumsHandler/e63b6654d6924896a024a2107530db81/image002.png

     

     

    The Community version of Rapid Miner also does this as long as you can use an MS Excel file as an input source.  However, it can not connect to the database directly.  If you have the paid version $1500 to $2000 per seat the database connection is quite nice.

     

    /Community/cfs-filesystemfile.ashx/__key/CommunityServer.MailGateway.MailRoom.ForumsHandler/c513f7338d194c1da6085ed35ece4d91/image003.png

     

    The output look OK here.  (I’m not going to take the time to load data to excel.)

     

    Knime another freemium tool is also able to do the same thing.  This one will connect directly to a database.  In this case I’ve connected to the T-Stats Data Warehouse.

     

    /Community/cfs-filesystemfile.ashx/__key/CommunityServer.MailGateway.MailRoom.ForumsHandler/e0165f582c5348e4a2e3dc1e145ac5a8/image004.png

     

    I don’t like the output from Knime.

     

    I cannot figure out how to make Tableau Public do a Box and Wisker Chart.  There is a menu item.  However in the Public Version almost nothing seems to work.  And you have to share your data publicly.

     

    --Tom

    718.724.8135

    tbrown@BAM.org

     

    From: Self-service Business Intelligence [mailto:groups-selfservicebi@tessituranetwork.com] On Behalf Of Matthew Echert
    Sent: Friday, March 25, 2016 3:06 PM
    To: Thomas Brown <tbrown@bam.org>
    Subject: Re: [Self-service Business Intelligence] R Programming

     

    I'm mostly interested in visualizations that go beyond what's currently possible/easy with T-Stats and Excel. When looking at our Major Gifts program in particular, things like average gift size don't tell a useful story because the tiny handful of very large gifts you get (or don't get) in a year really skew the numbers. I've put a lot of time and effort into creating box plots, for example, (something that Excel doesn't do natively, though you can hack a stacked column chart toward the same end.) Sometimes I get to a finished result and then we think it might be interesting to look at a slightly different iteration, and that process eats up so much time. I expect that if I invested the time into learning how to do this in R, it could be a simpler process. Maybe.

    I'd also like to see dynamically updating charts--campaign revenue over time vs prior years, for example. We currently have fairly intricate Excel charts that are hand-updated weekly, which obviously makes them vulnerable to data entry errors, not to mention the time we could save if they were automated. I explored ways of doing this with Tableau, but the cost is prohibitive for the use cases I've come up with so far. I've played with accomplishing this in SSRS too, but it is currently way down my list of priorities...!

    From: John Jakovich <bounce-johnjakovich8396@tessituranetwork.com>
    Sent: 3/25/2016 1:14:37 PM

    Do you have any examples of the types of analysis you've like to do?  Even high-level examples are helpful.



  • The main thing with anything Power BI is to have lots of memory and run on a windows 64 bit OS.

     

    --Tom

    718.724.8135

    tbrown@BAM.org

     

    From: Self-service Business Intelligence [mailto:groups-selfservicebi@tessituranetwork.com] On Behalf Of Matthew Echert
    Sent: Friday, March 25, 2016 7:22 PM
    To: Thomas Brown <tbrown@bam.org>
    Subject: RE: [Self-service Business Intelligence] R Programming

     

    Thanks Tom! Lots of good info there. I've had mixed results using the PowerPivot and PowerQuery plugins for Excel (vague error messages even when they're not in use,) but I'll give PowerBI a look.

    From: Tom Brown <bounce-tombrown3568@tessituranetwork.com>
    Sent: 3/25/2016 10:35:58 PM

    Matt,

     

    I’ve run into the same things with Major Gifts.  And just recently learned about how to do Box and Wisker type Graphs.  Several tools have them. 

     

     

    The New Power BI desktop is able to do several types of box plots.  And it can connect directly to the database.

     

    Check out the Power BI Visualization Gallery for all sorts of cool visuals.

     

    https://app.powerbi.com/visuals/

     

    /Community/cfs-filesystemfile.ashx/__key/CommunityServer.MailGateway.MailRoom.ForumsHandler/aabd223e4a484f8b8310051b4bd2c480/image001.png

     

     

    The Mekko Chart can be helpful as well for data sets with large skew.

     

    /Community/cfs-filesystemfile.ashx/__key/CommunityServer.MailGateway.MailRoom.ForumsHandler/e63b6654d6924896a024a2107530db81/image002.png

     

     

    The Community version of Rapid Miner also does this as long as you can use an MS Excel file as an input source.  However, it can not connect to the database directly.  If you have the paid version $1500 to $2000 per seat the database connection is quite nice.

     

    /Community/cfs-filesystemfile.ashx/__key/CommunityServer.MailGateway.MailRoom.ForumsHandler/c513f7338d194c1da6085ed35ece4d91/image003.png

     

    The output look OK here.  (I’m not going to take the time to load data to excel.)

     

    Knime another freemium tool is also able to do the same thing.  This one will connect directly to a database.  In this case I’ve connected to the T-Stats Data Warehouse.

     

    /Community/cfs-filesystemfile.ashx/__key/CommunityServer.MailGateway.MailRoom.ForumsHandler/e0165f582c5348e4a2e3dc1e145ac5a8/image004.png

     

    I don’t like the output from Knime.

     

    I cannot figure out how to make Tableau Public do a Box and Wisker Chart.  There is a menu item.  However in the Public Version almost nothing seems to work.  And you have to share your data publicly.

     

    --Tom

    718.724.8135

    tbrown@BAM.org

     

    From: Self-service Business Intelligence [mailto:groups-selfservicebi@tessituranetwork.com] On Behalf Of Matthew Echert
    Sent: Friday, March 25, 2016 3:06 PM
    To: Thomas Brown <tbrown@bam.org>
    Subject: Re: [Self-service Business Intelligence] R Programming

     

    I'm mostly interested in visualizations that go beyond what's currently possible/easy with T-Stats and Excel. When looking at our Major Gifts program in particular, things like average gift size don't tell a useful story because the tiny handful of very large gifts you get (or don't get) in a year really skew the numbers. I've put a lot of time and effort into creating box plots, for example, (something that Excel doesn't do natively, though you can hack a stacked column chart toward the same end.) Sometimes I get to a finished result and then we think it might be interesting to look at a slightly different iteration, and that process eats up so much time. I expect that if I invested the time into learning how to do this in R, it could be a simpler process. Maybe.

    I'd also like to see dynamically updating charts--campaign revenue over time vs prior years, for example. We currently have fairly intricate Excel charts that are hand-updated weekly, which obviously makes them vulnerable to data entry errors, not to mention the time we could save if they were automated. I explored ways of doing this with Tableau, but the cost is prohibitive for the use cases I've come up with so far. I've played with accomplishing this in SSRS too, but it is currently way down my list of priorities...!

    From: John Jakovich <bounce-johnjakovich8396@tessituranetwork.com>
    Sent: 3/25/2016 1:14:37 PM

    Do you have any examples of the types of analysis you've like to do?  Even high-level examples are helpful.

     



  • I'm becoming interested in Machine Learning for Record Linkage / De Duplication.  There are a number of tools in R and Python that seemed to be focused on this subject.

    In Tessitura we seem to have some interesting data for this process when it comes to historical Merge Data from which these tools might "learn" our business rules.

    The reason this is important from my point of view is if 10% - 20% of the records in our system are un-discovered or un-merged duplicates this can really skew analytical counts, and of course could cause un-needed expenditure on mailings and the like, with related customer service concerns.

  • This is a really good use case.  I'm looking forward to connecting again at TLCC.

  • Tom, 

    We have kicked off a project to tackle some of the same issues.  We have found anecdotally that many of our customers have 2, 3, sometime 5 or 6 customer records.  The cost of this can be an issue as well as just accuracy in reporting and data analysis.  It's' causing us to look at some of the processes by which these multiple accounts are created (Box Office rushing, customer forgets password and creates another with new email, etc).  

    I had not considered using ML for this, but my curiosity is peaked.  Let me know what you are thinking.  Maybe we can try a few things in tandem.  

    Missed you this year at TLCC! 

    Jamie

  • Hi Tom and Jamie,
         I am most interested as well and will be further investigating.  Maybe we can set up a WebEx to further discuss? I am out on vacation all next week but will be back the following week.
         Arthur

    Sent from my iPhone

    On Aug 11, 2016, at 5:49 PM, Jamie Shover <bounce-jamieshover9674@tessituranetwork.com> wrote:

    Tom, 

    We have kicked off a project to tackle some of the same issues.  We have found anecdotally that many of our customers have 2, 3, sometime 5 or 6 customer records.  The cost of this can be an issue as well as just accuracy in reporting and data analysis.  It's' causing us to look at some of the processes by which these multiple accounts are created (Box Office rushing, customer forgets password and creates another with new email, etc).  

    I had not considered using ML for this, but my curiosity is peaked.  Let me know what you are thinking.  Maybe we can try a few things in tandem.  

    Missed you this year at TLCC! 

    Jamie

    From: Tom Brown <bounce-tombrown3568@tessituranetwork.com>
    Sent: 7/10/2016 1:08:08 AM

    I'm becoming interested in Machine Learning for Record Linkage / De Duplication.  There are a number of tools in R and Python that seemed to be focused on this subject.

    In Tessitura we seem to have some interesting data for this process when it comes to historical Merge Data from which these tools might "learn" our business rules.

    The reason this is important from my point of view is if 10% - 20% of the records in our system are un-discovered or un-merged duplicates this can really skew analytical counts, and of course could cause un-needed expenditure on mailings and the like, with related customer service concerns.




  • I’m also very interested in this as well as other applications of machine learning. I may not have much to add to the discussion, but I am very interested in learning about it and figuring out how to apply it to the our systems.

     

    Brian Ramos

    Controller

                                                                     

    Opera Philadelphia

    Direct 215.893.5940

    Main 215.893.3600

    Guest Services 215.732.8400

    operaphila.org


    cid:image005.jpg@01D1A9E8.3C518FA0

    Connect    Description: Description: Description: Description: Description: Description: cid:319B92BA-99F7-45A6-8744-6F1B04D8231F@philorch.org    Description: Description: Description: Description: Description: Description: cid:9744B112-D63B-4F76-ACEA-784F763106CD@philorch.org   Description: Description: Description: Description: Description: Description: cid:A039933B-AF4B-475E-8BD2-E11375DAE37A@philorch.org

     

    From: Self-service Business Intelligence [mailto:groups-selfservicebi@tessituranetwork.com] On Behalf Of Arthur Curtis
    Sent: Friday, August 12, 2016 7:58 AM
    To: Ramos, Brian <ramos@operaphila.org>
    Subject: Re: [Self-service Business Intelligence] R Programming

     

    Hi Tom and Jamie,

         I am most interested as well and will be further investigating.  Maybe we can set up a WebEx to further discuss? I am out on vacation all next week but will be back the following week.

         Arthur


    Sent from my iPhone


    On Aug 11, 2016, at 5:49 PM, Jamie Shover <bounce-jamieshover9674@tessituranetwork.com> wrote:

    Tom, 

    We have kicked off a project to tackle some of the same issues.  We have found anecdotally that many of our customers have 2, 3, sometime 5 or 6 customer records.  The cost of this can be an issue as well as just accuracy in reporting and data analysis.  It's' causing us to look at some of the processes by which these multiple accounts are created (Box Office rushing, customer forgets password and creates another with new email, etc).  

    I had not considered using ML for this, but my curiosity is peaked.  Let me know what you are thinking.  Maybe we can try a few things in tandem.  

    Missed you this year at TLCC! 

    Jamie

    From: Tom Brown <bounce-tombrown3568@tessituranetwork.com>
    Sent: 7/10/2016 1:08:08 AM

    I'm becoming interested in Machine Learning for Record Linkage / De Duplication.  There are a number of tools in R and Python that seemed to be focused on this subject.

    In Tessitura we seem to have some interesting data for this process when it comes to historical Merge Data from which these tools might "learn" our business rules.

    The reason this is important from my point of view is if 10% - 20% of the records in our system are un-discovered or un-merged duplicates this can really skew analytical counts, and of course could cause un-needed expenditure on mailings and the like, with related customer service concerns.





  • TLCC has gotten big.  I was there.  Anyone want to help coordinate setting up a call.


    On Thu, Aug 11, 2016 at 5:49 PM, Jamie Shover <bounce-jamieshover9674@tessituranetwork.com> wrote:

    Tom, 

    We have kicked off a project to tackle some of the same issues.  We have found anecdotally that many of our customers have 2, 3, sometime 5 or 6 customer records.  The cost of this can be an issue as well as just accuracy in reporting and data analysis.  It's' causing us to look at some of the processes by which these multiple accounts are created (Box Office rushing, customer forgets password and creates another with new email, etc).  

    I had not considered using ML for this, but my curiosity is peaked.  Let me know what you are thinking.  Maybe we can try a few things in tandem.  

    Missed you this year at TLCC! 

    Jamie

    From: Tom Brown <bounce-tombrown3568@tessituranetwork.com>
    Sent: 7/10/2016 1:08:08 AM

    I'm becoming interested in Machine Learning for Record Linkage / De Duplication.  There are a number of tools in R and Python that seemed to be focused on this subject.

    In Tessitura we seem to have some interesting data for this process when it comes to historical Merge Data from which these tools might "learn" our business rules.

    The reason this is important from my point of view is if 10% - 20% of the records in our system are un-discovered or un-merged duplicates this can really skew analytical counts, and of course could cause un-needed expenditure on mailings and the like, with related customer service concerns.