Hello folks,
Has anyone experimented with R for analysis? If you have, I've love to chat about your experience and use cases.
Have a super day,
John
I haven't, but it's on my to-do list - so I'd love to hear what folks say as well!
Like Samantha, it's on my list. I've played around with it a little bit but so far only with sample data. I have yet to try to connect it to SQL Server. Definitely interested in hearing more.
Do you have any examples of the types of analysis you've like to do? Even high-level examples are helpful.
I'm mostly interested in visualizations that go beyond what's currently possible/easy with T-Stats and Excel. When looking at our Major Gifts program in particular, things like average gift size don't tell a useful story because the tiny handful of very large gifts you get (or don't get) in a year really skew the numbers. I've put a lot of time and effort into creating box plots, for example, (something that Excel doesn't do natively, though you can hack a stacked column chart toward the same end.) Sometimes I get to a finished result and then we think it might be interesting to look at a slightly different iteration, and that process eats up so much time. I expect that if I invested the time into learning how to do this in R, it could be a simpler process. Maybe.
I'd also like to see dynamically updating charts--campaign revenue over time vs prior years, for example. We currently have fairly intricate Excel charts that are hand-updated weekly, which obviously makes them vulnerable to data entry errors, not to mention the time we could save if they were automated. I explored ways of doing this with Tableau, but the cost is prohibitive for the use cases I've come up with so far. I've played with accomplishing this in SSRS too, but it is currently way down my list of priorities...!
John,
I’ve experimented with R and R studio.
We have been able to connect R via ODBC to our Tessitura T-Stats data warehouse and have experimented with Rattle “A free graphical interface for data mining with R.” which is a user interface that is supposed to make thing easier. I guess that it does. However, that whole process has been fairly heavy lifting. From my point of view. To be successful one really needs training to us R successfully. And I have not yet had an opportunity to do this training.
I’ve done a little bit of correlation visualization for example this is sort of a silly one on the Memb_Fact table out of T-Stats Warehouse.
It is clear that Microsoft is making an investment in R as part of their machine learning and data visualization practices. So I see it as worth the time to learn this. Again it’s a question of time and finding someone to help me on my way.
--Tom
…
718.724.8135
tbrown@BAM.org
From: Self-service Business Intelligence [mailto:groups-selfservicebi@tessituranetwork.com] On Behalf Of John JakovichSent: Friday, March 25, 2016 12:47 PMTo: Thomas Brown <tbrown@bam.org>Subject: [Self-service Business Intelligence] R Programming
Thanks Tom! Lots of good info there. I've had mixed results using the PowerPivot and PowerQuery plugins for Excel (vague error messages even when they're not in use,) but I'll give PowerBI a look.
Matt,
I’ve run into the same things with Major Gifts. And just recently learned about how to do Box and Wisker type Graphs. Several tools have them.
The New Power BI desktop is able to do several types of box plots. And it can connect directly to the database.
Check out the Power BI Visualization Gallery for all sorts of cool visuals.
https://app.powerbi.com/visuals/
The Mekko Chart can be helpful as well for data sets with large skew.
The Community version of Rapid Miner also does this as long as you can use an MS Excel file as an input source. However, it can not connect to the database directly. If you have the paid version $1500 to $2000 per seat the database connection is quite nice.
The output look OK here. (I’m not going to take the time to load data to excel.)
Knime another freemium tool is also able to do the same thing. This one will connect directly to a database. In this case I’ve connected to the T-Stats Data Warehouse.
I don’t like the output from Knime.
I cannot figure out how to make Tableau Public do a Box and Wisker Chart. There is a menu item. However in the Public Version almost nothing seems to work. And you have to share your data publicly.
From: Self-service Business Intelligence [mailto:groups-selfservicebi@tessituranetwork.com] On Behalf Of Matthew EchertSent: Friday, March 25, 2016 3:06 PMTo: Thomas Brown <tbrown@bam.org>Subject: Re: [Self-service Business Intelligence] R Programming
From: John Jakovich <bounce-johnjakovich8396@tessituranetwork.com>Sent: 3/25/2016 1:14:37 PM
The main thing with anything Power BI is to have lots of memory and run on a windows 64 bit OS.
From: Self-service Business Intelligence [mailto:groups-selfservicebi@tessituranetwork.com] On Behalf Of Matthew EchertSent: Friday, March 25, 2016 7:22 PMTo: Thomas Brown <tbrown@bam.org>Subject: RE: [Self-service Business Intelligence] R Programming
From: Tom Brown <bounce-tombrown3568@tessituranetwork.com>Sent: 3/25/2016 10:35:58 PM
I'm becoming interested in Machine Learning for Record Linkage / De Duplication. There are a number of tools in R and Python that seemed to be focused on this subject.
In Tessitura we seem to have some interesting data for this process when it comes to historical Merge Data from which these tools might "learn" our business rules.
The reason this is important from my point of view is if 10% - 20% of the records in our system are un-discovered or un-merged duplicates this can really skew analytical counts, and of course could cause un-needed expenditure on mailings and the like, with related customer service concerns.
This is a really good use case. I'm looking forward to connecting again at TLCC.
Tom,
We have kicked off a project to tackle some of the same issues. We have found anecdotally that many of our customers have 2, 3, sometime 5 or 6 customer records. The cost of this can be an issue as well as just accuracy in reporting and data analysis. It's' causing us to look at some of the processes by which these multiple accounts are created (Box Office rushing, customer forgets password and creates another with new email, etc).
I had not considered using ML for this, but my curiosity is peaked. Let me know what you are thinking. Maybe we can try a few things in tandem.
Missed you this year at TLCC!
Jamie
Tom, We have kicked off a project to tackle some of the same issues. We have found anecdotally that many of our customers have 2, 3, sometime 5 or 6 customer records. The cost of this can be an issue as well as just accuracy in reporting and data analysis. It's' causing us to look at some of the processes by which these multiple accounts are created (Box Office rushing, customer forgets password and creates another with new email, etc). I had not considered using ML for this, but my curiosity is peaked. Let me know what you are thinking. Maybe we can try a few things in tandem. Missed you this year at TLCC! Jamie From: Tom Brown <bounce-tombrown3568@tessituranetwork.com> Sent: 7/10/2016 1:08:08 AM I'm becoming interested in Machine Learning for Record Linkage / De Duplication. There are a number of tools in R and Python that seemed to be focused on this subject. In Tessitura we seem to have some interesting data for this process when it comes to historical Merge Data from which these tools might "learn" our business rules. The reason this is important from my point of view is if 10% - 20% of the records in our system are un-discovered or un-merged duplicates this can really skew analytical counts, and of course could cause un-needed expenditure on mailings and the like, with related customer service concerns.
From: Tom Brown <bounce-tombrown3568@tessituranetwork.com> Sent: 7/10/2016 1:08:08 AM
I’m also very interested in this as well as other applications of machine learning. I may not have much to add to the discussion, but I am very interested in learning about it and figuring out how to apply it to the our systems.
Brian Ramos
Controller
Opera Philadelphia
Direct 215.893.5940
Main 215.893.3600
Guest Services 215.732.8400
operaphila.org
Connect
From: Self-service Business Intelligence [mailto:groups-selfservicebi@tessituranetwork.com] On Behalf Of Arthur Curtis Sent: Friday, August 12, 2016 7:58 AM To: Ramos, Brian <ramos@operaphila.org> Subject: Re: [Self-service Business Intelligence] R Programming
Hi Tom and Jamie,
I am most interested as well and will be further investigating. Maybe we can set up a WebEx to further discuss? I am out on vacation all next week but will be back the following week.
Arthur
Sent from my iPhone
On Aug 11, 2016, at 5:49 PM, Jamie Shover <bounce-jamieshover9674@tessituranetwork.com> wrote: