Hello hive mind!
I'm trying to pull a list of max donation amounts for people who qualify as a major donor, however, my query is pulling more than just the max value for a given constituent. Essentially I'm getting any contribution amount for the individual that qualifies as a major gift. I just want to return one row per constituent ID that has their highest ever gift amount. What am I missing here? I'm still learning SQL, so any help is greatly appreciated.
Here is the code:
SELECT a.cont_amt, a.customer_no, a.cont_dt, b.fname, b.lname, b.sort_nameFROM impresario.dbo.T_CONTRIBUTION aJOIN impresario.dbo.T_CUSTOMER b ON a.customer_no = b.customer_noWHERE EXISTS (SELECT MAX(cont_amt), customer_noFROM impresario.dbo.T_CONTRIBUTIONGROUP BY customer_noHAVING MAX(cont_amt) > 1500)ORDER BY a.cont_amt DESC;
Jordan,
This is one of those SQL fun things that is always a little quirky. What you are trying to find is the associated A, B, C, D, etc... columns where column X is at its max. Really, the best way to do that is using CTEs (or temporary tables, depending upon your situation) using the ROW_NUMBER() function as a ranking solution.
Here is something that should get you most of the way there:
; --Do not omit thisWITH cte_max_conts (cont_amt,customer_no,ref_no,max_cont_rank)AS ( SELECT ci.cont_amt, ci.customer_no, ci.ref_no, ROW_NUMBER() OVER(PARTITION BY ci.customer_no ORDER BY ci.cont_amt DESC) AS max_cont_rank FROM T_CONTRIBUTION ci WHERE ci.cont_amt > 1500 )SELECT c.cont_amt, c.customer_no, c.cont_dt, cr.fname, cr.lname, cr.sort_nameFROM T_CONTRIBUTION c JOIN cte_max_conts cte ON c.ref_no = cte.ref_no JOIN T_CUSTOMER cr ON c.customer_no = cr.customer_noWHERE cte.max_cont_rank = 1
Let me know if that does not help.
John
P.S. to anyone out there on Tessitura's web team watching, I tried to use the "insert code" function here, but there was something wrong with it as it kept reloading itself driving the "ok" button lower. Windows 10, Firefox, updated to the most recent version of the browser.
Ahh, thank you! That worked! I was trying to use a CTE (without really understanding what it was doing), but couldn't find enough help on how to do it properly using my google skills, haha.
I was also having that issue with the insert code function!
Just think of a CTE as a one-time use temporary table. You create it, and then just use it immediately in your query below.
As for the kicker, though, that would be the ROW_NUMBER() OVER (PARTITION BY...) section. That function is highly useful and worth researching. The details are a little more complex than this, but it works sort of like an internal GROUP BY and ORDER BY together for the SELECT area of your current query without GROUPing or ORDERing the rest of the query.
I paraphrased a bit, but if you use what I sent plus a little documentation, that should pretty much get you there.
Amazing, thank you! I will definitely research that more.
The CTE is not required for the data you were looking for. CTE's add additional overhead, be careful in using them.
The query can be built like this:
Set ArithAbort OnSET TRANSACTION ISOLATION LEVEL READ UNCOMMITTEDSelect CT.customer_no , isnull(C.fname,'') as fname , C.lname , c.sort_name, cont_dt , cont_amt from (Select customer_no, cont_amt , Cont_dt , ROW_NUMBER () over (partition by Customer_No Order by Cont_AMT desc, Cont_Dt desc) as SeqNo from T_CONTRIBUTION where cont_amt > 1500 ) as CT inner join T_CUSTOMER as C on CT.customer_no = C.customer_no where SeqNo = 1
Performance statistics without CTE:
(3903 rows affected)Table 'T_CONTRIBUTION'. Scan count 9, logical reads 40705, physical reads 0, read-ahead reads 40705, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.Table 'T_CUSTOMER'. Scan count 0, logical reads 20435, physical reads 1, read-ahead reads 2717, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.Table 'Worktable'. Scan count 0, logical reads 0, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.Table 'Worktable'. Scan count 0, logical reads 0, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Performance statistics using Jordan's sample with CTE
(3903 rows affected)Table 'T_CONTRIBUTION'. Scan count 18, logical reads 81410, physical reads 0, read-ahead reads 40705, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.Table 'Worktable'. Scan count 0, logical reads 0, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.Table 'T_CUSTOMER'. Scan count 0, logical reads 20868, physical reads 1, read-ahead reads 2717, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.Table 'Workfile'. Scan count 0, logical reads 0, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.Table 'Worktable'. Scan count 0, logical reads 0, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
As you can see the CTE version doubled the number of logical reads, scan count, created an extra work table and did more reads on the customer table. The execution plan was also much simpler for the one without the CTE.
You are not incorrect, and it is true, I will often make different decisions when I do one-time queries versus queries that I am putting into a procedure. CTEs can be helpful, but can be dangerous, too. Same thing with temporary table, cursors and many other things.
I certainly was not trying to imply that this was the only way to get it done, and your method indeed simplifies the process one further step by internalizing the initial pull. On the other hand, there are also similar situations where a CTE would be more necessary. But always helpful to make sure both perspectives on anything like CTEs or Cursors are represented to make sure no one goes crazy.
I have seen databases tied up in nasty ways due to my own carelessness, so I am always grateful for reminders.
What would make a CTE dangerous in an instance like this (or not like this for that matter)?
In this example a CTE is not overly dangerous, it is just making SQL Server work a little harder than it needs to for the same results. What you want to avoid is getting into coding habits that could potentially have an adverse impact when coding for something much more impactful due to frequency of use or volume of data.
For example if this was a query to pull available performances to display for a landing page it would be much more important to make sure it was optimally coded due to both the frequency in which it is called and volume of data.
Got it, thank you for helping me learn!