Parallelism on SQL Server

I work at the Museum of Science, Boston as their Senior DBA.  I’d like to know if anyone has adjusted these two parallelism settings in your on-premise SQL Server:

  • Cost Threshold for Parallelism = 5
  • Max Degree of Parallelism = 0

We have slowdowns periodically and when we do, I use Spotlight on SQL Server to make sure the production SQL Server isn’t the problem.  In doing some of that forensic work, I noticed undue pressure on the CPU.  I don't think it's the culprit but it made me review the settings more deeply.  

My research is showing that we should set the cost threshold to something other than 5 (most often quoted at 50) and the MAXDOP should be equal to or less than the CPU (8 or less).  If we set it to 1, we’ll essentially turn parallelism ‘off’ so the range is 2-8.

We opened a ticket with Tessitura to ask their advice but they won't comment on server settings like this.  We run reports directly against production, so I’m wondering if some of the larger/longer ones are eating up the CPU cycles across all CPU’s and not yielding to the more chatty queries that take no time to run.

Has anyone adjusted these settings and had any luck?  Our environment is quite chatty - we have butterfly shows every 15 minutes, planetarium shows every hour, etc.  We don't sell just one or two performances a day - more like hundreds throughout the day (including exhibit hall admissions).  I'm seeing a lot of SOS_SCHEDULER_YIELD wait types showing up which makes me believe we're having this issue.  I've been watching it over the past month or two, so it's a cumulative observation, not one isolated to a particularly bad time.

Thanks for the help.

 

  • How many cores and NUMA nodes (processors) are you running? I'm relatively new at this, but that seems relevant.

    I set cost threshold to 50 based on the recommendation from the folks at brentozar.com -- their story is that the default setting of 5 may have made sense many years ago, but single threaded CPU performance has changed dramatically since then (and SQL Server's scale for query cost hasn't).

    Our production server is 4 cores on one NUMA node, and I set MAXDOP to 2.

    I'm by no means an expert on this, and I've never had to triage performance issues on our server, but my understanding is that you don't want parallelism to span multiple NUMA nodes (if you have them), and I wouldn't want a single query to take over all of my cores if I'm running an OLTP workload. It sounds like the strategy you need to go for is to let the reporting queries take as long as they need to in order to leave enough cores free to handle all of the transactions. So... maybe start with a lower MAXDOP and work your way up?

  • And just a theory -- what could be happening on your system is with your cost threshold so low, lots of queries are going parallel that would be fine single-threaded, and so all the extra parallelism overhead is inflating your CPU usage.

  • Hi Heather,

     

    We did make adjustments to this in our environment early last year. I likely read some of the same material that turned up in your research! I took a fairly conservative approach, and I believe we did see some improvement. (At the very least, it caused no issues or deterioration in performance.)

     

    Here are the settings we currently use:

     

    ·         Cost Threshold for Parallelism = 15

    ·         Max Degree for Parallelism = 4

     

    We are currently running 8 vCPUs. I increased Cost Threshold for Parallelism to ensure queries with lower cost did not use parallelism. This was meant to be an interim step before raising it to something like 50, as you mentioned, but I haven’t had time to circle back to it. I set Max Degree of Parallelism to 4 for similar reasons Nick mentioned: I don’t want any single query to be spread across all of the cores allocated to our database VM. I see this as important as generally speaking, our longest running queries tend to be for reporting. We did change this and use it for at least 2-3 weeks in one of our test environments before making the change in our live database.

     

    We use SolarWinds Database Performance Analyzer, which I believe is a similar product to Spotlight. We see very little SOS_SCHEDULER_YIELD waits at this point – in fact, most days, we have no waits of this type on a typical day. Sadly, I’m not sure what that wait looked like before we made this change. While I doubt changing these parallelism settings would resolve the root issue of that particular wait, it might at least provide some relief if there are some queries that are misbehaving, so to speak, and spread across all CPUs when they run.

     

    I hope this helps – good luck to you!

     

    Thanks,

    David

    _______________________________________________________
    David Frederick
    Database Applications Analyst
    Segerstrom Center for the Arts
    600 Town Center Drive,
    Costa Mesa, CA 92626
    T (714) 556-2122 x 4067 E DFrederick@SCFTA.org

     

  • That was my conclusion too Nick.   I think we have a majority of threads that can run single-threaded but they’re getting pushed aside when a largish report comes along and goes parallel.  I saw the same article from Brent too.  Good advice and well explained.

     

    From: Tessitura Technical Forum [mailto:forums-technical@tessituranetwork.com] On Behalf Of Nick Reilingh
    Sent: Wednesday, March 01, 2017 2:19 PM
    To: Heather Sheridan <hsheridan@mos.org>
    Subject: Re: [Tessitura Technical Forum] Parallelism on SQL Server

     

    And just a theory -- what could be happening on your system is with your cost threshold so low, lots of queries are going parallel that would be fine single-threaded, and so all the extra parallelism overhead is inflating your CPU usage.

    From: Nick Reilingh <bounce-nicholasreilingh4883@tessituranetwork.com>
    Sent: 3/1/2017 1:46:56 PM

    How many cores and NUMA nodes (processors) are you running? I'm relatively new at this, but that seems relevant.

    I set cost threshold to 50 based on the recommendation from the folks at brentozar.com -- their story is that the default setting of 5 may have made sense many years ago, but single threaded CPU performance has changed dramatically since then (and SQL Server's scale for query cost hasn't).

    Our production server is 4 cores on one NUMA node, and I set MAXDOP to 2.

    I'm by no means an expert on this, and I've never had to triage performance issues on our server, but my understanding is that you don't want parallelism to span multiple NUMA nodes (if you have them), and I wouldn't want a single query to take over all of my cores if I'm running an OLTP workload. It sounds like the strategy you need to go for is to let the reporting queries take as long as they need to in order to leave enough cores free to handle all of the transactions. So... maybe start with a lower MAXDOP and work your way up?




    This message was sent automatically to you by www.tessituranetwork.com because you subscribed to the Tessitura Technical Forum. You may reply to this message to post to the Technical forum or visit the site to search, read and post to the forums. In the interest of keeping the forum posts from becoming cluttered, we encourage you to delete previous message text from your reply before sending. Thank you!

  • To cover the first question, we are running 1 NUMA node.  We have 2 sockets with 4 cores per socket.  We have a total of 8 logical processors.  I think if we tried this, I might set the MAXDOP to 4 to start.

     

    From: Tessitura Technical Forum [mailto:forums-technical@tessituranetwork.com] On Behalf Of Nick Reilingh
    Sent: Wednesday, March 01, 2017 2:00 PM
    To: Heather Sheridan <hsheridan@mos.org>
    Subject: Re: [Tessitura Technical Forum] Parallelism on SQL Server

     

    How many cores and NUMA nodes (processors) are you running? I'm relatively new at this, but that seems relevant.

    I set cost threshold to 50 based on the recommendation from the folks at brentozar.com -- their story is that the default setting of 5 may have made sense many years ago, but single threaded CPU performance has changed dramatically since then (and SQL Server's scale for query cost hasn't).

    Our production server is 4 cores on one NUMA node, and I set MAXDOP to 2.

    I'm by no means an expert on this, and I've never had to triage performance issues on our server, but my understanding is that you don't want parallelism to span multiple NUMA nodes (if you have them), and I wouldn't want a single query to take over all of my cores if I'm running an OLTP workload. It sounds like the strategy you need to go for is to let the reporting queries take as long as they need to in order to leave enough cores free to handle all of the transactions. So... maybe start with a lower MAXDOP and work your way up?

    From: Heather Sheridan <bounce-heathersheridan3553@tessituranetwork.com>
    Sent: 3/1/2017 9:35:29 AM

    I work at the Museum of Science, Boston as their Senior DBA.  I’d like to know if anyone has adjusted these two parallelism settings in your on-premise SQL Server:

    • Cost Threshold for Parallelism = 5
    • Max Degree of Parallelism = 0

    We have slowdowns periodically and when we do, I use Spotlight on SQL Server to make sure the production SQL Server isn’t the problem.  In doing some of that forensic work, I noticed undue pressure on the CPU.  I don't think it's the culprit but it made me review the settings more deeply.  

    My research is showing that we should set the cost threshold to something other than 5 (most often quoted at 50) and the MAXDOP should be equal to or less than the CPU (8 or less).  If we set it to 1, we’ll essentially turn parallelism ‘off’ so the range is 2-8.

    We opened a ticket with Tessitura to ask their advice but they won't comment on server settings like this.  We run reports directly against production, so I’m wondering if some of the larger/longer ones are eating up the CPU cycles across all CPU’s and not yielding to the more chatty queries that take no time to run.

    Has anyone adjusted these settings and had any luck?  Our environment is quite chatty - we have butterfly shows every 15 minutes, planetarium shows every hour, etc.  We don't sell just one or two performances a day - more like hundreds throughout the day (including exhibit hall admissions).  I'm seeing a lot of SOS_SCHEDULER_YIELD wait types showing up which makes me believe we're having this issue.  I've been watching it over the past month or two, so it's a cumulative observation, not one isolated to a particularly bad time.

    Thanks for the help.

     




    This message was sent automatically to you by www.tessituranetwork.com because you subscribed to the Tessitura Technical Forum. You may reply to this message to post to the Technical forum or visit the site to search, read and post to the forums. In the interest of keeping the forum posts from becoming cluttered, we encourage you to delete previous message text from your reply before sending. Thank you!