Guy Harrison

Saturday

Mar012008

Using _spin_count to reduce latch contention in 11g

Saturday, March 1, 2008 at 8:17PM

Back in 2003, I published the results of experimenting with various values of _spin_count to improve throughput on latch congested systems. You can find the original paper here.

The study involved varying spin count on an Oracle 8.1.6 database suffering from heavy cache buffer chains latch contention. The database was running on a Windows 2000 server. The results of this study confirmed that _spin_count can be very effective in improving the throughput of latch contention-bound systems. In the test, throughput almost doubled when spin_count was raised from the default of 2000 to 10000.

I recently repeated these tests for Oracle 11g. This time, the database was experiencing shared pool and library cache latch contention and was running on RHEL 5.0. The results will be published soon in my contention series at ToadWorld. To summarize however, increasing spin_count in 11g was equally effective in reducing latch contention. As _spin_count increased, latch waits reduced and throughput increased, up until CPU saturated, at which point no further improvements were achieved.

This chart summarizes typical results:

Spin count versus latch contention

In this case, the optimal value for spin_count was probably about 8000. I did these tests on a variety of systems and workloads, and almost always the default spin_count of 2000 was lower than optimal

In the 2003 tests, increasing spin count beyond the point at which CPU saturated resulted in latch waits continuing to decrease, but waits for the CPU rendering those reductions useless. In these tests, after CPU saturated latch free waits stopped reducing. I presume that this is because the session was pre-empted by the OS - effectively stopping it's spin - so that when CPU gets stressed excessive _spin_count values are effectively dishonored. In 2003 trials it looked like the sessions kept spinning, creating large run queues. The difference may be due to the scheduling differences between the Windows and Linux kernels, or it might be a difference between 8i and 11g; I'm guessing the former, but I need to do some more research when I get the chance.

I should point out that Spotlight on Oracle, the diagnostic tool i developed at Quest software, has a latch tuning module that will try and discover the most optimal value for spin_count. You might like to try it out.

2 Comments |

Reader Comments (2)

Do you have some numbers concerning the sleep / spin ratio with different values of _spin_count you tried ?

The question behind this is, what is a good value for this ratio ? Is it a good idea to increase _spin_count if 1% / 0.5% / 0.1% of misses induce a sleep ?

March 12, 2009 |

Colin Pitrat

I think every system will be different. The chance that increasing the spin count will increase the hit rate is going to be highly dependent on the type of latch and the workload. My approach when we implemented a solution in Spotlight on Oracle (www.quest.com/spotlight) was to measure wait times and as we adjusted the spin count until we found the spin count that resulted in the least waits and the highest throughput.

March 26, 2009 |

Guy Harrison

Post a New Comment

Enter your information below to add a new comment.

My response is on my own website »

Author:

Author Email (optional):

Author URL (optional):

Post:

↓ | ↑

Some HTML allowed: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <code> <em> <i> <strike> <strong>

Using _spin_count to reduce latch contention in 11g

Reader Comments (2)

Post a New Comment

Link an External Response