Using _spin_count to reduce latch contention in 11g
Back in 2003, I published the results of experimenting with various values of _spin_count to improve throughput on latch congested systems. You can find the original paper here.
The study involved varying spin count on an Oracle 8.1.6 database suffering from heavy cache buffer chains latch contention. The database was running on a Windows 2000 server. The results of this study confirmed that _spin_count can be very effective in improving the throughput of latch contention-bound systems. In the test, throughput almost doubled when spin_count was raised from the default of 2000 to 10000.
I recently repeated these tests for Oracle 11g. This time, the database was experiencing shared pool and library cache latch contention and was running on RHEL 5.0. The results will be published soon in my contention series at ToadWorld. To summarize however, increasing spin_count in 11g was equally effective in reducing latch contention. As _spin_count increased, latch waits reduced and throughput increased, up until CPU saturated, at which point no further improvements were achieved.
This chart summarizes typical results:
In this case, the optimal value for spin_count was probably about 8000. I did these tests on a variety of systems and workloads, and almost always the default spin_count of 2000 was lower than optimal
In the 2003 tests, increasing spin count beyond the point at which CPU saturated resulted in latch waits continuing to decrease, but waits for the CPU rendering those reductions useless. In these tests, after CPU saturated latch free waits stopped reducing. I presume that this is because the session was pre-empted by the OS - effectively stopping it's spin - so that when CPU gets stressed excessive _spin_count values are effectively dishonored. In 2003 trials it looked like the sessions kept spinning, creating large run queues. The difference may be due to the scheduling differences between the Windows and Linux kernels, or it might be a difference between 8i and 11g; I'm guessing the former, but I need to do some more research when I get the chance.
I should point out that Spotlight on Oracle, the diagnostic tool i developed at Quest software, has a latch tuning module that will try and discover the most optimal value for spin_count. You might like to try it out.
Reader Comments (2)
The question behind this is, what is a good value for this ratio ? Is it a good idea to increase _spin_count if 1% / 0.5% / 0.1% of misses induce a sleep ?