Next Generation
Databases: NoSQL,
NewSQL and Big Data

Buy at Amazon
Buy at Apress

Search

Oracle Performance Survival Guide

Buy It
Read it on Safari
Scripts and Examples
Sample Chapter

                                    

 MySQL Stored procedure programming

Buy It
Read it on Safari
Scripts and Examples 

                                                

Tuesday
Dec012009

More on the database flash cache

I'm eagerly awaiting my high-performance flash SSD (an Intel X-25 E), but in the meantime I've played a bit more with the database flash cache on the cheap hardware setup described in my last post.   Sometimes it can be useful to test new features on slow hardware, since you observe phenomenon that don't occur when everything is running full speed.

I originally naively imagined that blocks would be copied into the flash cache by the Oracle server process .  Eg, that if I read from disk, I deposit the block in both the buffer cache or the flash cache.  However,  upon investigation it appears that blocks are moved from the buffer cache to the flash cache by the DBWR as they are about to be flushed from the buffer cache.  

This is of course, a much better approach.  The DBWR can write to the flash cache asynchronously, so that user sessions get the benefit - less time reading from magnetic disk - without having to wait while blocks are inserted into the flash cache. 

 

So the lifecycle of a block looks something like this:

  1. The Oracle server process reads a file from disk and adds it to the buffer cache
  2. If a session wants that block later and it's still in the buffer cache, they can read it from the buffer cache
  3. Before the block leaves the buffer cache the DBWR will write it to the flash cache (if the DBWR is not too busy)
  4. If a session want a block later and it's still in the flash cache, then they will read it from the flash cache (and presumably place it back in the buffer cache)
  5. If the block is modified, the DBWR will eventually write it back to the disk.  (Q:  What happens to any unmodified copies of that block in the flash cache?)

DBWR and the flash cache

 

Reading from flash is fast - but writing is much slower.  Therefore,  in order to avoid performance issues, the DBWR should:

 

  1. Not write to flash unless it has to and
  2. Not write to flash at all if it will get in the way of other more important activities. 

 

To the first point,  DBWR does not appear to populate the flash cache until blocks are about to be flushed.  In other words, the DBWR doesn't seem to write a buffer to the flash cache just in case it is going to be flushed, but only if it is actually (or maybe probably) going to be flushed.   We do not observe writes to the flash cache when blocks are brought into the buffer cache unless other blocks are also being flushed out.

Secondly,  the DBWR won't write to flash cache - and won't delay blocks from being flushed - if it is busy writing dirty blocks to disk.   The DBWR must clear dirty blocks from the cache as quickly as possible, otherwise "free buffer waits" will prevent new blocks from being introduced to the buffer cache.  If blocks are about to be flushed from the buffer cache but the DBWR is busy then the blocks will not get written to the flash cache, and the statistic 'flash cache insert skip: DBWR overloaded' will be incremented. 

 

Measuring db flash cache effectiveness

 

The following query reports on how often a busy DBWR is forced to skip flash cache inserts because it is too busy or for other reasons:

 

The "DBWR overloaded" statistic I think we understand - the DBWR is busy writing dirty blocks to disk and hasn't the bandwidth to write clean blocks to the flash cache.

"flash cache insert skip: exists" is also easy to understand.  If we read a block back from the flash cache into the buffer cache, then it remains in the flash cache.  There's no need to write it to the flash cache again should it again age out of the buffer cache.

"not current" probably means that the DBWR is aware of a more recent copy of the block on disk or in the buffer cache and for that reason is going to decline to write a version to flash.

"not useful" I do not understand....

The biggest takeaway here is - I think:

On a system with a very busy DBWR,  the db flash cache may be less effective. 

 

To measure the benefit from the flash cache we can compute the number of flash cache reads that avoided a physical read and multiply by the average time for each type of read.  This query attepts to do that though only for single block reads:

 

Conclusion

 

The architecture of the DB flash cache allows us to relax a bit regarding the relatively poor write performance that is associated with flash SSD.  If we place a datafile on flash SSD, we risk creating a free buffer wait bottleneck:  the DBWR will stall trying to write blocks to the slow (in terms of write latency) flash drive and so we might actually get worst performance compared to magnetic disk - at least if we have a high write rate.  But with the flash cache, the DBWR only writes when it has free time.  Consequentially, there should be little real down side - the worst that can happen is that the DBWR is too busy to populate the flash cache and so it becomes less useful.

The efficiency of the db flash cache is going to depend on two main factors:

  1. Does the DBWR have enough free time to write to the flash cache?
  2. How long does it take to write to the flash device?

The answer to the first question depends on the current rate of DBWR activity which in turn probably depends on how much IO write bandwidth exists on your disk array.  But if you have the sort of database that suffers from or is on the verge of suffering from free buffer waits, then probably the db flash cache won't work too well  because the DBWR will never be able to write to it.

The answer to the second question depends on the quality of the flash SSD.   You generally want to use Flash disks which have the lowest possible write latency.  That usually means support for the TRIM API call, and flash that uses Single Level Cells (SLC).

 

 

 

Tuesday
Nov242009

Using the Oracle 11GR2 database flash cache

Oracle just released a patch which allows you to use the database flash cache on Oracle Enterprise Linux even if you don't have exadata storage.  The patch is the obscurely named:

  •    8974084:META BUG FOR FLASH CACHE 11.2PL BUGS TO BACKPORT TO 11.2.0.1 OEL

Once you install the patch you can use any old flash device as a database flash cache.  Below I've documented some initial dabbling on a very old server and a cheap usb flash device.   The results are not representative of the performance you'd get on quality hardware, but are still interesting, I think.

Setup and configuration

 

If, like me, you just want to experiment using an USB flash device, then you first need to get that device mounted. On my test machine I created a directory "/mnt/usbflash" then created an /etc/fstab entry like this:

   /dev/sda1               /mnt/usbflash           vfat    noauto,users,rw,umask=0 0 0

On your system you might need to change "/dev/sda1" to another device depending on how your fixed disks are configured.  You should then be able to mount the flashdrive by typiing "mount /dev/sda1".  Make sure that the mount point is writable by oracle (chmod 777 /mnt/usbflash). 

Once mounted, you configure the flash cache by setting the parameters DB_FLASH_CACHE_FILE and DB_FLASH_CACHE_SIZE.  My settings are shown below:

 

Note that the value of DB_FLASH_CACHE_FILE needs to be a file on the flash drive, not the flash drive mount point itself.

Once these parameters are set, the flash cache will be enabled and will act as a secondary cache to the buffer cache.  When a block is removed from the primary cache, it will still exist in the flash cache, and can be read back without a physical read to the spinning disk.

Monitoring

There's a few ways to examine how the flash cache is being used.  Firstly,  V$SYSSTAT contains some new statistics showing the number of blocks added to the cache and the number of "hits" that were satisfied from cache (script here):

 

Some new wait events are now visible showing the waits incurred when adding or reading from the flash  cache.   Below we see that 'db flash' waits are higher than 'db file sequential read', though reads from the flash cache are much quicker than reads from disk (but there are a lot more of them):

 

 

 

Now, remember that I could not have picked worst hardware for this test - an old two CPU intel box and a cheap thumb drive.  Even so, it's remarkable how high the write overhead is in comparison to the total time.    Although the flash reads save time when compared to db file sequential reads, the overhead of maintaining the cache can be high because flash based SSD has a relatively severe write penalty.

All flash-based Solid State Disk have issues with write performance.  However, cheap Multi Level Cell (MLC) flash take about 3 times as long to write as the more expensive Single Level Cell (SLC).  When flash drives are new, the empty space can be written to in single page increments (usually 4KB).  However, when the flash drive is older, writes typically require erasing a complete 128 page block which is very much slower.  My cheap USB drive was old and MLC, so it had very poor write performance.   But even the best flash based SSD is going to be much slower for writes than for reads, and in some cases using a  flash cache might slow a database down as a result.  So monitoring is important.

There's a couple of other V$SYSSTAT statistics of interest:

 

To examine the contents of the cache, we can examine the V$BH view.  Buffers in the flash cache have STATUS values such as 'flashcur', allowing us to count the buffers from each object in the main buffer cache and in the flash cache (script is here):

 

 

In this case, the TXN_DATA table has 85,833 blocks in the flash cache and 28,753 blocks in the main buffer cache. 

Conclusion

I was happy to get the flash cache working, even with this crappy hardware.  I'm really happy that Oracle opened this up to non-exadata hardware. 

I'll be getting a better setup soon so that I can see how this works with a decent commerical SSD flash drive.

I'm a big believer in the need for a hierarchy of storage ideally including at least two layers - one for mass storage, the other for fast retrieval.   We should be cautious however, because the flash write penalty may result in performance problems similar to those we've seen with RAID5.

 

Wednesday
Oct282009

Performant Oracle programming:  perl

In the last installment, we looked at the fundamentals of efficient Oracle programming with C#.  In particular, we saw how to use bind variables and how to perform array processing.  Now lets take a look at perl.

I have a strong affection for perl:  I started using perl somewhere in the early nineties and did some simple hacking to allow the Oracle interface of the day - oraperl - to use stored procedures.  My very first true performance monitoring tools were all written using perl.  Perl has always had good support for Oracle and it's possible to write very efficient and powerful Oracle database utilities in perl.

You can get the perl program that implements the examples in this posting here

Thanks for all the perl!

Anyone who uses perl with Oracle today owes a debt of thanks to the many people who have maintained the Oracle perl interface (DBD::Oracle) over the years.

Kevin Stock created the original perl 4 "oraperl" modules, which Tim Bunce used to create the modern DBD::Oracle.  Tim maintained it for many years, but handed it over to John Scoles of the Pythian group in 2006.  Pythian are now sponsoring the ongoing maintenance of DBD::Oracle and they do this without any particular expectation of financial reward. It  demonstrates what great guys and outstanding members of the Oracle community they are, and those of us using DBD::Oracle owe them a debt of gratitude. The Pythian hosted home page for DBD::Oracle is here.

Bind variables

As with most languages, it's easier in perl to use literals than to use bind variables.  So for instance, in the following snippet we issue the same SQL statement in a loop, but because we are inserting the literal value of the variable "$x_value" into the SQL on line 236, Oracle will consider every SQL to be a unique statement. 

 


To use bind variables, it's simply required that you insert a placeholder in your SQL text - either a "?" or numbered parameters like ":1", and then assign them values with the bind_param call (see line 184 below).   Note that we prepare the statement only once (on line 179) even though we bind and execute it many times. 

 

  

Using bind variables results in significant performance gains,  both because of the reduction in parse time CPU and because of the reduced contention for library cache mutexes.  When I ran the script I was the only user on the database so mutex contention was not an issue.  Even so, the reduction in parse time resulted in an almost a 75% reduction in execution time:

 

Array Fetch

Array fetch is handled automatically by DBD::Oracle.   The RowCacheSize property of the database handle sets the size of the fetch array, which by default is set to 500.  

Perl supports a range of fetch methods:  fetchrow_array, fetchrow_arrayref, fetchrow_hashref and  fetchall_arrayref.  DBD::Oracle uses array processing regardless of the DBI method you use to fetch your rows.  

In the following example, we set the array fetch size, then process each row one at a time.  DBD::Oracle fetches rows in batches behind the scenes. 

 

We set the size of the array on line 61.  You don't really need to do this unless you think that 500 is an inappropriate array size - perhaps if the row size is very small or very large.  

A casual look at the DBI APIs might suggest that you need to use the fetchall_arrayref method to exploit array processing.  For instance, in the following perl code, the contents of the PERLDEMO table are loaded into a array of rows in a single call. 

Fetching the data in this way actually doesn't change the size of the array requests to Oracle:  all it achieves is to load all the rows into local memory.  This is usually not a good idea for massive tables, since you may get a memory allocation error or starve other programs - including Oracle - for memory.   Even if you don't cause a memory shortage, the overhead of allocating the memory will usually result in slower processing than if you process the rows using fetchrow_array or one of the other row-at-a-time fetch calls. 

Array Insert

As with a lot of languages,  it's easy to fetch in arrays in perl, but slightly harder to insert with arrays.  The "natural" way of inserting rows is to bind each row into the INSERT statement and execute.   Here's that simple technique, which does not exploit the array interface:

 

The value for the row to be inserted is bound in lines 148-149, then inserted in line 150.  Each row is inserted in a separate call, and so every row requires a network round trip to the database and a unique interaction with the oracle server code.

 It's only a little bit more complex to bind an array:

We bind the arrays using the bind_param_array method on lines 133 and 134.  The execute_array method (line 136) activates the array insert.  The size of the array can be adjusted by setting the ora_array_chunk_size property (line 132). 

As in all languages,  it's a very significant improvement to use array insert.  We see below that using the array methods reduced elapsed time by over 80%:

 

Conclusion

 Perl supports a very rich and mature interface to Oracle and it's certainly possible to write high-performance perl programs that interact with Oracle. 

Although it's usually not necessary to explicitly code array fetch,  you do need to explicitly code bind variables and array inserts and you should generally do so if you wish your perl script to interact efficiently with Oracle. 

There's lots of other performance features of the DBD::Oracle driver that I haven't covered in this basic introduction.  You can read up on all the capabilities of the driver in its documentation page on CPAN. 

 

 

Monday
Oct192009

Oracle performance programming: .NET

In Oracle SQL High Performance tuning, I included an appendix in which I outlined how to achieve good cursor management, bind variables and array processing in the major languages of the day.   I had intended to do the same in Oracle Performance Survival Guide, but I ran out of time and space in the book.  So the examples in the book are generally in Java or PL/SQL only.

I wanted to get up to date on the various languages, some of which (like Python) I haven't used for a while and others (Ruby for instance) I've never used with Oracle.  So I thought I'd kill two birds with one stone by writing a series of blog posts on how to program efficiently in the various languages.

There's lots of best practices in each language, but I think most of us would agree that you at least need to know how to do the following:

  1. Use bind variables appropriately.
  2. Ensure you are performing array fetch when retrieving more than one row
  3. Ensure that you use array insert when doing bulk inserts 

The ways of doing this are different in each language.  In Chapter 6 of Oracle Performance Survival Guide,  I describe these techniques and their performance implications, using Java and sometimes PL/SQL examples.  In this series I'll go through the techniques for other languages. 

Let's start with ODP.NET, which is Oracle's ADO.NET driver for .NET languages such as C#, VB.NET and Powershell (see here for a posting on using Oracle with powershell). 

You can get the C# program which contains the code snippets below here

Bind variables

As in most languages, it's all to easy to omit bind variables in C#.  Here's an example where the variable value is simply concatenated into a SQL string.  The values to be fetched are obtained from an array "xvalueList":

Every value of xvalue will generate a unique SQL which will have to be parsed (unless the database parameter CURSOR_SHARING is set to FORCE or SIMILAR).

To use bind variables, we need to make 3 changes to the code.  The changes are shown in the code below:

 

 

  • Firstly, we define the bind variable "xVal" in the text string that represents the SQL statement (line 231).
  • Next, we create an OracleParameter object representing the bind variable and assign it a datatype (line 233).  We need to do this for every bind variable we use.
  • Finally,  we assign the parameter value to the value of the program variable that we want to use (line 237). 

 

As with any programming language, using bind variables improves performance by reducing parse overhead and library cache latch or mutex contention.  The above SQL is about as simple as you can get, so parse overhead reductions should be modest.   As the only active session on my (local laptop) database,  library cache contention would not be expected either.  Nevertheless,    elapsed time even in this simple example reduced by about 2/3rds when bind variables were used. 

Array Select

 

Fetching rows in batches generally reduces elapsed time for bulk selects by up to a factor of 10.  Array fetch works to improve performance by reducing network round trips and by reducing logical reads. 

Array fetch is turned on by default in ODP.NET.  The size of the array is determined by the FetchSize property of the OracleCommand object.  It can also be set in the OracleDataReader object.   The value is specified as the number of bytes allocated to the array, not the size of the array.  This makes it tricky to set an exact array fetch size unless you know the row length.   However, you can get the row length once the statement is executed by examining the RowSize parameter.

The default array size is 64K which will generally be adequate for good performance.  Below we explicity set the fetch size to the size specified by the variable myFetchSize: 

 

You won't normally need to adjust FetchSize.  Below you can see the effect of adjusting the FetchSize to 64 - the size of a single row for this table - and to 6400 - the size of 100 rows: 

 

 

 Array insert

 

Array insert has similar performance implications as array fetch, but it harder to program and does not happen automatically.   Firstly, the rows to be inserted must exist within a .NET array variables.  The following declares and populates an array of two integer variables that we are going to insert:

 

 

The array is bound to bind variables as usual.  On lines 167-170 below we create the Oracle Parameter objects for the bind variables. and on lines 172-173 we assign the arrays to the bind variables.  On line 176 we set the number of rows to be inserted in the array insert;  we've set it to be equal to the number of values in the array.  

We just need to perform one insert - on line 177. 

 

 

Below we see the performance improvements between inserting in arrays and inserting row by row.  In this case, the database was my local laptop so the network latency was truly minimal.  Even bigger performance improvements can be observed when the database is across a slow network link:

 

 

Conclusion

There's lot's of advanced techniques for working with ADO.NET, but at the very least you should be familiar with bind variables and array processing.  Of the three, only array fetch is implemented by default and the performance issues that can result from a failure to use all three can be truly remarkable. 

 

 

 

 

Thursday
Oct082009

Oracle OpenWorld again!

Along with tens of thousands of other Oracle-types,  I'll shortly be in San Francisco for Oracle Open World 2009.  This will be my 12th OOW!    

We'll be giving away signed copies of Oracle Performance Survival Guide at the Quest booth (#335) on Wednesday October 14th at 1:30pm.   I'll also be giving a few copies away at the two Oracle Develop presentations I'm giving:

- Session ID S308361: High Performance PL/SQL  Oct. 12, 5:30 p.m., Hilton Hotel, Golden Gate 6/7

- Session ID S308362: Oracle Performance by Design  Oct. 13, 4 p.m., Hilton Hotel, Franciscan A/B

The rest of the time I'll be attending sessions and catching up with folk.  If you'd like to try and catch me, either direct message me at @guyharrison or email me at guy.a.harrison@gmail.com