Next Generation
Databases: NoSQL,
NewSQL and Big Data

Buy at Amazon
Buy at Apress

Search

Oracle Performance Survival Guide

Buy It
Read it on Safari
Scripts and Examples
Sample Chapter

                                    

 MySQL Stored procedure programming

Buy It
Read it on Safari
Scripts and Examples 

                                                

Entries in Oracle (35)

Tuesday
Nov242009

Using the Oracle 11GR2 database flash cache

Oracle just released a patch which allows you to use the database flash cache on Oracle Enterprise Linux even if you don't have exadata storage.  The patch is the obscurely named:

  •    8974084:META BUG FOR FLASH CACHE 11.2PL BUGS TO BACKPORT TO 11.2.0.1 OEL

Once you install the patch you can use any old flash device as a database flash cache.  Below I've documented some initial dabbling on a very old server and a cheap usb flash device.   The results are not representative of the performance you'd get on quality hardware, but are still interesting, I think.

Setup and configuration

 

If, like me, you just want to experiment using an USB flash device, then you first need to get that device mounted. On my test machine I created a directory "/mnt/usbflash" then created an /etc/fstab entry like this:

   /dev/sda1               /mnt/usbflash           vfat    noauto,users,rw,umask=0 0 0

On your system you might need to change "/dev/sda1" to another device depending on how your fixed disks are configured.  You should then be able to mount the flashdrive by typiing "mount /dev/sda1".  Make sure that the mount point is writable by oracle (chmod 777 /mnt/usbflash). 

Once mounted, you configure the flash cache by setting the parameters DB_FLASH_CACHE_FILE and DB_FLASH_CACHE_SIZE.  My settings are shown below:

 

Note that the value of DB_FLASH_CACHE_FILE needs to be a file on the flash drive, not the flash drive mount point itself.

Once these parameters are set, the flash cache will be enabled and will act as a secondary cache to the buffer cache.  When a block is removed from the primary cache, it will still exist in the flash cache, and can be read back without a physical read to the spinning disk.

Monitoring

There's a few ways to examine how the flash cache is being used.  Firstly,  V$SYSSTAT contains some new statistics showing the number of blocks added to the cache and the number of "hits" that were satisfied from cache (script here):

 

Some new wait events are now visible showing the waits incurred when adding or reading from the flash  cache.   Below we see that 'db flash' waits are higher than 'db file sequential read', though reads from the flash cache are much quicker than reads from disk (but there are a lot more of them):

 

 

 

Now, remember that I could not have picked worst hardware for this test - an old two CPU intel box and a cheap thumb drive.  Even so, it's remarkable how high the write overhead is in comparison to the total time.    Although the flash reads save time when compared to db file sequential reads, the overhead of maintaining the cache can be high because flash based SSD has a relatively severe write penalty.

All flash-based Solid State Disk have issues with write performance.  However, cheap Multi Level Cell (MLC) flash take about 3 times as long to write as the more expensive Single Level Cell (SLC).  When flash drives are new, the empty space can be written to in single page increments (usually 4KB).  However, when the flash drive is older, writes typically require erasing a complete 128 page block which is very much slower.  My cheap USB drive was old and MLC, so it had very poor write performance.   But even the best flash based SSD is going to be much slower for writes than for reads, and in some cases using a  flash cache might slow a database down as a result.  So monitoring is important.

There's a couple of other V$SYSSTAT statistics of interest:

 

To examine the contents of the cache, we can examine the V$BH view.  Buffers in the flash cache have STATUS values such as 'flashcur', allowing us to count the buffers from each object in the main buffer cache and in the flash cache (script is here):

 

 

In this case, the TXN_DATA table has 85,833 blocks in the flash cache and 28,753 blocks in the main buffer cache. 

Conclusion

I was happy to get the flash cache working, even with this crappy hardware.  I'm really happy that Oracle opened this up to non-exadata hardware. 

I'll be getting a better setup soon so that I can see how this works with a decent commerical SSD flash drive.

I'm a big believer in the need for a hierarchy of storage ideally including at least two layers - one for mass storage, the other for fast retrieval.   We should be cautious however, because the flash write penalty may result in performance problems similar to those we've seen with RAID5.

 

Wednesday
Oct282009

Performant Oracle programming:  perl

In the last installment, we looked at the fundamentals of efficient Oracle programming with C#.  In particular, we saw how to use bind variables and how to perform array processing.  Now lets take a look at perl.

I have a strong affection for perl:  I started using perl somewhere in the early nineties and did some simple hacking to allow the Oracle interface of the day - oraperl - to use stored procedures.  My very first true performance monitoring tools were all written using perl.  Perl has always had good support for Oracle and it's possible to write very efficient and powerful Oracle database utilities in perl.

You can get the perl program that implements the examples in this posting here

Thanks for all the perl!

Anyone who uses perl with Oracle today owes a debt of thanks to the many people who have maintained the Oracle perl interface (DBD::Oracle) over the years.

Kevin Stock created the original perl 4 "oraperl" modules, which Tim Bunce used to create the modern DBD::Oracle.  Tim maintained it for many years, but handed it over to John Scoles of the Pythian group in 2006.  Pythian are now sponsoring the ongoing maintenance of DBD::Oracle and they do this without any particular expectation of financial reward. It  demonstrates what great guys and outstanding members of the Oracle community they are, and those of us using DBD::Oracle owe them a debt of gratitude. The Pythian hosted home page for DBD::Oracle is here.

Bind variables

As with most languages, it's easier in perl to use literals than to use bind variables.  So for instance, in the following snippet we issue the same SQL statement in a loop, but because we are inserting the literal value of the variable "$x_value" into the SQL on line 236, Oracle will consider every SQL to be a unique statement. 

 


To use bind variables, it's simply required that you insert a placeholder in your SQL text - either a "?" or numbered parameters like ":1", and then assign them values with the bind_param call (see line 184 below).   Note that we prepare the statement only once (on line 179) even though we bind and execute it many times. 

 

  

Using bind variables results in significant performance gains,  both because of the reduction in parse time CPU and because of the reduced contention for library cache mutexes.  When I ran the script I was the only user on the database so mutex contention was not an issue.  Even so, the reduction in parse time resulted in an almost a 75% reduction in execution time:

 

Array Fetch

Array fetch is handled automatically by DBD::Oracle.   The RowCacheSize property of the database handle sets the size of the fetch array, which by default is set to 500.  

Perl supports a range of fetch methods:  fetchrow_array, fetchrow_arrayref, fetchrow_hashref and  fetchall_arrayref.  DBD::Oracle uses array processing regardless of the DBI method you use to fetch your rows.  

In the following example, we set the array fetch size, then process each row one at a time.  DBD::Oracle fetches rows in batches behind the scenes. 

 

We set the size of the array on line 61.  You don't really need to do this unless you think that 500 is an inappropriate array size - perhaps if the row size is very small or very large.  

A casual look at the DBI APIs might suggest that you need to use the fetchall_arrayref method to exploit array processing.  For instance, in the following perl code, the contents of the PERLDEMO table are loaded into a array of rows in a single call. 

Fetching the data in this way actually doesn't change the size of the array requests to Oracle:  all it achieves is to load all the rows into local memory.  This is usually not a good idea for massive tables, since you may get a memory allocation error or starve other programs - including Oracle - for memory.   Even if you don't cause a memory shortage, the overhead of allocating the memory will usually result in slower processing than if you process the rows using fetchrow_array or one of the other row-at-a-time fetch calls. 

Array Insert

As with a lot of languages,  it's easy to fetch in arrays in perl, but slightly harder to insert with arrays.  The "natural" way of inserting rows is to bind each row into the INSERT statement and execute.   Here's that simple technique, which does not exploit the array interface:

 

The value for the row to be inserted is bound in lines 148-149, then inserted in line 150.  Each row is inserted in a separate call, and so every row requires a network round trip to the database and a unique interaction with the oracle server code.

 It's only a little bit more complex to bind an array:

We bind the arrays using the bind_param_array method on lines 133 and 134.  The execute_array method (line 136) activates the array insert.  The size of the array can be adjusted by setting the ora_array_chunk_size property (line 132). 

As in all languages,  it's a very significant improvement to use array insert.  We see below that using the array methods reduced elapsed time by over 80%:

 

Conclusion

 Perl supports a very rich and mature interface to Oracle and it's certainly possible to write high-performance perl programs that interact with Oracle. 

Although it's usually not necessary to explicitly code array fetch,  you do need to explicitly code bind variables and array inserts and you should generally do so if you wish your perl script to interact efficiently with Oracle. 

There's lots of other performance features of the DBD::Oracle driver that I haven't covered in this basic introduction.  You can read up on all the capabilities of the driver in its documentation page on CPAN. 

 

 

Monday
Oct192009

Oracle performance programming: .NET

In Oracle SQL High Performance tuning, I included an appendix in which I outlined how to achieve good cursor management, bind variables and array processing in the major languages of the day.   I had intended to do the same in Oracle Performance Survival Guide, but I ran out of time and space in the book.  So the examples in the book are generally in Java or PL/SQL only.

I wanted to get up to date on the various languages, some of which (like Python) I haven't used for a while and others (Ruby for instance) I've never used with Oracle.  So I thought I'd kill two birds with one stone by writing a series of blog posts on how to program efficiently in the various languages.

There's lots of best practices in each language, but I think most of us would agree that you at least need to know how to do the following:

  1. Use bind variables appropriately.
  2. Ensure you are performing array fetch when retrieving more than one row
  3. Ensure that you use array insert when doing bulk inserts 

The ways of doing this are different in each language.  In Chapter 6 of Oracle Performance Survival Guide,  I describe these techniques and their performance implications, using Java and sometimes PL/SQL examples.  In this series I'll go through the techniques for other languages. 

Let's start with ODP.NET, which is Oracle's ADO.NET driver for .NET languages such as C#, VB.NET and Powershell (see here for a posting on using Oracle with powershell). 

You can get the C# program which contains the code snippets below here

Bind variables

As in most languages, it's all to easy to omit bind variables in C#.  Here's an example where the variable value is simply concatenated into a SQL string.  The values to be fetched are obtained from an array "xvalueList":

Every value of xvalue will generate a unique SQL which will have to be parsed (unless the database parameter CURSOR_SHARING is set to FORCE or SIMILAR).

To use bind variables, we need to make 3 changes to the code.  The changes are shown in the code below:

 

 

  • Firstly, we define the bind variable "xVal" in the text string that represents the SQL statement (line 231).
  • Next, we create an OracleParameter object representing the bind variable and assign it a datatype (line 233).  We need to do this for every bind variable we use.
  • Finally,  we assign the parameter value to the value of the program variable that we want to use (line 237). 

 

As with any programming language, using bind variables improves performance by reducing parse overhead and library cache latch or mutex contention.  The above SQL is about as simple as you can get, so parse overhead reductions should be modest.   As the only active session on my (local laptop) database,  library cache contention would not be expected either.  Nevertheless,    elapsed time even in this simple example reduced by about 2/3rds when bind variables were used. 

Array Select

 

Fetching rows in batches generally reduces elapsed time for bulk selects by up to a factor of 10.  Array fetch works to improve performance by reducing network round trips and by reducing logical reads. 

Array fetch is turned on by default in ODP.NET.  The size of the array is determined by the FetchSize property of the OracleCommand object.  It can also be set in the OracleDataReader object.   The value is specified as the number of bytes allocated to the array, not the size of the array.  This makes it tricky to set an exact array fetch size unless you know the row length.   However, you can get the row length once the statement is executed by examining the RowSize parameter.

The default array size is 64K which will generally be adequate for good performance.  Below we explicity set the fetch size to the size specified by the variable myFetchSize: 

 

You won't normally need to adjust FetchSize.  Below you can see the effect of adjusting the FetchSize to 64 - the size of a single row for this table - and to 6400 - the size of 100 rows: 

 

 

 Array insert

 

Array insert has similar performance implications as array fetch, but it harder to program and does not happen automatically.   Firstly, the rows to be inserted must exist within a .NET array variables.  The following declares and populates an array of two integer variables that we are going to insert:

 

 

The array is bound to bind variables as usual.  On lines 167-170 below we create the Oracle Parameter objects for the bind variables. and on lines 172-173 we assign the arrays to the bind variables.  On line 176 we set the number of rows to be inserted in the array insert;  we've set it to be equal to the number of values in the array.  

We just need to perform one insert - on line 177. 

 

 

Below we see the performance improvements between inserting in arrays and inserting row by row.  In this case, the database was my local laptop so the network latency was truly minimal.  Even bigger performance improvements can be observed when the database is across a slow network link:

 

 

Conclusion

There's lot's of advanced techniques for working with ADO.NET, but at the very least you should be familiar with bind variables and array processing.  Of the three, only array fetch is implemented by default and the performance issues that can result from a failure to use all three can be truly remarkable. 

 

 

 

 

Wednesday
Sep272006

10g tracing quick start

Oracle’s released a few new facilities to help with tracing in 10g,  here’s a real quick wrap up of the most significant:

Using the new client identifier

You can tag database sessions with a session identifier that can later be used to identify sessions to trace.  You can set the identifier like this:

begin
    dbms_session.set_identifier('GUY1');
end;


You can set this from a login trigger if you don’t have access to the source code.  To set trace on for a matching client id, you use DBMS_MONITOR.CLIENT_ID_TRACE_ENABLE:

BEGIN

   DBMS_MONITOR.client_id_trace_enable (client_id      => 'GUY1',
                                        waits          => TRUE,
                                        binds          => FALSE
                                       );
END;

You can add waits and or bind variables to the trace file using the flags shown.

Tracing by Module and/or action

Many Oracle-aware applications set Module and action properties and you can use these to enable tracing as well.  The serv_mod_act_trace_enable method allows you to set the tracing on for sessions matching particular service, module, actions and (for clusters) instance identifiers.  You can see current values for these usng the following query:

SELECT DISTINCT instance_name, service_name, module, action
           FROM gv$session JOIN gv$instance USING (inst_id);



INSTANCE_NAME    SERVICE_NA MODULE                         ACTION

Wednesday
Sep272006

10g time model query

Joining the 10g time model to the traditional wait interface views and taking advantage of the wait_class data is something most of us have probably done.  Here's my standard queries that do that thing....

fCOLUMN wait_class format a20
COLUMN name format a30
COLUMN time_secs format 999,999,999,999.99
COLUMN pct format 99.99

SELECT   wait_class, NAME, ROUND (time_secs, 2) time_secs,
         ROUND (time_secs * 100 / SUM (time_secs) OVER (), 2) pct
    FROM (SELECT n.wait_class, e.event NAME, e.time_waited / 100 time_secs
            FROM v$system_event e, v$event_name n
           WHERE n.NAME = e.event AND n.wait_class <> 'Idle'
                 AND time_waited > 0
          UNION
          SELECT 'CPU', 'server CPU', SUM (VALUE / 1000000) time_secs
            FROM v$sys_time_model
           WHERE stat_name IN ('background cpu time', 'DB CPU'))
ORDER BY time_secs DESC;

Which generates CPU and wait event times broken down to the event name:

WAIT_CLASS           NAME                                     TIME_SECS    PCT

Page 1 ... 3 4 5 6 7