Guy Harrison

Search

Latest Postings:

 

 

Oracle Performance Survival Guide

Buy It
Scripts and Examples

Sample Chapter

                                    

 MySQL Stored procedure programming

Buy It
Scripts and Examples 

                                                

Powered by Squarespace
Wednesday
28Oct2009

Performant Oracle programming:  perl

In the last installment, we looked at the fundamentals of efficient Oracle programming with C#.  In particular, we saw how to use bind variables and how to perform array processing.  Now lets take a look at perl.

I have a strong affection for perl:  I started using perl somewhere in the early nineties and did some simple hacking to allow the Oracle interface of the day - oraperl - to use stored procedures.  My very first true performance monitoring tools were all written using perl.  Perl has always had good support for Oracle and it's possible to write very efficient and powerful Oracle database utilities in perl.

You can get the perl program that implements the examples in this posting here

Thanks for all the perl!

Anyone who uses perl with Oracle today owes a debt of thanks to the many people who have maintained the Oracle perl interface (DBD::Oracle) over the years.

Kevin Stock created the original perl 4 "oraperl" modules, which Tim Bunce used to create the modern DBD::Oracle.  Tim maintained it for many years, but handed it over to John Scoles of the Pythian group in 2006.  Pythian are now sponsoring the ongoing maintenance of DBD::Oracle and they do this without any particular expectation of financial reward. It  demonstrates what great guys and outstanding members of the Oracle community they are, and those of us using DBD::Oracle owe them a debt of gratitude. The Pythian hosted home page for DBD::Oracle is here.

Bind variables

As with most languages, it's easier in perl to use literals than to use bind variables.  So for instance, in the following snippet we issue the same SQL statement in a loop, but because we are inserting the literal value of the variable "$x_value" into the SQL on line 236, Oracle will consider every SQL to be a unique statement. 

 


To use bind variables, it's simply required that you insert a placeholder in your SQL text - either a "?" or numbered parameters like ":1", and then assign them values with the bind_param call (see line 184 below).   Note that we prepare the statement only once (on line 179) even though we bind and execute it many times. 

 

  

Using bind variables results in significant performance gains,  both because of the reduction in parse time CPU and because of the reduced contention for library cache mutexes.  When I ran the script I was the only user on the database so mutex contention was not an issue.  Even so, the reduction in parse time resulted in an almost a 75% reduction in execution time:

 

Array Fetch

Array fetch is handled automatically by DBD::Oracle.   The RowCacheSize property of the database handle sets the size of the fetch array, which by default is set to 500.  

Perl supports a range of fetch methods:  fetchrow_array, fetchrow_arrayref, fetchrow_hashref and  fetchall_arrayref.  DBD::Oracle uses array processing regardless of the DBI method you use to fetch your rows.  

In the following example, we set the array fetch size, then process each row one at a time.  DBD::Oracle fetches rows in batches behind the scenes. 

 

We set the size of the array on line 61.  You don't really need to do this unless you think that 500 is an inappropriate array size - perhaps if the row size is very small or very large.  

A casual look at the DBI APIs might suggest that you need to use the fetchall_arrayref method to exploit array processing.  For instance, in the following perl code, the contents of the PERLDEMO table are loaded into a array of rows in a single call. 

Fetching the data in this way actually doesn't change the size of the array requests to Oracle:  all it achieves is to load all the rows into local memory.  This is usually not a good idea for massive tables, since you may get a memory allocation error or starve other programs - including Oracle - for memory.   Even if you don't cause a memory shortage, the overhead of allocating the memory will usually result in slower processing than if you process the rows using fetchrow_array or one of the other row-at-a-time fetch calls. 

Array Insert

As with a lot of languages,  it's easy to fetch in arrays in perl, but slightly harder to insert with arrays.  The "natural" way of inserting rows is to bind each row into the INSERT statement and execute.   Here's that simple technique, which does not exploit the array interface:

 

The value for the row to be inserted is bound in lines 148-149, then inserted in line 150.  Each row is inserted in a separate call, and so every row requires a network round trip to the database and a unique interaction with the oracle server code.

 It's only a little bit more complex to bind an array:

We bind the arrays using the bind_param_array method on lines 133 and 134.  The execute_array method (line 136) activates the array insert.  The size of the array can be adjusted by setting the ora_array_chunk_size property (line 132). 

As in all languages,  it's a very significant improvement to use array insert.  We see below that using the array methods reduced elapsed time by over 80%:

 

Conclusion

 Perl supports a very rich and mature interface to Oracle and it's certainly possible to write high-performance perl programs that interact with Oracle. 

Although it's usually not necessary to explicitly code array fetch,  you do need to explicitly code bind variables and array inserts and you should generally do so if you wish your perl script to interact efficiently with Oracle. 

There's lots of other performance features of the DBD::Oracle driver that I haven't covered in this basic introduction.  You can read up on all the capabilities of the driver in its documentation page on CPAN. 

 

 

Monday
19Oct2009

Oracle performance programming: .NET

In Oracle SQL High Performance tuning, I included an appendix in which I outlined how to achieve good cursor management, bind variables and array processing in the major languages of the day.   I had intended to do the same in Oracle Performance Survival Guide, but I ran out of time and space in the book.  So the examples in the book are generally in Java or PL/SQL only.

I wanted to get up to date on the various languages, some of which (like Python) I haven't used for a while and others (Ruby for instance) I've never used with Oracle.  So I thought I'd kill two birds with one stone by writing a series of blog posts on how to program efficiently in the various languages.

There's lots of best practices in each language, but I think most of us would agree that you at least need to know how to do the following:

  1. Use bind variables appropriately.
  2. Ensure you are performing array fetch when retrieving more than one row
  3. Ensure that you use array insert when doing bulk inserts 

The ways of doing this are different in each language.  In Chapter 6 of Oracle Performance Survival Guide,  I describe these techniques and their performance implications, using Java and sometimes PL/SQL examples.  In this series I'll go through the techniques for other languages. 

Let's start with ODP.NET, which is Oracle's ADO.NET driver for .NET languages such as C#, VB.NET and Powershell (see here for a posting on using Oracle with powershell). 

You can get the C# program which contains the code snippets below here

Bind variables

As in most languages, it's all to easy to omit bind variables in C#.  Here's an example where the variable value is simply concatenated into a SQL string.  The values to be fetched are obtained from an array "xvalueList":

Every value of xvalue will generate a unique SQL which will have to be parsed (unless the database parameter CURSOR_SHARING is set to FORCE or SIMILAR).

To use bind variables, we need to make 3 changes to the code.  The changes are shown in the code below:

 

 

  • Firstly, we define the bind variable "xVal" in the text string that represents the SQL statement (line 231).
  • Next, we create an OracleParameter object representing the bind variable and assign it a datatype (line 233).  We need to do this for every bind variable we use.
  • Finally,  we assign the parameter value to the value of the program variable that we want to use (line 237). 

 

As with any programming language, using bind variables improves performance by reducing parse overhead and library cache latch or mutex contention.  The above SQL is about as simple as you can get, so parse overhead reductions should be modest.   As the only active session on my (local laptop) database,  library cache contention would not be expected either.  Nevertheless,    elapsed time even in this simple example reduced by about 2/3rds when bind variables were used. 

Array Select

 

Fetching rows in batches generally reduces elapsed time for bulk selects by up to a factor of 10.  Array fetch works to improve performance by reducing network round trips and by reducing logical reads. 

Array fetch is turned on by default in ODP.NET.  The size of the array is determined by the FetchSize property of the OracleCommand object.  It can also be set in the OracleDataReader object.   The value is specified as the number of bytes allocated to the array, not the size of the array.  This makes it tricky to set an exact array fetch size unless you know the row length.   However, you can get the row length once the statement is executed by examining the RowSize parameter.

The default array size is 64K which will generally be adequate for good performance.  Below we explicity set the fetch size to the size specified by the variable myFetchSize: 

 

You won't normally need to adjust FetchSize.  Below you can see the effect of adjusting the FetchSize to 64 - the size of a single row for this table - and to 6400 - the size of 100 rows: 

 

 

 Array insert

 

Array insert has similar performance implications as array fetch, but it harder to program and does not happen automatically.   Firstly, the rows to be inserted must exist within a .NET array variables.  The following declares and populates an array of two integer variables that we are going to insert:

 

 

The array is bound to bind variables as usual.  On lines 167-170 below we create the Oracle Parameter objects for the bind variables. and on lines 172-173 we assign the arrays to the bind variables.  On line 176 we set the number of rows to be inserted in the array insert;  we've set it to be equal to the number of values in the array.  

We just need to perform one insert - on line 177. 

 

 

Below we see the performance improvements between inserting in arrays and inserting row by row.  In this case, the database was my local laptop so the network latency was truly minimal.  Even bigger performance improvements can be observed when the database is across a slow network link:

 

 

Conclusion

There's lot's of advanced techniques for working with ADO.NET, but at the very least you should be familiar with bind variables and array processing.  Of the three, only array fetch is implemented by default and the performance issues that can result from a failure to use all three can be truly remarkable. 

 

 

 

 

Thursday
08Oct2009

Oracle OpenWorld again!

Along with tens of thousands of other Oracle-types,  I'll shortly be in San Francisco for Oracle Open World 2009.  This will be my 12th OOW!    

We'll be giving away signed copies of Oracle Performance Survival Guide at the Quest booth (#335) on Wednesday October 14th at 1:30pm.   I'll also be giving a few copies away at the two Oracle Develop presentations I'm giving:

- Session ID S308361: High Performance PL/SQL  Oct. 12, 5:30 p.m., Hilton Hotel, Golden Gate 6/7

- Session ID S308362: Oracle Performance by Design  Oct. 13, 4 p.m., Hilton Hotel, Franciscan A/B

The rest of the time I'll be attending sessions and catching up with folk.  If you'd like to try and catch me, either direct message me at @guyharrison or email me at guy.a.harrison@gmail.com

 

Monday
05Oct2009

Oracle Performance Survival Guide available as PDF

My new book Oracle Performance Survival Guide just became available for PDF download at Informit.com!

The print version is available for pre-order at Informit or Amazon and elsewhere, and is due in book stores on October 14th.  There will be copies available at the Oracle Open World book store next week however and we'll be having a book signing with giveaways at the Quest Software booth, probably on Wednesday.   I haven't seen a print copy myself yet, but I was probably the first to buy a digital copy :-).   It will also soon be available on Kindle and on the Safari on-line bookshelf. 

This is the most challenging book I've written, and the most extensive in terms of scope.  I tried to write a book that would be accessible across a broad range of expertise, and which would systematically address most major aspects of Oracle RDBMS performance tuning.   Overall, I'm satisfied that it's a worthwhile contribution to Oracle performance literature.   

You can see the full table of contents at the books Informit webpage, but here's the short chapter list to give you an idea:  

 

PART I: METHODS, CONCEPTS, AND TOOLS
1 Oracle Performance Tuning: A Methodical Approach
2 Oracle Architecture and Concepts  
3 Tools of the Trade

PART II: APPLICATION AND DATABASE DESIGN
4 Logical and Physical Database Design
5 Indexing and Clustering
6 Application Design and Implementation

PART III: SQL AND PL/SQL TUNING
7 Optimizing the Optimizer
8 Execution Plan Management
9 Tuning Table Access
10 Joins and Subqueries
11 Sorting, Grouping, and Set Operations
12 Using and Tuning PL/SQL
13 Parallel SQL
14 DML Tuning

PART IV: MINIMIZING CONTENTION
15 Lock Contention  
16 Latch and Mutex Contention  
17 Shared Memory Contention

PART V: OPTIMIZING MEMORY
18 Buffer Cache Tuning  
19 Optimizing PGA Memory
20 Other Memory Management Topics

PART VI: IO TUNING AND CLUSTERING
21 Disk IO Tuning Fundamentals
22 Advanced IO Techniques
23 Optimizing RAC

Scripts, examples and packages are available for download here

Friday
18Sep2009

MTS + AMM + BULK COLLECT = Trouble!

Most databases don't run with shared servers - A.K.A. Multi-Threaded Servers or MTS - nowadays.   While reducing the number of server processes can reduce overall memory demand and sometimes certain forms of contention, the drawbacks - particularly delays caused when all serves are busy - are generally regarded as greater than the advantages. 

MTS becomes downright dangerous when Automatic Shared Memory Management (ASMM) or Automatic Memory Management (AMM) is in place.  When you use MTS and AMM (or ASMM) together, PL/SQL programs that try to create large collections can effectively consume all available server memory with disastrous consequences. 

When you allocate PL/SQL program memory with dedicated servers, the memory comes out of the PGA.  If you exceed your fair share of PGA allocation, you'll receive an ORA-4030 error:

 

BULK COLLECT for large tables is generally not a good idea because of the memory required;  it's far better to bulk fetch in increments using the LIMIT keyword, as in this snippet:

In fact, it's not necessary to explicitly code the LOOP with LIMIT from 10g onwards providing that PLSQL_OPTIMIZE_LEVEL is set to the default of 2 or higher.  With PLSQL_OPTIMIZE_LEVEL>=2, the PL/SQL compiler will transparently rewrite simple loops to use bulk collect with a LIMIT clause of 100.  For instance the statement below will have exactly the same performance profile as the statement above:

 

As counter-productive as a BULK COLLECT without LIMIT is in any circumstances, it becomes actively dangerous when using MTS and either Automatic Shared Memory Management (ASMM) or 11g Automatic Memory Management (AMM). 

When using dedicated servers, memory for PL/SQL variables is allocated from the Program Global Area (PGA).  However, with MTS the memory is allocated from the large pool.  If you try to allocate a very large collection when connected via MTS - and AMM or ASMM is enabled - then Oracle will expand the size of the large pool to make room for your collection.  

Here's what happens when we run the first BULK COLLECT example when using MTS and AMM (image courtesy of Spotlight on Oracle):

 That's right: AMM allocates virtually all memory on the system to the large pool in order to accommodate the PL/SQL memory request.  First it consumes the buffer cache, then it reduces the PGA_AGGREGATE_TARGET - all the way to zero!  Sessions that may have been doing buffered IO or large sorts may experience severe degradation as in-memory sorts go to disk, and buffered IOs fail to find anything in the buffer cache.  

If you have a diagnostic pack license, you can see the same thing in OEM in the Memory advisors page:

Conclusion

  1. Don't use MTS unless you have a very good reason.  Not only does it have it's own issues, but it works poorly with Automatic memory management (AMM or ASMM).
  2. Don't use SELECT ... BULK COLLECT INTO unless the table is very small.  An explicit cursor with a LIMIT clause is preferable, and if PLSQL_OPTIMIZE_LEVEL is set to defaults, you often don't need to worry too much about bulk collect at all, since it is automatically implemented (with an array size of 100) by the PLSQL compiler
  3. When using AMM or ASMM beware of one memory region "starving" another.  The best way to avoid this is to set minimum values for specific memory areas.  For instance, you could avoid the phenomenon above if you set a value for PGA_AGGREGATE_TARGET and DB_CACHE_SIZE.  When AMM is in effect, these values will serve as minimum values which will avoid either area shrinking to absurdly small values.  We have a wizard in Spotlight on Oracle that can help you set these minimum values.