dbKoda for MongoDB

dbKoda for MongoDB - a modern open source database IDE, now available for MongoDB. Download it here!

Next Generation
Databases: NoSQL,
NewSQL and Big Data

Buy at Amazon

Read sample at Amazon

Buy at Apress

Latest Postings:

Search

Oracle Performance Survival Guide

Buy It
Read it on Safari
Scripts and Examples
Sample Chapter

                                    

Powered by Squarespace

 MySQL Stored procedure programming

Buy It
Read it on Safari
Scripts and Examples 

                                                

Entries in cellcli (3)

Thursday
Aug092012

Exadata smart flash logging

Exadata storage software 11.2.2.4 introduced the Smart flash logging feature.  The intent of this is to reduce overall redo log sync times - especially outliers - by allowing the exadata flash storage to serve as a secondary destination for redo log writes.  During a redo log sync, Oracle will write to the disk and flash simultaneously and allow the redo log sync operation to complete when the first device completes. 

Jason Arneil reports some initial observations here, and Luis Moreno Campos summarized it here.

I’ve reported in the past on using SSD for redo including on Exadata and generally I’ve found that SSD is a poor fit for redo log style sequential write IO.  But this architecture should at least do now harm and on the assumption that the SSD will at least occasionally complete faster than a spinning disk I tried it out. 

My approach involved the same workload I’ve used in similar tests.  I ran 20 concurrent processes each of which performed 200,000 updates and commits – a total of 4,000,000 redo log sync operations.  I captured every redo log sync wait from 10046 traces and loaded them in R for analysis.

I turned flash logging on or off by using an ALTER IORMPLAN command like this (my DB is called SPOT):

ALTER IORMPLAN dbplan=((name='SPOT', flashLog=$1),(name=other,flashlog=on))'

And I ran “list metriccurrent where objectType='FLASHLOG'” before and after each run so I could be sure that flash logging was on or off.

When flash logging was on, I saw data like this:

Before:

     FL_DISK_FIRST                     FLASHLOG     32,669,310 IO requests
     FL_FLASH_FIRST                    FLASHLOG     7,318,741 IO requests
     FL_PREVENTED_OUTLIERS             FLASHLOG     774,146 IO requests

After:

      FL_DISK_FIRST                     FLASHLOG     33,201,462 IO requests
     FL_FLASH_FIRST                    FLASHLOG     7,337,931 IO requests
     FL_PREVENTED_OUTLIERS             FLASHLOG     774,146 IO requests

 

So for this particular cell the flash disk “won” only 3.8% of times (7,337,931-7,318,741)*100/(7,337,931-7,318,741+33,201,462-32,669,310) and prevented no “outliers”.  Outliers are defined as being redo log syncs that would have taken longer than 500 ms to complete. 

Looking at my 4 million redo log sync times,  I saw that the average and median times where statistically significantly higher when the smart flash logging was involved:

> summary(flashon.data$synctime_us) #Smart flash logging ON
   Min. 1st Qu.  Median    Mean 3rd Qu.    Max.
    1.0   452.0   500.0   542.4   567.0  3999.0
> summary(flashoff.data$synctime_us) #Smart flash logging OFF
   Min. 1st Qu.  Median    Mean 3rd Qu.    Max.
   29.0   435.0   481.0   508.7   535.0  3998.0
> t.test(flashon.data$synctime_us,flashoff.data$synctime_us,paired=FALSE)

    Welch Two Sample t-test

data:  flashon.data$synctime_us and flashoff.data$synctime_us
t = 263.2139, df = 7977922, p-value < 2.2e-16
alternative hypothesis: true difference in means is not equal to 0
95 percent confidence interval:
33.43124 33.93285
sample estimates:
mean of x mean of y
542.3583  508.6763

Plotting the distribution of redo log sync times we can pretty easily see that there’s actually a small “hump” in times when flash logging is on (note logarithmic scale):

image

This is of course the exact opposite of what we expect, and I checked my data very carefully to make sure that I had not somehow switched samples.  And I repeated the test many times and always saw the same pattern.  

It may be that there is a slight overhead to running the race between disk and flash, and that that overhead makes redo log sync times slightly higher.  That overhead may become more negligible on a busy system.  But for now I personally can’t confirm that smart flash logging provides the intended optimization and in fact I observed a small but statistically significant and noticeable degradation in redo log sync times when it is enabled.

Tuesday
Sep272011

Clearing the Exadata smart flash cache using dcli

I’ve been doing some performance benchmarks on our exadata box specifically focusing on the performance of the smart flash cache.  I found that even if I switched the CELL_FLASH_CACHE storage setting to NONE,  the flash cache will still keep cached blocks in flash and would therefore give me artificially high values for “cell flash cache read hits” statistic when I set CELL_FLASH_CACHE back to DEFAULT or KEEP.  What I needed was a way to flush the Exadata flash cache.

Unfortunately there doesn’t seem to be a good way to flush the flash cache – no obvious CELLCLI command.   Maybe I’ve missed something, but for now I’m dropping and recreating the flash cache before each run.

Luckily the dcli command lets me drop and recreate on each cell directly from the database node and even sets up passwordless connections.  Here’s how to do it.

Firstly, create a script that will drop and recreate the flash cache for a single cell:

$ cat flushcache.sh
cellcli <<!
drop flashcache;
create flashcache all;

Now, use ccli to execute that on each cell node (I have three named exa1cel01,exa1cel02,exa1cel03:

$ dcli -c exa1cel01,exa1cel02,exa1cel03 --serial -k -l userid -x flushcache.sh

The “-k” option copies the ssh key to the cell nodes which means that after the first execution you’ll be able to do this without typing in the password for each cell node.   The “—serial” option makes each command happen one after another rather than all at once – you probably don’t need this…

Anyone know a better way to flush the Exadata flash cache?

Sunday
Jul312011

A perl utility to improve exadata cellcli statistics

I’ve only recently had the opportunity to start experimenting with our relatively new Exadata 1/4 rack.  Like most Exadata newbs the most immediately useful new tool is the cellcli utility which lets us get as storage cell utilities.  Kerry Osborne’s blog post on top 10 commands is required reading when getting started.

Amongst Cellcli annoyances are:

  1. You can’t join between METRICDEFINITION and METRICCURRENT or other metric tables.  So you can’t easily see the descriptions of the metric values you are viewing.
  2. CELLCLI only reports the values for the current storage cell, so you can’t get a global view across the cells.

I decided a good first project would be to write a utility that could display the metric definitions and the values together, and which would merge values from all of the storage cells.   Since my old friend perl is present on the storage cells, it wasn’t too hard to put together a tool to do this.   The script is here

Setup

 

You’ll want to have password-less SSH connectivity between cells for this to work.   Many of us will be familiar with doing this for RAC, but if you don’t, a quick primer is available at  http://blogs.translucentcode.org/mick/archives/000230.html.   I wrote this script to partially automate the procedure:

   1: # Helper script to setup passwordless SSH
   2: if [ $# -eq  0 ]
   3: then
   4:     echo "Usage $0 remoteHost"
   5:     exit 1
   6: fi
   7:  
   8: ssh-keygen -t dsa
   9: while [ $# -gt 0 ]; do
  10:     export remoteHost=$1
  11:     scp ~/.ssh/id_dsa.pub ${remoteHost}:
  12:     ssh $remoteHost 'mkdir .ssh; cat ~/id_dsa.pub >>~/.ssh/authorized_keys;chmod 644 ~/.ssh/authorized_keys '
  13:     shift
  14: done

 

From a database node, run the script with the names of all the storage cells on the command line.  I have three cells so I issued the command like this:

[melbourne@exa1db02 ~]$ bash setup_ssh.sh exa1cel01 exa1cel02 exa1cel03

Then you will be prompted with various SSH messages about passphrases, and so on.  Just hit return or “y” if prompted, except when asked for passwords.  You’ll have to enter the password for each storage cell twice to get everything setup. After that you should be able to ssh to the storage cells without being prompted for a password.

Usage

 

Running the command with no arguments will generate the following usage message:

 

[melbourne@exa1db02 ~]$ perl cellcli.pl
Usage perl cellcli.pl [options]
Options:
        --hosts -H      comma seperated lists of cell nodes
        --mask -m       print metrics starting with this string
        --desc -d       print descriptions of metrics
        --cells -c      print individual cell metrics
        --sci -s        Use scientific notation
        --stats -a      Print min/max/avgs
        --help -h       print this

Example:
        perl cellcli -m FC_IO_RQ_R -h exa1cel01,exa1cel02,exa1cel03

NB: utility assumes passwordless SSH from this host to the cell nodes
see
http://guyharrison.squarespace.com/blog/2011/7/31/aggregating-exadata-cellcli-statistics.html

 

The only compulsory argument is “—hosts”, which requires a comma-seperated list of the cell nodes.    Most of the time you will want to use the “—mask” argument as well, which applies a filter to the metric names.   With no other arguments you’ll then get a list of each matching statistic and the sum of values across all the cells:

[melbourne@exa1db02 ~]$ perl cellcli.pl --hosts exa1cel01,exa1cel02,exa1cel03 --mask FC_IO_RQ_R.*

Name                    Sum
----------------------- ------------
FC_IO_RQ_R               8.55164e+06
FC_IO_RQ_R_MISS               675223
FC_IO_RQ_R_MISS_SEC                0
FC_IO_RQ_R_SEC                   0.1
FC_IO_RQ_R_SKIP            1.742e+08
FC_IO_RQ_R_SKIP_SEC             33.6

Note that mask matches Unix regular expressions, not Oracle style “%” wildcards - '.*' matches anything. 

The --desc argument adds the description (somewhat truncated) for each metric:

image

The --cells argument prints the values from each individual cell:

image

 

The --stats argument causes min,max and average values to be printed.  The average is the average of sums across cells, not the average of all values.  SOme statistics will have more than one value in each cell (GD_IO_RQ_W_SM for example) and the average is the average SUM of values across cells.  

image

You can use all the arguments together, but you’re output might be very wide!

Restrictions and caveats

 

When a metric has multiple object values they are simply summed.  That probably works OK for some statistics but maybe not for others.  For instance the disk grid IO counters are broken down by the name of the grid disk within the cell:

image

My utility simply sums all of those values:

image

 

 

Hope you find this of use.  Please let me know if you’ve got any suggestions…..