<?xml version="1.0" encoding="UTF-8"?>
<!--Generated by Squarespace V5 Site Server v5.13.166 (http://www.squarespace.com) on Wed, 19 Jun 2013 07:59:59 GMT--><rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:wfw="http://wellformedweb.org/CommentAPI/" xmlns:itunes="http://www.itunes.com/dtds/podcast-1.0.dtd" xmlns:dc="http://purl.org/dc/elements/1.1/" version="2.0"><channel><title>Yet Another Database Blog</title><link>http://guyharrison.squarespace.com/blog/</link><description></description><lastBuildDate>Mon, 17 Sep 2012 02:06:37 +0000</lastBuildDate><copyright></copyright><language>en-US</language><generator>Squarespace V5 Site Server v5.13.166 (http://www.squarespace.com)</generator><item><title>Exadata Smart Flash Logging–Outliers</title><category>Exadata</category><category>Oracle</category><category>Oracle</category><category>ssd</category><dc:creator>Guy Harrison</dc:creator><pubDate>Mon, 17 Sep 2012 00:37:03 +0000</pubDate><link>http://guyharrison.squarespace.com/blog/2012/9/17/exadata-smart-flash-loggingoutliers.html</link><guid isPermaLink="false">359481:3851163:29006867</guid><description><![CDATA[<p>In <a href="http://guyharrison.squarespace.com/blog/2012/8/9/exadata-smart-flash-logging.html">my last post</a>, I looked at the effect of the Exadata smart flash logging.&nbsp; Overall,&nbsp; there seemed to be a slight negative effect on median redo log sync times.&nbsp; This chart (slightly different from the last post because of different load and configuration of the system), shows how there&rsquo;s a &ldquo;hump&rdquo; of redo log syncs that take slightly longer when the flash logging is enabled:</p>
<p><a rel="lightbox" href="http://guyharrison.squarespace.com/resource/Windows-Live-Writer-Exadata-Smart-Flash-LoggingOutliers_9198-?fileId=20295822"><img style="background-image: none; padding-left: 0px; padding-right: 0px; display: inline; padding-top: 0px; border: 0px;" title="image" src="http://guyharrison.squarespace.com/resource/Windows-Live-Writer-Exadata-Smart-Flash-LoggingOutliers_9198-?fileId=20295823" border="0" alt="image" width="809" height="656" /></a></p>
<p>But of course, the flash logging feature was designed to improve performance not of the &ldquo;average&rdquo; redo log sync, but of the &ldquo;outliers&rdquo;.&nbsp;</p>
<p>In my tests, I had 40 concurrent processes writing redo as fast as they could.&nbsp; Occasionally this would result in some really long wait times.&nbsp; For instance, in this trace you see an outlier of 291,780 microseconds (the biggest outlier in my tests BTW) within an otherwise unremarkable set of waits:</p>
<p><span style="font-family: 'Courier New'; font-size: x-small;">WAIT #47124064145648: nam='log file sync' ela= 1043 buffer#=101808 sync scn=1266588527 p3=0 obj#=-1 tim=1347583167588250     <br /></span><span style="font-family: 'Courier New'; font-size: x-small;">WAIT #47124064145648: nam='log file sync' ela= 2394 buffer#=130714 sync scn=1266588560 p3=0 obj#=-1 tim=1347583167590888     <br /></span><span style="font-family: 'Courier New'; font-size: x-small;">WAIT #47124064145648: nam='log file sync' ela= 932 buffer#=101989 sync scn=1266588598 p3=0 obj#=-1 tim=1347583167592057     <br /></span><span style="font-family: 'Courier New'; font-size: x-small;"><span style="background-color: #ffff00; color: #ff0000;">WAIT #47124064145648: nam='log file sync' ela= <span style="text-decoration: underline;"><strong>291780</strong></span> buffer#=102074 sync scn=1266588637 p3=0 obj#=-1 tim=1347583167884090</span> <br /></span><span style="font-family: 'Courier New'; font-size: x-small;">WAIT #47124064145648: nam='log file sync' ela= 671 buffer#=102196 sync scn=1266588697 p3=0 obj#=-1 tim=1347583167885294     <br /></span><span style="font-family: 'Courier New'; font-size: x-small;">WAIT #47124064145648: nam='log file sync' ela= 957 buffer#=102294 sync scn=1266588730 p3=0 obj#=-1 tim=1347583167886575</span></p>
<p>To see if the flash logging feature was successful in removing these outliers, I extracted the top 10,000 waits from each of the roughly 8,000,000 waits I recorded in each category.&nbsp; Here&rsquo;s a plot (non-logarithmic) of those waits:</p>
<p><a rel="lightbox" href="http://guyharrison.squarespace.com/resource/Windows-Live-Writer-Exadata-Smart-Flash-LoggingOutliers_9198-?fileId=20295825"><img style="background-image: none; padding-left: 0px; padding-right: 0px; display: inline; padding-top: 0px; border: 0px;" title="image" src="http://guyharrison.squarespace.com/resource/Windows-Live-Writer-Exadata-Smart-Flash-LoggingOutliers_9198-?fileId=20295826" border="0" alt="image" width="809" height="662" /></a></p>
<p>So &ndash; the flash log feature was effective in eliminating or at least reducing very extreme outlying redo log sync times.&nbsp;&nbsp;&nbsp; Most redo log sync operations will experience no improvement or maybe even a slight degradation. But for the small number of log syncs that would have experienced a really excessive delay, the feature works as advertised &ndash; it reduces the chance of really excessive log file syncs.&nbsp;</p>
<p>In my opinion, this effect doesn't imply that the flash can process a redo log write faster than the magnetic disks - in fact probably the opposite is true. &nbsp;But given two desitinations to choose from, we avoid really long delays that occur when one of the destinations only is overloaded.&nbsp;</p>]]></description><wfw:commentRss>http://guyharrison.squarespace.com/blog/rss-comments-entry-29006867.xml</wfw:commentRss></item><item><title>Exadata smart flash logging</title><category>Exadata</category><category>Oracle</category><category>Oracle</category><category>R</category><category>cellcli</category><category>ssd</category><dc:creator>Guy Harrison</dc:creator><pubDate>Thu, 09 Aug 2012 02:30:41 +0000</pubDate><link>http://guyharrison.squarespace.com/blog/2012/8/9/exadata-smart-flash-logging.html</link><guid isPermaLink="false">359481:3851163:22199597</guid><description><![CDATA[<p>Exadata storage software 11.2.2.4 introduced the Smart flash logging feature.&nbsp; The intent of this is to reduce overall redo log sync times - especially outliers - by allowing the exadata flash storage to serve as a secondary destination for redo log writes.&nbsp; During a redo log sync, Oracle will write to the disk and flash simultaneously and allow the redo log sync operation to complete when the first device completes.&nbsp;</p>
<p>Jason Arneil reports some initial observations <a href="http://jarneil.wordpress.com/2012/03/20/exadata-smart-flash-logging">here</a>, and Luis Moreno Campos summarized it <a href="http://ocpdba.wordpress.com/2011/10/21/smart-flash-redo-logging-a-small-step-from-development-a-giant-leap-in-database-performance/?utm_source=feedburner&amp;utm_medium=feed&amp;utm_campaign=Feed%3A+orana+%28OraNA%29">here</a>.</p>
<p>I&rsquo;ve reported in the past on using SSD for redo including on Exadata and generally I&rsquo;ve found that <a href="http://guyharrison.squarespace.com/blog/2011/12/6/using-ssd-for-redo-on-exadata-pt-2.html">SSD is a poor fit for redo log style sequential write IO</a>.&nbsp; But this architecture should at least do now harm and on the assumption that the SSD will at least occasionally complete faster than a spinning disk I tried it out.&nbsp;</p>
<p>My approach involved the same workload I&rsquo;ve used in similar tests.&nbsp; I ran 20 concurrent processes each of which performed 200,000 updates and commits &ndash; a total of 4,000,000 redo log sync operations.&nbsp; I captured every redo log sync wait from 10046 traces and loaded them in R for analysis.</p>
<p>I turned flash logging on or off by using an ALTER IORMPLAN command like this (my DB is called SPOT):</p>
<blockquote>
<p>ALTER IORMPLAN dbplan=((name='SPOT', flashLog=$1),(name=other,flashlog=on))'</p>
</blockquote>
<p>And I ran &ldquo;list metriccurrent where objectType='FLASHLOG'&rdquo; before and after each run so I could be sure that flash logging was on or off.</p>
<p>When flash logging was on, I saw data like this:</p>
<p>Before:</p>
<p><span style="font-family: 'Courier New';">&nbsp;&nbsp;&nbsp;&nbsp; FL_DISK_FIRST&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; FLASHLOG&nbsp;&nbsp;&nbsp;&nbsp; 32,669,310 IO requests     <br />&nbsp;&nbsp;&nbsp;&nbsp; FL_FLASH_FIRST&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; FLASHLOG&nbsp;&nbsp;&nbsp;&nbsp; 7,318,741 IO requests      <br />&nbsp;&nbsp;&nbsp;&nbsp; FL_PREVENTED_OUTLIERS&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; FLASHLOG&nbsp;&nbsp;&nbsp;&nbsp; 774,146 IO requests</span></p>
<p>After:</p>
<p>&nbsp; <span style="font-family: 'Courier New';">&nbsp;&nbsp;&nbsp; FL_DISK_FIRST&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; FLASHLOG&nbsp;&nbsp;&nbsp;&nbsp; 33,201,462 IO requests     <br />&nbsp;&nbsp;&nbsp;&nbsp; FL_FLASH_FIRST&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; FLASHLOG&nbsp;&nbsp;&nbsp;&nbsp; 7,337,931 IO requests      <br />&nbsp;&nbsp;&nbsp;&nbsp; FL_PREVENTED_OUTLIERS&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; FLASHLOG&nbsp;&nbsp;&nbsp;&nbsp; 774,146 IO requests</span></p>
<p>&nbsp;</p>
<p>So for this particular cell the flash disk &ldquo;won&rdquo; only 3.8% of times (7,337,931-7,318,741)*100/(7,337,931-7,318,741+33,201,462-32,669,310) and prevented no &ldquo;outliers&rdquo;.&nbsp; Outliers are defined as being redo log syncs that would have taken longer than 500 ms to complete.&nbsp;</p>
<p>Looking at my 4 million redo log sync times,&nbsp; I saw that the average and median times where statistically significantly <strong>higher</strong> when the smart flash logging was involved:</p>
<blockquote>
<p><span style="font-family: 'Courier New';"><span style="color: #9b00d3;">&gt; summary(flashon.data$synctime_us)</span> #<strong><span style="color: #0000ff;">Smart flash logging ON</span></strong> <br />&nbsp;&nbsp; Min. 1st Qu.&nbsp; Median&nbsp;&nbsp;&nbsp; Mean 3rd Qu.&nbsp;&nbsp;&nbsp; Max.         <br />&nbsp;&nbsp;&nbsp; 1.0&nbsp;&nbsp; 452.0&nbsp;&nbsp; <span style="color: #0000ff;">500.0&nbsp;&nbsp; 542.4</span>&nbsp;&nbsp; 567.0&nbsp; 3999.0         <br /><span style="color: #9b00d3;">&gt; summary(flashoff.data$synctime_us)</span> #<strong><span style="color: #0000ff;">Smart flash logging OFF</span></strong> <br />&nbsp;&nbsp; Min. 1st Qu.&nbsp; Median&nbsp;&nbsp;&nbsp; Mean 3rd Qu.&nbsp;&nbsp;&nbsp; Max.         <br />&nbsp;&nbsp; 29.0&nbsp;&nbsp; 435.0&nbsp;&nbsp; <strong><span style="color: #0000ff;">481.0&nbsp;&nbsp; 508.7</span></strong>&nbsp;&nbsp; 535.0&nbsp; 3998.0         <br /><span style="color: #9b00d3;">&gt; t.test(flashon.data$synctime_us,flashoff.data$synctime_us,paired=FALSE)</span></span></p>
<p><span style="font-family: 'Courier New';">&nbsp;&nbsp;&nbsp; Welch Two Sample t-test</span></p>
<p><span style="font-family: 'Courier New';">data:&nbsp; flashon.data$synctime_us and flashoff.data$synctime_us        <br />t = 263.2139, df = 7977922, p-value &lt; 2.2e-16        <br />alternative hypothesis: true <span style="color: #0000ff;"><strong>difference in means is not equal to 0</strong></span> <br />95 percent confidence interval:        <br /> 33.43124 33.93285         <br />sample estimates:        <br />mean of x mean of y         <br /> 542.3583&nbsp; 508.6763</span></p>
</blockquote>
<p>Plotting the distribution of redo log sync times we can pretty easily see that there&rsquo;s actually a small &ldquo;hump&rdquo; in times when flash logging is on (note logarithmic scale):</p>
<p><a rel="lightbox" href="http://guyharrison.squarespace.com/resource/Windows-Live-Writer-Exadata-smart-flash-logging_A77F-?fileId=19774883"><img style="background-image: none; padding-left: 0px; padding-right: 0px; display: inline; padding-top: 0px; border: 0px;" title="image" src="http://guyharrison.squarespace.com/resource/Windows-Live-Writer-Exadata-smart-flash-logging_A77F-?fileId=19774884" border="0" alt="image" width="733" height="650" /></a></p>
<p>This is of course the exact opposite of what we expect, and I checked my data very carefully to make sure that I had not somehow switched samples.&nbsp; And I repeated the test many times and always saw the same pattern.&nbsp;&nbsp;</p>
<p>It may be that there is a slight overhead to running the race between disk and flash, and that that overhead makes redo log sync times slightly higher.&nbsp; That overhead may become more negligible on a busy system.&nbsp; But for now I personally can&rsquo;t confirm that smart flash logging provides the intended optimization and in fact I observed a small but statistically significant and noticeable degradation in redo log sync times when it is enabled.</p>]]></description><wfw:commentRss>http://guyharrison.squarespace.com/blog/rss-comments-entry-22199597.xml</wfw:commentRss></item><item><title>Getting started with Apache Pig</title><category>Hadoop</category><category>Hive</category><category>Pig</category><category>TCD blog post</category><dc:creator>Guy Harrison</dc:creator><pubDate>Fri, 06 Jan 2012 10:42:35 +0000</pubDate><link>http://guyharrison.squarespace.com/blog/2012/1/6/getting-started-with-apache-pig.html</link><guid isPermaLink="false">359481:3851163:14463811</guid><description><![CDATA[<p>If, like me, you want to play around with data in a Hadoop cluster without having to write hundreds or thousands of lines of Java <a href="http://hadoop.apache.org/mapreduce/">MapReduce</a> code, you most likely will use either <a href="http://hive.apache.org/">Hive</a> (using the&nbsp; Hive Query Language HQL) or <a href="http://pig.apache.org/">Pig</a>.</p>
<p>Hive is a SQL-like language which compiles to Java map-reduce code, while Pig is a <em>data flow language </em>which allows you to specify your map-reduce data pipelines using high level abstractions.&nbsp;</p>
<p>The way I like to think of it is that writing Java MapReduce is like programming in assembler:&nbsp; you need to manually construct every low level operation you want to perform.&nbsp; Hive allows people familiar with SQL to extract data from Hadoop with ease and &ndash; like SQL &ndash; you specify the data you want without having to worry too much about the way in which it is retrieved.&nbsp; Writing a Pig script is like writing a SQL execution plan:&nbsp; you specify the exact sequence of operations you want to undertake when retrieving the data.&nbsp; Pig also allows you to specify more complex data flows than is possible using HQL alone.</p>
<p>As a crusty old RDBMS guy, I at first thought that Hive and HQL was the most attractive solution and I still think Hive is critical to enterprise adoption of Hadoop since it opens up Hadoop to the world of enterprise Business Intelligence.&nbsp; But Pig really appeals to me as someone who has spent so much time tuning SQL.&nbsp; The Hive optimizer is currently at the level of early rule-based RDBMS optimizers from the early 90s.&nbsp; It will get better and get better quickly, but given the massive size of most Hadoop clusters, the cost of a poorly optimized HQL statement is really high.&nbsp; Explicitly specifying the execution plan in Pig arguably gives the programmer more control and lessens the likelihood of the &ldquo;HQL statement from Hell&rdquo; brining a cluster to it&rsquo;s knees.</p>
<p>So I&rsquo;ve started learning Pig, using the familiar (to me) Oracle sample schema which I downloaded using <a href="http://incubator.apache.org/projects/sqoop.html">SQOOP</a>.&nbsp;&nbsp; (Hint:&nbsp; Pig likes tab separated&nbsp; files, so use the <em>--fields-terminated-by '\t'</em> flag in your SQOOP job).&nbsp;</p>
<p>Here&rsquo;s a diagram I created showing how some of the more familiar HQL idioms are implemented in Pig:</p>
<p><span class="full-image-block ssNonEditable"><span><img style="width: 900px;" src="http://guyharrison.squarespace.com/storage/6-01-2012%209-21-39%20PM%20pig%20vs%20hive.png?__SQUARESPACE_CACHEVERSION=1325912009711" alt="" /></span></span></p>
<p>Note how using Pig we explicitly control the execution plan:&nbsp; In HQL it&rsquo;s up to the optimizer whether tables are joined before or after the &ldquo;country_region=&rsquo;Asia&rsquo;&rdquo; filter is applied.&nbsp; In Pig I explicitly execute the filter before the join. &nbsp; &nbsp;It turns out that the Hive optimizer does the same thing, but for complex data flows being able to explicitly control the sequence of events can be an advantage.&nbsp;</p>
<p>Pig is only a little more wordy than HQL and while I definitely like the familiar syntax of HQL I really like the additional control of Pig.</p>]]></description><wfw:commentRss>http://guyharrison.squarespace.com/blog/rss-comments-entry-14463811.xml</wfw:commentRss></item><item><title>Using SSD for redo on Exadata - pt 2</title><category>Exadata</category><category>Oracle</category><category>Oracle</category><category>ssd</category><dc:creator>Guy Harrison</dc:creator><pubDate>Tue, 06 Dec 2011 04:31:40 +0000</pubDate><link>http://guyharrison.squarespace.com/blog/2011/12/6/using-ssd-for-redo-on-exadata-pt-2.html</link><guid isPermaLink="false">359481:3851163:13994115</guid><description><![CDATA[<p>In my <a href="http://guyharrison.squarespace.com/blog/2011/10/27/using-flash-disk-for-redo-on-exadata.html">previous post</a> on this topic, I presented data showing that redo logs placed on an ASM diskgroup created from exadata griddisks created from flash performed far worse than redo logs placed on ASM created from spinning SAS disks.</p>
<p>Of course, theory predicts that flash will not outperform spinning magnetic disk for the sequential write IOs experienced by redo logs, but on Exadata, flash disk performed much worse than seemed reasonable and worse than experience on regular Oracle with FusionIO SSD would predict (see <a href="http://guyharrison.squarespace.com/ssdguide/04-evaluating-the-options-for-exploiting-ssd.html">this post</a>).</p>
<p><a href="http://structureddata.org/">Greg Rahn</a> and <a href="http://kevinclosson.wordpress.com/">Kevin Closson</a> were both kind enough to help explain this phenomenon.&nbsp; In particular, they pointed out that the flash cards might be performing poorly because of the default 512 byte redo block size and that I should try a 4K blocksize.&nbsp;&nbsp; Unfortunately, at least on my patch level (11.2.2.3.2), there appears to be a problem with setting a 4K blocksize</p>
<p style="padding-left: 30px;"><span style="font-family: 'Courier New';">ALTER DATABASE add logfile thread 1 group 9 ('+DATA_SSD') size 4096M blocksize 4096       <br />*        <br />ERROR at line 1:        <br />ORA-01378: The logical block size (4096) of file +DATA_SSD is not compatible with the disk sector size (media sector size is 512 and host sector size is 512)</span></p>
<p>According to Greg, the F20 SSD cards are incorrectly reporting their physical characteristics and this is fixed in the current patch level.&nbsp;&nbsp; Luckily, you can override the check by setting</p>
<p style="padding-left: 30px;"><span style="font-family: 'Courier New';">ALTER SYSTEM SET "_disk_sector_size_override"=TRUE SCOPE=BOTH;</span></p>
<p>Greg and Kevin really know their stuff:&nbsp; setting a 4k redo log block size resulted in dramatic improvements to redo log throughput &ndash; elapsed time reduced by 70%:</p>
<p><a rel="lightbox" href="http://guyharrison.squarespace.com/resource/Windows-Live-Writer-Using-SSD-as-redo-on-Exadata-pt-2_A64F-?fileId=15467153"><img style="background-image: none; padding-left: 0px; padding-right: 0px; display: inline; padding-top: 0px; border: 0px;" title="image" src="http://guyharrison.squarespace.com/resource/Windows-Live-Writer-Using-SSD-as-redo-on-Exadata-pt-2_A64F-?fileId=15467155" border="0" alt="image" width="778" height="412" /></a></p>
<p>As expected,&nbsp; redo log performance for SSD still slightly lags that of SAS spinning disks.&nbsp;&nbsp;&nbsp;&nbsp; It&rsquo;s clear that you can&rsquo;t expect a performance improvement by placing redo on SSD, but at least the 4K blocksize fix makes the response time comparable.&nbsp; Of course, with the price of SSD being what it is, and the far higher benefits provided for other workloads &ndash; especially random reads &ndash; it&rsquo;s hard to see an economic rationale for SSD-based redo.&nbsp;&nbsp;&nbsp; But at least with a 4K blocksize it&rsquo;s tolerable.</p>
<p>When our Exadata system is updated to the latest storage cell software, I&rsquo;ll try comparing workloads with the Exadata smart flash logging feature.</p>]]></description><wfw:commentRss>http://guyharrison.squarespace.com/blog/rss-comments-entry-13994115.xml</wfw:commentRss></item><item><title>Amazon Elastic Map Reduce (EMR), Hive, and TOAD</title><category>Hadoop</category><category>Hive</category><category>TCD blog post</category><category>amazon</category><dc:creator>Guy Harrison</dc:creator><pubDate>Mon, 05 Dec 2011 01:14:55 +0000</pubDate><link>http://guyharrison.squarespace.com/blog/2011/12/5/amazon-elastic-map-reduce-emr-hive-and-toad.html</link><guid isPermaLink="false">359481:3851163:13973869</guid><description><![CDATA[<p>Since my <a href="http://guyharrison.squarespace.com/blog/2011/2/1/using-toad-with-hive-in-amazon-elastic-map-reduce.html">first post</a> on connecting to Amazon Elastic Map Reduce with TOAD, we&rsquo;ve added quite a few features to our Hadoop support in general and our EMR support specifically, so I thought I&rsquo;d summarize those features in this blog post</p>
<p>Amazon Elastic Map Reduce is a cloud-based version of Hadoop hosted on Amazon Elastic Compute Cloud (EC2) instance.&nbsp; Using EMR, you can quickly establish a cloud based Hadoop cluster to perform map reduce work flows.&nbsp;</p>
<p>EMR support Hive of course, and <a href="http://www.toadforcloud.com/index.jspa">Toad for Cloud Databases</a> (TCD)&nbsp; includes Hive support, so let&rsquo;s look at using that to query EMR data.</p>
<h2>Using the Toad direct Hive client</h2>
<p>&nbsp;</p>
<p>TCD direct Hive connection support is the quickest way to establish a connection to Hive.&nbsp; It uses a bundled JDBC driver to establish the connection.</p>
<p>Below we create a new connection to a Hive server running on EMR:</p>
<p><a rel="lightbox" href="http://guyharrison.squarespace.com/resource/Windows-Live-Writer-Amazon-Elastic-Map-Reduce-EMR-and-TOAD-P_9A89-?fileId=15445994"><img style="background-image: none; padding-left: 0px; padding-right: 0px; display: inline; padding-top: 0px; border: 0px;" title="image" src="http://guyharrison.squarespace.com/resource/Windows-Live-Writer-Amazon-Elastic-Map-Reduce-EMR-and-TOAD-P_9A89-?fileId=15445995" border="0" alt="image" width="837" height="713" /></a></p>
<ol>
<li>Right click on Hive connections and choose &ldquo;Connect to Hive&rdquo; to create a new Hive connection.</li>
<li>The host address is the &ldquo;Master&rdquo; EC2 instance for your EMR cluster.&nbsp; You&rsquo;ll find that on the EMR Job flow management page within your Amazon AWS console.&nbsp; The Hive 0.5 server is running on port 10000 by default. </li>
<li>Specifying a job tracker port allows us to track the execution of our Hive jobs in EMR.&nbsp; The standard Hadoop jobtracker port is 50030, but in EMR it&rsquo;s 9600. </li>
<li>It&rsquo;s possible to open up port 10000 so you can directly connect with Hive clients, but it&rsquo;s a bad idea usually.&nbsp; Hive has negligible built-in security, so you&rsquo;d be exposing your Hive data.&nbsp;&nbsp; For that reason we support a SSH mode in which you can tunnel through to your hadoop server using the keypair file that you used to start the EMR job flow.&nbsp; The key name is also shown in the EMR console page, though obviously you&rsquo;ll need to have an actual keypair file. </li>
</ol>
<p>The direct Hive client allows you to execute any legal Hive QL commands.&nbsp; In the example below, we create a new Hive table based on data held in an S3 bucket (The data is some UN data on homicide rates I uploaded).</p>
<p><a rel="lightbox" href="http://guyharrison.squarespace.com/resource/Windows-Live-Writer-Amazon-Elastic-Map-Reduce-EMR-and-TOAD-P_9A89-?fileId=15445996"><img style="background-image: none; padding-left: 0px; padding-right: 0px; display: inline; padding-top: 0px; border: 0px;" title="SNAGHTML9c66e8d" src="http://guyharrison.squarespace.com/resource/Windows-Live-Writer-Amazon-Elastic-Map-Reduce-EMR-and-TOAD-P_9A89-?fileId=15445997" border="0" alt="SNAGHTML9c66e8d" width="929" height="469" /></a></p>
<h2>Connecting Hive to the Toad data hub</h2>
<p>&nbsp;</p>
<p>It&rsquo;s great to be able to use Hive to exploit Map Reduce using familiar (to me) SQL-like syntax.&nbsp; But the real advantage of TCD for Hive is that we link to data that might be held in other sources &ndash; like Oracle, Cassandra, SQL Server, MongoDB, etc.</p>
<p>Setting up a hub connection to EMR hive is very similar to setting up a direct connection.&nbsp; Of course you need a data hub installed (see <a href="http://toadforcloud.com/entry!default.jspa?categoryID=677&amp;externalID=4372">here</a> for instructions), then right click on the hub node and select &ldquo;map data source&rdquo;:</p>
<p><span class="full-image-block ssNonEditable"><span><a href="http://guyharrison.squarespace.com/resource/Windows-Live-Writer-Amazon-Elastic-Map-Reduce-EMR-and-TOAD-P_9A89-?fileId=15445999"><img style="width: 1024px;" src="http://guyharrison.squarespace.com/resource/Windows-Live-Writer-Amazon-Elastic-Map-Reduce-EMR-and-TOAD-P_9A89-?fileId=15446000&amp;__SQUARESPACE_CACHEVERSION=1323048317504" alt="" /></a></span></span></p>
<p>Now that the hub knows about the EMR hive connection, we can issue queries that access Hive and &ndash; in the same SQL &ndash; other datasources. For instance, here&rsquo;s a query that joins homicide data in Hive Elastic Map Reduce with population data stored in a Oracle database (running as Amazonn RDS: &nbsp;Relational Database Service).&nbsp; We can do these cross platform joins across a lot of different types of database sources, including any ODBC compliant databases, any Apache Hbase or Hive connections, Cassandra, MongoDB, SimpleDB, Azure table services:</p>
<p><span class="full-image-block ssNonEditable"><span><a href="http://guyharrison.squarespace.com/resource/Windows-Live-Writer-Amazon-Elastic-Map-Reduce-EMR-and-TOAD-P_9A89-?fileId=15446002"><img style="width: 1024px;" src="http://guyharrison.squarespace.com/resource/Windows-Live-Writer-Amazon-Elastic-Map-Reduce-EMR-and-TOAD-P_9A89-?fileId=15446003&amp;__SQUARESPACE_CACHEVERSION=1323048266371" alt="" /></a></span></span></p>
<p>In the version that we are just about to release, queries can be saved as views or snapshots, allowing easier access from external tools of for users who aren&rsquo;t familiar with SQL.&nbsp;&nbsp; In the example above, I&rsquo;m saving my query as a view.</p>
<h2>&nbsp;</h2>
<h2>Using other hub-enabled clients</h2>
<p>&nbsp;</p>
<p>TCD isn&rsquo;t the only product that can issue hub queries.&nbsp; In beta today, the <a href="http://questbi.com/">Quest Business Intelligence Studio</a> can attach to the data hub, and allows you to graphically explore you data using drag and drop, click and drilldown paradigms:</p>
<p><span class="full-image-block ssNonEditable"><span><a href="http://guyharrison.squarespace.com/resource/Windows-Live-Writer-Amazon-Elastic-Map-Reduce-EMR-and-TOAD-P_9A89-?fileId=15446005"><img style="width: 1024px;" src="http://guyharrison.squarespace.com/resource/Windows-Live-Writer-Amazon-Elastic-Map-Reduce-EMR-and-TOAD-P_9A89-?fileId=15446007&amp;__SQUARESPACE_CACHEVERSION=1323048337207" alt="" /></a></span></span></p>
<p>It&rsquo;s great to be living in Australia &ndash; one of the lowest homicide rates!</p>
<p>If you&rsquo;re a hard core data scientist, you can even attach R through to the hub via the RODBC interface.&nbsp; So for instance, in the screen shot below, I&rsquo;m using R to investigate the correlation&nbsp; between population density and homicide rate.&nbsp; The data comes from Hive (EMR) and Oracle (RDS),&nbsp; is joined in the hub, saved as a snapshot and then feed into R for analysis.&nbsp; Pretty cool for a crusty old stats guy like me (My very first computer program was written in 1979 on SPSS).</p>
<p><a rel="lightbox" href="http://guyharrison.squarespace.com/resource/Windows-Live-Writer-Amazon-Elastic-Map-Reduce-EMR-and-TOAD-P_9A89-?fileId=15446009"><img style="background-image: none; padding-left: 0px; padding-right: 0px; display: inline; padding-top: 0px; border: 0px;" title="image" src="http://guyharrison.squarespace.com/resource/Windows-Live-Writer-Amazon-Elastic-Map-Reduce-EMR-and-TOAD-P_9A89-?fileId=15446010" border="0" alt="image" width="940" height="605" /></a></p>]]></description><wfw:commentRss>http://guyharrison.squarespace.com/blog/rss-comments-entry-13973869.xml</wfw:commentRss></item><item><title>Using flash disk for Redo on Exadata</title><category>Exadata</category><category>Oracle</category><category>Oracle</category><category>ssd</category><dc:creator>Guy Harrison</dc:creator><pubDate>Thu, 27 Oct 2011 03:18:12 +0000</pubDate><link>http://guyharrison.squarespace.com/blog/2011/10/27/using-flash-disk-for-redo-on-exadata.html</link><guid isPermaLink="false">359481:3851163:13480845</guid><description><![CDATA[<p>In this Q<a href="http://www.quest.com/Quest_Site_Assets/WhitePapers/Best_Practices_for_Optimizing_Oracle_RDBMS_with_Solid_State_Disk-final.pdf">uest white paper</a> and on my <a href="http://guyharrison.squarespace.com/ssdguide/">SSD blog</a> here,&nbsp; I report on how using a FusionIO flash SSD compares with SAS disk for various configurations &ndash; datafile, flash cache, temp tablespace and redo log.&nbsp; Of all the options I found that using flash for redo was the least suitable, with virtually no performance benefit:</p>
<p><img src="http://guyharrison.squarespace.com/resource/Windows-Live-Writer-aaaf694c14e4_EFD6-?fileId=13374872" alt="image" /></p>
<p>That being the case,&nbsp; I was surprised to see that Oracle had decided to place Redo logs on flash disk within the <a href="http://www.oracle.com/us/products/database/database-appliance/index.html">database appliance</a>, and also that the latest release of the exadata storage cell software used flash disk to cache redo log writes (Greg Rahn explains it <a href="http://structureddata.org/2011/10/12/exadata-smart-flash-logging-explained/">here</a>).&nbsp;&nbsp; I asked around at OOW hoping someone could explain the thinking behind this, but generally I got very little insight.</p>
<p>I thought I better repeat my comparisons between spinning and solid state disk on our Exadata system here at Quest.&nbsp; Maybe the &ldquo;super capacitor&rdquo; backed 64M DRAM on each flash chip would provide enough buffering to improve performance.&nbsp; Or maybe I was just completely wrong in my previous tests (though I REALLY don&rsquo;t think so :-/).</p>
<p>Our Exadata 1/4 rack has a 237GB disk group constructed on top of storage cell flash disk.&nbsp; I described how that is created <a href="http://guyharrison.squarespace.com/blog/2011/9/27/configuring-exadata-flash-as-grid-disk.html">in this post</a>.&nbsp;&nbsp; I chose 96GB per storage cell in order to allow the software to isolate the grid disks created on flash to 4 24GB FMODs (each cell has 16 FMODs).&nbsp;&nbsp;&nbsp; Our Exadata system has fast SAS spinning disks &ndash; 12 per storage cell for a total of 36 disks.&nbsp; Both the SAS and SSD disk groups had normal redundancy.</p>
<p>I ran an identical redo-intensive workload on the system using SAS or SSD diskgroups for the redo logs.&nbsp; Redo logs were 3 groups of 4GB per instance.&nbsp;&nbsp; I ran the workload on it&rsquo;s own, and as10 separate concurrent sessions.&nbsp;&nbsp;</p>
<p>The results shocked me:</p>
<p><a rel="lightbox" href="http://guyharrison.squarespace.com/resource/Windows-Live-Writer-Using-flash-disk-for-Redo-on-Exadata_BEFC-?fileId=14837473"><img style="background-image: none; padding-left: 0px; padding-right: 0px; display: inline; padding-top: 0px; border: 0px;" title="image" src="http://guyharrison.squarespace.com/resource/Windows-Live-Writer-Using-flash-disk-for-Redo-on-Exadata_BEFC-?fileId=14837474" border="0" alt="image" width="745" height="515" /></a></p>
<p>When running at a single level of concurrency,&nbsp; the SSD based ASM redo seemed to be around 4 times slower than the default SAS-based ASM redo.&nbsp; Things got substantially worse as I upped the level of concurrency with SSD being almost 20 times slower.&nbsp; Wow.</p>
<p>I had expected the SAS based redo to win &ndash; the SAS ASM disk group has 36 spindles to write to, while the SSD group is (probably) only writing to 12 FMODs.&nbsp; And we know that we don&rsquo;t expect flash to be as good as SAS disks for sequential writes.&nbsp; But still, the performance delta is remarkable.&nbsp;</p>
<h2>Conclusion</h2>
<p>&nbsp;</p>
<p>I&rsquo;m yet to see any evidence that putting redo logs on SSD is a good idea, and I keep observing data from my own tests indicating that it is neutral at best and A Very Bad Idea at worse.&nbsp; Is anybody seeing any similar?&nbsp; Does anybody think there&rsquo;s a valid scenario for flash-based redo?</p>]]></description><wfw:commentRss>http://guyharrison.squarespace.com/blog/rss-comments-entry-13480845.xml</wfw:commentRss></item><item><title>Comparing Hadoop Oracle loaders</title><category>Hadoop</category><category>Oracle</category><category>Oracle</category><category>SQOOP</category><category>TCD blog post</category><category>toad</category><dc:creator>Guy Harrison</dc:creator><pubDate>Wed, 05 Oct 2011 13:55:03 +0000</pubDate><link>http://guyharrison.squarespace.com/blog/2011/10/6/comparing-hadoop-oracle-loaders.html</link><guid isPermaLink="false">359481:3851163:13086581</guid><description><![CDATA[<p>Oracle put a lot of effort into highlighting the upcoming Oracle Hadoop Loader (OHL) at OOW 2011 &ndash; it was even highlighted in Andy Mendelsohn's keynote.&nbsp; It&rsquo;s great to see Oracle recognizing Hadoop as a top tier technology!</p>
<p>However, there were a few comments made about the &ldquo;other loaders&rdquo; that I wanted to clarify.&nbsp; At Quest, I lead the team that writes the <a href="http://toadforcloud.com/entry.jspa?categoryID=677&amp;externalID=4298">Quest Data Connector for Oracle and Hadoop</a> (let&rsquo;s call it the &ldquo;Quest Connector&rdquo;) which is a plug-in to the Apache Hadoop SQOOP framework and which provides optimized bidirectional data loads between Oracle and Hadoop.&nbsp; Below I&rsquo;ve outlined some of the high level features of the Quest Connector in the context of the&nbsp; Oracle-Hadoop loaders.&nbsp; Of course, I got my information on the Oracle loader from technical sessions at OOW so I may have misunderstood and/or the facts may change between now and the eventual release of that loader.&nbsp; But I wanted to go on the record with the following:</p>
<ul>
<li>All parties (Quest, Cloudera, Oracle) agree that native SQOOP (eg, without the Quest plug-in) will be sub-optimal: it will not exploit Oracle direct path reads or writes, will not use partitioning, nologging, etc.&nbsp;&nbsp; Both Cloudera and Quest recommend that if are doing transfers between Oracle and Hadoop that you use SQOOP <strong>with</strong> the Quest connector. </li>
<li>The Quest connector is a free, open source plug in to SQOOP, which is itself a free, open source software product.&nbsp; Both are licensed under the Apache 2.0 open source license.&nbsp; Licensing for the Oracle Loader has not been announced, but Oracle has said it will be a commercial product and therefore presumably not free under all circumstances.&nbsp;&nbsp; It&rsquo;s definitely not open source.</li>
<li>The Quest loader is available now (version 1.4), the Oracle loader is in beta and will be released commercially in 2012.</li>
<li>The Oracle loader moves data from Hadoop to Oracle only.&nbsp; The Quest loader can also move data from Oracle to Hadoop.&nbsp;&nbsp; We import data into Hadoop from an Oracle database usually 5+ times faster than SQOOP alone. </li>
<li>Both Quest and the Oracle loader use direct path writes when loading from Hadoop to Oracle.&nbsp; Oracle do say they use OCI calls which may be faster than the direct path SQL calls used by Quest in some circumstances.&nbsp;&nbsp; But I&rsquo;d suggest that the main optimization in each case is direct path. </li>
<li>Both Quest and the Oracle loader can do parallel direct path writes to a partitioned Oracle table.&nbsp; In the case of the Quest loader, we create partitions based on the job and mapper ids.&nbsp; Oracle can use logical keys and write into existing partitioned tables.&nbsp; My understanding is that they will shuffle and sort the data in the mappers to direct the output to the appropriate partition in bulk.&nbsp; They also do statistical sampling which may improve the load balancing when you are inserting into an existing table.&nbsp; </li>
<li>The Quest loader can update existing tables, and can do Merge operations that insert or updates rows depending on the existence of a matching key value.&nbsp; My understanding is that the Oracle loader will do inserts only - at least initially.</li>
<li>Both the Quest connector and the Oracle loader have some form of GUI.&nbsp; The Oracle GUI I believe is in the commercial ODI product.&nbsp; The Quest GUI is in the free Toad for Cloud Databases Eclipse plug-in.&nbsp; I&rsquo;ve put a screenshot of that at the end of the post. </li>
<li>The Quest connector uses the SQOOP framework which is a Apache Hadoop sub-project maintained by multiple companies most notably Cloudera.&nbsp; This means that the Hadoop side of the product was written by people with a lot of experience in Hadoop.&nbsp;&nbsp; Cloudera and Quest jointly support SQOOP when used with the Quest connector so you get the benefit of having very experienced Hadoop people involved as well as Quest people who know Oracle very well.&nbsp;&nbsp; Obviously Oracle knows Oracle better than anyone, but people like me have been working with Oracle for decades and have credibility I think when it comes to Oracle performance optimization. </li>
</ul>
<p>Again,&nbsp; I&rsquo;m happy to see Oracle embracing Hadoop;&nbsp; I just wanted to set the record straight with regard to our technology which exists today as a free tool for optmized bi-directional data transfer between Oracle and Hadoop.&nbsp;</p>
<p>You can download the Quest Connector at <a title="http://bit.ly/questHadoopConnector" href="http://bit.ly/questHadoopConnector">http://bit.ly/questHadoopConnector</a>.&nbsp; The documentation is at&nbsp; <a title="http://bit.ly/QuestHadoopDoc" href="http://bit.ly/QuestHadoopDoc">http://bit.ly/QuestHadoopDoc</a>.</p>
<p><a rel="lightbox" href="http://guyharrison.squarespace.com/resource/Windows-Live-Writer-Comparing-Hadoop-Oracle-loaders_E7D7-?fileId=14487918"><img style="background-image: none; padding-left: 0px; padding-right: 0px; display: inline; padding-top: 0px; border: 0px;" title="15-09-2011 3-01-01 PM import" src="http://guyharrison.squarespace.com/resource/Windows-Live-Writer-Comparing-Hadoop-Oracle-loaders_E7D7-?fileId=14487919" border="0" alt="15-09-2011 3-01-01 PM import" width="607" height="484" /></a></p>
<p><a rel="lightbox" href="http://guyharrison.squarespace.com/resource/Windows-Live-Writer-Comparing-Hadoop-Oracle-loaders_E7D7-?fileId=14487920"><img style="background-image: none; padding-left: 0px; padding-right: 0px; display: inline; padding-top: 0px; border: 0px;" title="21-09-2011 9-21-41 AM Hadoop solutions" src="http://guyharrison.squarespace.com/resource/Windows-Live-Writer-Comparing-Hadoop-Oracle-loaders_E7D7-?fileId=14487921" border="0" alt="21-09-2011 9-21-41 AM Hadoop solutions" width="643" height="484" /></a></p>]]></description><wfw:commentRss>http://guyharrison.squarespace.com/blog/rss-comments-entry-13086581.xml</wfw:commentRss></item><item><title>Configuring Exadata flash as grid disk</title><category>Exadata</category><category>Oracle</category><category>Oracle</category><category>ssd</category><dc:creator>Guy Harrison</dc:creator><pubDate>Tue, 27 Sep 2011 06:09:17 +0000</pubDate><link>http://guyharrison.squarespace.com/blog/2011/9/27/configuring-exadata-flash-as-grid-disk.html</link><guid isPermaLink="false">359481:3851163:12995447</guid><description><![CDATA[<p>The default &ndash; or at least a very common - configuration for Exadata is to configure all the flash as Exadata Smart Flash Cache (ESFC).&nbsp;&nbsp; This is a simple and generally performant configuration, but won&rsquo;t be the best choice for all cases.&nbsp; In particular, if you have table which is performance critical, and it could fit in the flash storage you have available, you might be better off configuring some of your flash as grid disk, creating an ASM disk group from that, and putting the table there.</p>
<p>Here&rsquo;s the procedure:</p>
<p>1. Drop the flash cache, create a new flashcache of a smaller size, then create the griddisks from the unallocated space.&nbsp; These CELLCLI commands do that:</p>
<p style="padding-left: 30px;"><span style="font-family: 'Courier New';">CellCLI&gt; drop flashcache     <br />Flash cache exa1cel01_FLASHCACHE successfully dropped      <br />CellCLI&gt; create flashcache all size=288g      <br />Flash cache exa1cel01_FLASHCACHE successfully created      <br />CellCLI&gt; create griddisk all flashdisk prefix=ssddisk</span></p>
<p>There&rsquo;s 384G of flash on each storage cell, so the above commands create about 96G of SSD grid disk.&nbsp;&nbsp; Run those commands on each cell node, perhaps by using the CCLI command (see <a href="http://guyharrison.squarespace.com/blog/2011/9/27/clearing-the-exadata-smart-flash-cache-using-dcli.html">this post</a> for an example).</p>
<p>2. The above procedure will create disks in the format o/cellIpAddress/ssddisk_FD_*_cellnode.&nbsp; Log into an ASM instance, and issue the following command to create a diskgroup from those disks:</p>
<p style="padding-left: 30px;"><span style="font-family: 'Courier New';">SQL&gt;&nbsp; <br />&nbsp; 1&nbsp; create diskgroup DATA_SSD normal redundancy disk 'o/*/ssddisk*'      <br />&nbsp; 2&nbsp; attribute 'compatible.rdbms'='11.2.0.0.0',      <br />&nbsp; 3&nbsp; 'compatible.asm'='11.2.0.0.0',      <br />&nbsp; 4&nbsp; 'cell.smart_scan_capable'='TRUE',      <br />&nbsp; 5* 'au_size'='4M'</span></p>
<p>Alternatively you can use the database control for the ASM instance to create the new diskgroup.&nbsp; Your new flash disks should show up as candidate disks.</p>
<p>The relative performance of flash disks, vs flash cache is similar in Exadata to what I&rsquo;ve seen using the Database flash cache.&nbsp; Placing an object directly on flash is faster than using the cache, although the cache is very effective.&nbsp; Here&rsquo;s the results for 200,000 primary key lookups across&nbsp; 1,000,000 possible primary keys:</p>
<p><a rel="lightbox" href="http://guyharrison.squarespace.com/resource/Windows-Live-Writer-Configuring-Exadata-flash-as-grid-disk_DD47-?fileId=14354325"><img style="background-image: none; padding-left: 0px; padding-right: 0px; display: inline; padding-top: 0px; border: 0px;" title="image" src="http://guyharrison.squarespace.com/resource/Windows-Live-Writer-Configuring-Exadata-flash-as-grid-disk_DD47-?fileId=14354326" border="0" alt="image" width="644" height="385" /></a></p>]]></description><wfw:commentRss>http://guyharrison.squarespace.com/blog/rss-comments-entry-12995447.xml</wfw:commentRss></item><item><title>Clearing the Exadata smart flash cache using dcli</title><category>Exadata</category><category>Oracle</category><category>Oracle</category><category>cellcli</category><category>ssd</category><dc:creator>Guy Harrison</dc:creator><pubDate>Tue, 27 Sep 2011 02:01:45 +0000</pubDate><link>http://guyharrison.squarespace.com/blog/2011/9/27/clearing-the-exadata-smart-flash-cache-using-dcli.html</link><guid isPermaLink="false">359481:3851163:12993544</guid><description><![CDATA[<p>I&rsquo;ve been doing some performance benchmarks on our exadata box specifically focusing on the performance of the smart flash cache.&nbsp; I found that even if I switched the CELL_FLASH_CACHE storage setting to NONE,&nbsp; the flash cache will still keep cached blocks in flash and would therefore give me artificially high values for &ldquo;cell flash cache read hits&rdquo; statistic when I set CELL_FLASH_CACHE back to DEFAULT or KEEP.&nbsp; What I needed was a way to flush the Exadata flash cache.</p>
<p>Unfortunately there doesn&rsquo;t seem to be a good way to flush the flash cache &ndash; no obvious CELLCLI command.&nbsp;&nbsp; Maybe I&rsquo;ve missed something, but for now I&rsquo;m dropping and recreating the flash cache before each run.</p>
<p>Luckily the dcli command lets me drop and recreate on each cell directly from the database node and even sets up passwordless connections.&nbsp; Here&rsquo;s how to do it.</p>
<p>Firstly, create a script that will drop and recreate the flash cache for a single cell:</p>
<p style="padding-left: 30px;"><span style="font-family: 'Courier New';">$ cat flushcache.sh     <br />cellcli &lt;&lt;!      <br />drop flashcache;      <br />create flashcache all;</span></p>
<p>Now, use ccli to execute that on each cell node (I have three named exa1cel01,exa1cel02,exa1cel03:</p>
<p style="padding-left: 30px;"><span style="font-family: 'Courier New';">$ dcli -c exa1cel01,exa1cel02,exa1cel03 --serial -k -l <em>userid </em>-x flushcache.sh</span></p>
<p>The &ldquo;-k&rdquo; option copies the ssh key to the cell nodes which means that after the first execution you&rsquo;ll be able to do this without typing in the password for each cell node.&nbsp;&nbsp; The &ldquo;&mdash;serial&rdquo; option makes each command happen one after another rather than all at once &ndash; you probably don&rsquo;t need this&hellip;</p>
<p>Anyone know a better way to flush the Exadata flash cache?</p>]]></description><wfw:commentRss>http://guyharrison.squarespace.com/blog/rss-comments-entry-12993544.xml</wfw:commentRss></item><item><title>Best Practices for Optimizing Oracle RDBMS with Solid State Disk</title><category>Oracle</category><category>ssd</category><dc:creator>Guy Harrison</dc:creator><pubDate>Sun, 18 Sep 2011 06:27:55 +0000</pubDate><link>http://guyharrison.squarespace.com/blog/2011/9/18/best-practices-for-optimizing-oracle-rdbms-with-solid-state.html</link><guid isPermaLink="false">359481:3851163:12900861</guid><description><![CDATA[<p>I've been doing research onto the best use of flash SSD with the Oracle RDBMS over the past year and Quest has produced a whitepaper summarizing the findings. &nbsp;You can get it <a href="http://www.quest.com/documents/landing.aspx?id=15423">here</a>.&nbsp;</p>
<p>Understanding the performance dynamics of SSD is critical in modern Oracle performance management. &nbsp;I'm organizing&nbsp;my work on SSD Oracle performance into an <a href="http://guyharrison.net/ssd">online reference</a> that I hope to continually update as I learn more.&nbsp;</p>
<p>Finally, I'll be speaking on this topic at Oracle Open World in just a few weeks - here are the &nbsp;session details:</p>
<p style="padding-left: 30px;">Session: 3841<br />Title:<span> </span> <span> </span>Making the Most of Solid-State Disk in Oracle Database 11g<br />Time<span> </span> Tuesday, 05:00 PM, InterContinental - InterContinental Ballroom A-</p>
<p style="padding-left: 30px;">&nbsp;</p>
<p>&nbsp;</p>
<p>&nbsp;</p>
<p><span style="font-size: small;">&nbsp;<br /></span></p>]]></description><wfw:commentRss>http://guyharrison.squarespace.com/blog/rss-comments-entry-12900861.xml</wfw:commentRss></item></channel></rss>