Guy Harrison - Yet Another Database Blog

Entries in mysql (7)

Monday

Mar232015

Sakila sample schema in MongoDB

Monday, March 23, 2015 at 2:29PM

2018 Update: You can download this and other sample schemas we use in dbKoda from https://medium.com/dbkoda/mongodb-sample-collections-52d6a7745908.

I wanted to do some experimenting with MongoDB, but I wasn’t really happy with any of the sample data I could find in the web. So I decided that I would translate the MySQL “Sakila” schema into MongoDB collections as part of the learning process.

For those that don’t know, Sakila is a MySQL sample schema that was published about 8 years ago. It’s based on a DVD rental system. OK, not the most modern data ever, but DVDs are still a thing aren’t they??

You can get the MongoDB version of Sakilia here. To load, use unpack using tar zxvf sakilia.tgz then use mongoimport to load the resulting JSON documents. On windows you should be able to double click on the file to get to the JSON.

The Sakila database schema is shown below. There are 16 tables representing a fairly easy to understand inventory of films, staff, customers and stores.

When modelling MongoDB schemas, we partially ignore our relational modelling experience – “normalization” is not the desired end state. Instead of driving our decision on the nature of the data, we drive it on the nature of operations. The biggest decision is which “entities” get embedded within documents, and which get linked. I’m not the best person to articulate these principles – the O’Reilly book “MongoDB Applied Design Patterns” does a pretty good job and this presentation is also useful.

My first shot at mapping the data – which may prove to be flawed as I play with MongoDB queries – collapsed the 16 tables into just 3 documents: FILMS, STORES and CUSTOMERS. ACTORS became a nested document in FILMS, STAFF and INVENTORY were nested into STORES, while RENTALS and PAYMENTS nested into CUSTOMERS. Whether these nestings turn out to be good design decisions will depend somewhat on the application. Some operations are going to be awkward while others will be expedited.

Here’s a look at the FILMS collection:

Here is STORES:

And here is CUSTOMERS:

Looks like I have to fix some float rounding issues on customers.rentals.payments.amount Smile .

The code that generates the schema is here. It’s pretty slow, mainly because of the very high number of lookups on rentals and payments. It would be better to bulk collect everything and scan through it but it would make the code pretty ugly. If this were Oracle I’m pretty sure I could make it run faster but with MySQL SQL tuning is much harder.

Code is pretty straight forward. To insert a MongoDB document we get the DBCollection, then create BasicDBObjects which we insert into the DBCollection. To nest a documnet we create a BasicDBList and insert BasicDBObjects into it. Then we add the BasicDBList to the parent BasicDBObject. The following snippit illustrates that sequence. It's mostly boilerplate code, with the only human decision being the nesting structure.

DBCollection filmCollection = mongoDb.getCollection(mongoCollection);

  

while (fileRs.next()) { // For each film

  

        // Create the actors document

        BasicDBObject filmDoc = new BasicDBObject();

        Integer filmId = fileRs.getInt("FILM_ID");

        filmDoc.put("_id", filmId);

        filmDoc.put("Title", fileRs.getString("TITLE"));

        // Other attributes

        BasicDBList actorList = getActors(mysqlConn, filmId);

        // put the actor list into the film document

        filmDoc.put("Actors", actorList);

        filmCollection.insert(filmDoc); // insert the film

  

    }

Anyway, hopefully this might be of some use to those moving from MySQL to MongoDB. Comments welcome!

Guy Harrison |

6 Comments |

tagged

MongoDB,

mysql,

nosql in

mysql

Friday

Jul052013

Using GET DIAGNOSTICS in MySQL 5.6

Friday, July 5, 2013 at 4:15PM

When Steven and I wrote MySQL Stored Procedure programming our biggest reservation about the new stored procedure language was the lack of support for proper error handling. The lack of the SIGNAL and RESIGNAL clauses prevented a programmer from raising an error that could be propagated throughout a call stack properly, and the lack of a general purpose exception handler which could examine error codes at run time led to awkward exception handling code at best, and poorly implemented error handling at worst.

In 5.4 MySQL implemented the SIGNAL and RESIGNAL clauses (see http://guyharrison.squarespace.com/blog/2009/7/13/signal-and-resignal-in-mysql-54-and-60.html), which corrected half of the problem. Now finally, MySQL 5.6 implements the ANSI GET DIAGNOSTICS clause and we can write a general catch-all exception handler.

Here’s an example:

The exception handler is on lines 10-27. It catches any SQL exception, then uses the GET DIAGNOSTICS clause to fetch the SQLstate, MySQL error code and messages to local variables. We then decide what to do for anticipated errors – duplicate or badly formed product codes and SIGNAL a more more meaningful application error. Unexpected errors are RESIGNALed on line 24.

This is a great step forward for MySQL stored procedures – the lack of a means to programmatically examine error codes made proper error handling difficult or impossible. This is a major step forward in maturity.

Thanks to Ernst Bonat of www.evisualwww.com for helping me work through the usage of GET DIAGNOSTICS.

Guy Harrison |

1 Comment |

tagged

mysql

Monday

Jul132009

SIGNAL and RESIGNAL in MySQL 5.4 and 6.0

Monday, July 13, 2009 at 2:34PM

One of the most glaring omissions in the MySQL Stored procedure implementation was the lack of the ANSI standard SIGNAL and RESIGNAL clauses. These allow a stored procedure to conditionally return an error to the calling program.

When Steven and I wrote MySQL Stored Procedure programming we lamented this omission, and proposed an admittedly clumsy workaround. Our workaround involved creating and procedure in which dynamic SQL in which the error message was embedded in the name of a non-existent table. When the procedure was executed, the non-existing table name at least allowed the user to see the error. So for instance, here is the my_signal procedure:

When you want to raise an error, you call MY_SIGNAL and pass the text of the error you want to throw:

This causes a 'Table doesn't exist error', but the name of the table is quoted, which contains the error text:

Clumsy in the extreme, but this was about the best one could do in MySQL 5.0 and 5.1. At last, relief is in sight: MySQL 5.4 and 6.0 have implemented the SIGNAL and RESIGNAL clauses. SIGNAL and RESIGNAL are available in the latest drop of 6.0 (6.0.11) and - I believe - in 5.4.4. The SIGNAL clause is documented here; the RESIGNAL clause is documented in the same chapter. The examples below are from 6.0.11, but I believe the behavior in 5.4.4 is the same.

With SIGNAL, we can get rid of the clumsy MY_SIGNAL procedure, and simply code the error we want to send directly:

The error message is cleaner, and we don't get the misleading mysql error code:

You can use RESIGNAL in a similar way to send on an error message that you have already caught. So let's say that in the event of a duplicate key on index error you want to insult the user before returning the MySQL error. Here's how you'd do it:

When the duplicate insert is attempted the EXIT handler from line 8 fires and executes the SELECT and ROLLBACK statements (lines 10-12). Finally the RESIGNAL clause on line 13 causes the original error to be raised:

SIGNAL and RESIGNAL make a huge difference to the usability and reliability of Stored procedures. In fact, I beleive that MySQL should never have released a stored procedure implementation without them. Oh well, better late than never.

However, stored procedures are still missing a mechanism for catching the SQLCODE or SQLSTATE within the stored procedure code. THis means that in a general purpose error handler, you can't know what the error was. Instead, you have to code an error handler for every possible condition. As Roland Bourman has shown, this has a measurable performance impact and also leads to verbose and messy code. The correct implementation is to implement something along the lines of the ANSI GET DIAGNOSTICS command, and this is a low (!) priority enhancement probably not going to be seen before MySQL 7.0. That's a drag.

Guy Harrison |

1 Comment |

3 References |

tagged

mysql

Thursday

Jan182007

Using EXPLAIN EXTENDED to see view query rewrites

Thursday, January 18, 2007 at 9:22AM

At the MySQL Mini Conference in Sydney this week we discussed how to use EXPLAIN EXTENDED to view the rewrites undertaken by the MySQL optimizer. IN particular, to see if MySQL performs a merge of the query into the view definition, or if it creates a temporary table.

It can be tricky to optimize queries using views, since it's often hard to know exactly how the query will be resovled - will MySQL push merge the text of the query and the view, or will it use a temporary table containing the views result set and then apply the query clauses to that?

In general, MySQL merges query text except when the view definition includes a GROUP BY or UNION. But to be sure we can use EXPLAIN EXTENDED. This also helps when we get confusing output in the EXPLAIN output.

For instance if we have a view definition like this:

CREATE VIEW user_table_v AS
   SELECT *
   FROM information_schema.tables ist
      WHERE table_type='BASE TABLE';

and try and explain a query like this:

explain select * from user_table_v WHERE table_schema='mysql'\G

We get output like this, which might be difficult to interpret unless we know the view defition:

*************************** 1. row ***************************
   id: 1
select_type: SIMPLE
      table: ist
      type: ALL
possible_keys: NULL
      key: NULL
   key_len: NULL
      ref: NULL
      rows: 2
   filtered: 100.00
      Extra: Using where

Note the table "ist", only by looking at the view definition can we interepret this. But if we do an EXPLAIN EXTENDED followed by a SHOW WARNINGS we see the exact text:

*************************** 1. row ***************************
Level: Note
Code: 1003
Message: select `ist`.`TABLE_NAME` AS `TABLE_NAME` from `information_schema`.`tables` `ist` where ((
`ist`.`TABLE_SCHEMA` = _utf8'mysql') and (`ist`.`TABLE_TYPE` = _utf8'BASE TABLE'))
1 row in set (0.00 sec)

And from this we can see that MySQL did indeed merge the WHERE clauses of both the query and the view definition.

If we look at the output for a view like this:

CREATE VIEW table_types_v AS
   SELECT table_type,count(*)
      FROM information_schema.tables ist
      GROUP BY table_type;

The we see the following output, in which we can see that MySQL created a temporary table and then applied the WHERE clause from the query:

*************************** 1. row ***************************
   id: 1
select_type: PRIMARY
      table: NULL
      type: NULL
possible_keys: NULL
      key: NULL
   key_len: NULL
      ref: NULL
      rows: NULL
   filtered: NULL
      Extra: Impossible WHERE noticed after reading const tables
*************************** 2. row ***************************
   id: 2
select_type: DERIVED
      table: ist
      type: ALL
possible_keys: NULL
      key: NULL
   key_len: NULL
      ref: NULL
      rows: 2
   filtered: 100.00
      Extra: Using temporary; Using filesort
2 rows in set, 1 warning (0.00 sec)

*************************** 1. row ***************************
Level: Note
Code: 1003
Message: select `table_types_v`.`table_type` AS `table_type`,`table_types_v`.`count(*)` AS `count(*)
` from `mysql`.`table_types_v` where (`table_types_v`.`table_type` = _utf8'BASE TABLE')
1 row in set (0.00 sec)

EXPLAIN EXTENDED is an invaluable tool for tuning SQL statements, and even more so when working with views.

Guy Harrison |

2 Comments |

tagged

mysql in

mysql

Monday

Apr242006

MySQL stored procedures with Ruby

Monday, April 24, 2006 at 9:10PM

Ruby's getting an incredible amount of attention recently, largely as the result of Ruby on Rails. I've played a little with Ruby on Rails and it certainly is the easiest way I've seen so far to develop web interfaces to a back-end database.

At the same time, I've been shifting from perl to Java as my language of choice for any serious database utility development. But I still feel the need for something dynamic and hyper-productive when I'm writing something one-off or for my own use. I've been playing with Python, but if Ruby has the upper ground as a web platform then maybe I should try Ruby.

So seeing as how I've just finished the MySQL stored procedure book, first thing is to see if I can use Ruby for MySQL stored procedures.

Database - and MySQL - support for Ruby is kind of all over the place. There's a DBI option (similar to perl) which provides a consistent interface and there's also native drivers. For MySQL there are pure-ruby native drivers and drivers written in C. Since the DBI is based on the native driver, I thought I'd try the native driver first. The pure-ruby driver gave me some problems so I started with the C driver on Linux (RHAS4).

Retrieving multiple result sets

The main trick with stored procedures is that they might return multiple result sets. OUT or INOUT parameters can be an issue too, but you can always work around that using session variables.

If you try to call a stored procedure that returns a result set, you'll at first get a "procedure foo() can't return a result set in the given context error". This is because the CLIENT_MULTI_RESULTS flag is not set by default when the connection is created. Luckily we can set that in our own code:

dbh=Mysql.init dbh.real_connect("127.0.0.1", "root", "secret", "prod",3306,nil,Mysql::CLIENT_MULTI_RESULTS)The "query" method returns a result set as soon as it is called, but I found it easier to retrieve each result set manually, so i set the query_with_result attribute to false:

dbh.query_with_result=false

The next_result and more_results methods are implemented in the Ruby MySql driver, but there's some weird things about the more_results C API call that causes problems in python and PHP. In Ruby, the more_results call returns true whether or not there is an additional result. The only reliable way I found to determine if there is another result set is to try and grab the results and bail out if an exception fires (the exception doesn't generate an error code, btw);

dbh.query("CALL foo()") begin rs=dbh.use_result rescue Mysql::Error => e no_more_results=true end
.
We can then call more_results at the end of each rowset loop. So here's a method that dumps all the result sets from a stored procedure call as XML using this approach (I'm know the Ruby is probably crap, it's like my 3rd Ruby program):

def procXML(dbh,sql) connect(dbh) no_more_results=false dbh.query(sql) printf("<?xml version='1.0'?>\n"); printf("<proc sql=\"%s\">\n",sql) result_no=0 until no_more_results begin rs=dbh.use_result rescue Mysql::Error => e no_more_results=true end if no_more_results==false result_no+=1 colcount=rs.fetch_fields.size rowno=0 printf("\t<resultset id=%d columns=%s>\n",result_no,colcount) rs.each do |row| rowno+=1 printf "\t\t<row no=%d>\n",rowno rs.fetch_fields.each_with_index do |col,i| printf("\t\t\t<colvalue column=\"%s\">%s</colvalue>\n",col.name,row[i]) end printf("\t\t</row>\n") end printf("\t</resultset>\n"); rs.free dbh.next_result end end printf("</proc>\n") end

No C programming required!

Whew! No need to hack into the C code. So you can use MySQL stored procedures in Ruby with the existing native C driver. The problem is that the C driver is not yet available as a binary on Windows yet and trying to compile it turns out to be beyond my old brain (and yes, I used minGW and all the other "right" things). Hopefully a copy of the MySQL binary driver it will be available in the one-click installer Ruby installer eventually.

The above code doesn't work using the pure-Ruby driver on windows by the way - there's an "out of sequence" error when trying to execute the stored proc. I might hack around on that later (at the moment I'm 35,000 ft with 15 minutes of battery left on the way to the MySQL UC). For now if you want to use MySQL stored procedures in a ruby program on windows I can't help.

Note that ruby seems to hit a bug that causes MySQL to go away if there are two calls to the same stored proc in the same session and the stored proc is created using server-side prepared statements. Fixed soon hopefully, but for now if you get a "MySQL server has gone away error" you might be hitting the same problem. Wez posted on this problem here.

I suppose the end of this investigation will probably be to see if there's any way to use stored procedure calls to maintain a Rails AcitveRecord object. Not that I think you'd necessarily want to, but it would probably be a good learning exercise.

Guy Harrison |