Using Toad with Hive in Amazon Elastic Map Reduce
The Toad for Cloud Databases eclipse client has support for Hive queries which makes it really easy for me to run queries against our test hadoop clusters. It also supports Hive running on top of Amazon Elastic Map Reduce (EMR), but you do need to be aware that in EMR the default ports are different from what we have come to expect.
Firstly, if you have started an EMR cluster with Hive 0.5 support, then the Hive server will be running on port 10001, not port 10000. The second difference is that the JobTracker is running on port 9100, rather than 50030. So when attaching to EMR, you would set up your hive connection something like this:
Once you’ve done that, the Hive connection will show all the Hive tables and you can enter HQL queries in the SQL editor. You can drag table and column names into the editor as well:
One of the simple, but really useful things about the hive client is that you can jump to the jobtracker web page while the HQL is running to see how it is going:
Here’s the resulting JobTracker console. We can see the job running and – if we scroll to the right or maximize the window – we can see how the Map and reduce phases of the Hive job are progressing:
Reader Comments (3)
Nice post,
I am trying to use the toad for cloud, eclipse plugin in my own environment,
But I am facing a problem, once I got results for my query, I can't do anything with them,
I couldn't figure out, how to copy paste them to excel or export them to excel, or even an ordinary text file.
Maybe you know if I can?
Hi, Sorry for not noticing this comment immediately.
We are putting this feature into the product at the moment - you might have been in contact with the development team? Should have a patch for you in a few days
Regards,
Guy
It really is very valuable information given to us, come the December article in the Amazon Elastic Map Reduce (EMR), Hive, and TOAD and it pays to be aware of your blog!