Yet Another Database Blog

Moving to medium

Guy Harrison — Sun, 04 Feb 2018 22:22:40 +0000

Squarespace has been a great platform, but it seems to me that medium.com has really created a network effect for blogging that overwhelms all the downsides of having to write within someone else's platform. So going forward I'll be blogging at medium at https://medium.com/@guy.harrison.

I've been running this blog on squarespace for about 7 years and for a few years before that on typepad. If you check out http://guyharrison.typepad.com/ you can see how young and beardless I was in those days! :-)

All my older material will remain on Squarespace, though where there is an updated medium article I will link to it. Mostly my new blog content is around blockchain and MongoDB technologies (at least for now). Oracle and MySQL stuff will remain here. It was next to impossible to migrate Squarespace in bulk to medium.

Please checkout my medium page if you've enjoyed the content here - also my little startup dbKoda has it's medium account at https://medium.com/dbkoda. You'll particularly like that if you are interested in MongoDB.

Thanks to all my squarespace readers and see you on Medium!

A few new MongoDB performance blogs

Guy Harrison — Fri, 15 Dec 2017 02:19:17 +0000

Most of my recent blogging has been done on the dbKoda website where I've been writing mostly about MongoDB performance tuning. If you are interested, here are links to a few of the more interesting posts:

Introducting dbKoda 0.8

Guy Harrison — Tue, 07 Nov 2017 06:08:05 +0000

The dbKoda team just released our third major version of dbKoda - the open source, fat free development and administration tool for MongoDB.

This release contains lots of bug fixes, the ability to view data as a chart or in tabular form, and the ability to convert mongodb shell commands into Node.js syntax. Check out the blog post outlining all the new features at https://www.dbkoda.com/blog/2017/11/06/New-Features-in-dbKoda0.8

Effective MongoDB indexing (pt 1)

Guy Harrison — Tue, 10 Oct 2017 04:26:09 +0000

An index is a object with its own unique storage that provides a fast access path into a collection. Indexes exist primarily to enhance performance, so understanding and using indexes effectively is therefore of paramount importance when optimizing MongoDB performance.

B-tree Indexes

The B-tree (“Balanced Tree”) index, is MongoDB’s default index structure. Below is a high-level overview of B-tree index structure.

Read the rest of this post at https://medium.com/dbkoda/getting-started-with-mongodb-indexes-part-1-793667b252e8

Sealing MongoDB documents on the blockchain

Guy Harrison — Mon, 25 Sep 2017 04:18:24 +0000

As human beings, we get used to the limitations of the technologies we use and over time forget how fundamental some of these limitations are.

As a database administrator in the early 1990s, I remember the shock I felt when I realized that the contents of the database files were plain text; I’d just assumed they were encrypted and could only be modified by the database engine acting on behalf of a validated user. But I got used to it.

I also got used to the idea that the contents of a database where pretty much what I – the DBA – said it was. Rudimentary audit logs could be put in place to track activity, but as DBA I could easily disable the audit logs and tamper with any database if I so desired.

I think it’s obvious to all of us that this is not the way it should be – contents of production databases should be trustworthy, We should know that a DBA, hacker or privileged user has not tampered with the contents of the database. However, until recently we lacked the technology to ensure this.

The emergence of a tamper-proof distributed ledger in the form of the Blockchain now promises to give us a mechanism to at least “seal” database records. We can’t necessarily stop a hacker or malicious insider from breaking the seal, but we can at least know if the seal has been broken.

In this post, I’ll show how to implement a simple Blockchain seal for MongoDB. We’ll record a hash value corresponding to a set of documents in a database. As long as the hash value has not changed, we can be confident that the database records have not been tampered with. The hash value is stored on the Blockchain so that we can know with certainty that a particular hash value was in effect at a specific point in time.

Read the rest of this post at https://medium.com/dbkoda/sealing-mongodb-documents-on-the-blockchain-f60b0213c5f4

Announcing dbKoda 0.7

Guy Harrison — Mon, 04 Sep 2017 23:25:12 +0000

0.7.0 is the second public release of dbKoda and our first post-MVP release. With the MVP (Minimal Viable Product) we definitely nailed the "M" criteria, and in this release we've pushing harder on the "V" side of the equation.

As with 0.6, dbKoda is a free, open source, Vegan product made by groovy people in Melbourne Australia. It's licensed under the AGPL 3.0.

As well as many bug fixes, brand new bugs and performance improvements, we added the following features:

Aggregation Builder

At this years MongoDB conference, we spotted one of the MongoDB engineers wearing a "Aggregate() is the new Find()" T-shirt. It's funny, because it's true: almost every non-trivial MongoDB data retrieval operation requires an aggregation pipeline. Features such as joins and graph lookups can only be done through the aggregation framework and under the hood features such as the BI connector depend on aggregation as well.

As anybody who has ever written an aggregation framework pipeline knows, the process is very tedious and error prone - matching braces and getting syntax exactly right is difficult. So in dbKoda 0.7 our aggregation builder allows you to drag and drop pipeline elements and use file in the blank forms to construct complex pipelines. It's amazing how quickly you can build up a complex pipeline using the builder - a video on our youtube channel shows me building up a non-trival pipeline in under 60 seconds: so try it out!

(click to enlarge)

Storage Drilldown

MongoDB can tell you how much space is used up in databases, collections and indexes, but it is not so good at breaking down space within a document. Because MongoDB's document model supports nested arrays of documents, it’s often the space used within a collection that is the most important thing to identify. For instance, typical reasons for space blowouts MongoDB are unbounded arrays of nested collections.

dbKoda's storage drilldown breaks down space used within databases, collections, indexes and shows you how storage is used within a collection. It does this in an intuitive graphical presentation that allows you to drill in and out of nested documents.

SSH tunneling connections

We were all horrified at the explosion of ransomware attacks on MongoDB databases early in 2017. The root cause of the security vulnerabilities in these databases was the failure to correctly create authenticated users, but it is also true that you take your life in your hands whenever you expose a database port to the public Internet. For this reason it’s often the best practice to leave database ports open only within a walled garden. If you want to perform day administration using tools such as dbKoda then you use SSH tunneling to establish a connection.

This FAQ entry describes SSH tunnelling and how it is used in dbKoda. Put simply, you can now specify an intermediate host which offers you SSH connectivity and use that host to forward database requests to the secured MongoDB server.

Enhanced JSON viewer

Complex JSON documents can be difficult to read. By default dbKoda will display JSON output as it would appear in the MongoDB shell. We aspire to complete shall compatibility after all! In 0.7, we offer an enhanced trace and your that allows you to examine JSON documents with multiple levels of detail, allowing you to expand collapse subdocuments and long strings. This facility is available wherever JSON output is displayed on the product, by right clicking and choosing “enhanced JSON output”.

Export/Import

0.7 allows you to load or unload data to or from your MongoDB server. This facility provides GUI access to the `mongodump`, `mongorestore`, `mongoexport` and `mongoimport` commands.

Enhanced performance on windows

In 0.6, there were some performance issues when very large amounts of data were displayed in our output panel. We worked hard to resolve these and now believe that performance on windows will match performance on Linux and Mac.

Summing up

We added a lot of cool functionality in this release - we hope you'll try it out and let us know what you think. Download dbKoda from www.dbkoda.com and let us know what you think at our support site.

Announcing dbKoda!

Guy Harrison — Sun, 16 Jul 2017 23:34:06 +0000

I'm very excited to announce the release of dbKoda - a next generation database development and administration tool now available for MongoDB.

Those who've been following me know that I've been working with databases since the early Mesozoic period and I've worked in database tooling for almost two decades.

Working with next generation databases like MongoDB has been a lot of fun, but did make me realise how much need there is for a strong tooling ecosystem around these new databases. I like to think that I made significant contributions to tooling for relational databases and had a strong desire to build something for post-relational systems.

Consequently, late last year I founded the company Southbank Software and this week we launched our first product - dbKoda (www.dbKoda.com).

dbKoda is a modern, open source database development tool. The first release targets MongoDB. It is a 100% Javascript application which runs on Linux, Mac or Windows. It features a rich editing environment with syntax highlighting, code completion and formatting. It also offers easy graphical access to common MongoDB administration and configuration tasks.

I'm really excited about dbKoda - I hope that it will become the foundation for a product family that will support modern database development across a wide range of environments. And working closely with the small team of brilliant dbKoda developers has been an absolute privilege.

Checkout the dbKoda website and download dbKoda here. You can also checkout an introductory video on dbKoda. Please also follow dbKoda on https://twitter.com/db_Koda.

Optimizing the order of MongoDB aggregation steps

Guy Harrison — Wed, 30 Nov 2016 06:44:09 +0000

An updated version of this blog post can be found at https://www.dbkoda.com/blog/2017/10/14/Optimizing-the-order-of-aggregation-pipelines

MongoDB does have a query optimizer, and in most cases it's effective at picking the best of multiple possible plans. However it's worth remembering that in the case of the aggregate function the sequence in which various steps are executed is completely under your control. The optimizer won't reorder steps into the optimal sequence to get you out of trouble.

Optimizing the order of steps probably comes mainly to reducing the amount of data in the pipeline as early as possible – this reduces the amount of work that has to be done by each successive step. The corollary is that steps that perfom a lot of work on data should be placed after any filter steps.

Please go to https://medium.com/dbkoda/optimizing-the-order-of-aggregation-pipelines-44c7e3f4d5dd to read the rest of this post

Bulk inserts in MongoDB

Guy Harrison — Mon, 07 Nov 2016 11:31:22 +0000

Like most database systems, MongoDB provides API calls that allow multiple documents to be inserted or retrieved in a single operation.

These “Array” or “Bulk” interfaces improve database performance markedly by reducing the number of round trips between the client and the databases – dramatically. To realise how fundamental an optimisation this is, consider that you have a bunch of people that you are going to take across a river. You have a boat that can take 100 people at a time, but for some reason you are only taking one person across in each trip – not smart, right? Failing to take advantage of array inserts is very similar: you are essentially sending network packets that could take hundreds of documents over with only a single document in each packet.

Optimizing bulk reads using .batchSize()

Read the rest of this post at https://medium.com/dbkoda/bulk-operations-in-mongodb-ed49c109d280

Graph Lookup in MongoDB 3.3

Guy Harrison — Wed, 24 Aug 2016 05:22:54 +0000

Specialized graph databases such as Neo4J specialize in traversing graphs of relationships – such as those you might find in a social network. Many non-graph databases have been incorporating Graph Compute Engines to perform similar tasks. In the MongoDB 3.3 release, we now have the ability to perform simple graph traversal using the $graphLookup aggregation framework function. This will become a production feature in the 3.4 release.

The new feature is documented in MongoDB Jira SERVER-23725. The basic syntax is shown here:

{$graphLookup:

        from: ,

        startWith: ,

        connectFromField: in document from “from”>,

        connectToField: in document from “from”>,

        as: in output document>,

        maxDepth: ,

        depthField: in output

 documents>

    }

Please go to https://medium.com/dbkoda/optimising-graph-lookups-in-mongodb-49483afb55c8 to read the rest of this article (updated for MongoDB 3.6)