Join performance in MongoDB 3.2 using $lookup
One of the key tenants of MongoDB schema design is to account for the absence of server-side joins. Data is joined all the time inside of application code of course, but traditionally there’s been no way to perform joins within the server itself.
This changed in 3.2 with the introduction of the $lookup operator within the aggregation framework. $lookup performs the equivalent of a left outer join – eg: it retrieves matching data from another document and returns null data if no match is found.
Here’s an example using the MongoDB version of the Sakila dataset that I converted from MySQL back in this post:
Please go to https://medium.com/dbkoda/coding-efficient-mongodb-joins-97fe0627751a to read the rest of this post (it has been migrated to a medium.com article and updated).
Reader Comments (5)
Very difficult to understand the context here because no one ever "hires a film." In the real-world, what does that mean? In order to understand mongo's attempt to improve relational aggregation in this example, we first need to understand the real-world relationship between a film and hiring... so what is that relationship? Because it's not obvious, because "hiring a film" is not a phrase people ever use.
Well, back in the "old days" there were stored that contained DVD and VCR movies inside little plastic packages. People would "hire" or "rent" these movies for a few days and watch them at home. Sort of like netflix but with physical media. The Sakila schema was created for MySQL based on the idea of a video rental store like this.
So admittedly the sample data is a bit archaic, though the concept might easily be applied to car rentals or any other business that rents things out.
However, the key point here is not the business use case, but rather the peformance of the $lookup operator. Regardless of WHY you use the operator, the performance is going to be relatively poor if the lookup is unindexed.
Nice blog post guy.It has helped me lot,indexing on foreignfield increased the performance by oreder of magnitude
Thanks a lot of this post. I was struggling with optimising my query with lookup. After adding index, it was a breeze, the query completed in less than a second .
Cheers.
Thanks for this post. Saved me a lot of time!