What's the Difference Between INNER JOIN, the OUTER JOINs, and CROSS JOIN?
Over the course of my career, when interviewing people, I tend to ask questions that locate the edge of a candidate’s knowledge. One of the questions I often ask is about the different types of JOIN, and it’s still relatively rare that someone can completely explain all the join types—even though they haven’t changed for a very long time. So, here’s a refresher on how the different types of JOIN are related.
One thing I want to make clear about this post is that we’re just talking about logical JOINs—i.e. we are trying to clearly define what result you would expect when you issue a query that contains a JOIN. We are not covering how joins are navigated physically—i.e. how the database engine actually fulfils the request and produces the correct result. More on that at the end!
Isn’t the Cartesian Product a Type of Hair Gel?
A long time ago, I was interviewing someone and asked them about what the phrase “cartesian product” meant. The reply was, “This is getting silly now. I’m interviewing for a database developer position and you’re asking me about hair gel?” I wasn’t asking about hair gel, and that particular interviewee did not get the position.
The cartesian product is a mathematical term in which two sets of items are combined such that every item in the first set is combined with every item in the second set. Let’s demonstrate that with a simple example. Consider set L that has the items [1, 2, 3] and set R that has the items [A, B, C]. The cartesian product would be 9 pairs:
(1, A)
(1, B)
(1, C)
(2, A)
(2, B)
(2, C)
(3, A)
(3, B)
(3, C)
You can see that for each item in the set L we are paired up with each of the items from set R. In databases, the cartesian product is exactly the same thing—except for the fact that the sets are tables and contain rows rather than items.
So, let’s consider a set which contains some basic numbers in SQL: