Self-Joins

Sometimes you want to join a table to itself. You can do this either by putting the same table name in the FROM clause twice or by using that table after a JOIN keyword.

The following example uses a self-join to find people in the customer_contacts table who belong to the same company as a named person. You achieved the same result in Lesson 10, "Using Subqueries," using a subquerythis is a different approach to the same problem.

MariaDB> SELECT c2.first_name, c2.last_name
 -> FROM customer_contacts c1
 -> JOIN customer_contacts c2
 -> ON c1.customer_code = c2.customer_code
 -> WHERE c1.first_name = 'Benjamin'
 -> AND c1.last_name = 'Franklin';
+------------+-----------+
| first_name | last_name |
+------------+-----------+
| Albert | Einstein |
| Charles | Darwin |
| Marie | Curie |
| Benjamin | Franklin |
+------------+-----------+
4 rows in set (0.00 sec)

In this example, the customer_contacts table is used twice, as an alias to c1 and c2. By joining the table to itself in the customer_code field, the result is a data set that includes a pair of records for every person alongside every other person in the company. The filter in the WHERE clause restricts the output of the query to only find those contacts who share a company with Charles Darwin.

Of course, if you already knew the customer code that corresponded to the person you were searching for, the query would need to look at customer_contacts only once and filter using the appropriate condition on customer_code. The self-join is necessary only because the known valuea contact person's namecannot be used to perform the required filter directly.

Aliases with Self-Joins

When you join a table to itself, it is vital that you give each table an alias. The tables will, of course, contain exactly the same columns, so you must use aliases to qualify the columns you want to reference.


Note that the columns selected in the previous example are from the table aliased to c2, not c1. The filter is performed on values in c1 and restricts the records returned to a particular name. Therefore, if you had selected the name values from c1, they would all be the same.

Self-Joins vs. Subqueries

You have now seen two different ways to find the same information: using a self-join and a subquery. The "best" method often depends on the nature of the tables and the data involved.

In the very small sample database, you will not notice any difference in performance between the two methods. In a real-world system, however, if you find that one method is particularly slow, see if another approach to your query speeds things up.


Previous Page Next Page