Understanding Subqueries

A subquery is a query that is embedded within another query. Subqueries are sometimes also referred to as subselects because one SELECT statement appears within another. Let's look at how subqueries can be useful.

Filtering by Subquery

You can use the result of a query like you use a list of values with the IN operator to filter a query based on the result of another query. The subquery appears in parentheses after the IN keyword.

The following query fetches all columns from the products table, but only for those product codes that were part of order number 1:

MariaDB> SELECT *
 -> FROM products
 -> WHERE product_code IN (
 -> SELECT product_code
 -> FROM order_lines
 -> WHERE order_id = 1
 -> );
+--------------+---------------+--------+--------+
| product_code | name | weight | price |
+--------------+---------------+--------+--------+
| MINI | Small product | 1.50 | 5.99 |
| MAXI | Large product | 8.00 | 15.99 |
+--------------+---------------+--------+--------+
2 rows in set (0.00 sec)

Formatting Subqueries

In the previous example, the subquery was indented from the main query for clarity. Remember, you can use as many spaces and carriage returns as you like to keep things readable.

Let's break down how MariaDB evaluates this query. First, the subquery is executed to find a list of product codes. If you run the subquery on its own, you will see the following result:

MariaDB> SELECT product_code
 -> FROM order_lines
 -> WHERE order_id = 1;
+--------------+
| product_code |
+--------------+
| MINI |
| MAXI |
+--------------+
2 rows in set (0.00 sec)

So using this result, after evaluating the subquery, MariaDB runs a query that is the same as the following:

MariaDB> SELECT *
 -> FROM products
 -> WHERE product_code IN ('MINI", 'MAXI")
+--------------+---------------+--------+-------+
| product_code | name | weight | price |
+------------- +---------------+--------+-------+
| MINI | Small product | 1.50 | 5.99 |
| MAXI | Large product | 8.00 | 15.99 |
+--------------+---------------+--------+-------+
2 rows in set (0.01 sec)

As you can see, using a subquery in this type of situation saves you from having to perform two queries separately and insert the result of the first into the secondeither yourself by hand or using a data structure in a programming language.

Query Performance

The syntax of a subquery makes your SQL statement very easy to read, but often there are more efficient ways of performing the same task. You will learn how to perform similar operations using table joins in the next lesson.

The Correlated Subquery

If a subquery involves a table that also appears in the outer SELECT statement, it is known as a correlated subquery.

MariaDB cannot evaluate this type of subquery by feeding the result of the inner SELECT statement into the outer query because each SELECT statement relies on values from the other.

To reference the same table more than once in the same query, you must use table aliases. These work just the same as the column aliases you have already come across, but they enable you to qualify columns from a table using an abbreviated alias.

To alias a table, give it an alias name immediately after the table name in the FROM clause. The AS keyword is optional for table aliases.

The following example uses a correlated subquery on the order_lines table, comparing the average quantity of each product ordered at a time to every order that included that product in turn. The result is a list of orders in which a higher than average quantity of each product was shipped.

MariaDB> SELECT order_id, product_code, quantity
 -> FROM order_lines ol1
 -> WHERE quantity > (
 -> SELECT AVG(quantity)
 -> FROM order_lines ol2
 -> WHERE ol2.product_code = ol1.product_code
 -> );
+----------+--------------+----------+
| order_id | product_code | quantity |
+--------- +--------------+----------+
| 2 | MAXI | 12 | 
| 3 | MINI | 16 |
| 4 | MINI | 16 |
| 4 | MAXI | 10 |
| 5 | MAXI | 10 |
+----------+--------------+----------+
5 rows in set (0.00 sec)

The query fetches data from the order_lines table twice, aliased to ol1 and ol2. The tablename.column notation is used to qualify a column name in SQL; in this case, however, the table name would be the same, so two different aliases are required.

This example is a correlated subquery because the inner subquery cannot be executed outside this complete SQL statement. It references ol1.product_code, a column from the outer query.

Correlated Subquery Performance

Although it is a powerful SQL construct, the correlated subquery can be slow to execute if the result sets from each SELECT statement are large. If there is another way to find the same information in SQL, it will probably run faster than a correlated subquery.

Subquerying the Same Table

The following example uses a subquery on the same table as the outer query. Its purpose is to find records from the customer_contacts table that share attributes with a known record, but one in which the key value you actually want to search on is not known.

MariaDB> SELECT first_name, last_name
 -> FROM customer_contacts
 -> WHERE customer_code = (
 -> SELECT customer_code
 -> FROM customer_contacts
 -> WHERE first_name ='Benjamin'
 -> AND last_name = 'Franklin'
 -> ); 
+------------+-----------+
| first_name | last_name |
+------------+-----------+
| Albert | Einstein |
| Charles | Darwin |
| Marie | Curie |
| Benjamin | Franklin |
+------------+-----------+
4 rows in set (0.00 sec)

Note that this is not a correlated subquery. The subquery can be evaluated independently of the outer query and the result used as the argument to the = operator. The result is all the contacts who are related to the same customer record as the given contact name.

Using the equals operator with a subquery means you must be sure that the subquery returns only one row; otherwise, you must use IN. With the sample database, you know there is only one Benjamin Franklin, so there is no problem. However, be aware that MariaDB generates an error as shown here if the subquery returns more than one row for an equals operator:

MariaDB> SELECT first_name, last_name
 -> FROM customer_contacts
 -> WHERE customer_code = (
 -> SELECT customer_code
 -> FROM customer_contacts
 -> WHERE first_name ='Benjamin'
 -> );
ERROR 1242 (21000): Subquery returns more than 1 row

Using Subqueries as Calculated Fields

You can actually nest a query in the column list section of a SELECT statement. The query executes the subquery and returns its value as a column in the result set for each row in the main query.

The following example retrieves each product from the database along with a subquery column that finds the total quantity ordered for each product in turn:

MariaDB> SELECT p.name,
 -> (SELECT SUM(ol.quantity)
 -> FROM order_lines ol
 -> WHERE ol.product_code = p.product_code) AS subquery
 -> FROM products p;
+----------------+----------+
| name | subquery |
+----------------+----------+
| Small product | 36 |
| Medium product | 10 |
| Large product | 48 |
+----------------+----------+
3 rows in set (0.00 sec)

Table Joins

This subquery example involves a type of table join in which the filter on the query involves a column value from another table. Joins are covered in more detail in the next lesson.

Using Subqueries as Tables

In the same way that the result of a subquery can be used with an operator in the WHERE clause, you can use a subquery as a table in the FROM clause.

This feature will become more useful when you have learned how to perform advanced table joins in the next few chapters. For now, you'll see how it works with a trivial example.

The following query uses a subquery in place of a table in the FROM clause. The subquery returns a subset of the data from the customer_contacts table.

Performing a SELECT * on that virtual table can return only the columns included in the subquery. The WHERE clause in the main query takes effect after the data has already been filtered in the subquery.

MariaDB> SELECT *
 -> FROM (
 -> SELECT first_name, last_name
 -> FROM customer_contacts
 -> WHERE customer_code = 'SCICORP'
 -> ) scicorp_customers
 -> WHERE first_name = 'Benjamin';
+------------+-----------+
| first_name | last_name |
+------------+-----------+
| Benjamin | Franklin |
+------------+-----------+
1 row in set (0.00 sec)

When using a subquery as a table, you must give the subquery a table alias or MariaDB will return an error. In the previous example, the virtual table was aliased to scicorp_customers.

Previous Page Next Page