What Are Subqueries?
As its name implies, SQL, the Structured Query Language, is a systematic approach to access and manipulate data stored in relational databases. The most common and essential set of tools for the purpose are DML statements: SELECT, INSERT, UPDATE, MERGE, and DELETE. Collectively, these statements perform CRUD operations, that is Create, Read, Update, and Delete data.
In SQL, a subquery is a SELECT statement that is nested within another SQL statement. For example, consider a query shown in the figure below. The query calculates average combat power for the entire Pokémon.
Now, suppose that you want to retrieve every Pokémon whose combat power is greater than average. Instead of hard coding with the calculated average, you can employ a subquery as follows:
In the provided SQL code, the inner query (lines 14-18) calculates the average combat power for all Pokémon in the dataset. Then, the outer query retrieves the name and combat power of Pokémon whose individual combat power is greater than the average calculated from the inner query.
You might think we could have simply have a conditional expression like WHERE GREATEST(10, FLOOR(SQRT(hp) * SQRT(defense) * attack / 10)) > 555.926. However, this approach is less dynamic and adaptable. Hard-coding a specific value may not be optimal, as it requires manual adjustment whenever the dataset changes. On the other hand, using a subquery to dynamically compute the average combat power ensures a more flexible and maintenance-friendly solution.
Subqueries are not limited to DML statements; In addition to SELECT, INSERT, UPDATE, MERGE, and DELETE statement, they can also be used in CREATE TABLE or CREATE VIEW statement. For example, let's consider a CREATE TABLE statement as follows:
The query shown above creates a new table my_table by selecting rows from pokemon table based on a condition involving the combat power calculation. The condition filters out Pokémon whose combat power is greater than the average combat power calculated in a subquery.
Depending on the context, subqueries can be categorized as follows:
- Based on the number of rows returned:
- Single-row subqueries: These return a single value (or possibly no value at all) and are commonly used for comparisons in the WHERE clause. The subqueries calculating the average combat power mentioned earlier is a good example of this.
- Multiple-row subqueries: These return more than one rows and are used with operators like IN, ANY, or ALL to apply conditions based on a set of values.
- Based on their relation to the outer query:
- Non-correlated subqueries: These are self-contained and don't reference columns from the outer query.
- Correlated subqueries: These access and utilize values from the outer query within their execution, allowing for more dynamic filtering or calculations.
Subquery Quantifiers
Oracle SQL provides some special keywords to help you express complex filtering and checking conditions within your main query, named subquery quantifiers. They work alongside subqueries and evaluate if specific conditions hold true for some, all, or none of the rows returned by the subquery.
- ANY: This checks if at least one row from the subquery meets a specific condition. Think of it as "Does any row qualify?"
- ALL: This checks if every single row from the subquery meets a specific condition. In other words, "Do all rows qualify?"
- EXISTS: This simply checks if the subquery returns at least one row, regardless of the actual data in those rows. It asks, "Does the subquery return any rows at all?"
- NOT EXISTS: This is the opposite of EXISTS. It checks if the subquery returns no rows at all, asking, "Is the subquery empty?"
To illustrate how subquery quantifiers work, let's consider the following two tables:
In the query below, the subquery calculates the average combat power for each combination of type1 and type2 in the legendary_pokemon table. This query returns more than one rows. Thus, it is a multiple-row subquery. Then, the main query selects all rows from my_pokemon table. In the query, the WHERE clause filters the results based on the subquery. Here, the ANY operator checks if the compat_power of a Pokémon in my_pokemon is at least to any of the averages from the subquery. This means that, for each row in the my_pokemon table, the main query checks if there is at least one row from the subquery that is not greater than the current main query row's combat power.
On the other hand, ALL checks if every row from the subquery satisfies the condition for each row from the main query. For example, let's consider another query shown below. Just as in the example with ANY, the subquery returns multiple rows of average combat power from the legendary_pokemon table. However, this time, the main query checks if the current row of the my_pokemon is greater than or equals to every average combat power from the subquery:
The EXISTS and NOT EXISTS checks for the subquery's existence. If there is any row returned from the subquery, EXISTS evaluated to be true. Otherwise, false. Conversely, if there is no row returned from the subquery, NOT EXISTS evaluated to be true. For example:
In the query above, the EXISTS checks each row from the main query that if there is any matching row retrieved from the subquery. On the other hand, the NOT EXIST negates the results, meaning that it checks each row from the main query that if it is not contained in the subquery results.
While ANY/ALL applies conditions within the subquery itself, EXISTS/NOT EXISTS checks existence of the subquery's result in relation to the main query. In terms of filtering logic, ANY/ALL determines if a row from the main query qualifies for inclusion based on the applied condition (ANY for at least one match, ALL for every match). On the other hand, EXISTS/NOT EXISTS determines if a row from the main query qualifies for inclusion only.
Additional Problems Subqueries Can Solve
In a WHERE clause, a subquery provides some criteria for filtering rows. However, subqueries are not limited to the WHERE clause for DML statements. In fact, for a SELECT statement, you can employ subqueries for any of SELECT, FROM, GROUP BY, HAVING, or ORDER BY. Also, any valid statement can be used as a subquery of another statement. So that even a SELECT query nesting a subquery can be nested into another.
In a SELECT statement, subqueries are powerful tools for tackling many data challenges, often being the only solution. They are ideal for finding data extremes, such as top performers, averages for comparisons, or identifying related records based on certain criteria. Subqueries go beyond using aggregate functions; they can effectively join tables with complex conditions, filter data based on intricate logic, and even manipulate data within the SELECT clause itself.
Here are some common tasks that you can employ subqueries:
- Creating complex multistage queries: Subqueries enable the formulation of intricate multistage queries by finding answers based on specific criteria within a single SELECT statement's WHERE clause. This iterative process allows the subquery to identify secondary answers and pass them to the parent query, achieving multiple stages of retrieval in a single SQL statement.
- Creating populated tables: Incorporating a subquery into a CREATE TABLE statement facilitates the creation and population of a table from an existing data source simultaneously with its creation.
- Manipulating large data sets: Subqueries can be integrated into INSERT or UPDATE statement to efficiently move substantial data volumes, facilitating the insertion or modification of multiple rows of information in a single SQL statement.
- Defining dynamic expressions with single-row subqueries: Single-row subqueries is an ideal solution for defining dynamic expressions in almost any part of a SQL statement.
0 Comments