Friday, January 31, 2014

Did you know that EXISTS... exists?

One of the most useful and powerful functions that Microsoft SQL Server provides, is the EXISTS function that checks for the existence of records in a specific query. Unfortunately it is not used as often as it should be, mainly because a lot of SQL developers come from structural and/or object oriented programming, having difficulty to grasp SQL concepts.

The most common bad practice is the use of COUNT for checking the existence of records. For example:

IF (SELECT COUNT(*) FROM Persons WHERE LastName LIKE 'Smith') > 0
   SELECT 1
ELSE
   SELECT 0

The above query checks the existence of persons whose last name is "Smith". In order to do so, it first gets the count of all the persons named "Smith", and then check is the count is greater than zero. Though it's logically correct, it's pretty far from optimal. The equivalent query with EXISTS is:

IF EXISTS(SELECT * FROM Persons WHERE LastName LIKE 'Smith')
   SELECT 1
ELSE
   SELECT 0

This query checks for the first occurrence of a person named "Smith", and if it finds one, it returns TRUE. You can understand that it's potentially a lot faster than the previous query, since it only checks for one record, and doesn't try to count all the records with the specified criteria.

Another great use of EXISTS, is for eliminating records instead of using INNER JOIN. It's quite a common practice to use INNER JOIN to filter out records, while not using any of the joined table's fields in the SELECT list. All those queries can benefit from EXISTS. For example:

SELECT DISTINCT
   p.ID,
   p.LastName,
   p.FirstName
FROM
   Persons p
INNER JOIN Contracts c
   ON c.PersonID = p.ID

The above query fetches all the persons that have at least on contract, and we can see that there is no field from Contracts table in the SELECT list. It also uses DISTINCT, which is quite heavy on the server. The equivalent query with EXISTS is:

SELECT
   p.ID,
   p.LastName,
   p.FirstName 
FROM 
   Persons p
WHERE
   EXISTS(SELECT * FROM Contracts c WHERE c.PersonID = p.ID)

With this query, we don't have to use DISTINCT, and its performance is much much greater than the previous query, since it only checks for one record in the Contracts table instead of finding all of them.

Though these cases may sound simple and unrealistic, they appear a lot in SQL queries, whether they are used frequently or on demand. So, next time you need to write a simple query for checking the existence of a record, or a complex one that comprises many criteria in different tables, be sure to remember to use EXISTS, and I'm sure that you will be pleasantly surprised by the results!

No comments:

Post a Comment