Learn How To Write SQL Queries With Examples: #6

analytics viz

Data are becoming the new raw material of business.

Craig Mundie (President, Mundie & Associates | Former Senior Advisor to the CEO, Microsoft)
Question Source: LeetCode
Solution Language: MySQL

This Q&A series will cover data questions from LeetCode and present my solutions to them. Please feel free to comment with your suggestions if you feel that these problems may be solved in a more optimized manner.

Question (LeetCode Question #1270, Level: Medium)

Table: Employees

+---------------+---------+
| Column Name   | Type    |
+---------------+---------+
| employee_id   | int     |
| employee_name | varchar |
| manager_id    | int     |
+---------------+---------+
employee_id is the primary key for this table.
Each row of this table indicates that the employee with ID employee_id and name employee_name reports his work to his/her direct manager with manager_id
The head of the company is the employee with employee_id = 1.

Write an SQL query to find employee_id of all employees that directly or indirectly report their work to the head of the company.

The indirect relation between managers will not exceed 3 managers as the company is small.

Return the result table in any order without duplicates.

The query result format is in the following example:

Employees table:
+-------------+---------------+------------+
| employee_id | employee_name | manager_id |
+-------------+---------------+------------+
| 1           | Boss          | 1          |
| 3           | Alice         | 3          |
| 2           | Bob           | 1          |
| 4           | Daniel        | 2          |
| 7           | Luis          | 4          |
| 8           | Jhon          | 3          |
| 9           | Angela        | 8          |
| 77          | Robert        | 1          |
+-------------+---------------+------------+

Result table:
+-------------+
| employee_id |
+-------------+
| 2           |
| 77          |
| 4           |
| 7           |
+-------------+

The head of the company is the employee with employee_id 1.
The employees with employee_id 2 and 77 report their work directly to the head of the company.
The employee with employee_id 4 report his work indirectly to the head of the company 4 --> 2 --> 1. 
The employee with employee_id 7 report his work indirectly to the head of the company 7 --> 4 --> 2 --> 1.
The employees with employee_id 3, 8 and 9 don't report their work to head of company directly or indirectly.

Solution

This solution uses the Set operation “UNION ALL” instead of “UNION” because we do not anticipate any duplicates in the final result set. Here is my reasoning:

  • The only way that there could be duplicates in the final result set is if an employee reports to herself directly (employee_id = manager_id) as well as to employee_id 1 indirectly.
  • This can happen only if there is a data quality issue
  • To deal with this situation, I have added a condition in WHERE clause of every CTE subqueries to verify that for ny record, manager_id is not equal to employee_id

This approach is good for performance as well because we are filtering out any instances that could have led to duplicates, and then using “UNION ALL”, which gives us a higher-performing query as compared to one using the “UNION” set operation.

-- Direct Reports of Head of the company

WITH dr AS (select employee_id
 FROM Employees
 WHERE manager_id = 1 AND employee_id <>1),
 
 -- Indirect Reports of Head of the company (Levels 1 to 3)
 
 ir1 AS (SELECT e.employee_id 
         FROM Employees e JOIN dr
         ON e.manager_id = dr.employee_id
        AND e.manager_id <> e.employee_id),
         
 ir2 AS (SELECT e.employee_id 
         FROM Employees e JOIN ir1
         ON e.manager_id = ir1.employee_id
        AND e.manager_id <> e.employee_id),
                 
 ir3 AS (SELECT e.employee_id
         FROM Employees e JOIN ir2
         ON e.manager_id = ir2.employee_id
        AND e.manager_id <> e.employee_id)
         
SELECT employee_id FROM dr
UNION ALL
SELECT employee_id FROM ir1
UNION ALL
SELECT employee_id FROM ir2
UNION ALL
SELECT employee_id FROM ir3