Sometimes it’s simple. Just playing around with SQL, accompanied with a cup of coffee, why not right?
This time I wanted to play a little with hierarchical SQL queries and of course explain it. Hierarchical queries come in handy whenever you have a table with self referencing columns. Within the HR sample schema Oracle provides (download HR and other sample schemas on GitHub) we have for example the employees table.
In this example the manager_id joins to the employee_id to retrieve the manager of each and every employee.
With hierarchical queries two parts always come back:
- START WITH
Defines the top node of the hierarchical tree. - CONNECT BY PRIOR
Defines the relationship between the parent and the child, so the manager with the employee.
With some example queries I’ll try to explain these type of queries further.
START WITH and CONNECT BY PRIOR
We need to start our tree with a top node. I would like to start it with the chief of all managers, so a manager without a manager. I’ll add a START WITH clause with the condition manager_id is null. Furthermore, we’ll get the parent/manager for every child/employee so we need to connect the employee_id of the parent record with the column manager_id of the child record. We’ll add a CONNECT BY PRIOR clause with the join employee_id = manager_id.
select
employee_id
, first_name
, last_name
, manager_id
from
employees
start with
manager_id is null
connect by prior
employee_id = manager_id;
The query will list the hierarchy from top to bottom starting with our top node being manager Steven King.
Level
To more illustrate the structure of the tree we can use a so called pseude column named LEVEL. This LEVEL column will return the depth of the row within the tree as a number. We can use this number together with the lpad sql function to fill out spaces before the employee full name to give the tree a little bit more body.
select
level
, employee_id
, lpad ( ' ', level, ' ' ) || first_name || ' ' || last_name name
, manager_id
from
employees
start with
manager_id is null
connect by prior
employee_id = manager_id;
As you can see the name column is now idented based on the depth or position within the hierarchy.
SIBLINGS
If you want to sort the childs of a parent based on a column you need to add the SIBLINGS keyword to the ORDER BY statement. For example, I need to sort the childs of a parent by last_name. For this I’ll add the statement ORDER SIBLINGS BY last_name.
select
level
, employee_id
, lpad ( ' ', level, ' ' ) || first_name || ' ' || last_name name
, manager_id
from
employees
start with
manager_id is null
connect by prior
employee_id = manager_id
order siblings by
last_name;
As you can see all childs are sorted on last_name within the parent structure.
In future articles I’ll explain some more features of hierarchical queries like:
- NOCYCLE: To detect and ignore a loop within your hierarchy
- CONNECT_BY_ROOT: Provides the value of a column from the root node row
- SYS_CONNECT_BY_PATH: This can generate for example a directory structure like format. For example: /King/Kochhar/Greenberg/Faviet. So you get a concatenation of column values with a seperator.
- CONNECT_BY_ISLEAF: This statement return a 1 for the lowest level of a hierarchy – a leaf. Every parent will return a 0 as there are childs involved.
Enjoy yourself while playing with SQL to make yourself known with all the advanced possibilities you have at hand.