How to write outer join more than 3 tables

eye-catch Other techs

I recently started developing an Android application and I needed to create more complicated SQL. The database for Android app is SQLite and I use Room library for the database access but I used SQL server to learn complicated SQL quickly.

You can clone my repository or just copy the docker-compose.yml file in order to run SQL server in your machine.

https://github.com/yuto-yuto/BlogPost/tree/master/src/Docker/sql-server

This article might be helpful to establish the environment.

Establish SQL Server in Docker container
I tried to establish SQL Server in Docker because I wanted to learn SQL again. Create a Dockerfile and run It's possible...
Sponsored links

What I want to achieve

What I want to do is to show a list like following.

item_namevaluestart_timeend_timetypeunitsortcategory_name
weight502021-07-19 10:00:00.0000000NULL3kg2category-3
pushup302021-07-19 10:01:00.00000002021-07-19 10:05:00.00000002reps3category-2
water3502021-07-19 10:00:00.0000000NULL1ml1category-1
salt52021-07-19 10:10:00.00000002021-07-19 10:15:00.00000004g4category-4
squat502021-07-19 10:20:00.00000002021-07-19 10:30:01.00000002reps5category-2
sugar102021-07-19 10:30:00.00000002021-07-19 10:35:01.00000004g6category-4
beer7002021-07-19 10:40:00.00000002021-07-19 11:00:01.00000001ml7category-1
water2002021-07-19 11:05:00.00000002021-07-19 11:10:01.00000001ml1category-1

The application records everything what I need and show them. I need to create some tables to do this work. Following ER diagram is what I created for the app.

er-diagram-for-outer-join

I separated the table into two main tables, Entry and Item. There are also Unit and Category tables because I want to enable the app to create new unit and category when I need it. That’s why there are 4 tables here. If its feature is not necessary Category and Unit tables are not necessary.

Sponsored links

Table creation and preparation of data

I created the tables and inserted the data with this SQL queries below.

Unit table

CREATE TABLE Unit(
    uid int NOT NULL,
    name nvarchar(50) NOT NULL UNIQUE,
    PRIMARY KEY(uid)
);

INSERT INTO Unit VALUES
(1, 'ml'),
(2, 'g'),
(3, 'kg'),
(4, 'reps');

Category table

CREATE TABLE Category(
    uid int NOT NULL,
    name nvarchar(50) NOT NULL UNIQUE,
    parent int,
    PRIMARY KEY(uid)
);

INSERT INTO Category VALUES
(1, 'category-1', NULL),
(2, 'category-2', NULL),
(3, 'category-3', 1),
(4, 'category-4', 3);

Item table

CREATE TABLE Item(
    uid int NOT NULL,
    name nvarchar(50) NOT NULL UNIQUE,
    type int,
    unit int,
    sort int UNIQUE,
    category_id int,
    PRIMARY KEY(uid),
    FOREIGN KEY (unit) REFERENCES Unit(uid),
    FOREIGN KEY (category_id) REFERENCES Category(uid),
);

INSERT INTO Item VALUES
(1, 'water', 1, 1, 1, 1),
(2, 'pushup', 2, 4, 3, 2),
(3, 'weight', 3, 3, 2, 3),
(4, 'salt', 4, 2, 4, 4),
(5, 'squat', 2, 4, 5, 2),
(6, 'sugar', 4, 2, 6, 4),
(7, 'beer', 1, 1, 7, 1);

Entry table

CREATE TABLE Entry(
    uid int NOT NULL,
    item int,
    start_time datetime2,
    end_time datetime2,
    value int,
    PRIMARY KEY(uid),
    FOREIGN KEY (item) REFERENCES Item(uid),
);

INSERT INTO Entry VALUES
(1, 3, '2021-07-19 10:00:00.000', NULL, 50),
(2, 2, '2021-07-19 10:01:00.000', '2021-07-19 10:05:00.000', 30),
(3, 1, '2021-07-19 10:00:00.000', NULL, 350),
(4, 4, '2021-07-19 10:10:00.000', '2021-07-19 10:15:00.000', 5),
(5, 5, '2021-07-19 10:20:00.000', '2021-07-19 10:30:01.000', 50),
(6, 6, '2021-07-19 10:30:00.000', '2021-07-19 10:35:01.000', 10),
(7, 7, '2021-07-19 10:40:00.000', '2021-07-19 11:00:01.000', 700),
(8, 1, '2021-07-19 11:05:00.000', '2021-07-19 11:10:01.000', 200);

The necessary data are prepared correctly. Next step is to extract the data from the tables.

How to merge two tables by outer join

Let’s take a look how to write the query to merge two tables. Entry table is the main table and additional information needs to be added there. There are two ways to do this.

  • LEFT OUTER JOIN
  • RIGHT OUTER JOIN

The difference is which table is treated as main/base table. LEFT OUTER JOIN treats the first table as main/base table. Following is an example for it.

SELECT * 
FROM Entry LEFT OUTER JOIN ON Item Entry.item = Item.uid

In this case, Entry table is treated as main/base table and Item info is added to the Entry table. We need to set on which condition the data are merged because there are many rows in Item table. We need to pick one of data to merge it to one of rows in the main table.
If the table name is a bit long we can specify alias by AS keyword.

SELECT * 
FROM Entry AS e LEFT OUTER JOIN ON Item AS i e.item = i.uid

Following is the result of the left outer join.

It looks good. Item name, unit and other items are added to the Entry table.

Multiple LEFT OUTER JOIN on multiple tables

Let’s check how to do left outer join on multiple tables before merging 4 tables in one query. The target tables are following.

The query is simple. I simply wrote two LEFT OUTER JOIN.

SELECT *
FROM Item
LEFT OUTER JOIN Unit ON Item.unit = Unit.uid
LEFT OUTER JOIN Category ON Item.category_id = Category.uid

The point is to write the merge condition right after the LEFT OUTER JOIN keyword. Otherwise, it shows an error. In this example, Unit table info is added to the Item table’s rows that has the same unit id. Category table is merged in the same way. Following is the result.

It works as expected but there are unnecessary or unreadable columns. Let’s make it clean.

SELECT Item.name, Item.type, Item.sort, Unit.name as unit, Category.name as category
FROM Item
LEFT OUTER JOIN Unit ON Item.unit = Unit.uid
LEFT OUTER JOIN Category ON Item.category_id = Category.uid
ORDER BY Item.sort

LEFT OUTER JOIN with subquery

I explained how to use LEFT OUTER JOIN to merge two tables and three tables in the examples above. However, the goal is to merge them all in one query. How can we merge them all? LEFT OUTER JOIN requires only one table. If we need to merge multiple tables on different level we need to consider the order of the merge. The order is either top to down (blue square first) or down to top (red square first).

Which is better? Blue square first, red square first? We need subquery to create a merged table first that is used as a base table in the subsequent step.

Firstly, I wrote a query to create a merged table surrounded by a blue square first. And then, merge other two tables. Following is the query for it but it looks long and it has two subqueries.

SELECT tmp2.item_name, tmp2.unit_name, tmp2.value, tmp2.start_time, tmp2.end_time, tmp2.type, tmp2.unit, tmp2.sort, Category.name
FROM Category
RIGHT OUTER JOIN
(
    SELECT tmp.*, Unit.name as unit_name
    FROM Unit
    RIGHT OUTER JOIN 
    (
        SELECT Entry.*, Item.type, Item.name as item_name, Item.unit, Item.sort, Item.category_id 
        FROM Entry
        LEFT OUTER JOIN Item ON Entry.item = Item.uid
    ) as tmp
    ON Unit.uid = tmp.unit
) as tmp2
ON Category.uid = tmp2.category_id
ORDER BY tmp2.sort

It includes both LEFT OUTER JOIN and RIGHT OUTER JOIN in the query. When both of them are included in one query it is not readable because it is not consistent. The result looks following.

How does it look like when we create red squared table first?

SELECT * 
FROM Entry 
LEFT OUTER JOIN (
    SELECT Item.uid, Item.name as item_name, Unit.name as unit, Category.name as category
    FROM Item
    LEFT OUTER JOIN Unit ON Item.unit = Unit.uid
    LEFT OUTER JOIN Category ON Item.category_id = Category.uid
) as Item2
ON Entry.item = Item2.uid

It has only one subquery and it includes only LEFT OUTER JOIN. The query is simpler than the previous one. It is better to merge deep level tables first to make the code simple. The result is of course the same.

End

Database normalization is one of important techniques to make the data maintainable. However, it increases SQL complexity and execution time of the SQL due to merging tables. It is hard to change database scheme, so I recommend designing table well at start. We can do performance tuning afterwards if necessary. We can also merge those info in application side. If the SQL you require gets big you can consider splitting the SQL into multiple SQLs and merge them again in application side. It depends on how many entries there are in the tables.

Comments

Copied title and URL