A Beginner's Guide To SQL ORDER BY and GROUP BY.
Table of contents
Learning SQL is like taking a road trip in an unknown desert. Finding a cool oasis with a shade will describe the moments when you are finally able to write queries with simple SELECT statements. The sandstorms will perfectly describe the moments when you hit clauses like GROUP BY and ORDER BY.
I mean, aren’t they the same thing? Don’t they have the same function? Can’t they be used interchangeably? Before you know it, you’re stuck in an unforgiving dry quicksand, at a loss on how to proceed.
Before we begin, I want you to understand that this is one of the fundamentals of SQL and if you can write a simple query, this would be a piece of cake for you too! Now let’s begin…
For starters, I like to call these two clauses “THE SORTERS”. This is because their basic function is just that - to arrange and sort data into sections based on their values. But that is just where the similarity ends for these two because they have different use cases and once you get this, you’re on your way to being a pro!
To explain further, ORDER BY helps you to organize the results of your query alphabetically or numerically and in descending or ascending order. On the other hand, the GROUP BY clause is used with aggregate functions and helps you sort data by grouping it based on specified columns.
Get it? If it’s a bit confusing, don’t worry. I’m going to explain them further one at a time with use cases in the next few paragraphs.
ORDER BY.
Sometimes, in order to answer some questions with data, it is vital to arrange the query results in some sort of order. With the ORDER BY clause, you can specify which column(s) you want to arrange alphabetically or numerically. For instance, if you have a column of names that begin with letters A - G in no particular order, the ORDER BY clause will help to arrange that column in ascending(A - G) or descending(G -A) order.
By default, when you use the ORDER BY clause your data is automatically sorted in ascending order so there is really no need to use the “ascending” word or ASC. However, many still use it as a form of clarity. If you want to do the opposite, the word “descending” or DESC should come after the ORDER BY clause and specified column(s).
Following SQL syntax, ORDER BY comes after FROM and the WHERE clause(if there’s a WHERE clause).
Let’s try a simple query with the Chess table below:
Players | Apples | Oranges | Pears |
Pamela | 1 | 1 | 1 |
Andrew | 1 | 1 | 0 |
Linda | 0 | 0 | 1 |
Trish | 1 | 1 | 1 |
Amby | 1 | 1 | 1 |
Bucknor | 1 | 1 | 1 |
Beatrice | 1 | 0 | 1 |
Peter | 1 | 0 | 1 |
Dunni | 0 | 1 | 1 |
|
Fig 1
To arrange the Players column in ascending alphabetical order, the following code block is all you need;
SELECT *
FROM Chess
ORDER BY Players;
The result will be:
Players | Apples | Oranges | Pears |
Amby | 1 | 1 | 1 |
Andrew | 1 | 1 | 0 |
Beatrice | 1 | 0 | 1 |
Bucknor | 1 | 1 | 1 |
Dunni | 0 | 1 | 1 |
Linda | 0 | 0 | 1 |
Pamela | 1 | 1 | 1 |
Peter | 1 | 0 | 1 |
Trish | 1 | 1 | 1 |
Fig 2
The Players column has been successfully arranged in alphabetical order from A - T as shown above.
A note of warning: Don’t be too sure about the arranged order of a set no matter how accurate it looks. If you want your sets in order, always use an ORDER BY clause to be doubly sure.
In the example above, all the columns were selected in the SELECT statement but that doesn’t always have to be the case. All you need to do is specify the column you want to be arranged after the ORDER BY clause even if it does not appear in the SELECT statement.
The ORDER BY clause can also be used on more than one column. Each column specified in the ORDER BY clause can also be arranged in whatever direction. For this instance, let’s use the new table “Customers” shown below:
customer_id | first_name | last_name | age | country |
1 | John | Doe | 31 | USA |
2 | Robert | Luna | 22 | USA |
3 | David | Robinson | 22 | UK |
4 | John | Reinhardt | 25 | UK |
5 | Betty | Doe | 28 | UAE |
Fig 3
Let’s see what happens when we ORDER BY last_name ASCENDING and age DESCENDING. The SELECT statement should have columns customer_id, first_name, last_name and country.
customer_id | first_name | last_name | country |
1 | John | Doe | USA |
5 | Betty | Doe | UAE |
2 | Robert | Luna | USA |
4 | John | Reinhardt | UK |
3 | David | Robinson | UK |
Fig 4.
The query result above shows all the columns in the SELECT statement and if you look closely, it also shows how the ORDER BY clause affects these columns.
The query result in Fig 4 shows the last name being arranged alphabetically. If you compare the table with the query result, you’ll find out that John Doe is 31 and Betty Doe is 28, confirming that the age column is arranged in descending order as requested.
The first column in the ORDER BY clause will be given the most priority in the sense that it will be the first to be ordered. For clarification, you can order by last_name first and then age using the following query:
SELECT *
FROM Customers
ORDER BY country, age;
The query result will be:-
customer_id | first_name | last_name | age | country |
5 | Betty | Doe | 28 | UAE |
3 | David | Robinson | 22 | UK |
4 | John | Reinhardt | 25 | UK |
2 | Robert | Luna | 22 | USA |
1 | John | Doe | 31 | USA |
Fig 5
You can see that the country column was first attended to and arranged in ascending order before the age column was attended to and though at first glance the age column looks well organized, the last row betrays the fact that it is actually not. This is why you should be cautious about the columns you choose to appear in your ORDER BY clause.
You can also use the column position number to indicate the column to be ordered instead of the column name. In the Customers table above, customer_id is column 1, first_name is column 2 and so on. In a query, it will appear as follows:
SELECT *
FROM Customers
WHERE last_name = “Doe”
ORDER BY 3 DESC;
The result will be:
customer_id | first_name | last_name | age | country |
1 | John | Doe | 31 | USA |
5 | Betty | Doe | 28 | UAE |
Fig 6
Column 3 is the last_name column. The appearance of the WHERE clause indicates that only rows that have “Doe” as the last name are to be compiled.
However, this method might get confusing in the long run so it is best to stick with simple column names already provided in the table.
Now let’s move on to the GROUP BY clause.
GROUP BY.
The GROUP BY clause is always used when there is an aggregate function in a query. It is used to group rows that have shared values. Aggregate functions summarize queries so with a GROUP BY clause, you can use an aggregate function to get single results for each group or criteria.
To illustrate this, I’ll be using the new Shippings table below;
shipping_id | status | customer |
1 | Pending | 2 |
2 | Pending | 4 |
3 | Delivered | 3 |
4 | Pending | 5 |
5 | Delivered | 1 |
Fig 7
Now let’s find out what happens when the query below is inputted;
SELECT COUNT(status), status, customer
FROM Shippings
GROUP BY status;
COUNT(status) | status | customer |
2 | Delivered | 3 |
3 | Pending | 2 |
Fig 8
The result above shows that the status column has been grouped according to type. The COUNT(status) column shows the number of rows that are Delivered or Pending. The customer row shows the first customer for each group. Delivered is number 3 while Pending is number 2.
Using a WHERE clause also changes things. For example,
SELECT COUNT(status), status
FROM Shippings
WHERE status = “Pending”
GROUP BY status;
COUNT(status) | status |
3 | Pending |
Fig 9
Right now, the result shows what happens when the condition is to only show rows with “Pending” rows.
USING ORDER BY AND GROUP BY.
ORDER BY and GROUP BY can be used together in a query. The correct SQL syntax is to place ORDER BY after GROUP BY. Let’s illustrate this with the Shippings table and the query below:-
SELECT *, COUNT(status) AS total_stats,
FROM Shippings
GROUP BY status
ORDER BY shipping_id;
total_stats | shipping_id | status | customer |
3 | 1 | Pending | 2 |
2 | 3 | Delivered | 3 |
Fig 10
The result above shows all the columns in the table plus the new total_stats column which shows the number of rows that are “Pending” or “Delivered”. The shipping_id column is also arranged in ascending numerical order to match the customer column.
CONCLUSION.
This is a basic explanation of what the ORDER BY and GROUP BY clauses do separately and together. To get used to these clauses, remember to practice constantly to further familiarize yourself with them. Now that you have “THE SORTERS” in your arsenal, your SQL journey is definitely on the up and up.