How to use SQL series – How to use mathematical expressions and aggregate functions in SQL

Introduction

Structured Query Language (SQL) is used to store, manage, and organize information in a relational database management system (RDBMS). SQL can also calculate and manipulate data through Expressions. Expressions combine various SQL operators, functions, and values to calculate a value. Mathematical expressions are often used to add, subtract, divide, and multiply numerical values. Additionally, Aggregation functions are used to calculate and group values to produce summaries, such as the average or sum of the values in a given column. Mathematical and aggregate expressions can provide valuable insights through data analysis that can inform future decisions.

In this tutorial, you will practice using mathematical expressions. First, you will use numeric operations on the calculator, then use these operators on sample data to perform queries using aggregate functions, and finally use business scenarios to query the sample data for more complex information and analysis.

Preliminary preparation

In order to follow this guide, you will need a computer running some kind of relational database management system (RDBMS) using SQL.

Note: Please note that many RDBMS use their own unique implementation of SQL. While the commands outlined in this tutorial will work with most RDBMS, the exact syntax or output may differ if you test them on a system other than MySQL.

You will also need a database and tables loaded with some sample data where you can practice using the relevant commands.

Connect to MySQL and set up a sample database

If the SQL database system is running on a remote server, SSH to the server from your local device:

ssh sammy@your_server_ip

Then open the MySQL server prompt and replace ==sammy== with the name of your MySQL user account:

mysql -u sammy -p

Create a database named mathDB:

CREATE DATABASE mathDB;

If the database is created successfully, you will receive output like this:

OutputQuery OK, 1 row affected (0.01 sec)

To select the mathDB database, run the following USE statement:

USE mathDB;
OutputDatabase changed

After selecting a database, use the CREATE TABLE command to create a table in it. In the example of this tutorial, we will create a table named product_information to store inventory and sales information for a small tea shop. This table contains the following 8 columns:

  • product_id: represents the value of int data type, which will be used as the primary key of the table. This means that each value in this column will serve as a unique identifier for its respective row.
  • product_name: Use the varchar data type to specify the name of the product, no more than 30 characters.
  • product_type: Stores the type of product, as shown in the varchar data type of no more than 30 characters.
  • total_inventory: Indicates the inventory quantity of each product, using the int data type, with a maximum value of 200.
  • product_cost: Use the decimal data type to display the price of each product purchased at cost, with up to 3 values on the left and up to 2 values after the decimal point.
  • product_retail: Stores the price of each product in retail form, displayed by the decimal data type, with up to 3 values on the left and up to 2 values after the decimal point.
  • store_units: Uses a value of int data type to show how many units of a specific product are available for in-store sales inventory.
  • online_units: Indicates how many units of a specific product are available for online sales inventory using the int data type value

Run the following command to create this sample table:

CREATE TABLE product_information (
product_id int,
product_name varchar(30),
product_type varchar(30),
total_inventory int(200),
product_cost decimal(3, 2),
product_retail decimal(3, 2),
store_units int(100),
online_units int(100),
PRIMARY KEY (product_id)
);
OutputQuery OK, 0 rows affected, 0 warnings (0.01 sec)

Now insert some sample data into an empty table:

INSERT INTO product_information
(product_id, product_name, product_type, total_inventory, product_cost, product_retail, store_units, online_units)
VALUES
(1, 'chamomile', 'tea', 200, 5.12, 7.50, 38, 52),
(2, 'chai', 'tea', 100, 7.40, 9.00, 17, 27),
(3, 'lavender', 'tea', 200, 5.12, 7.50, 50, 112),
(4, 'english_breakfast', 'tea', 150, 5.12, 7.50, 22, 74),
(5, 'jasmine', 'tea', 150, 6.17, 7.50, 33, 92),
(6, 'matcha', 'tea', 100, 6.17, 7.50, 12, 41),
(7, 'oolong', 'tea', 75, 7.40, 9.00, 10, 29),
(8, 'tea sampler', 'tea', 50, 6.00, 8.50, 18, 25),
(9, 'ceramic teapot', 'tea item', 30, 7.00, 9.75, 8, 15),
(10, 'golden teaspoon', 'tea item', 100, 2.00, 5.00, 18, 67);
OutputQuery OK, 10 rows affected (0.01 sec)
Records: 10 Duplicates: 0 Warnings: 0

Once you’ve inserted your data, you can start using mathematical expressions.

Calculate using mathematical expressions

In SQL, you typically use SELECT to query the database and get the desired result set. However, you can also use the SELECT keyword to perform various mathematical operations.

Remember that in real life, SQL is mostly used for querying and calculating values in real databases. But in this section, you will only use SELECT with numeric values to become familiar with the syntax of mathematical expressions and operators.

Before we begin, let’s give an overview of the operators used to perform six arithmetic operations in SQL. Please note that this list is not comprehensive and many RDBMS have a unique set of mathematical operators:

  • Addition uses + notation
  • Subtraction uses the - symbol
  • Multiplication uses the * notation
  • Division uses / symbols
  • Modulo operations use the % symbol
  • To find exponentiation, use POW(x,y)

You can practice running combinations of different types of calculations with your own values. We’ll demonstrate using the following example, starting with an addition equation:

SELECT 893 + 579;
Output + ----------- +
| 893 + 579 |
 + ----------- +
| 1472 |
 + ----------- +
1 row in set (0.00 sec)

Note that because you do not need to retrieve any data from the database, but only calculate raw numbers, you do not need to include a FROM clause in this or other example queries in this section.

Now perform the calculation using the subtraction operator. Alternatively, you can calculate decimals like this:

SELECT 437.82 - 66.34;
Output +----------------+
| 437.82 - 66.34 |
 +----------------+
| 371.48 |
 +----------------+
1 row in set (0.00 sec)

In SQL, you can include multiple values and operators in a single calculation. The following example uses 3 multiplication operators to calculate the product of 4 numbers:

SELECT 60 * 1234 * 2 * 117;
Output + -------------------------- +
| 60 * 1234 * 2 * 117 |
 +---------------------+
| 17325360 |
 +---------------------+
1 row in set (0.00 sec)

Next, calculate a division problem combining decimals and integers as follows:

SELECT 2604.56 / 41;
Output + -------------- +
| 2604.56 / 41 |
 + -------------- +
| 63.525854 |
 + -------------- +
1 row in set (0.00 sec)

Another division operator is %, which is the modulo operator. It calculates the remainder after dividing the dividend by the divisor:

SELECT 38 % 5;
Output + -------- +
| 38% 5 |
 +--------+
| 3 |
 +--------+
1 row in set (0.00 sec)

Another useful operator is POW(x,y), which raises the specified base value (x) to the power of the exponent (y) :

SELECT POW(99,9);
Output + -------------------------- +
| POW(99,9) |
 +---------------------+
| 9.13517247483641e17 |
 +---------------------+
1 row in set (0.01 sec)

Now that you have practiced calculating using each operation individually, you can try combining different mathematical operators to practice more complex mathematical equations.

Understand the order of SQL operations

You may be familiar with the termPEMDAS, which stands for Parentheses, Exponents, Multiplication, Division, Addition, and Subtraction. This term can serve as a guide to the sequence of operations required to solve more complex equations. PEMDAS is the term used by the United States, while other countries may use different abbreviations to represent their operating rules.

When it is necessary to combine different mathematical operations nested within parentheses, SQL reads from left to right and then reads the values from inside to outside. Therefore, make sure the value within the brackets accurately captures the problem you are trying to solve.

Try calculations using parentheses and a few different operators:

SELECT (2 + 4 ) * 8;
Output + ----------- +
| (2 + 4) * 8 |
 + ----------- +
| 48 |
 + ----------- +
1 row in set (0.00 sec)

Keep in mind that the placement of the brackets matters, and if you’re not careful, the entire result may change. For example, the following code uses the same three values and operators, but uses different bracket positions, and the results will be different:

SELECT 2 + (4 * 8);
Output + ------------- +
| 2 + (4 * 8) |
 +-------------+
| 34 |
 +-------------+
1 row in set (0.00 sec)

If you prefer to do calculations without parentheses, you can do that too. Remember, there are still order of operations rules here; so, similar to bracket placement, confirm that this is the equation you want based on the order of operations used to compute it. In the following example, you’ll notice that the division operator has higher precedence than the subtraction operator, and the result is a negative number:

SELECT 100 / 5 - 300;
Output + --------------- +
| 100 / 5 - 300 |
 + --------------- +
| -280.0000 |
 + --------------- +
1 row in set (0.00 sec)

By combining various operators, you successfully perform numerical calculations and complex calculations using mathematical expressions. Next, we’ll use sample data to perform calculations using aggregate functions and generate new information about the data.

Use aggregate functions to analyze data

Suppose you are the owner of a small tea shop and you want to perform calculations on information stored in a database. SQL can query and manipulate data using mathematical expressions to retrieve data from database tables and different columns. This helps generate new information about the data you are interested in analyzing. In this section, you practice using aggregate functions to query and manipulate sample data to find information about a teahouse business.

The main aggregate functions in SQL include SUM, MAX, MIN, AVG and COUNT. The SUM function adds all the values in a column. For example, use SUM to sum the amounts in the total_inventory column in the sample data set:

SELECT SUM(total_inventory) FROM product_information;
Output + ----------------------- +
| SUM(total_inventory) |
 + --------------------------- +
| 1155 |
 + --------------------------- +
1 row in set (0.00 sec)

The MAX function finds the maximum value in a selected column. For this example, use MAX to query the maximum cost of the product listed in the product_cost column, and use the AS statement to re-mark the title, To make it read clearer:

SELECT MAX(product_cost) AS cost_max
FROM product_information;
Output + ---------- +
| cost_max |
 + ---------- +
| 7.40 |
 + ---------- +
1 row in set (0.00 sec)

The MIN function is the opposite of the MAX function in that it calculates the minimum value. Use MIN to query the minimum spending amount of product_retail:

SELECT MIN(product_retail) AS retail_min
FROM product_information;
Output + ------------ +
| retail_min |
 + ------------ +
| 5.00 |
 + ------------ +
1 row in set (0.00 sec)

The AVG function calculates the average of all values in a specified column in a table. Also, note that you can run multiple aggregate functions in the same query. Try combining a query to find the average cost of a product sold at retail price and a product purchased at cost price:

SELECT AVG(product_retail) AS retail_average,
AVG(product_cost) AS cost_average
FROM product_information;
Output + ---------------- + ---------------- +
| retail_average | cost_average |
 +----------------+----------------+
| 7.875000 | 5.750000 |
 +----------------+----------------+
1 row in set (0.00 sec)

The COUNT function operates differently from other functions because it calculates values from the table itself by counting the number of rows returned by the query. Use the COUNT function in the WHERE statement to query the number of products with a retail value greater than 8.00:

SELECT COUNT(product_retail)
FROM product_information
WHERE product_retail > 8.00;
Output + ----------------------- +
| COUNT(product_retail) |
 + ----------------------- +
| 4 |
 + ----------------------- +
1 row in set (0.00 sec)

Now query the quantity of products purchased in the store that exceeds 8.00 in product_cost:

SELECT COUNT(product_cost)
FROM product_information
WHERE product_cost > 8.00;
Output + -------------------------- +
| COUNT(product_cost) |
 +---------------------+
| 0 |
 +---------------------+
1 row in set (0.00 sec)

You have successfully used aggregate functions to provide a summary of values such as maximum, minimum, average, and count. You retrieved this information from sample data to simulate real-world scenarios. In the final section, you will apply all the mathematical expressions and aggregate functions you have learned to perform more detailed queries and analysis on the sample data at Xiaocha.

Apply mathematical expressions in business scenarios

This section provides some example scenarios involving performing data analysis to help tea shop owners with decisions related to their business.

As a first scenario, calculate the total quantity currently available in inventory to understand how many products are left available for sale in-store and online. This query will also include DESC statements to organize the data from largest to smallest. Normally RDBMS will default to ascending order, but this example includes the “DESC” option, allowing you to view the data in descending order:

SELECT product_name,
total_inventory - (store_units + online_units)
AS remaining_inventory
FROM product_information
ORDER BY(remaining_inventory) DESC;
Output + ------------------- + --------------------- +
| product_name | remaining_inventory |
 + ------------------- + ---------------------------- +
| chamomile | 110 |
| chai | 56 |
| english_breakfast | 54 |
| matcha | 47 |
| lavender | 38 |
| oolong | 36 |
| jasmine | 25 |
| golden teaspoon | 15 |
| tea sampler | 7 |
| ceramic teapot | 7 |
 + ------------------- + ---------------------------- +
10 rows in set (0.00 sec)

This query is useful because it calculates remaining inventory, which can help tea shop owners plan to purchase more orders when the product is running out.

In the next scenario, you will analyze and compare the amount of revenue from physical store and online sales:

SELECT product_name,
(online_units * product_retail) AS o,
(store_units * product_retail) AS s
FROM product_information;
Output
 + ------------------- + -------- + -------- +
| product_name | o | s |
 + ------------------- + -------- + -------- +
| chamomile | 390.00 | 285.00 |
| chai | 243.00 | 153.00 |
| lavender | 840.00 | 375.00 |
| english_breakfast | 555.00 | 165.00 |
| jasmine | 690.00 | 247.50 |
| matcha | 307.50 | 90.00 |
| oolong | 261.00 | 90.00 |
| tea sampler | 212.50 | 153.00 |
| ceramic teapot | 146.25 | 78.00 |
| golden teaspoon | 335.00 | 90.00 |
 + ------------------- + -------- + -------- +
10 rows in set (0.00 sec)

Next, calculate the total revenue from in-store and online sales using the SUM function and a few mathematical operators:

SELECT SUM(online_units * product_retail) +
SUM(store_units * product_retail)
AS total_sales
FROM product_information;
Output + ------------- +
| total_sales |
 +-------------+
| 5706.75 |
 +-------------+
1 row in set (0.00 sec)

Executing these queries is important for two reasons. The first reason is that tea store owners can evaluate which products are best-selling and prioritize these products when purchasing more products in the future. Secondly, they can analyze the tea store’s overall product sales in-store and online execution.

Next, you’ll find the profit margin for each product. TheProfit Margin of a given product is the revenue a business earns on each unit sold of that product. To find out how much revenue you make, you can multiply sales by profit margin.

To calculate the profit margin for each product, subtract product_cost from product_retail for each row. This value is then divided by the total retail sales of the product to calculate the profit margin:

SELECT product_name,
(product_retail - product_cost) / product_retail
AS profit_margin
FROM product_information;
Output + ------------------ + ------------- +
| product_name | profit_margin |
 + ------------------ + ------------- +
| chamomile | 0.317333 |
| chai | 0.177778 |
| lavender | 0.317333 |
| english_breakfast | 0.317333 |
| jasmine | 0.177333 |
| matcha | 0.177333 |
| oolong | 0.177778 |
| tea sampler | 0.294118 |
| ceramic teapot | 0.282051 |
| golden teaspoon | 0.600000 |
 + ------------------ + ------------- +
10 rows in set (0.00 sec)

According to the output, Golden Tea Spoon has the highest profit margin at 60%, while Indian tea, jasmine tea, matcha, and oolong tea have the lowest profit margin at 18%. For Golden Teaspoon, this means that with a retail value of $5 and a profit margin of 60%, you generated $3 in revenue.

You can also use the aggregate function AVG to calculate the average profit margin of all products in the teahouse. This average serves as a baseline for tea shop owners, who then determine which products fall below this number and strategize how to improve:

SELECT AVG((product_retail - product_cost) / product_retail)
AS avg_profit_margin
FROM product_information;
Output + ------------------- +
| avg_profit_margin |
 + ------------------- +
| 0.2838391151 |
 + ------------------- +
1 row in set (0.00 sec)

From this calculation, it can be concluded that the average profit margin for products in this tea shop is 28%.

With this new information, let’s say the tea shop owner wants to increase the profit margin to 31% next quarter on a product that currently has a profit margin of less than 27%. To do this, you need to subtract your target profit margin from 1 (1 - 0.31) and then divide the cost of each returned product by this value. The result will be the new price at which the product must be sold at retail to achieve a 31% profit margin:

SELECT product_name, product_cost / (1 - 0.31)
AS new_retail
FROM product_information
WHERE (product_retail - product_cost) / product_retail < 0.27;
Output + -------------- + ------------ +
| product_name | new_retail |
 + -------------- + ------------ +
| chai | 10.724638 |
| jasmine | 8.942029 |
| matcha | 8.942029 |
| oolong | 10.724638 |
 + -------------- + ------------ +
4 rows in set (0.00 sec)

These results show that new retail prices are necessary to achieve a 31% profit margin on underperforming products. Such data analysis equips tea shop owners with the ability to make decisive business decisions on how to improve their next quarter’s revenue and understand purpose.

Summary

Use mathematical expressions in SQL to solve arithmetic problems like you would a calculator, or to perform complex analyzes on real-world data that can impact business decisions. If you can identify the major mathematical operators and order of operations rules, the computational possibilities are endless. When you want to analyze the data further, using these operators with aggregate functions allows you to calculate answers to “what if” questions, which may aid in strategic planning. Learn more about how to use SQL to work with databases with our series on How to use SQL.