Fundamental SQL Using FUNCTION and GROUP BY

Cindy Wanady
11 min readNov 3, 2020

--

SCALAR VS AGGREGATE FUNCTION

Functions are methods used to perform data operations on the database. These operations can be numerical calculations such as sum, count, avg, etc; or non-numeric operations such as string concatenations and sub-strings. SQL functions can be divided into 2 categories, namely scalar functions and aggregate functions.

Scalar functions in SQL are used to return a single value from a given input value, while aggregate functions in SQL are used to perform calculations on a group of values ​​and then return a single value. So, to make it easier to understand, let’s discuss and practice the functions of these two categories.

Scalar Functions (Numeric Value)

This function is generally used when you want to perform mathematical operations in SQL quickly and effectively. In SQL itself there are many mathematical functions.

To check what functions can be done in SQL, we can open this documentation:

https://www.postgresql.org/docs/9.5/functions-math.html -> postgresql

https://dev.mysql.com/doc/refman/8.0/en/mathematical-functions.html -> mysql database

Here is some functions that are generally used:

To facilitate understanding, see this table dummy containing the grades of students in semesters 1 and 2 at a school. Here’s an example of using scalar functions using a dummy table:

- Numerical Scalar Function — ABS()

Syntax:

Try this query:

Output:

You see that all the negative values of Markgrowth now become positive value because the ABS() function.

- Numerical Scalar Function — CEILING()

Syntax:

Try this query:

Output:

You see that all the values of Semester1 and Semester2 now are rounded up because the CEILING() function.

- Numerical Scalar Function — FLOOR()

Syntax:

Try this query:

Output:

You see that all the values of Semester1 and Semester2 now are rounded down because the CEILING() function.

- Numerical Scalar Function — ROUND()

Syntax:

Try this query:

Output:

You see that all the values of Semester1 and Semester2 now are rounded based on the query. Value of Semester1 is rounded to 1 decimal, while value of Semester2 is rounded to no decimals.

- Numerical Scalar Function — SQRT()

Syntax:

Try this query:

Output:

You see that the new values of Semester1 is the square root of the real value of Semester1.

Mini Project

Use the MOD () function to calculate the residual value if the Semester1 value is divided by 2 and the EXP () function to calculate the exponential value of the MarkGrowth value. Use these two functions in one SELECT-Statement.

Syntax:

So, we select StudentID, FirstName, LastName. Then to get the residual value of Semester 1 divided by 2, we type MOD(Semester1,2) and use Semester1 alias. Then, we type Semester2, and after that, we calculate exponential value of MarkGrowth with EXP()

Output:

TEXT Functions

This function is used if we want to perform operations on text or characters in SQL, for example, changing lowercase to uppercase, counting the number of characters from text, etc.

We can check the function here:

postgrestql →https://www.postgresql.org/docs/9.1/functions-string.html

mysql → https://dev.mysql.com/doc/refman/8.0/en/string-functions.html

Here is the function that is often used:

First, let’s display all columns from students:

SELECT * FROM students;

Result:

- TEXT Functions — CONCAT()

Syntax:

Query:

Result:

It can be seen that the first name and last name are combined to a Name column.

- TEXT Functions — SUBSTRING_INDEX()

Syntax:

Information:

  • column → column name that the text will be split
  • delimiter → characters or a combination of characters to break the text in the column,
  • index_to_return → the index of the fractional text to be retrieved.

Query:

Result:

Here we see how to get the first text fraction (1) or sub-text before the delimiter ‘@’.

- TEXT Functions — SUBSTR()

Syntax:

Information:

  • columnName -> column name in which we want find the substring
  • Start Index -> index of the text (starting from 1)
  • Number of string to be extract -> the number of characters to be extract.

Query:

Result:

In this example you can see that we take the substring starting at the second character to the fourth character.

- TEXT Functions — LENGTH()

Syntax:

Query:

Result:

This LENGTH() function is used to count the number of characters from a text.

- TEXT Functions — REPLACE()

Syntax:

Information:

  • ColumnName → column name that each record / row will be replaced based on a certain string / character
  • Character/String to be change → the string / character to be replaced
  • New String/Character → new string / character in place of the previous string / character

Query:

Result:

We see that ‘yahoo’ has been replaced by ‘gmail’ in the New_Email column.

Mini Task

Use the UPPER () function to turn the FirstName column all caps and use LOWER () to make the LastName column all non-capitals. Use these two functions in one SELECT-Statement.

Write the query so that we will get this result:

Query:

This query make the FirstName in uppercase and LastName in lowercase.

Aggregate Functions

This aggregate function is used to perform calculations on a group of values.

Here are some functions which are usually used:

- Aggregate Functions— SUM()

Syntax:

Query:

Result:

We see that Total_1 cosists of values of SUM from column Semester1, while Total_2 cosists of values of SUM from column Semester2.

- Aggregate Functions — COUNT()

Syntax:

Query:

Result:

There are 5 students from students Table.

- Aggregate Functions — AVERAGE()

Syntax:

Query:

Result:

We see two values from the average of values from Semester1 column and Semester2 column.

Mini Task

Try this MIN() and MAX() to calculate value from Semester1 and Semester2 column.

Expected Output:

Query:

Output:

Here are the result of minimum and maximum value from Semester1 and Semester2 column.

GROUP BY

To group data in SQL we use GROUP BY Statement. The GROUP BY statement will group data of the same value into one group, and by using an aggregate function such as (COUNT, MAX, MIN, SUM, AVG) we can aggregate for each group or category that is formed.

Syntax:

Important things to note are:

  1. GROUP BY is used with SELECT, meaning that the column used in the GROUP BY statement also needs to be placed in SELECT.
  2. GROUP BY is placed after the WHERE, but if you don’t use WHERE, it is immediately placed after FROM.
  3. If using ORDER BY, GROUP BY is placed before ORDER BY.

Can we use GROUP BY to group data using 2 criteria or 2 columns ?

Of course yes.

  1. Group by Single Column, data is grouped using criteria from one column only, for example grouping data by province only.
  2. Group by Multiple Columns, data is grouped using criteria from two or more columns, for example grouping data by province and brand.

Group by Single Column

Group by Single Column function ensures data can be grouped using criteria from only one column, for example grouping data by province only.

Example:

Result:

Group by Multiple Column

With the Group by Multiple Column function, data can be grouped using criteria from two or more columns, for example grouping data by province and brand.

Example:

Result:

some output

Aggregate Function with Grouping

Try to use the aggregation function and GROUP BY to calculate the total sales from each province in 2019

Syntax:

Result:

We see that the table show the number of unique order and total revenue of each province.

CASE … WHEN …

If we are asked to add a recommendation column or remark on the results of data aggregation, for example the sales for March 2019 is more than 30M, then a recommendation / remark is given “Target Achieved”, if it is not given the remark “Follow Up”. How to do it in SQL ?

You can do that with CASE — statements in SQL. CASE — this statement is the same as IF — THEN — ELSE in most programming.

Syntax:

The CASE-statement will evaluate the specified conditions, starting at condition1, and will return the result (result1), if condition1 is met (TRUE). If not, then condition2 will evaluate, and will return result2 if condition2 is met, etc. If no conditions are met, the result on the ELSE section will be returned.

Mini Task

By using sales_retail_2019 data, create a query syntax that uses the scalar function MONTH () to change order_date from date to month, aggregate SUM () function to sum column item_price.

Add ‘remark’ Column using CASE…WHEN… statement.

If sum (item_price)> = 30,000,000,000, then the remark is “Target Achieved”;

If sum (item_price) <= 25,000,000,000, the remark is “Less performed”;

Apart from that, give the remark “Follow Up”.

Query:

Result:

MINI PROJECT

Do a sales analysis of a store. The requested reports are as follows:

  1. Total amount of all sales (total / revenue).
  2. Total quantity of all products sold.
  3. Total quantity and total revenue for each product code.
  4. Average total spend of each kode_pelanggan
  5. Also, don’t forget to add a new column called ‘category’ which categorizes total / revenue into 3 categories: High:> 300K; Medium: 100K — 300K; Low: <100K.

tr_penjualan table

Query 1–3:

Result:

We get the total revenue (top), total quantity sold (center), total revenue and quantity sold of each kode_produk (down)

Query 4–6:

Result:

Review

From this Fundamental SQL Using FUNCTION and GROUP BY modul,we’ve understood and been able to practice:

  • Use of scalar and aggregate functions in string and numeric operations in SQL databases
  • Use GROUP BY to group data and understand the concept of combining GROUP BY with an aggregate function
  • Use of the CASE Statement for decision making structure.
  • Working on a mini project which is an integration of the whole material and of course the materials in the previous modules to solve business problems.

Source: dqlab.id

--

--

No responses yet