Skip to content

SQL Query over File Examples

esProcSPL edited this page Jul 1, 2024 · 1 revision

【Abstract】 This article illustrates scenarios of querying files directly in SQL and provides sample programs written with esProc SPL.

It’s convenient and efficient to manipulate database data in SQL. It’s natural to load data into the database and handle it in SQL when we trying to compute files. The problem is that data loading itself is not simple. It would be extremely convenient if there was a way to perform SQL query directly on files. Here I’ll bring in such a tool, esProc SPL, list various SQL file query scenarios and offers esProc example programs. esProc is a specialized data computing engine, and SPL, abbreviation of Structured Process Language used by esProc, boasts a complete set of SQL file query methods.

This article uses text files in all examples. In fact the methods also apply in Excel files.

1.Filtering

Select records from a text file on a specified condition in SQL.

Example: Select scores of students in class 10 from students score table Students_scores.txt. In the file, the first row contains column names and detailed data begins from the second row:

A
1 $select * from E:/txt/Students_scores.txt where CLASS=10

2. Aggregation

Summarize data in a text file in SQL.

Example: Calculate the average Chinese score, the highest math score, and the total of English scores based on the student scores table.

A
1 $select avg(Chinese),max(Math),sum(English) from E:/txt/Students_scores.txt

3. Inter-column calculations

Perform inter-column calculations in a text file in SQL.

Example: Calculate the total score of each student in the student score table.

A
1 $select *,English+Chinese+Math as total_score from E:/txt/students_scores.txt

Below is A1’s result, where a computed column total_score is added.

4.CASE statement

We can perform complicated conditional queries in SQL using CASE statement.

Example: Find whether or not the English result of each student in the student scores table is a Pass.

A
1 $select *, case when English>=60 then 'Pass' else 'Fail' end as English_evaluation from E:/txt/students_scores.txt

Below is A1’s result, where a computed column English_evaluation is added.

5.Sorting

Sort data in a text file in ascending (or descending) order in SQL.

Example: Sort the student scores table by class in ascending order and by total score in descending order.

A
1 $select * from E:/txt/students_scores.txt order by CLASS,English+Chinese+Math desc

6.TOP-N

Get Top-N over a text file in SQL.

Example: Get the records of the three students who have the highest English scores.

A
1 $select top 3 * from E:/txt/students_scores.txt order by English desc

7. Grouping & aggregation

Perform grouping & aggregation on data in a text file in SQL.

Example: Find the lowest English score, the highest Chinese score and the total math score in each class.

A
1 $select CLASS,min(English),max(Chinese),sum(Math) from E:/txt/students_scores.txt group by CLASS

8. Post-grouping filtering

Group and summarize data in a text file and then perform filtering in SQL.

Example: Find the class where the average English score is below 70.

A
1 $select CLASS,avg(English) as avg_En from E:/txt/students_scores.txt group by CLASS having avg(English)<70

A1’s result.

9. Distinct

Perform distinct operation over data in a text file in SQL.

Example: Get the IDs of all classes.

A
1 $select distinct CLASS from E:/txt/students_scores.txt

10. Distinct Count

Perform count after distinct operation on data in a text file in SQL.

Example: Based on the product information file, count the number of different kinds of products. Below is part of the file:

A
1 $select count(distinct PID) from E:/txt/PRODUCT_SALE.txt

11. Grouping & Count Distinct

Group data in text file and perform count distinct on data in text file in SQL.

Example: Based on sales table, count the days when there are sales records for each kind of product.

A
1 $select PID,count(distinct DATE) as no_sdate from E:/txt/PRODUCT_SALE.txt group by PID

12. Two-table join query

Perform a join query over two text files in SQL.

Example: The product information and the sales information are stored in two text files respectively. We want to calculate the total sales amount of the products whose quantity per order is less than 10. Below are the files’ data structures and their relationship:

A
1 $select sum(S.quantity*P.Price) as total
from E:/txt/Sales.txt as S join E:/txt/Products.txt as P on S.productid=P.ID
where S.quantity<=10

13. Multi-table join query

Perform a join query over more than two text files in SQL.

Example: The state information, department information and employee information are stored in 3 different text files. We want to find the information of employees of California in HR department.

A
1 $select e.NAME as NAME
from E:/txt/EMPLOYEE_J.txt as e
join E:/txt/DEPARTMENT.txt as d on e.DEPTID=d.DEPTID
join E:/txt/STATE.txt as s on e.STATEID=s.STATEID
where
d.NAME='HR' and s.NAME='California'

14. Multi-table, multilevel join query

Perform a multilevel join query over more than two text files in SQL.

Example: The state information, department information and employee information are stored in 3 different text files. We want to find the information of employees of New York state whose managers come from California.

A
1 $select e.NAME as ENAME
from E:/txt/EMPLOYEE.txt as e
join E:/txt/DEPARTMENT.txt as d on e.DEPT=d.NAME
join E:/txt/EMPLOYEE.txt as emp on d.MANAGER=emp.EID
where e.STATE='New York' and emp.STATE='California'

15. Nested subquery

Use very complicated SQL in a nested subquery.

Example: The employee information and the department information are stored in two text files. We want to find the department that has the youngest manager. Below are parts of the two files:

A
1 $select emp.BIRTHDAY as BIRTHDAY,emp.DEPT as DEPT
from E:/txt/DEPARTMENT.txt as dept
join E:/txt/EMPLOYEE.txt emp
on dept.MANAGER=emp.EID
where
emp.BIRTHDAY=(select max(BIRTHDAY)
from ( select emp1.BIRTHDAY as BIRTHDAY
from E:/txt/DEPARTMENT.txt as dept1
join E:/txt/EMPLOYEE.txt as emp1
on dept1.MANAGER=emp1.EID
)
)

16. Common table expression

Compute data in a text file using SQL WITH clause.

Example: Based on the same text files in the previous example, find the specified departments (HR, R&D, Sales) from the department table, and count the female employees and calculate the average salary in each of these departments.

A
1 $with A as
(select NAME as DEPT from E:/txt/DEPARTMENT.txt
where NAME='HR' or NAME='R&D' or NAME='Sales')

select A.DEPT DEPT,count(*) NUM,avg(B.SALARY) AVG_SALARY from
A left join E:/txt/EMPLOYEE.txt B
on A.DEPT=B.DEPT
where B.GENDER='F' group by A.DEPT

Find more examples of agile computations in SPL CookBook.

Clone this wiki locally