iT邦幫忙

2019 iT 邦幫忙鐵人賽

DAY 8
1

7: SQL Parser

2018/10/22

It's recommended to read on Gitbook

Until now, StellarSQL has been able to receive message from and send answer back to clients. That's good. It is a server, but it still cannot do works of database management.

Further, only with SQL parser, storage engine, and data engine, StellarSQL could be called a real DBMS.

These modules should be implemented in order, so let's talk about the SQL parser first.

For any message sent from client, it must be a query in SQL format. The query could be "create table", "insert data", "delete date", etc. In order to understand the query, we need a SQL parser.

General speaking, DBMS has a parser includes a lexical scanner and a grammar rule module. The lexical scanner splits the entire query into tokens (keywords or domain name), and the grammar rule module finds a combination of SQL grammar rules that produce this sequence, and process the code associated with those rules.

For example:

SELECT * FROM Customers WHERE Country = 'Mexico';

The lexical scanner breaks a SQL query above into tokens as:

  • SELECT
  • *
  • FROM
  • Customers
  • WHERE
  • Country
  • =
  • Mexico

A semicolon means the end of a query, so not it is not counted here.

Furthermore, the grammar rule module applies rules for these tokens, such as:

  • SELECT keyword is before columns
  • From keyword is before tables
  • Where keyword stands for conditions

A mature DBMS also does lots of optimizations. As you can imagine, the complexity of a SQL query requires an equally complex structure that efficiently stores the information needed for executing every possible SQL statement.

For example, according to the book, "Understanding MySQL Internals", MySQL optimizer does some important tasks (I just list a few):

  • Determine which keys can be used to retrieve the records from tables, and choose the best one for each table.

  • Determine the order in which tables should be joined when more than one table is present in the query.

  • Eliminate unused tables from the join.

  • Determine whether keys can be used for ORDER BY and GROUP BY.

In the next days, I am going to implement the SQL parser. However, I am afraid that I could only implement it with basic algorithm. The optimizer is a huge engineering, not to mention my lack of time due to preparing midterm exams.

The good news is that StellarSQL can parse SQL soon. :)

Reference


Quick Link
1.StellarSQL Repository
2.Gitbook of this series

Author
Liu, An-Chi (劉安齊). A software engineer, who loves writing code and promoting CS to people. Welcome to follow me at Facebook Page. More information on Personal Site and Github.


上一篇
StellarSQL 6: Client/Server Communication Implementation(2)
下一篇
StellarSQL 8: SQL Parser - Lexical Scanner
系列文
Let's build a DBMS: StellarSQL -- a minimal SQL DBMS written in Rust30

尚未有邦友留言

立即登入留言