Parsing SQLs

xfactrs

Varsha Gupta

xfactrs Parsing SQLs

In Data Analysis, SQLs are inevitable. Right business rules, the right business transformation logic, and, thereby, the right business decision depends on efficient SQLs. Validating and processing large complex SQLs with the aid of SQL parsing tools or libraries must go hand in hand with SQLs and insight automation.

For software to parse SQLs, what would be the most efficient and cost-effective approach? Writing custom-built parsers using regex?  Using commercial SQL parsers? Using open-source SQL parsers?

  • Custom-built parsers take a higher amount of time to roll out. It has a higher maintenance cost. And it will not cover most of the SQL scenarios. Every time a new scenario comes up, the developer will end up coding again.
  • Commercial parsers are usually expensive. Most of the commercial library-software are not available for any modifications based on custom requirements. That impacts the flexibility of the product.

What is left are open-source SQL parsers viz. Apache Calcite, ANTLR, JSqlParser.

  • Apache Calcite is used in popular databases and query engines viz. Apache Hive, Drill, Flink, Phoenix. Calcite requires more computing resources and coding effort to incorporate into software and run in production.
  • ANTLR is widely used to build tools and frameworks. However, ANTLR also requires higher coding effort. The computing resource requirement is less compared to that of Calcite.
  • JSqlParser is an SQL statement parser, it can parse multiple SQL dialects viz. MySQL, Postgres, Oracle. For E.g. it supports Oracle’s JOIN syntax using (+), PostgreSQLs cast syntax using ::, relational operators like != and so on. The results can be accessed in a structured way.

The grammar can be modified to support other SQL dialects such as that of hive. JSqlParser is lightweight and can

  • Validate SQL Queries
  • Check Semantics and Syntax
  • Auto-map columns to respective entity/table in all SQL statements.
  • Extract SQL details from DDL and DML statements.

JSqlParser is compatible with java and allows one to extend/customize the library to handle complex SQLs. It parses an SQL statement and translates it into a hierarchy of Java classes. The generated hierarchy can be navigated using the Visitor Pattern. JSqlParser is built using JavaCC. The core JavaCC grammar for SQL has been taken from Guido Draheim’s site and has been changed to produce a hierarchy of Java classes.

JSqlParser completely supports the following features

  • Query parser, validator, and optimizer.
  • Support of multiple standard SQL and aggregate functions.
    • SELECT, FROM (including JOIN syntax), WHERE, GROUP BY (including GROUPING SETS), aggregate functions (including COUNT(DISTINCT …) and FILTER), HAVING, ORDER BY (including NULLS FIRST/LAST), set operations (INTERSECT, MINUS), sub-queries (including correlated sub-queries), windowed aggregates, LIMIT (syntax as Postgres);
    • DELETE, INSERT and INSERT-INTO statements.
    • CREATE Table statement, add Primary key, SET the index.

JSqlParser can be extended to support some specific scenarios.

Example1:

In a SELECT query which has UNION operator, combine same fields and show that field in one column. By default, JSqlParser will show it as two or more columns – depending on the number of UNION operator.

SELECT AccountNumber FROM Customer_Table

UNION

SELECT AccountNumber FROM Billing_Table

ORDER BY AccountNumber;

Expected output field: AccountNumber

JSQL parser default output: AccountNumber, AccountNumber

Example2:

In the Input SELECT statement, the new column can be added using JSqlParser for a single select expression. Multi-SELECT expression is not supported viz. query with UNION is not supported

Select select = (Select) CCJSqlParserUtil.parse(” SELECT City FROM Customers”);

SelectUtils.addExpression(select, new Column(” State “));

In the above, a New column State is added. But same cannot achieve for query that has UNION.

SELECT City FROM Customers UNION SELECT City FROM Suppliers ORDER BY City;

In both the above examples, extension of JSqlParser is required.

To summarize, JSqlParser is an effective lightweight SQL parsing that covers a wide range of SQL cases. It supports extension and can be useful in ones’ development effort.

Relevant Blogs

Mind the Gap – The Leakage Happens in the Gap!

Mind the Gap – The Leakage Happens in the Gap!

Prasanna Deshmukh, Chief Product OfficerAs I start my journey as a Chief Product Officer with xfactrs, I am starting to talk with our customers about the value of xfactrs products and how it can identify the revenue leakage in the quote to cash processes. Analysts...

Analytics Tools Vs Revenue Assurance Platform

Analytics Tools Vs Revenue Assurance Platform

Kiran MohanWe have an analytics tool, why do I need a revenue assurance platform? More or less, that’s what we hear from many conversations we have. I don’t blame them as we are comparing apples vs oranges. The concept of revenue assurance is new to the subscription...

Quote Configuration

Is your quote configured as per requirement?

Order Assurance

Are your orders recorded accurately?

Consumption Assurance

Are you capturing the consumption accurately?

Billing Assurance

Are you sure that you are not under or overbilling?

Invoice Assurance

Are there any discrepancies in your invoice?

Renewal Assurance

Still servicing the customers who haven’t renewed?

Payment Assurance

Are you realizing the entire payments accurately?

Get a Demo

Write to us and see how xfactrs can help you protect your hard-earned revenues from leaking.

About Us

See what drives our passion for controlling revenue leakage

Why xfactrs

We believe xfactrs is unique. See why

Resources

Our thoughts and perspective on revenue assurance

Media

Already buzzing in the media world

Careers

Exciting careers are waiting for you to take on the subscription world

Get a Demo

Write to us and see how xfactrs can help you protect your hard-earned revenues from leaking.