Sunday, 7 January 2018

Java - SQL Parsing

Recently I had a use case where I have to validate a SQL query. Basically I had to look at the JOINS and WHERE clauses in the query.

I thought that why to parse a SQL on my own and extract table names and where clauses from the query. Why to reinvent the wheel and write your own logic when you can re-use existing open source libraries. Keeping that in mind I started searching for a simple open source library to serve the purpose.

Surprisingly I found only a few and those were not serving the purpose then I found Apache-Calcite(https://calcite.apache.org/). Apache-Calcite is a huge project and SQL parsing is just one part of it.

I didn't find a good example though using its SQL parser and had to dig in a bit to get a working example. Following is a working example:

public class BaseQueryValidator {
private static List<String> extractTableAliases(SqlNode node) {
final List<String> tables = new ArrayList<>();
// If order by comes in the query.
if (node.getKind().equals(SqlKind.ORDER_BY)) {
// Retrieve exact select.
node = ((SqlSelect) ((SqlOrderBy) node).query).getFrom();
} else {
node = ((SqlSelect) node).getFrom();
}
if (node == null) {
return tables;
}
// Case when only 1 data set in the query.
if (node.getKind().equals(SqlKind.AS)) {
tables.add(((SqlBasicCall) node).operand(1).toString());
return tables;
}
// Case when there are more than 1 data sets in the query.
if (node.getKind().equals(SqlKind.JOIN)) {
final SqlJoin from = (SqlJoin) node;
// Case when only 2 data sets are in the query.
if (from.getLeft().getKind().equals(SqlKind.AS)) {
tables.add(((SqlBasicCall) from.getLeft()).operand(1).toString());
} else {
// Case when more than 2 data sets are in the query.
SqlJoin left = (SqlJoin) from.getLeft();
// Traverse until we get a AS.
while (!left.getLeft().getKind().equals(SqlKind.AS)) {
tables.add(((SqlBasicCall) left.getRight()).operand(1).toString());
left = (SqlJoin) left.getLeft();
}
tables.add(((SqlBasicCall) left.getLeft()).operand(1).toString());
tables.add(((SqlBasicCall) left.getRight()).operand(1).toString());
}
tables.add(((SqlBasicCall) from.getRight()).operand(1).toString());
return tables;
}
return tables;
}
private static Map<String, String> extractWhereClauses(SqlNode node) {
final Map<String, String> tableToPlaceHolder = new HashMap<>();
// If order by comes in the query.
if (node.getKind().equals(SqlKind.ORDER_BY)) {
// Retrieve exact select.
node = ((SqlOrderBy) node).query;
}
if (node == null) {
return tableToPlaceHolder;
}
final SqlBasicCall where = (SqlBasicCall) ((SqlSelect) node).getWhere();
if (where != null) {
// Case when there is only 1 where clause
if (where.operand(0).getKind().equals(SqlKind.IDENTIFIER)
&& where.operand(1).getKind().equals(SqlKind.LITERAL)) {
tableToPlaceHolder.put(where.operand(0).toString(),
where.operand(1).toString());
return tableToPlaceHolder;
}
final SqlBasicCall sqlBasicCallRight = where.operand(1);
SqlBasicCall sqlBasicCallLeft = where.operand(0);
// Iterate over left until we get a pair of identifier and literal.
while (!sqlBasicCallLeft.operand(0).getKind().equals(SqlKind.IDENTIFIER)
&& !sqlBasicCallLeft.operand(1).getKind().equals(SqlKind.LITERAL)) {
tableToPlaceHolder.put(((SqlBasicCall) sqlBasicCallLeft.operand(1)).operand(0).toString(),
((SqlBasicCall) sqlBasicCallLeft.operand(1)).operand(1).toString());
sqlBasicCallLeft = sqlBasicCallLeft.operand(0); // Move to next where condition.
}
tableToPlaceHolder.put(sqlBasicCallLeft.operand(0).toString(),
sqlBasicCallLeft.operand(1).toString());
tableToPlaceHolder.put(sqlBasicCallRight.operand(0).toString(),
sqlBasicCallRight.operand(1).toString());
return tableToPlaceHolder;
}
return tableToPlaceHolder;
}
public static void main(String[] args) throws SqlParseException {
final String query = "SELECT e.first_name AS FirstName, s.salary AS Salary from employee AS e join salary AS s on e.emp_id=s.emp_id where e.organization = 'Tesla' and s.organization = 'Tesla'";
final SqlParser parser = SqlParser.create(baseQuery);
final SqlNode sqlNode = parser.parseQuery();
final SqlSelect sqlSelect = (SqlSelect) sqlNode;
final SqlJoin from = (SqlJoin) sqlSelect.getFrom();
// Extract table names/data sets, For above SQL query : [e, s]
final List<String> tables = extractTableNames(from);
// Extract where clauses, For above SQL query : [e.organization -> 'Tesla', s.organization -> 'Tesla']
final Map<String, String> whereClauses = extractWhereClauses(sqlSelect);
}
}
So this is just about extracting table names and where clauses. This SQL parser can do definitely more and can be used to extract other parts of the query as well. Of course you can dig in the java docs and API to cater your requirements. Happy Learning !!!

No comments:

Post a Comment