7.3. DAG Execution / Translation

Once the relational algebra tree provided by Calcite has been interpreted (see Calcite Parser) and optimized (see DAG Builder / Optimizer) the tree is ordered and executed. Each remaining node in the tree (except for RelTableScan and RelJoin) forms a query step. Each step is executed in order, with the result from all previous steps available for subsequent steps. The relational algebra tree is a directed, acyclic graph dictating the execution order of the query steps.

7.3.1. Determining Execution Order

The DAG is built from the relational algebra tree by adding edges between nodes in the tree according to the inputs to each node, starting with the last node in the tree (i.e. the last query step to be executed).

Once the DAG has been built, an execution ordering is determined using a topological sort over the DAG. The sort returns an ordered list of vertices according to the dependencies described by the DAG. For query execution, we walk the DAG according to the ordering.

Consider the following example:

SELECT t1.x FROM dead_cols_test t1 JOIN (SELECT * FROM dead_cols_test) t2 ON t1.x = t2.x;

with the following Calcite plan:

1
2
3
4
5
LogicalProject(x=[$0])
  LogicalJoin(condition=[=($0, $3)], joinType=[inner])
    EnumerableTableScan(table=[[mapd, dead_cols_test]])
    LogicalProject(x=[$0], y=[$1], rowid=[$2])
      EnumerableTableScan(table=[[mapd, dead_cols_test]])

The DAG for this query is as follows:

digraph {
   "Project [1]" -> "Join [2]";
   "Join [2]" -> "Scan [3]";
   "Join [2]" -> "Project [4]";
   "Project [4]" -> "Scan [5]";
}

(with the number in paranthesis corresponding to the line number in the calcite plan)

The topological for the above graph produces the following ordering:

digraph {
   "Project [1]" -> "Join [2]"[label="4"];
   "Join [2]" -> "Scan [3]"[label="3"];
   "Join [2]" -> "Project [4]"[label="2"];
   "Project [4]" -> "Scan [5]"[label="1"];
}

Note

The ordering will be applied per vertex, but for illustration purposes we have placed the ordering on the edges. The vertex ordering can be deduced from the above figure by moving the edge order up to the higher vertex in the graph (e.g. the first step to be executed will be Project [4]).

Finally, note that Scan and Join nodes are not executed, but are automatically rolled into the next node during work unit generation.

7.3.2. RaExecutionDescriptor

After the ordering has been determined, query steps are wrapped in RaExecutionDescriptor objects. The RaExecutionDescriptor stores the relational algebra node for the query step, along with the query step result and any related output metadata. It is important that these objects do not go out of scope until the final ResultSet is returned to the client, as intermediate results may be required by subsequent query steps.

7.3.3. Query Step Translation

Each query step is packaged into a work unit for code generation and execution. The act of packaging a query step into code generation is called translation and is managed by the RelAlgTranslator. The translator converts a set of Rex expressions into an abstract syntax tree (AST) representation, which maps directly to the generated code for the kernel.

The translated AST is stored in multiple vectors which logically separate the projected SQL expressions from group by targets, filters, etc. The RelAlgExecutionUnit stores analyzer expressions in the following members:

  • target_exprs: Projected output expressions for the query step.

  • groupby_exprs: Columns being grouped. Note that all projection queries are considered group by queries with the group key being the identity function.

  • quals: Filter expressions.

  • simple_quals: Filter expressions involving a literal value (e.g. WHERE x = 10). These are separated for purposes of fragment skipping.

  • join_quals: Join expressions.

  • sort_info: Columns used for sorting, along with related sort info (limit, offset, etc).

Note

The quals, simple_quals, and join_quals vectors together make up the set of all filter expressions. That is, a filter expression comparing with a literal will be in simple_quals only, and will not be duplicated in the quals vector.

The RelAlgExecutionUnit is the primary member of the WorkUnit and contains all the information required to generate code for the query.

7.3.4. Query Step Execution

After translation, the work unit is passed to the Executor for native code generation and kernel execution. The Executor returns a ResultSet pointer. The ResultSet pointer is stored in the ExecutionDescriptor for the current step, and is also stored in the global temporary tables map. Intermediate results are referenced by negating the node ID of their parent query step.

7.3.4.1. Scalar Subqueries

Scalar subqueries are subqueries which return a single literal value, e.g.:

SELECT x FROM test WHERE x = (SELECT y FROM test2);

Scalar subqueries are identified during interpretation and split out prior to execution of the first query step. The subqueries are then executed as individual queries. The ResultSet for scalar subquery execution is expected to be a single row with a single column. During translation, a RexSubQuery expression is replaced with the result from the subquery, represented by a literal analyzer expression. The subqueries_ member of the RelAlgExecutor manages scalar subquery results for use in future steps.