.. OmniSciDB System Overview

#########################
OmniSciDB at 30,000 Feet
#########################

OmniSciDB is made up of several high level components. A diagram illustrating most of the components of the system is displayed below, with a short description of each major section of the documentation following.

High Level Diagram
==================

.. image:: ../img/platform_overview.png

The major components in the above diagram and their respective reference pages are listed in the table below.


.. list-table:: Component Reference
    :header-rows: 1

    * - Component
      - Reference Page
    * - Thrift Interface  
      - :doc:`../data_model/api`
    * - Calcite Server
      - :doc:`../calcite/calcite_parser`
    * - Catalog
      - :doc:`../catalog/index`
    * - Executor
      - :doc:`../execution/overview`
    * - LLVM JIT
      - :doc:`../execution/codegen`
    * - CPU / GPU Kernels 
      - :doc:`../execution/kernels`
    * - Database Files, Metadata Files, Dictionary Files
      - :doc:`../data_model/physical_layout`

Data Model
===========

The :doc:`../data_model/index` section provides an overview of the data formats and data types supported by OmniSciDB. A brief overview of various storage layer components is also included.

Data Flow
=========

The :doc:`../flow/data` section ties together the :doc:`../data_model/index` and :doc:`../execution/index` sections, providing information about the complete flow of data from the input columns for a query to its projected outputs.

Query Execution
==========================
The :doc:`../execution/index` section provides an overview
of how a query is executed inside OmniSciDB.

At a high-level, all SQL queries made to the server pass through the
Thrift_ `sql_execute` endpoint. The query string is passed to Apache Calcite_ 
for parsing and cost-based optimization, yielding an optimized relational 
algebra tree. This relational algebra tree is then passed through OmniSci-specific 
optimization passes and translated into an OmniSCi-specific abstract syntax tree (AST). 
The AST provides all the information necessary to generate native machine code for 
query execution on the target device. Execution then occurs in parallel on the target 
device, with device results being aggregated and reduced into a final `ResultSet`
for each query step.

The sections following provide in-depth details on each of the
stages involved in executing a query.

.. _Thrift: https://thrift.apache.org/
.. _Calcite: https://calcite.apache.org/
.. _Bison: https://www.gnu.org/software/bison/

Simplified Execution Model
==========================

.. uml::
   :align: center

    @startuml
   
    start
   
    :Parse and Validate SQL;
   
    :Generate Optimized 
     Relational Algebra Sequence;
   
    :Prepare Execution Environment;
    
    repeat
        fork
            :Data Ownership, 
             Identification, 
             Load (as required);
            :Execute Query Kernel 
             on Target Device;
        fork again
            :Data Ownership, 
             Identification, 
             Load (as required);
            :Execute Query Kernel 
             on Target Device;
        fork again
            :Data Ownership, 
             Identification, 
             Load (as required);
            :Execute Query Kernel 
             on Target Device;
        end fork      
        :Reduce Result;

    while (Query Completed?)

    :Return Result;
    
    stop

    @enduml