======
Logger
======

``#include "Logger/Logger.h"``

The OmniSci Logger is based on `Boost.Log`_ with a design goal of being largely, though not completely, backward
compatible with `glog`_ in usage, but with additional control over the logging format and other features.

.. _Boost.Log: https://www.boost.org/libs/log/doc/html/index.html
.. _glog: https://github.com/google/glog

For the database user, log settings are set in the same manner as other program options,
which can be specified via the command line or config file. These include settings such as the directory
where log files are written to, log file naming format, log severity level, etc. See `Program Options`_.

For the developer, log entries are made in a syntax that is similar to ``std::ostream``. Example::

    LOG(INFO) << "x = " << x;

where ``INFO`` is one common example of the log "severity" level. Other severities are ``WARNING``, ``FATAL``,
``DEBUG1``, etc. See `Severity`_.

In addition, there are a number of ``CHECK`` macros which act like ``assert()`` but will report via the
logging system upon failure, with an optional message, and subsequently call ``abort()``. Examples::

    CHECK(ptr);
    CHECK_LT(1u, list.size()) << "list must contain more than 1 element.";

If ``ptr==nullptr`` or ``1u >= list.size()`` then the program will abort, and log a corresponding ``FATAL`` message.
See `CHECK`_.

Program Options
---------------

``omnisci_server --help`` includes the logging options::

    Logging:
    --log-directory arg (="mapd_log")     Logging directory. May be relative to
                                          data directory, or absolute.
    --log-file-name arg (=omnisci_server.{SEVERITY}.%Y%m%d-%H%M%S.log)
                                          Log file name relative to
                                          log-directory.
    --log-symlink arg (=omnisci_server.{SEVERITY})
                                          Symlink to active log.
    --log-severity arg (=INFO)            Log to file severity level: INFO
                                          WARNING ERROR FATAL
    --log-severity-clog arg (=ERROR)      Log to console severity level: INFO
                                          WARNING ERROR FATAL
    --log-channels arg                    Log channel debug info: IR PTX ASM
    --log-auto-flush arg (=1)              Flush logging buffer to file after each
                                          message.
    --log-max-files arg (=100)             Maximum number of log files to keep.
    --log-min-free-space arg (=20971520)  Minimum number of bytes left on device
                                          before oldest log files are deleted.
    --log-rotate-daily arg (=1)           Start new log files at midnight.
    --log-rotation-size arg (=10485760)   Maximum file size in bytes before new
                                          log files are started.

The class ``logger::LogOptions`` manages the above values, and integrates into the `Boost.Program_options`_
classes used by the OmniSciDB to process program options, from both the command line and/or a config file.

.. _Boost.Program_options: https://www.boost.org/doc/html/program_options.html

Initialization and Global Instances
-----------------------------------

To initialize the logging system, the function::

    namespace logger {

    void init(LogOptions const&);

    }

must be invoked with the ``logger::LogOptions`` object to be applied. It is recommended to run this from
``main()`` as early as possible. This initializes two global logging objects, ``gChannelLogger`` and
``gSeverityLogger``, that exist in the object file ``Logger.o``, which should never be referenced
directly. Instead, use the ``LOG``/``CHECK`` macros for all normal logging. This is mentioned here so that
issues such as global variable construction/destruction can be considered. More specifically, due to the
unpredictability of `static initialization order fiasco`_, logging should never be done from other global
constructors or destructors, directly or indirectly.

.. _`static initialization order fiasco`: https://cryptopp.com/wiki/Static_Initialization_Order_Fiasco


Usage
-----

Severity
^^^^^^^^

There are currently 8 severity levels that can be used with the ``LOG()`` macro, in decreasing order of
severity from most severe to least:

============ ============================================================================
**Severity** **When to Use**
``FATAL``    An unrecoverable error has occurred, and must be fixed in the software.
             This is the only severity which will result in a call to ``abort()``.
``ERROR``    A recoverable error has occurred, and must be fixed in the software.
``WARNING``  Something that "should not have" happened happened, but is not as demanding
             of an immediate fix as an ``ERROR``. Example: A deprecated feature is still
             being used, even though the user was informed of its deprecation.
``INFO``     Significant and informative milestones in the execution of the program.
             One must balance logging useful and informative information, against
             logging too much useless and redundant information that drowns the signal
             out with noise.
``DEBUG1``   More detailed information about the execution of the program than ``INFO``
             that would be useful for debugging, but less detailed than the below debug
             levels. E.g. don't log every row of a million-row table.
``DEBUG2``   More detailed information than ``DEBUG1``.
``DEBUG3``   More detailed information than ``DEBUG2``.
``DEBUG4``   More detailed information than ``DEBUG3``.
============ ============================================================================

The program option ``--log-severity`` is set to exactly one of these, defaulting to ``INFO``. All ``LOG()``
calls with a lesser severity are ignored and not logged. For example if ``--log-severity=DEBUG1`` then log
calls ``LOG(DEBUG1)``, ``LOG(INFO)``, ``LOG(WARNING)``, ``LOG(ERROR)``, and ``LOG(FATAL)`` are active and
will produce log entries when executed, and log calls ``LOG(DEBUG4)``, ``LOG(DEBUG3)``, and ``LOG(DEBUG2)``
are ignored. If a function is called in the input stream, e.g. ``LOG(DEBUG2) << f(x)``, then ``f(x)`` will
be invoked if any only if ``DEBUG2`` is an active log severity.

One guiding principle for deciding which of the debug levels to use is to consider it from the perspective of
someone who is trying to diagnose/debug a problem. Running the database with ``--log-severity=DEBUG4`` will
show ALL debug log lines, but more information must be searched through in order to find what is useful. Having
a few different debug levels allows for some control over how verbose the logging is.

Errors that are logged with ``ERROR`` and ``FATAL`` should be considered software errors, and not user errors.
For example, if the user inputs a malformed SQL query, e.g. ``SELEKT`` instead of ``SELECT``, then this
should NOT be logged as an ``ERROR``, but instead logged as an ``INFO`` with an appropriate response to
the user. This would be considered correct behavior of the software. In contrast, anytime an ``ERROR`` or
``FATAL`` is logged, then it means there is a bug in the software that must be fixed.

For backward compatibility, the flag ``--verbose=on`` is equivalent to ``--log-severity=DEBUG1``. If both
are provided, then the lesser severity is active.

Log Files
^^^^^^^^^

A separate log file is produced for ``INFO``, ``WARNING``, ``ERROR``, and ``FATAL`` if the log severity is
active.  If any of the ``DEBUG`` severities are active, they are included into the ``INFO`` log file.  Each log
file redundantly includes all entries that are more severe than itself. For example if ``--log-severity=DEBUG1``
then the ``INFO`` log file will include all log calls ``LOG(DEBUG1)``, ``LOG(INFO)``, ``LOG(WARNING)``,
``LOG(ERROR)``, and ``LOG(FATAL)``; the ``WARNING`` log file will include all log calls ``LOG(WARNING)``,
``LOG(ERROR)``, and ``LOG(FATAL)``.

The name of the log file by default includes the program name, severity, and timestamp of when its first entry
was made. Example::

    omnisci_server.INFO.20190918-162525.log

This can be customized with the ``--log-file-name`` program option.

New log files are started on each (re)start. A symbolic link without the timestamp and file extension
suffix points to the latest version. Example::

    omnisci_server.INFO -> omnisci_server.INFO.20190918-162525.log

Format
""""""

The general format of a log entry is::

    (timestamp) (severity) (process_id) (query_id) (thread_id) (filename:line_number) (message)

Example::

    2019-09-18T16:25:25.659248 I 26481 5 DBHandler.cpp:181 OmniSci Server 4.9.0dev-20190918-bd97353685

Field descriptions:

| 1. Timestamp in local timezone with microsecond resolution.
| 2. Single-character severity level. In same order as above severity levels:
|    ``F`` ``E`` ``W`` ``I`` ``1`` ``2`` ``3`` ``4``
|    For instance the ``I`` implies that the above log entry is of ``INFO`` severity.
| 3. The `process_id` assigned by the operating system.
| 4. The `query_id` is a unique 64-bit positive integer incrementally assigned to each new SQL query. A value of `0` indicates that the log line is outside of the context of any particular query, or that the `query_id` is not available.
| 5. The `thread_id` is a unique 64-bit positive integer incrementally assigned to each new thread. `thread_id=1` is assigned to the first thread each time the program starts.
| 6. Source filename:Line number.
| 7. Custom message sent to ``LOG()`` via the insertion ``<<`` operator.

Note that log entries can contain line breaks, thus not all log lines will begin with these fields if
the message itself contains multiple lines.

Channel
^^^^^^^

Channels are similar to severities, but exist outside of the severity hierarchy, have no ordering of their own,
and can only be activated by explicitly including them in the ``--log-channels`` program option.

Currently there are 3 channels: ``IR`` ``PTX`` ``ASM``

which log intermediate representation, and parallel thread execution code, respectively. Scripts may be
used for other purposes that parse and analyze these logs, therefore using channels outside of the severity
hierarchy is convenient so that the output is not interleaved with unrelated ``INFO`` or ``DEBUG`` log lines.

For example, ``LOG(IR) << "Foo = " << foo.getIr();`` will be activated if any only if ``IR`` is included
in the ``--log-channels`` program option, which can accept multiple comma-delimited channel names. If activated,
the output will be logged to a file named after the channel, e.g. ``omnisci_server.IR.20190918-162525.log``.

Macros
^^^^^^

LOG
"""

In addition to the ``LOG()`` macro, there are:

 * ``LOG_IF(severity, condition)`` - Same as ``LOG(severity)`` but first checks a boolean ``condition`` and logs
   only if evaluated to ``true``.
 * ``VLOG(n)`` - Same as ``LOG(DEBUGn)`` for ``n = 1, 2, 3, 4``.

CHECK
"""""

The ``CHECK(condition)`` macro evaluates ``condition`` as a boolean value. If true, then execution continues
with nothing logged. Otherwise both the ``condition`` source code string is logged at ``FATAL`` severity,
along with any optional ``<< message``, before calling ``abort()``. The program may then either exit, or
optionally catch the ``SIGABRT`` signal.

| Similarly there are 6 binary ``CHECK`` comparison macros:
| ``CHECK_EQ``, ``CHECK_NE``, ``CHECK_LT``, ``CHECK_LE``, ``CHECK_GT``, ``CHECK_GE``
| which accept two parameters, and apply the comparison operators ``==``, ``!=``, ``<``, ``<=``, ``>``, ``>=``, respectively. For example, ``CHECK_LT(1u, list.size())`` will evaluate ``1u < list.size()``, and log and ``abort()`` if not true. The advantage of calling ``CHECK_LT(1u, list.size())`` over ``CHECK(1u < list.size())`` is that the value of both operands will be logged if the test fails, which is not reported with ``CHECK()``.

STDLOG
""""""

``DBHandler`` uses a logging helper class ``StdLog`` for logging query-specific information in
a standard format::

 (timestamp) (severity) (process_id) (query_id) (thread_id) (filename:line_number) stdlog (function_name) (match_id)
 (time_ms) (username) (dbname) (public_session_id) (array of names) (array of values)

Since this contains timing information, it is logged at the end of query execution.  If the ``DEBUG1`` severity is
active, then a corresponding ``stdlog_begin`` line is also logged at the start of the query, with the same format.

Example usage is given in the `QueryState`_ documentation.

.. _QueryState: query_state.html

Example entries::

 2019-09-20T17:15:28.215590 1 13080 DBHandler.cpp:846 stdlog_begin sql_execute 2 0 omnisci testuser 528-dyM2 {"query_str"} {"SELECT * FROM omnisci_counties LIMIT 1;"}
 2019-09-20T17:15:28.924512 I 13080 DBHandler.cpp:846 stdlog sql_execute 2 709 omnisci testuser 528-dyM2 {"query_str","execution_time_ms","total_time_ms"} {"SELECT * FROM omnisci_counties LIMIT 1;","708","709"}

The first 4 fields are same as in the above `Format`_ section.  Additional field descriptions:

#. ``stdlog_begin``/``stdlog`` -  logged when ``StdLog`` object is constructed/destructed.
#. ``function_name`` - Name of function the ``StdLog`` object was constructed/destructed in.
#. ``match_id`` - Integer to uniquely match ``stdlog_begin`` and ``stdlog`` lines together.
#. ``time_ms`` - Time in milliseconds since constructor. For ``stdlog_begin`` it is always 0. For ``stdlog``
   it is the time between the constructor and destructor, which is generally the duration of the function call.
#. ``username`` - Session username. Blank if not available.
#. ``dbname`` - Session database. Blank if not available.
#. ``public_session_id`` - Identifier for session. May be publicly accessible without compromising security.
   Blank if not available.
#. ``names`` - List of optional value names in SQL-array format.
#. ``values`` - List of optional values in SQL-array format, in same order as ``names``. Names and values are
   logged in this way so as to be readily imported into a SQL table for analysis.

DEBUG_TIMER
"""""""""""

``DebugTimer`` objects can be instantiated in the code that measure and log the duration of their own lifetimes,
and include the following features:

* Globally accessible via a macro. E.g. ``auto timer = DEBUG_TIMER(__func__)``.
* Single multi-line log entry is reported for nested timers.
* Enabled with the ``--enable-debug-timer`` program option. Without it, the ``timer`` objects have no effect.
* Include timers from spawned threads. Requires a call on the child thread informing the parent thread id:
  ``DEBUG_TIMER_NEW_THREAD(parent_thread_id);``

Example::

    void foo() {
      auto timer = DEBUG_TIMER(__func__);
      ...
      bar();
      ...
    }

    void bar() {
      auto timer = DEBUG_TIMER(__func__);
      ...
      bar2();
      ...
      timer.stop();  // Manually stop timer for bar().
      ...
    }

    void bar2() {
      auto timer = DEBUG_TIMER(__func__);
      ...
    }

Upon the destruction of the ``timer`` object within ``foo()``, a log entry similar to the following will be made::

    2019-10-17T15:22:53.981002 I 8980 foobar.cpp:70 DEBUG_TIMER thread_id(140719710320384)
    19ms total duration for foo
      17ms start(10ms) bar foobar.cpp:100
        13ms start(10ms) bar2 foobar.cpp:130

Fields for the ``Duration`` lines (last two line above) are:

#. Lifetime of ``timer`` object.
#. Time after start of current thread. (This can be used to find gaps in timing coverage.)
#. String parameter to ``DEBUG_TIMER`` (``__func__`` in above examples.)
#. File\:Line where ``DEBUG_TIMER`` was called from.

The first root ``DEBUG_TIMER`` instance is in ``foo()``, and the two others in ``bar()`` and ``bar2()`` are initiated
upon subsequent calls into the call stack, represented by the indentations.  Once the first root ``timer`` object
destructs, the entire ``DurationTree`` of recorded times are logged together into a single multi-line log entry,
one line per ``timer`` instance.

There is a ``DebugTimer::stop()`` method that manually stops the timer, serving the same function
as the destructor. The destructor then will have no further effect.

To embed timers in a spawned child thread, call ``DEBUG_TIMER_NEW_THREAD(parent_thread_id);`` from the child
thread. The ``parent_thread_id`` must get its value from ``logger::thread_id()`` before the new thread is spawned.
This will not start a timer, but will record the child-parent relationship so that subsequent ``DEBUG_TIMER``
calls are stored in the correct node of the parent tree. An example of a resulting report::

    2020-01-30T16:58:19.926148 I 33266 DBHandler.cpp:956 DEBUG_TIMER thread_id(4)
    591ms total duration for sql_execute
      511ms start(41ms) executeRelAlgQuery RelAlgExecutor.cpp:71
        6ms start(41ms) executeWorkUnit RelAlgExecutor.cpp:1858
          4ms start(41ms) compileWorkUnit NativeCodegen.cpp:1571
            New thread(5)
              0ms start(0ms) fetchChunks Execute.cpp:2024
              0ms start(0ms) getQueryExecutionContext QueryMemoryDescriptor.cpp:711
              0ms start(0ms) executePlanWithoutGroupBy Execute.cpp:2276
                0ms start(0ms) launchGpuCode QueryExecutionContext.cpp:195
            End thread(5)

.. note::

    Any timer that is created in a thread when no other timers are active in the same or parent thread is
    called a *root timer*. The timer stack is logged when the root timer destructs, or ``stop()`` is called,
    after which memory used for tracking the timer trees are freed.  The performance cost of this should be
    kept in mind when placing timers within the code.

.. warning::

    Non-root timers that end *after* their root timer ends will result in a **segmentation fault** (but only
    when the ``--enable-debug-timer`` option is active). This is easily avoided by not interleaving timer
    lifetimes with one another in the same block of code, and making sure that all child threads end prior
    to the ending of any corresponding root timer.

The high-level class relationships are:

.. uml::
    :align: center

    @startuml
    object thread_id
    class DurationTree
    note right: Each node of DurationTree is of type\n**boost::variant<Duration, DurationTree&>**\nto hold both Durations and\nDurationTrees of child threads.
    class DurationTreeMap
    note right: Global singleton:\nlogger::g_duration_tree_map
    class Duration {
      int depth_
      Clock::time_point start_
      Clock::time_point stop_
    }
    thread_id - DurationTree
    DurationTreeMap -- (thread_id, DurationTree)
    DurationTree o- DurationTree
    DurationTree *- Duration
    class DebugTimer {
      ---
      void stop()
    }
    note left: Instantiate with macro:\nauto timer = DEBUG_TIMER(name);
    Duration <.. DebugTimer
    @enduml

There is a single global instance of ``DurationTreeMap`` that tracks a separate ``DurationTree`` for each thread.