Class ResultSet¶

class ResultSet¶

Public Types

enum GeoReturnType¶

Geo return type options when accessing geo columns from a result set.

Values:

GeoTargetValue¶: Copies the geo data into a struct of vectors - coords are uncompressed

WktString¶: Returns the geo data as a WKT string

GeoTargetValuePtr¶: Returns only the pointers of the underlying buffers for the geo data.

GeoTargetValueGpuPtr¶: If geo data is currently on a device, keep the data on the device and return the device ptrs

Public Functions

ResultSet(const std::vector<TargetInfo> &targets, const ExecutorDeviceType device_type, const QueryMemoryDescriptor &query_mem_desc, const std::shared_ptr<RowSetMemoryOwner> row_set_mem_owner, const Catalog_Namespace::Catalog *catalog, const unsigned block_size, const unsigned grid_size)¶

ResultSet(const std::vector<TargetInfo> &targets, const std::vector<ColumnLazyFetchInfo> &lazy_fetch_info, const std::vector<std::vector<const int8_t *>> &col_buffers, const std::vector<std::vector<int64_t>> &frag_offsets, const std::vector<int64_t> &consistent_frag_sizes, const ExecutorDeviceType device_type, const int device_id, const QueryMemoryDescriptor &query_mem_desc, const std::shared_ptr<RowSetMemoryOwner> row_set_mem_owner, const Catalog_Namespace::Catalog *catalog, const unsigned block_size, const unsigned grid_size)¶

ResultSet(const std::shared_ptr<const Analyzer::Estimator>, const ExecutorDeviceType device_type, const int device_id, Data_Namespace::DataMgr *data_mgr)¶

ResultSet(const std::string &explanation)¶

ResultSet(int64_t queue_time_ms, int64_t render_time_ms, const std::shared_ptr<RowSetMemoryOwner> row_set_mem_owner)¶

~ResultSet()¶

std::string toString() const¶

std::string summaryToString() const¶

ResultSetRowIterator rowIterator(size_t from_logical_index, bool translate_strings, bool decimal_to_double) const¶

ResultSetRowIterator rowIterator(bool translate_strings, bool decimal_to_double) const¶

ExecutorDeviceType getDeviceType() const¶

const ResultSetStorage *allocateStorage() const¶

const ResultSetStorage *allocateStorage(int8_t *, const std::vector<int64_t>&, std::shared_ptr<VarlenOutputInfo> = nullptr) const¶

const ResultSetStorage *allocateStorage(const std::vector<int64_t>&) const¶

void updateStorageEntryCount(const size_t new_entry_count)¶

std::vector<TargetValue> getNextRow(const bool translate_strings, const bool decimal_to_double) const¶

size_t getCurrentRowBufferIndex() const¶

std::vector<TargetValue> getRowAt(const size_t index) const¶

TargetValue getRowAt(const size_t row_idx, const size_t col_idx, const bool translate_strings, const bool decimal_to_double = true) const¶

OneIntegerColumnRow getOneColRow(const size_t index) const¶

std::vector<TargetValue> getRowAtNoTranslations(const size_t index, const std::vector<bool> &targets_to_skip = {}) const¶

bool isRowAtEmpty(const size_t index) const¶

void sort(const std::list<Analyzer::OrderEntry> &order_entries, size_t top_n, const Executor *executor)¶

void keepFirstN(const size_t n)¶

void dropFirstN(const size_t n)¶

void append(ResultSet &that)¶

const ResultSetStorage *getStorage() const¶

size_t colCount() const¶

SQLTypeInfo getColType(const size_t col_idx) const¶

size_t rowCount(const bool force_parallel = false) const¶

Returns the number of valid entries in the result set (i.e that will be returned from the SQL query or inputted into the next query step)

Note that this can be less than or equal to the value returned by ResultSet::getEntries(), whether due to a SQL LIMIT/OFFSET applied or because the result set representation is inherently sparse (i.e. baseline hash group by).

Internally this function references/sets a cached value (cached_row_count_) so that the cost of computing the result is only paid once per result set.

If the actual row count is not cached and needs to be computed, in some cases that can be O(1) (i.e. if limits and offsets are present, or for the output of a table function). For projections, we use a binary search, so it is O(log n), otherwise it is O(n) (with n being ResultSet::entryCount()), which will be run in parallel if the entry count >= the default of 20000 or if force_parallel is set to true

Note that we currently do not invalidate the cache if the result set is changed (i.e appended to), so this function should only be called after the result set is finalized.

Parameters

force_parallel: Forces the row count to be computed in parallel if the row count cannot be otherwise be computed from metadata or via a binary search (otherwise parallel search is automatically used for result sets with entryCount() >= 20000)

void invalidateCachedRowCount() const¶

void setCachedRowCount(const size_t row_count) const¶

bool isEmpty() const¶

Returns a boolean signifying whether there are valid entries in the result set.

Note a result set can be logically empty even if the value returned by ResultSet::entryCount() is > 0, whether due to a SQL LIMIT/OFFSET applied or because the result set representation is inherently sparse (i.e. baseline hash group by).

Internally this function is just implemented as ResultSet::rowCount() == 0, which caches it’s value so the row count will only be computed once per finalized result set.

size_t entryCount() const¶

Returns the number of entries the result set is allocated to hold.

Note that this can be greater than or equal to the actual number of valid rows in the result set, whether due to a SQL LIMIT/OFFSET applied or because the result set representation is inherently sparse (i.e. baseline hash group by)

For getting the number of valid rows in the result set (inclusive of any applied LIMIT and/or OFFSET), use ResultSet::rowCount(). Or to just test if there are any valid rows, use ResultSet::entryCount(), as a return value from entryCount() greater than 0 does not neccesarily mean the result set is empty.

size_t getBufferSizeBytes(const ExecutorDeviceType device_type) const¶

bool definitelyHasNoRows() const¶

const QueryMemoryDescriptor &getQueryMemDesc() const¶

const std::vector<TargetInfo> &getTargetInfos() const¶

const std::vector<int64_t> &getTargetInitVals() const¶

int8_t *getDeviceEstimatorBuffer() const¶

int8_t *getHostEstimatorBuffer() const¶

void syncEstimatorBuffer() const¶

size_t getNDVEstimator() const¶

void setQueueTime(const int64_t queue_time)¶

void setKernelQueueTime(const int64_t kernel_queue_time)¶

void addCompilationQueueTime(const int64_t compilation_queue_time)¶

int64_t getQueueTime() const¶

int64_t getRenderTime() const¶

void moveToBegin() const¶

bool isTruncated() const¶

bool isExplain() const¶

void setValidationOnlyRes()¶

bool isValidationOnlyRes() const¶

std::string getExplanation() const¶

bool isGeoColOnGpu(const size_t col_idx) const¶

int getDeviceId() const¶

void fillOneEntry(const std::vector<int64_t> &entry)¶

void initializeStorage() const¶

void holdChunks(const std::list<std::shared_ptr<Chunk_NS::Chunk>> &chunks)¶

void holdChunkIterators(const std::shared_ptr<std::list<ChunkIter>> chunk_iters)¶

void holdLiterals(std::vector<int8_t> &literal_buff)¶

std::shared_ptr<RowSetMemoryOwner> getRowSetMemOwner() const¶

const Permutation &getPermutationBuffer() const¶

const bool isPermutationBufferEmpty() const¶

void serialize(TSerializedRows &serialized_rows) const¶

size_t getLimit() const¶

ResultSetPtr copy()¶

void clearPermutation()¶

void initStatus()¶

void invalidateResultSetChunks()¶

const bool isEstimator() const¶

void setCached(bool val)¶

const bool isCached() const¶

void setExecTime(const long exec_time)¶

const long getExecTime() const¶

void setQueryPlanHash(const QueryPlanHash query_plan)¶

const QueryPlanHash getQueryPlanHash()¶

std::unordered_set<size_t> getInputTableKeys() const¶

void setInputTableKeys(std::unordered_set<size_t> &&intput_table_keys)¶

void setTargetMetaInfo(const std::vector<TargetMetaInfo> &target_meta_info)¶

std::vector<TargetMetaInfo> getTargetMetaInfo()¶

std::optional<bool> canUseSpeculativeTopNSort() const¶

void setUseSpeculativeTopNSort(bool value)¶

const bool hasValidBuffer() const¶

GeoReturnType getGeoReturnType() const¶

void setGeoReturnType(const GeoReturnType val)¶

void copyColumnIntoBuffer(const size_t column_idx, int8_t *output_buffer, const size_t output_buffer_size) const¶: For each specified column, this function goes through all available storages and copies its content into a contiguous output_buffer

bool isDirectColumnarConversionPossible() const¶

Determines if it is possible to directly form a ColumnarResults class from this result set, bypassing the default columnarization.

NOTE: If there exists a permutation vector (i.e., in some ORDER BY queries), it becomes equivalent to the row-wise columnarization.

bool didOutputColumnar() const¶

bool isZeroCopyColumnarConversionPossible(size_t column_idx) const¶

const int8_t *getColumnarBuffer(size_t column_idx) const¶

QueryDescriptionType getQueryDescriptionType() const¶

const int8_t getPaddedSlotWidthBytes(const size_t slot_idx) const¶

std::tuple<std::vector<bool>, size_t> getSingleSlotTargetBitmap() const¶

std::tuple<std::vector<bool>, size_t> getSupportedSingleSlotTargetBitmap() const¶

This function returns a bitmap and population count of it, where it denotes all supported single-column targets suitable for direct columnarization.

The final goal is to remove the need for such selection, but at the moment for any target that doesn’t qualify for direct columnarization, we use the traditional result set’s iteration to handle it (e.g., count distinct, approximate count distinct)

std::vector<size_t> getSlotIndicesForTargetIndices() const¶

const std::vector<ColumnLazyFetchInfo> &getLazyFetchInfo() const¶

bool areAnyColumnsLazyFetched() const¶

size_t getNumColumnsLazyFetched() const¶

void setSeparateVarlenStorageValid(const bool val)¶

const std::vector<std::string> getStringDictionaryPayloadCopy(const int dict_id) const¶

const std::pair<std::vector<int32_t>, std::vector<std::string>> getUniqueStringsForDictEncodedTargetCol(const size_t col_idx) const¶

StringDictionaryProxy *getStringDictionaryProxy(int const dict_id) const¶

template<typename ENTRY_TYPE, QueryDescriptionType QUERY_TYPE, bool COLUMNAR_FORMAT> ENTRY_TYPE getEntryAt(const size_t row_idx, const size_t target_idx, const size_t slot_idx) const¶

ChunkStats getTableFunctionChunkStats(const size_t target_idx) const¶

void translateDictEncodedColumns(std::vector<TargetInfo> const &targets, size_t const start_idx)¶

void eachCellInColumn(RowIterationState &state, CellCallback const &func)¶

const Executor *getExecutor() const¶

template<typename ENTRY_TYPE, QueryDescriptionType QUERY_TYPE, bool COLUMNAR_FORMAT> ENTRY_TYPE getEntryAt(const size_t row_idx, const size_t target_idx, const size_t slot_idx) const

template<typename ENTRY_TYPE> ENTRY_TYPE getColumnarPerfectHashEntryAt(const size_t row_idx, const size_t target_idx, const size_t slot_idx) const¶

Directly accesses the result set’s storage buffer for a particular data type (columnar output, perfect hash group by)

NOTE: Currently, only used in direct columnarization

template<typename ENTRY_TYPE> ENTRY_TYPE getRowWisePerfectHashEntryAt(const size_t row_idx, const size_t target_idx, const size_t slot_idx) const¶

Directly accesses the result set’s storage buffer for a particular data type (row-wise output, perfect hash group by)

NOTE: Currently, only used in direct columnarization

template<typename ENTRY_TYPE> ENTRY_TYPE getRowWiseBaselineEntryAt(const size_t row_idx, const size_t target_idx, const size_t slot_idx) const¶

Directly accesses the result set’s storage buffer for a particular data type (columnar output, baseline hash group by)

NOTE: Currently, only used in direct columnarization

template<typename ENTRY_TYPE> ENTRY_TYPE getColumnarBaselineEntryAt(const size_t row_idx, const size_t target_idx, const size_t slot_idx) const¶

Directly accesses the result set’s storage buffer for a particular data type (row-wise output, baseline hash group by)

NOTE: Currently, only used in direct columnarization

Public Members

friend ResultSet::ResultSetBuilder

Public Static Functions

QueryMemoryDescriptor fixupQueryMemoryDescriptor(const QueryMemoryDescriptor &query_mem_desc)¶

static std::unique_ptr<ResultSet> unserialize(const TSerializedRows &serialized_rows, const Executor *)¶

double calculateQuantile(quantile::TDigest *const t_digest)¶

Private Types

using ApproxQuantileBuffers = std::vector<std::vector<double>>¶

using SerializedVarlenBufferStorage = std::vector<std::string>¶

Private Functions

void advanceCursorToNextEntry(ResultSetRowIterator &iter) const¶

std::vector<TargetValue> getNextRowImpl(const bool translate_strings, const bool decimal_to_double) const¶

std::vector<TargetValue> getNextRowUnlocked(const bool translate_strings, const bool decimal_to_double) const¶

std::vector<TargetValue> getRowAt(const size_t index, const bool translate_strings, const bool decimal_to_double, const bool fixup_count_distinct_pointers, const std::vector<bool> &targets_to_skip = {}) const¶

template<typename ENTRY_TYPE> ENTRY_TYPE getColumnarPerfectHashEntryAt(const size_t row_idx, const size_t target_idx, const size_t slot_idx) const

template<typename ENTRY_TYPE> ENTRY_TYPE getRowWisePerfectHashEntryAt(const size_t row_idx, const size_t target_idx, const size_t slot_idx) const

template<typename ENTRY_TYPE> ENTRY_TYPE getRowWiseBaselineEntryAt(const size_t row_idx, const size_t target_idx, const size_t slot_idx) const

template<typename ENTRY_TYPE> ENTRY_TYPE getColumnarBaselineEntryAt(const size_t row_idx, const size_t target_idx, const size_t slot_idx) const

size_t binSearchRowCount() const¶

size_t parallelRowCount() const¶

size_t advanceCursorToNextEntry() const¶

void radixSortOnGpu(const std::list<Analyzer::OrderEntry> &order_entries) const¶

void radixSortOnCpu(const std::list<Analyzer::OrderEntry> &order_entries) const¶

TargetValue getTargetValueFromBufferRowwise(int8_t *rowwise_target_ptr, int8_t *keys_ptr, const size_t entry_buff_idx, const TargetInfo &target_info, const size_t target_logical_idx, const size_t slot_idx, const bool translate_strings, const bool decimal_to_double, const bool fixup_count_distinct_pointers) const¶

TargetValue getTargetValueFromBufferColwise(const int8_t *col_ptr, const int8_t *keys_ptr, const QueryMemoryDescriptor &query_mem_desc, const size_t local_entry_idx, const size_t global_entry_idx, const TargetInfo &target_info, const size_t target_logical_idx, const size_t slot_idx, const bool translate_strings, const bool decimal_to_double) const¶

TargetValue makeTargetValue(const int8_t *ptr, const int8_t compact_sz, const TargetInfo &target_info, const size_t target_logical_idx, const bool translate_strings, const bool decimal_to_double, const size_t entry_buff_idx) const¶

TargetValue makeVarlenTargetValue(const int8_t *ptr1, const int8_t compact_sz1, const int8_t *ptr2, const int8_t compact_sz2, const TargetInfo &target_info, const size_t target_logical_idx, const bool translate_strings, const size_t entry_buff_idx) const¶

TargetValue makeGeoTargetValue(const int8_t *geo_target_ptr, const size_t slot_idx, const TargetInfo &target_info, const size_t target_logical_idx, const size_t entry_buff_idx) const¶

InternalTargetValue getVarlenOrderEntry(const int64_t str_ptr, const size_t str_len) const¶

int64_t lazyReadInt(const int64_t ival, const size_t target_logical_idx, const StorageLookupResult &storage_lookup_result) const¶

std::pair<size_t, size_t> getStorageIndex(const size_t entry_idx) const¶: Returns (storageIdx, entryIdx) pair, where: storageIdx : 0 is storage_, storageIdx-1 is index into appended_storage_. entryIdx : local index into the storage object.

const std::vector<const int8_t *> &getColumnFrag(const size_t storge_idx, const size_t col_logical_idx, int64_t &global_idx) const¶

const VarlenOutputInfo *getVarlenOutputInfo(const size_t entry_idx) const¶

ResultSet::StorageLookupResult findStorage(const size_t entry_idx) const¶

Comparator createComparator(const std::list<Analyzer::OrderEntry> &order_entries, const PermutationView permutation, const Executor *executor, const bool single_threaded)¶

PermutationView initPermutationBuffer(PermutationView permutation, PermutationIdx const begin, PermutationIdx const end) const¶

void parallelTop(const std::list<Analyzer::OrderEntry> &order_entries, const size_t top_n, const Executor *executor)¶

void baselineSort(const std::list<Analyzer::OrderEntry> &order_entries, const size_t top_n, const Executor *executor)¶

void doBaselineSort(const ExecutorDeviceType device_type, const std::list<Analyzer::OrderEntry> &order_entries, const size_t top_n, const Executor *executor)¶

bool canUseFastBaselineSort(const std::list<Analyzer::OrderEntry> &order_entries, const size_t top_n)¶

size_t rowCountImpl(const bool force_parallel) const¶

Data_Namespace::DataMgr *getDataManager() const¶

int getGpuCount() const¶

void serializeProjection(TSerializedRows &serialized_rows) const¶

void serializeVarlenAggColumn(int8_t *buf, std::vector<std::string> &varlen_bufer) const¶

void serializeCountDistinctColumns(TSerializedRows&) const¶

void unserializeCountDistinctColumns(const TSerializedRows&)¶

void fixupCountDistinctPointers()¶

void create_active_buffer_set(CountDistinctSet &count_distinct_active_buffer_set) const¶

int64_t getDistinctBufferRefFromBufferRowwise(int8_t *rowwise_target_ptr, const TargetInfo &target_info) const¶

Private Members

const std::vector<TargetInfo> targets_¶

const ExecutorDeviceType device_type_¶

const int device_id_¶

QueryMemoryDescriptor query_mem_desc_¶

std::unique_ptr<ResultSetStorage> storage_¶

AppendedStorage appended_storage_¶

size_t crt_row_buff_idx_¶

size_t fetched_so_far_¶

size_t drop_first_¶

size_t keep_first_¶

std::shared_ptr<RowSetMemoryOwner> row_set_mem_owner_¶

Permutation permutation_¶

const Catalog_Namespace::Catalog *catalog_¶

unsigned block_size_ = {0}¶

unsigned grid_size_ = {0}¶

QueryExecutionTimings timings_¶

std::list<std::shared_ptr<Chunk_NS::Chunk>> chunks_¶

std::vector<std::shared_ptr<std::list<ChunkIter>>> chunk_iters_¶

std::vector<std::vector<int8_t>> literal_buffers_¶

std::vector<ColumnLazyFetchInfo> lazy_fetch_info_¶

std::vector<std::vector<std::vector<const int8_t *>>> col_buffers_¶

std::vector<std::vector<std::vector<int64_t>>> frag_offsets_¶

std::vector<std::vector<int64_t>> consistent_frag_sizes_¶

const std::shared_ptr<const Analyzer::Estimator> estimator_¶

Data_Namespace::AbstractBuffer *device_estimator_buffer_ = {nullptr}¶

int8_t *host_estimator_buffer_ = {nullptr}¶

Data_Namespace::DataMgr *data_mgr_¶

std::vector<SerializedVarlenBufferStorage> serialized_varlen_buffer_¶

bool separate_varlen_storage_valid_¶

std::string explanation_¶

const bool just_explain_¶

bool for_validation_only_¶

std::atomic<int64_t> cached_row_count_¶

std::mutex row_iteration_mutex_¶

GeoReturnType geo_return_type_¶

bool cached_¶

size_t query_exec_time_¶

QueryPlanHash query_plan_¶

std::unordered_set<size_t> input_table_keys_¶

std::vector<TargetMetaInfo> target_meta_info_¶

std::optional<bool> can_use_speculative_top_n_sort¶

Private Static Functions

bool isNull(const SQLTypeInfo &ti, const InternalTargetValue &val, const bool float_argument_input)¶

PermutationView topPermutation(PermutationView permutation, const size_t n, const Comparator &compare)¶

Friends

friend ResultSet::ResultSetManager

friend ResultSet::ResultSetRowIterator

friend ResultSet::ColumnarResults

struct ColumnWiseTargetAccessor¶

Public Functions

ColumnWiseTargetAccessor(const ResultSet *result_set)¶

void initializeOffsetsForStorage()¶

InternalTargetValue getColumnInternal(const int8_t *buff, const size_t entry_idx, const size_t target_logical_idx, const StorageLookupResult &storage_lookup_result) const¶

Public Members

std::vector<std::vector<TargetOffsets>> offsets_for_storage_¶

const ResultSet *result_set_¶

struct QueryExecutionTimings¶

Public Members

int64_t executor_queue_time = {0}¶

int64_t render_time = {0}¶

int64_t compilation_queue_time = {0}¶

int64_t kernel_queue_time = {0}¶

template<typename BUFFER_ITERATOR_TYPE> struct ResultSetComparator¶

Public Types

template<> using BufferIteratorType = BUFFER_ITERATOR_TYPE¶

Public Functions

ResultSetComparator(const std::list<Analyzer::OrderEntry> &order_entries, const ResultSet *result_set, const PermutationView permutation, const Executor *executor, const bool single_threaded)¶

void materializeCountDistinctColumns()¶

ResultSet::ApproxQuantileBuffers materializeApproxQuantileColumns() const¶

std::vector<int64_t> materializeCountDistinctColumn(const Analyzer::OrderEntry &order_entry) const¶

ResultSet::ApproxQuantileBuffers::value_type materializeApproxQuantileColumn(const Analyzer::OrderEntry &order_entry) const¶

bool operator()(const PermutationIdx lhs, const PermutationIdx rhs) const¶

Public Members

const std::list<Analyzer::OrderEntry> &order_entries_¶

const ResultSet *result_set_¶

const PermutationView permutation_¶

const BufferIteratorType buffer_itr_¶

const Executor *executor_¶

const bool single_threaded_¶

std::vector<std::vector<int64_t>> count_distinct_materialized_buffers_¶

const ApproxQuantileBuffers approx_quantile_materialized_buffers_¶

struct RowIterationState¶

Public Members

size_t prev_target_idx_ = {0}¶

size_t cur_target_idx_¶

size_t agg_idx_ = {0}¶

int8_t const *buf_ptr_ = {nullptr}¶

int8_t compact_sz1_¶

struct RowWiseTargetAccessor¶

Public Functions

RowWiseTargetAccessor(const ResultSet *result_set)¶

InternalTargetValue getColumnInternal(const int8_t *buff, const size_t entry_idx, const size_t target_logical_idx, const StorageLookupResult &storage_lookup_result) const¶

void initializeOffsetsForStorage()¶

const int8_t *get_rowwise_ptr(const int8_t *buff, const size_t entry_idx) const¶

Public Members

std::vector<std::vector<TargetOffsets>> offsets_for_storage_¶

const ResultSet *result_set_¶

const size_t row_bytes_¶

const size_t key_width_¶

const size_t key_bytes_with_padding_¶

struct StorageLookupResult¶

Public Members

const ResultSetStorage *storage_ptr¶

const size_t fixedup_entry_idx¶

const size_t storage_idx¶

struct TargetOffsets¶

Public Members

const int8_t *ptr1¶

const size_t compact_sz1¶

const int8_t *ptr2¶

const size_t compact_sz2¶

struct VarlenTargetPtrPair¶

Public Functions

VarlenTargetPtrPair()¶

Public Members

int8_t *ptr1¶

int8_t compact_sz1¶

int8_t *ptr2¶

int8_t compact_sz2¶