Class foreign_storage::ParquetStringEncoder¶
-
template<typename
V
>
classParquetStringEncoder
: public foreign_storage::TypedParquetInPlaceEncoder<V, V>¶ Public Functions
-
ParquetStringEncoder
(Data_Namespace::AbstractBuffer *buffer, StringDictionary *string_dictionary, ChunkMetadata *chunk_metadata)¶
-
void
validateAndAppendData
(const int16_t *def_levels, const int16_t *rep_levels, const int64_t values_read, const int64_t levels_read, int8_t *values, const SQLTypeInfo &column_type, InvalidRowGroupIndices &invalid_indices)¶
-
void
appendDataTrackErrors
(const int16_t *def_levels, const int16_t *rep_levels, const int64_t values_read, const int64_t levels_read, int8_t *values)¶
-
void
appendData
(const int16_t *def_levels, const int16_t *rep_levels, const int64_t values_read, const int64_t levels_read, int8_t *values)¶ Appends Parquet data to the buffer using an in-place algorithm. Any necessary transformation or validation of the data and decoding of nulls is part of appending the data. Each class inheriting from this abstract class must implement the functionality to copy, nullify and encode the data.
Note that the Parquet format encodes nulls using Dremel encoding.
- Parameters
def_levels
: - an array containing the Dremel encoding definition levelsrep_levels
: - an array containing the Dremel encoding repetition levelsvalues_read
: - the number of non-null values readlevels_read
: - the total number of values (non-null & null) that are readvalues
: - values that are read
-
void
encodeAndCopyContiguous
(const int8_t *parquet_data_bytes, int8_t *omnisci_data_bytes, const size_t num_elements)¶
-
void
encodeAndCopy
(const int8_t *parquet_data_bytes, int8_t *omnisci_data_bytes)¶
-
std::shared_ptr<ChunkMetadata>
getRowGroupMetadata
(const parquet::RowGroupMetaData *group_metadata, const int parquet_column_index, const SQLTypeInfo &column_type)¶
Protected Functions
-
bool
encodingIsIdentityForSameTypes
() const¶
Private Functions
-
void
updateMetadataStats
(int64_t values_read, int8_t *values)¶
Private Members
-
StringDictionary *
string_dictionary_
¶
-
ChunkMetadata *
chunk_metadata_
¶
-
std::vector<int8_t>
encode_buffer_
¶
-
V
min_
¶
-
V
max_
¶
-
int64_t
current_batch_offset_
¶
-
InvalidRowGroupIndices *
invalid_indices_
¶
-