Class File_Namespace::CachingFileMgr¶
-
class
CachingFileMgr
: public File_Namespace::FileMgr¶ A FileMgr capable of limiting it’s size and storing data from multiple tables in a shared directory. For any table that supports DiskCaching, the CachingFileMgr must contain either metadata for all table chunks, or for none (the cache is either has no knowledge of that table, or has complete knowledge of that table). Any data chunk within a table may or may not be contained within the cache.
Public Functions
-
CachingFileMgr
(const DiskCacheConfig &config)¶
-
~CachingFileMgr
()¶
-
MgrType
getMgrType
()¶
-
std::string
getStringMgrType
()¶
-
size_t
getDefaultPageSize
()¶
-
size_t
getMaxSize
()¶
-
size_t
getMaxDataFiles
() const¶
-
size_t
getMaxMetaFiles
() const¶
-
size_t
getMaxWrapperSize
() const¶
-
size_t
getDataFileSize
() const¶
-
size_t
getMetadataFileSize
() const¶
-
size_t
getNumDataFiles
() const¶
-
size_t
getNumMetaFiles
() const¶
-
size_t
getAvailableSpace
()¶
-
size_t
getAvailableWrapperSpace
()¶
-
size_t
getAllocated
()¶
-
size_t
getMaxDataFilesSize
() const¶
-
void
removeChunkKeepMetadata
(const ChunkKey &key)¶ Free pages for chunk and remove it from the chunk eviction algorithm.
-
void
clearForTable
(int32_t db_id, int32_t tb_id)¶ Removes all data related to the given table (pages and subdirectories).
-
bool
hasFileMgrKey
() const¶ Query to determine if the contained pages will have their database and table ids overriden by the filemgr key (FileMgr does this).
-
void
closeRemovePhysical
()¶ Closes files and removes the caching directory.
-
size_t
getChunkSpaceReservedByTable
(int32_t db_id, int32_t tb_id) const¶ Set of functions to determine how much space is reserved in a table by type.
-
size_t
getMetadataSpaceReservedByTable
(int32_t db_id, int32_t tb_id) const¶
-
size_t
getTableFileMgrSpaceReserved
(int32_t db_id, int32_t tb_id) const¶
-
size_t
getSpaceReservedByTable
(int32_t db_id, int32_t tb_id) const¶
-
void
checkpoint
(const int32_t db_id, const int32_t tb_id)¶ writes buffers for the given table, synchronizes files to disk, updates file epoch, and commits free pages.
-
int32_t
epoch
(int32_t db_id, int32_t tb_id) const¶ obtain the epoch version for the given table.
-
FileBuffer *
putBuffer
(const ChunkKey &key, AbstractBuffer *srcBuffer, const size_t numBytes = 0)¶ deletes any existing buffer for the given key then copies in a new one.
putBuffer() needs to behave differently than it does in FileMgr. Specifically, it needs to delete the buffer beforehand and then append, rather than overwrite the existing buffer. This way we only store a single version of the buffer rather than accumulating versions that need to be rolled off.
-
CachingFileBuffer *
allocateBuffer
(const size_t page_size, const ChunkKey &key, const size_t num_bytes = 0)¶ allocates a new CachingFileBuffer and tracks it’s use in the eviction algorithms.
-
CachingFileBuffer *
allocateBuffer
(const ChunkKey &key, const std::vector<HeaderInfo>::const_iterator &headerStartIt, const std::vector<HeaderInfo>::const_iterator &headerEndIt)¶
-
bool
updatePageIfDeleted
(FileInfo *file_info, ChunkKey &chunk_key, int32_t contingent, int32_t page_epoch, int32_t page_num)¶ checks whether a page should be deleted.
-
bool
failOnReadError
() const¶ True if a read error should cause a fatal error.
-
void
deleteBufferIfExists
(const ChunkKey &key)¶ deletes a buffer if it exists in the mgr. Otherwise do nothing.
-
size_t
getNumChunksWithMetadata
() const¶ Returns the number of buffers with metadata in the CFM. Any buffer with an encoder counts.
-
size_t
getNumDataChunks
() const¶ Returns the number of buffers with chunk data in the CFM.
-
std::vector<ChunkKey>
getChunkKeysForPrefix
(const ChunkKey &prefix) const¶ Returns the keys for chunks with chunk data that match the given prefix.
-
std::unique_ptr<CachingFileMgr>
reconstruct
() const¶ Initializes a new CFM using the initialization values in the current CFM.
-
void
deleteWrapperFile
(int32_t db, int32_t tb)¶ Deletes the wrapper file from a table subdir.
-
void
writeWrapperFile
(const std::string &doc, int32_t db, int32_t tb)¶ Writes a wrapper file to a table subdir.
-
std::string
getTableFileMgrPath
(int32_t db, int32_t tb) const¶
-
size_t
getFilesSize
() const¶ Get the total size of page files (data and metadata files). This includes allocated, but unused space.
-
size_t
getTableFileMgrsSize
() const¶ Returns the total size of all subdirectory files. Each table represented in the CFM has a subdirectory for serialized data wrappers and epoch files.
-
std::optional<FileBuffer *>
getBufferIfExists
(const ChunkKey &key)¶ an optional version of get buffer if we are not sure a chunk exists.
-
void
free_page
(std::pair<FileInfo *, int32_t> &&page)¶ Unlike the FileMgr, the CFM frees pages immediately instead of holding them until the next checkpoint.
-
void
getChunkMetadataVecForKeyPrefix
(ChunkMetadataVector &chunkMetadataVec, const ChunkKey &keyPrefix)¶
-
std::string
dumpKeysWithMetadata
() const¶
-
std::string
dumpKeysWithChunkData
() const¶
-
std::string
dumpTableQueue
() const¶
-
std::string
dumpEvictionQueue
() const¶
-
std::string
dump
() const¶
-
void
setMaxNumDataFiles
(size_t max)¶
-
void
setMaxNumMetadataFiles
(size_t max)¶
-
void
setMaxWrapperSpace
(size_t max)¶
-
std::set<ChunkKey>
getKeysWithMetadata
() const¶
-
void
setDataSizeLimit
(size_t max)¶
Public Static Functions
-
static size_t
getMinimumSize
()¶
Public Static Attributes
-
constexpr char
WRAPPER_FILE_NAME
[] = "wrapper_metadata.json"¶
-
constexpr float
METADATA_SPACE_PERCENTAGE
= {0.1}¶
-
constexpr float
METADATA_FILE_SPACE_PERCENTAGE
= {0.01}¶
Private Functions
-
void
incrementEpoch
(int32_t db_id, int32_t tb_id)¶ Increments epoch for the given table.
-
void
init
(const size_t num_reader_threads)¶ Initializes a CFM, parsing any existing files and initializing data structures appropriately (currently not thread-safe).
-
void
writeAndSyncEpochToDisk
(int32_t db_id, int32_t tb_id)¶ Flushes epoch value to disk for a table.
-
void
readTableFileMgrs
()¶ Checks for any sub-directories containing table-specific data and creates epochs from found files.
-
FileBuffer *
createBufferFromHeaders
(const ChunkKey &key, const std::vector<HeaderInfo>::const_iterator &startIt, const std::vector<HeaderInfo>::const_iterator &endIt)¶ Creates a buffer and initializes it with info read from files on disk.
-
FileBuffer *
createBufferUnlocked
(const ChunkKey &key, size_t pageSize = 0, const size_t numBytes = 0)¶ Creates a buffer.
-
void
createTableFileMgrIfNoneExists
(const int32_t db_id, const int32_t tb_id)¶ Create and initialize a subdirectory for a table if none exists.
-
void
incrementAllEpochs
()¶ Increment epochs for each table in the CFM.
-
void
removeTableFileMgr
(int32_t db_id, int32_t tb_id)¶ Removes the subdirectory content for a table.
-
void
removeTableBuffers
(int32_t db_id, int32_t tb_id)¶ Erases and cleans up all buffers for a table.
-
void
writeDirtyBuffers
(int32_t db_id, int32_t tb_id)¶ helper function to flush all dirty buffers to disk.
-
Page
requestFreePage
(size_t pagesize, const bool isMetadata)¶ requests a free page similar to FileMgr, but this override will also evict existing pages to make space if there are none available.
-
void
touchKey
(const ChunkKey &key) const¶ Used to track which tables/chunks were least recently used.
-
void
removeKey
(const ChunkKey &key) const¶
-
std::vector<ChunkKey>
getKeysForTable
(int32_t db_id, int32_t tb_id) const¶ returns set of keys contained in chunkIndex_ that match the given table prefix.
-
FileInfo *
evictMetadataPages
()¶ evicts all metadata pages for the least recently used table. Returns the first FileInfo that a page was evicted from (guaranteed to now have at least one free page in it).
-
FileInfo *
evictPages
()¶ evicts all data pages for the least recently used Chunk (metadata pages persist). Returns the first FileInfo that a page was evicted from (guaranteed to now have at least one free page in it).
-
void
deleteCacheIfTooLarge
()¶ When the cache is read from disk, we don’t know which chunks were least recently used. Rather than try to evict random pages to get down to size we just reset the cache to make sure we have space.
-
void
setMaxSizes
()¶ Sets the maximum number of files/space for each type of storage based on the maximum size.
-
FileBuffer *
getBufferUnlocked
(const ChunkKeyToChunkMap::iterator chunk_it, const size_t numBytes = 0)¶
-
ChunkKeyToChunkMap::iterator
deleteBufferUnlocked
(const ChunkKeyToChunkMap::iterator chunk_it, const bool purge = true)¶
Private Members
-
mapd_shared_mutex
table_dirs_mutex_
¶
-
std::map<TablePair, std::unique_ptr<TableFileMgr>>
table_dirs_
¶
-
size_t
max_num_data_files_
¶
-
size_t
max_num_meta_files_
¶
-
size_t
max_wrapper_space_
¶
-
size_t
max_size_
¶
-
std::optional<size_t>
limit_data_size_
= {}¶
-
LRUEvictionAlgorithm
chunk_evict_alg_
¶
-
LRUEvictionAlgorithm
table_evict_alg_
¶
-