Class File_Namespace::CachingFileMgr¶
- 
class 
CachingFileMgr: public File_Namespace::FileMgr¶ A FileMgr capable of limiting it’s size and storing data from multiple tables in a shared directory. For any table that supports DiskCaching, the CachingFileMgr must contain either metadata for all table chunks, or for none (the cache is either has no knowledge of that table, or has complete knowledge of that table). Any data chunk within a table may or may not be contained within the cache.
Public Functions
- 
CachingFileMgr(const DiskCacheConfig &config)¶ 
- 
~CachingFileMgr()¶ 
- 
MgrType 
getMgrType()¶ 
- 
std::string 
getStringMgrType()¶ 
- 
size_t 
getDefaultPageSize()¶ 
- 
size_t 
getMaxSize()¶ 
- 
size_t 
getMaxDataFiles() const¶ 
- 
size_t 
getMaxMetaFiles() const¶ 
- 
size_t 
getMaxWrapperSize() const¶ 
- 
size_t 
getDataFileSize() const¶ 
- 
size_t 
getMetadataFileSize() const¶ 
- 
size_t 
getNumDataFiles() const¶ 
- 
size_t 
getNumMetaFiles() const¶ 
- 
size_t 
getAvailableSpace()¶ 
- 
size_t 
getAvailableWrapperSpace()¶ 
- 
size_t 
getAllocated()¶ 
- 
size_t 
getMaxDataFilesSize() const¶ 
- 
void 
removeChunkKeepMetadata(const ChunkKey &key)¶ Free pages for chunk and remove it from the chunk eviction algorithm.
- 
void 
clearForTable(int32_t db_id, int32_t tb_id)¶ Removes all data related to the given table (pages and subdirectories).
- 
bool 
hasFileMgrKey() const¶ Query to determine if the contained pages will have their database and table ids overriden by the filemgr key (FileMgr does this).
- 
void 
closeRemovePhysical()¶ Closes files and removes the caching directory.
- 
size_t 
getChunkSpaceReservedByTable(int32_t db_id, int32_t tb_id) const¶ Set of functions to determine how much space is reserved in a table by type.
- 
size_t 
getMetadataSpaceReservedByTable(int32_t db_id, int32_t tb_id) const¶ 
- 
size_t 
getTableFileMgrSpaceReserved(int32_t db_id, int32_t tb_id) const¶ 
- 
size_t 
getSpaceReservedByTable(int32_t db_id, int32_t tb_id) const¶ 
- 
void 
checkpoint(const int32_t db_id, const int32_t tb_id)¶ writes buffers for the given table, synchronizes files to disk, updates file epoch, and commits free pages.
- 
int32_t 
epoch(int32_t db_id, int32_t tb_id) const¶ obtain the epoch version for the given table.
- 
FileBuffer *
putBuffer(const ChunkKey &key, AbstractBuffer *srcBuffer, const size_t numBytes = 0)¶ deletes any existing buffer for the given key then copies in a new one.
putBuffer() needs to behave differently than it does in FileMgr. Specifically, it needs to delete the buffer beforehand and then append, rather than overwrite the existing buffer. This way we only store a single version of the buffer rather than accumulating versions that need to be rolled off.
- 
CachingFileBuffer *
allocateBuffer(const size_t page_size, const ChunkKey &key, const size_t num_bytes = 0)¶ allocates a new CachingFileBuffer and tracks it’s use in the eviction algorithms.
- 
CachingFileBuffer *
allocateBuffer(const ChunkKey &key, const std::vector<HeaderInfo>::const_iterator &headerStartIt, const std::vector<HeaderInfo>::const_iterator &headerEndIt)¶ 
- 
bool 
updatePageIfDeleted(FileInfo *file_info, ChunkKey &chunk_key, int32_t contingent, int32_t page_epoch, int32_t page_num)¶ checks whether a page should be deleted.
- 
bool 
failOnReadError() const¶ True if a read error should cause a fatal error.
- 
void 
deleteBufferIfExists(const ChunkKey &key)¶ deletes a buffer if it exists in the mgr. Otherwise do nothing.
- 
size_t 
getNumChunksWithMetadata() const¶ Returns the number of buffers with metadata in the CFM. Any buffer with an encoder counts.
- 
size_t 
getNumDataChunks() const¶ Returns the number of buffers with chunk data in the CFM.
- 
std::vector<ChunkKey> 
getChunkKeysForPrefix(const ChunkKey &prefix) const¶ Returns the keys for chunks with chunk data that match the given prefix.
- 
std::unique_ptr<CachingFileMgr> 
reconstruct() const¶ Initializes a new CFM using the initialization values in the current CFM.
- 
void 
deleteWrapperFile(int32_t db, int32_t tb)¶ Deletes the wrapper file from a table subdir.
- 
void 
writeWrapperFile(const std::string &doc, int32_t db, int32_t tb)¶ Writes a wrapper file to a table subdir.
- 
std::string 
getTableFileMgrPath(int32_t db, int32_t tb) const¶ 
- 
size_t 
getFilesSize() const¶ Get the total size of page files (data and metadata files). This includes allocated, but unused space.
- 
size_t 
getTableFileMgrsSize() const¶ Returns the total size of all subdirectory files. Each table represented in the CFM has a subdirectory for serialized data wrappers and epoch files.
- 
std::optional<FileBuffer *> 
getBufferIfExists(const ChunkKey &key)¶ an optional version of get buffer if we are not sure a chunk exists.
- 
void 
free_page(std::pair<FileInfo *, int32_t> &&page)¶ Unlike the FileMgr, the CFM frees pages immediately instead of holding them until the next checkpoint.
- 
void 
getChunkMetadataVecForKeyPrefix(ChunkMetadataVector &chunkMetadataVec, const ChunkKey &keyPrefix)¶ 
- 
std::string 
dumpKeysWithMetadata() const¶ 
- 
std::string 
dumpKeysWithChunkData() const¶ 
- 
std::string 
dumpTableQueue() const¶ 
- 
std::string 
dumpEvictionQueue() const¶ 
- 
std::string 
dump() const¶ 
- 
void 
setMaxNumDataFiles(size_t max)¶ 
- 
void 
setMaxNumMetadataFiles(size_t max)¶ 
- 
void 
setMaxWrapperSpace(size_t max)¶ 
- 
std::set<ChunkKey> 
getKeysWithMetadata() const¶ 
- 
void 
setDataSizeLimit(size_t max)¶ 
Public Static Functions
- 
static size_t 
getMinimumSize()¶ 
Public Static Attributes
- 
constexpr char 
WRAPPER_FILE_NAME[] = "wrapper_metadata.json"¶ 
- 
constexpr float 
METADATA_SPACE_PERCENTAGE= {0.1}¶ 
- 
constexpr float 
METADATA_FILE_SPACE_PERCENTAGE= {0.01}¶ 
Private Functions
- 
void 
incrementEpoch(int32_t db_id, int32_t tb_id)¶ Increments epoch for the given table.
- 
void 
init(const size_t num_reader_threads)¶ Initializes a CFM, parsing any existing files and initializing data structures appropriately (currently not thread-safe).
- 
void 
writeAndSyncEpochToDisk(int32_t db_id, int32_t tb_id)¶ Flushes epoch value to disk for a table.
- 
void 
readTableFileMgrs()¶ Checks for any sub-directories containing table-specific data and creates epochs from found files.
- 
FileBuffer *
createBufferFromHeaders(const ChunkKey &key, const std::vector<HeaderInfo>::const_iterator &startIt, const std::vector<HeaderInfo>::const_iterator &endIt)¶ Creates a buffer and initializes it with info read from files on disk.
- 
FileBuffer *
createBufferUnlocked(const ChunkKey &key, size_t pageSize = 0, const size_t numBytes = 0)¶ Creates a buffer.
- 
void 
createTableFileMgrIfNoneExists(const int32_t db_id, const int32_t tb_id)¶ Create and initialize a subdirectory for a table if none exists.
- 
void 
incrementAllEpochs()¶ Increment epochs for each table in the CFM.
- 
void 
removeTableFileMgr(int32_t db_id, int32_t tb_id)¶ Removes the subdirectory content for a table.
- 
void 
removeTableBuffers(int32_t db_id, int32_t tb_id)¶ Erases and cleans up all buffers for a table.
- 
void 
writeDirtyBuffers(int32_t db_id, int32_t tb_id)¶ helper function to flush all dirty buffers to disk.
- 
Page 
requestFreePage(size_t pagesize, const bool isMetadata)¶ requests a free page similar to FileMgr, but this override will also evict existing pages to make space if there are none available.
- 
void 
touchKey(const ChunkKey &key) const¶ Used to track which tables/chunks were least recently used.
- 
void 
removeKey(const ChunkKey &key) const¶ 
- 
std::vector<ChunkKey> 
getKeysForTable(int32_t db_id, int32_t tb_id) const¶ returns set of keys contained in chunkIndex_ that match the given table prefix.
- 
FileInfo *
evictMetadataPages()¶ evicts all metadata pages for the least recently used table. Returns the first FileInfo that a page was evicted from (guaranteed to now have at least one free page in it).
- 
FileInfo *
evictPages()¶ evicts all data pages for the least recently used Chunk (metadata pages persist). Returns the first FileInfo that a page was evicted from (guaranteed to now have at least one free page in it).
- 
void 
deleteCacheIfTooLarge()¶ When the cache is read from disk, we don’t know which chunks were least recently used. Rather than try to evict random pages to get down to size we just reset the cache to make sure we have space.
- 
void 
setMaxSizes()¶ Sets the maximum number of files/space for each type of storage based on the maximum size.
- 
FileBuffer *
getBufferUnlocked(const ChunkKeyToChunkMap::iterator chunk_it, const size_t numBytes = 0)¶ 
- 
ChunkKeyToChunkMap::iterator 
deleteBufferUnlocked(const ChunkKeyToChunkMap::iterator chunk_it, const bool purge = true)¶ 
Private Members
- 
mapd_shared_mutex 
table_dirs_mutex_¶ 
- 
std::map<TablePair, std::unique_ptr<TableFileMgr>> 
table_dirs_¶ 
- 
size_t 
max_num_data_files_¶ 
- 
size_t 
max_num_meta_files_¶ 
- 
size_t 
max_wrapper_space_¶ 
- 
size_t 
max_size_¶ 
- 
std::optional<size_t> 
limit_data_size_= {}¶ 
- 
LRUEvictionAlgorithm 
chunk_evict_alg_¶ 
- 
LRUEvictionAlgorithm 
table_evict_alg_¶ 
-