+1 -Murtadha ________________________________ From: Glenn Justo Galvizo <[email protected]> Sent: Friday, April 19, 2024 8:19:14 PM To: [email protected] <[email protected]> Subject: Re: [APE] Unlimited Storage: Local disk caching in cloud deployment
+1 from me, sounds interesting! > On Apr 3, 2024, at 11:42, Ian Maxon <[email protected]> wrote: > > +1, this will be a great addition. > >> On Apr 3, 2024 at 09:44:01, Wail Alkowaileet <[email protected]> wrote: >> >> In the current cloud deployment, users are limited by the disk space of the >> cluster's nodes. However, the blob storage services provided by cloud >> providers (e.g., S3) can virtually store an "unlimited" amount of data. >> Thus, AsterixDB can provide the means to store beyond what the cluster's >> local drives can. >> >> In this proposal, we want to extend AsterixDB's capability to allow the >> local drives to act as a cache, instead of a mirror image of what's stored >> in the cloud. By "as a cache" we mean files and pages can be >> retrieved/persited and removed (evicted) from the local drives, according >> to some policy. >> >> The aim of this proposal is to describe and implement a mechanism called >> "*Weep >> and Sweep*". Those are the names of two phases when the amount of the data >> in the cloud exceeds the space of the cluster's local disks. >> Weep >> >> When the disk is pressured (the pressure size can be configured), the >> system will start to "weep" and devise a plan to what should be "evicted" >> according to some statistics and policies, *which are not solidified yet >> and still a work in progress.* >> Sweep >> >> After "weeping", a sweep operation will take place and start evicting what >> the weep's plan considers as evictable. Depending on the index type >> (primary/secondary) and the storage format (row/column), the smallest >> evictable unit can differ. The following table shows the smallest unit of >> evictable unit: >> *Index Type* *Evictable* >> Metadata Indexes (e.g., Dataset, ..etc) Not evictable >> Secondary indexes Evicted as a whole >> Primary Indexes (Row) Evicted as a whole >> Primary Indexes (Columnar) Columns (or columns’ pages) >> Featured Considerations >> >> - For columnar primary index, they will never be downloaded as a whole >> - Instead, columns will be streamed from the cloud (if accessed for >> the first time) and persisted to local disk if necessary >> - We are considering providing a mechanism to prefetch the next columns >> of the next mega-leaf node >> < >> https://urldefense.com/v3/__https://www.vldb.org/pvldb/vol15/p2085-alkowaileet.pdf__;!!CzAuKJ42GuquVTTmVmPViYEvSg!Oah7iQPtzg5ozE3ckKpn-ANVgu_VrdWY_2gO_-HwxeYgrKWj8kmv7ifZQKnf36jne2V_SXXvmITxy_E$ >>> . The hope here >> is to mask any latencies when reading columns from the cloud >> - Depending on the disk pressure and the operation, the system can >> determine if the streamed columns from the cloud are "worthy" to be >> cached >> locally. For example, if columns are read in a merge operation, it might >> not be "wise" to persist these columns as their on-disk component is >> going >> to be deleted at the end of the merge operation. Thus, it might be >> "better" >> to dedicate the free space on disk for the newly created/merged >> component. >> >> >> Multiple aspects (such as the evictable units and policies) of this APE are >> not solidified yet, but the core concepts are in place and are ready for >> the community's vote :) >> >> EPIC: ASTERIXDB-3373 < >> https://urldefense.com/v3/__https://issues.apache.org/jira/browse/ASTERIXDB-3373__;!!CzAuKJ42GuquVTTmVmPViYEvSg!Oah7iQPtzg5ozE3ckKpn-ANVgu_VrdWY_2gO_-HwxeYgrKWj8kmv7ifZQKnf36jne2V_SXXv8xZvKPI$ >>> >> -- >> >> *Regards,* >> Wail Alkowaileet >>
