+1

> On Dec 2, 2023, at 11:23, Glenn Justo Galvizo <[email protected]> wrote:
> 
> +1 from me as well.
> 
>> On Dec 2, 2023, at 10:27, Ian Maxon <[email protected]> wrote:
>> 
>> +1
>> 
>>>> On Dec 1, 2023 at 12:28:23, Murtadha Al-Hubail <[email protected]> wrote:
>>> 
>>> Each AsterixDB cluster today consists of one or more Node Controllers (NC)
>>> where the data is stored and processed. Each NC has a predefined set of
>>> storage partitions (iodevices). When data is ingested into the system, the
>>> data is hash-partitioned across the total number of storage partitions in
>>> the cluster. Similarly, when the data is queried, each NC will start as
>>> many threads as the number storage partitions it has to read and process
>>> the data in parallel. While this shared-nothing architecture has its
>>> advantages, it has its drawbacks too. One major drawback is the time needed
>>> to scale the cluster. Adding a new NC to an existing cluster of (n) nodes
>>> means writing a completely new copy of the data which will now be
>>> hash-partitioned to the new total number of storage partitions of (n + 1)
>>> nodes. This operation could potentially take several hours or even days
>>> which is unacceptable in the cloud age.
>>> 
>>> This APE is about adding a new deployment (cloud) mode to AsterixDB by
>>> implementing compute-storage separation to take advantage of the elasticity
>>> of the cloud. This will require the following:
>>> 
>>> 1. Moving from the dynamic data partitioning described earlier to a static
>>> data partitioning based on a configurable, but fixed during a cluster's
>>> life, number of storage partitions.
>>> 2. Introducing the concept of a "compute partition" where each NC will have
>>> a fixed number of compute partitions. This number could potentially be
>>> based on the number of CPU cores it has.
>>> 
>>> This will decouple the number of storage partitions being processed on an
>>> NC from the number of its compute partitions.
>>> 
>>> When an AsterixDB cluster is deployed using the cloud mode, we will do the
>>> following:
>>> 
>>> - The Cluster Controller will maintain a map containing the assignment of
>>> storage partitions to compute partitions.
>>> - New writes will be written to the NC's local storage and uploaded to an
>>> object store (e.g. AWS S3) which will be used as a highly available shared
>>> filesystem between NCs.
>>> - On queries, each NC will start as many threads as its compute partitions
>>> to process its currently assigned storage partitions.
>>> - On scaling operations, we will simply update the assignment map and NCs
>>> will lazily cache any data of newly assigned storage partitions from the
>>> object store.
>>> 
>>> Improvement tickets:
>>> Static data partitioning:
>>> https://urldefense.com/v3/__https://issues.apache.org/jira/browse/ASTERIXDB-3144__;!!CzAuKJ42GuquVTTmVmPViYEvSg!OAhVXrR7KC09sldpj5RPLxWAUgdr8MVlQ9bIpT5QK76KPmMlxnjFGChosdZpBbe81Z_KZI7COEEXdi5a$
>>> Compute-Storage Separation
>>> https://urldefense.com/v3/__https://issues.apache.org/jira/browse/ASTERIXDB-3196__;!!CzAuKJ42GuquVTTmVmPViYEvSg!OAhVXrR7KC09sldpj5RPLxWAUgdr8MVlQ9bIpT5QK76KPmMlxnjFGChosdZpBbe81Z_KZI7COGLN6MWp$
>>> 
>>> Please vote on this APE. We'll keep this open for 72 hours and pass with
>>> either 3 votes or a majority of positive votes.
>>> 

Reply via email to