Re: [DISCUSS] CEP-7 Storage Attached Index

Oleksandr Petrov Wed, 23 Sep 2020 08:35:15 -0700

Short question: looking forward, how are we going to maintain three 2i
implementations: SASI, SAI, and 2i?


Another thing I think this CEP is missing is rationale and motivation
about why trie-based indexes were chosen over, say, B-Tree. We did have a
short discussion about this on Slack, but both arguments that I've heard
(space-saving and keeping a small subset of nodes in memory) work only for
the most primitive implementation of a B-Tree. Fully-occupied prefix B-Tree
can have similar properties. There's been a lot of research on B-Trees and
optimisations in those. Unfortunately, I do not have an
implementation sitting around for a direct comparison, but I can imagine
situations when B-Trees may perform better because of simpler construction.
Maybe we should even consider prototyping a prefix B-Tree to have a more
fair comparison.

Thank you,
-- Alex



On Thu, Sep 10, 2020 at 9:12 AM Jasonstack Zhao Yang <
jasonstack.z...@gmail.com> wrote:

> Thank you Patrick for hosting Cassandra Contributor Meeting for CEP-7 SAI.
>
> The recorded video is available here:
>
> https://cwiki.apache.org/confluence/display/CASSANDRA/2020-09-01+Apache+Cassandra+Contributor+Meeting
>
> On Tue, 1 Sep 2020 at 14:34, Jasonstack Zhao Yang <
> jasonstack.z...@gmail.com>
> wrote:
>
> > Thank you, Charles and Patrick
> >
> > On Tue, 1 Sep 2020 at 04:56, Charles Cao <caohair...@gmail.com> wrote:
> >
> >> Thank you, Patrick!
> >>
> >> On Mon, Aug 31, 2020 at 12:59 PM Patrick McFadin <pmcfa...@gmail.com>
> >> wrote:
> >> >
> >> > I just moved it to 8AM for this meeting to better accommodate APAC.
> >> Please
> >> > see the update here:
> >> >
> >>
> https://cwiki.apache.org/confluence/display/CASSANDRA/2020-08-01+Apache+Cassandra+Contributor+Meeting
> >> >
> >> > Patrick
> >> >
> >> > On Mon, Aug 31, 2020 at 10:04 AM Charles Cao <caohair...@gmail.com>
> >> wrote:
> >> >
> >> > > Patrick,
> >> > >
> >> > > 11AM PST is a bad time for the people in the APAC timezone. Can we
> >> > > move it to 7 or 8AM PST in the morning to accommodate their needs ?
> >> > >
> >> > > ~Charles
> >> > >
> >> > > On Fri, Aug 28, 2020 at 4:37 PM Patrick McFadin <pmcfa...@gmail.com
> >
> >> > > wrote:
> >> > > >
> >> > > > Meeting scheduled.
> >> > > >
> >> > >
> >>
> https://cwiki.apache.org/confluence/display/CASSANDRA/2020-08-01+Apache+Cassandra+Contributor+Meeting
> >> > > >
> >> > > > Tuesday September 1st, 11AM PST. I added a basic bullet for the
> >> agenda
> >> > > but
> >> > > > if there is more, edit away.
> >> > > >
> >> > > > Patrick
> >> > > >
> >> > > > On Thu, Aug 27, 2020 at 11:31 AM Jasonstack Zhao Yang <
> >> > > > jasonstack.z...@gmail.com> wrote:
> >> > > >
> >> > > > > +1
> >> > > > >
> >> > > > > On Thu, 27 Aug 2020 at 04:52, Ekaterina Dimitrova <
> >> > > e.dimitr...@gmail.com>
> >> > > > > wrote:
> >> > > > >
> >> > > > > > +1
> >> > > > > >
> >> > > > > > On Wed, 26 Aug 2020 at 16:48, Caleb Rackliffe <
> >> > > calebrackli...@gmail.com>
> >> > > > > > wrote:
> >> > > > > >
> >> > > > > > > +1
> >> > > > > > >
> >> > > > > > >
> >> > > > > > >
> >> > > > > > > On Wed, Aug 26, 2020, 3:45 PM Patrick McFadin <
> >> pmcfa...@gmail.com>
> >> > > > > > wrote:
> >> > > > > > >
> >> > > > > > >
> >> > > > > > >
> >> > > > > > > > This is related to the discussion Jordan and I had about
> the
> >> > > > > > contributor
> >> > > > > > >
> >> > > > > > > > Zoom call. Instead of open mic for any issue, call it
> based
> >> on a
> >> > > > > > > discussion
> >> > > > > > >
> >> > > > > > > > thread or threads for higher bandwidth discussion.
> >> > > > > > >
> >> > > > > > > >
> >> > > > > > >
> >> > > > > > > > I would be happy to schedule on for next week to
> >> specifically
> >> > > discuss
> >> > > > > > >
> >> > > > > > > > CEP-7. I can attach the recorded call to the CEP after.
> >> > > > > > >
> >> > > > > > > >
> >> > > > > > >
> >> > > > > > > > +1 or -1?
> >> > > > > > >
> >> > > > > > > >
> >> > > > > > >
> >> > > > > > > > Patrick
> >> > > > > > >
> >> > > > > > > >
> >> > > > > > >
> >> > > > > > > > On Tue, Aug 25, 2020 at 7:03 AM Joshua McKenzie <
> >> > > > > jmcken...@apache.org>
> >> > > > > > >
> >> > > > > > > > wrote:
> >> > > > > > >
> >> > > > > > > >
> >> > > > > > >
> >> > > > > > > > > >
> >> > > > > > >
> >> > > > > > > > > > Does community plan to open another discussion or CEP
> on
> >> > > > > > >
> >> > > > > > > > modularization?
> >> > > > > > >
> >> > > > > > > > >
> >> > > > > > >
> >> > > > > > > > > We probably should have a discussion on the ML or
> monthly
> >> > > contrib
> >> > > > > > call
> >> > > > > > >
> >> > > > > > > > > about it first to see how aligned the interested
> >> contributors
> >> > > are.
> >> > > > > > > Could
> >> > > > > > >
> >> > > > > > > > do
> >> > > > > > >
> >> > > > > > > > > that through CEP as well but CEP's (at least thus far
> >> sans k8s
> >> > > > > > > operator)
> >> > > > > > >
> >> > > > > > > > > tend to start with a strong, deeply thought out point of
> >> view
> >> > > being
> >> > > > > > >
> >> > > > > > > > > expressed.
> >> > > > > > >
> >> > > > > > > > >
> >> > > > > > >
> >> > > > > > > > > On Tue, Aug 25, 2020 at 3:26 AM Jasonstack Zhao Yang <
> >> > > > > > >
> >> > > > > > > > > jasonstack.z...@gmail.com> wrote:
> >> > > > > > >
> >> > > > > > > > >
> >> > > > > > >
> >> > > > > > > > > > >>> SASI's performance, specifically the search in the
> >> B+
> >> > > tree
> >> > > > > > >
> >> > > > > > > > component,
> >> > > > > > >
> >> > > > > > > > > > >>> depends a lot on the component file's header being
> >> > > available
> >> > > > > in
> >> > > > > > > the
> >> > > > > > >
> >> > > > > > > > > > >>> pagecache. SASI benefits from (needs) nodes with
> >> lots of
> >> > > RAM.
> >> > > > > > Is
> >> > > > > > >
> >> > > > > > > > SAI
> >> > > > > > >
> >> > > > > > > > > > bound
> >> > > > > > >
> >> > > > > > > > > > >>> to this same or similar limitation?
> >> > > > > > >
> >> > > > > > > > > >
> >> > > > > > >
> >> > > > > > > > > > SAI also benefits from larger memory because SAI puts
> >> block
> >> > > info
> >> > > > > on
> >> > > > > > >
> >> > > > > > > > heap
> >> > > > > > >
> >> > > > > > > > > > for searching on-disk components and having
> cross-index
> >> > > files on
> >> > > > > > page
> >> > > > > > >
> >> > > > > > > > > cache
> >> > > > > > >
> >> > > > > > > > > > improves read performance of different indexes on the
> >> same
> >> > > table.
> >> > > > > > >
> >> > > > > > > > > >
> >> > > > > > >
> >> > > > > > > > > >
> >> > > > > > >
> >> > > > > > > > > > >>> Flushing of SASI can be CPU+IO intensive, to the
> >> point of
> >> > > > > > >
> >> > > > > > > > saturation,
> >> > > > > > >
> >> > > > > > > > > > >>> pauses, and crashes on the node. SSDs are a must,
> >> along
> >> > > with
> >> > > > > a
> >> > > > > > > bit
> >> > > > > > >
> >> > > > > > > > of
> >> > > > > > >
> >> > > > > > > > > > >>> tuning, just to avoid bringing down your cluster.
> >> Beyond
> >> > > > > > reducing
> >> > > > > > >
> >> > > > > > > > > space
> >> > > > > > >
> >> > > > > > > > > > >>> requirements, does SAI improve on these things?
> Like
> >> > > SASI how
> >> > > > > > > does
> >> > > > > > >
> >> > > > > > > > > SAI,
> >> > > > > > >
> >> > > > > > > > > > in
> >> > > > > > >
> >> > > > > > > > > > >>> its own way, change/narrow the recommendations on
> >> node
> >> > > > > hardware
> >> > > > > > >
> >> > > > > > > > > specs?
> >> > > > > > >
> >> > > > > > > > > >
> >> > > > > > >
> >> > > > > > > > > > SAI won't crash the node during compaction and
> requires
> >> less
> >> > > > > > CPU/IO.
> >> > > > > > >
> >> > > > > > > > > >
> >> > > > > > >
> >> > > > > > > > > > * SAI defines global memory limit for compaction
> >> instead of
> >> > > > > > per-index
> >> > > > > > >
> >> > > > > > > > > > memory limit used by SASI.
> >> > > > > > >
> >> > > > > > > > > >   For example, compactions are running on 10 tables
> and
> >> each
> >> > > has
> >> > > > > 10
> >> > > > > > >
> >> > > > > > > > > > indexes. SAI will cap the
> >> > > > > > >
> >> > > > > > > > > >   memory usage with global limit while SASI may use up
> >> to
> >> > > 100 *
> >> > > > > > >
> >> > > > > > > > per-index
> >> > > > > > >
> >> > > > > > > > > > limit.
> >> > > > > > >
> >> > > > > > > > > >
> >> > > > > > >
> >> > > > > > > > > > * After flushing in-memory segments to disk, SAI won't
> >> merge
> >> > > > > > on-disk
> >> > > > > > >
> >> > > > > > > > > > segments while SASI
> >> > > > > > >
> >> > > > > > > > > >   attempts to merge them at the end.
> >> > > > > > >
> >> > > > > > > > > >
> >> > > > > > >
> >> > > > > > > > > >   There are pros and cons of not merging segments:
> >> > > > > > >
> >> > > > > > > > > >     ** Pros: compaction runs faster and requires fewer
> >> > > resources.
> >> > > > > > >
> >> > > > > > > > > >     ** Cons: small segments reduce compression ratio.
> >> > > > > > >
> >> > > > > > > > > >
> >> > > > > > >
> >> > > > > > > > > > * SAI on-disk format with row ids compresses better.
> >> > > > > > >
> >> > > > > > > > > >
> >> > > > > > >
> >> > > > > > > > > >
> >> > > > > > >
> >> > > > > > > > > > >>> I understand the desire in keeping out of scope
> the
> >> > > longer
> >> > > > > term
> >> > > > > > >
> >> > > > > > > > > > deprecation
> >> > > > > > >
> >> > > > > > > > > > >>> and migration plan, but… if SASI provides
> >> functionality
> >> > > that
> >> > > > > > SAI
> >> > > > > > >
> >> > > > > > > > > > doesn't,
> >> > > > > > >
> >> > > > > > > > > > >>> like tokenisation and DelimiterAnalyzer, yet
> >> introduces a
> >> > > > > body
> >> > > > > > of
> >> > > > > > >
> >> > > > > > > > > code
> >> > > > > > >
> >> > > > > > > > > > >>> ~somewhat similar, shouldn't we be roughly
> >> sketching out
> >> > > how
> >> > > > > to
> >> > > > > > >
> >> > > > > > > > > reduce
> >> > > > > > >
> >> > > > > > > > > > the
> >> > > > > > >
> >> > > > > > > > > > >>> maintenance surface area?
> >> > > > > > >
> >> > > > > > > > > >
> >> > > > > > >
> >> > > > > > > > > > Agreed that we should reduce maintenance area if
> >> possible,
> >> > > but
> >> > > > > only
> >> > > > > > >
> >> > > > > > > > very
> >> > > > > > >
> >> > > > > > > > > > limited
> >> > > > > > >
> >> > > > > > > > > > code base (eg. RangeIterator, QueryPlan) can be
> shared.
> >> The
> >> > > rest
> >> > > > > of
> >> > > > > > > the
> >> > > > > > >
> >> > > > > > > > > > code base
> >> > > > > > >
> >> > > > > > > > > > is quite different because of on-disk format and
> >> cross-index
> >> > > > > files.
> >> > > > > > >
> >> > > > > > > > > >
> >> > > > > > >
> >> > > > > > > > > > The goal of this CEP is to get community buy-in on
> SAI's
> >> > > design.
> >> > > > > > >
> >> > > > > > > > > > Tokenization,
> >> > > > > > >
> >> > > > > > > > > > DelimiterAnalyzer should be straightforward to
> >> implement on
> >> > > top
> >> > > > > of
> >> > > > > > > SAI.
> >> > > > > > >
> >> > > > > > > > > >
> >> > > > > > >
> >> > > > > > > > > > >>> Can we list what configurations of SASI will
> become
> >> > > > > deprecated
> >> > > > > > > once
> >> > > > > > >
> >> > > > > > > > > SAI
> >> > > > > > >
> >> > > > > > > > > > >>> becomes non-experimental?
> >> > > > > > >
> >> > > > > > > > > >
> >> > > > > > >
> >> > > > > > > > > > Except for "Like", "Tokenisation",
> "DelimiterAnalyzer",
> >> the
> >> > > rest
> >> > > > > of
> >> > > > > > >
> >> > > > > > > > SASI
> >> > > > > > >
> >> > > > > > > > > > can
> >> > > > > > >
> >> > > > > > > > > > be replaced by SAI.
> >> > > > > > >
> >> > > > > > > > > >
> >> > > > > > >
> >> > > > > > > > > > >>> Given a few bugs are open against 2i and SASI, can
> >> we
> >> > > provide
> >> > > > > > > some
> >> > > > > > >
> >> > > > > > > > > > >>> overview, or rough indication, of how many of them
> >> we
> >> > > could
> >> > > > > > > "triage
> >> > > > > > >
> >> > > > > > > > > > away"?
> >> > > > > > >
> >> > > > > > > > > >
> >> > > > > > >
> >> > > > > > > > > > I believe most of the known bugs in 2i/SASI either
> have
> >> been
> >> > > > > > > addressed
> >> > > > > > >
> >> > > > > > > > in
> >> > > > > > >
> >> > > > > > > > > > SAI or
> >> > > > > > >
> >> > > > > > > > > > don't apply to SAI.
> >> > > > > > >
> >> > > > > > > > > >
> >> > > > > > >
> >> > > > > > > > > > >>> And, is it time for the project to start
> >> introducing new
> >> > > SPI
> >> > > > > > >
> >> > > > > > > > > > >>> implementations as separate sub-modules and jar
> >> files
> >> > > that
> >> > > > > are
> >> > > > > > > only
> >> > > > > > >
> >> > > > > > > > > > loaded
> >> > > > > > >
> >> > > > > > > > > > >>> at runtime based on configuration settings? (sorry
> >> for
> >> > > the
> >> > > > > > >
> >> > > > > > > > conflation
> >> > > > > > >
> >> > > > > > > > > > on
> >> > > > > > >
> >> > > > > > > > > > >>> this one, but maybe it's the right time to raise
> it
> >> > > :shrug:)
> >> > > > > > >
> >> > > > > > > > > >
> >> > > > > > >
> >> > > > > > > > > > Agreed that modularization is the way to go and will
> >> speed up
> >> > > > > > module
> >> > > > > > >
> >> > > > > > > > > > development speed.
> >> > > > > > >
> >> > > > > > > > > >
> >> > > > > > >
> >> > > > > > > > > > Does community plan to open another discussion or CEP
> on
> >> > > > > > >
> >> > > > > > > > modularization?
> >> > > > > > >
> >> > > > > > > > > >
> >> > > > > > >
> >> > > > > > > > > >
> >> > > > > > >
> >> > > > > > > > > > On Mon, 24 Aug 2020 at 16:43, Mick Semb Wever <
> >> > > m...@apache.org>
> >> > > > > > > wrote:
> >> > > > > > >
> >> > > > > > > > > >
> >> > > > > > >
> >> > > > > > > > > > > Adding to Duy's questions…
> >> > > > > > >
> >> > > > > > > > > > >
> >> > > > > > >
> >> > > > > > > > > > >
> >> > > > > > >
> >> > > > > > > > > > > * Hardware specs
> >> > > > > > >
> >> > > > > > > > > > >
> >> > > > > > >
> >> > > > > > > > > > > SASI's performance, specifically the search in the
> B+
> >> tree
> >> > > > > > > component,
> >> > > > > > >
> >> > > > > > > > > > > depends a lot on the component file's header being
> >> > > available in
> >> > > > > > the
> >> > > > > > >
> >> > > > > > > > > > > pagecache. SASI benefits from (needs) nodes with
> lots
> >> of
> >> > > RAM.
> >> > > > > Is
> >> > > > > > > SAI
> >> > > > > > >
> >> > > > > > > > > > bound
> >> > > > > > >
> >> > > > > > > > > > > to this same or similar limitation?
> >> > > > > > >
> >> > > > > > > > > > >
> >> > > > > > >
> >> > > > > > > > > > > Flushing of SASI can be CPU+IO intensive, to the
> >> point of
> >> > > > > > > saturation,
> >> > > > > > >
> >> > > > > > > > > > > pauses, and crashes on the node. SSDs are a must,
> >> along
> >> > > with a
> >> > > > > > bit
> >> > > > > > > of
> >> > > > > > >
> >> > > > > > > > > > > tuning, just to avoid bringing down your cluster.
> >> Beyond
> >> > > > > reducing
> >> > > > > > >
> >> > > > > > > > space
> >> > > > > > >
> >> > > > > > > > > > > requirements, does SAI improve on these things? Like
> >> SASI
> >> > > how
> >> > > > > > does
> >> > > > > > >
> >> > > > > > > > SAI,
> >> > > > > > >
> >> > > > > > > > > > in
> >> > > > > > >
> >> > > > > > > > > > > its own way, change/narrow the recommendations on
> node
> >> > > hardware
> >> > > > > > >
> >> > > > > > > > specs?
> >> > > > > > >
> >> > > > > > > > > > >
> >> > > > > > >
> >> > > > > > > > > > >
> >> > > > > > >
> >> > > > > > > > > > > * Code Maintenance
> >> > > > > > >
> >> > > > > > > > > > >
> >> > > > > > >
> >> > > > > > > > > > > I understand the desire in keeping out of scope the
> >> longer
> >> > > term
> >> > > > > > >
> >> > > > > > > > > > deprecation
> >> > > > > > >
> >> > > > > > > > > > > and migration plan, but… if SASI provides
> >> functionality
> >> > > that
> >> > > > > SAI
> >> > > > > > >
> >> > > > > > > > > doesn't,
> >> > > > > > >
> >> > > > > > > > > > > like tokenisation and DelimiterAnalyzer, yet
> >> introduces a
> >> > > body
> >> > > > > of
> >> > > > > > >
> >> > > > > > > > code
> >> > > > > > >
> >> > > > > > > > > > > ~somewhat similar, shouldn't we be roughly sketching
> >> out
> >> > > how to
> >> > > > > > >
> >> > > > > > > > reduce
> >> > > > > > >
> >> > > > > > > > > > the
> >> > > > > > >
> >> > > > > > > > > > > maintenance surface area?
> >> > > > > > >
> >> > > > > > > > > > >
> >> > > > > > >
> >> > > > > > > > > > > Can we list what configurations of SASI will become
> >> > > deprecated
> >> > > > > > once
> >> > > > > > >
> >> > > > > > > > SAI
> >> > > > > > >
> >> > > > > > > > > > > becomes non-experimental?
> >> > > > > > >
> >> > > > > > > > > > >
> >> > > > > > >
> >> > > > > > > > > > > Given a few bugs are open against 2i and SASI, can
> we
> >> > > provide
> >> > > > > > some
> >> > > > > > >
> >> > > > > > > > > > > overview, or rough indication, of how many of them
> we
> >> could
> >> > > > > > "triage
> >> > > > > > >
> >> > > > > > > > > > away"?
> >> > > > > > >
> >> > > > > > > > > > >
> >> > > > > > >
> >> > > > > > > > > > > And, is it time for the project to start introducing
> >> new
> >> > > SPI
> >> > > > > > >
> >> > > > > > > > > > > implementations as separate sub-modules and jar
> files
> >> that
> >> > > are
> >> > > > > > only
> >> > > > > > >
> >> > > > > > > > > > loaded
> >> > > > > > >
> >> > > > > > > > > > > at runtime based on configuration settings? (sorry
> >> for the
> >> > > > > > > conflation
> >> > > > > > >
> >> > > > > > > > > on
> >> > > > > > >
> >> > > > > > > > > > > this one, but maybe it's the right time to raise it
> >> > > :shrug:)
> >> > > > > > >
> >> > > > > > > > > > >
> >> > > > > > >
> >> > > > > > > > > > > regards,
> >> > > > > > >
> >> > > > > > > > > > > Mick
> >> > > > > > >
> >> > > > > > > > > > >
> >> > > > > > >
> >> > > > > > > > > > >
> >> > > > > > >
> >> > > > > > > > > > > On Tue, 18 Aug 2020 at 13:05, DuyHai Doan <
> >> > > > > doanduy...@gmail.com>
> >> > > > > > >
> >> > > > > > > > > wrote:
> >> > > > > > >
> >> > > > > > > > > > >
> >> > > > > > >
> >> > > > > > > > > > > > Thank you Zhao Yang for starting this topic
> >> > > > > > >
> >> > > > > > > > > > > >
> >> > > > > > >
> >> > > > > > > > > > > > After reading the short design doc, I have a few
> >> > > questions
> >> > > > > > >
> >> > > > > > > > > > > >
> >> > > > > > >
> >> > > > > > > > > > > > 1) SASI was pretty inefficient indexing wide
> >> partitions
> >> > > > > because
> >> > > > > > > the
> >> > > > > > >
> >> > > > > > > > > > index
> >> > > > > > >
> >> > > > > > > > > > > > structure only retains the partition token, not
> the
> >> > > > > clustering
> >> > > > > > >
> >> > > > > > > > > colums.
> >> > > > > > >
> >> > > > > > > > > > As
> >> > > > > > >
> >> > > > > > > > > > > > per design doc SAI has row id mapping to partition
> >> > > offset,
> >> > > > > can
> >> > > > > > we
> >> > > > > > >
> >> > > > > > > > > hope
> >> > > > > > >
> >> > > > > > > > > > > that
> >> > > > > > >
> >> > > > > > > > > > > > indexing wide partition will be more efficient
> with
> >> SAI
> >> > > ? One
> >> > > > > > >
> >> > > > > > > > detail
> >> > > > > > >
> >> > > > > > > > > > that
> >> > > > > > >
> >> > > > > > > > > > > > worries me is that in the beggining of the design
> >> doc,
> >> > > it is
> >> > > > > > said
> >> > > > > > >
> >> > > > > > > > > that
> >> > > > > > >
> >> > > > > > > > > > > the
> >> > > > > > >
> >> > > > > > > > > > > > matching rows are post filtered while scanning the
> >> > > partition.
> >> > > > > > Can
> >> > > > > > >
> >> > > > > > > > you
> >> > > > > > >
> >> > > > > > > > > > > > confirm or infirm that SAI is efficient with wide
> >> > > partitions
> >> > > > > > and
> >> > > > > > >
> >> > > > > > > > > > provides
> >> > > > > > >
> >> > > > > > > > > > > > the partition offsets to the matching rows ?
> >> > > > > > >
> >> > > > > > > > > > > >
> >> > > > > > >
> >> > > > > > > > > > > > 2) About space efficiency, one of the biggest
> >> drawback of
> >> > > > > SASI
> >> > > > > > > was
> >> > > > > > >
> >> > > > > > > > > the
> >> > > > > > >
> >> > > > > > > > > > > huge
> >> > > > > > >
> >> > > > > > > > > > > > space required for index structure when using
> >> CONTAINS
> >> > > logic
> >> > > > > > >
> >> > > > > > > > because
> >> > > > > > >
> >> > > > > > > > > of
> >> > > > > > >
> >> > > > > > > > > > > the
> >> > > > > > >
> >> > > > > > > > > > > > decomposition of text columns into n-grams. Will
> SAI
> >> > > suffer
> >> > > > > > from
> >> > > > > > >
> >> > > > > > > > the
> >> > > > > > >
> >> > > > > > > > > > same
> >> > > > > > >
> >> > > > > > > > > > > > issue in future iterations ? I'm anticipating a
> bit
> >> > > > > > >
> >> > > > > > > > > > > >
> >> > > > > > >
> >> > > > > > > > > > > > 3) If I'm querying using SAI and providing
> complete
> >> > > partition
> >> > > > > > > key,
> >> > > > > > >
> >> > > > > > > > > will
> >> > > > > > >
> >> > > > > > > > > > > it
> >> > > > > > >
> >> > > > > > > > > > > > be more efficient than querying without partition
> >> key. In
> >> > > > > other
> >> > > > > > >
> >> > > > > > > > > words,
> >> > > > > > >
> >> > > > > > > > > > > does
> >> > > > > > >
> >> > > > > > > > > > > > SAI provide any optimisation when partition key is
> >> > > specified
> >> > > > > ?
> >> > > > > > >
> >> > > > > > > > > > > >
> >> > > > > > >
> >> > > > > > > > > > > > Regards
> >> > > > > > >
> >> > > > > > > > > > > >
> >> > > > > > >
> >> > > > > > > > > > > > Duy Hai DOAN
> >> > > > > > >
> >> > > > > > > > > > > >
> >> > > > > > >
> >> > > > > > > > > > > > Le mar. 18 août 2020 à 11:39, Mick Semb Wever <
> >> > > > > m...@apache.org>
> >> > > > > > a
> >> > > > > > >
> >> > > > > > > > > > écrit :
> >> > > > > > >
> >> > > > > > > > > > > >
> >> > > > > > >
> >> > > > > > > > > > > > > >
> >> > > > > > >
> >> > > > > > > > > > > > > > We are looking forward to the community's
> >> feedback
> >> > > and
> >> > > > > > >
> >> > > > > > > > > suggestions.
> >> > > > > > >
> >> > > > > > > > > > > > > >
> >> > > > > > >
> >> > > > > > > > > > > > >
> >> > > > > > >
> >> > > > > > > > > > > > >
> >> > > > > > >
> >> > > > > > > > > > > > > What comes immediately to mind is testing
> >> > > requirements. It
> >> > > > > > has
> >> > > > > > >
> >> > > > > > > > been
> >> > > > > > >
> >> > > > > > > > > > > > > mentioned already that the project's testability
> >> and QA
> >> > > > > > >
> >> > > > > > > > guidelines
> >> > > > > > >
> >> > > > > > > > > > are
> >> > > > > > >
> >> > > > > > > > > > > > > inadequate to successfully introduce new
> features
> >> and
> >> > > > > > >
> >> > > > > > > > refactorings
> >> > > > > > >
> >> > > > > > > > > to
> >> > > > > > >
> >> > > > > > > > > > > the
> >> > > > > > >
> >> > > > > > > > > > > > > codebase. During the 4.0 beta phase this was
> >> intended
> >> > > to be
> >> > > > > > >
> >> > > > > > > > > > addressed,
> >> > > > > > >
> >> > > > > > > > > > > > i.e.
> >> > > > > > >
> >> > > > > > > > > > > > > defining more specific QA guidelines for 4.0-rc.
> >> This
> >> > > would
> >> > > > > > be
> >> > > > > > > an
> >> > > > > > >
> >> > > > > > > > > > > > important
> >> > > > > > >
> >> > > > > > > > > > > > > step towards QA guidelines for all changes and
> >> CEPs
> >> > > > > post-4.0.
> >> > > > > > >
> >> > > > > > > > > > > > >
> >> > > > > > >
> >> > > > > > > > > > > > > Questions from me
> >> > > > > > >
> >> > > > > > > > > > > > >  - How will this be tested, how will its QA
> >> status and
> >> > > > > > > lifecycle
> >> > > > > > >
> >> > > > > > > > be
> >> > > > > > >
> >> > > > > > > > > > > > > defined? (per above)
> >> > > > > > >
> >> > > > > > > > > > > > >  - With existing C* code needing to be changed,
> >> what
> >> > > is the
> >> > > > > > >
> >> > > > > > > > > proposed
> >> > > > > > >
> >> > > > > > > > > > > plan
> >> > > > > > >
> >> > > > > > > > > > > > > for making those changes ensuring maintained QA,
> >> e.g.
> >> > > is
> >> > > > > > there
> >> > > > > > >
> >> > > > > > > > > > separate
> >> > > > > > >
> >> > > > > > > > > > > > QA
> >> > > > > > >
> >> > > > > > > > > > > > > cycles planned for altering the SPI before
> adding
> >> a
> >> > > new SPI
> >> > > > > > >
> >> > > > > > > > > > > > implementation?
> >> > > > > > >
> >> > > > > > > > > > > > >  - Despite being out of scope, it would be nice
> >> to have
> >> > > > > some
> >> > > > > > > idea
> >> > > > > > >
> >> > > > > > > > > > from
> >> > > > > > >
> >> > > > > > > > > > > > the
> >> > > > > > >
> >> > > > > > > > > > > > > CEP author of when users might still choose
> >> afresh 2i
> >> > > or
> >> > > > > SASI
> >> > > > > > >
> >> > > > > > > > over
> >> > > > > > >
> >> > > > > > > > > > SAI,
> >> > > > > > >
> >> > > > > > > > > > > > >  - Who fills the roles involved? Who are the
> >> > > contributors
> >> > > > > in
> >> > > > > > > this
> >> > > > > > >
> >> > > > > > > > > > > > DataStax
> >> > > > > > >
> >> > > > > > > > > > > > > team? Who is the shepherd? Are there other
> >> stakeholders
> >> > > > > > willing
> >> > > > > > >
> >> > > > > > > > to
> >> > > > > > >
> >> > > > > > > > > be
> >> > > > > > >
> >> > > > > > > > > > > > > involved?
> >> > > > > > >
> >> > > > > > > > > > > > >  - Is there a preference to use gdoc instead of
> >> the
> >> > > > > project's
> >> > > > > > >
> >> > > > > > > > wiki,
> >> > > > > > >
> >> > > > > > > > > > and
> >> > > > > > >
> >> > > > > > > > > > > > > why? (the CEP process suggest a wiki page, and
> >> > > feedback on
> >> > > > > > why
> >> > > > > > >
> >> > > > > > > > > > another
> >> > > > > > >
> >> > > > > > > > > > > > > approach is considered better helps evolve the
> CEP
> >> > > process
> >> > > > > > >
> >> > > > > > > > itself)
> >> > > > > > >
> >> > > > > > > > > > > > >
> >> > > > > > >
> >> > > > > > > > > > > > > cheers,
> >> > > > > > >
> >> > > > > > > > > > > > > Mick
> >> > > > > > >
> >> > > > > > > > > > > > >
> >> > > > > > >
> >> > > > > > > > > > > >
> >> > > > > > >
> >> > > > > > > > > > >
> >> > > > > > >
> >> > > > > > > > > >
> >> > > > > > >
> >> > > > > > > > >
> >> > > > > > >
> >> > > > > > > >
> >> > > > > > >
> >> > > > > > >
> >> > > > > >
> >> > > > >
> >> > >
> >> > >
> ---------------------------------------------------------------------
> >> > > To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
> >> > > For additional commands, e-mail: dev-h...@cassandra.apache.org
> >> > >
> >> > >
> >>
> >> ---------------------------------------------------------------------
> >> To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
> >> For additional commands, e-mail: dev-h...@cassandra.apache.org
> >>
> >>
>


-- 
alex p

Re: [DISCUSS] CEP-7 Storage Attached Index

Reply via email to