Re: [DISCUSSION] Should we mark DROP COMPACT STORAGE as experimental

Ekaterina Dimitrova Fri, 11 Jun 2021 06:18:36 -0700

Sounds reasonable to me, thank you Benjamin!

On Fri, 11 Jun 2021 at 7:36, Benjamin Lerer <b.le...@gmail.com> wrote:


> Thanks everybody for the feedback.
>
> With the protocol flag mechanism and the documentation. I agree that the
> users should not be surprised by the schema change.
> Nevertheless, there are still several issues remaining and I am under the
> impression that our test coverage is not at the level it should be for that
> feature.
>
> By consequence, I would like to mark the feature as experimental until the
> following points are solved:
> # Have a similar set of unit tests to the one we have for COMPACT storage
> for after DROP COMPACT and another set where we mix writes performed before
> and after DROP COMPACT.
> # Solve the issues linked to the lack of primary key liveness
> (CASSANDRA-16675)
> # Have a way to prevent users to scew up their tables by using ALTER DROP
> statements
> # Find some solution for CASSANDRA-1606
>
> My plan is to create an umbrella ticket to have DROP COMPACT STORAGE out of
> experimental as soon as possible and allow users to be able to use it
> without taking any risk.
>
> Le lun. 7 juin 2021 à 20:36, Jeremiah D Jordan <jeremiah.jor...@gmail.com>
> a écrit :
>
> > We had many discussions around this back when this was added.  There is a
> > transition ability in place.  Users can set a native protocol flag to
> have
> > the server return results as if DROP COMPACT STORAGE was already run.  In
> > this way you can update your applications to support the new way results
> > are returned before the change has been made server side.  In the face of
> > multiple applications you can update them one at a time, switch each over
> > with the protocol flag.  Once all of your applications are updated and
> > running with the protocol flag set, so they now deal with how data is
> > returned when DROP COMPACT STORAGE has been run, you can then finally run
> > DROP COMPACT STORAGE on the server itself to update the schema.
> >
> > This is not the case where someone needs to run DROP COMPACT STORAGE and
> > then deal with the fallout.
> >
> > -Jeremiah
> >
> > > On Jun 7, 2021, at 3:06 AM, Oleksandr Petrov <
> oleksandr.pet...@gmail.com>
> > wrote:
> > >
> > > Thank you for bringing this subject up.
> > >
> > >> not ready for production use unless users fully understand what they
> are
> > > doing.
> > >
> > > Thing is, there's no easy way around dropping compact storage.  At the
> > > moment of writing of 10857, we have collectively decided that we'll
> > > document that the new columns are going to be shown, and have added a
> > > client protocol option that would hide/show columns depending on the
> mode
> > > we're running it in for anyone who upgrades. This makes it harder to
> make
> > > a transition for anyone who controls only the server side, since you
> have
> > > to account for how clients would behave whenever they see a new column.
> > We
> > > did try to patch around the shown columns, but because of ColumnFilter
> > this
> > > also turned out to be non-trivial, or at least not worth the effort for
> > the
> > > moment.
> > >
> > > One of the things mentioned in this list (primary key liveness) is also
> > > existing as a difference between UPDATE and INSERT, but I'm not sure if
> > > it's properly documented. Similar to some other nuances, such as nulls
> in
> > > clustering keys on partitions that only have a static row. We did
> > recently
> > > discuss some of these not-commonly-known cases with Benjamin and some
> > other
> > > folks. So it might be worth documenting those, too.
> > >
> > > Problem with compact storage is that very few people want to touch it,
> > and
> > > it requires a non-trivial amount of "institutional" knowledge and
> > > remembering things about Thrift. I think it's OK to mark the feature as
> > > experimental, but remembering how we haven't made significant
> > improvements
> > > to things we have previously marked as experimental, this one may not
> > > materialise into something final, too.
> > >
> > > What would a complete, non-foot-gun solution for dropping compact
> storage
> > > entail? If we're talking about avoiding showing columns to users, there
> > are
> > > ways to achieve this without rewriting sstables, for example, by
> > > introducing "hidden" columns in table metadata. However, if we want to
> > > preserve deletion semantics, I'm not sure if we're doing it right at
> all:
> > > we'll just trade one source of difference for partition liveness for
> > insert
> > > queries for the other, so I'd say that, by executing ALTER TABLE
> > statement,
> > > you're accepting that after it propagates, there will be at least some
> > > difference in behaviour and semantics. We did discuss this in C-16069,
> > and
> > > my thesis back then was that replacing special-casing for compact
> tables
> > > with special casing for tables that "used to be compact" isn't
> bringing
> > us
> > > closer to the final solution.
> > >
> > > To summarise, I don't mind if we mark this feature experimental, but if
> > we
> > > want to ever make it complete, we have to discuss what we do with each
> of
> > > the special cases. And it may very well be that we just need to add
> > > explicit hidden columns to metadata, and allow nulls for clusterings,
> > maybe
> > > several more small changes. Unless we define what it would take to get
> > this
> > > feature out of experimental state, and actually make an effort to
> resolve
> > > these issues, I'd just put a huge warning and call it a power-user
> > feature.
> > >
> > >
> > > On Fri, Jun 4, 2021 at 5:01 PM Joshua McKenzie <jmcken...@apache.org>
> > wrote:
> > >
> > >>>
> > >>> not ready for production use unless users fully understand what they
> > are
> > >>> doing.
> > >>
> > >> This statement stood out to me - in my opinion we should think
> carefully
> > >> about the surface area of the user interfaces on new features before
> we
> > add
> > >> more cognitive burden to our users. We already have plenty of
> > "foot-guns"
> > >> in the project and should only add more if absolutely necessary.
> > >>
> > >> Further, marking this as experimental would be another feature we've
> > >> released and then retroactively marked as experimental; that's a habit
> > we
> > >> should not get into.
> > >>
> > >> On balance, my .02 is the benefits to our end users and operators of
> > >> getting 4.0 to GA outweigh the costs of flagging this as experimental
> > now
> > >> so I'm a +1 to the flagging idea, but I think there's some valuable
> > lessons
> > >> for us to learn in retrospect from not just this feature but others
> > like it
> > >> in the past.
> > >>
> > >> Curious to hear Alex' thoughts about this situation in particular as
> > author
> > >> of C-10857. I recall that being a pretty painful slog so apologies in
> > >> advance for picking at this scab. :)
> > >>
> > >>
> > >>
> > >> On Fri, Jun 4, 2021 at 9:44 AM Brandon Williams <dri...@gmail.com>
> > wrote:
> > >>
> > >>> +1
> > >>>
> > >>> On Fri, Jun 4, 2021, 3:53 AM Benjamin Lerer <ble...@apache.org>
> wrote:
> > >>>
> > >>>> Hi everybody,
> > >>>>
> > >>>> There are a significant amount of issues with DROP COMPACT STORAGE
> > that
> > >>> can
> > >>>> be pretty surprising for users.
> > >>>> To name a few:
> > >>>> * Some hidden columns will show up changing the resultset returned
> for
> > >>>> wildcard queries
> > >>>> * As COMPACT tables did not have primary key liveness there empty
> rows
> > >>>> inserted AFTER the ALTER will be returned whereas the one inserted
> > >> before
> > >>>> the ALTER will not.
> > >>>> * Also due to the lack of primary key liveness the amount of
> SSTables
> > >>> being
> > >>>> read will increase resulting in slower queries
> > >>>> * After DROP COMPACT it becomes possible to ALTER the table in a way
> > >> that
> > >>>> makes all the row disappears
> > >>>> * There is a loss of functionality around null clustering when
> > dropping
> > >>>> compact storage (CASSANDRA-16069)
> > >>>>
> > >>>> In my opinion DROP COMPACT STORAGE is not ready for production use
> > >> unless
> > >>>> users fully understand what they are doing.
> > >>>> By consequence, I am wondering if we should not mark it as
> > experimental
> > >>> as
> > >>>> we did for the Materialized Views (CASSANDRA-13959).
> > >>>>
> > >>>> What is your opinion?
> > >>>>
> > >>>
> > >>
> > >
> > >
> > > --
> > > alex p
> >
> >
> > ---------------------------------------------------------------------
> > To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
> > For additional commands, e-mail: dev-h...@cassandra.apache.org
> >
> >
>

Re: [DISCUSSION] Should we mark DROP COMPACT STORAGE as experimental

Reply via email to