Sounds reasonable to me, thank you Benjamin! On Fri, 11 Jun 2021 at 7:36, Benjamin Lerer <b.le...@gmail.com> wrote:
> Thanks everybody for the feedback. > > With the protocol flag mechanism and the documentation. I agree that the > users should not be surprised by the schema change. > Nevertheless, there are still several issues remaining and I am under the > impression that our test coverage is not at the level it should be for that > feature. > > By consequence, I would like to mark the feature as experimental until the > following points are solved: > # Have a similar set of unit tests to the one we have for COMPACT storage > for after DROP COMPACT and another set where we mix writes performed before > and after DROP COMPACT. > # Solve the issues linked to the lack of primary key liveness > (CASSANDRA-16675) > # Have a way to prevent users to scew up their tables by using ALTER DROP > statements > # Find some solution for CASSANDRA-1606 > > My plan is to create an umbrella ticket to have DROP COMPACT STORAGE out of > experimental as soon as possible and allow users to be able to use it > without taking any risk. > > Le lun. 7 juin 2021 à 20:36, Jeremiah D Jordan <jeremiah.jor...@gmail.com> > a écrit : > > > We had many discussions around this back when this was added. There is a > > transition ability in place. Users can set a native protocol flag to > have > > the server return results as if DROP COMPACT STORAGE was already run. In > > this way you can update your applications to support the new way results > > are returned before the change has been made server side. In the face of > > multiple applications you can update them one at a time, switch each over > > with the protocol flag. Once all of your applications are updated and > > running with the protocol flag set, so they now deal with how data is > > returned when DROP COMPACT STORAGE has been run, you can then finally run > > DROP COMPACT STORAGE on the server itself to update the schema. > > > > This is not the case where someone needs to run DROP COMPACT STORAGE and > > then deal with the fallout. > > > > -Jeremiah > > > > > On Jun 7, 2021, at 3:06 AM, Oleksandr Petrov < > oleksandr.pet...@gmail.com> > > wrote: > > > > > > Thank you for bringing this subject up. > > > > > >> not ready for production use unless users fully understand what they > are > > > doing. > > > > > > Thing is, there's no easy way around dropping compact storage. At the > > > moment of writing of 10857, we have collectively decided that we'll > > > document that the new columns are going to be shown, and have added a > > > client protocol option that would hide/show columns depending on the > mode > > > we're running it in for anyone who upgrades. This makes it harder to > make > > > a transition for anyone who controls only the server side, since you > have > > > to account for how clients would behave whenever they see a new column. > > We > > > did try to patch around the shown columns, but because of ColumnFilter > > this > > > also turned out to be non-trivial, or at least not worth the effort for > > the > > > moment. > > > > > > One of the things mentioned in this list (primary key liveness) is also > > > existing as a difference between UPDATE and INSERT, but I'm not sure if > > > it's properly documented. Similar to some other nuances, such as nulls > in > > > clustering keys on partitions that only have a static row. We did > > recently > > > discuss some of these not-commonly-known cases with Benjamin and some > > other > > > folks. So it might be worth documenting those, too. > > > > > > Problem with compact storage is that very few people want to touch it, > > and > > > it requires a non-trivial amount of "institutional" knowledge and > > > remembering things about Thrift. I think it's OK to mark the feature as > > > experimental, but remembering how we haven't made significant > > improvements > > > to things we have previously marked as experimental, this one may not > > > materialise into something final, too. > > > > > > What would a complete, non-foot-gun solution for dropping compact > storage > > > entail? If we're talking about avoiding showing columns to users, there > > are > > > ways to achieve this without rewriting sstables, for example, by > > > introducing "hidden" columns in table metadata. However, if we want to > > > preserve deletion semantics, I'm not sure if we're doing it right at > all: > > > we'll just trade one source of difference for partition liveness for > > insert > > > queries for the other, so I'd say that, by executing ALTER TABLE > > statement, > > > you're accepting that after it propagates, there will be at least some > > > difference in behaviour and semantics. We did discuss this in C-16069, > > and > > > my thesis back then was that replacing special-casing for compact > tables > > > with special casing for tables that "used to be compact" isn't > bringing > > us > > > closer to the final solution. > > > > > > To summarise, I don't mind if we mark this feature experimental, but if > > we > > > want to ever make it complete, we have to discuss what we do with each > of > > > the special cases. And it may very well be that we just need to add > > > explicit hidden columns to metadata, and allow nulls for clusterings, > > maybe > > > several more small changes. Unless we define what it would take to get > > this > > > feature out of experimental state, and actually make an effort to > resolve > > > these issues, I'd just put a huge warning and call it a power-user > > feature. > > > > > > > > > On Fri, Jun 4, 2021 at 5:01 PM Joshua McKenzie <jmcken...@apache.org> > > wrote: > > > > > >>> > > >>> not ready for production use unless users fully understand what they > > are > > >>> doing. > > >> > > >> This statement stood out to me - in my opinion we should think > carefully > > >> about the surface area of the user interfaces on new features before > we > > add > > >> more cognitive burden to our users. We already have plenty of > > "foot-guns" > > >> in the project and should only add more if absolutely necessary. > > >> > > >> Further, marking this as experimental would be another feature we've > > >> released and then retroactively marked as experimental; that's a habit > > we > > >> should not get into. > > >> > > >> On balance, my .02 is the benefits to our end users and operators of > > >> getting 4.0 to GA outweigh the costs of flagging this as experimental > > now > > >> so I'm a +1 to the flagging idea, but I think there's some valuable > > lessons > > >> for us to learn in retrospect from not just this feature but others > > like it > > >> in the past. > > >> > > >> Curious to hear Alex' thoughts about this situation in particular as > > author > > >> of C-10857. I recall that being a pretty painful slog so apologies in > > >> advance for picking at this scab. :) > > >> > > >> > > >> > > >> On Fri, Jun 4, 2021 at 9:44 AM Brandon Williams <dri...@gmail.com> > > wrote: > > >> > > >>> +1 > > >>> > > >>> On Fri, Jun 4, 2021, 3:53 AM Benjamin Lerer <ble...@apache.org> > wrote: > > >>> > > >>>> Hi everybody, > > >>>> > > >>>> There are a significant amount of issues with DROP COMPACT STORAGE > > that > > >>> can > > >>>> be pretty surprising for users. > > >>>> To name a few: > > >>>> * Some hidden columns will show up changing the resultset returned > for > > >>>> wildcard queries > > >>>> * As COMPACT tables did not have primary key liveness there empty > rows > > >>>> inserted AFTER the ALTER will be returned whereas the one inserted > > >> before > > >>>> the ALTER will not. > > >>>> * Also due to the lack of primary key liveness the amount of > SSTables > > >>> being > > >>>> read will increase resulting in slower queries > > >>>> * After DROP COMPACT it becomes possible to ALTER the table in a way > > >> that > > >>>> makes all the row disappears > > >>>> * There is a loss of functionality around null clustering when > > dropping > > >>>> compact storage (CASSANDRA-16069) > > >>>> > > >>>> In my opinion DROP COMPACT STORAGE is not ready for production use > > >> unless > > >>>> users fully understand what they are doing. > > >>>> By consequence, I am wondering if we should not mark it as > > experimental > > >>> as > > >>>> we did for the Materialized Views (CASSANDRA-13959). > > >>>> > > >>>> What is your opinion? > > >>>> > > >>> > > >> > > > > > > > > > -- > > > alex p > > > > > > --------------------------------------------------------------------- > > To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org > > For additional commands, e-mail: dev-h...@cassandra.apache.org > > > > >