How are we gauging what our python dtest coverage is vs. in-jvm dtest coverage?
On Wed, Mar 30, 2022, at 4:51 AM, Benjamin Lerer wrote: >> >> >> I think we can get rid of this by extending CassandraDaemon, just need to >> add a few hooks to mock out gossip/internode/client (for cases where the >> mocks are desired), and when mocks are not desired just run the real logic. >> >> Too many times I have had to make the 2 more in-line, and this is hard to >> maintain… we should fix this and feel this is 100% fixable > > Thanks for the explanation David. Outside of this area is there some other > difference in the coverage of the tests. Is serialization fully covered? > I would like to be sure that we will not miss anything by using in-jvm dtests > instead of python dtests. > > > Le mer. 30 mars 2022 à 02:15, bened...@apache.org <bened...@apache.org> a > écrit : >> > a well-defined path to reduce/eliminate code duplication and basic >> > documentation for newcomers to get up to speed with writing in-jvm dtests >> > and extending the framework____ >> __ __ >> Are python tests much better here? If not, I do not see why these should be >> blockers for their deprecation.____ >> __ __ >> Perfect feature parity also seems unnecessary - unless a missing feature is >> an active impediment. But as far as I know every missing feature is actively >> under development and can be expected very soon. ____ >> __ __ >> Let’s get this decision over and done with.____ >> __ __ >> __ __ >> *From: *Paulo Motta <pauloricard...@gmail.com> >> *Date: *Wednesday, 30 March 2022 at 00:46 >> *To: *Cassandra DEV <dev@cassandra.apache.org> >> *Subject: *Re: [DISCUSS] Should we deprecate / freeze python dtests____ >> I support deprecating python dtests, as long as in-jvm dtests have feature >> parity with python dtests, a well-defined path to reduce/eliminate code >> duplication and basic documentation for newcomers to get up to speed with >> writing in-jvm dtests and extending the framework.____ >> __ __ >> Em ter., 29 de mar. de 2022 às 20:09, bened...@apache.org >> <bened...@apache.org> escreveu:____ >>> It often does not work. I can attest to many wasted weeks, on some >>> environments never getting them to work.____ >>> ____ >>> They happen to work right now for me, though.____ >>> ____ >>> I think the learning curve thing is a bit of a distraction, personally. I >>> have always found python dtests hard to work with, both developing against >>> and running, so their learning curve for me is going on 10 years. Some folk >>> may be more comfortable with python dtests due to their familiarity with >>> python, ccm or other tooling, but that is a different matter.____ >>> ____ >>> Looking at git, most contributors to python dtests are contributors to >>> in-jvm dtests, and the latter have received 20x as many net code >>> contributions over the past year. ____ >>> ____ >>> I think it’s quite justified to just say in-jvm dtests are simply better to >>> work with, and already better and more widely used despite their youth, >>> whatever their remaining teething problems.____ >>> ____ >>> I vote we immediately discontinue python dtest development, and discontinue >>> running python dtests pre-commit, retaining them for releases only. This >>> will provide the necessary impetus to polish off any last remaining gaps, >>> without reducing coverage.____ >>> ____ >>> *From: *Brandon Williams <dri...@gmail.com> >>> *Date: *Tuesday, 29 March 2022 at 23:42 >>> *To: *dev <dev@cassandra.apache.org> >>> *Subject: *Re: [DISCUSS] Should we deprecate / freeze python dtests____ >>> > In fact there is a high learning curve to setup cassandra-dtest >>> > environment >>> >>> I think this is fairly well documented: >>> https://github.com/apache/cassandra-dtest/blob/trunk/README.md >>> >>> On Tue, Mar 29, 2022 at 5:27 PM Paulo Motta <pauloricard...@gmail.com> >>> wrote: >>> > >>> > > I am curious about this comment. When I first joined I learned >>> > > jvm-dtest within an hour and started walking Repair code in a debugger >>> > > (and this was way before the improvements that let us do things like >>> > > nodetool)… python dtest took weeks to get working correctly (still >>> > > having issues with the MBean library we use… so have to comment out >>> > > error handling to get some tests to pass)…. >>> > >>> > Thanks for sharing your perspective. In fact there is a high learning >>> > curve to setup cassandra-dtest environment, but once it's working it's >>> > pretty straightforward to test any existing or new functionality. >>> > >>> > I think with in-jvm dtests you don't have the hassle of setting up a >>> > different environment and this is a great motivator to standardize on >>> > this solution. The main difficulty I had was testing features not >>> > supported by the framework, which require you to extend the framework. I >>> > don't recall having to extend ccm/cassandra-dtest many times when working >>> > on new features. >>> > >>> > Perhaps this has improved recently and we no longer need to worry about >>> > extending the framework or duplicating code when testing new >>> > functionality. >>> > >>> > Em ter., 29 de mar. de 2022 às 15:12, Ekaterina Dimitrova >>> > <e.dimitr...@gmail.com> escreveu: >>> >> >>> >> One thing that we can add to docs is for people how to update the in-jvm >>> >> framework and test their patches before asking for in-jvm api release. >>> >> The assumption is those won’t be many updates needed I think, but it is >>> >> good to be documented. >>> >> >>> >> On Tue, 29 Mar 2022 at 13:51, David Capwell <dcapw...@apple.com> wrote: >>> >>> >>> >>> They use a separate implementation of instance initialization and thus >>> >>> they test the test server rather than the real node. >>> >>> >>> >>> >>> >>> I think we can get rid of this by extending CassandraDaemon, just need >>> >>> to add a few hooks to mock out gossip/internode/client (for cases where >>> >>> the mocks are desired), and when mocks are not desired just run the >>> >>> real logic. >>> >>> >>> >>> Too many times I have had to make the 2 more in-line, and this is hard >>> >>> to maintain… we should fix this and feel this is 100% fixable >>> >>> >>> >>> we shouldn't neglect that there is a significant learning curve >>> >>> associated with it for new contributors which IMO is much lower for >>> >>> pyhton dtests >>> >>> >>> >>> >>> >>> I am curious about this comment. When I first joined I learned >>> >>> jvm-dtest within an hour and started walking Repair code in a debugger >>> >>> (and this was way before the improvements that let us do things like >>> >>> nodetool)… python dtest took weeks to get working correctly (still >>> >>> having issues with the MBean library we use… so have to comment out >>> >>> error handling to get some tests to pass)…. >>> >>> >>> >>> Maybe we could have some example docs showing how to do the same in >>> >>> both tools? Honestly Cluster.build(3).withConfig(c -> >>> >>> c.with(Feature.values())).start() matches 95% of python dtest tests >>> >>> (the withConfig logic is a bit cryptic), so don’t think the docs would >>> >>> be too much work >>> >>> >>> >>> >>> >>> On Mar 29, 2022, at 5:48 AM, Josh McKenzie <jmcken...@apache.org> wrote: >>> >>> >>> >>> we should at least write extensive documentation on how to use/modify >>> >>> in-jvm dtest framework before deprecating python dtests. >>> >>> >>> >>> We should have this for all our testing frameworks period, in-jvm >>> >>> dtest, python dtest, and ccm. They're woefully under-documented IMO. >>> >>> >>> >>> On Tue, Mar 29, 2022, at 6:11 AM, Paulo Motta wrote: >>> >>> >>> >>> To elaborate a bit on the steep learning curve point, when mentoring >>> >>> new contributors on a couple of occasions I told them to "just write a >>> >>> python dtest" because we had no idea on how to test that functionality >>> >>> on in-jvm tests while the python dtest was fairly straightforward to >>> >>> implement (I can't recall exactly what feature was it but I can dig if >>> >>> necessary). >>> >>> >>> >>> While we might be already familiar with the in-jvm dtest framework due >>> >>> to our exposure to it, we shouldn't neglect that there is a significant >>> >>> learning curve associated with it for new contributors which IMO is >>> >>> much lower for pyhton dtests. So we should at least write extensive >>> >>> documentation on how to use/modify in-jvm dtest framework before >>> >>> deprecating python dtests. >>> >>> >>> >>> Em ter., 29 de mar. de 2022 às 06:58, Paulo Motta >>> >>> <pauloricard...@gmail.com> escreveu: >>> >>> >>> >>> > They use a separate implementation of instance initialization and >>> >>> > thus they test the test server rather than the real node. >>> >>> >>> >>> I also have this concern. When adding a new service on CASSANDRA-16789 >>> >>> we had to explicitly modify the in-jvm dtest server to match the >>> >>> behavior from the actual server [1] (this is just a minor example but I >>> >>> remember having to do something similar on other tickets). >>> >>> >>> >>> Besides having a steep learning curve since users need to be familiar >>> >>> with the in-jvm dtest framework in order to add new functionality not >>> >>> supported by it, this is potentially unsafe, since the implementations >>> >>> can diverge without being caught by tests. >>> >>> >>> >>> Is there any way we could avoid duplicating functionality on the test >>> >>> server and use the same initialization code on in-jvm dtests? >>> >>> >>> >>> [1] - >>> >>> https://github.com/apache/cassandra/commit/ad249424814836bd00f47931258ad58bfefb24fd#diff-321b52220c5bd0aaadf275a845143eb208c889c2696ba0d48a5fc880551131d8R735 >>> >>> >>> >>> Em ter., 29 de mar. de 2022 às 04:22, Benjamin Lerer >>> >>> <ble...@apache.org> escreveu: >>> >>> >>> >>> They use a separate implementation of instance initialization and thus >>> >>> they test the test server rather than the real node. >>> >>> >>> >>> >>> >>> This is actually my main concern. What is the real gap between the >>> >>> in-JVM tests server instance and a server as run by python DTests? >>> >>> >>> >>> Le mar. 29 mars 2022 à 00:08, bened...@apache.org <bened...@apache.org> >>> >>> a écrit : >>> >>> >>> >>> > Other than that, it can be problematic to test upgrades when the >>> >>> > starting version must run with a different Java version than the end >>> >>> > release >>> >>> >>> >>> >>> >>> >>> >>> python upgrade tests seem to be particularly limited (from a quick >>> >>> skim, primarily testing major upgrade points that are now long in the >>> >>> past), so I’m not sure how much of a penalty this is today in practice >>> >>> - but it might well become a problem. >>> >>> >>> >>> >>> >>> >>> >>> There’s several questions to answer, namely how many versions we want >>> >>> to: >>> >>> >>> >>> >>> >>> >>> >>> - test upgrades across >>> >>> >>> >>> - maintain backwards compatibility of the in-jvm dtest api across >>> >>> >>> >>> - support a given JVM for >>> >>> >>> >>> >>> >>> >>> >>> However, if we need to, we can probably use RMI to transparently >>> >>> support multiple JVMs for tests that require it. Since we already use >>> >>> serialization to cross the ClassLoader boundary it might not even be >>> >>> very difficult. >>> >>> >>> >>> >>> >>> >>> >>> >>> >>> >>> >>> From: Jacek Lewandowski <lewandowski.ja...@gmail.com> >>> >>> Date: Monday, 28 March 2022 at 22:30 >>> >>> To: dev@cassandra.apache.org <dev@cassandra.apache.org> >>> >>> Subject: Re: [DISCUSS] Should we deprecate / freeze python dtests >>> >>> >>> >>> Although I like in-jvm DTests for many scenarios, I can see that they >>> >>> do not test the production code as it is. They use a separate >>> >>> implementation of instance initialization and thus they test the test >>> >>> server rather than the real node. Other than that, it can be >>> >>> problematic to test upgrades when the starting version must run with a >>> >>> different Java version than the end release. One more thing I've been >>> >>> observing sometimes is high consumption of metaspace, which does not >>> >>> seem to be cleaned after individual test cases. Given each started >>> >>> instance uses a dedicated class loader there is some amount of trash >>> >>> left and when there are a couple of multi-node test cases in a single >>> >>> test class, it sometimes happens that the test fail with out of memory >>> >>> in metaspace error. >>> >>> >>> >>> >>> >>> >>> >>> Thanks, >>> >>> >>> >>> Jacek >>> >>> >>> >>> >>> >>> >>> >>> On Mon, Mar 28, 2022 at 10:06 PM David Capwell <dcapw...@apple.com> >>> >>> wrote: >>> >>> >>> >>> I am back and the work for trunk to support vnode is at the last stage >>> >>> of review; I had not planned to backport the changes to other branches >>> >>> (aka, older branches would only support single token), so if someone >>> >>> would like to pick up this work it is rather LHF after 17332 goes in >>> >>> (see trunk patch GH PR: trunk). >>> >>> >>> >>> >>> >>> >>> >>> I am in favor of deprecating python dtests, and agree we should figure >>> >>> out what the gaps are (once vnode support is merged) so we can either >>> >>> shrink them or special case to unfreeze (such as startup changes being >>> >>> allowed). >>> >>> >>> >>> >>> >>> On Mar 14, 2022, at 6:13 AM, Josh McKenzie <jmcken...@apache.org> wrote: >>> >>> >>> >>> >>> >>> >>> >>> vnode support for in-jvm dtests is in flight and fairly straightforward: >>> >>> >>> >>> >>> >>> >>> >>> https://issues.apache.org/jira/browse/CASSANDRA-17332 >>> >>> >>> >>> >>> >>> >>> >>> David's OOO right now but I suspect we can get this in in April some >>> >>> time. >>> >>> >>> >>> >>> >>> >>> >>> On Mon, Mar 14, 2022, at 8:36 AM, bened...@apache.org wrote: >>> >>> >>> >>> This is the limitation I mentioned. I think this is solely a question >>> >>> of supplying an initial config that uses vnodes, i.e. that specifies >>> >>> multiple tokens for each node. It is not really a limitation – I >>> >>> believe a dtest could be written today using vnodes, by overriding the >>> >>> config’s tokens. It does look like the token handling has been >>> >>> refactored since the initial implementation to make this a little >>> >>> uglier than should be necessary. >>> >>> >>> >>> >>> >>> >>> >>> We should make this trivial, anyway, and perhaps offer a way to run all >>> >>> of the dtests with vnodes (and suitably annotating those that cannot be >>> >>> run with vnodes). This should be quite easy. >>> >>> >>> >>> >>> >>> >>> >>> >>> >>> >>> >>> From: Andrés de la Peña <adelap...@apache.org> >>> >>> Date: Monday, 14 March 2022 at 12:28 >>> >>> To: dev@cassandra.apache.org <dev@cassandra.apache.org> >>> >>> Subject: Re: [DISCUSS] Should we deprecate / freeze python dtests >>> >>> >>> >>> Last time I checked there wasn't support for vnodes on in-jvm dtests, >>> >>> which seems an important limitation. >>> >>> >>> >>> >>> >>> >>> >>> On Mon, 14 Mar 2022 at 12:24, bened...@apache.org <bened...@apache.org> >>> >>> wrote: >>> >>> >>> >>> I am strongly in favour of deprecating python dtests in all cases where >>> >>> they are currently superseded by in-jvm dtests. They are >>> >>> environmentally more challenging to work with, causing many problems on >>> >>> local and remote machines. They are harder to debug, slower, flakier, >>> >>> and mostly less sophisticated. >>> >>> >>> >>> >>> >>> >>> >>> > all focus on getting the in-jvm framework robust enough to cover >>> >>> > edge-cases >>> >>> >>> >>> >>> >>> >>> >>> Would be great to collect gaps. I think it’s just vnodes, which is by >>> >>> no means a fundamental limitation? There may also be some stuff to do >>> >>> startup/shutdown and environmental scripts, that may be a niche we >>> >>> retain something like python dtests for. >>> >>> >>> >>> >>> >>> >>> >>> > people aren’t familiar >>> >>> >>> >>> >>> >>> >>> >>> I would be interested to hear from these folk to understand their >>> >>> concerns or problems using in-jvm dtests, if there is a cohort holding >>> >>> off for this reason >>> >>> >>> >>> >>> >>> >>> >>> > This is going to require documentation work from some of the original >>> >>> > authors >>> >>> >>> >>> >>> >>> >>> >>> I think a collection of template-like tests we can point people to >>> >>> would be a cheap initial effort. Cutting and pasting an existing test >>> >>> with the required functionality, then editing to suit, should get most >>> >>> people off to a quick start who aren’t familiar. >>> >>> >>> >>> >>> >>> >>> >>> > Labor and process around revving new releases of the in-jvm dtest API >>> >>> >>> >>> >>> >>> >>> >>> I think we need to revisit how we do this, as it is currently broken. >>> >>> We should consider either using ASF snapshots until we cut new releases >>> >>> of C* itself, or else using git subprojects. This will also become a >>> >>> problem for Accord’s integration over time, and perhaps other >>> >>> subprojects in future, so it is worth better solving this. >>> >>> >>> >>> >>> >>> >>> >>> I think this has been made worse than necessary by moving too many >>> >>> implementation details to the shared API project – some should be >>> >>> retained within the C* tree, with the API primarily serving as the >>> >>> shared API itself to ensure cross-version compatibility. However, this >>> >>> is far from a complete explanation of (or solution to) the problem. >>> >>> >>> >>> >>> >>> >>> >>> >>> >>> >>> >>> >>> >>> >>> >>> From: Josh McKenzie <jmcken...@apache.org> >>> >>> Date: Monday, 14 March 2022 at 12:11 >>> >>> To: dev@cassandra.apache.org <dev@cassandra.apache.org> >>> >>> Subject: [DISCUSS] Should we deprecate / freeze python dtests >>> >>> >>> >>> I've been wrestling with the python dtests recently and that led to >>> >>> some discussions with other contributors about whether we as a project >>> >>> should be writing new tests in the python dtest framework or the in-jvm >>> >>> framework. This discussion has come up tangentially on some other >>> >>> topics, including the lack of documentation / expertise on the in-jvm >>> >>> framework dis-incentivizing some folks from authoring new tests there >>> >>> vs. the difficulty debugging and maintaining timer-based, sleep-based >>> >>> non-deterministic python dtests, etc. >>> >>> >>> >>> >>> >>> >>> >>> I don't know of a place where we've formally discussed this and made a >>> >>> project-wide call on where we expect new distributed tests to be >>> >>> written; if I've missed an email about this someone please link on the >>> >>> thread here (and stop reading! ;)) >>> >>> >>> >>> >>> >>> >>> >>> At this time we don't specify a preference for where you write new >>> >>> multi-node distributed tests on our "development/testing" portion of >>> >>> the site and documentation: >>> >>> https://cassandra.apache.org/_/development/testing.html >>> >>> >>> >>> >>> >>> >>> >>> The primary tradeoffs as I understand them for moving from python-based >>> >>> multi-node testing to jdk-based are: >>> >>> >>> >>> Pros: >>> >>> >>> >>> Better debugging functionality (breakpoints, IDE integration, etc) >>> >>> Integration with simulator >>> >>> More deterministic runtime (anecdotally; python dtests _should_ be >>> >>> deterministic but in practice they prove to be very prone to >>> >>> environmental disruption) >>> >>> Test time visibility to internals of cassandra >>> >>> >>> >>> Cons: >>> >>> >>> >>> The framework is not as mature as the python dtest framework (some >>> >>> functionality missing) >>> >>> Labor and process around revving new releases of the in-jvm dtest API >>> >>> People aren't familiar with it yet and there's a learning curve >>> >>> >>> >>> >>> >>> >>> >>> So my bid here: I personally think we as a project should freeze >>> >>> writing new tests in the python dtest framework and all focus on >>> >>> getting the in-jvm framework robust enough to cover edge-cases that >>> >>> might still be causing new tests to be written in the python framework. >>> >>> This is going to require documentation work from some of the original >>> >>> authors of the in-jvm framework as well as folks currently familiar >>> >>> with it and effort from those of us not yet intimately familiar with >>> >>> the API to get to know it, however I believe the long-term benefits to >>> >>> the project will be well worth it. >>> >>> >>> >>> >>> >>> >>> >>> We could institute a pre-commit check that warns on a commit increasing >>> >>> our raw count of python dtests to help provide process-based visibility >>> >>> to this change in direction for the project's testing. >>> >>> >>> >>> >>> >>> >>> >>> So: what do we think? >>> >>> >>> >>>____