> In fact there is a high learning curve to setup cassandra-dtest environment
I think this is fairly well documented: https://github.com/apache/cassandra-dtest/blob/trunk/README.md On Tue, Mar 29, 2022 at 5:27 PM Paulo Motta <pauloricard...@gmail.com> wrote: > > > I am curious about this comment. When I first joined I learned jvm-dtest > > within an hour and started walking Repair code in a debugger (and this was > > way before the improvements that let us do things like nodetool)… python > > dtest took weeks to get working correctly (still having issues with the > > MBean library we use… so have to comment out error handling to get some > > tests to pass)…. > > Thanks for sharing your perspective. In fact there is a high learning curve > to setup cassandra-dtest environment, but once it's working it's pretty > straightforward to test any existing or new functionality. > > I think with in-jvm dtests you don't have the hassle of setting up a > different environment and this is a great motivator to standardize on this > solution. The main difficulty I had was testing features not supported by the > framework, which require you to extend the framework. I don't recall having > to extend ccm/cassandra-dtest many times when working on new features. > > Perhaps this has improved recently and we no longer need to worry about > extending the framework or duplicating code when testing new functionality. > > Em ter., 29 de mar. de 2022 às 15:12, Ekaterina Dimitrova > <e.dimitr...@gmail.com> escreveu: >> >> One thing that we can add to docs is for people how to update the in-jvm >> framework and test their patches before asking for in-jvm api release. The >> assumption is those won’t be many updates needed I think, but it is good to >> be documented. >> >> On Tue, 29 Mar 2022 at 13:51, David Capwell <dcapw...@apple.com> wrote: >>> >>> They use a separate implementation of instance initialization and thus they >>> test the test server rather than the real node. >>> >>> >>> I think we can get rid of this by extending CassandraDaemon, just need to >>> add a few hooks to mock out gossip/internode/client (for cases where the >>> mocks are desired), and when mocks are not desired just run the real logic. >>> >>> Too many times I have had to make the 2 more in-line, and this is hard to >>> maintain… we should fix this and feel this is 100% fixable >>> >>> we shouldn't neglect that there is a significant learning curve associated >>> with it for new contributors which IMO is much lower for pyhton dtests >>> >>> >>> I am curious about this comment. When I first joined I learned jvm-dtest >>> within an hour and started walking Repair code in a debugger (and this was >>> way before the improvements that let us do things like nodetool)… python >>> dtest took weeks to get working correctly (still having issues with the >>> MBean library we use… so have to comment out error handling to get some >>> tests to pass)…. >>> >>> Maybe we could have some example docs showing how to do the same in both >>> tools? Honestly Cluster.build(3).withConfig(c -> >>> c.with(Feature.values())).start() matches 95% of python dtest tests (the >>> withConfig logic is a bit cryptic), so don’t think the docs would be too >>> much work >>> >>> >>> On Mar 29, 2022, at 5:48 AM, Josh McKenzie <jmcken...@apache.org> wrote: >>> >>> we should at least write extensive documentation on how to use/modify >>> in-jvm dtest framework before deprecating python dtests. >>> >>> We should have this for all our testing frameworks period, in-jvm dtest, >>> python dtest, and ccm. They're woefully under-documented IMO. >>> >>> On Tue, Mar 29, 2022, at 6:11 AM, Paulo Motta wrote: >>> >>> To elaborate a bit on the steep learning curve point, when mentoring new >>> contributors on a couple of occasions I told them to "just write a python >>> dtest" because we had no idea on how to test that functionality on in-jvm >>> tests while the python dtest was fairly straightforward to implement (I >>> can't recall exactly what feature was it but I can dig if necessary). >>> >>> While we might be already familiar with the in-jvm dtest framework due to >>> our exposure to it, we shouldn't neglect that there is a significant >>> learning curve associated with it for new contributors which IMO is much >>> lower for pyhton dtests. So we should at least write extensive >>> documentation on how to use/modify in-jvm dtest framework before >>> deprecating python dtests. >>> >>> Em ter., 29 de mar. de 2022 às 06:58, Paulo Motta >>> <pauloricard...@gmail.com> escreveu: >>> >>> > They use a separate implementation of instance initialization and thus >>> > they test the test server rather than the real node. >>> >>> I also have this concern. When adding a new service on CASSANDRA-16789 we >>> had to explicitly modify the in-jvm dtest server to match the behavior from >>> the actual server [1] (this is just a minor example but I remember having >>> to do something similar on other tickets). >>> >>> Besides having a steep learning curve since users need to be familiar with >>> the in-jvm dtest framework in order to add new functionality not supported >>> by it, this is potentially unsafe, since the implementations can diverge >>> without being caught by tests. >>> >>> Is there any way we could avoid duplicating functionality on the test >>> server and use the same initialization code on in-jvm dtests? >>> >>> [1] - >>> https://github.com/apache/cassandra/commit/ad249424814836bd00f47931258ad58bfefb24fd#diff-321b52220c5bd0aaadf275a845143eb208c889c2696ba0d48a5fc880551131d8R735 >>> >>> Em ter., 29 de mar. de 2022 às 04:22, Benjamin Lerer <ble...@apache.org> >>> escreveu: >>> >>> They use a separate implementation of instance initialization and thus they >>> test the test server rather than the real node. >>> >>> >>> This is actually my main concern. What is the real gap between the in-JVM >>> tests server instance and a server as run by python DTests? >>> >>> Le mar. 29 mars 2022 à 00:08, bened...@apache.org <bened...@apache.org> a >>> écrit : >>> >>> > Other than that, it can be problematic to test upgrades when the starting >>> > version must run with a different Java version than the end release >>> >>> >>> >>> python upgrade tests seem to be particularly limited (from a quick skim, >>> primarily testing major upgrade points that are now long in the past), so >>> I’m not sure how much of a penalty this is today in practice - but it might >>> well become a problem. >>> >>> >>> >>> There’s several questions to answer, namely how many versions we want to: >>> >>> >>> >>> - test upgrades across >>> >>> - maintain backwards compatibility of the in-jvm dtest api across >>> >>> - support a given JVM for >>> >>> >>> >>> However, if we need to, we can probably use RMI to transparently support >>> multiple JVMs for tests that require it. Since we already use serialization >>> to cross the ClassLoader boundary it might not even be very difficult. >>> >>> >>> >>> >>> >>> From: Jacek Lewandowski <lewandowski.ja...@gmail.com> >>> Date: Monday, 28 March 2022 at 22:30 >>> To: dev@cassandra.apache.org <dev@cassandra.apache.org> >>> Subject: Re: [DISCUSS] Should we deprecate / freeze python dtests >>> >>> Although I like in-jvm DTests for many scenarios, I can see that they do >>> not test the production code as it is. They use a separate implementation >>> of instance initialization and thus they test the test server rather than >>> the real node. Other than that, it can be problematic to test upgrades when >>> the starting version must run with a different Java version than the end >>> release. One more thing I've been observing sometimes is high consumption >>> of metaspace, which does not seem to be cleaned after individual test >>> cases. Given each started instance uses a dedicated class loader there is >>> some amount of trash left and when there are a couple of multi-node test >>> cases in a single test class, it sometimes happens that the test fail with >>> out of memory in metaspace error. >>> >>> >>> >>> Thanks, >>> >>> Jacek >>> >>> >>> >>> On Mon, Mar 28, 2022 at 10:06 PM David Capwell <dcapw...@apple.com> wrote: >>> >>> I am back and the work for trunk to support vnode is at the last stage of >>> review; I had not planned to backport the changes to other branches (aka, >>> older branches would only support single token), so if someone would like >>> to pick up this work it is rather LHF after 17332 goes in (see trunk patch >>> GH PR: trunk). >>> >>> >>> >>> I am in favor of deprecating python dtests, and agree we should figure out >>> what the gaps are (once vnode support is merged) so we can either shrink >>> them or special case to unfreeze (such as startup changes being allowed). >>> >>> >>> On Mar 14, 2022, at 6:13 AM, Josh McKenzie <jmcken...@apache.org> wrote: >>> >>> >>> >>> vnode support for in-jvm dtests is in flight and fairly straightforward: >>> >>> >>> >>> https://issues.apache.org/jira/browse/CASSANDRA-17332 >>> >>> >>> >>> David's OOO right now but I suspect we can get this in in April some time. >>> >>> >>> >>> On Mon, Mar 14, 2022, at 8:36 AM, bened...@apache.org wrote: >>> >>> This is the limitation I mentioned. I think this is solely a question of >>> supplying an initial config that uses vnodes, i.e. that specifies multiple >>> tokens for each node. It is not really a limitation – I believe a dtest >>> could be written today using vnodes, by overriding the config’s tokens. It >>> does look like the token handling has been refactored since the initial >>> implementation to make this a little uglier than should be necessary. >>> >>> >>> >>> We should make this trivial, anyway, and perhaps offer a way to run all of >>> the dtests with vnodes (and suitably annotating those that cannot be run >>> with vnodes). This should be quite easy. >>> >>> >>> >>> >>> >>> From: Andrés de la Peña <adelap...@apache.org> >>> Date: Monday, 14 March 2022 at 12:28 >>> To: dev@cassandra.apache.org <dev@cassandra.apache.org> >>> Subject: Re: [DISCUSS] Should we deprecate / freeze python dtests >>> >>> Last time I checked there wasn't support for vnodes on in-jvm dtests, which >>> seems an important limitation. >>> >>> >>> >>> On Mon, 14 Mar 2022 at 12:24, bened...@apache.org <bened...@apache.org> >>> wrote: >>> >>> I am strongly in favour of deprecating python dtests in all cases where >>> they are currently superseded by in-jvm dtests. They are environmentally >>> more challenging to work with, causing many problems on local and remote >>> machines. They are harder to debug, slower, flakier, and mostly less >>> sophisticated. >>> >>> >>> >>> > all focus on getting the in-jvm framework robust enough to cover >>> > edge-cases >>> >>> >>> >>> Would be great to collect gaps. I think it’s just vnodes, which is by no >>> means a fundamental limitation? There may also be some stuff to do >>> startup/shutdown and environmental scripts, that may be a niche we retain >>> something like python dtests for. >>> >>> >>> >>> > people aren’t familiar >>> >>> >>> >>> I would be interested to hear from these folk to understand their concerns >>> or problems using in-jvm dtests, if there is a cohort holding off for this >>> reason >>> >>> >>> >>> > This is going to require documentation work from some of the original >>> > authors >>> >>> >>> >>> I think a collection of template-like tests we can point people to would be >>> a cheap initial effort. Cutting and pasting an existing test with the >>> required functionality, then editing to suit, should get most people off to >>> a quick start who aren’t familiar. >>> >>> >>> >>> > Labor and process around revving new releases of the in-jvm dtest API >>> >>> >>> >>> I think we need to revisit how we do this, as it is currently broken. We >>> should consider either using ASF snapshots until we cut new releases of C* >>> itself, or else using git subprojects. This will also become a problem for >>> Accord’s integration over time, and perhaps other subprojects in future, so >>> it is worth better solving this. >>> >>> >>> >>> I think this has been made worse than necessary by moving too many >>> implementation details to the shared API project – some should be retained >>> within the C* tree, with the API primarily serving as the shared API itself >>> to ensure cross-version compatibility. However, this is far from a complete >>> explanation of (or solution to) the problem. >>> >>> >>> >>> >>> >>> >>> >>> From: Josh McKenzie <jmcken...@apache.org> >>> Date: Monday, 14 March 2022 at 12:11 >>> To: dev@cassandra.apache.org <dev@cassandra.apache.org> >>> Subject: [DISCUSS] Should we deprecate / freeze python dtests >>> >>> I've been wrestling with the python dtests recently and that led to some >>> discussions with other contributors about whether we as a project should be >>> writing new tests in the python dtest framework or the in-jvm framework. >>> This discussion has come up tangentially on some other topics, including >>> the lack of documentation / expertise on the in-jvm framework >>> dis-incentivizing some folks from authoring new tests there vs. the >>> difficulty debugging and maintaining timer-based, sleep-based >>> non-deterministic python dtests, etc. >>> >>> >>> >>> I don't know of a place where we've formally discussed this and made a >>> project-wide call on where we expect new distributed tests to be written; >>> if I've missed an email about this someone please link on the thread here >>> (and stop reading! ;)) >>> >>> >>> >>> At this time we don't specify a preference for where you write new >>> multi-node distributed tests on our "development/testing" portion of the >>> site and documentation: >>> https://cassandra.apache.org/_/development/testing.html >>> >>> >>> >>> The primary tradeoffs as I understand them for moving from python-based >>> multi-node testing to jdk-based are: >>> >>> Pros: >>> >>> Better debugging functionality (breakpoints, IDE integration, etc) >>> Integration with simulator >>> More deterministic runtime (anecdotally; python dtests _should_ be >>> deterministic but in practice they prove to be very prone to environmental >>> disruption) >>> Test time visibility to internals of cassandra >>> >>> Cons: >>> >>> The framework is not as mature as the python dtest framework (some >>> functionality missing) >>> Labor and process around revving new releases of the in-jvm dtest API >>> People aren't familiar with it yet and there's a learning curve >>> >>> >>> >>> So my bid here: I personally think we as a project should freeze writing >>> new tests in the python dtest framework and all focus on getting the in-jvm >>> framework robust enough to cover edge-cases that might still be causing new >>> tests to be written in the python framework. This is going to require >>> documentation work from some of the original authors of the in-jvm >>> framework as well as folks currently familiar with it and effort from those >>> of us not yet intimately familiar with the API to get to know it, however I >>> believe the long-term benefits to the project will be well worth it. >>> >>> >>> >>> We could institute a pre-commit check that warns on a commit increasing our >>> raw count of python dtests to help provide process-based visibility to this >>> change in direction for the project's testing. >>> >>> >>> >>> So: what do we think? >>> >>>