Nikolay, Can you, please, write down your version of requirements so we can reach a > consensus on that and therefore move to the discussion of the > implementation?
I guess that Max's requirements quite similar to your requirements: 1. The framework should support deployment on hardware/docker/AWS ... 2. Integration with TeamCity/Jenkins 3. Clients Java applications contain basic tests logic, Python for deployment/logs analysis 4. Tests can be executed against the dev branch/release build 5. The framework should allow us to create stable performance tests 6. Clear reporting Max, please correct me if I miss something. пн, 6 июл. 2020 г. в 14:27, Anton Vinogradov <a...@apache.org>: > Max, > > Thanks for the check! > > > Is it OK for those tests to fail? > No. > I see really strange things at logs. > Looks like you have concurrent ducktests run started not expected services, > and this broke the tests. > Could you please clean up the docker (use clean-up script [1]). > Compile sources (use script [2]) and rerun the tests. > > [1] > > https://github.com/anton-vinogradov/ignite/blob/dc98ee9df90b25eb5d928090b0e78b48cae2392e/modules/ducktests/tests/docker/clean_up.sh > [2] > > https://github.com/anton-vinogradov/ignite/blob/3c39983005bd9eaf8cb458950d942fb592fff85c/scripts/build.sh > > On Mon, Jul 6, 2020 at 12:03 PM Nikolay Izhikov <nizhi...@apache.org> > wrote: > > > Hello, Maxim. > > > > Thanks for writing down the minutes. > > > > There is no such thing as «Nikolay team» on the dev-list. > > I propose to focus on product requirements and what we want to gain from > > the framework instead of taking into account the needs of some team. > > > > Can you, please, write down your version of requirements so we can reach > a > > consensus on that and therefore move to the discussion of the > > implementation? > > > > > 6 июля 2020 г., в 11:18, Max Shonichev <mshon...@yandex.ru> > написал(а): > > > > > > Yes, Denis, > > > > > > common ground seems to be as follows: > > > Anton Vinogradov and Nikolay Izhikov would try to prepare and run PoC > > over physical hosts and share benchmark results. In the meantime, while I > > strongly believe that dockerized approach to benchmarking is a road to > > misleading and false positives, I'll prepare a PoC of Tiden in dockerized > > environment to support 'fast development prototyping' usecase Nikolay > team > > insist on. It should be a matter of few days. > > > > > > As a side note, I've run Anton PoC locally and would like to have some > > comments about results: > > > > > > Test system: Ubuntu 18.04, docker 19.03.6 > > > Test commands: > > > > > > > > > git clone -b ignite-ducktape g...@github.com: > anton-vinogradov/ignite.git > > > cd ignite > > > mvn clean install -DskipTests -Dmaven.javadoc.skip=true > > -Pall-java,licenses,lgpl,examples,!spark-2.4,!spark,!scala > > > cd modules/ducktests/tests/docker > > > ./run_tests.sh > > > > > > Test results: > > > > > > ==================================================================================================== > > > SESSION REPORT (ALL TESTS) > > > ducktape version: 0.7.7 > > > session_id: 2020-07-05--004 > > > run time: 7 minutes 36.360 seconds > > > tests run: 5 > > > passed: 3 > > > failed: 2 > > > ignored: 0 > > > > > > ==================================================================================================== > > > test_id: > > > ignitetest.tests.benchmarks.add_node_rebalance_test.AddNodeRebalanceTest.test_add_node.version=2.8.1 > > > status: FAIL > > > run time: 3 minutes 12.232 seconds > > > > > > ---------------------------------------------------------------------------------------------------- > > > test_id: > > > ignitetest.tests.benchmarks.pme_free_switch_test.PmeFreeSwitchTest.test.version=2.7.6 > > > status: FAIL > > > run time: 1 minute 33.076 seconds > > > > > > > > > Is it OK for those tests to fail? Attached is full test report > > > > > > > > > On 02.07.2020 17:46, Denis Magda wrote: > > >> Folks, > > >> Please share the summary of that Slack conversation here for records > > once > > >> you find common ground. > > >> - > > >> Denis > > >> On Thu, Jul 2, 2020 at 3:22 AM Nikolay Izhikov <nizhi...@apache.org> > > wrote: > > >>> Igniters. > > >>> > > >>> All who are interested in integration testing framework discussion > are > > >>> welcome into slack channel - > > >>> > > > https://join.slack.com/share/zt-fk2ovehf-TcomEAwiXaPzLyNKZbmfzw?cdn_fallback=2 > > >>> > > >>> > > >>> > > >>>> 2 июля 2020 г., в 13:06, Anton Vinogradov <a...@apache.org> > написал(а): > > >>>> > > >>>> Max, > > >>>> Thanks for joining us. > > >>>> > > >>>>> 1. tiden can deploy artifacts by itself, while ducktape relies on > > >>>>> dependencies being deployed by external scripts. > > >>>> No. It is important to distinguish development, deploy, and > > >>> orchestration. > > >>>> All-in-one solutions have extremely limited usability. > > >>>> As to Ducktests: > > >>>> Docker is responsible for deployments during development. > > >>>> CI/CD is responsible for deployments during release and nightly > > checks. > > >>> It's up to the team to chose AWS, VM, BareMetal, and even OS. > > >>>> Ducktape is responsible for orchestration. > > >>>> > > >>>>> 2. tiden can execute actions over remote nodes in real parallel > > >>> fashion, > > >>>>> while ducktape internally does all actions sequentially. > > >>>> No. Ducktape may start any service in parallel. See Pme-free > benchmark > > >>> [1] for details. > > >>>> > > >>>>> if we used ducktape solution we would have to instead prepare some > > >>>>> deployment scripts to pre-initialize Sberbank hosts, for example, > > with > > >>>>> Ansible or Chef. > > >>>> Sure, because a way of deploy depends on infrastructure. > > >>>> How can we be sure that OS we use and the restrictions we have will > be > > >>> compatible with Tiden? > > >>>> > > >>>>> You have solved this deficiency with docker by putting all > > dependencies > > >>>>> into one uber-image ... > > >>>> and > > >>>>> I guess we all know about docker hyped ability to run over > > distributed > > >>>>> virtual networks. > > >>>> It is very important not to confuse the test's development (docker > > image > > >>> you're talking about) and real deployment. > > >>>> > > >>>>> If we had stopped and started 5 nodes one-by-one, as ducktape does > > >>>> All actions can be performed in parallel. > > >>>> See how Ducktests [2] starts cluster in parallel for example. > > >>>> > > >>>> [1] > > >>> > > > https://github.com/apache/ignite/pull/7967/files#diff-59adde2a2ab7dc17aea6c65153dfcda7R84 > > >>>> [2] > > >>> > > > https://github.com/apache/ignite/pull/7967/files#diff-d6a7b19f30f349d426b8894a40389cf5R79 > > >>>> > > >>>> On Thu, Jul 2, 2020 at 1:00 PM Nikolay Izhikov <nizhi...@apache.org > > > > >>> wrote: > > >>>> Hello, Maxim. > > >>>> > > >>>>> 1. tiden can deploy artifacts by itself, while ducktape relies on > > >>> dependencies being deployed by external scripts > > >>>> > > >>>> Why do you think that maintaining deploy scripts coupled with the > > >>> testing framework is an advantage? > > >>>> I thought we want to see and maintain deployment scripts separate > from > > >>> the testing framework. > > >>>> > > >>>>> 2. tiden can execute actions over remote nodes in real parallel > > >>> fashion, while ducktape internally does all actions sequentially. > > >>>> > > >>>> Can you, please, clarify, what actions do you have in mind? > > >>>> And why we want to execute them concurrently? > > >>>> Ignite node start, Client application execution can be done > > concurrently > > >>> with the ducktape approach. > > >>>> > > >>>>> If we used ducktape solution we would have to instead prepare some > > >>> deployment scripts to pre-initialize Sberbank hosts, for example, > with > > >>> Ansible or Chef > > >>>> > > >>>> We shouldn’t take some user approach as an argument in this > > discussion. > > >>> Let’s discuss a general approach for all users of the Ignite. Anyway, > > what > > >>> is wrong with the external deployment script approach? > > >>>> > > >>>> We, as a community, should provide several ways to run integration > > tests > > >>> out-of-the-box AND the ability to customize deployment regarding the > > user > > >>> landscape. > > >>>> > > >>>>> You have solved this deficiency with docker by putting all > > >>> dependencies into one uber-image and that looks like simple and > elegant > > >>> solution however, that effectively limits you to single-host testing. > > >>>> > > >>>> Docker image should be used only by the Ignite developers to test > > >>> something locally. > > >>>> It’s not intended for some real-world testing. > > >>>> > > >>>> The main issue with the Tiden that I see, it tested and maintained > as > > a > > >>> closed source solution. > > >>>> This can lead to the hard to solve problems when we start using and > > >>> maintaining it as an open-source solution. > > >>>> Like, how many developers used Tiden? And how many of developers > were > > >>> not authors of the Tiden itself? > > >>>> > > >>>> > > >>>>> 2 июля 2020 г., в 12:30, Max Shonichev <mshon...@yandex.ru> > > >>> написал(а): > > >>>>> > > >>>>> Anton, Nikolay, > > >>>>> > > >>>>> Let's agree on what we are arguing about: whether it is about "like > > or > > >>> don't like" or about technical properties of suggested solutions. > > >>>>> > > >>>>> If it is about likes and dislikes, then the whole discussion is > > >>> meaningless. However, I hope together we can analyse pros and cons > > >>> carefully. > > >>>>> > > >>>>> As far as I can understand now, two main differences between > ducktape > > >>> and tiden is that: > > >>>>> > > >>>>> 1. tiden can deploy artifacts by itself, while ducktape relies on > > >>> dependencies being deployed by external scripts. > > >>>>> > > >>>>> 2. tiden can execute actions over remote nodes in real parallel > > >>> fashion, while ducktape internally does all actions sequentially. > > >>>>> > > >>>>> As for me, these are very important properties for distributed > > testing > > >>> framework. > > >>>>> > > >>>>> First property let us easily reuse tiden in existing > infrastructures, > > >>> for example, during Zookeeper IEP testing at Sberbank site we used > the > > same > > >>> tiden scripts that we use in our lab, the only change was putting a > > list of > > >>> hosts into config. > > >>>>> > > >>>>> If we used ducktape solution we would have to instead prepare some > > >>> deployment scripts to pre-initialize Sberbank hosts, for example, > with > > >>> Ansible or Chef. > > >>>>> > > >>>>> > > >>>>> You have solved this deficiency with docker by putting all > > >>> dependencies into one uber-image and that looks like simple and > elegant > > >>> solution, > > >>>>> however, that effectively limits you to single-host testing. > > >>>>> > > >>>>> I guess we all know about docker hyped ability to run over > > distributed > > >>> virtual networks. We used to go that way, but quickly found that it > is > > more > > >>> of the hype than real work. In real environments, there are problems > > with > > >>> routing, DNS, multicast and broadcast traffic, and many others, that > > turn > > >>> docker-based distributed solution into a fragile hard-to-maintain > > monster. > > >>>>> > > >>>>> Please, if you believe otherwise, perform a run of your PoC over at > > >>> least two physical hosts and share results with us. > > >>>>> > > >>>>> If you consider that one physical docker host is enough, please, > > don't > > >>> overlook that we want to run real scale scenarios, with 50-100 cache > > >>> groups, persistence enabled and a millions of keys loaded. > > >>>>> > > >>>>> Practical limit for such configurations is 4-6 nodes per single > > >>> physical host. Otherwise, tests become flaky due to resource > > starvation. > > >>>>> > > >>>>> Please, if you believe otherwise, perform at least a 10 of runs of > > >>> your PoC with other tests running at TC (we're targeting TeamCity, > > right?) > > >>> and share results so we could check if the numbers are reproducible. > > >>>>> > > >>>>> I stress this once more: functional integration tests are OK to run > > in > > >>> Docker and CI, but running benchmarks in Docker is a big NO GO. > > >>>>> > > >>>>> > > >>>>> Second property let us write tests that require real-parallel > actions > > >>> over hosts. > > >>>>> > > >>>>> For example, agreed scenario for PME benchmarkduring "PME > > optimization > > >>> stream" was as follows: > > >>>>> > > >>>>> - 10 server nodes, preloaded with 1M of keys > > >>>>> - 4 client nodes perform transactional load (client nodes > > physically > > >>> separated from server nodes) > > >>>>> - during load: > > >>>>> -- 5 server nodes stopped in parallel > > >>>>> -- after 1 minute, all 5 nodes are started in parallel > > >>>>> - load stopped, logs are analysed for exchange times. > > >>>>> > > >>>>> If we had stopped and started 5 nodes one-by-one, as ducktape does, > > >>> then partition map exchange merge would not happen and we could not > > have > > >>> measured PME optimizations for that case. > > >>>>> > > >>>>> > > >>>>> These are limitations of ducktape that we believe as a more > important > > >>>>> argument "against" than you provide "for". > > >>>>> > > >>>>> > > >>>>> > > >>>>> > > >>>>> On 30.06.2020 14:58, Anton Vinogradov wrote: > > >>>>>> Folks, > > >>>>>> First, I've created PR [1] with ducktests improvements > > >>>>>> PR contains the following changes > > >>>>>> - Pme-free switch proof-benchmark (2.7.6 vs master) > > >>>>>> - Ability to check (compare with) previous releases (eg. 2.7.6 & > > 2.8) > > >>>>>> - Global refactoring > > >>>>>> -- benchmarks javacode simplification > > >>>>>> -- services python and java classes code deduplication > > >>>>>> -- fail-fast checks for java and python (eg. application should > > >>> explicitly write it finished with success) > > >>>>>> -- simple results extraction from tests and benchmarks > > >>>>>> -- javacode now configurable from tests/benchmarks > > >>>>>> -- proper SIGTERM handling at javacode (eg. it may finish last > > >>> operation and log results) > > >>>>>> -- docker volume now marked as delegated to increase execution > speed > > >>> for mac & win users > > >>>>>> -- Ignite cluster now start in parallel (start speed-up) > > >>>>>> -- Ignite can be configured at test/benchmark > > >>>>>> - full and module assembly scripts added > > >>>>> Great job done! But let me remind one of Apache Ignite principles: > > >>>>> week of thinking save months of development. > > >>>>> > > >>>>> > > >>>>>> Second, I'd like to propose to accept ducktests [2] (ducktape > > >>> integration) as a target "PoC check & real topology benchmarking > tool". > > >>>>>> Ducktape pros > > >>>>>> - Developed for distributed system by distributed system > developers. > > >>>>> So does Tiden > > >>>>> > > >>>>>> - Developed since 2014, stable. > > >>>>> Tiden is also pretty stable, and development start date is not a > good > > >>> argument, for example pytest is since 2004, pytest-xdist (plugin for > > >>> distributed testing) is since 2010, but we don't see it as a > > alternative at > > >>> all. > > >>>>> > > >>>>>> - Proven usability by usage at Kafka. > > >>>>> Tiden is proven usable by usage at GridGain and Sberbank > deployments. > > >>>>> Core, storage, sql and tx teams use benchmark results provided by > > >>> Tiden on a daily basis. > > >>>>> > > >>>>>> - Dozens of dozens tests and benchmarks at Kafka as a great > example > > >>> pack. > > >>>>> We'll donate some of our suites to Ignite as I've mentioned in > > >>> previous letter. > > >>>>> > > >>>>>> - Built-in Docker support for rapid development and checks. > > >>>>> False, there's no specific 'docker support' in ducktape itself, you > > >>> just wrap it in docker by yourself, because ducktape is lacking > > deployment > > >>> abilities. > > >>>>> > > >>>>>> - Great for CI automation. > > >>>>> False, there's no specific CI-enabled features in ducktape. Tiden, > on > > >>> the other hand, provide generic xUnit reporting format, which is > > supported > > >>> by both TeamCity and Jenkins. Also, instead of using private keys, > > Tiden > > >>> can use SSH agent, which is also great for CI, because both > > >>>>> TeamCity and Jenkins store keys in secret storage available only > for > > >>> ssh-agent and only for the time of the test. > > >>>>> > > >>>>> > > >>>>>>> As an additional motivation, at least 3 teams > > >>>>>> - IEP-45 team (to check crash-recovery speed-up (discovery and > > Zabbix > > >>> speed-up)) > > >>>>>> - Ignite SE Plugins team (to check plugin's features does not > > >>> slow-down or broke AI features) > > >>>>>> - Ignite SE QA team (to append already developed > smoke/load/failover > > >>> tests to AI codebase) > > >>>>> > > >>>>> Please, before recommending your tests to other teams, provide > proofs > > >>>>> that your tests are reproducible in real environment. > > >>>>> > > >>>>> > > >>>>>> now, wait for ducktest merge to start checking cases they working > on > > >>> in AI way. > > >>>>>> Thoughts? > > >>>>> Let us together review both solutions, we'll try to run your tests > in > > >>> our lab, and you'll try to at least checkout tiden and see if same > > tests > > >>> can be implemented with it? > > >>>>> > > >>>>> > > >>>>> > > >>>>>> [1] https://github.com/apache/ignite/pull/7967 > > >>>>>> [2] https://github.com/apache/ignite/tree/ignite-ducktape > > >>>>>> On Tue, Jun 16, 2020 at 12:22 PM Nikolay Izhikov < > > nizhi...@apache.org > > >>> <mailto:nizhi...@apache.org>> wrote: > > >>>>>> Hello, Maxim. > > >>>>>> Thank you for so detailed explanation. > > >>>>>> Can we put the content of this discussion somewhere on the > wiki? > > >>>>>> So It doesn’t get lost. > > >>>>>> I divide the answer in several parts. From the requirements to > > the > > >>>>>> implementation. > > >>>>>> So, if we agreed on the requirements we can proceed with the > > >>>>>> discussion of the implementation. > > >>>>>> 1. Requirements: > > >>>>>> The main goal I want to achieve is *reproducibility* of the > > tests. > > >>>>>> I’m sick and tired with the zillions of flaky, rarely failed, > and > > >>>>>> almost never failed tests in Ignite codebase. > > >>>>>> We should start with the simplest scenarios that will be as > > >>> reliable > > >>>>>> as steel :) > > >>>>>> I want to know for sure: > > >>>>>> - Is this PR makes rebalance quicker or not? > > >>>>>> - Is this PR makes PME quicker or not? > > >>>>>> So, your description of the complex test scenario looks as a > next > > >>>>>> step to me. > > >>>>>> Anyway, It’s cool we already have one. > > >>>>>> The second goal is to have a strict test lifecycle as we have > in > > >>>>>> JUnit and similar frameworks. > > >>>>>> > It covers production-like deployment and running a scenarios > > >>> over > > >>>>>> a single database instance. > > >>>>>> Do you mean «single cluster» or «single host»? > > >>>>>> 2. Existing tests: > > >>>>>> > A Combinator suite allows to run set of operations > > concurrently > > >>>>>> over given database instance. > > >>>>>> > A Consumption suite allows to run a set production-like > > actions > > >>>>>> over given set of Ignite/GridGain versions and compare test > > metrics > > >>>>>> across versions > > >>>>>> > A Yardstick suite > > >>>>>> > A Stress suite that simulates hardware environment > degradation > > >>>>>> > An Ultimate, DR and Compatibility suites that performs > > >>> functional > > >>>>>> regression testing > > >>>>>> > Regression > > >>>>>> Great news that we already have so many choices for testing! > > >>>>>> Mature test base is a big +1 for Tiden. > > >>>>>> 3. Comparison: > > >>>>>> > Criteria: Test configuration > > >>>>>> > Ducktape: single JSON string for all tests > > >>>>>> > Tiden: any number of YaML config files, command line option > > for > > >>>>>> fine-grained test configuration, ability to select/modify tests > > >>>>>> behavior based on Ignite version. > > >>>>>> 1. Many YAML files can be hard to maintain. > > >>>>>> 2. In ducktape, you can set parameters via «—parameters» > option. > > >>>>>> Please, take a look at the doc [1] > > >>>>>> > Criteria: Cluster control > > >>>>>> > Tiden: additionally can address cluster as a whole and > execute > > >>>>>> remote commands in parallel. > > >>>>>> It seems we implement this ability in the PoC, already. > > >>>>>> > Criteria: Test assertions > > >>>>>> > Tiden: simple asserts, also few customized assertion > helpers. > > >>>>>> > Ducktape: simple asserts. > > >>>>>> Can you, please, be more specific. > > >>>>>> What helpers do you have in mind? > > >>>>>> Ducktape has an asserts that waits for logfile messages or some > > >>>>>> process finish. > > >>>>>> > Criteria: Test reporting > > >>>>>> > Ducktape: limited to its own text/HTML format > > >>>>>> Ducktape have > > >>>>>> 1. Text reporter > > >>>>>> 2. Customizable HTML reporter > > >>>>>> 3. JSON reporter. > > >>>>>> We can show JSON with the any template or tool. > > >>>>>> > Criteria: Provisioning and deployment > > >>>>>> > Ducktape: can provision subset of hosts from cluster for > test > > >>>>>> needs. However, that means, that test can’t be scaled without > > test > > >>>>>> code changes. Does not do any deploy, relies on external means, > > >>> e.g. > > >>>>>> pre-packaged in docker image, as in PoC. > > >>>>>> This is not true. > > >>>>>> 1. We can set explicit test parameters(node number) via > > parameters. > > >>>>>> We can increase client count of cluster size without test code > > >>> changes. > > >>>>>> 2. We have many choices for the test environment. These choices > > are > > >>>>>> tested and used in other projects: > > >>>>>> * docker > > >>>>>> * vagrant > > >>>>>> * private cloud(ssh access) > > >>>>>> * ec2 > > >>>>>> Please, take a look at Kafka documentation [2] > > >>>>>> > I can continue more on this, but it should be enough for > now: > > >>>>>> We need to go deeper! :) > > >>>>>> [1] > > >>>>>> > > >>> > https://ducktape-docs.readthedocs.io/en/latest/run_tests.html#options > > >>>>>> [2] > > >>> https://github.com/apache/kafka/tree/trunk/tests#ec2-quickstart > > >>>>>> > 9 июня 2020 г., в 17:25, Max A. Shonichev < > mshon...@yandex.ru > > >>>>>> <mailto:mshon...@yandex.ru>> написал(а): > > >>>>>> > > > >>>>>> > Greetings, Nikolay, > > >>>>>> > > > >>>>>> > First of all, thank you for you great effort preparing PoC > of > > >>>>>> integration testing to Ignite community. > > >>>>>> > > > >>>>>> > It’s a shame Ignite did not have at least some such tests > yet, > > >>>>>> however, GridGain, as a major contributor to Apache Ignite had > a > > >>>>>> profound collection of in-house tools to perform integration > and > > >>>>>> performance testing for years already and while we slowly > > consider > > >>>>>> sharing our expertise with the community, your initiative makes > > us > > >>>>>> drive that process a bit faster, thanks a lot! > > >>>>>> > > > >>>>>> > I reviewed your PoC and want to share a little about what we > > do > > >>>>>> on our part, why and how, hope it would help community take > > proper > > >>>>>> course. > > >>>>>> > > > >>>>>> > First I’ll do a brief overview of what decisions we made and > > >>> what > > >>>>>> we do have in our private code base, next I’ll describe what we > > >>> have > > >>>>>> already donated to the public and what we plan public next, > then > > >>>>>> I’ll compare both approaches highlighting deficiencies in order > > to > > >>>>>> spur public discussion on the matter. > > >>>>>> > > > >>>>>> > It might seem strange to use Python to run Bash to run Java > > >>>>>> applications because that introduces IT industry best of > breed’ – > > >>>>>> the Python dependency hell – to the Java application code base. > > The > > >>>>>> only strangest decision one can made is to use Maven to run > > Docker > > >>>>>> to run Bash to run Python to run Bash to run Java, but > desperate > > >>>>>> times call for desperate measures I guess. > > >>>>>> > > > >>>>>> > There are Java-based solutions for integration testing > exists, > > >>>>>> e.g. Testcontainers [1], Arquillian [2], etc, and they might go > > >>> well > > >>>>>> for Ignite community CI pipelines by them selves. But we also > > >>> wanted > > >>>>>> to run performance tests and benchmarks, like the dreaded PME > > >>>>>> benchmark, and this is solved by totally different set of tools > > in > > >>>>>> Java world, e.g. Jmeter [3], OpenJMH [4], Gatling [5], etc. > > >>>>>> > > > >>>>>> > Speaking specifically about benchmarking, Apache Ignite > > >>> community > > >>>>>> already has Yardstick [6], and there’s nothing wrong with > writing > > >>>>>> PME benchmark using Yardstick, but we also wanted to be able to > > run > > >>>>>> scenarios like this: > > >>>>>> > - put an X load to a Ignite database; > > >>>>>> > - perform an Y set of operations to check how Ignite copes > > with > > >>>>>> operations under load. > > >>>>>> > > > >>>>>> > And yes, we also wanted applications under test be deployed > > >>> ‘like > > >>>>>> in a production’, e.g. distributed over a set of hosts. This > > arises > > >>>>>> questions about provisioning and nodes affinity which I’ll > cover > > in > > >>>>>> detail later. > > >>>>>> > > > >>>>>> > So we decided to put a little effort to build a simple tool > to > > >>>>>> cover different integration and performance scenarios, and our > QA > > >>>>>> lab first attempt was PoC-Tester [7], currently open source for > > all > > >>>>>> but for reporting web UI. It’s a quite simple to use 95% > > Java-based > > >>>>>> tool targeted to be run on a pre-release QA stage. > > >>>>>> > > > >>>>>> > It covers production-like deployment and running a scenarios > > >>> over > > >>>>>> a single database instance. PoC-Tester scenarios consists of a > > >>>>>> sequence of tasks running sequentially or in parallel. After > all > > >>>>>> tasks complete, or at any time during test, user can run logs > > >>>>>> collection task, logs are checked against exceptions and a > > summary > > >>>>>> of found issues and task ops/latency statistics is generated at > > the > > >>>>>> end of scenario. One of the main PoC-Tester features is its > > >>>>>> fire-and-forget approach to task managing. That is, you can > > deploy > > >>> a > > >>>>>> grid and left it running for weeks, periodically firing some > > tasks > > >>>>>> onto it. > > >>>>>> > > > >>>>>> > During earliest stages of PoC-Tester development it becomes > > >>> quite > > >>>>>> clear that Java application development is a tedious process > and > > >>>>>> architecture decisions you take during development are slow and > > >>> hard > > >>>>>> to change. > > >>>>>> > For example, scenarios like this > > >>>>>> > - deploy two instances of GridGain with master-slave data > > >>>>>> replication configured; > > >>>>>> > - put a load on master; > > >>>>>> > - perform checks on slave, > > >>>>>> > or like this: > > >>>>>> > - preload a 1Tb of data by using your favorite tool of > choice > > to > > >>>>>> an Apache Ignite of version X; > > >>>>>> > - run a set of functional tests running Apache Ignite > version > > Y > > >>>>>> over preloaded data, > > >>>>>> > do not fit well in the PoC-Tester workflow. > > >>>>>> > > > >>>>>> > So, this is why we decided to use Python as a generic > > scripting > > >>>>>> language of choice. > > >>>>>> > > > >>>>>> > Pros: > > >>>>>> > - quicker prototyping and development cycles > > >>>>>> > - easier to find DevOps/QA engineer with Python skills than > > one > > >>>>>> with Java skills > > >>>>>> > - used extensively all over the world for DevOps/CI > pipelines > > >>> and > > >>>>>> thus has rich set of libraries for all possible integration > uses > > >>> cases. > > >>>>>> > > > >>>>>> > Cons: > > >>>>>> > - Nightmare with dependencies. Better stick to specific > > >>>>>> language/libraries version. > > >>>>>> > > > >>>>>> > Comparing alternatives for Python-based testing framework we > > >>> have > > >>>>>> considered following requirements, somewhat similar to what > > you’ve > > >>>>>> mentioned for Confluent [8] previously: > > >>>>>> > - should be able run locally or distributed (bare metal or > in > > >>> the > > >>>>>> cloud) > > >>>>>> > - should have built-in deployment facilities for > applications > > >>>>>> under test > > >>>>>> > - should separate test configuration and test code > > >>>>>> > -- be able to easily reconfigure tests by simple > configuration > > >>>>>> changes > > >>>>>> > -- be able to easily scale test environment by simple > > >>>>>> configuration changes > > >>>>>> > -- be able to perform regression testing by simple switching > > >>>>>> artifacts under test via configuration > > >>>>>> > -- be able to run tests with different JDK version by simple > > >>>>>> configuration changes > > >>>>>> > - should have human readable reports and/or reporting tools > > >>>>>> integration > > >>>>>> > - should allow simple test progress monitoring, one does not > > >>> want > > >>>>>> to run 6-hours test to find out that application actually > crashed > > >>>>>> during first hour. > > >>>>>> > - should allow parallel execution of test actions > > >>>>>> > - should have clean API for test writers > > >>>>>> > -- clean API for distributed remote commands execution > > >>>>>> > -- clean API for deployed applications start / stop and > other > > >>>>>> operations > > >>>>>> > -- clean API for performing check on results > > >>>>>> > - should be open source or at least source code should allow > > >>> ease > > >>>>>> change or extension > > >>>>>> > > > >>>>>> > Back at that time we found no better alternative than to > write > > >>>>>> our own framework, and here goes Tiden [9] as GridGain > framework > > of > > >>>>>> choice for functional integration and performance testing. > > >>>>>> > > > >>>>>> > Pros: > > >>>>>> > - solves all the requirements above > > >>>>>> > Cons (for Ignite): > > >>>>>> > - (currently) closed GridGain source > > >>>>>> > > > >>>>>> > On top of Tiden we’ve built a set of test suites, some of > > which > > >>>>>> you might have heard already. > > >>>>>> > > > >>>>>> > A Combinator suite allows to run set of operations > > concurrently > > >>>>>> over given database instance. Proven to find at least 30+ race > > >>>>>> conditions and NPE issues. > > >>>>>> > > > >>>>>> > A Consumption suite allows to run a set production-like > > actions > > >>>>>> over given set of Ignite/GridGain versions and compare test > > metrics > > >>>>>> across versions, like heap/disk/CPU consumption, time to > perform > > >>>>>> actions, like client PME, server PME, rebalancing time, data > > >>>>>> replication time, etc. > > >>>>>> > > > >>>>>> > A Yardstick suite is a thin layer of Python glue code to run > > >>>>>> Apache Ignite pre-release benchmarks set. Yardstick itself has > a > > >>>>>> mediocre deployment capabilities, Tiden solves this easily. > > >>>>>> > > > >>>>>> > A Stress suite that simulates hardware environment > degradation > > >>>>>> during testing. > > >>>>>> > > > >>>>>> > An Ultimate, DR and Compatibility suites that performs > > >>> functional > > >>>>>> regression testing of GridGain Ultimate Edition features like > > >>>>>> snapshots, security, data replication, rolling upgrades, etc. > > >>>>>> > > > >>>>>> > A Regression and some IEPs testing suites, like IEP-14, > > IEP-15, > > >>>>>> etc, etc, etc. > > >>>>>> > > > >>>>>> > Most of the suites above use another in-house developed Java > > >>> tool > > >>>>>> – PiClient – to perform actual loading and miscellaneous > > operations > > >>>>>> with Ignite under test. We use py4j Python-Java gateway library > > to > > >>>>>> control PiClient instances from the tests. > > >>>>>> > > > >>>>>> > When we considered CI, we put TeamCity out of scope, because > > >>>>>> distributed integration and performance tests tend to run for > > hours > > >>>>>> and TeamCity agents are scarce and costly resource. So, bundled > > >>> with > > >>>>>> Tiden there is jenkins-job-builder [10] based CI pipelines and > > >>>>>> Jenkins xUnit reporting. Also, rich web UI tool Ward aggregates > > >>> test > > >>>>>> run reports across versions and has built in visualization > > support > > >>>>>> for Combinator suite. > > >>>>>> > > > >>>>>> > All of the above is currently closed source, but we plan to > > make > > >>>>>> it public for community, and publishing Tiden core [9] is the > > first > > >>>>>> step on that way. You can review some examples of using Tiden > for > > >>>>>> tests at my repository [11], for start. > > >>>>>> > > > >>>>>> > Now, let’s compare Ducktape PoC and Tiden. > > >>>>>> > > > >>>>>> > Criteria: Language > > >>>>>> > Tiden: Python, 3.7 > > >>>>>> > Ducktape: Python, proposes itself as Python 2.7, 3.6, 3.7 > > >>>>>> compatible, but actually can’t work with Python 3.7 due to > broken > > >>>>>> Zmq dependency. > > >>>>>> > Comment: Python 3.7 has a much better support for > async-style > > >>>>>> code which might be crucial for distributed application > testing. > > >>>>>> > Score: Tiden: 1, Ducktape: 0 > > >>>>>> > > > >>>>>> > Criteria: Test writers API > > >>>>>> > Supported integration test framework concepts are basically > > the > > >>> same: > > >>>>>> > - a test controller (test runner) > > >>>>>> > - a cluster > > >>>>>> > - a node > > >>>>>> > - an application (a service in Ducktape terms) > > >>>>>> > - a test > > >>>>>> > Score: Tiden: 5, Ducktape: 5 > > >>>>>> > > > >>>>>> > Criteria: Tests selection and run > > >>>>>> > Ducktape: suite-package-class-method level selection, > internal > > >>>>>> scheduler allows to run tests in suite in parallel. > > >>>>>> > Tiden: also suite-package-class-method level selection, > > >>>>>> additionally allows selecting subset of tests by attribute, > > >>> parallel > > >>>>>> runs not built in, but allows merging test reports after > > different > > >>> runs. > > >>>>>> > Score: Tiden: 2, Ducktape: 2 > > >>>>>> > > > >>>>>> > Criteria: Test configuration > > >>>>>> > Ducktape: single JSON string for all tests > > >>>>>> > Tiden: any number of YaML config files, command line option > > for > > >>>>>> fine-grained test configuration, ability to select/modify tests > > >>>>>> behavior based on Ignite version. > > >>>>>> > Score: Tiden: 3, Ducktape: 1 > > >>>>>> > > > >>>>>> > Criteria: Cluster control > > >>>>>> > Ducktape: allow execute remote commands by node granularity > > >>>>>> > Tiden: additionally can address cluster as a whole and > execute > > >>>>>> remote commands in parallel. > > >>>>>> > Score: Tiden: 2, Ducktape: 1 > > >>>>>> > > > >>>>>> > Criteria: Logs control > > >>>>>> > Both frameworks have similar builtin support for remote logs > > >>>>>> collection and grepping. Tiden has built-in plugin that can > zip, > > >>>>>> collect arbitrary log files from arbitrary locations at > > >>>>>> test/module/suite granularity and unzip if needed, also > > application > > >>>>>> API to search / wait for messages in logs. Ducktape allows each > > >>>>>> service declare its log files location (seemingly does not > > support > > >>>>>> logs rollback), and a single entrypoint to collect service > logs. > > >>>>>> > Score: Tiden: 1, Ducktape: 1 > > >>>>>> > > > >>>>>> > Criteria: Test assertions > > >>>>>> > Tiden: simple asserts, also few customized assertion > helpers. > > >>>>>> > Ducktape: simple asserts. > > >>>>>> > Score: Tiden: 2, Ducktape: 1 > > >>>>>> > > > >>>>>> > Criteria: Test reporting > > >>>>>> > Ducktape: limited to its own text/html format > > >>>>>> > Tiden: provides text report, yaml report for reporting tools > > >>>>>> integration, XML xUnit report for integration with > > >>> Jenkins/TeamCity. > > >>>>>> > Score: Tiden: 3, Ducktape: 1 > > >>>>>> > > > >>>>>> > Criteria: Provisioning and deployment > > >>>>>> > Ducktape: can provision subset of hosts from cluster for > test > > >>>>>> needs. However, that means, that test can’t be scaled without > > test > > >>>>>> code changes. Does not do any deploy, relies on external means, > > >>> e.g. > > >>>>>> pre-packaged in docker image, as in PoC. > > >>>>>> > Tiden: Given a set of hosts, Tiden uses all of them for the > > >>> test. > > >>>>>> Provisioning should be done by external means. However, > provides > > a > > >>>>>> conventional automated deployment routines. > > >>>>>> > Score: Tiden: 1, Ducktape: 1 > > >>>>>> > > > >>>>>> > Criteria: Documentation and Extensibility > > >>>>>> > Tiden: current API documentation is limited, should change > as > > we > > >>>>>> go open source. Tiden is easily extensible via hooks and > plugins, > > >>>>>> see example Maven plugin and Gatling application at [11]. > > >>>>>> > Ducktape: basic documentation at readthedocs.io > > >>>>>> <http://readthedocs.io>. Codebase is rigid, framework core is > > >>>>>> tightly coupled and hard to change. The only possible extension > > >>>>>> mechanism is fork-and-rewrite. > > >>>>>> > Score: Tiden: 2, Ducktape: 1 > > >>>>>> > > > >>>>>> > I can continue more on this, but it should be enough for > now: > > >>>>>> > Overall score: Tiden: 22, Ducktape: 14. > > >>>>>> > > > >>>>>> > Time for discussion! > > >>>>>> > > > >>>>>> > --- > > >>>>>> > [1] - https://www.testcontainers.org/ > > >>>>>> > [2] - http://arquillian.org/guides/getting_started/ > > >>>>>> > [3] - https://jmeter.apache.org/index.html > > >>>>>> > [4] - https://openjdk.java.net/projects/code-tools/jmh/ > > >>>>>> > [5] - https://gatling.io/docs/current/ > > >>>>>> > [6] - https://github.com/gridgain/yardstick > > >>>>>> > [7] - https://github.com/gridgain/poc-tester > > >>>>>> > [8] - > > >>>>>> > > >>> > > > https://cwiki.apache.org/confluence/display/KAFKA/System+Test+Improvements > > >>>>>> > [9] - https://github.com/gridgain/tiden > > >>>>>> > [10] - https://pypi.org/project/jenkins-job-builder/ > > >>>>>> > [11] - https://github.com/mshonichev/tiden_examples > > >>>>>> > > > >>>>>> > On 25.05.2020 11:09, Nikolay Izhikov wrote: > > >>>>>> >> Hello, > > >>>>>> >> > > >>>>>> >> Branch with duck tape created - > > >>>>>> https://github.com/apache/ignite/tree/ignite-ducktape > > >>>>>> >> > > >>>>>> >> Any who are willing to contribute to PoC are welcome. > > >>>>>> >> > > >>>>>> >> > > >>>>>> >>> 21 мая 2020 г., в 22:33, Nikolay Izhikov > > >>>>>> <nizhikov....@gmail.com <mailto:nizhikov....@gmail.com>> > > >>> написал(а): > > >>>>>> >>> > > >>>>>> >>> Hello, Denis. > > >>>>>> >>> > > >>>>>> >>> There is no rush with these improvements. > > >>>>>> >>> We can wait for Maxim proposal and compare two solutions > :) > > >>>>>> >>> > > >>>>>> >>>> 21 мая 2020 г., в 22:24, Denis Magda <dma...@apache.org > > >>>>>> <mailto:dma...@apache.org>> написал(а): > > >>>>>> >>>> > > >>>>>> >>>> Hi Nikolay, > > >>>>>> >>>> > > >>>>>> >>>> Thanks for kicking off this conversation and sharing your > > >>>>>> findings with the > > >>>>>> >>>> results. That's the right initiative. I do agree that > > Ignite > > >>>>>> needs to have > > >>>>>> >>>> an integration testing framework with capabilities listed > > by > > >>> you. > > >>>>>> >>>> > > >>>>>> >>>> As we discussed privately, I would only check if instead > of > > >>>>>> >>>> Confluent's Ducktape library, we can use an integration > > >>>>>> testing framework > > >>>>>> >>>> developed by GridGain for testing of Ignite/GridGain > > >>> clusters. > > >>>>>> That > > >>>>>> >>>> framework has been battle-tested and might be more > > >>> convenient for > > >>>>>> >>>> Ignite-specific workloads. Let's wait for @Maksim > Shonichev > > >>>>>> >>>> <mshonic...@gridgain.com <mailto:mshonic...@gridgain.com > >> > > >>> who > > >>>>>> promised to join this thread once he finishes > > >>>>>> >>>> preparing the usage examples of the framework. To my > > >>>>>> knowledge, Max has > > >>>>>> >>>> already been working on that for several days. > > >>>>>> >>>> > > >>>>>> >>>> - > > >>>>>> >>>> Denis > > >>>>>> >>>> > > >>>>>> >>>> > > >>>>>> >>>> On Thu, May 21, 2020 at 12:27 AM Nikolay Izhikov > > >>>>>> <nizhi...@apache.org <mailto:nizhi...@apache.org>> > > >>>>>> >>>> wrote: > > >>>>>> >>>> > > >>>>>> >>>>> Hello, Igniters. > > >>>>>> >>>>> > > >>>>>> >>>>> I created a PoC [1] for the integration tests of Ignite. > > >>>>>> >>>>> > > >>>>>> >>>>> Let me briefly explain the gap I want to cover: > > >>>>>> >>>>> > > >>>>>> >>>>> 1. For now, we don’t have a solution for automated > testing > > >>> of > > >>>>>> Ignite on > > >>>>>> >>>>> «real cluster». > > >>>>>> >>>>> By «real cluster» I mean cluster «like a production»: > > >>>>>> >>>>> * client and server nodes deployed on different > > hosts. > > >>>>>> >>>>> * thin clients perform queries from some other > hosts > > >>>>>> >>>>> * etc. > > >>>>>> >>>>> > > >>>>>> >>>>> 2. We don’t have a solution for automated benchmarks of > > some > > >>>>>> internal > > >>>>>> >>>>> Ignite process > > >>>>>> >>>>> * PME > > >>>>>> >>>>> * rebalance. > > >>>>>> >>>>> This means we don’t know - Do we perform rebalance(or > PME) > > >>> in > > >>>>>> 2.7.0 faster > > >>>>>> >>>>> or slower than in 2.8.0 for the same cluster? > > >>>>>> >>>>> > > >>>>>> >>>>> 3. We don’t have a solution for automated testing of > > Ignite > > >>>>>> integration in > > >>>>>> >>>>> a real-world environment: > > >>>>>> >>>>> Ignite-Spark integration can be taken as an example. > > >>>>>> >>>>> I think some ML solutions also should be tested in > > >>> real-world > > >>>>>> deployments. > > >>>>>> >>>>> > > >>>>>> >>>>> Solution: > > >>>>>> >>>>> > > >>>>>> >>>>> I propose to use duck tape library from confluent > (apache > > >>> 2.0 > > >>>>>> license) > > >>>>>> >>>>> I tested it both on the real cluster(Yandex Cloud) and > on > > >>> the > > >>>>>> local > > >>>>>> >>>>> environment(docker) and it works just fine. > > >>>>>> >>>>> > > >>>>>> >>>>> PoC contains following services: > > >>>>>> >>>>> > > >>>>>> >>>>> * Simple rebalance test: > > >>>>>> >>>>> Start 2 server nodes, > > >>>>>> >>>>> Create some data with Ignite client, > > >>>>>> >>>>> Start one more server node, > > >>>>>> >>>>> Wait for rebalance finish > > >>>>>> >>>>> * Simple Ignite-Spark integration test: > > >>>>>> >>>>> Start 1 Spark master, start 1 Spark > worker, > > >>>>>> >>>>> Start 1 Ignite server node > > >>>>>> >>>>> Create some data with Ignite client, > > >>>>>> >>>>> Check data in application that queries it > > from > > >>>>>> Spark. > > >>>>>> >>>>> > > >>>>>> >>>>> All tests are fully automated. > > >>>>>> >>>>> Logs collection works just fine. > > >>>>>> >>>>> You can see an example of the tests report - [4]. > > >>>>>> >>>>> > > >>>>>> >>>>> Pros: > > >>>>>> >>>>> > > >>>>>> >>>>> * Ability to test local changes(no need to public > changes > > to > > >>>>>> some remote > > >>>>>> >>>>> repository or similar). > > >>>>>> >>>>> * Ability to parametrize test environment(run the same > > tests > > >>>>>> on different > > >>>>>> >>>>> JDK, JVM params, config, etc.) > > >>>>>> >>>>> * Isolation by default so system tests are as reliable > as > > >>>>>> possible. > > >>>>>> >>>>> * Utilities for pulling up and tearing down services > > easily > > >>>>>> in clusters in > > >>>>>> >>>>> different environments (e.g. local, custom cluster, > > Vagrant, > > >>>>>> K8s, Mesos, > > >>>>>> >>>>> Docker, cloud providers, etc.) > > >>>>>> >>>>> * Easy to write unit tests for distributed systems > > >>>>>> >>>>> * Adopted and successfully used by other distributed > open > > >>>>>> source project - > > >>>>>> >>>>> Apache Kafka. > > >>>>>> >>>>> * Collect results (e.g. logs, console output) > > >>>>>> >>>>> * Report results (e.g. expected conditions met, > > performance > > >>>>>> results, etc.) > > >>>>>> >>>>> > > >>>>>> >>>>> WDYT? > > >>>>>> >>>>> > > >>>>>> >>>>> [1] https://github.com/nizhikov/ignite/pull/15 > > >>>>>> >>>>> [2] https://github.com/confluentinc/ducktape > > >>>>>> >>>>> [3] > > >>> https://ducktape-docs.readthedocs.io/en/latest/run_tests.html > > >>>>>> >>>>> [4] https://yadi.sk/d/JC8ciJZjrkdndg > > >>>> > > >>> > > >>> > > >>> > > > <2020-07-05--004.tar.gz> > > > > > > > -- Best Regards, Ilya Suntsov email: isunt...@gridgain.com *GridGain Systems* www.gridgain.com