On Fri, Mar 17, 2017 at 9:46 AM, Edward Capriolo <edlinuxg...@gmail.com> wrote:
> > > On Fri, Mar 17, 2017 at 6:41 AM, Ryan Svihla <r...@foundev.pro> wrote: > >> Different DI frameworks have different initialization costs, even inside >> of >> spring even depending on how you wire up dependencies (did it use autowire >> with reflection, parse a giant XML of explicit dependencies, etc). >> >> To back this assertion up for awhile in that community benching different >> DI frameworks perf was a thing and you can find benchmarks galore with a >> quick Google. >> >> The practical cost is also dependent on the lifecycles used (transient >> versus Singleton style for example) and features used (Interceptors >> depending on implementation can get expensive). >> >> So I think there should be some quantification of cost before a framework >> is considered, something like dagger2 which uses codegen I wager is only a >> cost at compile time (have not benched it, but looking at it's feature >> set, >> that's my guess) , Spring I know from experience even with the most >> optimal >> settings is slower on initialization time than doing by DI "by hand" at >> minimum, and that can sometimes be substantial. >> >> >> On Mar 17, 2017 12:29 AM, "Edward Capriolo" <edlinuxg...@gmail.com> >> wrote: >> >> On Thu, Mar 16, 2017 at 5:18 PM, Jason Brown <jasedbr...@gmail.com> >> wrote: >> >> > >> do we have plan to integrate with a dependency injection framework? >> > >> > No, we (the maintainers) have been pretty much against more frameworks >> due >> > to performance reasons, overhead, and dependency management problems. >> > >> > On Thu, Mar 16, 2017 at 2:04 PM, Qingcun Zhou <zhouqing...@gmail.com> >> > wrote: >> > >> > > Since we're here, do we have plan to integrate with a dependency >> > injection >> > > framework like Dagger2? Otherwise it'll be difficult to write unit >> test >> > > cases. >> > > >> > > On Thu, Mar 16, 2017 at 1:16 PM, Edward Capriolo < >> edlinuxg...@gmail.com> >> > > wrote: >> > > >> > > > On Thu, Mar 16, 2017 at 3:10 PM, Jeff Jirsa <jji...@apache.org> >> wrote: >> > > > >> > > > > >> > > > > >> > > > > On 2017-03-16 10:32 (-0700), François Deliège < >> > franc...@instagram.com> >> > > > > wrote: >> > > > > > >> > > > > > To get this started, here is an initial proposal: >> > > > > > >> > > > > > Principles: >> > > > > > >> > > > > > 1. Tests always pass. This is the starting point. If we don't >> care >> > > > > about test failures, then we should stop writing tests. A >> recurring >> > > > failing >> > > > > test carries no signal and is better deleted. >> > > > > > 2. The code is tested. >> > > > > > >> > > > > > Assuming we can align on these principles, here is a proposal >> for >> > > their >> > > > > implementation. >> > > > > > >> > > > > > Rules: >> > > > > > >> > > > > > 1. Each new release passes all tests (no flakinesss). >> > > > > > 2. If a patch has a failing test (test touching the same code >> > path), >> > > > the >> > > > > code or test should be fixed prior to being accepted. >> > > > > > 3. Bugs fixes should have one test that fails prior to the fix >> and >> > > > > passes after fix. >> > > > > > 4. New code should have at least 90% test coverage. >> > > > > > >> > > > > First I was >> > > > > I agree with all of these and hope they become codified and >> > followed. I >> > > > > don't know anyone who believes we should be committing code that >> > breaks >> > > > > tests - but we should be more strict with requiring green test >> runs, >> > > and >> > > > > perhaps more strict with reverting patches that break tests (or >> cause >> > > > them >> > > > > to be flakey). >> > > > > >> > > > > Ed also noted on the user list [0] that certain sections of the >> code >> > > > > itself are difficult to test because of singletons - I agree with >> the >> > > > > suggestion that it's time to revisit CASSANDRA-7837 and >> > CASSANDRA-10283 >> > > > > >> > > > > Finally, we should also recall Jason's previous notes [1] that the >> > > actual >> > > > > test infrastructure available is limited - the system provided by >> > > > Datastax >> > > > > is not generally open to everyone (and not guaranteed to be >> > permanent), >> > > > and >> > > > > the infrastructure currently available to the ASF is somewhat >> limited >> > > > (much >> > > > > slower, at the very least). If we require tests passing (and I >> agree >> > > that >> > > > > we should), we need to define how we're going to be testing (or >> how >> > > we're >> > > > > going to be sharing test results), because the ASF hardware isn't >> > going >> > > > to >> > > > > be able to do dozens of dev branch dtest runs per day in its >> current >> > > > form. >> > > > > >> > > > > 0: https://lists.apache.org/thread.html/ >> > f6f3fc6d0ad1bd54a6185ce7bd7a2f >> > > > > 6f09759a02352ffc05df92eef6@%3Cuser.cassandra.apache.org%3E >> > > > > 1: https://lists.apache.org/thread.html/ >> > 5fb8f0446ab97644100e4ef987f36e >> > > > > 07f44e8dd6d38f5dc81ecb3cdd@%3Cdev.cassandra.apache.org%3E >> > > > > >> > > > > >> > > > > >> > > > Ed also noted on the user list [0] that certain sections of the code >> > > itself >> > > > are difficult to test because of singletons - I agree with the >> > suggestion >> > > > that it's time to revisit CASSANDRA-7837 and CASSANDRA-10283 >> > > > >> > > > Thanks for the shout out! >> > > > >> > > > I was just looking at a patch about compaction. The patch was to >> > > calculate >> > > > free space correctly in case X. Compaction is not something that >> > requires >> > > > multiple nodes to test. The logic on the surface seems simple: find >> > > tables >> > > > of similar size and select them and merge them. The reality is it >> turns >> > > out >> > > > now to be that way. The coverage itself both branch and line may be >> > very >> > > > high, but what the code does not do is directly account for a wide >> > > variety >> > > > of scenarios. Without direct tests you end up with a mental >> > approximation >> > > > of what it does and that varies from person to person and accounts >> for >> > > the >> > > > cases that fit in your mind. For example, you personally are only >> > running >> > > > LevelDB inspired compaction. >> > > > >> > > > Being that this this is not a multi-node problem you should be able >> to >> > re >> > > > factor this heavily. Pulling everything out to a static method where >> > all >> > > > the parameters are arguments, or inject a lot of mocks in the >> current >> > > code, >> > > > and develop some scenario based coverage. >> > > > >> > > > That is how i typically "rescue" code I take over. I look at the >> > > nightmare >> > > > and say, "damn i am really afraid to touch this". I construct 8 >> > scenarios >> > > > that test green. Then I force some testing into it through careful >> re >> > > > factoring. Now, I probably know -something- about it. Now, you are >> > fairly >> > > > free to do a wide ranging refactor, because you at least counted >> for 8 >> > > > scenarios and you put unit test traps so that some rules are >> enforced. >> > > (Or >> > > > the person changing the code has to actively REMOVE your tests >> > asserting >> > > it >> > > > was not or no longer is valid). Later on you (or someone else) >> > __STILL__ >> > > > might screw the entire thing up, but at least you can now build >> > forward. >> > > > >> > > > Anyway that patch on compaction was great and I am sure it improved >> > > things. >> > > > That being said it did not add any tests :). So it can easily be >> undone >> > > by >> > > > the next person who does not understand the specific issue trying to >> be >> > > > addressed. Inline comments almost scream to me 'we need a test' not >> > > > everyone believes that. >> > > > >> > > >> > > >> > > >> > > -- >> > > Thank you & Best Regards, >> > > --Simon (Qingcun) Zhou >> > > >> > >> >> I don't believe dependency injection frameworks cause "overhead". For >> example, if you are using spring >> >> @Bean(initMethod=init, destroyMethod=destroy) >> public Server getMeAServer( Component component) {} >> >> @Bean(initMethod=init, destroyMethod=destroy) >> public Component getMeAComponent(){} >> >> What spring actually does is something like thisthis: >> >> startup code >> Component c = new Component(); >> c.init() >> Server s = new Server(c); >> s.init() >> The default spring prototype is "singleton" aka create one of these per >> bean context. >> >> Spring also does the reverse on shutdown. >> >> The static singleton initialization thing creates a host of problems, with >> things not closing cleanly. >> >> https://issues.apache.org/jira/browse/CASSANDRA-12844 >> https://issues.apache.org/jira/browse/CASSANDRA-12728 >> >> One of them I even fixed: >> https://github.com/stef1927/cassandra/commits/11984-2.2 >> >> So beyond it not creating any performance problems it probably would help >> in practice by forcing code into a pattern which is not singleton >> initiation driven by classloader which is something very opaque and hard >> to >> troubleshoot. Try putting a break point for example on something loaded by >> a static singleton and try to reason about the order of initialization. >> > > Different DI frameworks have different initialization costs, even inside of > spring even depending on how you wire up dependencies (did it use autowire > with reflection, parse a giant XML of explicit dependencies, etc). > > Spring I know from experience even with the most optimal > settings is slower on initialization time than doing by DI "by hand" at > minimum, and that can sometimes be substantial. > > You are considering the cost of reflection (nano seconds) vs the cost of > starting a database that has to read files from disk. (seconds to minutes) > > "parse a giant XML of explicit dependencies" > > Modern (spring) projects rarely do what you are describing. The big xml > files are gone: https://projects.spring.io/spring-boot/. If you were not > going to use spring (cause you are afraid of big xml files) there is > https://github.com/google/guice. > > It is basically a straw man argument to say "we don't like DI because of > big XML files" while papering over the fact that drain really does not shot > down cassandra properly and it has been project for a long while. > > To put this into prospective I worked with a group that ran a larger number of C* servers. They had tried and failed to develop a clean shutdown procedure that would not leave corrupt commit logs. The scripts were like this: nodetool disable-gossip sleep 5 nodetool disable-thrift sleep 5 nodetool drain sleep 30 kill We tried all permutations of shutdown techniques and could not build a sequence that could cleanly shutdown and restart cassandra 100% of the time. So there is room for improvement here. DI would help IMHO.