To have something to focus on, I have opened https://github.com/apache/beam/pull/2359 to remove all of either tags from anything outside of the core libraries.
I believe there is rough consensus around this and I just wanted to make it extremely concrete. Comments appreciated, especially on those places where we think we might be losing some coverage. On Tue, Mar 28, 2017 at 11:54 PM, Stas Levin <stasle...@apache.org> wrote: > Stephen, the enforcement only applies to (test) pipelines created using > "@Rule TestPipeline pipeline = ...". Naming could, and should, definitely > be improved :) > I think this aligns nicely with > https://issues.apache.org/jira/browse/BEAM-1557 where we want to separate > the TestPipeline rule into a hardcore and a user-friendly versions. > Citing the ticket: "One that is strict, and aimed to be used by sdk/runner > developers, and one that is more lenient, and aimed to be used by casual > (i.e., non sdk/runner) users.". > > Jason, just to make sure I get it: > (1) "@NeedsRunner" will only be legit in modules which the DirectRunner > depends on (directly or transitively). > (2) "@NeedsRunner" will be removed from modules that do not satisfy (1) > > If that is the case, I believe it also implies that every module that has a > "@NeedsRunner" annotated test (and is not part of the DirectRunner itself), > will also need to be configured in the DirectRunner's "dependenciesToScan" > surefire section, right? > Otherwise, these "@NeedsRunner" annotated tests will not be run as part of > the build, since they are in the "excludedGroups" of their respective > modules. > This is basically a generalisation of how things work in the > "beam-sdks-java-core" and "and beam-runners-core-java" modules. > > > On Tue, Mar 28, 2017 at 11:38 PM Stephen Sisk <s...@google.com.invalid> > wrote: > > > Jason - is the implication of "turn on abandoned node detection by > default" > > that if a dev writes a regular unit test that does not invoke a pipeline > > that they'd need to call p.enableAbandonedNodeEnforcement(false) ? (Or > > would this only be for particular test methods on files that have the > @Rule > > TestPipeline... in them?) > > > > *IF* so, we might want to make the naming of that method a bit more > > self-explanatory if we expect it to be used in a lot of places. > > p.expectPipelineRunForThisTest(false) ?? > > > > S > > > > On Tue, Mar 28, 2017 at 12:39 PM Jason Kuster > > <jasonkus...@google.com.invalid> wrote: > > > > > Going off of what Robert and others have said -- I think the sanest way > > to > > > use NeedsRunner is as an annotation on specific tests in the set of > > modules > > > which the DirectRunner depends on (directly or transitively). It should > > be > > > used only as an indication that they should be tested with the > > DirectRunner > > > once the DirectRunner is available. In this light, I'd suggest removing > > > NeedsRunner annotations from everywhere below the DirectRunner in the > > build > > > order since it will already be available. > > > > > > The one wrinkle here is that NeedsRunner is used in Stas's abandoned > node > > > detection in TestPipeline, a very useful piece of functionality. My > > > suggestion here would be to just turn on abandoned node detection by > > > default; tests which do not need it can call > > > p.enableAbandonedNodeEnforcement(false). Thoughts? > > > > > > On Tue, Mar 28, 2017 at 1:55 AM, Stas Levin <stasle...@apache.org> > > wrote: > > > > > > > Jason, I can add the write-up in > > > > https://github.com/apache/beam/pull/2077#issuecomment-282273273 to > the > > > > testing > > > > section <https://beam.apache.org/contribute/testing/> as part of the > > > > upcoming doc updates in light of "RunnableOnService" becoming > > > > "NeedsRunner". > > > > > > > > -Stas > > > > > > > > > > > > On Tue, Mar 28, 2017 at 12:38 AM Robert Bradshaw > > > > <rober...@google.com.invalid> wrote: > > > > > > > > > On Mon, Mar 27, 2017 at 1:16 PM, Eugene Kirpichov < > > > > > kirpic...@google.com.invalid> wrote: > > > > > > > > > > > Thanks Kenn, this breakdown makes a lot of sense. We should > > probably > > > > > > clarify this in the documentation of both of these annotations. > > > Though > > > > > now > > > > > > that I look at the current docs, they seem clear enough, but > > perhaps > > > > they > > > > > > can be phrased stronger. > > > > > > > > > > > > To summarize: > > > > > > - ValidatesRunner tests are for testing a runner. They should be > > > > created > > > > > > mostly by core Beam SDK authors, when introducing a new Beam > > feature > > > > etc. > > > > > > It may even make sense to make this annotation private to the > beam > > > sdk > > > > > core > > > > > > module. > > > > > > > > > > > > > > > > +1, but see below. > > > > > > > > > > > > > > > > - NeedsRunner tests are for testing a transform. That's what all > > > > external > > > > > > users should be using, authors of IOs, etc. It's unspecified > which > > > > runner > > > > > > will be used, but in practice usually it'll be direct runner. > > > > > > > > > > > > > > > > My one issue with NeedsRunner is that as one moves further from > core, > > > the > > > > > public API of Beam libraries tend to be suites of PTransforms > > > (including > > > > > IOs) and almost every strong test of the public API (which is where > > > most > > > > > tests should live) has at least one test that becomes NeedsRunner. > It > > > > seems > > > > > annotations should be used to mark the exception, not the norm. > > > > > > > > > > Perhaps this also stems from my point of view that the Direct > Runner > > is > > > > in > > > > > many ways part of the SDK as the reference implementation and test > > > runner > > > > > (the SDK minus any runner isn't very useful) and so there's little > or > > > no > > > > > need to distinguish tests that require a runner from those that > don't > > > > > (especially the further you get from core, where you should be able > > to > > > > > assume you have a runner if you have the ability to construct > > pipelines > > > > and > > > > > run PAssert). > > > > > > > > > > - The current situation (use of both of these annotations) does not > > > quite > > > > > > reflect that, and should be fixed. > > > > > > - There might be a use case for testing a transform against all > > > > runners, > > > > > > and we don't have an agreed-upon solution about how to do that: > > > > > > ValidatesRunner technically accomplishes this, but it's logically > > > wrong > > > > > to > > > > > > use it in this capacity. > > > > > > > > > > > > > > > > I think such tests do validate runners insofar as the > > @ValidatesRunner > > > > > suites are not comprehensive. > > > > > > > > > > In the ideal world, one could simply assume the primitive > transforms > > > are > > > > > perfectly tested with 100% coverage of all relevant permutations of > > > > > features, and tests of composite transforms can assume correct(*) > > > > > implementation of primitives and only test their assembly on a > single > > > > > runner. In practice, our tests are not comprehensive, and these > > > > > higher-level tests form a much larger suite of additional tests > > biased > > > > > towards how the primitive are actually composed and used. Somewhat > > > like a > > > > > second line of defense. As such, its useful to be able to run this > > > larger > > > > > suite on a variety of runners, even if it's theoretically > redundant. > > > > > > > > > > (There's also the sticky issue that if a runner overrides a > composite > > > > > transform there's suddenly value in running that transform's tests > > > > against > > > > > that runner. It's unclear where that information should live (seems > > to > > > > > belong to the runner, but it should just reference (and possibly > > > augment) > > > > > the existing suite).) > > > > > > > > > > (*) Even the notion of correctness is difficult here, as one might > be > > > > > relying of hidden assumptions that don't hold on all runners > because > > > they > > > > > are not part of the spec. This is the case for the wild-and-crazy > > test > > > > > runner that goes out of its way to violate assumptions. > > > > > > > > > > On Mon, Mar 27, 2017 at 12:50 PM Kenneth Knowles > > > <k...@google.com.invalid > > > > > > > > > > > wrote: > > > > > > > > > > > > > Without claiming that this is the final set of categories or > that > > > > they > > > > > > are > > > > > > > used correctly right now, here is what I think they mean: > > > > > > > > > > > > > > - ValidatesRunner tests should be tests of the runner itself, > > > > > generally > > > > > > > that it implements a primitive correctly > > > > > > > - NeedsRunner tests should be tests of the > PTransform/pipeline, > > > > > assuming > > > > > > > the runner is correct > > > > > > > > > > > > > > Notably, to the extent the assumption of runner correctness > > holds, > > > > this > > > > > > > implies that it is OK to run NeedsRunner tests with just the > > direct > > > > > > runner. > > > > > > > > > > > > > > Pragmatically, in the Java SDK & IOs, this is not the current > > > > > breakdown. > > > > > > > > > > > > > > - In the Java SDK the NeedsRunner category is probably used > more > > > to > > > > > flag > > > > > > > "run this with just the direct runner" than to express the > > semantic > > > > > > intent. > > > > > > > That isn't so bad; it is very close to the right usage. > > > > > > > > > > > > > > - There are IOs that had RunnableOnService tests which are now > > > > > > > ValidatesRunner tests. While the ability to run an IO does > > > validate a > > > > > > > runner, this is really an integration test of the IO. If they > are > > > to > > > > be > > > > > > run > > > > > > > with just the direct runner they don't need any annotation, > > because > > > > the > > > > > > IO > > > > > > > can take a test-scoped dependency on the direct runner. So it > > > mostly > > > > > > makes > > > > > > > sense to tag those tests for which it is profitable to run > > against > > > > all > > > > > > > runners. > > > > > > > > > > > > > > I think the question of IO ITs with the intent to run across > > > runners > > > > is > > > > > > > currently under design discussion and I would defer to other > > people > > > > on > > > > > > the > > > > > > > best way to do that. It could be a new category, or it could > be a > > > > > > different > > > > > > > design pattern entirely. > > > > > > > > > > > > > > Kenn > > > > > > > > > > > > > > On Mon, Mar 27, 2017 at 11:55 AM, Eugene Kirpichov < > > > > > > > kirpic...@google.com.invalid> wrote: > > > > > > > > > > > > > > > Kenn - can you also remind for everybody, what is the > > difference > > > > > > between > > > > > > > > @NeedsRunner and @ValidatesRunner, and when should one use > one > > or > > > > the > > > > > > > > other? I always find myself confused about this especially in > > > code > > > > > > > reviews. > > > > > > > > > > > > > > > > On Mon, Mar 27, 2017 at 11:32 AM Kenneth Knowles > > > > > > <k...@google.com.invalid > > > > > > > > > > > > > > > > wrote: > > > > > > > > > > > > > > > > > Hi all, > > > > > > > > > > > > > > > > > > I just merged the rename from RunnableOnService to > > > > ValidatesRunner > > > > > in > > > > > > > the > > > > > > > > > Java codebase (Python was already there) > > > > > > > > > https://github.com/apache/beam/pull/2157. > > > > > > > > > > > > > > > > > > I'm sure there will be stragglers throughout our docs, etc, > > so > > > > > please > > > > > > > do > > > > > > > > > help me catch them and fix them. And start learning to say > > > > > > > > > "ValidatesRunner" in conversation :-) > > > > > > > > > > > > > > > > > > Kenn > > > > > > > > > > > > > > > > > > On Thu, Nov 10, 2016 at 1:01 PM, Lukasz Cwik > > > > > > <lc...@google.com.invalid > > > > > > > > > > > > > > > > > wrote: > > > > > > > > > > > > > > > > > > > The default is a crashing runner which throws an > exception > > if > > > > its > > > > > > > > > executed. > > > > > > > > > > This makes SDK core/examples/... not depend on any > > > implemented > > > > > > > runners. > > > > > > > > > > > > > > > > > > > > On Thu, Nov 10, 2016 at 12:37 PM, Robert Bradshaw < > > > > > > > > > > rober...@google.com.invalid> wrote: > > > > > > > > > > > > > > > > > > > > > +1 to ValidatesRunner. I'd be nice if it were > > (optionally?) > > > > > > > > > > > parameterized by which feature it validates. > > > > > > > > > > > > > > > > > > > > > > @NeedsRunner is odd, as using a runner is the most > > natural > > > > way > > > > > to > > > > > > > > > > > write many (most) tests, but an annotation should be > used > > > to > > > > > mark > > > > > > > the > > > > > > > > > > > exception, not the norm. (I'd just assume a runner is > > > > available > > > > > > for > > > > > > > > > > > all tests, e.g. CoreTests depends on DirectRunner > depends > > > on > > > > > > Core). > > > > > > > > > > > > > > > > > > > > > > On Thu, Nov 10, 2016 at 10:14 AM, Mark Liu > > > > > > > > <mark...@google.com.invalid > > > > > > > > > > > > > > > > > > > > > wrote: > > > > > > > > > > > > +1 ValidatesRunner > > > > > > > > > > > > > > > > > > > > > > > > On Thu, Nov 10, 2016 at 8:40 AM, Kenneth Knowles > > > > > > > > > > <k...@google.com.invalid > > > > > > > > > > > > > > > > > > > > > > > > wrote: > > > > > > > > > > > > > > > > > > > > > > > >> Nice. I like ValidatesRunner. > > > > > > > > > > > >> > > > > > > > > > > > >> On Nov 10, 2016 03:39, "Amit Sela" < > > > amitsel...@gmail.com> > > > > > > > wrote: > > > > > > > > > > > >> > > > > > > > > > > > >> > How about @ValidatesRunner ? > > > > > > > > > > > >> > Seems to complement @NeedsRunner as well. > > > > > > > > > > > >> > > > > > > > > > > > > >> > On Thu, Nov 10, 2016 at 9:47 AM Aljoscha Krettek < > > > > > > > > > > aljos...@apache.org > > > > > > > > > > > > > > > > > > > > > > > >> > wrote: > > > > > > > > > > > >> > > > > > > > > > > > > >> > > +1 > > > > > > > > > > > >> > > > > > > > > > > > > > >> > > What I would really like to see is automatic > > > > derivation > > > > > of > > > > > > > the > > > > > > > > > > > >> capability > > > > > > > > > > > >> > > matrix from an extended Runner Test Suite. (As > > > > outlined > > > > > in > > > > > > > > > Thomas' > > > > > > > > > > > >> doc). > > > > > > > > > > > >> > > > > > > > > > > > > > >> > > On Wed, 9 Nov 2016 at 21:42 Kenneth Knowles > > > > > > > > > > <k...@google.com.invalid > > > > > > > > > > > > > > > > > > > > > > > >> > > wrote: > > > > > > > > > > > >> > > > > > > > > > > > > > >> > > > Huge +1 to this. > > > > > > > > > > > >> > > > > > > > > > > > > > > >> > > > The two categories I care most about are: > > > > > > > > > > > >> > > > > > > > > > > > > > > >> > > > 1. Tests that need a runner, but are testing > the > > > > other > > > > > > > > "thing > > > > > > > > > > > under > > > > > > > > > > > >> > > test"; > > > > > > > > > > > >> > > > today this is NeedsRunner. > > > > > > > > > > > >> > > > 2. Tests that are intended to test a runner; > > today > > > > > this > > > > > > is > > > > > > > > > > > >> > > > RunnableOnService. > > > > > > > > > > > >> > > > > > > > > > > > > > > >> > > > Actually the lines are not necessary clear > > between > > > > > them, > > > > > > > > but I > > > > > > > > > > > think > > > > > > > > > > > >> we > > > > > > > > > > > >> > > can > > > > > > > > > > > >> > > > make good choices, like we already do. > > > > > > > > > > > >> > > > > > > > > > > > > > > >> > > > The idea of two categories with a common > > > superclass > > > > > > > actually > > > > > > > > > > has a > > > > > > > > > > > >> > > pitfall: > > > > > > > > > > > >> > > > what if a test is put in the superclass > > category, > > > > when > > > > > > it > > > > > > > > does > > > > > > > > > > not > > > > > > > > > > > >> > have a > > > > > > > > > > > >> > > > clear meaning? And also, I don't have any good > > > ideas > > > > > for > > > > > > > > > names. > > > > > > > > > > > >> > > > > > > > > > > > > > > >> > > > So I think just replacing RunnableOnService > with > > > > > > > RunnerTest > > > > > > > > to > > > > > > > > > > > make > > > > > > > > > > > >> > clear > > > > > > > > > > > >> > > > that it is there just to test the runner is > > good. > > > We > > > > > > might > > > > > > > > > also > > > > > > > > > > > want > > > > > > > > > > > >> > > > RunnerIntegrationTest extends NeedsRunner to > use > > > in > > > > > the > > > > > > IO > > > > > > > > > > > modules. > > > > > > > > > > > >> > > > > > > > > > > > > > > >> > > > See also Thomas's doc on capability matrix > > > testing* > > > > > > which > > > > > > > is > > > > > > > > > > > aimed at > > > > > > > > > > > >> > > case > > > > > > > > > > > >> > > > 2. Those tests should all have a category from > > the > > > > > doc, > > > > > > > or a > > > > > > > > > new > > > > > > > > > > > one > > > > > > > > > > > >> > > added. > > > > > > > > > > > >> > > > > > > > > > > > > > > >> > > > * > > > > > > > > > > > >> > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > >> > > > > > > > > > > > https://docs.google.com/document/d/ > > > > 1fICxq32t9yWn9qXhmT07xpclHeHX2 > > > > > > > > > > > >> > VlUyVtpi2WzzGM/edit > > > > > > > > > > > >> > > > > > > > > > > > > > > >> > > > Kenn > > > > > > > > > > > >> > > > > > > > > > > > > > > >> > > > On Wed, Nov 9, 2016 at 12:20 PM, Jean-Baptiste > > > > Onofré > > > > > < > > > > > > > > > > > >> j...@nanthrax.net > > > > > > > > > > > >> > > > > > > > > > > > > > >> > > > wrote: > > > > > > > > > > > >> > > > > > > > > > > > > > > >> > > > > Hi Mark, > > > > > > > > > > > >> > > > > > > > > > > > > > > > >> > > > > Generally speaking, I agree. > > > > > > > > > > > >> > > > > > > > > > > > > > > > >> > > > > As RunnableOnService extends NeedsRunner, > > > > > > > @TestsWithRunner > > > > > > > > > or > > > > > > > > > > > >> > > > @RunOnRunner > > > > > > > > > > > >> > > > > sound clearer. > > > > > > > > > > > >> > > > > > > > > > > > > > > > >> > > > > Regards > > > > > > > > > > > >> > > > > JB > > > > > > > > > > > >> > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > >> > > > > On 11/09/2016 09:00 PM, Mark Liu wrote: > > > > > > > > > > > >> > > > > > > > > > > > > > > > >> > > > >> Hi all, > > > > > > > > > > > >> > > > >> > > > > > > > > > > > >> > > > >> I'm working on building RunnableOnService > in > > > > Python > > > > > > > SDK. > > > > > > > > > > After > > > > > > > > > > > >> > having > > > > > > > > > > > >> > > > >> discussions with folks, "RunnableOnService" > > > looks > > > > > > like > > > > > > > > not > > > > > > > > > a > > > > > > > > > > > very > > > > > > > > > > > >> > > > >> intuitive > > > > > > > > > > > >> > > > >> name for those unit tests that require > > runners > > > > and > > > > > > > build > > > > > > > > > > > >> lightweight > > > > > > > > > > > >> > > > >> pipelines to test specific components. > > > > Especially, > > > > > > they > > > > > > > > > don't > > > > > > > > > > > have > > > > > > > > > > > >> > to > > > > > > > > > > > >> > > > run > > > > > > > > > > > >> > > > >> on a service. > > > > > > > > > > > >> > > > >> > > > > > > > > > > > >> > > > >> So I want to raise this idea to the > community > > > and > > > > > see > > > > > > > if > > > > > > > > > > anyone > > > > > > > > > > > >> have > > > > > > > > > > > >> > > > >> similar thoughts. Maybe we can come up > with a > > > > name > > > > > > this > > > > > > > > is > > > > > > > > > > > tight > > > > > > > > > > > >> to > > > > > > > > > > > >> > > > >> runner. > > > > > > > > > > > >> > > > >> Currently, I have two names in my head: > > > > > > > > > > > >> > > > >> > > > > > > > > > > > >> > > > >> - TestsWithRunners > > > > > > > > > > > >> > > > >> - RunnerExecutable > > > > > > > > > > > >> > > > >> > > > > > > > > > > > >> > > > >> Any thoughts? > > > > > > > > > > > >> > > > >> > > > > > > > > > > > >> > > > >> Thanks, > > > > > > > > > > > >> > > > >> Mark > > > > > > > > > > > >> > > > >> > > > > > > > > > > > >> > > > >> > > > > > > > > > > > >> > > > > -- > > > > > > > > > > > >> > > > > Jean-Baptiste Onofré > > > > > > > > > > > >> > > > > jbono...@apache.org > > > > > > > > > > > >> > > > > http://blog.nanthrax.net > > > > > > > > > > > >> > > > > Talend - http://www.talend.com > > > > > > > > > > > >> > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > >> > > > > > > > > > > > > > >> > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > -- > > > ------- > > > Jason Kuster > > > Apache Beam / Google Cloud Dataflow > > > > > >