Thanks, JING ZHANG. The first one is definitely doable, I will add it to my list. The second one I'll have to take a look at.
On Fri, Sep 10, 2021 at 2:27 PM JING ZHANG <beyond1...@gmail.com> wrote: > Hi Ingo, > Thanks for driving this discussion. > Some use cases come to my mind, maybe those rules could be checked by the > same way. > 1. new introduced `StreamExecNode` is implemented json > serialization/deserialization. Currently it is checked in > `JsonSerdeCoverageTest`. > 2. new introduced `RelNode` could be covered by all `MetadataHandler`s. > Currently this rule not exists yet. > > Best regards, > JING ZHANG > > > Ingo Bürk <i...@ververica.com> 于2021年9月9日周四 下午6:49写道: > > > Great! I'll work on getting the PR into an actual, proper shape now, > > including looking at found violations more carefully and eventually > > freezing current violations (maybe removing some quick-wins). > > > > One more thing I just ran into is that ArchUnit doesn't explicitly > support > > Scala; while many things just work (since it's still byte code), > > Scala-specific concepts like traits seem to cause issues. I'll have to > > exclude Scala code from the checks for now, I think. > > > > > > Ingo > > > > On Tue, Sep 7, 2021 at 5:03 PM Chesnay Schepler <ches...@apache.org> > > wrote: > > > > > I would say that's fine time-wise. > > > > > > On 07/09/2021 15:29, Ingo Bürk wrote: > > > > Thanks, Chesnay. I updated the PR to use a separate module now, and > ran > > > it > > > > on a few modules (some Table API modules and a couple connectors). > The > > CI > > > > seemed to take ~2.5min for executing the tests; that's certainly not > > > > negligible. On the other hand, even the few tests implemented already > > > found > > > > several violations ("several" is an understatement, but I manually > > > verified > > > > some of them, not all of them). > > > > > > > > On Mon, Sep 6, 2021 at 3:44 PM Chesnay Schepler <ches...@apache.org> > > > wrote: > > > > > > > >> While flink-tests is currently the best choice in that it has the > > > >> biggest classpath, it is also the module already requiring the most > > time > > > >> on CI. > > > >> > > > >> Furthermore, given that we ideally cover all APIs (including > > connectors > > > >> & formats), having that mess of dependencies in flink-tests may > > > >> interfere with existing / future tests. > > > >> > > > >> As such I would prefer a separate module, as annoying as that may > be. > > > >> > > > >> On 06/09/2021 15:26, Ingo Bürk wrote: > > > >>> I just quickly chatted with the author/maintainer of ArchUnit, and > a > > > >> module > > > >>> which depends on every module that should be tested seems to be the > > > best > > > >>> solution. How do you feel about using flink-tests for this vs. > > having a > > > >>> separate module for this purpose? > > > >>> > > > >>> > > > >>> Ingo > > > >>> > > > >>> On Mon, Sep 6, 2021 at 3:04 PM Ingo Bürk <i...@ververica.com> > wrote: > > > >>> > > > >>>> Hi Chesnay, > > > >>>> > > > >>>> Those are all great questions, and I want to tackle those as well. > > For > > > >> the > > > >>>> moment I went per-module, but runtime-wise that isn't ideal the > more > > > >>>> modules we'd activate this in. ArchUnit does cache classes between > > > >> tests, > > > >>>> but if we run them individually per module, we'd still add up > quite > > a > > > >> bit > > > >>>> of execution time (a single module in my IDE is around 10s with > the > > > >> tests I > > > >>>> currently have implemented, but I suspect the bottleneck here is > the > > > >>>> importing of classes, not the number of tests). Ideally we'd just > > run > > > >> them > > > >>>> once in a module with a big enough classpath to cover everything. > If > > > we > > > >>>> have such a place, that would probably be our best shot. I'll also > > > keep > > > >>>> investigating here, of course. > > > >>>> > > > >>>> For now I just pushed a solution to avoid the overlap when > executing > > > it > > > >>>> per-module by matching on the URI. It's not the prettiest > solution, > > > but > > > >>>> does work; but that's more to not fail the tests in unrelated > > modules > > > >> and > > > >>>> doesn't help much with execution time. > > > >>>> > > > >>>> > > > >>>> Ingo > > > >>>> > > > >>>> On Mon, Sep 6, 2021 at 1:57 PM Chesnay Schepler < > ches...@apache.org > > > > > > >>>> wrote: > > > >>>> > > > >>>>> Do you have an estimate for long these tests would run for? > > > >>>>> > > > >>>>> For project-wide tests, what are the options for setting that up? > > > >>>>> If we let the tests run per-module then I guess they'd overlap > > > >>>>> considerably (because other Flink modules are being put on the > > > >>>>> classpath), which isn't ideal. > > > >>>>> > > > >>>>> On 06/09/2021 13:51, David Morávek wrote: > > > >>>>>> Hi Ingo, > > > >>>>>> > > > >>>>>> +1 for this effort. This could automate lot of "written rules" > > that > > > >> are > > > >>>>>> easy to forget about / not to be aware of (such as that each > test > > > >> should > > > >>>>>> extend the TestLogger as Till has already mentioned). > > > >>>>>> > > > >>>>>> I went trough your examples and ArchUnit looks really powerful > and > > > >>>>>> expressive while still being easy to read. > > > >>>>>> > > > >>>>>> Best, > > > >>>>>> D. > > > >>>>>> > > > >>>>>> On Mon, Sep 6, 2021 at 1:00 PM Ingo Bürk <i...@ververica.com> > > > wrote: > > > >>>>>> > > > >>>>>>> Thanks for your input Chesnay! > > > >>>>>>> > > > >>>>>>> The limitations of ArchUnit probably mostly stem from the fact > > that > > > >> it > > > >>>>>>> operates on byte code and thus can't access anything not > > accessible > > > >>>>> from > > > >>>>>>> byte code, i.e. JavaDocs. But I think Checkstyle and ArchUnit > are > > > >>>>>>> complementing each other quite well here. The main reason > against > > > >>>>>>> Checkstyle for these tests is its limitation to single files > > only, > > > >>>>>>> rendering many tests (including the one you mentioned) > > impossible. > > > >> The > > > >>>>>>> secondary reason is that ArchUnit has more declarative APIs and > > the > > > >>>>> tests > > > >>>>>>> become quite easy to write and maintain (some groundwork effort > > is > > > >>>>> needed, > > > >>>>>>> of course). Over time we could probably expand quite a bit more > > on > > > >>>>> what is > > > >>>>>>> tested with ArchUnit as it can test entire architectures > (package > > > >>>>> accesses > > > >>>>>>> etc.), and it has support for "freezing" known violations to > > > prevent > > > >>>>> new > > > >>>>>>> violations and removing existing ones over time. > > > >>>>>>> > > > >>>>>>> The @VisibleForTesting use case you mentioned is possible; I've > > > >> pushed > > > >>>>> a > > > >>>>>>> version of that rule to the draft PR now as well. > > > >>>>>>> > > > >>>>>>> > > > >>>>>>> Best > > > >>>>>>> Ingo > > > >>>>>>> > > > >>>>>>> On Mon, Sep 6, 2021 at 12:11 PM Chesnay Schepler < > > > ches...@apache.org > > > >>>>>>> wrote: > > > >>>>>>> > > > >>>>>>>> This sounds like an interesting effort. > > > >>>>>>>> > > > >>>>>>>> The draft you have opened uses ArchUnit; can you explain a bit > > > what > > > >>>>> the > > > >>>>>>>> capabilities/limitations of said tool are? > > > >>>>>>>> > > > >>>>>>>> One thing we wanted to have for a long time is that > > > methods/classes > > > >>>>>>>> annotated with @VisibleForTesting are not called from > production > > > >> code; > > > >>>>>>>> would that be something that could be implemented? > > > >>>>>>>> > > > >>>>>>>> It's not a problem imo that tests need to run in order to > catch > > > >> stuff; > > > >>>>>>>> so long as it is noticed on CI. > > > >>>>>>>> > > > >>>>>>>> On 03/09/2021 08:48, Ingo Bürk wrote: > > > >>>>>>>>> Hi Timo, Till, > > > >>>>>>>>> > > > >>>>>>>>> thanks for your input already. I'm glad to hear that the idea > > > >>>>>>> resonates, > > > >>>>>>>>> also thanks for the additional ideas! > > > >>>>>>>>> > > > >>>>>>>>> I've created a JIRA issue[1] for now just to track this idea. > > I'm > > > >>>>> also > > > >>>>>>>>> working on a bit of a proof of concept and opened it as a > draft > > > >>>>> PR[2]. > > > >>>>>>>> I'm > > > >>>>>>>>> happy for anyone to join that PR to look and discuss. The PR > > > >> doesn't > > > >>>>>>>>> necessarily intend to be merged in its current state, but is > > > rather > > > >>>>> for > > > >>>>>>>>> evaluation. > > > >>>>>>>>> > > > >>>>>>>>> Meanwhile I'm also collecting ideas in a Google Doc, so if > > anyone > > > >>>>> wants > > > >>>>>>>> to > > > >>>>>>>>> suggest more use cases to explore or implement, please let me > > > know. > > > >>>>>>>>> > > > >>>>>>>>> [1] https://issues.apache.org/jira/browse/FLINK-24138 > > > >>>>>>>>> [2] https://github.com/apache/flink/pull/17133 > > > >>>>>>>>> > > > >>>>>>>>> > > > >>>>>>>>> Best > > > >>>>>>>>> Ingo > > > >>>>>>>>> > > > >>>>>>>>> On Thu, Sep 2, 2021 at 12:31 PM Till Rohrmann < > > > >> trohrm...@apache.org> > > > >>>>>>>> wrote: > > > >>>>>>>>>> If it is possible to automate these kinds of checks, then I > am > > > all > > > >>>>> for > > > >>>>>>>> it > > > >>>>>>>>>> because everything else breaks eventually. So +1 for this > > > >> proposal. > > > >>>>>>>>>> I don't have experience with what tools are available, > though. > > > >>>>>>>>>> > > > >>>>>>>>>> I would like to add a rule that every test class extends > > > directly > > > >> or > > > >>>>>>>>>> indirectly TestLogger because otherwise it is super hard to > > make > > > >>>>> sense > > > >>>>>>>> of > > > >>>>>>>>>> the test logs (Arvid will probably chime in stating that > this > > > will > > > >>>>> be > > > >>>>>>>>>> solved with Junit5 eventually). > > > >>>>>>>>>> > > > >>>>>>>>>> Not sure whether this is possible or not but if we can check > > > that > > > >>>>> all > > > >>>>>>>>>> interfaces have properly defined JavaDocs on each method, > then > > > >> this > > > >>>>>>>> could > > > >>>>>>>>>> also be helpful in my opinion. > > > >>>>>>>>>> > > > >>>>>>>>>> Cheers, > > > >>>>>>>>>> Till > > > >>>>>>>>>> > > > >>>>>>>>>> On Thu, Sep 2, 2021 at 11:16 AM Timo Walther < > > > twal...@apache.org> > > > >>>>>>>> wrote: > > > >>>>>>>>>>> Hi Ingo, > > > >>>>>>>>>>> > > > >>>>>>>>>>> thanks for starting this discussion. Having more automation > > is > > > >>>>>>>>>>> definitely desirable. Esp. in the API / SDK areas where we > > > >>>>> frequently > > > >>>>>>>>>>> have to add similar comments to PRs. The more checks the > > > better. > > > >> We > > > >>>>>>>>>>> definitely also need more guidelines (e.g. how to develop a > > > Flink > > > >>>>>>>>>>> connector) but automation is safer then long checklists > that > > > >> might > > > >>>>> be > > > >>>>>>>>>>> out of date quickly. > > > >>>>>>>>>>> > > > >>>>>>>>>>> +1 to the proposal. I don't have an opinion on the tool > > though. > > > >>>>>>>>>>> > > > >>>>>>>>>>> Regards, > > > >>>>>>>>>>> Timo > > > >>>>>>>>>>> > > > >>>>>>>>>>> > > > >>>>>>>>>>> On 01.09.21 11:03, Ingo Bürk wrote: > > > >>>>>>>>>>>> Hello everyone, > > > >>>>>>>>>>>> > > > >>>>>>>>>>>> I would like to start a discussion on introducing > automated > > > >> tests > > > >>>>>>> for > > > >>>>>>>>>>> more > > > >>>>>>>>>>>> architectural rather than stilistic topics. For example, > > here > > > >> are > > > >>>>> a > > > >>>>>>>> few > > > >>>>>>>>>>>> things that seem worth checking to me (this is > > > Table-API-focused > > > >>>>>>> since > > > >>>>>>>>>> it > > > >>>>>>>>>>>> is the subsystem I'm involved in): > > > >>>>>>>>>>>> > > > >>>>>>>>>>>> (a) All classes in o.a.f.table.api should be annotated > with > > > one > > > >>>>>>>>>>>> of @Internal, @PublicEvolving, or @Public. > > > >>>>>>>>>>>> (b) Classes whose name ends in *ConnectorOptions should be > > > >> located > > > >>>>>>> in > > > >>>>>>>>>>>> o.a.f.connector.*.table > > > >>>>>>>>>>>> (c) Classes implementing DynamicSourceFactory / > > > >> DynamicSinkFactory > > > >>>>>>>>>> should > > > >>>>>>>>>>>> have no static members of type ConfigOption > > > >>>>>>>>>>>> > > > >>>>>>>>>>>> There are probably significantly more cases worth > checking, > > > and > > > >>>>> also > > > >>>>>>>>>> more > > > >>>>>>>>>>>> involved ones (these are rather simple examples), like > > > >> disallowing > > > >>>>>>>>>> access > > > >>>>>>>>>>>> between certain packages etc. There are two questions I > > would > > > >> like > > > >>>>>>> to > > > >>>>>>>>>> ask > > > >>>>>>>>>>>> to the community: > > > >>>>>>>>>>>> > > > >>>>>>>>>>>> (1) Do you think such tests are useful in general? > > > >>>>>>>>>>>> (2) What use cases come to mind for you? > > > >>>>>>>>>>>> > > > >>>>>>>>>>>> If the idea finds consensus, I would like to use (2) to > > > >>>>> investigate > > > >>>>>>>>>> which > > > >>>>>>>>>>>> tooling to use. An obvious candidate is Checkstyle, as > this > > is > > > >>>>>>> already > > > >>>>>>>>>>>> used. It also has the advantage of being well integrated > in > > > the > > > >>>>> IDE. > > > >>>>>>>>>>>> However, it is limited to looking at single files only, > and > > > >> custom > > > >>>>>>>>>> checks > > > >>>>>>>>>>>> are pretty complicated and involved to implement[1]. > Another > > > >>>>>>> possible > > > >>>>>>>>>>> tool > > > >>>>>>>>>>>> is ArchUnit[2], which would be significantly easier to > > > maintain > > > >>>>> and > > > >>>>>>> is > > > >>>>>>>>>>> more > > > >>>>>>>>>>>> powerful, but in turn requires tests to be executed. If > you > > > have > > > >>>>>>>>>> further > > > >>>>>>>>>>>> suggestions (or thoughts) they would of course also be > quite > > > >>>>>>> welcome, > > > >>>>>>>>>>>> though for now I would focus on (1) and (2) and go from > > there > > > to > > > >>>>>>>>>>> evaluate. > > > >>>>>>>>>>>> [1] https://checkstyle.sourceforge.io/writingchecks.html > > > >>>>>>>>>>>> [2] https://www.archunit.org/ > > > >>>>>>>>>>>> > > > >>>>>>>>>>>> > > > >>>>>>>>>>>> Best > > > >>>>>>>>>>>> Ingo > > > >>>>>>>>>>>> > > > >> > > > > > > > > >