[ https://issues.apache.org/jira/browse/FLINK-17723?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
John Lonergan closed FLINK-17723. --------------------------------- Resolution: Abandoned > Written design for flink threading model and guarantees made to the various > structual components > ------------------------------------------------------------------------------------------------ > > Key: FLINK-17723 > URL: https://issues.apache.org/jira/browse/FLINK-17723 > Project: Flink > Issue Type: Improvement > Components: Documentation > Reporter: John Lonergan > Priority: Minor > Labels: auto-deprioritized-major, stale-minor > > I enjoy using Flink but ... > Do we have a written design for the threading model including the guarantees > made by the core framework in terms of threading and concurrency. > Looking at various existing components such as JDBC and file sinks and other > non-core facilities. > Having some difficulty understanding the intended design. > Want to understand the assumptions I can make about when certain functions > will be called (for example JDBCOutputFormat open vs flush vs writeRecord vs > close) and whether this will always be from the same thread or some other > thread, or whether they might be called concurrently, in order to verify the > correctness of the code. > What guarantees are there? > Does a certain reference need a volatile or even a synchronisation or not. > What's the design for threading? > If the intended design is not written down then we have to infer it from the > code and we will definitiely come to different conclusions and thus bugs and > leaks. and other avoidable horrors. > It's really hard writing good MT code and a strong design is necessary to > provide a framework for the code. > Some info here > https://flink.apache.org/contributing/code-style-and-quality-common.html#concurrency-and-threading > , but this isn't a design and doesn't say how it's meant to work. However > that page does agree that writing MT code is very hard and this just > underlines the need for a strong and detailed design for this aspect. > == > Another supporting example. > When I see code like this ... > FileOutputFormat > > {code:java} > public void close() throws IOException { > final FSDataOutputStream s = this.stream; > if (s != null) { > this.stream = null; > s.close(); > } > } > {code} > My feeling is that someone else wasn't sure what the right approach was. > I can only guess that the author was concerned that someone else was going to > call the function concurrently, or mess with the class state by some other > means. And, if that were true then would this code even be MT safe - who > knows? Ought there be a volatile in there or a plain old sync? Or perhaps > none of the caution is needed at all (framework guarantees preventing the > need)? > Or take a look at the extensive sychronisation efforts in > https://github.com/apache/flink/blob/master/flink-connectors/flink-hadoop-compatibility/src/main/java/org/apache/flink/api/java/hadoop/mapred/HadoopOutputFormatBase.java > is this code correct? Not to mention that fact that this close() method > might throw an NPE if there is any possiblity that 'this.outputCommitter' > might not have been initialised in open OR is the framework can ever call > close() without open() having completed. > I find if worrying that I see a lot of code in the project that is similarly > uncertain and inconsistent syncronisation and resource management. > I would have hoped that the underlying core framework provided guarantees > that avoided the need to have extensive synchronisation effort in derived or > auxiliary classes. > What's the design. -- This message was sent by Atlassian Jira (v8.20.1#820001)