Till, Stephen, & Others, I have created a discuss thread a few days back. Attaching the link here. Appreciate if you could take a look. https://lists.apache.org/thread.html/rf885987160bede5911a7f61923307a6d5ae07f850da0a90555728e5f%40%3Cdev.flink.apache.org%3E
Please let me know if you want me to improve/edit the content to make it better. Thanks, Sivaprasanna On Tue, Mar 17, 2020 at 8:22 PM Sivaprasanna <sivaprasanna...@gmail.com> wrote: > Hi Till, > > Sure. I'll take a look and start a discuss thread soon. > > Thanks, > Sivaprasanna > > On Mon, Mar 16, 2020 at 4:01 PM Till Rohrmann <trohrm...@apache.org> > wrote: > >> Hi Sivaprasanna, >> >> do you want to collect the set of Hadoop utility classes which could be >> moved to a flink-hadoop-utils module and start a discuss thread about it? >> I >> think this could be a first good step into cleaning up the module >> structure >> a bit. >> >> Cheers, >> Till >> >> On Fri, Mar 6, 2020 at 7:27 AM Sivaprasanna <sivaprasanna...@gmail.com> >> wrote: >> >> > That also makes sense but that, I believe, would be a breaking/major >> > change. If we are okay with merging them together, we can name something >> > like "flink-hadoop-compress" since SequenceFile is also a Hadoop format >> and >> > the existing "flink-compress" module, as of now, deals with Hadoop based >> > compression. >> > >> > On Fri, Mar 6, 2020 at 1:33 AM João Boto <eskabe...@apache.org> wrote: >> > >> > > We could merge the two modules into one? >> > > sequence-files its another way of compressing files.. >> > > >> > > >> > > On 2020/03/05 13:02:46, Sivaprasanna <sivaprasanna...@gmail.com> >> wrote: >> > > > Hi Stephen, >> > > > >> > > > I guess it is a valid point to have something like >> > 'flink-hadoop-utils'. >> > > > Maybe a [DISCUSS] thread can be started to understand what the >> > community >> > > > thinks? >> > > > >> > > > On Thu, Mar 5, 2020 at 4:22 PM Stephan Ewen <se...@apache.org> >> wrote: >> > > > >> > > > > Do we have more cases of "common Hadoop Utils"? >> > > > > >> > > > > If yes, does it make sense to create a "flink-hadoop-utils" module >> > with >> > > > > exactly such classes? It would have an optional dependency on >> > > > > "flink-shaded-hadoop". >> > > > > >> > > > > On Wed, Mar 4, 2020 at 9:12 AM Till Rohrmann < >> trohrm...@apache.org> >> > > wrote: >> > > > > >> > > > > > Hi Sivaprasanna, >> > > > > > >> > > > > > we don't upload the source jars for the flink-shaded modules. >> > > However you >> > > > > > can build them yourself and install by cloning the flink-shaded >> > > > > repository >> > > > > > [1] and then call `mvn package -Dshade-sources`. >> > > > > > >> > > > > > [1] https://github.com/apache/flink-shaded >> > > > > > >> > > > > > Cheers, >> > > > > > Till >> > > > > > >> > > > > > On Tue, Mar 3, 2020 at 6:29 PM Sivaprasanna < >> > > sivaprasanna...@gmail.com> >> > > > > > wrote: >> > > > > > >> > > > > > > BTW, can we leverage flink-shaded-hadoop-2? Reason why I ask, >> if >> > > any >> > > > > > Flink >> > > > > > > module is going to use Hadoop in any way, it will most >> probably >> > > include >> > > > > > > flink-shaded-hadoop-2 as a dependency. >> > > > > > > However, flink-shaded modules don't have any source files. Is >> > that >> > > a >> > > > > > strict >> > > > > > > convention that the community follows? >> > > > > > > >> > > > > > > - >> > > > > > > Sivaprasanna >> > > > > > > >> > > > > > > On Tue, Mar 3, 2020 at 10:48 PM Sivaprasanna < >> > > > > sivaprasanna...@gmail.com> >> > > > > > > wrote: >> > > > > > > >> > > > > > > > Hi Arvid, >> > > > > > > > >> > > > > > > > Thanks for the quick reply. Yes, it actually makes sense to >> > avoid >> > > > > > Hadoop >> > > > > > > > dependencies from getting into Flink's core modules but I >> also >> > > wonder >> > > > > > if >> > > > > > > it >> > > > > > > > will be an overkill to add flink-hadoop-fs as a dependency >> just >> > > > > because >> > > > > > > we >> > > > > > > > want to use a utility class from that module. >> > > > > > > > >> > > > > > > > - >> > > > > > > > Sivaprasanna >> > > > > > > > >> > > > > > > > On Tue, Mar 3, 2020 at 4:17 PM Arvid Heise < >> > ar...@ververica.com> >> > > > > > wrote: >> > > > > > > > >> > > > > > > >> Hi Sivaprasanna, >> > > > > > > >> >> > > > > > > >> we actually want to remove Hadoop from all core modules, >> so we >> > > could >> > > > > > not >> > > > > > > >> place it in some very common place like flink-core. >> > > > > > > >> >> > > > > > > >> But I think the module flink-hadoop-fs could be a fitting >> > place. >> > > > > > > >> >> > > > > > > >> On Tue, Mar 3, 2020 at 11:25 AM Sivaprasanna < >> > > > > > sivaprasanna...@gmail.com >> > > > > > > > >> > > > > > > >> wrote: >> > > > > > > >> >> > > > > > > >> > Hi >> > > > > > > >> > >> > > > > > > >> > The flink-sequence-file module has a class named >> > > > > > > >> > SerializableHadoopConfiguration[1] which is nothing but a >> > > wrapper >> > > > > > > class >> > > > > > > >> for >> > > > > > > >> > Hadoop Configuration. I believe this class can be moved >> to a >> > > > > common >> > > > > > > >> module >> > > > > > > >> > since this is not necessarily tightly coupled with >> > > sequence-file >> > > > > > > module, >> > > > > > > >> > and also because it can be used by many other modules, >> for >> > ex. >> > > > > > > >> > flink-compress. Thoughts? >> > > > > > > >> > >> > > > > > > >> > - >> > > > > > > >> > Sivaprasanna >> > > > > > > >> > >> > > > > > > >> >> > > > > > > > >> > > > > > > >> > > > > > >> > > > > >> > > > >> > > >> > >> >