I have created a ticket to update the ORC version. https://issues.apache.org/jira/browse/FLINK-17142
On Tue, Apr 14, 2020 at 8:18 PM Jingsong Li <jingsongl...@gmail.com> wrote: > Hi, yes, we can bump orc-core version to a newer. > > Best, > Jingsong Lee > > On Tue, Apr 14, 2020 at 8:16 PM Sivaprasanna <sivaprasanna...@gmail.com> > wrote: > > > On a similar note, I just checked that the Flink currently uses orc 1.4.3 > > in the dependencies. IMO, it is a little outdated. Can we bump the ORC > > version to a slightly newer version - maybe 1.5.x or even 1.6.0? > > > > - > > Sivaprasanna > > > > On Tue, Apr 14, 2020 at 1:42 PM Jingsong Li <jingsongl...@gmail.com> > > wrote: > > > > > Hi, > > > > > > Maybe you should use flink-orc. And use orc-core instead of orc-core > with > > > nohive classifier. We can provide nohive version in the future. > > > > > > Because orc and hive are so close, orc still relies on some classes of > > hive > > > currently. > > > Apache orc with nohive classifier is for create a variant of core and > > > mapreduce jars that don't conflict with hive 1.x [1] > > > > > > So the orc and orc-nohive have same class name, but orc-nohive > > > shade/relocation lots of classes, like "ColumnVector" and > > > "VectorizedRowBatch". > > > Now the flink-orc-nohive depends on flink-orc, they share lots of > codes. > > > They can not be unified to a separate module, there will be a lot of > > > conflicts. > > > > > > [1]https://issues.apache.org/jira/browse/ORC-174 > > > > > > Best, > > > Jingsong Lee > > > > > > On Tue, Apr 14, 2020 at 3:36 PM Sivaprasanna < > sivaprasanna...@gmail.com> > > > wrote: > > > > > > > Hello, > > > > > > > > I'm working on an implementation of ORC BulkWriter[1]. As of now, I > > have > > > > the entire implementation in a separate module called > > > "flink-orc-compress" > > > > under "flink-formats" since I'm not entirely sure whether it should > go > > > into > > > > the existing ORC modules i.e flink-orc & flink-orc-nohive. > > > > > > > > So my questions are: > > > > 1. What's the difference between these two ORC modules? > > > > 2. Should the ORC BulkWriter implementation go into one of these > > existing > > > > modules? If yes, which one? Or can we keep it in a separate module to > > > avoid > > > > duplicating or causing any conflicts? > > > > > > > > Note: My current implementation of ORC BulkWriter uses orc-core with > > > nohive > > > > classifier as the dependency. > > > > > > > > [1] https://issues.apache.org/jira/browse/FLINK-10114 > > > > > > > > > > > > > -- > > > Best, Jingsong Lee > > > > > > > > -- > Best, Jingsong Lee >