Reading Parquet/Hive

2015-12-18 Thread Gwenhael Pasquiers
Hi, I'm trying to read Parquet/Hive data using parquet's ParquetInputFormat and hive's DataWritableReadSupport. I have an error when the TupleSerializer tries to create an instance of ArrayWritable, using reflection because ArrayWritable has no no-args constructor. I've been able to make it w

Re: Hive bug? about no such table

2015-12-18 Thread Philip Lee
Opps, sorry I was supposed to email this one to hive mailiing list. On Fri, Dec 18, 2015 at 2:19 AM, Philip Lee wrote: > I think It is from Hive Bug about something related to metastore. > > Here is the thing. > > After I generated scale factor 300 named bigbench300 and bigbench100, > which al

Re: Hive bug? about no such table

2015-12-18 Thread Ufuk Celebi
> On 18 Dec 2015, at 11:07, Philip Lee wrote: > > Opps, sorry > > I was supposed to email this one to hive mailiing list. No problem. Can happen easily with auto completion ;) – Ufuk

RE: Reading Parquet/Hive

2015-12-18 Thread Gwenhael Pasquiers
I'll answer to myself :) I think i've managed to make it work by creating my "WrappingReadSupport" that wraps the DataWritableReadSupport but I also insert my "WrappingMaterializer" that converts the ArrayWritable produced by the original Materializer to String[]. Then later on, the String[] po

Re: Size of a window without explicit trigger/evictor

2015-12-18 Thread Fabian Hueske
Hi Nirmalya, sorry for the delayed answer. First of all, Flink does not take care that our windows fit into memory. The default trigger depends on the way in which you define a window. Given a KeyedStream you can define a window in the following ways: KeyedStream s = ... s.timeWindow() // this w

Re: Usecase for Flink

2015-12-18 Thread Stephan Ewen
If I understand you correctly, you want to write something like: -- [cassandra] ^ | V (even

Re: Problem to show logs in task managers

2015-12-18 Thread Ana M. Martinez
Hi Till, Many thanks for your quick response. I have modified the WordCountExample to re-reproduce my problem in a simple example. I run the code below with the following command: ./bin/flink run -m yarn-cluster -yn 1 -ys 4 -yjm 1024 -ytm 1024 -c mypackage.WordCountExample ../flinklink.jar An

Re: Problem to show logs in task managers

2015-12-18 Thread Till Rohrmann
In which log file are you exactly looking for the logging statements? And on what machine? You have to look on the machines on which the yarn container were started. Alternatively if you have log aggregation activated, then you can simply retrieve the log files via yarn logs. Cheers, Till On Fri,

Re: Streaming to db question

2015-12-18 Thread Flavio Pompermaier
I was thinking to something more like http://www.infoq.com/articles/key-lessons-learned-from-transition-to-nosql that basically implement what you call Out-of-core state at https://cwiki.apache.org/confluence/display/FLINK/Stateful+Stream+Processing. Riak provide some feature to handle the eventual