RE: GROUP BY TUMBLE on ROW range

2017-10-18 Thread Stefano Bortoli
n some key) sorted on time with a ROW_NUMBER function that assigns increasing numbers to rows. - do a group by on the row number modulo the window size. Btw. count windows are supported by the Table API. Best, Fabian 2017-10-17 17:16 GMT+02:00 Stefano Bortoli mailto:stefano.bort...@huawei.com>

GROUP BY TUMBLE on ROW range

2017-10-17 Thread Stefano Bortoli
Hi all, Is there a way to use a tumble window group by with row range in streamSQL? I mean, something like this: // "SELECT COUNT(*) " + // "FROM T1 " + //"GROUP BY TUMBLE(rowtime, INTERVAL '2' ROWS PRECEDING )" However, even looking at tests and looking at the "row int

RE: UnilateralSortMerger error (again)

2017-04-21 Thread Stefano Bortoli
In fact the old problem was with the KryoSerializer missed initialization on the exception that would trigger the spilling on disk. This would lead to dirty serialization buffer that would eventually break the program. Till worked on it debugging the source code generating the error. Perhaps som

RE: Flink memory usage

2017-04-20 Thread Stefano Bortoli
, Billy [mailto:billy.newp...@gs.com] Sent: Thursday, April 20, 2017 4:46 PM To: Stefano Bortoli ; 'Fabian Hueske' Cc: 'user@flink.apache.org' Subject: RE: Flink memory usage Your reuse idea kind of implies that it’s a GC generation rate issue, i.e. it’s not collecting fast en

RE: Flink memory usage

2017-04-20 Thread Stefano Bortoli
Hi Billy, The only suggestion I can give is to check very well in your code for useless variable allocations, and foster reuse as much as possible. Don’t create a new collection at any map execution, but rather clear, reuse the collected output of the flatMap, and so on. In the past we run lon

Re: JDBC sink in flink

2016-07-05 Thread Stefano Bortoli
As Chesnay said, it not necessary to use a pool as the connection is reused across split. However, if you had to customize it for some reasons, you can do it starting from the JDBC Input and Output format. cheers! 2016-07-05 13:27 GMT+02:00 Harikrishnan S : > Awesome ! Thanks a lot ! I should pr

Re: JDBC sink in flink

2016-07-05 Thread Stefano Bortoli
The connection will be managed by the splitManager, no need of using a pool. However, if you had to, probably you should look into establishConnection() method of the JDBCInputFormat. 2016-07-05 10:52 GMT+02:00 Flavio Pompermaier : > why do you need a connection pool? > On 5 Jul 2016 11:41, "Ha

Re: Weird Kryo exception (Unable to find class: java.ttil.HashSet)

2016-05-24 Thread Stefano Bortoli
Till mentioned the fact that 'spilling on disk' was managed through exception catch. The last serialization error was related to bad management of Kryo buffer that was not cleaned after spilling on exception management. Is it possible we are dealing with an issue similar to this but caused by anoth

Re: Weird Kryo exception (Unable to find class: java.ttil.HashSet)

2016-05-16 Thread Stefano Bortoli
Hi Flavio, Till, do you think this can be possibly related to the serialization problem caused by 'the management' of Kryo serializer buffer when spilling on disk? We are definitely going beyond what is managed in memory with this task. saluti, Stefano 2016-05-16 9:44 GMT+02:00 Flavio Pompermaie

Re: Requesting the next InputSplit failed

2016-04-29 Thread Stefano Bortoli
fers:16384 >>>>>> >>>>>> The job just read a window of max 100k elements and then writes a >>>>>> Tuple5 into a CSV on the jobmanger fs with parallelism 1 (in order to >>>>>> produce a single file). The job dies after 40 mi

Re: Requesting the next InputSplit failed

2016-04-28 Thread Stefano Bortoli
is not clear why it should refuse a connection to itself after 40min of run. we'll try to figure out possible environment issues. Its a fresh installation, therefore we may have left out some configurations. saluti, Stefano 2016-04-28 9:22 GMT+02:00 Stefano Bortoli : > I had this type of e

Re: Requesting the next InputSplit failed

2016-04-28 Thread Stefano Bortoli
I had this type of exception when trying to build and test Flink on a "small machine". I worked around the test increasing the timeout for Akka. https://github.com/stefanobortoli/flink/blob/FLINK-1827/flink-tests/src/test/java/org/apache/flink/test/checkpointing/EventTimeAllWindowCheckpointingITCa

Re: threads, parallelism and task managers

2016-04-13 Thread Stefano Bortoli
to 16 I expect to >>> see 16 calls to the connection pool (i.e. ' CREATING >>> NEW CONNECTION!') while I see only 8 (up to my maximum number of cores). >>> The number of created task instead is correct (16). >>> >>>

Joda DateTimeSerializer

2016-04-08 Thread Stefano Bortoli
Hi to all, we've just upgraded to Flink 1.0.0 and we had some problems with joda DateTime serialization. The problem was caused by Flink-3305 that removed the JavaKaffee dependency. We had to re-add such dependency in our application and then register the DateTime serializer in the environment:

Re: threads, parallelism and task managers

2016-03-30 Thread Stefano Bortoli
Ufuk Celebi : > Do you have the code somewhere online? Maybe someone can have a quick > look over it later. I'm pretty sure that is indeed a problem with the > custom input format. > > – Ufuk > > On Tue, Mar 29, 2016 at 3:50 PM, Stefano Bortoli > wrote: > > Per

Re: threads, parallelism and task managers

2016-03-29 Thread Stefano Bortoli
e, because there would be millions of queries before the statistics are collected. Perhaps we are doing something wrong, still to figure out what. :-/ thanks a lot for your help. saluti, Stefano 2016-03-29 13:30 GMT+02:00 Stefano Bortoli : > That is exactly my point. I should have 32 threads

Re: threads, parallelism and task managers

2016-03-29 Thread Stefano Bortoli
nning. > ​ > > On Tue, Mar 29, 2016 at 12:29 PM, Stefano Bortoli > wrote: > >> In fact, I don't use it. I just had to crawl back the runtime >> implementation to get to the point where parallelism was switching from 32 >> to 8. >> >> saluti, &g

Re: threads, parallelism and task managers

2016-03-29 Thread Stefano Bortoli
; something which you shouldn’t be concerned with since it is only used > internally by the runtime. > > Cheers, > Till > ​ > > On Tue, Mar 29, 2016 at 12:09 PM, Stefano Bortoli > wrote: > >> Well, in theory yes. Each task has a thread, but only a numb

Re: threads, parallelism and task managers

2016-03-29 Thread Stefano Bortoli
nContext = ExecutionContext.fromExecutor(new > > ForkJoinPool()) > > > > and thus the number of concurrently running threads is limited to the > number > > of cores (using the default constructor of the ForkJoinPool). > > What do you think? > > > > > &g

threads, parallelism and task managers

2016-03-23 Thread Stefano Bortoli
Hi guys, I am trying to test a job that should run a number of tasks to read from a RDBMS using an improved JDBC connector. The connection and the reading run smoothly, but I cannot seem to be able to move above the limit of 8 concurrent threads running. 8 is of course the number of cores of my ma

Re: Oracle 11g number serialization: classcast problem

2016-03-23 Thread Stefano Bortoli
ripts using Flink. We are testing it on a Oracle table of 11 billion records, but we did not get through a complete run. We are just at first prototype level, so there is surely some work to do. :-) saluti, Stefano 2016-03-23 10:38 GMT+01:00 Chesnay Schepler : > On 23.03.2016 10:04, Stefan

Re: Oracle 11g number serialization: classcast problem

2016-03-23 Thread Stefano Bortoli
ormats. > > Is the INT returned as a double as well? > > Note: The (runtime) output type is in no way connected to the TypeInfo you > pass when constructing the format. > > > On 21.03.2016 14:16, Stefano Bortoli wrote: > >> Hi squirrels, >> >> I working o

Oracle 11g number serialization: classcast problem

2016-03-21 Thread Stefano Bortoli
Hi squirrels, I working on a flink job connecting to a Oracle DB. I started from the JDBC example for Derby, and used the TupleTypeInfo to configure the fields of the tuple as it is read. The record of the example has 2 INT, 1 FLOAT and 2 VARCHAR. Apparently, using Oracle, all the numbers are rea

Re: Error running an hadoop job from web interface

2015-10-23 Thread Stefano Bortoli
What I normally do is to java -cp MYUBERJAR.jar my.package.mainclass does it make sense? 2015-10-23 17:22 GMT+02:00 Flavio Pompermaier : > could you write ne the command please?I'm not in the office right now.. > On 23 Oct 2015 17:10, "Maximilian Michels" wrote: > >> Could you try submitting t

Re: kryo exception due to race condition

2015-10-07 Thread Stefano Bortoli
efinitely look into it once Flink Forward is over and we've > finished the current release work. Thanks for reporting the issue. > > Cheers, > Till > > On Tue, Oct 6, 2015 at 9:21 AM, Stefano Bortoli wrote: > >> Hi guys, I could manage to complete the process crossing

Re: kryo exception due to race condition

2015-10-06 Thread Stefano Bortoli
Hi guys, I could manage to complete the process crossing byte arrays I deserialize within the group function. However, I think this workaround is feasible just with relatively simple processes. Any idea/plan about to fix the serialization problem? saluti, Stefano Stefano Bortoli, PhD *ENS

Re: data flow example on cluster

2015-10-02 Thread Stefano Bortoli
I had problems running a flink job with maven, probably there is some issue of classloading. For me worked to run a simple java command with the uberjar. So I build the jar using maven, and then run it this way java -Xmx2g -cp target/youruberjar.jar yourclass arg1 arg2 hope it helps, Stefano 201

Re: kryo exception due to race condition

2015-10-02 Thread Stefano Bortoli
02:00 Stefano Bortoli : > here it is: https://issues.apache.org/jira/browse/FLINK-2800 > > saluti, > Stefano > > 2015-10-01 18:50 GMT+02:00 Stephan Ewen : > >> This looks to me like a bug where type registrations are not properly >> forwarded to all Serializers. >

Re: kryo exception due to race condition

2015-10-02 Thread Stefano Bortoli
ct 1, 2015 at 6:46 PM, Stefano Bortoli > wrote: > >> Hi guys, >> >> I hit a Kryo exception while running a process 'crossing' POJOs datasets. >> I am using the 0.10-milestone-1. >> Checking the serializer: >> org.apache.flink.api.java.typeutils.ru

kryo exception due to race condition

2015-10-01 Thread Stefano Bortoli
Hi guys, I hit a Kryo exception while running a process 'crossing' POJOs datasets. I am using the 0.10-milestone-1. Checking the serializer: org.apache.flink.api.java.typeutils.runtime.kryo.KryoSerializer.deserialize(KryoSerializer.java:210) I have noticed that the Kryo instance is reused along s

Re: starting flink job from bash script with maven

2015-07-24 Thread Stefano Bortoli
problems of classpath are eased, so I can live with it. thanks a lot for your support. saluti, Stefano 2015-07-24 11:17 GMT+02:00 Stefano Bortoli : > HI Stephan, > > I think I may have found a possible root of the problem. I do not build > the fat jar, I simply execute the main with mav

Re: starting flink job from bash script with maven

2015-07-24 Thread Stefano Bortoli
t; Configuration cfg = new Configuration(); > TaskConfig taskConfig = new TaskConfig(cfg); > taskConfig.setStubWrapper(userCode); > taskConfig.getStubWrapper(ClassLoader.getSystemClassLoader()); > > > > On Fri, Jul 24, 2015 at 10:44 AM, Stefano Bortoli > wrote: > >>

Re: starting flink job from bash script with maven

2015-07-24 Thread Stefano Bortoli
; MongoHadoopOutputFormat? > > Can you try and call "SerializationUtils.clone(new MongoHadoopOutputFormat > )", for example at the beginning of your main method? The > SerializationUtils are part of Apache Commons and are probably in your > class path anyways. > > S

starting flink job from bash script with maven

2015-07-24 Thread Stefano Bortoli
k trace. I thought it was a matter of serializable classes, so I have made all my classes serializable.. still I have the same error. Perhaps it is not possible to do these things with Flink. any intuition? is it doable? thanks a lot for your support. :-) saluti, Stefano Bortoli, PhD *ENS Technica

Re: MongoOutputFormat does not write back to collection

2015-07-23 Thread Stefano Bortoli
Yes it does. :-) I have implemented it with Hadoop1 and Hadoop2. Essentially I have extended the HadoopOutputFormat reusing part of the code of the HadoopOutputFormatBase, and set the MongoOutputCommiter to replace the FileOutputCommitter. saluti, Stefano Stefano Bortoli, PhD *ENS Technical

Re: MongoOutputFormat does not write back to collection

2015-07-23 Thread Stefano Bortoli
Seems like the HadoopOutputFormat wrapper is pretty much specialized on > File Output Formats. > > Can you open an issue for that? Someone will need to look into this... > > On Wed, Jul 22, 2015 at 4:48 PM, Stefano Bortoli > wrote: > >> In fact, on close() of the HadoopOutputFormat

Re: MongoOutputFormat does not write back to collection

2015-07-22 Thread Stefano Bortoli
finalizeGlobal to the outputCommitter to FileOutputCommitter(), or keep it as a default in case of no specific assignment. saluti, Stefano 2015-07-22 16:48 GMT+02:00 Stefano Bortoli : > In fact, on close() of the HadoopOutputFormat the fileOutputCommitter > returns false

Re: MongoOutputFormat does not write back to collection

2015-07-22 Thread Stefano Bortoli
-07-22 15:53 GMT+02:00 Stefano Bortoli : > Debugging, it seem the commitTask method of the MongoOutputCommitter is > never called. Is it possible that this 'bulk' approach of mongo-hadoop 1.4 > does not fit the task execution method of Flink? > > any idea? thanks a l

Re: MongoOutputFormat does not write back to collection

2015-07-22 Thread Stefano Bortoli
Debugging, it seem the commitTask method of the MongoOutputCommitter is never called. Is it possible that this 'bulk' approach of mongo-hadoop 1.4 does not fit the task execution method of Flink? any idea? thanks a lot in advance. saluti, Stefano Stefano Bortoli, PhD *ENS Technica

MongoOutputFormat does not write back to collection

2015-07-22 Thread Stefano Bortoli
Hi, I am trying to analyze and update a MongoDB collection with Apache Flink 0.9.0 and Mongo Hadoop 1.4.0 Hadoop 2.6.0. The process is fairly simple, and the MongoInputFormat works smoothly, however it does not write back to the collection. The process works, because the writeAsText works as expe

Re: Cannot instantiate Mysql connection

2015-06-05 Thread Stefano Bortoli
Hi Robert, I answer on behalf of Flavio. He told me the driver jar was included. Smells lik class-loading issue due to 'conflicting' dependencies. Is it possible? Saluti, Stefano 2015-06-05 16:24 GMT+02:00 Robert Metzger : > Hi, > > is the MySQL driver part of the Jar file that you've build? >

Re: How to use Tuple in ListValue?

2015-01-23 Thread Stefano Bortoli
what I did was to implement ListValue in a MyListValue object, then you can do pretty much what you want. :-) saluti, Stefano 2015-01-23 11:29 GMT+01:00 Robert Metzger : > Hi, > > I think you can just use a java collection for the Tuple2's. (Starting > from Flink 0.8.0) > > Robert. > > On Fri, J