Re: Question about Window

2017-07-12 Thread Rui Tang
results should be emitted based on Calendar time. > > Please do let us know if you had further questions. > > Best, > Jagadish > > > > On Wed, Jul 12, 2017 at 6:57 AM, Rui Tang wrote: > > > According to the windowing ( > > > http://samza.apache.org/learn/docu

Question about Window

2017-07-12 Thread Rui Tang
According to the windowing ( http://samza.apache.org/learn/documentation/0.13/container/windowing.html) documentation, I know that I can specify a fixed duration window, like 864ms (one day). That means if the window starts at 9:00 AM, then it will end at 9:00 AM the next day. But I need a kin

Re: How to Use samza-hdfs

2016-12-20 Thread Rui Tang
hy if you change the size > per file and make them smaller, you will probably see the previous results > earlier. > > It's all just my theory, though. But it's worth a shot:) > > And yeah, the stream name doesn't really make a difference as far as I > know. > &

Re: How to Use samza-hdfs

2016-12-20 Thread Rui Tang
By the way, what do you mean "close", close what? And what should the stream parameter been, like the following "default" one? It seems noting to do with the result. private final SystemStream OUTPUT_STREAM = new SystemStream("hdfs", * "default"*);

Re: How to Use samza-hdfs

2016-12-20 Thread Rui Tang
r they get closed. > > You can probably play with the "producer.hdfs.write.batch.size.bytes" > config to force rolling over to new files so you can see the results of the > previous one. > > Thanks, > Hai > > On Mon, Dec 19, 2016 at 11:29 PM, Rui Tang wrote: > &

How to Use samza-hdfs

2016-12-19 Thread Rui Tang
I'm using samza-hdfs to write Kafka streams to HDFS, but I can't make it work. Here is my samza job's properties file: # Job job.factory.class=org.apache.samza.job.yarn.YarnJobFactory job.name=kafka2hdfs # YARN yarn.package.path=file://${basedir}/target/${project.artifactId}-${pom.version}-dist.