Re: RocksDB vs in-memory store when not writing changelog to Kafka

2016-08-02 Thread Jack Huang
d container may get allocated on the same host. In this > case, it will re-use the already persisted state on disk, after making sure > that it is caught up with the changelog stream. > > HTH, > Navina > > On Tue, Aug 2, 2016 at 11:51 AM, Jack Huang wrote: > > > Hi all,

RocksDB vs in-memory store when not writing changelog to Kafka

2016-08-02 Thread Jack Huang
Hi all, Is there any reason to use RocksDB without associating it to changelog in Kafka? My understanding is that even though Rocks persists data to disk, when container fails the partition might be restarted on a different machine, where there is no persisted data on disk. In that case, wouldn't

How to make the window() of different partitions to start at different time?

2016-06-21 Thread Jack Huang
Hi all, I have a window() that performs external IO to a database (Aerospike). It seems that window() gets called at pretty much the same time for every partition. I want to make all containers to execute window() at different time so they distribute the IO load evenly across time. Is it possible?

Re: Location of the RocksDB Key-Value store

2016-06-21 Thread Jack Huang
/data/b/yarn/nm/usercache/david/appcache/application_1464853403568_0010/container_e04_1464853403568_0010_01_02/state/session-store/Partition_14 > > On Fri, Jun 10, 2016 at 7:01 PM, Jack Huang wrote: > > > Hi all, > > > > I couldn't find in the documentation ( > &g

Re: java.rmi.server.ExportException: Port already in use

2016-06-10 Thread Jack Huang
allocation failed. That's the most common cause of GC in any Java > application. > > -Jake > > On Fri, Jun 10, 2016 at 10:55 AM, Jack Huang wrote: > > > Hi all, > > > > I am having difficulty determining the reason my Samza task is failing. > It >

Location of the RocksDB Key-Value store

2016-06-10 Thread Jack Huang
Hi all, I couldn't find in the documentation ( https://samza.apache.org/learn/documentation/latest/jobs/configuration-table.html) where on the disk Samza stores the local RocksDB key-value store. Can anyone tell me where it is and if it can be configured? Thanks, Jack

java.rmi.server.ExportException: Port already in use

2016-06-10 Thread Jack Huang
Hi all, I am having difficulty determining the reason my Samza task is failing. It generally failed within 10 minutes of start. When I examine the YARN log I see the following exception on some but not all containers: java.rmi.server.ExportException: Port already in use: 40029; nested exception i

Recover from SamzaException thrown by KeyValueIterator.all()

2016-05-04 Thread Jack Huang
The following code for(KeyValueIterator itor = myStore.all(); itor.hasNext(); ) { ... } ​ Throws the exception *org.apache.samza.SamzaException: Unable to send message from TaskName-Partition 8 to system kafka.* at org.apache.samza.system.kafka.KafkaSystemProducer$$anonfun$flush$1.

Reporting deserialization error in StreamTask

2016-03-11 Thread Jack Huang
dropped silently and the task goes on to the next message. Is there a way for the task to report the deserialization error but not fail? Thanks, Jack Huang

Re: Error when fetching configuration from coordinator

2016-02-29 Thread Jack Huang
Jack Huang On Mon, Feb 29, 2016 at 12:52 PM, Jack Huang wrote: > Thanks a lot Jadadish! I changed jackson-jaxrs:1.8.5 to* > jackson-jaxrs:1.9.13* and added *jackson-mapper-asl:1.9.13*. Now > hello-samza finally runs :) > > > > org.codehaus.jackson &g

Re: Error when fetching configuration from coordinator

2016-02-29 Thread Jack Huang
ption: org.codehaus.jackson.map.deser.std.StdDeserializer* exception though... Thanks, Jack Jack Huang On Fri, Feb 26, 2016 at 6:21 PM, Jack Huang wrote: > Hi all, > > Still trying to get hello-samza running on our cluster. After I > successfully launched the hello-samza job with

Error when fetching configuration from coordinator

2016-02-26 Thread Jack Huang
ntainer.main(SamzaContainer.scala)* *End of LogType:samza-container-0.log* Can anyone help me figure out why it can't read the configuration file from the coordinator URL? Thanks, Jack Huang

Re: Port issue with running hello-samza on HDP managed by Ambari

2016-02-26 Thread Jack Huang
Turns out that I need to set the environment variable export HADOOP_YARN_HOME=/etc/hadoop so that the *run-class.sh *script will pick up the right config file Jack Huang On Wed, Feb 24, 2016 at 5:56 PM, Jack Huang wrote: > Hi all, > > I am having trouble running the hello-samz

Port issue with running hello-samza on HDP managed by Ambari

2016-02-24 Thread Jack Huang
I can't find any other place that specifies ResourceManager address. Can anyone help? Thanks, Jack Huang