Hey Jason, Can you share the entire container log? It will be useful to find out what went wrong. If this is a non-yarn, it will be also useful if you share the JobRunner logs.
Thanks, Jagadish On Fri, Feb 5, 2016 at 4:33 PM, Jason Erickson <ja...@stormpath.com> wrote: > Is the Kafka producer configuration different than the Samza configuration > of the Samza task that references the store? If not, here is are those > configuration values. The changelog in question is > resourceStore-changelog. > > # Job > job.factory.class=org.apache.samza.job.local.ThreadJobFactory > job.name=resource_normalizer > > job.coordinator.system=kafka > > # Task > > task.class=com.foo.blazer.resource.normalizer.samza.ResourceNormalizerSamzaTask > task.inputs=kafka.com.foo.iam.indexing.resource.mutation > task.window.ms=10000 > > task.checkpoint.factory=org.apache.samza.checkpoint.kafka.KafkaCheckpointManagerFactory > task.checkpoint.system=kafka > # Normally, this would be 3, but we have only one broker. > task.checkpoint.replication.factor=3 > task.checkpoint.skip-migration=true > > # Serializers > > serializers.registry.byte.class=org.apache.samza.serializers.ByteSerdeFactory > > serializers.registry.entity.class=com.foo.blazer.resource.normalizer.serde.ResourceEventEntitySerdeFactory > > serializers.registry.string.class=org.apache.samza.serializers.StringSerdeFactory > > serializers.registry.int.class=org.apache.samza.serializers.IntegerSerdeFactory > > serializers.registry.metrics.class=org.apache.samza.serializers.MetricsSnapshotSerdeFactory > > # Systems > > systems.kafka.samza.factory=org.apache.samza.system.kafka.KafkaSystemFactory > systems.kafka.samza.msg.serde=byte > systems.kafka.samza.offset.default=oldest > systems.kafka.consumer.zookeeper.connect=${ZK_NODES}/${ZK_ROOT} > systems.kafka.consumer.auto.offset.reset=smallest > systems.kafka.producer.bootstrap.servers=${KAFKA_NODES} > systems.kafka.producer.max.request.size=52428800 > systems.kafka.streams.metrics.samza.msg.serde=metrics > > # Metrics > > metrics.reporter.snapshot.class=org.apache.samza.metrics.reporter.MetricsSnapshotReporterFactor > metrics.reporter.snapshot.stream=kafka.metrics > > metrics.reporter.jmx.class=org.apache.samza.metrics.reporter.JmxReporterFactory > metrics.reporters=snapshot,jmx > > # Stores > > stores.resourceStore.factory=org.apache.samza.storage.kv.RocksDbKeyValueStorageEngineFactory > stores.resourceStore.changelog=kafka.resourceStore-changelog > stores.resourceStore.changelog.replication.factor=3 > > stores.resourceStore.key.serde=string > stores.resourceStore.msg.serde=entity > > > stores.deletedFlagStore.factory=org.apache.samza.storage.kv.RocksDbKeyValueStorageEngineFactory > stores.deletedFlagStore.changelog=kafka.deletedFlagStore-changelog > stores.deletedFlagStore.changelog.replication.factor=3 > > stores.deletedFlagStore.key.serde=string > stores.deletedFlagStore.msg.serde=int > > > > We do not get a stack trace from the Samza task itself, it just never seems > to fully start. However if we use kafka-console-consumer to try to examine > the changelog we get this: > > 2016-02-02 22:10:11,252] ERROR Error processing message, terminating > consumer process: (kafka.tools.ConsoleConsumer$) > kafka.common.MessageSizeTooLargeException: Found a message larger than the > maximum fetch size of this consumer on topic resourceStore-changelog > partition 7 at fetch offset 0. Increase the fetch size, or decrease the > maximum message size the broker will allow. > at kafka.consumer.ConsumerIterator.makeNext(ConsumerIterator.scala:90) > at kafka.consumer.ConsumerIterator.makeNext(ConsumerIterator.scala:33) > at kafka.utils.IteratorTemplate.maybeComputeNext(IteratorTemplate.scala:66) > at kafka.utils.IteratorTemplate.hasNext(IteratorTemplate.scala:58) > at kafka.consumer.OldConsumer.receive(BaseConsumer.scala:79) > at kafka.tools.ConsoleConsumer$.process(ConsoleConsumer.scala:110) > at kafka.tools.ConsoleConsumer$.run(ConsoleConsumer.scala:69) > at kafka.tools.ConsoleConsumer$.main(ConsoleConsumer.scala:47) > at kafka.tools.ConsoleConsumer.main(ConsoleConsumer.scala) > > > On Fri, Feb 5, 2016 at 3:42 PM, Ramesh Bhojan <rbhojan.soc...@gmail.com> > wrote: > > > Jason, > > Can we please share more information about the exact stack trace and the > > job configuration, especially the Kafka producer configuration for the > > changelog system, as requested by Yi Pan? > > > > Regards, > > Ramesh > > > > On Thu, Feb 4, 2016 at 11:56 AM, Ramesh Bhojan <rbhojan.soc...@gmail.com > > > > wrote: > > > >> Dear team @ Samza, > >> I would really appreciate some help with the following question posted > in > >> Stack Overflow : > >> > >> > >> > http://stackoverflow.com/questions/35168641/is-there-a-configuration-setting-to-allow-large-values-in-my-samza-store > >> > >> Thanks, > >> Ramesh > >> > > > > > > > -- > Thanks, > > Jason Erickson > -- Jagadish V, Graduate Student, Department of Computer Science, Stanford University