Thanks for the input Jagadish. This makes sense now. I'm seeing one of our 
input topic partitions empty. One of our upstream jobs is likely configured 
incorrectly.

 



Jeremiah Adams
Software Engineer
www.helixeducation.com
Blog | Twitter | Facebook | LinkedIn

________________________________________
From: Jagadish Venkatraman <jagadish1...@gmail.com>
Sent: Monday, March 25, 2019 10:49 AM
To: dev@samza.apache.org
Subject: Re: Empty Kafka topic partition Warning

Hi Jeremiah,

>> It looks like my configuration for changelog.replication.factor is not
being applied. Instead the default seems to be applied. I have messages on
2/3 partitions. I am not seeing what I have incorrectly configured.

The "replication.factor" config you mention determines how many brokers
each change-log partition is replicated to. This is un-related to the
behavior you are observing - where a single partition is empty.

For context, each Samza task has its own store and writes to a single
change-log partition. An empty change-log partition means its corresponding
task did not perform writes to its store.

Can you check if there are *input* topic-partitions that don't receive
messages? If so, tasks consuming from them wouldn't have data to process.
This results in their store and the corresponding change-log partition to
be empty. Also, can you check for any exceptions in your logs?

For details on Samza's state management, I'd recommend the below resources:
Videos: Samza Architecture Part-1
<https://url.emailprotection.link/?bH1bmLuRP-9egfczNVLB6hd1gawa23HTbKoVFrWAVOlRfLffKffb7k6Xz4SIKeQI_-CACqwBAAIv3sh6ggDJS0pwrRpygaBi5RQfIazuEl6b3tQARLvOEhyiW0TJtkUGo8iFsUnmdOZ1JP5t8htxpLELL8Fc_bTyPPOf7HGCzqgafJ0dg6ZlqUd_hiohVUE_GRFkJ5akgiTrEVn6Atuz7ww~~>
 Samza Architecture Part-2
<https://url.emailprotection.link/?bH1bmLuRP-9egfczNVLB6hd1gawa23HTbKoVFrWAVOlSf_VraXufE_XpIbZBkBEqImcqPh9L5FuLuDNcZ62q9Rxq9ovOrvm2AnsNc3Jar1dCQ3YPrRkbddUeagVD98AamxScT9ytgfFGR_xlLudhgIvywi062aHaNyy8ATooSdwA~>
Docs: Architecture overview
<https://url.emailprotection.link/?b4GATHeTL1aGKLTs1Na3CfFbXFP0dMSRpyk1-M3oM2l0apvilQMvwdv4vkmcIDJo4yw6in8o7yvSedpOKyEvHxyiCQYkqUDP8o6FRVBFhQ7ehTfrVW4nBa8MBvfGU7FW8DkuH-VIG6Id5RgOWCS-Dryx2x3Pfc55FSH8rwNWDQ8Y~>


Best,
Jagadish

On Mon, Mar 25, 2019 at 8:30 AM Jeremiah Adams <jad...@helixeducation.com>
wrote:

> It looks like my configuration for changelog.replication.factor is not
> being applied. Instead the default seems to be applied. I have messages on
> 2/3 partitions. I am not seeing what I have incorrectly configured.
>
>
> stores.redelivery-store.factory=org.apache.samza.storage.kv.RocksDbKeyValueStorageEngineFactory
> stores.redelivery-store.changelog=kafka.delivery-changelog
> stores.default.changelog.replication.factor=3
>
>
>
>
>
> Jeremiah Adams
> Software Engineer
> https://url.emailprotection.link/?bM9S-3pRw1lv8pYfwa-TwdjElP4W2K6b9vP5Crz22L_YcgsRJ-13h-OgPZSwFtU7GSNTDi1z-jdaRvWESRhtTVA~~
> Blog | Twitter | Facebook | LinkedIn
>
> ________________________________________
> From: Jagadish Venkatraman <jagadish1...@gmail.com>
> Sent: Friday, March 22, 2019 3:01 PM
> To: dev@samza.apache.org
> Subject: Re: Empty Kafka topic partition Warning
>
> Hi Jeremiah,
>
> >> why is the offset 0?
>
> This likely means that the change-log is empty and does not have any
> messages.
>
> Can you try consuming from partition-number: 0 using a KafkaConsumer?
>
> Best,
> Jagadish
>
>
>
>
>
> On Fri, Mar 22, 2019 at 11:45 AM Jeremiah Adams <jad...@helixeducation.com
> >
> wrote:
>
> > I'm seeing these in our log periodically  and havn't seen them before.
> > Does this imply that the topic associated with the change log is being
> > replayed from the beginning?
> >
> >
> > Also, why is the offset 0? It definitely should not be. We have messages
> > across all partitions.
> >
> >
> > 2019-03-22 18:30:24 KafkaSystemAdmin [INFO] Fetching SSP metadata for:
> > [SystemStreamPartition [kafka, delivery-changelog, 0]]
> > 2019-03-22 18:30:24 KafkaSystemAdmin [WARN] Empty Kafka topic partition
> > delivery-changelog-0 with upcoming offset 0. Skipping newest offset and
> > setting oldest offset to 0 to consume from beginning
> > 2019-03-22 18:30:52 KafkaSystemAdmin [INFO] Fetching SSP metadata for:
> > [SystemStreamPartition [kafka, delivery-changelog, 0]]
> > 2019-03-22 18:30:52 KafkaSystemAdmin [WARN] Empty Kafka topic partition
> > delivery-changelog-0 with upcoming offset 0. Skipping newest offset and
> > setting oldest offset to 0 to consume from beginning
> >
> >
> > Jeremiah Adams
> > Software Engineer
> >
> https://url.emailprotection.link/?bM9S-3pRw1lv8pYfwa-TwdjElP4W2K6b9vP5Crz22L_YcgsRJ-13h-OgPZSwFtU7GSNTDi1z-jdaRvWESRhtTVA~~
> <
> https://url.emailprotection.link/?basKr9vk92a8vVw0XMnK5bmaSKuBc0AuEZ7YasYc7Df8YVt3SYmcjmLWdKMWzAAINWlUUA33ebGI7pSoTl9cg1g~~
> >
> > Blog<
> https://url.emailprotection.link/?basKr9vk92a8vVw0XMnK5bmaSKuBc0AuEZ7YasYc7Df-lAcqG1fqHPpNw-wd9z7HtUJeCG5_8UjCf2mHtn6C_zQ~~>
> | Twitter<
> >
> https://url.emailprotection.link/?bVO2q0UXR235wN_yOnM0FjqITPdBYMD3reLGNddq-zPV5ChMQK9JwV4Be-QnrbRoXpJl8IcknAqKzYtA3RABKww~~>
> | Facebook<
> >
> https://url.emailprotection.link/?bUU7m4NfMS_EWGtH1yojBHX9sWZ6uxVdT1eQUkmU5vWY01WFZiS2KJ-c9iLIncdHB7Uw1lRYCprEEpPPQCdiK6Q~~>
> | LinkedIn<
> >
> https://url.emailprotection.link/?b0ZQfJ1pZYnASyoShs9MJI46-r1lxPhA-JS5VSkR7so-DFP0_HxbOo2LsajGOaoYXxb1ZCOMAu7hZscPCnIKWpXz0cpgQ386SnNHjPcwsu4z90mzBkuwoZc6YxOCzMGA0
> >
> >
>
>
> --
> Jagadish V,
> Graduate Student,
> Department of Computer Science,
> Stanford University
>


--
Jagadish V,
Graduate Student,
Department of Computer Science,
Stanford University

Reply via email to