Hi,
> Can you describe a bit more on your ingestion rate ?
> what exactly were the read limits?
Streaming job ingestion is maximum 1M records per batch. Trigger interval
is every 1 minute which seem to be fine for regular stream processing. Our
avg per minute record count is way less than that.
>
Hi Nirav,
> in our case streaming job was stuck for over 3 days
3 days seems too much, what exactly were the read limits and how many files
before and after compaction ? Can you describe a bit more on your ingestion
rate ?
> Also why not add `nextValidSnapshot(Snapshot curSnapshot)` check at the
Hi Prashant,
Thanks for responding and sharing the related issue. Issue that's been fix
seem very much related. However, in our case streaming job was stuck for
over 3 days. You are sayiing because of it scanning all the Manifests list
`latestOffSet` may not be returning for that long!
Also why
Hi Nirav,
Thanks for reporting the issue, let me try answering your question below :)
> We are encountering the following issue where spark streaming read job
from iceberg table stays stuck after some maintenance jobs
(rewrite_data_files and rewrite_manifests) has been ran on parallel on same
tab