But you cant delete them from the local store like this... you need to
process tombstone to get them deleted from there. The idea about the
design is, to compute those tombstone an inject them into the source topics.
-Matthias
On 2/1/17 3:34 PM, Gwen Shapira wrote:
> I'm wondering why it has to b
I'm wondering why it has to be so complex... Kafka can be configured
to delete items older than 24h in a topic. So if you want to get rid
of records that did not arrive in the last 24h, just configure the
topic accordingly?
On Wed, Feb 1, 2017 at 2:37 PM, Matthias J. Sax wrote:
> Understood now.
Understood now.
It's a tricky problem you have, and the only solution I can come up with
is quite complex -- maybe anybody else has a better idea?
Honestly, I am not sure if this will work:
For my proposal, the source ID must be part of the key of your records
to distinguish records from differen
Sorry for the confusion, I stopped the example before processing the file
from S2.
So in day 2, if we get
S2=[D,E, Z], we will have to remove F and add Z; K = [A,B,D,E,Z]
To elaborate more, A, B and C belong to S1 ( items have field to state
their source). Processing files from S1 should never de
Thanks for the update.
What is not clear to me: why do you only need to remove C, but not
D,E,F, too, as source2 does not deliver any data on day 2?
Furhtermore, IQ is designed to be use outside of you Streams code, and
thus, you should no use it in SourceTask (not sure if this would even be
poss
Sorry for not being clear. Let me explain by example. Let's say I have two
sources S1 and S2. The application that I need to write will load the files
from these sources every 24 hours. The results will be KTable K.
For day 1:
S1=[A, B, C] => the result K = [A,B,C]
S2=[D,E,F] => K will be [A
I am not sure if I understand the complete scenario yet.
> I need to delete all items from that source that
> doesn't exist in the latest CSV file.
Cannot follow here. I thought your CSV files provide the data you want
to process. But it seems you also have a second source?
How does your Streams
Thanks Matthias for your reply.
I'm not trying to stop the application. I'm importing inventory from CSV
files coming from 3rd party sources. The CSVs are snapshots for each
source's inventory. I need to delete all items from that source that
doesn't exist in the latest CSV file.
I was thinking o
Hi,
currently, a Kafka Streams application is designed to "run forever" and
there is no notion of "End of Batch" -- we have plans to add this
though... (cf.
https://cwiki.apache.org/confluence/display/KAFKA/KIP-95%3A+Incremental+Batch+Processing+for+Kafka+Streams)
Thus, right now you need to stop