Hi Florian,
Just curious, what 'shared storage' you guys use to keep the files before
ingested into Kafka?
In our case, we could not figure out such a nice distributed+shared file
system that is NOT HDFS alike and runs before Kafka. So we use individual
harddisks on connector machines and keep of
Hey Florian,
It seems reasonable to me to let the connector track task progress through
offsets. I recall there have been other use cases for communication between
tasks and connectors (perhaps Ewen or someone else will jump in here and
mention them), so I'm not sure if there if this could fall un
Hi Jason,
Yes, this is the idea. The connector assigns a subset of files to each
task.
A task stores the size of file, the bytes offset and the bytes size of the
last sent record as a source offsets.
A file is finished when recordBytesOffsets + recordBytesSize =
fileBytesSize.
The connector shou
Hey Florian,
Can you explain a bit more how having access to the offset storage from the
connector helps in your use case? I guess you are planning to use offsets
to be able to tell when a task has finished a file?
Thanks,
Jason
On Fri, Feb 17, 2017 at 4:45 AM, Florian Hussonnois
wrote:
> Hi K
Hi Kafka Team,
I'm developping a connector which need to monitor the progress of its tasks
in order to be able to request a tasks reconfiguration in some situations.
Our connector is pretty simple. It's used to stream a thousands of files
into Kafka. The connector scans directories then schedules