[ 
https://issues.apache.org/jira/browse/IGNITE-2016?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15071613#comment-15071613
 ] 

Roman Shtykh commented on IGNITE-2016:
--------------------------------------

Denis,

Yes, it makes sense. Thank you.

The only remaining issue is using autoflushing. We already invoke 
_IgniteDataStreamer.flush()_ on _SinkTask.put(...)_, which interval can be 
configured by the user.
Therefore I think we don't need to expose __IgniteDataStreamer_'s autoflushing. 
Do you agree?

As to buffering on _SinkTask.put(...)_ and then flushing, it is needed to 
increase throughput (pretty common). It is needed only in case we use 
_cache.putAll(...)_, which was my first solution.
_In fact, in many cases internal buffering will be useful so an entire batch of 
records can be sent at once, reducing the overhead of inserting events into the 
downstream data store._ http://kafka.apache.org/documentation.html#connect
As I understand, the same thing is achieved _IgniteDataStreamer_ and, since we 
go with it, explicit buffering is not needed anymore ;)


> Update KafkaStreamer to fit new features introduced in Kafka 0.9
> ----------------------------------------------------------------
>
>                 Key: IGNITE-2016
>                 URL: https://issues.apache.org/jira/browse/IGNITE-2016
>             Project: Ignite
>          Issue Type: New Feature
>          Components: streaming
>            Reporter: Roman Shtykh
>            Assignee: Roman Shtykh
>
> Particularly,
> - new consumer
> - Kafka Connect (Copycat)
> http://www.confluent.io/blog/apache-kafka-0.9-is-released
> This can be a a different integration task or a complete re-write of the 
> current implementation, considering the fact that Kafka Connect is a new 
> standard way for "large-scale, real-time data import and export for Kafka."



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to