Hi Yi,

It could be merged into the Samza project if there's enough interest but
may need some re-working depending on which dependencies are ok to bring
in.  I did it outside of the Samza project first because I had to get it
done quickly so it relies on Java 8 features, dropwizard metrics for
histogram metrics, and JEST (https://github.com/searchbox-io/Jest) which
itself drags in more dependencies (Guava, Gson, commons http).

There are few issues with the existing ElasticsearchSystemProducer:

   1. The plugin API (IndexRequestFactory) is tied to the Elasticsearch
   Java API (a bulky dependency)
   2. It only supports index requests.  I needed to also support updates
   and deletes.
   3. There currently no plugin mechanism to register a flush listener.
   The reason I needed that was to be able to report end to end latency stats
   (total pipeline latency = commit time - event time).

#3 is easily solvable with a additional plugin options. #1 and #2 require
changing the system producer API.

Roger

On Tue, Feb 9, 2016 at 10:56 AM, Yi Pan <nickpa...@gmail.com> wrote:

> Hi, Roger,
>
> That's awesome! Are you planning to submit the HTTP-based system producer
> in Samza open-source samza-elasticsearch module? If ElasticSearch community
> suggest that HTTP-based clients be the recommended way, we should use it in
> samza-elasticsearch as well. And what's your opinion on the existing
> ElasticsearchSystemProducer? If the SystemProducer APIs and configure
> options do not change, I would vote to replace the implementation w/
> HTTP-based ElasticsearchSystemProducer.
>
> Thanks for putting this new additions up!
>
> -Yi
>
> On Tue, Feb 9, 2016 at 10:39 AM, Roger Hoover <roger.hoo...@gmail.com>
> wrote:
>
> > Hi Samza folks,
> >
> > For people who want to use HTTP to integrate with Elasticsearch, I wrote
> an
> > HTTP-based system producer and a reusable task, including latency stats
> > from event origin time, task processing time, and time spent talking to
> > Elasticsearch API.
> >
> >
> https://github.com/quantiply/rico/blob/master/docs/common_tasks/es-push.md
> >
> > Cheers,
> >
> > Roger
> >
>

Reply via email to