Re: [Discussion] flink elasticsearch connector supports

Jacky Lau Thu, 04 Jun 2020 23:47:13 -0700

hi Etienne Chauchot:
you can read here https://www.jianshu.com/p/d32e17dab90c, which is
chinese.But you can konw that slice api has poor performance in es-hadoop
project .


And i found that es-hadoop has removed this and disable sliced scrolls by
default. you can see below, which i found in the lastest es-hadoop release
version
==== Configuration Changes
`es.input.use.sliced.partitions` is deprecated in 6.5.0, and will be removed
in 7.0.0. The default value for `es.input.max.docs.per.partition` (100000)
will also be removed in 7.0.0, thus disabling sliced scrolls by default, and
switching them to be an explicitly opt-in feature.

added[5.0.0]
`es.input.max.docs.per.partition` ::
When reading from an {es} cluster that supports scroll slicing ({es} v5.0.0
and above), this parameter advises the
connector on what the maximum number of documents per input partition should
be. The connector will sample and estimate
the number of documents on each shard to be read and divides each shard into
input slices using the value supplied by
this property. This property is a suggestion, not a guarantee. The final
number of documents per partition is not
guaranteed to be below this number, but rather, they will be close to this
number. This property is ignored if you are
reading from an {es} cluster that does not support scroll slicing ({es} any
version below v5.0.0). By default, this
value is unset, and the input partitions are calculated based on the number
of shards in the indices being read.



Jacky Lau wrote
> hi Etienne Chauchot:
> thanks for your discussion.
> for 1) we do not supprt es  unbouded source currently
> 
> for 2) RichParallelSourceFunction is used for streaming ,InputFormat is
> for
> batch
> 
> for 3)  i downloaded beam just now. and the beam es connector is also
> using
> es-hadoop. i have read the code of es-hadoop(inputsplit contains shard and
> slice. And i think it is better when diffirent shard has diffirent number
> of
> docs), which you can seed here
> .https://github.com/elastic/elasticsearch-hadoop. But the code is not
> good.
> so we do not want to reference . and you can see presto, there is also
> just
> using inputsplit with shard not contains slice
> 
> for 4) because flink es connectro has alreay using diffrent client (es 5
> for
> tranport client, es 6,7 for highlevelrest), we just  reuse it,which will
> not
> change too much code
> 
> 
> 
> --
> Sent from: http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/





--
Sent from: http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/

Re: [Discussion] flink elasticsearch connector supports

Reply via email to