Hi Georg,

Simply from the aspect of a Flink source that listens to a REST endpoint for 
input data, there should be quite a variety of options to do that. The Akka 
streaming source from Bahir should also serve this purpose well. It would also 
be quite straightforward to implement one yourself.

On the other hand, what Jörn was suggesting was that you would want to first 
persist the incoming data from the REST endpoint to a repayable storage / 
queue, and your Flink job reads from that replayable storage / queue.
The reason for this is that Flink’s checkpointing mechanism for exactly-once 
guarantee relies on a replayable source (see [1]), and since a REST endpoint is 
not replayable, you’ll not be able to benefit from the fault-tolerance 
guarantees provided by Flink. The most popular source used with Flink for 
exactly-once, currently, is Kafka [2]. The only extra latency compared to just 
fetching REST endpoint, in this setup, is writing to the intermediate Kafka 
topic.

Of course, if you’re just testing around and just getting to know Flink, this 
setup isn’t necessary.
You can just start off with a source such as the Flink Akka connector in Bahir, 
and start writing your first Flink job right away :)

Cheers,
Gordon

[1] 
https://ci.apache.org/projects/flink/flink-docs-release-1.2/dev/connectors/guarantees.html
[2] 
https://ci.apache.org/projects/flink/flink-docs-release-1.2/dev/connectors/kafka.html

On 24 April 2017 at 4:02:14 PM, Georg Heiler (georg.kf.hei...@gmail.com) wrote:

Wouldn't adding flume -> Kafka -> flink also introduce additional latency?

Georg Heiler <georg.kf.hei...@gmail.com> schrieb am So., 23. Apr. 2017 um 20:23 
Uhr:
So you would suggest flume over a custom akka-source from bahir?

Jörn Franke <jornfra...@gmail.com> schrieb am So., 23. Apr. 2017 um 18:59 Uhr:
I would use flume to import these sources to HDFS and then use flink or Hadoop 
or whatever to process them. While it is possible to do it in flink, you do not 
want that your processing fails because the web service is not available etc.
Via flume which is suitable for this kind of tasks it is more controlled and 
reliable.

On 23. Apr 2017, at 18:02, Georg Heiler <georg.kf.hei...@gmail.com> wrote:

New to flink I would like to do a small project to get a better feeling for 
flink. I am thinking of getting some stats from several REST api (i.e. Bitcoin 
course values from different exchanges) and comparing prices over different 
exchanges in real time.

Are there already some REST api sources for flink as a sample to get started 
implementing a custom REST source?

I was thinking about using https://github.com/timmolter/XChange to connect to 
several exchanges. E.g. to make a single api call by hand would look similar to

val currencyPair = new CurrencyPair(Currency.XMR, Currency.BTC)
  CertHelper.trustAllCerts()
  val poloniex = 
ExchangeFactory.INSTANCE.createExchange(classOf[PoloniexExchange].getName)
  val dataService = poloniex.getMarketDataService

  generic(dataService)
  raw(dataService.asInstanceOf[PoloniexMarketDataServiceRaw])
System.out.println(dataService.getTicker(currencyPair))

How would be a proper way to make this available as a flink source? I have seen 
https://github.com/apache/flink/blob/master/flink-contrib/flink-connector-wikiedits/src/main/java/org/apache/flink/streaming/connectors/wikiedits/WikipediaEditsSource.java
 but new to flink am a bit unsure how to proceed.

Regards,
Georg

Reply via email to