Flink application with HBase

2015-12-22 Thread Thomas Lamirault
Hello everybody,

I am using Flink (0.10.1) with a streaming source (Kafka) , and I write results 
of  flatMap/keyBy/timeWindow/reduce to a HBase table.
I have try with a class (Sinkclass) who implements SinkFunction, and 
a class (HBaseOutputFormat) who implements OutputFormat. For you, 
it's better to use the Sinkclass or HBaseOutputFormat, for better performance 
and cleaner code ? (Or equivalent ?)

Thanks,

B.R / Cordialement

Thomas Lamirault


Re: Flink application with HBase

2015-12-22 Thread Márton Balassi
Hi Thomas,

You can use both of the suggested solutions.

The benefit that you might get from HBaseOutputformat that it is already
tested and integrated with Flink as opposed to you having to connect to
HBase in a general SinkFunction.

Best,

Marton
On Dec 22, 2015 1:04 PM, "Thomas Lamirault" 
wrote:

> Hello everybody,
>
> I am using Flink (0.10.1) with a streaming source (Kafka) , and I write
> results of  flatMap/keyBy/timeWindow/reduce to a HBase table.
> I have try with a class (Sinkclass) who implements SinkFunction,
> and a class (HBaseOutputFormat) who implements OutputFormat. For
> you, it's better to use the Sinkclass or HBaseOutputFormat, for better
> performance and cleaner code ? (Or equivalent ?)
>
> Thanks,
>
> B.R / Cordialement
>
> Thomas Lamirault
>


Sink - Cassandra

2015-12-22 Thread syepes
Hello,

I have just started testing out Flink and would like to migrate one of my
Spark Streaming jobs (Kafka->Spark->C*) to Flink.

- Is there anyone using with Flink with Cassandra?
- Does there exist a working Cassandra Sink, the only thing I have found is:
https://github.com/rzvoncek/flink/tree/2b281120f4206c4fd66bec22090e0b6d62ebb8ad/flink-staging/flink-cassandra

Regards,
Sebastian




--
View this message in context: 
http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/Sink-Cassandra-tp4107.html
Sent from the Apache Flink User Mailing List archive. mailing list archive at 
Nabble.com.


Scala API and sources with timestamp

2015-12-22 Thread Don Frascuchon
Hello,

There is a way for define  a EventTimeSourceFunction with anonymous
functions from the scala api?  Like that:

 env.addSource[Int] {
  ctx => {

  ...

ctx.collectWithTimestamp(i, System.currentTimeMillis())
...

  }

}

Thanks in advance!


Re: How do failovers work on yarn?

2015-12-22 Thread Maximilian Michels
Hi Niels,

Very good question! The config file which is written serves as a hint
for the client. When the YARN session is started without high
availability mode, i.e. no high availability settings have been found
in the client's config, the client will try to look up the job manager
using the hostname and port found in the file. In the high
availability case, the client will perform a lookup with Zookeeper to
find the current active job manager.

Thus, submitting a new job after a job manager failover should work fine.

Cheers,
Max

On Mon, Dec 21, 2015 at 5:01 PM, Niels Basjes  wrote:
> Hi,
>
> When I start a yarn-session I found that a file is written with a
> hostname+port number in a config file on my machine.
> Apparently this is the place where the job manager van be found.
>
> Question: What happens if that node on the cluster goes down?
> I expect that Yarn will reallocate the job manager to a different node.
> How will I be able to submit a job after that happened?
>
> --
> Best regards / Met vriendelijke groeten,
>
> Niels Basjes