If your data comes from HBase maybe it would also good to implement a HBase source. A current HBase sink is in the making: https://github.com/apache/flink/pull/2332

Maybe it would be better to save your data in an HDFS (e.g. CSV file) and use the built-in "readFile()". This does the parallelism automatically.



Am 06/09/16 um 14:56 schrieb rimin...@sina.cn:
my data from a Hbase table ,it is like a List[rowkey,Map[String,String]],
class MySplittableIterator extends SplittableIterator[String]{

// Members declared in java.util.Iterator
def hasNext(): Boolean = {

}
def next(): Nothing = {

}

// Members declared in org.apache.flink.util.SplittableIterator
def getMaximumNumberOfSplits(): Int = {

}
def split(num: Int): Array[Iterator[String]] = {

}
}
i do not know the methods to write,can you give me a example.
----- 原始邮件 -----
发件人:Timo Walther <twal...@apache.org>
收件人:user@flink.apache.org
主题:Re: 回复:Re: fromParallelCollection
日期:2016年09月06日 17点03分

Hi,

you have to implement a class that extends "org.apache.flink.util.SplittableIterator". The runtime will ask this class for multiple "java.util.Iterator"s over your split data. How you split your data and how an iterator looks like depends on your data and implementation.

If you need more help, you should show us some examples of your data.

Timo

Am 06/09/16 um 09:46 schrieb rimin...@sina.cn <mailto:rimin...@sina.cn>:
fromCollection is not parallelization,the data is huge,so i want to use env.fromParallelCollection(data),but the data i do not know how to initialize,
----- 原始邮件 -----
发件人:Maximilian Michels <m...@apache.org> <mailto:m...@apache.org>
收件人:"user@flink.apache.org" <mailto:user@flink.apache.org> <user@flink.apache.org> <mailto:user@flink.apache.org>, rimin...@sina.cn <mailto:rimin...@sina.cn>
主题:Re: fromParallelCollection
日期:2016年09月05日 16点58分


Please give us a bit more insight on what you're trying to do.
On Sat, Sep 3, 2016 at 5:01 AM, <rimin...@sina.cn> <mailto:rimin...@sina.cn> wrote:
> Hi,
> val env = StreamExecutionEnvironment.getExecutionEnvironment
> val tr = env.fromParallelCollection(data)
>
> the data i do not know initialize,some one can tell me..
> --------------------------------
>
>
>


--
Freundliche Grüße / Kind Regards

Timo Walther

Follow me: @twalthr
https://www.linkedin.com/in/twalthr


--
Freundliche Grüße / Kind Regards

Timo Walther

Follow me: @twalthr
https://www.linkedin.com/in/twalthr

Reply via email to