when I was write the PR.
-Gna
On Thu, Apr 7, 2016 at 11:08 AM, Sourigna Phetsarath <
gna.phetsar...@teamaol.com> wrote:
> Tranadeep,
>
> Thanks for pasting your code!
>
> I have a PR ready that extends AvroInputFormat and will submit it soon.
>
> Still waiting fo
arser().parse(new
> File("/path/to/schemafile.avsc"));
> DataSet dataSet = env.createInput(new
> GenericAvroInputFormat(inPath, schema));
> dataSet.map(new MapFunction>() {
> @Override
> public Tuple2 map(GenericRecord record) {
> Long id = (Long
There is a way yet, but I am proposing to do one:
https://issues.apache.org/jira/browse/FLINK-3691
On Fri, Apr 1, 2016 at 4:04 AM, Tarandeep Singh wrote:
> Hi,
>
> Can someone please point me to an example of creating DataSet using Avro
> Generic Records?
>
> I tried this code -
>
> final Ex
Tarandeep,
There isn't a way yet, but I am proposing to do one:
https://issues.apache.org/jira/browse/FLINK-3691
-Gna
On Fri, Apr 1, 2016 at 4:04 AM, Tarandeep Singh wrote:
> Hi,
>
> Can someone please point me to an example of creating DataSet using Avro
> Generic Records?
>
> I tried this co
>
>
> On Tue, Mar 29, 2016 at 10:46 AM, Simone Robutti <
> simone.robu...@radicalbit.io> wrote:
>
>> To my knowledge there is nothing like that. PMML is not supported in any
>> form and there's no custom saving format yet. If you really need a quick
>
Flinksters,
Is there an example of saving a Trained Model, loading a Trained Model and
then scoring one or more feature vectors using Flink ML?
All of the examples I've seen have shown only sequential fit and predict.
Thank you.
-Gna
--
*Gna Phetsarath*System Architect // AOL Platforms // Da
t part of Flink yet.
>
> Did you have a look at the sampling methods in DataSetUtils? Maybe they
> can be helpful for what you are trying to achieve.
>
> – Ufuk
>
> On Wed, Mar 23, 2016 at 5:19 PM, Sourigna Phetsarath <
> gna.phetsar...@teamaol.com> wrote:
>
>>
All:
Does Flink DataSet have a randomSplit(weights:Array[Double], seed: Long):
Array[DataSet[T]] function?
There is this pull request: https://github.com/apache/flink/pull/921
Does anyone have an update of the progress of this?
Thank you.
--
*Gna Phetsarath*System Architect // AOL Platforms
g/contribute-code.html
> http://flink.apache.org/how-to-contribute.html
>
> On Wed, Mar 23, 2016 at 12:00 AM, Fabian Hueske wrote:
>
>> Hi Gna,
>>
>> thanks for sharing the good news and opening the JIRA!
>>
>> Cheers, Fabian
>>
>> 2016-03-22 2
On Mon, Mar 21, 2016 at 10:04 AM, Sourigna Phetsarath <
gna.phetsar...@teamaol.com> wrote:
> Fabian,
>
> I'll try extending InputFormat as you suggested and will create a JIRA
> issue as well.
>
> I also have an AvroGenericRecordInput format class that I would like to
tom
> InputFormat based on FileInputFormat by overriding the createInputSplits()
> method.
>
> Best, Fabian
>
> 2016-03-21 0:11 GMT+01:00 Sourigna Phetsarath
> :
>
>> All,
>>
>> Do any of the Flink Data Sources support comma separated directories with
>> wildcar
A issue with your suggestions?
>>
>> Until this is added to Flink, it can be implemented as a custom
>> InputFormat based on FileInputFormat by overriding the createInputSplits()
>> method.
>>
>> Best, Fabian
>>
>> 2016-03-21 0:11 GMT+01:00 Sou
All,
Do any of the Flink Data Sources support comma separated directories with
wildcards?
For example:
env.readFile("/data/2016/01/01/*/*,/data/2016/01/02/*/*,/data/2016/01/03/*/*
")
Thanks in advance for any help that you can provide.
--
*Gna Phetsarath*System Architect // AOL Platforms //
All:
Flink Version 0.10.2
The number that I set for *taskmanager.network.numberOfBuffers* doesn't
seem to have any affect, even if I set it to a very high number. There
might be a race condition here where the upper bound is not enforced or
computer correctly.
java.io.IOException: Insufficient n
gt; fs.s3a.connection.timeout
> 5
> Socket connection timeout in milliseconds.
>
>
>
> -- Ken
>
>
> --
>
> *From:* Sourigna Phetsarath
>
> *Sent:* March 17, 2016 2:05:39pm PDT
>
> *To:* user@flink.apache.org
>
> *Su
All:
I'm trying to read lots of files from S3 and I am getting timeouts from S3:
java.io.IOException: Error opening the Input Split [0,558574890]:
Input opening request timed out. Opener was alive. Stack of split open
thread:
at sun.misc.Unsafe.park(Native Method)
at java.util.concurrent.locks.
//ci.apache.org/projects/flink/flink-docs-master/setup/config.html#configuring-the-network-buffers
>
> On Fri, Mar 18, 2016 at 9:38 PM, Sourigna Phetsarath <
> gna.phetsar...@teamaol.com> wrote:
>
>> It's batch.
>>
>> Reading a few thousand Avro files from
All:
I'm running a Flink 0.10.2 App by submitting to YARN as an application.
I'm using an AWS EMR cluster of 1 Master and 10 d2.8xlarge. When I submit
the job using:
bin/flink run \
-m yarn-cluster \
-yjm 20480 \
-yn 10 \
-ytm 80960 \
-ys 36 \
-yD taskmanager.network.num
All:
I'm trying to use SparseVectors with FlinkML 0.10.1. It does not seem to
be working. Here is a UnitTest that I created to recreate the problem:
*package* com.aol.ds.arc.ml.poc.flink
> *import* org.junit.After
> *import* org.junit.Before
> *import* org.slf4j.LoggerFactory
> *import* org.
19 matches
Mail list logo