> that in.
>
> On 10.07.2018 17:02, Garrett Barton wrote:
>
> Greetings all,
> The docs say that I can skip creating a cluster and let the jobs create
> their own clusters on yarn. The example given is:
>
> ./bin/flink run -m yarn-cluster -yn 2 ./examples/batch/WordCount.j
Greetings all,
The docs say that I can skip creating a cluster and let the jobs create
their own clusters on yarn. The example given is:
./bin/flink run -m yarn-cluster -yn 2 ./examples/batch/WordCount.jar
What I cannot figure out is what the -m option is meant for. In my opinion
there is no
s,
> Till
>
> On Thu, Jun 21, 2018 at 7:43 PM Garrett Barton
> wrote:
>
>> Actually, random thought, could yarn preemption be causing this? What is
>> the failure scenario should a working task manager go down in yarn that is
>> doing real work? The docs make it sound
n Thu, Jun 21, 2018 at 1:20 PM Garrett Barton
wrote:
> Thank you all for the reply!
>
> I am running batch jobs, I read in a handful of files from HDFS and output
> to HBase, HDFS, and Kafka. I run into this when I have partial usage of
> the cluster as the job runs. So right now I s
hers?
>>
>> I'm adding Till to this thread who's very familiar with scheduling and
>> process communication.
>>
>> Best, Fabian
>>
>> 2018-06-19 0:03 GMT+02:00 Garrett Barton :
>>
>>> Hey all,
>>>
>>> My jobs that I am
Hey all,
My jobs that I am trying to write in Flink 1.5 are failing after a few
minutes. I think its because the idle task managers are shutting down,
which seems to kill the client and the running job. The running job itself
was still going on one of the other task managers. I get:
org.apache
the data is shipped to the receiver tasks
>>> (the sorter). This mode decouples tasks and also reduces the number of
>>> network buffers because fewer connection must be active at the same time.+
>>> Here's a link to an internal design document (not sure how up to date i
>
> Best, Fabian
>
> [1] https://cwiki.apache.org/confluence/display/FLINK/Data+
> exchange+between+tasks
>
> 2017-12-07 16:30 GMT+01:00 Garrett Barton :
>
>> Thanks for the reply again,
>>
>> I'm currently doing runs with:
>> yarn-session.sh
GroupReduce.
>
> Best, Fabian
>
> [1] https://ci.apache.org/projects/flink/flink-docs-
> release-1.3/setup/config.html#yarn
>
> 2017-12-06 23:32 GMT+01:00 Garrett Barton :
>
>> Wow thank you for the reply, you gave me a lot to look into and mess
>> with. I
e current state.
>
> Best, Fabian
>
> [1] https://ci.apache.org/projects/flink/flink-docs-
> release-1.3/setup/config.html#managed-memory
> [2] https://ci.apache.org/projects/flink/flink-docs-
> release-1.3/dev/execution_configuration.html
> [3] http://flink.apache.org/visualiz
ng.
> The plan can be obtained from the ExecutionEnvironment by calling
> env.getExecutionPlan() instead of env.execute().
>
> I would also like to know how you track the progress of the program.
> Are you looking at the record counts displayed in the WebUI?
>
> Best,
> Fabian
I have been moving some old MR and hive workflows into Flink because I'm
enjoying the api's and the ease of development is wonderful. Things have
largely worked great until I tried to really scale some of the jobs
recently.
I have for example one etl job that reads in about 12B records at a time
Fabian,
Just to follow up on this, I took the patch, compiled that class and stuck
it into the existing 1.3.2 jar and all is well. (I couldn't get all of
flink to build correctly)
Thank you!
On Wed, Sep 20, 2017 at 3:53 PM, Garrett Barton
wrote:
> Fabian,
> Awesome! After y
;
> Thanks, Fabian
>
> 2017-10-03 21:57 GMT+02:00 Garrett Barton :
>
>> Gábor
>> ,
>> Thank you for the reply, I gave that a go and the flow still showed
>> parallel 90 for each step. Is the ui not 100% accurate perhaps?
>>
>> To get around it for
Gévay wrote:
> Hi Garrett,
>
> You can call .setParallelism(1) on just this operator:
>
> ds.reduceGroup(new GroupReduceFunction...).setParallelism(1)
>
> Best,
> Gabor
>
>
>
> On Mon, Oct 2, 2017 at 3:46 PM, Garrett Barton
> wrote:
> > I have a comp
I have a complex alg implemented using the DataSet api and by default it
runs with parallel 90 for good performance. At the end I want to perform a
clustering of the resulting data and to do that correctly I need to pass
all the data through a single thread/process.
I read in the docs that as long
context
>> classloader to be the user classloader before calling user code. However,
>> this has not been done here.
>> So, this is in fact a bug.
>>
>> I created this JIRA issue: https://issues.apache.org/jira
>> /browse/FLINK-7656 and will open a PR for that.
.
> Can you post more of the stacktrace? This would help to identify the spot
> in the Flink code where the exception is thrown.
>
> Thanks, Fabian
>
> 2017-09-18 18:42 GMT+02:00 Garrett Barton :
>
>> Hey all,
>>
>> I am trying out a POC with flink on yarn. My s
Hey all,
I am trying out a POC with flink on yarn. My simple goal is to read from
a Hive ORC table, process some data and write to a new Hive ORC table.
Currently I can get Flink to read the source table fine, both with using
The HCatalog Input format directly, and by using the flink-hcatalog
w
19 matches
Mail list logo