Hi Fabian,
Thank you for your explanation. Also can you give an example on how the
optimizer behaves on the assumption that the outputs of a function are
replicated?
Thank you...
On 3 March 2017 at 13:52, Fabian Hueske wrote:
> Hi CPC,
>
> we had several requests in the past to
Hi all,
We will try to implement select/split functionality for batch api. We
looked at streaming side and understand how it works but since streaming
side does not include an optimizer it was easier. Since adding such a
runtime operator will affect optimizer layer as well, is there a part that
yo
Hi Till,
Thank you.
On 2 March 2017 at 18:13, Till Rohrmann wrote:
> Hi CPC,
>
> I think that the optimizer does not take the scheduling mode into account
> when optimizing a Flink job.
>
> Cheers,
> Till
>
> On Thu, Mar 2, 2017 at 11:52 AM, CPC wrote:
>
>
Hi all,
Currently our team trying implement a runtime operator also playing with
scheduler. We are trying to understand batch optimizer but it will take
some time. What we want to know is whether changing batch scheduling mode
from LAZY_FROM_SOURCES to EAGER could affect optimizer? I mean whether
Is it just related to stream api? This feature could be really useful for
etl scenarios with dataset api as well.
On Oct 26, 2016 22:29, "Fabian Hueske" wrote:
> Hi Chen,
>
> thanks for this interesting proposal. I think side output would be a very
> valuable feature to have!
>
> I went of the F
Hi,
On flink downloads page it says that the latest version is 1.1 and have
links for 1.1 binaries but hasnt rc2 voted recently? Are those binaries 1.1
or rc2?
izer, but I would expect everything that goes back to
> the DataExchangeMode to be correct. The rest should be an artifact of
> the old pipeline breaker logic and not be used any more. Does this
> help in any way?
>
> On Thu, Jun 16, 2016 at 4:46 PM, CPC wrote:
> > Hi everybody,
>
e runtime performance at the
> cost of a slightly longer start-up time for your TaskManagers.
>
> Cheers,
> Till
>
>
> On Sun, Jun 19, 2016 at 6:16 PM, CPC wrote:
>
> > Hi,
> >
> > I think i found some information regarding this behavior. In jvm it is
&
ct-when-JNA-not-available-td6977711.html
).
I think currently only way to limit memory usage in flink if you want to
use same taskmanager across jobs is via "taskmanager.memory.preallocate:
true". Since it allocate memory at the beginning and not freed its memory
usage stays constant.
PS:
skManager --configDir
> /home/capacman/Data/programlama/flink-1.0.3/conf
>
but memory usage reach up to 13Gb. Could somebodey explain me why memory
usage is so high? I expect it to be at most 8GB with some jvm internal
overhead.
[image: Inline images 1]
[image: Inline images 2]
On 17 Jun
Hi,
I am making some test about offheap memory usage and encounter an odd
behavior. My taskmanager heap limit is 12288 Mb and when i set
"taskmanager.memory.off-hep:true" for every job it allocates 11673 Mb off
heap area at most which is heapsize*0.95(value of
taskmanager.memory.fraction). But whe
Hi everybody,
I am trying to understand flink optimizer in DataSet api. But i dont
understant tempMode and breakPipeline properties. TempMode enum also has a
breakPipeline property but they both accessed and set in different places.
DagConnection.breakPipeline is used to select network io DataExch
Cassandra backend would be interesting especially if flink could benefit
from cassandra data locality. Cassandra/spark integration is using this for
information to schedule spark tasks.
On 9 June 2016 at 19:55, Nick Dimiduk wrote:
> You might also consider support for a Bigtable
> backend: HBas
Data/wiki/14533")
env.execute()
}
}
On 7 June 2016 at 19:26, CPC wrote:
> Hello everyone,
>
> When i use DataStream split/select,it always send all selected records to
> same taskmanager. Is there any reason for this behaviour? Also is it
> possible to implement same spli
Hello everyone,
When i use DataStream split/select,it always send all selected records to
same taskmanager. Is there any reason for this behaviour? Also is it
possible to implement same split/select behaviour for DataSet api(without
using a different filter for every output )? I found this
https:
only allow
> one output from an operation right now. Maybe we can extends this somehow
> in the future.
>
> Cheers,
> Aljoscha
>
> On Thu, 12 May 2016 at 17:27 CPC wrote:
>
> > Hi Gabor,
> >
> > Yes functionally this helps. But in this case i am proc
t[A] = xs.filter(f1)
> val split2: DataSet[A] = xs.filter(f2)
>
> where f1 and f2 are true for those elements that should go into the
> first and second DataSets respectively. So far, the splits will just
> contain elements from the input DataSet, but you can of course apply
Hi folks,
Is there any way in dataset api to split Dataset[A] to Dataset[A] and
Dataset[B] ? Use case belongs to a custom filter component that we want to
implement. We will want to direct input elements whose result is false
after we apply the predicate. Actually we want to direct input elements
> Best, Fabian
>
>
> 2016-04-26 16:59 GMT+02:00 CPC :
>
> > Hi,
> >
> > I look at some scheduler documentations but could not find answer to my
> > question. My question is: suppose that i have a big file on 40 node
> hadoop
> > cluster and since it is a
Hi,
I look at some scheduler documentations but could not find answer to my
question. My question is: suppose that i have a big file on 40 node hadoop
cluster and since it is a big file every node has at least one chunk of the
file. If i write a flink job and want to filter file and if job has
par
Himmm i understand now. Thank you guys:)
On Apr 16, 2016 2:21 PM, "Matthias J. Sax" wrote:
> Sure. WITHOUT.
>
> Thanks. Good catch :)
>
> On 04/16/2016 01:18 PM, Ufuk Celebi wrote:
> > On Sat, Apr 16, 2016 at 1:05 PM, Matthias J. Sax
> wrote:
> >> (with the need to sort the data, because both
>
Hi
When i look for what kind of optimizations flink does, i found
https://cwiki.apache.org/confluence/display/FLINK/Optimizer+Internals is
it up to date? Also i couldnt understand:
"Reusing of partitionings and sort orders across operators. If one operator
leaves the data in partitioned fashion
22 matches
Mail list logo