Re: Gelly Scatter-Gather Iteration, In a single superstep, GatherFunction.updateVertex invoked more then once

2019-02-04 Thread Greg Hogan
Would you perchance have an example program to demonstrate the unexpected behavior? Does this issue always manifest or are you only seeing duplicate calls under specific circumstances? On Mon, Oct 22, 2018 at 8:33 AM 曹建华 wrote: > Hi: > According to the code comment, in Scatter-Gather Iteration,

Re: Plain text SSL passwords in Log file

2018-03-28 Thread Greg Hogan
With the current method you always have the risk, no matter which keywords you filter on ("secret", "password", etc.), that the key name is mistyped and inadvertently logged. Perhaps we could implement something like TravisCI's encryption keys [ https://docs.travis-ci.com/user/encryption-keys/] at

Re: Unable to see more than 5 jobs on Flink Dashboard

2018-03-28 Thread Greg Hogan
What version of Flink are you running? Deployment method? Referenced section of flink-conf.yaml? On Wed, Mar 28, 2018 at 4:34 PM, Vinay Patil wrote: > Hi, > > I am not able to see more than 5 jobs on Flink Dashboard. > I have set web.history to 50 in flink-conf.yaml file. > > Is there any other

Re: TaskManager crashes with PageRank algorithm in Gelly

2018-03-15 Thread Greg Hogan
Termination of the TaskManager by the Linux OOM killer indicates an overallocation of memory and you have set "taskmanager.heap.mb: 139264” on machines with 136 GB. Even though you were able to (temporarily?) resolve the issue by enabling preallocation, you may see degraded performance if syste

Re: Gelly: akka.ask.timeout

2018-01-09 Thread Greg Hogan
Hi Alieh, Are you able to run the example WordCount application without losing TaskManagers? Greg > On Jan 8, 2018, at 7:48 AM, Alieh Saeedi wrote: > > Hey all, > I have an iterative algorithm implemented in Gelly. As long as I upgraded > everything to flink-1.3.1 from 1.1.2, the runtime ha

Re: Flink Batch Performance degradation at scale

2017-12-07 Thread Greg Hogan
Hi Garrett, In the Web UI, when viewing a job under overview / subtasks, selecting the checkbox "Aggregate task statistics by TaskManager” will reduce the number of displayed rows (though in your case only by half). The following documents profiling a Flink job with Java Flight Recorder: http

Re: Flink memory usage

2017-11-07 Thread Greg Hogan
I’ve used the following simple script to capture Flink metrics by running: python -u ./statsd_server.py 9020 > statsd_server.log >>> flink-conf.yaml metrics.reporters: statsd_reporter metrics.reporter.statsd_reporter.class: org.apache.flink.metrics.statsd.StatsDReporter metrics.reporter.

Re: Task Manager was lost/killed due to full GC

2017-09-15 Thread Greg Hogan
Late response, but a common reason for disappearing TaskManagers is termination by the Linux out-of-memory killer, with the recommendation to decrease the allotted memory. > On Sep 5, 2017, at 9:09 AM, ShB wrote: > > Hi, > > I'm running a Flink batch job that reads almost 1 TB of data from

Re: TypeInformation in Custom Deserializer

2017-08-13 Thread Greg Hogan
You should be able to implement this using a TypeHint (see the Creating a TypeInformation or TypeSerializer section from the linked page): return TypeInformation.of(new TypeHint>(){}); > On Aug 13, 2017, at 10:31 AM, AndreaKinn wrote: > > Hi, > I'm trying to implement a custom deseria

Re: PageRank iteration

2017-08-13 Thread Greg Hogan
PageRank is using a bulk iteration via DataSet#iterate whereas a delta iteration would start with DataSet#iterateDelta. > On Aug 13, 2017, at 10:30 AM, Kaepke, Marc wrote: > > Hi everyone, > > does PageRank use bulk or delta iteration? > > I mean the implementation of PageRank of the package

Re: Standalone cluster - taskmanager settings ignored

2017-08-11 Thread Greg Hogan
Hi Marc, By chance did you edit the slaves file before shutting down the cluster? If so, then the removed worker would not be stopped and would reconnect to the restarted JobManager. Greg > On Aug 11, 2017, at 11:25 AM, Kaepke, Marc wrote: > > Hi, > > I have a cluster of 4 dedicated machin

Re: Gelly PageRank implementations in 1.2 to 1.3

2017-07-26 Thread Greg Hogan
; The vertex-centric, sg and gsa PageRank need a Double as vertex value. A >> VertexDegree function generate a vertex with a LongValue as value. >> Maybe I can iterate over the graph and remove all edges with a degree of >> zero?! >> >>> Am 24.07.2017 um 16:36 schrie

Re: Gelly PageRank implementations in 1.2 to 1.3

2017-07-24 Thread Greg Hogan
1 with 1.103 > id 2 with 0.815 > id 3 with 0 > sg and gsa > id 4 with 2.208 > id 1 with 2.114 > id 2 with 1.546 > id 3 with 0 > new PageRank in Gelly 1.3.X > id 1 with 0.392 > id 2 with 0.313 > id 4 with 0.294 > > Do you know why? > > > Best > Marc >

Re: Gelly PageRank implementations in 1.2 to 1.3

2017-07-22 Thread Greg Hogan
Hi Marc, PageRank and GSAPageRank were moved to the flink-gelly-examples jar in the org.apache.flink.graph.examples package. A library algorithm was added that supports both source and sink vertices. This limitation of the old algorithms was noted in the class documentation and I understand to

Re: delta iteration

2017-07-12 Thread Greg Hogan
Hi Alieh, From a rich function call getIterationRuntimeContext().getSuperstepNumber() Greg > On Jul 12, 2017, at 9:56 AM, Alieh wrote: > > Hello all, > > I need iteration number in delta iteration (or any kind of counter). Is there > anyway to implement or extract it? > > Cheers, > > Alie

Re: Problem with Summerization

2017-06-28 Thread Greg Hogan
0008,(8,1)) > (0004,(9,7)) > (000a,(10,1)) > > g.getEdges.print(); //error!! > > As I said, I also tested when edges have values other than null > and the same problem appears. > > Regards, > Ali > > > Quoting Greg Hogan : &

Re: Problem with Summerization

2017-06-27 Thread Greg Hogan
Hi Ali, Could you print and include a gellyGraph which results in this error. Greg > On Jun 27, 2017, at 2:48 PM, rost...@informatik.uni-leipzig.de wrote: > > Dear All, > > I do not understand what the error in the following code can be? > > Graph gellyGraph = ... > > Graph, > Summarizatio

Re: Performance Improvement on Flink 1.2.0

2017-06-22 Thread Greg Hogan
Some documentation on application profiling with Flink 1.3 (can be manually inserted into the scripts for Flink 1.2): https://ci.apache.org/projects/flink/flink-docs-release-1.3/monitoring/application_profiling.html > On Jun 22, 2017, at 9:24 AM, Stefan Richter > wrote: > > Hi, > > the an

Re: Kafka and Flink integration

2017-06-21 Thread Greg Hogan
TypeInformation was specified. Greg > On Jun 21, 2017, at 6:21 AM, Ted Yu wrote: > > Greg: > Can you clarify he last part? > Should it be: the concrete type cannot be known ? > > Original message ---- > From: Greg Hogan > Date: 6/21/17 3:10 AM (GMT-08:0

Re: Kafka and Flink integration

2017-06-21 Thread Greg Hogan
The recommendation has been to avoid Kryo where possible. General data exchange: avro or thrift. Flink internal data exchange: POJO (or Tuple, which are slightly faster though less readable, and there is an outstanding PR to narrow or close the performance gap). Kryo is useful for types which

Re: coGroup exception or something else in Gelly job

2017-06-12 Thread Greg Hogan
the issue. > Maybe Flink provides an influence by the programmer? > > > Best and thanks > Marc > > > Am 10.06.2017 um 00:49 schrieb Greg Hogan : > > Have you looked at org.apache.flink.gelly.GraphExtension. > CustomVertexValue.createInitSemiCluster(CustomVertexValue.jav

Re: coGroup exception or something else in Gelly job

2017-06-09 Thread Greg Hogan
Have you looked at org.apache.flink.gelly.GraphExtension.CustomVertexValue.createInitSemiCluster(CustomVertexValue.java:51)? > On Jun 9, 2017, at 4:53 PM, Kaepke, Marc wrote: > > Hi everyone, > > I don’t have any exceptions if I execute my Gelly job in my IDE (local) > directly. > The next s

Re: Flink and swapping question

2017-05-24 Thread Greg Hogan
Hi Flavio, Flink handles interrupts so the only silent killer I am aware of is Linux's OOM killer. Are you seeing such a message in dmesg? Greg On Wed, May 24, 2017 at 3:18 AM, Flavio Pompermaier wrote: > Hi to all, > I'd like to know whether memory swapping could cause a taskmanager crash. >

Re: Production Deployments

2017-05-13 Thread Greg Hogan
We recently added documentation (and official builds) for Docker and K8s. Feedback is appreciated! https://ci.apache.org/projects/flink/flink-docs-release-1.2/setup/kubernetes.html I’m not aware of a preference

Re: Gelly - which partitioning

2017-03-29 Thread Greg Hogan
Hi Marc, I’ll defer to Vasia’s comment below from FLINK-1536 as she has much more knowledge and experience with graph partitioning. This is certainly an area of interest so please let us know if you would like to contribute! The referenced list of papers is at: http://www.citeulike.org/user/v

Re: [E] Re: unable to add more servers in zookeeper quorum peers in flink 1.2

2017-03-23 Thread Greg Hogan
-3A__twitter.com_verizon&d=DwMD-g&c=udBTRvFvXC5Dhqg7UHpJlPps3mZ3LRxpb6__0PomBTQ&r=vykaCRzOoltktnhRTDFL-6nA55ddeOEzGwp9gbKhlfRrUojHU0nRm-mt29N20zWv&m=8K9Z7ArBnvYZL6fAHHu5u1r_-ezNyqdayG6uUQ8Wpqc&s=tqZxl_Osm7Nl_TMH5fn6uLqMmYwzWMRmIY0a6Cv94XE&e=> > > <https://urldefense.proofpoint.

Re: [POLL] Who still uses Java 7 with Flink ?

2017-03-23 Thread Greg Hogan
Robert, Thanks for the report. Shouldn’t we be revisiting this decision at the beginning of the new release cycle rather than near the end? There is currently little cost to staying with Java 7 since no Flink code or pull requests have been written for Java 8. Greg > On Mar 23, 2017, at 6:3

Re: unable to add more servers in zookeeper quorum peers in flink 1.2

2017-03-22 Thread Greg Hogan
Kanagaraj, None of the server lines are matching since the regex in start-zookeeper-quorum.sh does not allow for spaces after the equals sign. ^server\.([0-9])+[[:space:]]*\=([^: \#]+) Greg > On Mar 22, 2017, at 8:49 AM, kanagaraj.vengidas...@verizon.com wrote: > > Hi All, > > We are using

Re: Variable Tuple Type

2017-03-15 Thread Greg Hogan
Hi Max, Belated response but this looks to be the same problem I am working to solve in Gelly with graph data in FLINK-3695 [0]. These arrays allow for object reuse. Interface is here [1]. Additional Value types are easy to add but Long, Int, and String are most common to Gelly. Suggestions are w

Re: Apache Flink 1.1.4 - Gelly - LocalClusteringCoefficient - Returning values above 1?

2017-02-23 Thread Greg Hogan
,5113,1.0) >> (5113,6064,1.0) >> (6065,5113,1.0) >> (5113,6065,1.0) >> (6279,5113,1.0) >> (5113,6279,1.0) >> (4907,5113,1.0) >> (5113,4907,1.0) >> (6465,5113,1.0) >> (5113,6465,1.0) >> (6707,5113,1.0) >> (5113,6707,1.0) >> (70

Re: parallelism and slots allocated

2017-02-11 Thread Greg Hogan
Hi Bernard and Kurt, Chaining affects how subtasks operate within slots. Resource groups segregate subtasks into different slots. https://ci.apache.org/projects/flink/flink-docs-release-1.2/concepts/runtime.html#task-slots-and-resources https://ci.apache.org/projects/flink/flink-docs-release-1.2

Re: Questions about the V-C Iteration in Gelly

2017-02-10 Thread Greg Hogan
Hi Xingcan, FLINK-1885 looked into adding a bulk mode to Gelly's iterative models. As an alternative you could implement your algorithm with Flink operators and a bulk iteration. Most of the Gelly library is written with native operators. Greg On Fri, Feb 10, 2017 at 5:02 AM, Xingcan Cui wrote

Re: Remove old <1.0 documentation from google

2017-02-09 Thread Greg Hogan
See FLINK-5575. https://issues.apache.org/jira/browse/FLINK-5575 Looks like release-0.8 and older are not automatically rebuilt. https://ci.apache.org/builders/ On Thu, Feb 9, 2017 at 7:17 AM, Jonas wrote: > Maybe add "This documentation is outdated. Please switch to a newer version > by cl

Re: start-cluster.sh issue

2017-01-27 Thread Greg Hogan
Hi Lior, Try adding this to your flink-conf.yaml: env.ssh.opts: FLINK_CONF_DIR=/tmp/parallelmachine/lior/flink/conf I think this is expected and not a bug (setting FLINK_CONF_DIR from the environment is supported for YARN). Please do file a JIRA for this feature as I think it would be a nice im

Re: Flink configuration

2017-01-25 Thread Greg Hogan
Has anyone reported decreased performance with hyper-threading? On Tue, Jan 24, 2017 at 11:18 AM, Aljoscha Krettek wrote: > Hi, > that wording is from a time where no-one though about VMs with virtual > cores. IMHO this maps directly to virtual cores so you should set it > according to the numbe

Re: Improving Flink performance

2017-01-23 Thread Greg Hogan
Hi Jonas, It looks like the mailing list has removed your formatting and/or attachments. Greg On Mon, Jan 23, 2017 at 6:08 AM, Jonas wrote: > Hello! > > I'm having performance problems with a Flink job. If there is anything > valuable missing, please ask and I will try to answer ASAP. My job l

Re: Release 1.2?

2017-01-23 Thread Greg Hogan
Support for Hadoop 1 was dropped in FLINK-4895 [1]. [1] https://issues.apache.org/jira/browse/FLINK-4895 On Mon, Jan 23, 2017 at 11:09 AM, wrote: > I notice that for Flink 1.2.0 there is no x.y.z-hadoop1 folder (Cf. Apache > staging > repo >

Re: Apache Flink 1.1.4 - Gelly - LocalClusteringCoefficient - Returning values above 1?

2017-01-20 Thread Greg Hogan
Hi Miguel, The '--output print' option describes the values and also displays the local clustering coefficient value. You're running the undirected algorithm on a directed graph. In 1.2 there is an option '--simplify true' that will add reverse edges and remove duplicate edges and self-loops. Alt

Re: Flink rolling upgrade support

2016-12-22 Thread Greg Hogan
Aljoscha, For the second, possible solution is there also a requirement that the data sinks handle out-of-order writes? If the new job outpaces the old job which is then terminated, the final write from the old job could have overwritten "newer" writes from the new job. Greg On Tue, Dec 20, 2016

Re: Calculating stateful counts per key

2016-12-20 Thread Greg Hogan
Hi Mäki, This is the expected output. Your RichFlatMapFunction is opened once per task and you are sharing counterValue for all keys processed by that task. Greg On Mon, Dec 19, 2016 at 11:38 AM, Mäki Hanna wrote: > Hi, > > > > I'm trying to calculate stateful counts per key with checkpoints

Re: [DISCUSS] "Who's hiring on Flink" monthly thread in the mailing lists?

2016-12-09 Thread Greg Hogan
Google indexes the mailing list. Anyone can filter the messages to trash in a few clicks. This will also be a means for the community to better understand which and how companies are using Flink. On Fri, Dec 9, 2016 at 8:27 AM, Felix Neutatz wrote: > Hi, > > I wonder whether a mailing list is a

Re: How to analyze space usage of Flink algorithms

2016-12-09 Thread Greg Hogan
This does sound like a nice feature, both per-job and per-taskmanager bytes written to and read from disk. On Fri, Dec 9, 2016 at 8:51 AM, Chesnay Schepler wrote: > We do not measure how much data we are spilling to disk. > > > On 09.12.2016 14:43, Fabian Hueske wrote: > > Hi, > > the heap mem u

Re: S3 checkpointing in AWS in Frankfurt

2016-11-23 Thread Greg Hogan
hat's something I can configure somewhere. > > Regards, > Jonathan > > > On 23 November 2016 at 16:29, Greg Hogan wrote: > >> Hi Jonathan, >> >> Which S3 storage class are you using? Do you have a breakdown of the S3 >> costs as storage / API calls

Re: S3 checkpointing in AWS in Frankfurt

2016-11-23 Thread Greg Hogan
Hi Jonathan, Which S3 storage class are you using? Do you have a breakdown of the S3 costs as storage / API calls / early deletes / data transfer? Greg On Wed, Nov 23, 2016 at 2:52 AM, Jonathan Share wrote: > Hi, > > I'm interested in hearing if anyone else has experience with using Amazon > S

Re: spark vs flink batch performance

2016-11-18 Thread Greg Hogan
"For csv reading, i deliberately did not use csv reader since i want to run same code across spark and flink." If your objective deviates from writing and running the fastest Spark and fastest Flink programs, then your comparison is worthless. On Fri, Nov 18, 2016 at 5:37 AM, CPC wrote: > Hi G

Re: Retrieving a single element from a DataSet

2016-11-04 Thread Greg Hogan
The tickets are in Flink's Jira: https://issues.apache.org/jira/browse/FLINK-4965 https://issues.apache.org/jira/browse/FLINK-4966 Are you looking to process temporal graphs with the DataStream API? On Fri, Nov 4, 2016 at 5:52 AM, otherwise777 wrote: > Cool, thnx for that, > > I tried searc

Re: NotSerializableException: jdk.nashorn.api.scripting.NashornScriptEngine

2016-11-03 Thread Greg Hogan
Hi Pedro, Which problem are you having, the NotSerializableException or not seeing open() called on a RichFunction? Greg On Wed, Nov 2, 2016 at 10:47 AM, PedroMrChaves wrote: > Hello, > > I'm having the exact same problem. > I'm using a filter function on a datastream. > My flink version is 1.

Re: Looping over a DataSet and accesing another DataSet

2016-11-01 Thread Greg Hogan
By 'loop' do you refer to an iteration? The output of a bulk iteration is processed as the input of the following iteration. Values updated in an iteration are available in the next iteration just as values updated by an operator are available to the following operator. Your chosen algorithm may n

Re: Looping over a DataSet and accesing another DataSet

2016-10-31 Thread Greg Hogan
The DataSet API only supports binary joins but one can simulate an n-ary join by chaining successive join operations. Your algorithm requires a global ordering on edges, requiring a parallelism of 1, and will not scale in a distributed processing system. Flink excels at processing bulk (larger tha

Re: Retrieving a single element from a DataSet

2016-10-30 Thread Greg Hogan
I created FLINK-4965 "AllPairsShortestPaths" and FLINK-4966 "BetweennessCentrality". On Wed, Oct 26, 2016 at 4:39 PM, Greg Hogan wrote: > It sounds like you want to use an all-pairs shortest paths algorithm. This > would be a great contribution to Gelly! >

Re: TIMESTAMP TypeInformation

2016-10-27 Thread Greg Hogan
Could be. I had thought TypeInfoParser was closely related to TypeExtractor. On Thu, Oct 27, 2016 at 10:20 AM, Fabian Hueske wrote: > Wouldn't that be orthogonal to adding it to the TypeInfoParser? > > 2016-10-27 15:22 GMT+02:00 Greg Hogan : > >> Fabian, >> >&

Re: TIMESTAMP TypeInformation

2016-10-27 Thread Greg Hogan
Fabian, Should we instead add this as a registered TypeInfoFactory? Greg On Thu, Oct 27, 2016 at 3:55 AM, Fabian Hueske wrote: > Yes, I think you are right. > TypeInfoParser needs to be extended to parse the java.sql.* types into the > corresponding TypeInfos. > > Can you open a JIRA for that?

Re: Retrieving a single element from a DataSet

2016-10-26 Thread Greg Hogan
It sounds like you want to use an all-pairs shortest paths algorithm. This would be a great contribution to Gelly! https://en.wikipedia.org/wiki/Shortest_path_problem#All-pairs_shortest_paths On Wed, Oct 26, 2016 at 9:29 AM, otherwise777 wrote: > That is indeed not the nice way to do it because

Re: Flink error: Too few memory segments provided

2016-10-20 Thread Greg Hogan
By default Flink only allocates 2048 network buffers (64 MiB at 32 KiB/buffer). Have you increased the value for taskmanager.network.numberOfBuffers in flink-conf.yaml? On Thu, Oct 20, 2016 at 11:24 AM, otherwise777 wrote: > I got this error in Gelly, which is a result of flink (i believe) > > E

Re: question about making a temporal Graph with Gelly

2016-10-12 Thread Greg Hogan
Hi Wouter, Packing two or more values into the edge value using a Tuple is a common practice. Does this work well for the algorithms you are writing? Greg On Wed, Oct 12, 2016 at 4:29 PM, Wouter Ligtenberg wrote: > ​​Hi there, > > I'm currently working on making a Temporal Graph with Gelly, a

Re: DataStream csv reading

2016-10-06 Thread Greg Hogan
The program executes when you call print (same for collect), which is why you are seeing an error when calling execute (since there is no new job to execute). As Fabian noted, you'll need to look in the TaskManager log files for the printed output if running on a cluster. On Thu, Oct 6, 2016 at 4:

Re: Parallelism vs task manager allocation

2016-09-21 Thread Greg Hogan
Is the query stream also a Flink job? Is this use case not supported by keeping state within a single Flink job? https://ci.apache.org/projects/flink/flink-docs-master/dev/state.html FLINK-3779 recently added "queryable state" to allow external processes access to operator state. https://issue

Re: Parallelism vs task manager allocation

2016-09-20 Thread Greg Hogan
Hi Pushpendra, This is the expected system behavior. Slots local to the same TaskManager can transfer buffers in memory. Are you able to also run the Sink with a parallelism of 4? Greg On Tue, Sep 20, 2016 at 6:16 AM, Pushpendra Jaiswal < pushpendra.jaiswa...@gmail.com> wrote: > Hi > I have lau

Re: SQL for Flink

2016-09-14 Thread Greg Hogan
Hi Deepak, There are many open tickets for Flink's SQL API. Documentation is at https://ci.apache.org/projects/flink/flink-docs-master/dev/table_api.html. https://issues.apache.org/jira/issues/?jql=project%20%3D%20FLINK%20AND%20resolution%20%3D%20Unresolved%20AND%20component%20%3D%20%22Table%20AP

Re: Flink Iterations vs. While loop

2016-09-06 Thread Greg Hogan
ink Iterations the data is repeatedly > distributed? Or the other way around: Might it be that flink "remembers" > somehow that the data is already distributed even for the while loop? > > -Dan > > > > Am 02.09.2016 um 16:39 schrieb Greg Hogan: > > Hi Dan, &g

Re: Flink Iterations vs. While loop

2016-09-02 Thread Greg Hogan
Hi Dan, Where are you reading the 200 GB "data" from? How much memory per node? If the DataSet is read from a distributed filesystem and if with iterations Flink must spill to disk then I wouldn't expect much difference. About how many iterations are run in the 30 minutes? I don't know that this i

Re: Metrics not reported to graphite

2016-09-01 Thread Greg Hogan
Have you copied the required jar files into your lib/ directory? Only JMX support is provided in the distribution. On Thu, Sep 1, 2016 at 5:07 PM, Jack Huang wrote: > Hi all, > > I followed the instruction for reporting metrics to a Graphite server on > the official document (https://ci.apache.o

Re: Setting number of TaskManagers

2016-08-24 Thread Greg Hogan
The number of TaskManagers will be equal to the number of entries in the conf/slaves file. On Wed, Aug 24, 2016 at 3:04 PM, Foster, Craig wrote: > Is there a way to set the number of TaskManagers using a configuration > file or environment variable? I'm looking at the docs for it and it says > y

Re: Performance issues with GroupBy?

2016-07-26 Thread Greg Hogan
Hi Robert, Are you able to simplify the your function input / output types? Flink aggressively serializes the data stream and complex types such as ArrayList and BitSet will be much slower to process. Are you able to reconstruct the lists to be groupings on elements? Greg On Mon, Jul 25, 2016 at

Re: Modifying start-cluster scripts to efficiently spawn multiple TMs

2016-07-11 Thread Greg Hogan
, Saliya Ekanayake > wrote: > >> I meant, I'll check when current jobs are done and will let you know. >> >> On Mon, Jul 11, 2016 at 12:19 PM, Saliya Ekanayake >> wrote: >> >>> I am running some jobs now. I'll stop and restart using pdsh to see

Re: Modifying start-cluster scripts to efficiently spawn multiple TMs

2016-07-11 Thread Greg Hogan
as well) bind > processes and threads. For Flink, I've manually done this using shell > script that scans TMs in a node and pin them appropriately. This approach > is OK, but it's better if the support is integrated to Flink. > > On Sun, Jul 10, 2016 at 8:33 PM, Greg H

Re: Parameters to Control Intra-node Parallelism

2016-07-11 Thread Greg Hogan
2016 at 11:55 PM, Saliya Ekanayake wrote: > Greg, > > where did you see the OOM log as shown in this mail thread? In my case > none of the TaskManagers nor JobManger reports an error like this. > > On Sun, Jul 10, 2016 at 8:45 PM, Greg Hogan wrote: > >> These

Re: Parameters to Control Intra-node Parallelism

2016-07-10 Thread Greg Hogan
These symptoms sounds similar to what I was experiencing in the following thread. Flink can have some unexpected memory usage which can result in an OOM kill by the kernel, and this becomes more pronounced as the cluster size grows. https://www.mail-archive.com/dev@flink.apache.org/msg06346.html

Re: Modifying start-cluster scripts to efficiently spawn multiple TMs

2016-07-10 Thread Greg Hogan
Hi Saliya, Would you happen to have pdsh (parallel distributed shell) installed? If so the TaskManager startup in start-cluster.sh will run in parallel. As to running 24 TaskManagers together, are these running across multiple NUMA nodes? I had filed FLINK-3163 ( https://issues.apache.org/jira/br

Re: sampling function

2016-07-09 Thread Greg Hogan
Hi Do, DataSet provides a stable @Public interface. DataSetUtils is marked @PublicEvolving which is intended for public use, has stable behavior, but method signatures may change. It's also good to limit DataSet to common methods whereas the utility methods tend to be used for specific application

Re: Does Flink allows for encapsulation of transformations?

2016-06-07 Thread Greg Hogan
"The question is how to encapsulate numerous transformations into one object or may be a function in Apache Flink Java setting." Implement CustomUnaryOperation. This can then be applied to a DataSet by calling `DataSet result = DataSet.runOperation(new MyOperation<>(...));`. On Mon, Jun 6, 2016 a

Re: BloomFilter Exception

2015-09-05 Thread Greg Hogan
Flavio, It looks like your build is older than some recent fixes in that code. https://github.com/apache/flink/commits/2e6e4de5d1d2b5123f4311493763fd84f52779ab/flink-runtime/src/main/java/org/apache/flink/runtime/operators/hash/MutableHashTable.java Greg On Fri, Sep 4, 2015 at 10:34 AM, Flavio

max-fan

2015-09-02 Thread Greg Hogan
number of file handles per operator, but may cause intermediate merging/partitioning, if set too small (DEFAULT: 128). Greg Hogan