Oh yes, on that I agree. I'm just saying that the checkpoint setting should maybe be a central setting.
On Mon, 15 Jun 2015 at 15:38 Matthias J. Sax <mj...@informatik.hu-berlin.de> wrote: > Hi, > > IMHO, it is very common that Workers do have their own config files (eg, > Storm works the same way). And I think it make a lot of senses. You > might run Flink in an heterogeneous cluster and you want to assign > different memory and slots for different hardware. This would not be > possible using a single config file (specified at the master and > distribute it). > > > -Matthias > > On 06/15/2015 03:30 PM, Aljoscha Krettek wrote: > > Regarding 1), thats why I said "bugs and features". :D But I think of it > as > > a bug, since people will normally set in in the flink-conf.yaml on the > > master and assume that it works. That's what I assumed and it took me a > > while to figure out that the task managers don't respect this setting. > > > > Regarding 3), if you think about it, this could never work. The state > > handle cleanup logic happens purely on the JobManager. So what happens is > > that the TaskManagers create state in some directory, let's say > > /tmp/checkpoints, on the TaskManager. For cleanup, the JobManager gets > the > > state handle and calls discard (on the JobManager), this tries to cleanup > > the state in /tmp/checkpoints, but of course, there is nothing there > since > > we are still on the JobManager. > > > > On Mon, 15 Jun 2015 at 15:23 Márton Balassi <balassi.mar...@gmail.com> > > wrote: > > > >> @Aljoscha: > >> 1) I think this just means that you can set the state backend on a > >> taskmanager basis. > >> 3) This is a serious issue then. Is it work when you set it in the > >> flink-conf.yaml? > >> > >> On Mon, Jun 15, 2015 at 3:17 PM, Aljoscha Krettek <aljos...@apache.org> > >> wrote: > >> > >>> So, during my testing of the state checkpointing on a cluster I > >> discovered > >>> several things (bugs and features): > >>> > >>> - If you have a setup where the configuration is not synced to the > >> workers > >>> they do not pick up the state back-end configuration. The workers do > not > >>> respect the setting in the flink-cont.yaml on the master > >>> - HDFS checkpointing works fine if you manually set it as the per-job > >>> state-backend using setStateHandleProvider() > >>> - If you manually set the stateHandleProvider to a "file://" backend, > old > >>> checkpoints will not be cleaned up, they will also not be cleaned up > >> when a > >>> job is finished. > >>> > >>> On Sun, 14 Jun 2015 at 23:22 Maximilian Michels <m...@apache.org> > wrote: > >>> > >>>> Hi Henry, > >>>> > >>>> This is just a dry run. The goal is to get everything in shape for a > >>> proper > >>>> vote. > >>>> > >>>> Kind regards, > >>>> Max > >>>> > >>>> > >>>> On Sun, Jun 14, 2015 at 7:58 PM, Henry Saputra < > >> henry.sapu...@gmail.com> > >>>> wrote: > >>>> > >>>>> Hi Max, > >>>>> > >>>>> Are you doing official VOTE on the RC on 0.9 release or this is just > >> a > >>>> dry > >>>>> run? > >>>>> > >>>>> > >>>>> - Henry > >>>>> > >>>>> On Sun, Jun 14, 2015 at 9:11 AM, Maximilian Michels <m...@apache.org> > >>>>> wrote: > >>>>>> Dear Flink community, > >>>>>> > >>>>>> Here's the second release candidate for the 0.9.0 release. We > >> haven't > >>>>> had a > >>>>>> formal vote on the previous release candidate but it received an > >>>> implicit > >>>>>> -1 because of a couple of issues. > >>>>>> > >>>>>> Thanks to the hard-working Flink devs these issues should be solved > >>>> now. > >>>>>> The following commits have been added to the second release > >>> candidate: > >>>>>> > >>>>>> f5f0709 [FLINK-2194] [type extractor] Excludes Writable type from > >>>>>> WritableTypeInformation to be treated as an interface > >>>>>> 40e2df5 [FLINK-2072] [ml] Adds quickstart guide > >>>>>> af0fee5 [FLINK-2207] Fix TableAPI conversion documenation and > >> further > >>>>>> renamings for consistency. > >>>>>> e513be7 [FLINK-2206] Fix incorrect counts of finished, canceled, > >> and > >>>>> failed > >>>>>> jobs in webinterface > >>>>>> ecfde6d [docs][release] update stable version to 0.9.0 > >>>>>> 4d8ae1c [docs] remove obsolete YARN link and cleanup download links > >>>>>> f27fc81 [FLINK-2195] Configure Configurable Hadoop InputFormats > >>>>>> ce3bc9c [streaming] [api-breaking] Minor DataStream cleanups > >>>>>> 0edc0c8 [build] [streaming] Streaming parents dependencies pushed > >> to > >>>>>> children > >>>>>> 6380b95 [streaming] Logging update for checkpointed streaming > >>>> topologies > >>>>>> 5993e28 [FLINK-2199] Escape UTF characters in Scala Shell welcome > >>>>> squirrel. > >>>>>> 80dd72d [FLINK-2196] [javaAPI] Moved misplaced > >> SortPartitionOperator > >>>>> class > >>>>>> c8c2e2c [hotfix] Bring KMeansDataGenerator and KMeans quickstart in > >>>> sync > >>>>>> 77def9f [FLINK-2183][runtime] fix deadlock for concurrent slot > >>> release > >>>>>> 87988ae [scripts] remove quickstart scripts > >>>>>> f3a96de [streaming] Fixed streaming example jars packaging and > >>>>> termination > >>>>>> 255c554 [FLINK-2191] Fix inconsistent use of closure cleaner in > >> Scala > >>>>>> Streaming > >>>>>> 1343f26 [streaming] Allow force-enabling checkpoints for iterative > >>> jobs > >>>>>> c59d291 Fixed a few trivial issues: > >>>>>> e0e6f59 [streaming] Optional iteration feedback partitioning added > >>>>>> 348ac86 [hotfix] Fix YARNSessionFIFOITCase > >>>>>> 80cf2c5 [ml] Makes StandardScalers state package private and reduce > >>>>>> redundant code. Adjusts flink-ml readme. > >>>>>> c83ee8a [FLINK-1844] [ml] Add MinMaxScaler implementation in the > >>>>>> proprocessing package, test for the for the corresponding > >>> functionality > >>>>> and > >>>>>> documentation. > >>>>>> ee7c417 [docs] [streaming] Added states and fold to the streaming > >>> docs > >>>>>> fcca75c [docs] Fix some typos and grammar in the Streaming > >>> Programming > >>>>>> Guide. > >>>>>> > >>>>>> > >>>>>> Again, we need to test the new release candidate. Therefore, I've > >>>>> created a > >>>>>> new document where we keep track of our testing criteria for > >>> releases: > >>>>>> > >>>>> > >>>> > >>> > >> > https://docs.google.com/document/d/162AZEX8lo0Njal10mmt9wzM5GYVL5WME-VfwGmwpBoA/edit > >>>>>> > >>>>>> Everyone who tested previously, could take a different task this > >>> time. > >>>>> For > >>>>>> some components we probably don't have to test again but, if in > >>> doubt, > >>>>>> testing twice doesn't hurt. > >>>>>> > >>>>>> Happy testing :) > >>>>>> > >>>>>> Cheers, > >>>>>> Max > >>>>>> > >>>>>> Git branch: release-0.9.0-rc2 > >>>>>> Release binaries: http://people.apache.org/~mxm/flink-0.9.0-rc2/ > >>>>>> Maven artifacts: > >>>>>> > >>>> > >> https://repository.apache.org/content/repositories/orgapacheflink-1040/ > >>>>>> PGP public key for verifying the signatures: > >>>>>> http://pgp.mit.edu/pks/lookup?op=vindex&search=0xDE976D18C2909CBF > >>>>> > >>>> > >>> > >> > > > >