[ https://issues.apache.org/jira/browse/BEAM-14449?focusedWorklogId=775264&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-775264 ]
ASF GitHub Bot logged work on BEAM-14449: ----------------------------------------- Author: ASF GitHub Bot Created on: 26/May/22 22:18 Start Date: 26/May/22 22:18 Worklog Time Spent: 10m Work Description: KevinGG commented on code in PR #17736: URL: https://github.com/apache/beam/pull/17736#discussion_r883122855 ########## sdks/python/apache_beam/runners/interactive/interactive_beam.py: ########## @@ -475,7 +485,9 @@ def cleanup( 'options is deprecated since First stable release. References to ' '<pipeline>.options will not be supported', category=DeprecationWarning) - p.options.view_as(FlinkRunnerOptions).flink_master = '[auto]' + p_flink_options = p.options.view_as(FlinkRunnerOptions) + p_flink_options.flink_master = '[auto]' + p_flink_options.flink_version = None Review Comment: Because Dataproc only supports a constant Flink version and we always override to that version. We set it to None in case the user wants to use a different cluster (as the behavior indicates: clean up the cluster status for the given pipeline) but forgets to set the Flink version they want to use. Beam's default value is the latest hard coded published Flink version. Issue Time Tracking ------------------- Worklog Id: (was: 775264) Time Spent: 1h 20m (was: 1h 10m) > Support cluster provisioning when using Flink on Dataproc > --------------------------------------------------------- > > Key: BEAM-14449 > URL: https://issues.apache.org/jira/browse/BEAM-14449 > Project: Beam > Issue Type: Improvement > Components: runner-py-interactive > Reporter: Ning > Assignee: Ning > Priority: P2 > Attachments: image-2022-05-16-11-25-32-904.png, > image-2022-05-16-11-28-12-702.png > > Time Spent: 1h 20m > Remaining Estimate: 0h > > Provide the capability for the user to explicitly provision a cluster. > Current implementation provisions each cluster at the location specified by > GoogleCloudOptions using 3 worker nodes. There is no explicit API to > configure the number or shape of workers. > We could use the WorkerOptions to allow customers to explicitly provision a > cluster and expose an explicit API (with UX in notebook extension) for > customers to change the size of a cluster connected with their notebook > (until we have an auto scaling solution with Dataproc for Flink). > The API looks like this when configuring the workers for a dataproc cluster > when creating it: > !image-2022-05-16-11-25-32-904.png! > An example request setting the masterConfig and workerConfig: > !image-2022-05-16-11-28-12-702.png! -- This message was sent by Atlassian Jira (v8.20.7#820007)