[GitHub] flink issue #3232: [FLINK-5183] [py] Support mulitple jobs per plan file

GEOFBOT Thu, 09 Feb 2017 05:14:44 -0800

Github user GEOFBOT commented on the issue:

    https://github.com/apache/flink/pull/3232
  
    In the process of getting a more complex job to run, I've tested a basic 
multi-job file (seen below) on an Amazon EMR YARN cluster and it works 
successfully.
    
    ```
    from flink.plan.Environment import get_environment
    from flink.plan.Constants import INT, STRING, WriteMode
    from flink.functions.GroupReduceFunction \
        import GroupReduceFunction
    from flink.functions.Aggregation import Sum
    
    import sys
    
    if __name__ == "__main__":
        output_file = 'hdfs:/tmp/out.txt'
        output_file2 = 'hdfs:/tmp/out2.txt'
    
        env = get_environment()
        data = env.from_elements((0,1),(1,2),(2,3),(2,10))
    
        data \
            .group_by(0) \
            .aggregate(Sum, 1) \
            .write_csv(output_file, write_mode=WriteMode.OVERWRITE)
        env.execute()
    
        env2 = get_environment()
        data2 = env2.read_csv(output_file, (INT, INT))
    
        data2 \
            .first(2) \
            .write_text(output_file2, write_mode=WriteMode.OVERWRITE)
    
        env2.execute()
    ```



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] flink issue #3232: [FLINK-5183] [py] Support mulitple jobs per plan file

Reply via email to