Lukasz

Thank you for the reply.

> * Using a "remote" filesystem such as HDFS/S3/GCS/...
> * Mounting an external directory into the container so that any "local"
writes appear outside the container
> * Using a non-docker environment such as external or process.

  Understood.

Thanks,
Yu Watanabe

On Fri, Sep 13, 2019 at 2:34 AM Lukasz Cwik <lc...@google.com> wrote:

> When you use a local filesystem path and a docker environment, "/tmp" is
> written inside the container. You can solve this issue by:
> * Using a "remote" filesystem such as HDFS/S3/GCS/...
> * Mounting an external directory into the container so that any "local"
> writes appear outside the container
> * Using a non-docker environment such as external or process.
>
> On Thu, Sep 12, 2019 at 4:51 AM Yu Watanabe <yu.w.ten...@gmail.com> wrote:
>
>> Hello.
>>
>> I would like to ask for help with my sample code using portable runner
>> using apache flink.
>> I was able to work out the wordcount.py using this page.
>>
>> https://beam.apache.org/roadmap/portability/
>>
>> I got below two files under /tmp.
>>
>> -rw-r--r-- 1 ywatanabe ywatanabe    185 Sep 12 19:56
>> py-wordcount-direct-00001-of-00002
>> -rw-r--r-- 1 ywatanabe ywatanabe    190 Sep 12 19:56
>> py-wordcount-direct-00000-of-00002
>>
>> Then I wrote sample code with below steps.
>>
>> 1.Install apache_beam using pip3 separate from source code directory.
>> 2. Wrote sample code as below and named it "test-protable-runner.py".
>> Placed it separate directory from source code.
>>
>> -----------------------------------------------------------------------------------
>> (python) ywatanabe@debian-09-00:~$ ls -ltr
>> total 16
>> drwxr-xr-x 18 ywatanabe ywatanabe 4096 Sep 12 19:06 beam (<- source code
>> directory)
>> -rw-r--r--  1 ywatanabe ywatanabe  634 Sep 12 20:25
>> test-portable-runner.py
>>
>> -----------------------------------------------------------------------------------
>> 3. Executed the code with "python3 test-protable-ruuner.py"
>>
>>
>> ==========================================================================================
>> #!/usr/bin/env
>>
>> import apache_beam as beam
>> from apache_beam.options.pipeline_options import PipelineOptions
>> from apache_beam.io import WriteToText
>>
>>
>> def printMsg(line):
>>
>>     print("OUTPUT: {0}".format(line))
>>
>>     return line
>>
>> options = PipelineOptions(["--runner=PortableRunner",
>> "--job_endpoint=localhost:8099", "--shutdown_sources_on_final_watermark"])
>>
>> p = beam.Pipeline(options=options)
>>
>> output = ( p | 'create' >> beam.Create(["a", "b", "c"])
>>              | beam.Map(printMsg)
>>          )
>>
>> output | 'write' >> WriteToText('/tmp/sample.txt')
>>
>> =======================================================================================
>>
>> Job seemed to went all the way to "FINISHED" state.
>>
>> -------------------------------------------------------------------------------------------------------------------------------------------------------
>> [DataSource (Impulse) (1/1)] INFO
>> org.apache.flink.runtime.taskmanager.Task - Registering task at network:
>> DataSource (Impulse) (1/1) (9df528f4d493d8c3b62c8be23dc889a8) [DEPLOYING].
>> [DataSource (Impulse) (1/1)] INFO
>> org.apache.flink.runtime.taskmanager.Task - DataSource (Impulse) (1/1)
>> (9df528f4d493d8c3b62c8be23dc889a8) switched from DEPLOYING to RUNNING.
>> [flink-akka.actor.default-dispatcher-3] INFO
>> org.apache.flink.runtime.executiongraph.ExecutionGraph - DataSource
>> (Impulse) (1/1) (9df528f4d493d8c3b62c8be23dc889a8) switched from DEPLOYING
>> to RUNNING.
>> [flink-akka.actor.default-dispatcher-3] INFO
>> org.apache.flink.runtime.executiongraph.ExecutionGraph - CHAIN MapPartition
>> (MapPartition at [2]write/Write/WriteImpl/DoOnce/{FlatMap(<lambda at
>> core.py:2415>), Map(decode)}) -> FlatMap (FlatMap at ExtractOutput[0])
>> (1/1) (0b60f1a9e620b061f85e97998aa4b181) switched from CREATED to SCHEDULED.
>> [flink-akka.actor.default-dispatcher-3] INFO
>> org.apache.flink.runtime.executiongraph.ExecutionGraph - CHAIN MapPartition
>> (MapPartition at [2]write/Write/WriteImpl/DoOnce/{FlatMap(<lambda at
>> core.py:2415>), Map(decode)}) -> FlatMap (FlatMap at ExtractOutput[0])
>> (1/1) (0b60f1a9e620b061f85e97998aa4b181) switched from SCHEDULED to
>> DEPLOYING.
>> [flink-akka.actor.default-dispatcher-3] INFO
>> org.apache.flink.runtime.executiongraph.ExecutionGraph - Deploying CHAIN
>> MapPartition (MapPartition at
>> [2]write/Write/WriteImpl/DoOnce/{FlatMap(<lambda at core.py:2415>),
>> Map(decode)}) -> FlatMap (FlatMap at ExtractOutput[0]) (1/1) (attempt #0)
>> to aad5f142-dbb7-4900-a5ae-af8a6fdcc787 @ localhost (dataPort=-1)
>> [flink-akka.actor.default-dispatcher-2] INFO
>> org.apache.flink.runtime.taskexecutor.TaskExecutor - Received task CHAIN
>> MapPartition (MapPartition at
>> [2]write/Write/WriteImpl/DoOnce/{FlatMap(<lambda at core.py:2415>),
>> Map(decode)}) -> FlatMap (FlatMap at ExtractOutput[0]) (1/1).
>> [DataSource (Impulse) (1/1)] INFO
>> org.apache.flink.runtime.taskmanager.Task - Loading JAR files for task
>> DataSource (Impulse) (1/1) (d51e4171c0af881342eaa8a7ab06def8) [DEPLOYING].
>> [DataSource (Impulse) (1/1)] INFO
>> org.apache.flink.runtime.taskmanager.Task - Registering task at network:
>> DataSource (Impulse) (1/1) (d51e4171c0af881342eaa8a7ab06def8) [DEPLOYING].
>> [DataSource (Impulse) (1/1)] INFO
>> org.apache.flink.runtime.taskmanager.Task - DataSource (Impulse) (1/1)
>> (9df528f4d493d8c3b62c8be23dc889a8) switched from RUNNING to FINISHED.
>>
>> -------------------------------------------------------------------------------------------------------------------------------------------------------
>>
>> But I ended up with docker error on client side.
>>
>> -------------------------------------------------------------------------------------------------------------------------------------------------------
>> (python) ywatanabe@debian-09-00:~$ python3 test-portable-runner.py
>> /home/ywatanabe/python/lib/python3.5/site-packages/apache_beam/__init__.py:84:
>> UserWarning: Some syntactic constructs of Python 3 are not yet fully
>> supported by Apache Beam.
>>   'Some syntactic constructs of Python 3 are not yet fully supported by '
>> ERROR:root:java.io.IOException: Received exit code 125 for command
>> 'docker run -d --network=host --env=DOCKER_MAC_CONTAINER=null
>> ywatanabe-docker-apache.bintray.io/beam/python3:latest --id=13-1
>> --logging_endpoint=localhost:39049 --artifact_endpoint=localhost:33779
>> --provision_endpoint=localhost:34827 --control_endpoint=localhost:36079'.
>> stderr: Unable to find image '
>> ywatanabe-docker-apache.bintray.io/beam/python3:latest' locallydocker:
>> Error response from daemon: unknown: Subject ywatanabe was not found.See
>> 'docker run --help'.
>> Traceback (most recent call last):
>>   File "test-portable-runner.py", line 27, in <module>
>>     result.wait_until_finish()
>>   File
>> "/home/ywatanabe/python/lib/python3.5/site-packages/apache_beam/runners/portability/portable_runner.py",
>> line 446, in wait_until_finish
>>     self._job_id, self._state, self._last_error_message()))
>> RuntimeError: Pipeline
>> BeamApp-ywatanabe-0912112516-22d84d63_d3b5e778-d771-4118-9ba7-c9dde85938e9
>> failed in state FAILED: java.io.IOException: Received exit code 125 for
>> command 'docker run -d --network=host --env=DOCKER_MAC_CONTAINER=null
>> ywatanabe-docker-apache.bintray.io/beam/python3:latest --id=13-1
>> --logging_endpoint=localhost:39049 --artifact_endpoint=localhost:33779
>> --provision_endpoint=localhost:34827 --control_endpoint=localhost:36079'.
>> stderr: Unable to find image '
>> ywatanabe-docker-apache.bintray.io/beam/python3:latest' locallydocker:
>> Error response from daemon: unknown: Subject ywatanabe was not found.See
>> 'docker run --help'.
>>
>> -------------------------------------------------------------------------------------------------------------------------------------------------------
>>
>> As a result , I got nothing under /tmp . Code works when using
>> DirectRunner.
>> May I ask , where should I look for in order to get the pipeline to write
>> results to text files under /tmp ?
>>
>> Best Regards,
>> Yu Watanabe
>>
>> --
>> Yu Watanabe
>> Weekend Freelancer who loves to challenge building data platform
>> yu.w.ten...@gmail.com
>> [image: LinkedIn icon] <https://www.linkedin.com/in/yuwatanabe1>  [image:
>> Twitter icon] <https://twitter.com/yuwtennis>
>>
>

-- 
Yu Watanabe
Weekend Freelancer who loves to challenge building data platform
yu.w.ten...@gmail.com
[image: LinkedIn icon] <https://www.linkedin.com/in/yuwatanabe1>  [image:
Twitter icon] <https://twitter.com/yuwtennis>

Reply via email to