Thanks all! Sent https://github.com/apache/beam/pull/12619 to cherrypick into 2.24.
On Mon, Aug 17, 2020 at 3:37 PM Robert Bradshaw <rober...@google.com> wrote: > I checked Java, it looks like the way things are structured we do not > have that bug there. > > On Mon, Aug 17, 2020 at 3:31 PM Robert Bradshaw <rober...@google.com> > wrote: > > > > +1 > > > > Thanks, Eugene, for finding and fixing this! > > > > FWIW, most use of Python from the Python Portable Runner used the > > embedded environment (this is the default direct runner), so > > dependencies are already present. > > > > On Mon, Aug 17, 2020 at 3:19 PM Daniel Oliveira <danolive...@google.com> > wrote: > > > > > > Normally I'd say not to cherry-pick this since the issue is only > affecting one runner and isn't really a regression, but given that it's the > last Py2 release and there won't be a follow-up release that will be able > to include this fix, I think it's worth making an exception this time. > There should be at least one release with a working portable runner for Py2 > users, given that the portable runner is included in examples on the > website. Plus, I'm already waiting for some other cherry-picks, so it won't > even delay anything. > > > > > > So yes, Eugene if you could create a cherry-pick of this change into > the release-2.24.0 branch, I'll review and merge it. > > > > > > On Mon, Aug 17, 2020 at 11:27 AM Valentyn Tymofieiev < > valen...@google.com> wrote: > > >> > > >> Will defer to the release manager; one reason to cherry-pick is that > 2.24.0 will be the last release with Python 2 support, so Py2 users of > Portable Python Local Runner might appreciate the fix, since they won't be > able to use the next release. > > >> > > >> On Thu, Aug 13, 2020 at 6:28 PM Eugene Kirpichov < > ekirpic...@gmail.com> wrote: > > >>> > > >>> +Daniel as in charge of 2.24 per dev@ thread. > > >>> > > >>> On Thu, Aug 13, 2020 at 6:24 PM Eugene Kirpichov < > ekirpic...@gmail.com> wrote: > > >>>> > > >>>> The PR is merged. > > >>>> > > >>>> Do folks think this warrants being cherrypicked into v2.24? My > hunch is yes, cause basically one of the runners (local portable python > runner) is broken for any production workload (works only if your pipeline > has no dependencies). > > >>>> > > >>>> On Thu, Aug 13, 2020 at 12:56 PM Eugene Kirpichov < > ekirpic...@gmail.com> wrote: > > >>>>> > > >>>>> FWIW I sent a PR to fix this > https://github.com/apache/beam/pull/12571 > > >>>>> > > >>>>> However, I'm not up to date on the portable test infrastructure > and would appreciate guidance on what tests I can add for this. > > >>>>> > > >>>>> On Tue, Aug 11, 2020 at 5:28 PM Eugene Kirpichov < > ekirpic...@gmail.com> wrote: > > >>>>>> > > >>>>>> (FYI Sam +sbrot...@gmail.com) > > >>>>>> > > >>>>>> On Tue, Aug 11, 2020 at 5:00 PM Eugene Kirpichov < > ekirpic...@gmail.com> wrote: > > >>>>>>> > > >>>>>>> Ok I found the bug, and now I don't understand how it could have > possibly ever worked. And if this was never tested, then I don't understand > why it works after fixing this one bug :) > > >>>>>>> > > >>>>>>> Basically the Python ArtifactStaging/RetrievalService uses > FileSystems.open() to read the artifacts to be staged, and > FileSystems.open() by default decompresses compressed files based on their > extension. > > >>>>>>> I found two of such services - in Python and in Java. Is the > Python used with an embedded job endpoint and the java one otherwise? I > haven't inspected the Java one, but fixing Python does the trick. > > >>>>>>> > > >>>>>>> The fix is this patch: > > >>>>>>> > > >>>>>>> diff --git > a/sdks/python/apache_beam/runners/portability/artifact_service.py > b/sdks/python/apache_beam/runners/portability/artifact_service.py > > >>>>>>> index f2bbf534c3..1f3ec1c0b0 100644 > > >>>>>>> --- > a/sdks/python/apache_beam/runners/portability/artifact_service.py > > >>>>>>> +++ > b/sdks/python/apache_beam/runners/portability/artifact_service.py > > >>>>>>> @@ -41,6 +41,7 @@ import grpc > > >>>>>>> from future.moves.urllib.request import urlopen > > >>>>>>> > > >>>>>>> from apache_beam.io import filesystems > > >>>>>>> +from apache_beam.io.filesystems import CompressionTypes > > >>>>>>> from apache_beam.portability import common_urns > > >>>>>>> from apache_beam.portability.api import beam_artifact_api_pb2 > > >>>>>>> from apache_beam.portability.api import > beam_artifact_api_pb2_grpc > > >>>>>>> @@ -263,7 +264,8 @@ class BeamFilesystemHandler(object): > > >>>>>>> self._root = root > > >>>>>>> > > >>>>>>> def file_reader(self, path): > > >>>>>>> - return filesystems.FileSystems.open(path) > > >>>>>>> + return filesystems.FileSystems.open( > > >>>>>>> + path, compression_type=CompressionTypes.UNCOMPRESSED) > > >>>>>>> > > >>>>>>> def file_writer(self, name=None): > > >>>>>>> full_path = filesystems.FileSystems.join(self._root, name) > > >>>>>>> diff --git > a/sdks/python/apache_beam/runners/portability/fn_api_runner/worker_handlers.py > b/sdks/python/apache_beam/runners/portability/fn_api_runner/worker_handlers.py > > >>>>>>> index 5bf3282250..2684235be0 100644 > > >>>>>>> --- > a/sdks/python/apache_beam/runners/portability/fn_api_runner/worker_handlers.py > > >>>>>>> +++ > b/sdks/python/apache_beam/runners/portability/fn_api_runner/worker_handlers.py > > >>>>>>> @@ -45,6 +45,7 @@ from typing import overload > > >>>>>>> import grpc > > >>>>>>> > > >>>>>>> from apache_beam.io import filesystems > > >>>>>>> +from apache_beam.io.filesystems import CompressionTypes > > >>>>>>> from apache_beam.portability import common_urns > > >>>>>>> from apache_beam.portability import python_urns > > >>>>>>> from apache_beam.portability.api import > beam_artifact_api_pb2_grpc > > >>>>>>> @@ -464,9 +465,13 @@ class GrpcServer(object): > > >>>>>>> self.provision_info.provision_info, > worker_manager), > > >>>>>>> self.control_server) > > >>>>>>> > > >>>>>>> + def open_uncompressed(f): > > >>>>>>> + return filesystems.FileSystems.open( > > >>>>>>> + f, compression_type=CompressionTypes.UNCOMPRESSED) > > >>>>>>> + > > >>>>>>> > beam_artifact_api_pb2_grpc.add_ArtifactRetrievalServiceServicer_to_server( > > >>>>>>> artifact_service.ArtifactRetrievalService( > > >>>>>>> - file_reader=filesystems.FileSystems.open), > > >>>>>>> + file_reader=open_uncompressed), > > >>>>>>> self.control_server) > > >>>>>>> > > >>>>>>> self.data_plane_handler = data_plane.BeamFnDataServicer( > > >>>>>>> > > >>>>>>> > > >>>>>>> > > >>>>>>> On Tue, Aug 11, 2020 at 1:43 PM Eugene Kirpichov < > ekirpic...@gmail.com> wrote: > > >>>>>>>> > > >>>>>>>> Hi Maximilian, > > >>>>>>>> > > >>>>>>>> Thank you - it works fine with the embedded Flink runner (per > below, seems like it's not using Docker for running Python code? What is it > using then?). > > >>>>>>>> > > >>>>>>>> However, the original bug appears to be wider than I thought - > it is also present if I run --runner=FlinkRunner --environment_type=DOCKER. > Seems like something is very broken in local Docker execution in general - > I haven't yet verified whether the same error will happen when running on a > remote Flink cluster. > > >>>>>>>> > > >>>>>>>> Trying to build my own SDK containers with some more debugging > so I can figure out what's going on... > > >>>>>>>> > > >>>>>>>> > > >>>>>>>> On Tue, Aug 11, 2020 at 7:53 AM Maximilian Michels < > m...@apache.org> wrote: > > >>>>>>>>> > > >>>>>>>>> Looks like you ran into a bug. > > >>>>>>>>> > > >>>>>>>>> You could just run your program without specifying any > arguments, since > > >>>>>>>>> running with Python's FnApiRunner should be enough. > > >>>>>>>>> > > >>>>>>>>> Alternatively, how about trying to run the same pipeline with > the > > >>>>>>>>> FlinkRunner? Use: --runner=FlinkRunner and do not specify an > endpoint. > > >>>>>>>>> It will run the Python code embedded (loopback environment) > without > > >>>>>>>>> additional containers. > > >>>>>>>>> > > >>>>>>>>> Cheers, > > >>>>>>>>> Max > > >>>>>>>>> > > >>>>>>>>> On 10.08.20 21:59, Eugene Kirpichov wrote: > > >>>>>>>>> > Thanks Valentyn! > > >>>>>>>>> > > > >>>>>>>>> > Good to know that this is a bug (I'll file a bug), and that > Dataflow has > > >>>>>>>>> > an experimental way to use custom containers. I'll try that. > > >>>>>>>>> > > > >>>>>>>>> > On Mon, Aug 10, 2020 at 12:51 PM Valentyn Tymofieiev > > >>>>>>>>> > <valen...@google.com <mailto:valen...@google.com>> wrote: > > >>>>>>>>> > > > >>>>>>>>> > Hi Eugene, > > >>>>>>>>> > > > >>>>>>>>> > Good to hear from you. The experience you are describing > on Portable > > >>>>>>>>> > Runner + Docker container in local execution mode is > most certainly > > >>>>>>>>> > a bug, if you have not opened an issue on it, please do > so and feel > > >>>>>>>>> > free to cc me. > > >>>>>>>>> > > > >>>>>>>>> > I can also reproduce the bug and likewise didn't see > anything > > >>>>>>>>> > obvious immediately, this needs some debugging. > > >>>>>>>>> > > > >>>>>>>>> > cc: +Ankur Goenka <mailto:goe...@google.com> +Kyle > Weaver > > >>>>>>>>> > <mailto:kcwea...@google.com> who recently worked on > Portable Runner > > >>>>>>>>> > and may be interested. > > >>>>>>>>> > > > >>>>>>>>> > By the way, you should be able to use custom containers > with > > >>>>>>>>> > Dataflow, if you set --experiments=use_runner_v2. > > >>>>>>>>> > > > >>>>>>>>> > On Mon, Aug 10, 2020 at 9:06 AM Eugene Kirpichov > > >>>>>>>>> > <ekirpic...@gmail.com <mailto:ekirpic...@gmail.com>> > wrote: > > >>>>>>>>> > > > >>>>>>>>> > (cc'ing Sam with whom I'm working on this atm) > > >>>>>>>>> > > > >>>>>>>>> > FWIW I'm still stumped. I've looked through Python, > Go and Java > > >>>>>>>>> > code in the Beam repo having anything to do with > > >>>>>>>>> > gzipping/unzipping, and none of it appears to be > used in the > > >>>>>>>>> > artifact staging/retrieval codepaths. I also can't > find any > > >>>>>>>>> > mention of compression/decompression in the > container boot code. > > >>>>>>>>> > My next step will be to add a bunch of debugging, > rebuild the > > >>>>>>>>> > containers, and see what the artifact services think > they're > > >>>>>>>>> > serving. > > >>>>>>>>> > > > >>>>>>>>> > > > >>>>>>>>> > On Fri, Aug 7, 2020 at 6:47 PM Eugene Kirpichov > > >>>>>>>>> > <ekirpic...@gmail.com <mailto:ekirpic...@gmail.com>> > wrote: > > >>>>>>>>> > > > >>>>>>>>> > Thanks Austin! Good stuff - though note that I am > > >>>>>>>>> > /not/ using custom containers, I'm just trying > to get the > > >>>>>>>>> > basic stuff to work, a Python pipeline with a > simple > > >>>>>>>>> > requirements.txt file. Feels like this should > work > > >>>>>>>>> > out-of-the-box, I must be doing something wrong. > > >>>>>>>>> > > > >>>>>>>>> > On Fri, Aug 7, 2020 at 6:38 PM Austin Bennett > > >>>>>>>>> > <whatwouldausti...@gmail.com > > >>>>>>>>> > <mailto:whatwouldausti...@gmail.com>> wrote: > > >>>>>>>>> > > > >>>>>>>>> > I only believe @OrielResearch Eila > Arich-Landkof > > >>>>>>>>> > <mailto:e...@orielresearch.org> potentially > doing > > >>>>>>>>> > applied work with custom containers (there > must be others)! > > >>>>>>>>> > > > >>>>>>>>> > For a plug for her and @BeamSummit -- I > think enough > > >>>>>>>>> > related will be talked about in (with Conda > specifics) > > >>>>>>>>> > --> > > >>>>>>>>> > > https://2020.beamsummit.org/sessions/workshop-using-conda-on-beam/ > > >>>>>>>>> > > > >>>>>>>>> > I'm sure others will have more things to say > that are > > >>>>>>>>> > actually helpful, on-list, before that > occurs (~3 weeks). > > >>>>>>>>> > > > >>>>>>>>> > > > >>>>>>>>> > > > >>>>>>>>> > On Fri, Aug 7, 2020 at 6:32 PM Eugene > Kirpichov > > >>>>>>>>> > <ekirpic...@gmail.com <mailto: > ekirpic...@gmail.com>> wrote: > > >>>>>>>>> > > > >>>>>>>>> > Hi old Beam friends, > > >>>>>>>>> > > > >>>>>>>>> > I left Google to work on climate change > > >>>>>>>>> > < > https://www.linkedin.com/posts/eugenekirpichov_i-am-leaving-google-heres-a-snipped-to-activity-6683408492444962816-Mw5U > > > > >>>>>>>>> > and am now doing a short engagement with > Pachama > > >>>>>>>>> > <https://pachama.com/>. Right now I'm > trying to get > > >>>>>>>>> > a Beam Python pipeline to work; the > pipeline will > > >>>>>>>>> > use fancy requirements and native > dependencies, and > > >>>>>>>>> > we plan to run it on Cloud Dataflow (so > custom > > >>>>>>>>> > containers are not yet an option), so > I'm going > > >>>>>>>>> > straight for the direct PortableRunner > as per > > >>>>>>>>> > > https://beam.apache.org/documentation/runtime/environments/. > > >>>>>>>>> > > > >>>>>>>>> > Basically I can't get a minimal Beam > program with a > > >>>>>>>>> > minimal requirements.txt file to work - > the .tar.gz > > >>>>>>>>> > of the dependency mysteriously ends up > being > > >>>>>>>>> > ungzipped and non-installable inside the > Docker > > >>>>>>>>> > container running the worker. Details > below. > > >>>>>>>>> > > > >>>>>>>>> > === main.py === > > >>>>>>>>> > import argparse > > >>>>>>>>> > import logging > > >>>>>>>>> > > > >>>>>>>>> > import apache_beam as beam > > >>>>>>>>> > from > apache_beam.options.pipeline_options import > > >>>>>>>>> > PipelineOptions > > >>>>>>>>> > from > apache_beam.options.pipeline_options import > > >>>>>>>>> > SetupOptions > > >>>>>>>>> > > > >>>>>>>>> > def run(argv=None): > > >>>>>>>>> > parser = argparse.ArgumentParser() > > >>>>>>>>> > known_args, pipeline_args = > > >>>>>>>>> > parser.parse_known_args(argv) > > >>>>>>>>> > > > >>>>>>>>> > pipeline_options = > PipelineOptions(pipeline_args) > > >>>>>>>>> > > > >>>>>>>>> > > pipeline_options.view_as(SetupOptions).save_main_session > > >>>>>>>>> > = True > > >>>>>>>>> > > > >>>>>>>>> > with > beam.Pipeline(options=pipeline_options) as p: > > >>>>>>>>> > (p | 'Create' >> > beam.Create(['Hello']) > > >>>>>>>>> > | 'Write' >> > beam.io.WriteToText('/tmp')) > > >>>>>>>>> > > > >>>>>>>>> > > > >>>>>>>>> > if __name__ == '__main__': > > >>>>>>>>> > > logging.getLogger().setLevel(logging.INFO) > > >>>>>>>>> > run() > > >>>>>>>>> > > > >>>>>>>>> > === requirements.txt === > > >>>>>>>>> > alembic > > >>>>>>>>> > > > >>>>>>>>> > When I run the program: > > >>>>>>>>> > $ python3 main.py > > >>>>>>>>> > --runner=PortableRunner > --job_endpoint=embed --requirements_file=requirements.txt > > >>>>>>>>> > > > >>>>>>>>> > > > >>>>>>>>> > I get some normal output and then: > > >>>>>>>>> > > > >>>>>>>>> > > INFO:apache_beam.runners.portability.fn_api_runner.worker_handlers:b' > > >>>>>>>>> > File > > >>>>>>>>> > > "/usr/local/lib/python3.7/site-packages/pip/_internal/utils/unpacking.py", > > >>>>>>>>> > line 261, in unpack_file\n > untar_file(filename, > > >>>>>>>>> > location)\n File > > >>>>>>>>> > > "/usr/local/lib/python3.7/site-packages/pip/_internal/utils/unpacking.py", > > >>>>>>>>> > line 177, in untar_file\n tar = > > >>>>>>>>> > tarfile.open(filename, mode)\n File > > >>>>>>>>> > "/usr/local/lib/python3.7/tarfile.py", > line 1591, in > > >>>>>>>>> > open\n return func(name, filemode, > fileobj, > > >>>>>>>>> > **kwargs)\n File > > >>>>>>>>> > "/usr/local/lib/python3.7/tarfile.py", > line 1648, in > > >>>>>>>>> > gzopen\n raise ReadError("not a gzip > > >>>>>>>>> > file")\ntarfile.ReadError: not a gzip > > >>>>>>>>> > file\n2020/08/08 01:17:07 Failed to > install required > > >>>>>>>>> > packages: failed to install > requirements: exit > > >>>>>>>>> > status 2\n' > > >>>>>>>>> > > > >>>>>>>>> > This greatly puzzled me and, after some > looking, I > > >>>>>>>>> > found something really surprising. Here > is the > > >>>>>>>>> > package in the /directory to be staged/: > > >>>>>>>>> > > > >>>>>>>>> > $ file > > >>>>>>>>> > > > /var/folders/07/j09mnhmd2q9_kw40xrbfvcg80000gn/T/dataflow-requirements-cache/alembic-1.4.2.tar.gz > > >>>>>>>>> > ...: gzip compressed data, was > > >>>>>>>>> > "dist/alembic-1.4.2.tar", last modified: > Thu Mar 19 > > >>>>>>>>> > 21:48:31 2020, max compression, original > size modulo > > >>>>>>>>> > 2^32 4730880 > > >>>>>>>>> > $ ls -l > > >>>>>>>>> > > > /var/folders/07/j09mnhmd2q9_kw40xrbfvcg80000gn/T/dataflow-requirements-cache/alembic-1.4.2.tar.gz > > >>>>>>>>> > -rw-r--r-- 1 jkff staff 1092045 Aug > 7 16:56 ... > > >>>>>>>>> > > > >>>>>>>>> > So far so good. But here is the same > file inside the > > >>>>>>>>> > Docker container (I ssh'd into the dead > container > > >>>>>>>>> > < > https://thorsten-hans.com/how-to-run-commands-in-stopped-docker-containers > >): > > >>>>>>>>> > > > >>>>>>>>> > # file /tmp/staged/alembic-1.4.2.tar.gz > > >>>>>>>>> > /tmp/staged/alembic-1.4.2.tar.gz: POSIX > tar archive > > >>>>>>>>> > (GNU) > > >>>>>>>>> > # ls -l /tmp/staged/alembic-1.4.2.tar.gz > > >>>>>>>>> > -rwxr-xr-x 1 root root 4730880 Aug 8 > 01:17 > > >>>>>>>>> > /tmp/staged/alembic-1.4.2.tar.gz > > >>>>>>>>> > > > >>>>>>>>> > The file has clearly been unzipped and > now of course > > >>>>>>>>> > pip can't install it! What's going on > here? Am I > > >>>>>>>>> > using the direct/portable runner > combination wrong? > > >>>>>>>>> > > > >>>>>>>>> > Thanks! > > >>>>>>>>> > > > >>>>>>>>> > -- > > >>>>>>>>> > Eugene Kirpichov > > >>>>>>>>> > > http://www.linkedin.com/in/eugenekirpichov > > >>>>>>>>> > > > >>>>>>>>> > > > >>>>>>>>> > > > >>>>>>>>> > -- > > >>>>>>>>> > Eugene Kirpichov > > >>>>>>>>> > http://www.linkedin.com/in/eugenekirpichov > > >>>>>>>>> > > > >>>>>>>>> > > > >>>>>>>>> > > > >>>>>>>>> > -- > > >>>>>>>>> > Eugene Kirpichov > > >>>>>>>>> > http://www.linkedin.com/in/eugenekirpichov > > >>>>>>>>> > > > >>>>>>>>> > > > >>>>>>>>> > > > >>>>>>>>> > -- > > >>>>>>>>> > Eugene Kirpichov > > >>>>>>>>> > http://www.linkedin.com/in/eugenekirpichov > > >>>>>>>> > > >>>>>>>> > > >>>>>>>> > > >>>>>>>> -- > > >>>>>>>> Eugene Kirpichov > > >>>>>>>> http://www.linkedin.com/in/eugenekirpichov > > >>>>>>> > > >>>>>>> > > >>>>>>> > > >>>>>>> -- > > >>>>>>> Eugene Kirpichov > > >>>>>>> http://www.linkedin.com/in/eugenekirpichov > > >>>>>> > > >>>>>> > > >>>>>> > > >>>>>> -- > > >>>>>> Eugene Kirpichov > > >>>>>> http://www.linkedin.com/in/eugenekirpichov > > >>>>> > > >>>>> > > >>>>> > > >>>>> -- > > >>>>> Eugene Kirpichov > > >>>>> http://www.linkedin.com/in/eugenekirpichov > > >>>> > > >>>> > > >>>> > > >>>> -- > > >>>> Eugene Kirpichov > > >>>> http://www.linkedin.com/in/eugenekirpichov > > >>> > > >>> > > >>> > > >>> -- > > >>> Eugene Kirpichov > > >>> http://www.linkedin.com/in/eugenekirpichov > -- Eugene Kirpichov http://www.linkedin.com/in/eugenekirpichov