Hi kyle,

I noticed the one you linked didnt work when i copied and pasted it so
chose the only one in the directory and that’s how when i got the error
message

Thanks!
Joey

On Fri, Jul 9, 2021 at 4:50 PM Kyle Weaver <kcwea...@google.com> wrote:

> Hi Joey, my mistake, I picked the wrong jar. The correct jar should be
> "runners/flink/1.13/job-server/build/libs/beam-runners-flink-1.13-job-server-2.31.0-SNAPSHOT.jar"
> (or similar depending on your Beam/Flink version choices).
>
> On Fri, Jul 9, 2021 at 1:43 PM Joey Tran <joey.t...@schrodinger.com>
> wrote:
>
>> Hi all,
>>
>> Thank you very much for the responses!
>>
>> I feel a bit better about using dataproc since it's not in beta like
>> flink on GKE.
>>
>> I rebuilt the flinkrunner as you specified but I still get an error. I've
>> attached the stdout from trying to run with the patched flink runner.
>>
>> Here's instructions to get a cluster started and to the state right
>> before I run your patched flink runner instructions:
>>
>>
>> gcloud dataproc clusters create my-flink-cluster
>> --optional-components=FLINK,DOCKER --region=us-central1 --image-version=2.0
>> --enable-component-gateway
>> gcloud compute ssh my-flink-cluster-m
>> curl
>> https://raw.githubusercontent.com/cs109/2015/master/Lectures/Lecture15b/sparklect/shakes/kinglear.txt
>> > kinglear.txt
>> curl
>> https://raw.githubusercontent.com/apache/beam/master/sdks/python/apache_beam/examples/wordcount.py
>> > wordcount.py
>> pip install apache_beam apache_beam[gcp]
>> . /usr/bin/flink-yarn-daemon
>> python wordcount.py --input kinglear.txt --output my_counts --runner
>> FlinkRunner --flink_master $FLINK_MASTER_URL --environment_type DOCKER
>> --flink_job_server_jar
>> beam/runners/flink/1.13/build/libs/beam-runners-flink-1.13-2.32.0-SNAPSHOT.jar
>>
>>
>> Let me know if anything stands out to you. Thanks again for the support!
>> Sorry if I'm missing something silly
>>
>> On Fri, Jul 9, 2021 at 3:20 PM Kyle Weaver <kcwea...@google.com> wrote:
>>
>>> That's for Java only. Joey was asking about the portable (Python)
>>> example.
>>>
>>> On Fri, Jul 9, 2021 at 12:18 PM Tianzi Cai <tia...@google.com> wrote:
>>>
>>>> Thanks Kyle so much for forwarding.
>>>>
>>>> I was literally just trying this myself and got stuck too (b/
>>>> <http://b/193180649>193180649 <http://b/193180649>). I finally got it
>>>> all to work. Please feel free to share with the customer. I can give them
>>>> repo.reader permission if needed.
>>>>
>>>>    1. Run this command to generate the canonical word count example.
>>>>    mvn archetype:generate \
>>>>        -DarchetypeGroupId=org.apache.beam \
>>>>        -DarchetypeArtifactId=beam-sdks-java-maven-archetypes-examples \
>>>>        -DarchetypeVersion=2.30.0 \
>>>>        -DgroupId=org.example \
>>>>        -DartifactId=word-count-beam \
>>>>        -Dversion="0.1" \
>>>>        -Dpackage=org.apache.beam.examples \
>>>>        -DinteractiveMode=false
>>>>    2. Make a few code changes (here
>>>>    
>>>> <https://source.cloud.google.com/tz-playground-bigdata/word-count-example/+/gcp:>
>>>>  are
>>>>    mine) then make sure that the code works with mvn compile exec:java
>>>>    -Dexec.mainClass=org.apache.beam.examples.WordCount and you can see
>>>>    the aggregated results printed out.
>>>>    3. Run mvn package -Pflink-runner to get the packaged JARs.
>>>>    4. Upload the uber jar word-count-beam-bundled-0.1.jar to a Cloud
>>>>    Storage bucket. SSH into my Dataproc master node. Download the uber jar.
>>>>    5. flink run -c org.apache.beam.examples.WordCount
>>>>    word-count-beam-bundled-0.1.jar --runner=FlinkRunner
>>>>
>>>>
>>>> On Fri, Jul 9, 2021 at 12:11 PM Kyle Weaver <kcwea...@google.com>
>>>> wrote:
>>>>
>>>>> If you're not committed to Dataproc, you may also want to try running
>>>>> it on GKE, which AFAIK doesn't have these issues.
>>>>> https://github.com/GoogleCloudPlatform/flink-on-k8s-operator/blob/master/docs/beam_guide.md
>>>>>
>>>>> On Fri, Jul 9, 2021 at 12:08 PM Kyle Weaver <kcwea...@google.com>
>>>>> wrote:
>>>>>
>>>>>> Hi Joey,
>>>>>>
>>>>>> Jackson dependency issues are likely
>>>>>> https://issues.apache.org/jira/browse/BEAM-10430. You will have to
>>>>>> manually patch it until a fix is available in an upcoming Beam release.
>>>>>>
>>>>>> 1. Download Beam source from Github
>>>>>> 2. Check out a patch for the issue, such as
>>>>>> https://github.com/apache/beam/pull/14953
>>>>>> 3. Build the Flink runner using command "./gradlew
>>>>>> :runners:flink:1.13:job-server:shadowJar"
>>>>>> 4. Use the outputted Flink runner jar in your Python pipeline options
>>>>>> "--flink_job_server_jar=runners/flink/1.13/build/libs/beam-runners-flink-1.13-2.31.0-SNAPSHOT.jar"
>>>>>>
>>>>>> For the "No container id" issue, can you share the full logs?
>>>>>>
>>>>>> +Tianzi Cai <tia...@google.com> +Anthony Mancuso
>>>>>> <amanc...@google.com>
>>>>>>
>>>>>> Thanks,
>>>>>> Kyle
>>>>>>
>>>>>> On Fri, Jul 9, 2021 at 8:47 AM Ahmet Altay <al...@google.com> wrote:
>>>>>>
>>>>>>> /cc @Kyle Weaver <kcwea...@google.com>
>>>>>>>
>>>>>>> On Fri, Jul 9, 2021 at 5:24 AM Joey Tran <joey.t...@schrodinger.com>
>>>>>>> wrote:
>>>>>>>
>>>>>>>> Hello!
>>>>>>>>
>>>>>>>> I'm trying to just demo Beam/Flink and I tried following the
>>>>>>>> instructions with Google's Dataproc but I get a bunch of errors ranging
>>>>>>>> from jackson dependency issues to some issue about "No container id".
>>>>>>>>
>>>>>>>> Does anyone know if these dataproc instructions[1] are complete? I
>>>>>>>> ran through it pretty much word for word and can't get a simple 
>>>>>>>> wordcount
>>>>>>>> going, I'm not sure if I'm somehow messing something up or there's more
>>>>>>>> necessary than just this doc instructs? FWIW I've been able to run the 
>>>>>>>> java
>>>>>>>> wordcount example fine, it seems like I only run into issues when 
>>>>>>>> trying to
>>>>>>>> follow the portable runner instructions
>>>>>>>>
>>>>>>>> Thanks so much in advance for your help1 I'm not very experienced
>>>>>>>> with deploying these kinds of things but I wanted to do a demo to show 
>>>>>>>> that
>>>>>>>> Beam+Flink is a better solution than writing a framework myself
>>>>>>>>
>>>>>>>> [1]
>>>>>>>> https://cloud.google.com/dataproc/docs/concepts/components/flink#portable_beam_jobs
>>>>>>>>
>>>>>>>

Reply via email to