[ 
https://issues.apache.org/jira/browse/BEAM-7763?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17222976#comment-17222976
 ] 

Xinbin Huang edited comment on BEAM-7763 at 10/29/20, 3:50 PM:
---------------------------------------------------------------

[~udim] I tried to reproduce this, but it doesn't seem to be reproducible on my 
end. Do you think this may already be solved in other parts of the codebase?

Test script that I used:
{code:bash}
#! /usr/bin/sh
set -a
LOCAL_TEXT_FILE=README.md

INPUT_TOPIC=<input-topic>
OUTPUT_TOPIC=<output-topic>
set +a


function create_topics {
    gcloud pubsub topics create $INPUT_TOPIC
    gcloud pubsub topics create $OUTPUT_TOPIC
}


function publish_words {
    echo "Publishing words to topic ${INPUT_TOPIC}"
    cat $LOCAL_TEXT_FILE | while read line; do gcloud pubsub topics publish 
$INPUT_TOPIC --message "$line"; done
}

function run_wc_beam_streaming {
    python -m apache_beam.examples.streaming_wordcount \
    --input_topic $INPUT_TOPIC \
    --output_topic $OUTPUT_TOPIC \
    --streaming
}


############################


create_topics
run_wc_beam_streaming &
publish_words
{code}



was (Author: xbhuang):
[~udim] I tried to reproduce this, but it doesn't seem to be reproducible on my 
end. Do you think this may already be solved in other parts of the codebase?


{code:bash}
#! /usr/bin/sh
set -a
LOCAL_TEXT_FILE=README.md

INPUT_TOPIC=<input-topic>
OUTPUT_TOPIC=<output-topic>
set +a


function create_topics {
    gcloud pubsub topics create $INPUT_TOPIC
    gcloud pubsub topics create $OUTPUT_TOPIC
}


function publish_words {
    echo "Publishing words to topic ${INPUT_TOPIC}"
    cat $LOCAL_TEXT_FILE | while read line; do gcloud pubsub topics publish 
$INPUT_TOPIC --message "$line"; done
}

function run_wc_beam_streaming {
    python -m apache_beam.examples.streaming_wordcount \
    --input_topic $INPUT_TOPIC \
    --output_topic $OUTPUT_TOPIC \
    --streaming
}


############################


create_topics
run_wc_beam_streaming &
publish_words
{code}


> Python DirectRunner _PubSubReadEvaluator creates new client per bundle
> ----------------------------------------------------------------------
>
>                 Key: BEAM-7763
>                 URL: https://issues.apache.org/jira/browse/BEAM-7763
>             Project: Beam
>          Issue Type: Bug
>          Components: sdk-py-core
>            Reporter: Udi Meiri
>            Priority: P3
>              Labels: easy
>
> Lots of credential fetches.
> Similar to https://issues.apache.org/jira/browse/BEAM-2264
> but in this case the DirectRunner implementation seems to be creating a new 
> client for each bundle:
> https://github.com/apache/beam/blob/d5d7a7b7d0408d8435031e7bfce1abe2227115f5/sdks/python/apache_beam/runners/direct/transform_evaluator.py#L474
> From: 
> https://stackoverflow.com/questions/57010426/dataflow-access-to-pubsub-access-tokens



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to