Juan FYI i am using this, hth cutoff_date_str = (date.today() - BDay(60)).date().strftime('%Y-%m-%d') logging.info('Cutoff is:{}'.format(cutoff_date_str)) bq_sql = """SELECT TICKER, LABEL, COUNT(*) as COUNTER FROM `datascience-projects.gcp_shareloader.stock_selection` WHERE AS_OF_DATE > PARSE_DATE("%F", "{}") AND LABEL <> 'STOCK_UNIVERSE' GROUP BY TICKER,LABEL """.format(cutoff_date_str) logging.info('executing SQL :{}'.format(bq_sql)) return (p | 'Reading-{}'.format(cutoff_date_str) >> beam.io.Read( beam.io.BigQuerySource(query=bq_sql, use_standard_sql=True))
) On Sun, Jun 12, 2022 at 5:17 PM Chamikara Jayalath <chamik...@google.com> wrote: > Please see here for an example pipeline: > https://github.com/apache/beam/blob/35bac6a62f1dc548ee908cfeff7f73ffcac38e6f/sdks/python/apache_beam/examples/cookbook/bigquery_tornadoes.py#L90 > > On Sun, Jun 12, 2022 at 8:54 AM Reuven Lax <re...@google.com> wrote: > >> Did you create a pipeline object? >> >> On Sun, Jun 12, 2022 at 8:36 AM Vega, Juan <juan.v...@schwab.com> wrote: >> >>> I’m trying to read data from a simple query in BigQuery and cannot get >>> it to work. >>> >>> >>> >>> I’m following the steps from this URL: >>> >>> https://beam.apache.org/documentation/io/built-in/google-bigquery/ >>> >>> >>> >>> The process is trying to query a very small table with one column and >>> five records. >>> >>> >>> >>> I have this code from the URL: >>> >>> >>> >>> from apache_beam import pipeline >>> >>> import apache_beam as beam >>> >>> >>> >>> customer_id = ( >>> >>> pipeline >>> >>> | 'QueryTable' >> beam.io.ReadFromBigQuery( >>> >>> query='SELECT customer_id FROM >>> [cs-clientu-ad00007609-sbx5615:cuwi_acq_int.jv_test_data]') >>> >>> # Each row is a dictionary where the keys are the BigQuery columns >>> >>> | beam.Map(lambda elem: elem['customer_id'])) >>> >>> >>> >>> Below is the error output: >>> >>> >>> >>> C:\Users\juan.vega\AppData\Local\Continuum\anaconda3\python.exe >>> C:/Users/juan.vega/PycharmProjects/us_1715_BQ_Stg_to_Intg/us_1715_main.py >>> >>> Traceback (most recent call last): >>> >>> File >>> "C:/Users/juan.vega/PycharmProjects/us_1715_BQ_Stg_to_Intg/us_1715_main.py", >>> line 18, in <module> >>> >>> | beam.Map(lambda elem: elem['customer_id'])) >>> >>> File >>> "C:\Users\juan.vega\AppData\Local\Continuum\anaconda3\lib\site-packages\apache_beam\transforms\ptransform.py", >>> line 1092, in __ror__ >>> >>> return self.transform.__ror__(pvalueish, self.label) >>> >>> File >>> "C:\Users\juan.vega\AppData\Local\Continuum\anaconda3\lib\site-packages\apache_beam\transforms\ptransform.py", >>> line 609, in __ror__ >>> >>> for (ix, v) in enumerate(pvalues) >>> >>> File >>> "C:\Users\juan.vega\AppData\Local\Continuum\anaconda3\lib\site-packages\apache_beam\transforms\ptransform.py", >>> line 610, in <dictcomp> >>> >>> if not isinstance(v, pvalue.PValue) and v is not None >>> >>> File >>> "C:\Users\juan.vega\AppData\Local\Continuum\anaconda3\lib\site-packages\apache_beam\transforms\core.py", >>> line 3179, in __init__ >>> >>> self.values = tuple(values) >>> >>> TypeError: 'module' object is not iterable >>> >>> >>> >>> Process finished with exit code 1 >>> >>> >>> >>> >>> >>> >>> >>> I’ve tried many things and I really need help. >>> >>> >>> >>> Do you have sample code to simply query some data from BigQuery using >>> Dataflow? >>> >>> >>> >>> Thanks. >>> >>> >>> >>> >>> >>> Classification: Schwab Internal >>> >>