Re: Not Able to Get Code to Work for BigQuery using DataFlow

Sofia’s World Tue, 14 Jun 2022 10:07:23 -0700

Juan   FYI i am using this, hth

cutoff_date_str = (date.today() - BDay(60)).date().strftime('%Y-%m-%d')
  logging.info('Cutoff is:{}'.format(cutoff_date_str))
  bq_sql = """SELECT TICKER, LABEL, COUNT(*) as COUNTER FROM
`datascience-projects.gcp_shareloader.stock_selection`
      WHERE AS_OF_DATE > PARSE_DATE("%F", "{}") AND LABEL <>
'STOCK_UNIVERSE' GROUP BY TICKER,LABEL
""".format(cutoff_date_str)
  logging.info('executing SQL :{}'.format(bq_sql))
  return (p | 'Reading-{}'.format(cutoff_date_str) >> beam.io.Read(
      beam.io.BigQuerySource(query=bq_sql, use_standard_sql=True))


          )




On Sun, Jun 12, 2022 at 5:17 PM Chamikara Jayalath <chamik...@google.com>
wrote:

> Please see here for an example pipeline:
> https://github.com/apache/beam/blob/35bac6a62f1dc548ee908cfeff7f73ffcac38e6f/sdks/python/apache_beam/examples/cookbook/bigquery_tornadoes.py#L90
>
> On Sun, Jun 12, 2022 at 8:54 AM Reuven Lax <re...@google.com> wrote:
>
>> Did you create a pipeline object?
>>
>> On Sun, Jun 12, 2022 at 8:36 AM Vega, Juan <juan.v...@schwab.com> wrote:
>>
>>> I’m trying to read data from a simple query in BigQuery and cannot get
>>> it to work.
>>>
>>>
>>>
>>> I’m following the steps from this URL:
>>>
>>> https://beam.apache.org/documentation/io/built-in/google-bigquery/
>>>
>>>
>>>
>>> The process is trying to query a very small table with one column and
>>> five records.
>>>
>>>
>>>
>>> I have this code from the URL:
>>>
>>>
>>>
>>> from apache_beam import pipeline
>>>
>>> import apache_beam as beam
>>>
>>>
>>>
>>> customer_id = (
>>>
>>>     pipeline
>>>
>>>     | 'QueryTable' >> beam.io.ReadFromBigQuery(
>>>
>>>         query='SELECT customer_id FROM
>>> [cs-clientu-ad00007609-sbx5615:cuwi_acq_int.jv_test_data]')
>>>
>>>     # Each row is a dictionary where the keys are the BigQuery columns
>>>
>>>     | beam.Map(lambda elem: elem['customer_id']))
>>>
>>>
>>>
>>> Below is the error output:
>>>
>>>
>>>
>>> C:\Users\juan.vega\AppData\Local\Continuum\anaconda3\python.exe
>>> C:/Users/juan.vega/PycharmProjects/us_1715_BQ_Stg_to_Intg/us_1715_main.py
>>>
>>> Traceback (most recent call last):
>>>
>>>   File
>>> "C:/Users/juan.vega/PycharmProjects/us_1715_BQ_Stg_to_Intg/us_1715_main.py",
>>> line 18, in <module>
>>>
>>>     | beam.Map(lambda elem: elem['customer_id']))
>>>
>>>   File
>>> "C:\Users\juan.vega\AppData\Local\Continuum\anaconda3\lib\site-packages\apache_beam\transforms\ptransform.py",
>>> line 1092, in __ror__
>>>
>>>     return self.transform.__ror__(pvalueish, self.label)
>>>
>>>   File
>>> "C:\Users\juan.vega\AppData\Local\Continuum\anaconda3\lib\site-packages\apache_beam\transforms\ptransform.py",
>>> line 609, in __ror__
>>>
>>>     for (ix, v) in enumerate(pvalues)
>>>
>>>   File
>>> "C:\Users\juan.vega\AppData\Local\Continuum\anaconda3\lib\site-packages\apache_beam\transforms\ptransform.py",
>>> line 610, in <dictcomp>
>>>
>>>     if not isinstance(v, pvalue.PValue) and v is not None
>>>
>>>   File
>>> "C:\Users\juan.vega\AppData\Local\Continuum\anaconda3\lib\site-packages\apache_beam\transforms\core.py",
>>> line 3179, in __init__
>>>
>>>     self.values = tuple(values)
>>>
>>> TypeError: 'module' object is not iterable
>>>
>>>
>>>
>>> Process finished with exit code 1
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>> I’ve tried many things and I really need help.
>>>
>>>
>>>
>>> Do you have sample code to simply query some data from BigQuery using
>>> Dataflow?
>>>
>>>
>>>
>>> Thanks.
>>>
>>>
>>>
>>>
>>>
>>> Classification: Schwab Internal
>>>
>>

Re: Not Able to Get Code to Work for BigQuery using DataFlow

Reply via email to