Getting null pointer exception in a basic setup, don't know why

2021-05-06 Thread Teodor Spæren
Hey! I'm having problems with a program that I used to be able to run just fine with flink, but now I'm getting a null pointer exception. The beam program in question looks like this: package no.spaeren.thesis.benchmarks.beam; import no.spaeren.thesis.benchmarks.beam.helpers.CountSource; im

Re: Potential bug with Kafka (or other external) IO in Python Portable Runner

2021-05-06 Thread Nir Gazit
Hey, Not sure I follow - from the code you sent me, it seems that the environment is chosen according to the pipeline options (JAVA_SDK_HARNESS_ENVIRONMENT is used by createOrGetDefaultEnvironment only AFAIU). So if I'm passing `--environment_type=EXTERNAL` why wouldn't I get to using an external e

Re: Getting null pointer exception in a basic setup, don't know why

2021-05-06 Thread Jan Lukavský
Hi Teodor, can you share (maybe github link, if you have it in public repo) the implementation of CountSource and Printer? What changed in Beam 2.25.0 (if I recall correctly) is how Read transform is translated. It uses SDF now, so there might be something that was broken before, but the chang

RE: Query regarding support for ROLLUP

2021-05-06 Thread D, Anup (Nokia - IN/Bangalore)
Thank you Brian, Andrew for your response. Do you see any alternatives currently in Beam SQL that could be used to achieve this ? From: Andrew Pilloud Sent: Wednesday, May 5, 2021 10:39 PM To: Brian Hulette Cc: user Subject: Re: Query regarding support for ROLLUP I can confirm we don't have a

IllegalStateException with simple Kafka Pipeline

2021-05-06 Thread Nir Gazit
Hey, I'm trying to run a pipeline with the Python SDK that reads from Kafka. I've started with a simple one that just reads messages and prints them to the console. When running on Flink, I get the following error: File "kafka_print.py", line 36, in run_kafka_pipeline | 'Print to console' >> be

Re: IllegalStateException with simple Kafka Pipeline

2021-05-06 Thread Chamikara Jayalath
On Thu, May 6, 2021 at 9:53 AM Nir Gazit wrote: > Hey, > I'm trying to run a pipeline with the Python SDK that reads from Kafka. > I've started with a simple one that just reads messages and prints them to > the console. When running on Flink, I get the following error: > File "kafka_print.py", l

Re: Potential bug with Kafka (or other external) IO in Python Portable Runner

2021-05-06 Thread Chamikara Jayalath
On Thu, May 6, 2021 at 4:58 AM Nir Gazit wrote: > Hey, > Not sure I follow - from the code you sent me, it seems that the > environment is chosen according to the pipeline options > (JAVA_SDK_HARNESS_ENVIRONMENT is used by createOrGetDefaultEnvironment only > AFAIU). So if I'm passing `--environm

Does SnowflakeIO support spark runner

2021-05-06 Thread Tao Li
Hi Beam community, Does SnowflakeIO support spark runner? Seems like only direct runner and dataflow runner are supported.. Thanks!

Re: Does SnowflakeIO support spark runner

2021-05-06 Thread Kyle Weaver
As far as I know, it should be supported (Beam's abstract model means IOs usually "just work" on all runners). What makes you think it isn't supported? On Thu, May 6, 2021 at 11:52 AM Tao Li wrote: > Hi Beam community, > > > > Does SnowflakeIO support spark runner? Seems like only direct runner

Re: Does SnowflakeIO support spark runner

2021-05-06 Thread Tao Li
Hi @Kyle Weaver According to this doc: --runner= From: Kyle Weaver Reply-To: "user@beam.apache.org" Date: Thursday, May 6, 2021 at 12:01 PM To: "user@beam.apache.org" Cc: Anuj Gandhi Subject: Re: Does

Re: Does SnowflakeIO support spark runner

2021-05-06 Thread Kyle Weaver
Yeah, I'm pretty sure that documentation is just misleading. All of the options from --runner onward are runner-specific and don't have anything to do with Snowflake, so they should probably be removed from the doc. On Thu, May 6, 2021 at 12:06 PM Tao Li wrote: > Hi @Kyle Weaver > > > > Accordi

Re: Does SnowflakeIO support spark runner

2021-05-06 Thread Tao Li
Thanks Kyle! From: Kyle Weaver Date: Thursday, May 6, 2021 at 12:19 PM To: Tao Li Cc: "user@beam.apache.org" , Anuj Gandhi Subject: Re: Does SnowflakeIO support spark runner Yeah, I'm pretty sure that documentation is just misleading. All of the options from --runner onward are runner-specif

Re: Query regarding support for ROLLUP

2021-05-06 Thread Andrew Pilloud
I'm not familiar with the semantics of ROLLUP but the results look like this query, which might work? select warehouse, SUM(quantity) as quantity from PCOLLECTION group by warehouse UNION select "Warehouse_Total", SUM(quantity) as quantity from PCOLLECTION On Thu, May 6, 2021 at 7:07 AM D, Anup (N

unsubscribe

2021-05-06 Thread Simon Gauld

Batch load with BigQueryIO fails because of a few bad records.

2021-05-06 Thread Matthew Ouyang
I am loading a batch load of records with BigQueryIO.Write, but because some records don't match the target table schema the entire and the write step fails and nothing gets written to the table. Is there a way for records that do match the target table schema to be inserted, and the records that

Re: unsubscribe

2021-05-06 Thread Brian Hulette
Hey Simon, you can unsubscribe by writing to user-unsubscr...@beam.apache.org [1] [1] https://apache.org/foundation/mailinglists.html#request-addresses-for-unsubscribing On Thu, May 6, 2021 at 2:24 PM Simon Gauld wrote: > >

Re: Batch load with BigQueryIO fails because of a few bad records.

2021-05-06 Thread Evan Galpin
Hey Matthew, I believe you might also need to use the “ignoreUnknownValues”[1] or skipInvalidRows[2] options depending on your use case if your goal is to allow valid entities to succeed even if invalid entities exist and separately process failed entities via “getFailedResults”. You could also co