Re: Problem with gzip

2019-05-14 Thread Niels Basjes
Hi This project is a completely different solution towards this problem, but in the hadoop mapreduce context. https://github.com/nielsbasjes/splittablegzip I have used this a lot in the past. Perhaps porting this project to beam is an option? Niels Basjes On Tue, May 14, 2019, 20:45 Lukasz

Beam-SQL: How to create a UDF that returns a Map ?

2020-02-07 Thread Niels Basjes
wPrinter extends DoFn { @ProcessElement public void processElement(ProcessContext c) { final Row row = c.element(); LOG.info("ROW: {} --> {}", row, row.getSchema()); } } } -- Best regards / Met vriendelijke groeten, Niels Basjes

Re: Beam-SQL: How to create a UDF that returns a Map ?

2020-02-07 Thread Niels Basjes
been released ... I created https://issues.apache.org/jira/browse/BEAM-9267 to track this. Niels Basjes On Fri, Feb 7, 2020 at 11:26 AM Niels Basjes wrote: > Hi, > > My context: Java 8 , Beam 2.19.0 > > *TLDR*: How do I create a Beam-SQL UDF that returns a Map > ? >

Terminating a streaming integration test

2020-07-15 Thread Niels Basjes
X_VALUE but that yielded exceptions. So far I have not been able to figure out how to tell the TestPipeline: Finish what you have and shutdown. How do I do that? -- Best regards / Met vriendelijke groeten, Niels Basjes

Re: Terminating a streaming integration test

2020-07-16 Thread Niels Basjes
age in my stream and let the pipeline shutdown once it sees that? Can you please indicate if this is a valid way of doing this? Thanks. On Wed, Jul 15, 2020 at 8:48 PM Niels Basjes wrote: > Hi, > > I'm testing to see how I can build an integration test for a Beam > a

Re: Terminating a streaming integration test

2020-07-16 Thread Niels Basjes
testing >> reasons). >> >> [1] >> https://github.com/apache/beam/blob/master/sdks/java/io/kafka/src/test/java/org/apache/beam/sdk/io/kafka/KafkaIOIT.java >> [2] >> https://beam.apache.org/documentation/pipelines/create-your-pipeline/#running-your-pipeline >&

Beam SQL UDF with variable arguments list?

2021-01-26 Thread Niels Basjes
10", optional = true) String field10 ) { My question: Is there a better way to do this? -- Best regards / Met vriendelijke groeten, Niels Basjes

Re: Beam SQL UDF with variable arguments list?

2021-01-26 Thread Niels Basjes
On Tue, Jan 26, 2021 at 2:04 AM Niels Basjes wrote: > >> Hi, >> >> I want to define a Beam SQL user defined function that accepts a variable >> list of arguments (which may be empty). >> >> What I essentially would like to have is >> >> public cla

Problems trying to run Beam on Flink on Yarn

2017-07-19 Thread Niels Basjes
-- Best regards / Met vriendelijke groeten, Niels Basjes

Re: Problems trying to run Beam on Flink on Yarn

2017-07-19 Thread Niels Basjes
a loss where it goes wrong. Niels On Wed, Jul 19, 2017 at 12:34 PM, Niels Basjes wrote: > Hi, > > We have a (Kerberos secured) Yarn cluster on which we run (among lots of > others) our Apache Flink applications. > I'm having trouble getting even the simplest Beam application

Re: Problems trying to run Beam on Flink on Yarn

2017-07-20 Thread Niels Basjes
-inputFile=hdfs:///user/nbasjes/helloworld.txt \ --output=hdfs:///user/nbasjes/hello-counts hdfs dfs -cat hello-counts* And this works too! Thanks for the suggestion! Next step: getting both Kafka and HBase connections running ... :) -- Best regards / Met vriendelijke groeten, Niels Basjes

Writing Parquet / Orc files

2017-09-05 Thread Niels Basjes
back to Flink as there support for these files formats is readily available and works as expected. -- Best regards, Niels Basjes

Routing events by key

2018-07-06 Thread Niels Basjes
ke in the Beam Java API? -- Best regards / Met vriendelijke groeten, Niels Basjes

Re: Routing events by key

2018-07-08 Thread Niels Basjes
nd then see if the runners can support it? Niels Basjes On Sun, Jul 8, 2018 at 1:43 AM, Robert Bradshaw wrote: > Reshuffle for the purpose of ensuring stable inputs is deprecated, but > this seems a valid(ish) usecase. > > Currently, all runners implement GroupByKey by sending everythi