Hi
This project is a completely different solution towards this problem, but
in the hadoop mapreduce context.
https://github.com/nielsbasjes/splittablegzip
I have used this a lot in the past.
Perhaps porting this project to beam is an option?
Niels Basjes
On Tue, May 14, 2019, 20:45 Lukasz
wPrinter extends DoFn {
@ProcessElement
public void processElement(ProcessContext c) {
final Row row = c.element();
LOG.info("ROW: {} --> {}", row, row.getSchema());
}
}
}
--
Best regards / Met vriendelijke groeten,
Niels Basjes
been released ...
I created https://issues.apache.org/jira/browse/BEAM-9267 to track this.
Niels Basjes
On Fri, Feb 7, 2020 at 11:26 AM Niels Basjes wrote:
> Hi,
>
> My context: Java 8 , Beam 2.19.0
>
> *TLDR*: How do I create a Beam-SQL UDF that returns a Map
> ?
>
X_VALUE but that yielded exceptions.
So far I have not been able to figure out how to tell the TestPipeline:
Finish what you have and shutdown.
How do I do that?
--
Best regards / Met vriendelijke groeten,
Niels Basjes
age in my stream
and let the pipeline shutdown once it sees that?
Can you please indicate if this is a valid way of doing this?
Thanks.
On Wed, Jul 15, 2020 at 8:48 PM Niels Basjes wrote:
> Hi,
>
> I'm testing to see how I can build an integration test for a Beam
> a
testing
>> reasons).
>>
>> [1]
>> https://github.com/apache/beam/blob/master/sdks/java/io/kafka/src/test/java/org/apache/beam/sdk/io/kafka/KafkaIOIT.java
>> [2]
>> https://beam.apache.org/documentation/pipelines/create-your-pipeline/#running-your-pipeline
>&
10", optional = true) String field10
) {
My question: Is there a better way to do this?
--
Best regards / Met vriendelijke groeten,
Niels Basjes
On Tue, Jan 26, 2021 at 2:04 AM Niels Basjes wrote:
>
>> Hi,
>>
>> I want to define a Beam SQL user defined function that accepts a variable
>> list of arguments (which may be empty).
>>
>> What I essentially would like to have is
>>
>> public cla
--
Best regards / Met vriendelijke groeten,
Niels Basjes
a loss where it goes wrong.
Niels
On Wed, Jul 19, 2017 at 12:34 PM, Niels Basjes wrote:
> Hi,
>
> We have a (Kerberos secured) Yarn cluster on which we run (among lots of
> others) our Apache Flink applications.
> I'm having trouble getting even the simplest Beam application
-inputFile=hdfs:///user/nbasjes/helloworld.txt \
--output=hdfs:///user/nbasjes/hello-counts
hdfs dfs -cat hello-counts*
And this works too!
Thanks for the suggestion!
Next step: getting both Kafka and HBase connections running ... :)
--
Best regards / Met vriendelijke groeten,
Niels Basjes
back to Flink as there support
for these files formats is readily available and works as expected.
--
Best regards,
Niels Basjes
ke in the Beam Java API?
--
Best regards / Met vriendelijke groeten,
Niels Basjes
nd then see if the runners can support it?
Niels Basjes
On Sun, Jul 8, 2018 at 1:43 AM, Robert Bradshaw wrote:
> Reshuffle for the purpose of ensuring stable inputs is deprecated, but
> this seems a valid(ish) usecase.
>
> Currently, all runners implement GroupByKey by sending everythi
14 matches
Mail list logo