I am feeling a bit stupid, but I haven't had time to try out the different
possibilities to model the Kafka -> Partitioned-Table-in-BQ pipeline in
Beam, until now.
I am using the snapshort 2.10 version at the moment and your comment was on
point: After rewriting the pipeline (which limits it to dea
Thanx Tobi very interesting ! Sorry for the long gaps in email, I am based
out of Singapore :-)
I wonder if this is coming from the use of .to(
partitionedTableDynamicDestinations), which I am guessing accesses the
Intervalwindow. With the use of the Time column based partitioning in my
pipeline
Rick,
I think “spark.streaming.kafka.maxRatePerPartition” won’t work for you since,
afaik, it’s a configuration option of Spark Kafka reader and Beam KafkaIO
doesn’t use it (since it has own consumer implementation).
In the same time, if you want to set an option for Beam KafkaIO consumer confi
That was it Juan Carlos. Can't thank you enough.
I now have generic Kettle transformations running on the Direct Runner,
Dataflow and Spark.
Cheers,
Matt
---
Matt Casters attcast...@gmail.com>
Senior Solution Architect, Kettle Project Founder
Op di 29 jan. 2019 om 18:19 schreef Matt Casters :
No you're right. I got so focused on getting
org.apache.hadoop.fs.FileSystem in order that I forgot about the other
files. Doh!
Thanks for the tip!
---
Matt Casters attcast...@gmail.com>
Senior Solution Architect, Kettle Project Founder
Op di 29 jan. 2019 om 16:41 schreef Juan Carlos Garcia :
Hi Matt,
Are the META-INF/services files merged correctly on the fat-jar?
On Tue, Jan 29, 2019 at 2:33 PM Matt Casters wrote:
> Hi Beam!
>
> After I have my pipeline created and the execution started on the Spark
> master I get this strange nugget:
>
> java.lang.IllegalStateException: No Tran
Hi Tobias,
It is normal to see "No restore state for UnbounedSourceWrapper" when not
restoring from a checkpoint/savepoint.
Just checking. You mentioned you set the checkpoint interval via:
--checkpointingInterval=30
That means you have to wait 5 minutes until the first checkpoint will
Hi Beam!
After I have my pipeline created and the execution started on the Spark
master I get this strange nugget:
java.lang.IllegalStateException: No TransformEvaluator registered for
BOUNDED transform Read(CompressedSource)
I'm reading from a HDFS file (hdfs:///input/customers-noheader-1M.txt)
I am using FILE_LOADS and no timestamp column, but partitioned tables. In
this case I have seen the following error, when I comment the windowing out
(direct runner):
Exception in thread "main"
org.apache.beam.sdk.Pipeline$PipelineExecutionException:
java.lang.ClassCastException:
org.apache.beam.s
OK folks, I figured it out.
For the other people desperately clutching to years-old google results in
the hope to find any hint...
The Spark requirement to work with a fat jar caused a collision in the
packaging on file:
META-INF/services/org.apache.hadoop.fs.FileSystem
This in turn erased cert
If I have a pipeline running and I restart the taskmanager on which it's
executing the log shows - I find the "No restore state for
UnbounedSourceWrapper." interesting, as it seems to indicate that the
pipeline never stored a state in the first place?
Starting taskexecutor as a console application
Also my BQ table was partitioned based on a TIMESTAMP column, rather than
being ingestion time based partitioning.
Cheers
Reza
On Tue, 29 Jan 2019 at 15:44, Reza Rokni wrote:
> Hya Tobi,
>
> When you mention failed do you mean you get an error on running the
> pipeline or there is a incorrect d
12 matches
Mail list logo