d AWS does not give the owner of the database
sufficient permissions to execute this.
From: John Tipper
Sent: 24 July 2022 22:01
To: user@flink.apache.org
Subject: Write JSON string to JDBC as JSONB?
Hi all,
I am using PyFlink and SQL and have a JSON string that I
Hi all,
I am using PyFlink and SQL and have a JSON string that I wish to write to a
JDBC sink (it's a Postgresql DB). I'd like to write that string to a column
that is of type JSONB (or even JSON).
I'm getting exceptions when Flink tries to write the column:
Batch entry 0 INSERT INTO my_table(
Sorry, pressed send too early.
What is the unit of measure for "count" and does this tell me I have too little
Direct Memory and if so, what do I do to specifically increase this number?
Many thanks,
John
____
From: John Tipper
Sent: 20 July 2022 17:5
Hi all,
I can't find mention of what the columns mean for the "Outside JVM Memory' for
the Task Manager in the Flink console.
I have:
Count UsedCapacity
Direct4,203 227 MB 227MB
Mapped0 0 B 0 B
Wh
copy all events to all slots, in which case I
don't understand how parallelism assists?
Many thanks,
John
From: Dian Fu
Sent: 20 July 2022 05:19
To: John Tipper
Cc: user@flink.apache.org
Subject: Re: PyFlink SQL: force maximum use of slots
Hi John
Hi all,
Is there a way of forcing a pipeline to use as many slots as possible? I have
a program in PyFlink using SQL and the Table API and currently all of my
pipeline is using just a single slot. I've tried this:
StreamExecutionEnvironment.get_execution_environment().disable_operator_
s broken if the tablestream
contains a timestamp, which I reported a little while ago and Dian filed as
FLINK-28253.
Kind regards,
John
From: Juntao Hu
Sent: 18 July 2022 04:13
To: John Tipper
Cc: user@flink.apache.org
Subject: Re: PyFlink and parallelism
It&
t in Java and not the int
primitive. I actually see this if I just call set_parallelism(1) without the
call to get_config(). Is this a bug or is there a workaround?
________
From: John Tipper
Sent: 15 July 2022 16:44
To: user@flink.apache.org
Subject: PyFlink and p
Hi all,
I have a processing topology using PyFlink and SQL where there is data skew:
I'm splitting a stream of heterogenous data into separate streams based on the
type of data that's in it and some of these substreams have very many more
events than others and this is causing issues when check
nks,
John
From: Dian Fu
Sent: 08 July 2022 02:27
To: John Tipper
Cc: user@flink.apache.org
Subject: Re: PyFlink: restoring from savepoint
Hi John,
Could you provide more information, e.g. the exact command submitting the job,
the logs file, the PyFlink ve
operators that did not have any state in
the savepoint:
INFO o.a.f.r.c.CheckpointCoordinator [] - Skipping empty savepoint state for
operator a0f11f7a2c416beb6b7aed14be0d63ca.
Best,
Alexander Fedulov
On Wed, Jul 6, 2022 at 9:50 PM John Tipper
mailto:john_tip...@hotmail.com>> wrote:
Hi all,
I have a PyFlink job running in Kubernetes. Savepoints and checkpoints are
being successfully saved to S3. However, I am unable to get the job to start
from a save point.
The container is started with these args:
“standalone-job”, “-pym”, “foo.main”, “-s”, “s3://”, “-n”
In the JM logs
Hi all,
The docs on restoring a job from a savepoint
(https://nightlies.apache.org/flink/flink-docs-master/docs/ops/state/savepoints/#resuming-from-savepoints)
state that the syntax is:
$ bin/flink run -s :savepointPath [:runArgs]
and where "you may give a path to either the savepoint’s dire
with to_append_stream` to see
if it works.
Regards,
Dian
On Sat, Jun 25, 2022 at 4:07 AM John Tipper
mailto:john_tip...@hotmail.com>> wrote:
Hi,
I have a source table using a Kinesis connector reading events from AWS
EventBridge using PyFlink 1.15.0. An example of the sorts of data that
Hi,
I have a source table using a Kinesis connector reading events from AWS
EventBridge using PyFlink 1.15.0. An example of the sorts of data that are in
this stream is here:
https://docs.aws.amazon.com/codebuild/latest/userguide/sample-build-notifications.html#sample-build-notifications-ref.
Hi all,
There are a number of connectors which do not appear to be in the Python API
v1.15.0, e.g. Kinesis. I can see that it's possible to use these connectors by
using the Table API:
CREATE TABLE my_table (...)
WITH ('connector' = 'kinesis' ...)
I guess if you wanted the stream as a DataStre
Sorry for the noise, I completely missed this part of the documentation
describing exactly how to do this:
https://nightlies.apache.org/flink/flink-docs-release-1.14/docs/dev/table/data_stream_api/
From: John Tipper
Sent: 23 June 2022 21:35
To: user
Hi all,
In PyFlink, is it possible to use the DataStream API to create a DataStream by
means of StreamExecutionEnvironment's addSource(...), then perform
transformations on this data stream using the DataStream API and then convert
that stream into a form where SQL statements can be executed on
Hi all,
I'm wanting to run quite a number of PyFlink jobs on Kubernetes, where the
amount of state and number of events being processed is small and therefore I'd
like to use as little memory in my clusters as possible so I can bin pack most
efficiently. I'm running a Flink cluster per job. I'm
naged.consumer-weights`, I really can't
think of any situation where this problem will occur.
Best,
Xingbo
John Tipper mailto:john_tip...@hotmail.com>>
于2022年6月16日周四 19:41写道:
Hi Xingbo,
Yes, there are a number of temporary views being created, where each is being
created using
From: Xingbo Huang
Sent: 16 June 2022 12:34
To: John Tipper
Cc: user@flink.apache.org
Subject: Re: The configured managed memory fraction for Python worker process
must be within (0, 1], was: %s
Hi John,
Does your job logic include conversion between Table and
version of pyflink you used?
Best,
Xingbo
John Tipper mailto:john_tip...@hotmail.com>>
于2022年6月16日周四 17:05写道:
Hi all,
I'm trying to run a PyFlink unit test to test some PyFlink SQL and where my
code uses a Python UDF. I can't share my code but the test case is similar to
the c
Hi all,
I'm trying to run a PyFlink unit test to test some PyFlink SQL and where my
code uses a Python UDF. I can't share my code but the test case is similar to
the code here:
https://github.com/apache/flink/blob/f8172cdbbc27344896d961be4b0b9cdbf000b5cd/flink-python/pyflink/testing/test_case_
-06-09 08:53:36,"Dian Fu" 写道:
Hi John,
If you are using Table API & SQL, the framework is handling the RowKind and
it's transparent for you. So usually you don't need to handle RowKind in Table
API & SQL.
Regards,
Dian
On Thu, Jun 9, 2022 at 6:56 AM John Tipper
mailto
you?
Best regards,
Martijn
[1]
https://nightlies.apache.org/flink/flink-docs-release-1.15/docs/dev/table/sql/queries/joins/#array-expansion
Op ma 13 jun. 2022 om 13:55 schreef John Tipper
mailto:john_tip...@hotmail.com>>:
Hi all,
Flink doesn’t support the unnest() function, which takes an a
Hi all,
Flink doesn’t support the unnest() function, which takes an array and creates a
row for each element in the array. I have column containing an array of
map and I’d like to implement my own unnest.
I try this as an initial do-nothing implementation:
@udtf(result_types=Datatypes.MAP(
ble-functions
[2]
https://github.com/apache/flink/blob/44f73c496ed1514ea453615b77bee0486b8998db/flink-core/src/main/java/org/apache/flink/types/RowKind.java#L52
--
Best!
Xuyang
At 2022-06-08 20:06:17, "John Tipper" wrote:
Hi all,
I have some reference data that is periodica
Hi all,
I have some reference data that is periodically emitted by a crawler mechanism
into an upstream Kinesis data stream, where those rows are used to populate a
sink table (and where I am using Flink 1.13 PyFlink SQL within AWS Kinesis Data
Analytics). What is the best pattern to handle de
-python-api-with-amazon-kinesis-data-analytics/
Sorry for the noise.
From: John Tipper
Sent: 29 April 2022 14:07
To: user@flink.apache.org
Subject: Multiple INSERT INTO within single PyFlink job?
Hi all,
Is it possible to have more than one `INSERT INTO ... SELECT
Hi all,
Is it possible to have more than one `INSERT INTO ... SELECT ...` statement
within a single PyFlink job (on Flink 1.13.6)?
I have a number of output tables that I create and I am trying to write to
write to these within a single job, where the example SQL looks like (assume
there is an
Hi Dian,
I've tried this and it works nicely, on both MacOS and Windows, thank you very
much indeed for your help.
Kind regards,
John
From: Dian Fu
Sent: 25 April 2022 02:42
To: John Tipper
Cc: user@flink.apache.org
Subject: Re: Unit testing PyFlin
via command line argument '--jarfile' or the config option 'pipeline.jars'
--
Ran 1 test in 0.401s
FAILED (errors=1)
sys:1: ResourceWarning: unclosed file <_io.BufferedWriter name=4>
From: John Tipper
Sent: 24 April 2022 20:48
To: Dian Fu
osed file <_io.BufferedWriter name=4>
(.venv) Johns-MacBook-Pro:testing john$ Unable to get the Python watchdog
object, now exit.
From: John Tipper
Sent: 24 April 2022 20:30
To: Dian Fu
Cc: user@flink.apache.org
Subject: Re: Unit testing PyFlink SQL project
Hi Dian,
Thank you very much, th
code needs all of the transitive dependencies on the
classpath?
Have you managed to get your example tests to run in a completely clean virtual
environment? It looks like if it's working on your computer that your computer
perhaps has Java and Python dependencies already downloaded into partic
Hi all,
Is there an example of a self-contained repository showing how to perform SQL
unit testing of PyFlink (specifically 1.13.x if possible)? I have cross-posted
the question to Stack Overflow here:
https://stackoverflow.com/questions/71983434/is-there-an-example-of-pyflink-sql-unit-testing
Hi all,
I'm having some issues with getting a Flink SQL application to work, where I
get an exception and I'm not sure why it's occurring.
I have a source table, reading from Kinesis, where the incoming data in JSON
format has a "source" and "detail-type" field.
CREATE TABLE `input` (
`so
Hi Dian,
Thank you very much, that worked very nicely.
Kind regards,
John
From: Dian Fu
Sent: 11 April 2022 06:32
To: John Tipper
Cc: user@flink.apache.org
Subject: Re: How to process events with different JSON schemas in single
Kinesis stream using PyFlink
TLDR; I want to know how best to process a stream of events using PyFlink,
where the events in the stream have a number of different schemas.
Details:
I want to process a stream of events coming from a Kinesis data stream which
originate from an AWS EventBridge bus. The events in this stream ar
nay Schepler
mailto:ches...@apache.org>> wrote:
Have you looked at org.apache.flink.types.Either? If you'd wrap all elements in
both streams before the union you should be able to join them properly.
On 17/07/2019 14:18, John Tipper wrote:
Hi All,
Can I union/join 2 streams cont
Hi All,
Can I union/join 2 streams containing generic classes, where each stream has a
different parameterised type? I'd like to process the combined stream of values
as a single raw type, casting to a specific type for detailed processing, based
on some information in the type that will allow
For the case of a single iteration of an iterative loop where the feedback type
is different to the input stream type, what order are events processed in the
forward flow? So for example, if we have:
* the input stream contains input1 followed by input2
* a ConnectedIterativeStream at th
Hi All,
I have 2 streams of events that relate to a common base event, where one stream
is the result of a flatmap. I want to join all events that share a common
identifier.
Thus I have something that looks like:
DataStream streamA = ...
DataStream streamB = someDataStream.flatMap(...) // pro
Hi All,
How are timestamps treated within an iterative DataStream loop within Flink?
For example, here is an example of a simple iterative loop within Flink where
the feedback loop is of a different type to the input stream:
DataStream inputStream = env.addSource(new MyInputSourceFunction());
I
Flink performs significant scanning during the pre-flight phase of a Flink
application
(https://ci.apache.org/projects/flink/flink-docs-stable/dev/types_serialization.html).
The act of creating sources, operators and sinks causes Flink to scan the data
types of the objects that are used within
admap
> page can be expected in the next 2-3 weeks.
>
> I hope this information helps.
>
> Regards,
> Timo
>
> [1]
> https://flink.apache.org/news/2019/02/13/unified-batch-streaming-blink.html
>
>> Am 19.02.19 um 12:47 schrieb John Tipper:
>> Hi A
Hi All,
Does anyone know what the current status is for FLIP-16 (loop fault tolerance)
and FLIP-15 (redesign iterations) please? I can see lots of work back in 2016,
but it all seemed to stop and go quiet since about March 2017. I see iterations
as offering very interesting capabilities for Fli
46 matches
Mail list logo