Hi Pierre,
The serialization/deserialization of sparse Row in flink is specially
optimized. The principle is that each Row will have a leading mask when
serializing to identify whether the field at the specified position is
NULL, and one field corresponds to one bit. For example, if you have 10k
f
It appears that even when I pass id through the map function and join back
with the original table, it does not seem to think that the id passed
through map is a unique key. Is there any way to solve this while still
preserving the primary key?
On Wed, Dec 2, 2020 at 5:27 PM Rex Fenley wrote:
>
Hi Xingbo, Community,
Thanks a lot for your support.
May I finally ask to conclude this thread, including wider audience:
- Are serious performance issues to be expected with 100k fields per ROW
(i.e. due solely to metadata overhead and independently of queries logic) ?
- In sparse population (say
Hi Dhurandar,
I'm afraid that Flink's rest API cannot satisfy your request as it would not
act as any source. One possible example could be SocketWindowWordCount [1]
which listens data on a port from all taskmanagers with sources.
[1]
https://github.com/apache/flink/blob/master/flink-examples/
Hi Kevin,
If you pass the savepoint path to resume application [1], the application would
resume from last savepoint.
If you change the logic of your DDL and since no uid can be set from users, I
am afraid not all state could be restored as you expected.
[1]
https://ci.apache.org/projects/fli
Hi Pierre,
This example is written based on the syntax of release-1.12 that is about
to be released, and the test passed. In release-1.12, input_type can be
omitted and expression can be used directly. If you are using release-1.11,
you only need to modify the grammar of udf used slightly accordin
Even odder, if I pull the constructor of the function into its own variable
it "works" (though it appears that map only passes through the fields
mapped over which means I'll need an additional join, though now I think
I'm on the right path).
I.e.
def splatFruits(table: Table, columnPrefix: String
Looks like `as` needed to move outside of where it was before to fix that
error. Though now I'm receiving
>org.apache.flink.client.program.ProgramInvocationException: The main
method caused an error: Aliasing more fields than we actually have.
Example code now:
// table will always have pk id
def
So I just instead tried changing SplatFruitsFunc to a ScalaFunction and
leftOuterJoinLateral to a map and I'm receiving:
> org.apache.flink.client.program.ProgramInvocationException: The main
method caused an error: Only a scalar function can be used in the map
operator.
which seems odd because doc
I have a question regarding DDLs if they are considered operators and can
be savepointed
For example
CREATE TABLE mytable (
id BIGINT,
data STRING
WATERMARK(...)
) with (
connector = 'kafka'
)
If I create the table like above, save&exit and resume application, will
the application start
Hello,
I have a TableFunction and wherever it is applied with a
leftOuterJoinLateral, my table loses any inference of there being a primary
key. I see this because all subsequent joins end up with "NoUniqueKey" when
I know a primary key of id should exist.
I'm wondering if this is expected behavi
Can Flink job be running as Rest Server, Where Apache Flink job is
listening on a port (443). When a user calls this URL with payload,
data directly goes to the Apache Flink windowing function.
Right now Flink can ingest data from Kafka or Kinesis, but we have a use
case where we would like to pus
Hello!
I have a datastream like this:
env.readTextFile("events.log")
.map(event => StopFactory(event)) // I have defined a Stop class and this
creates an instance from the file line
.assignTimestampsAndWatermarks(stopEventTimeExtractor) // extract the timestamp
from a field from each instance
.
Hi Xingbo,
Nice ! This looks a bit hacky, but shows that it can be done ;)
I just got an exception preventing me running your code, apparently from
udf.py:
TypeError: Invalid input_type: input_type should be DataType but contains
None
Can you pls check again ?
If the schema is defined is a .avs
Hi Till
Super, understood! I will also read the website with the link that you provided
me.
Thanks and have a nice eve.
best
s
From: Till Rohrmann
Sent: 02 December 2020 17:44
To: Simone Cavallarin
Cc: user
Subject: Re: Process windows not firing - > Can i
Hi Prasanna,
I believe that what Aljoscha suggestd in the linked discussion is still the
best way to go forward. Given your description of the problem this should
actually be pretty straightforward as you can deduce the topic from the
message. Hence, you just need to create the ProducerRecord with
Hi Simone,
You need to set this option because otherwise Flink will not generate the
Watermarks. If you don't need watermarks (e.g. when using ProcessingTime),
then the system needs to send fewer records over the wire. That's basically
how Flink has been developed [1].
With Flink 1.12 this option
Hi Till and David,
First at all thanks for the quick reply, really appreciated!
Drilling down to the issue:
1. No session is closing because there isn't a sufficiently long gap in the
test data -> It was the first thing that I thought, before asking I run a test
checking the gap on my data
Hi Till,
Thanks for the clarification and suggestions
Regards
Sidhant Gupta
On Wed, Dec 2, 2020, 10:10 PM Till Rohrmann wrote:
> Hi Sidhant,
>
> Have you seen this discussion [1]? If you want to use S3, then you need to
> make sure that you start your Flink processes with the appropriate
> Fil
Hi Maciek,
I am pulling in Timo who might help you with this problem.
Cheers,
Till
On Tue, Dec 1, 2020 at 6:51 PM Maciek Próchniak wrote:
> Hello,
>
> I try to configure SQL Client to query partitioned ORC data on local
> filesystem. I have directory structure like that:
>
> /tmp/table1/startd
Hi Anil,
Flink does not maintain the MDC context between threads. Hence, I don't
think that it is possible w/o changes to Flink.
One note, if operators are chained then they are run by the same thread.
Cheers,
Till
On Wed, Dec 2, 2020 at 7:22 AM Anil K wrote:
> Hi All,
>
> Is it possible to h
Hi Sidhant,
Have you seen this discussion [1]? If you want to use S3, then you need to
make sure that you start your Flink processes with the appropriate
FileSystemProvider for S3 [2]. So the problem you are seeing is most likely
caused by the JVM not knowing a S3 file system implementation.
Be a
Hi,
Events need to be routed to different kafka topics dynamically based upon
some info in the message.
We have implemented using KeyedSerializationSchema similar to
https://stackoverflow.com/questions/49508508/apache-flink-how-to-sink-events-to-different-kafka-topics-depending-on-the-even.
But i
Hi Martin,
In general, Flink's MiniCluster should be able to run every complete Flink
JobGraph. However, from what I read you are looking for a test harness for
a processWindowFunction so that you can test this function in a more unit
test style, right? What you can do is to use the
OneInputStream
Hi Narasimha,
thanks for reaching out to the community. I am not entirely sure whether
VVP 2.3 supports the application mode. Since this is a rather new feature,
it could be that it has not been integrated yet. I am pulling in Ufuk and
Fabian who should be able to definitely answer your question.
Hi,
Using ververica platform to deploy flink jobs, found that it is not
supporting application deployment mode.
Just want to check if it is expected.
Below is a brief of how the main method has been composed.
class Job1 {
public void execute(){
StreamExecutingEnvironemnt env = ...
Hi,
I am trying to make a test-suite for our flink cluster using harness and
minicluster. As we are using processWindowFunctions in our pipeline we need
some good ways of validating these functions. To my surprise
processWindowFunctions are neither supported by test-harness or minicluster
setups,
Hi All,
I have 2 job managers in flink HA mode cluster setup. I have a load
balancer forwarding request to both (leader and stand by) the job managers
in default round-robin fashion. While uploading the job jar the Web UI is
fluctuating between the leader and standby page. Its difficult to upload
Yes, exactly. Thanks!
On Tue, Dec 1, 2020 at 6:11 PM Danny Chan wrote:
> Hi, Rex ~
>
> For "leftOuterJoinLateral" do you mean join a table function through
> lateral table ?
> If it is, yes, the complexity is O(1) for each probe key of LHS. The table
> function evaluate the extra columns and app
29 matches
Mail list logo