Hi Fanbin,
TableEnvironment is unification of batch/streaming in blink planner.
Use: TableEnvironment.create(fsSettings)
We continue improving TableEnvironment to contain more features.
[1]
https://ci.apache.org/projects/flink/flink-docs-master/dev/table/common.html#create-a-tableenvironment
Be
Hi,
My app creates the source from JDBC inputformat and running some sql and
print out. But the source terminates itself after the query is done. Is
there anyway to keep the source running?
samle code:
val env = StreamExecutionEnvironment.getExecutionEnvironment
val settings = EnvironmentSettings.
Hi,
Currently, "StreamTableEnvironment can not run in batch mode for now,
please use TableEnvironment."
Are there any plans on the unification of batch/streaming roadmap that
use StreamTableEnvironment for both streamingMode and batchMode?
Thanks,
Fanbin
Hi,
I use a FlinkKinesisConsumer in a Flink job to de-serialize kinesis events.
Flink: 1.9.2
Avro: 1.9.2
The serDe class is like:
public class ManagedSchemaKinesisPayloadSerDe
implements KinesisSerializationSchema,
KinesisDeserializationSchema {
private static final String REGISTRY_END
I think so also. But I was wondering if this was Consumer or actual Kafka
Broker. But this error displayed on the flink task node where the task was
running. The brokers looked fine at the time.
I have about a dozen topics which all are single partition except one which
is 18. So I really doubt the
Hi,
any thoughts about this one?
Regards,
Krzysztof
--
Sent from: http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/
Hi all,
I was trying to build Flink in my local machine and these two unit tests
are failing.
*[ERROR] Errors:[ERROR]
FileUtilsTest.testCompressionOnRelativePath:261->verifyDirectoryCompression:440
» NoSuchFile[ERROR] FileUtilsTest.testDeleteDirectoryConcurrently »
FileSystem /var/folders/x9/
Hi,
Currently, "StreamTableEnvironment can not run in batch mode for now,
please use TableEnvironment."
Are there any plans on the unification of batch/streaming roadmap that
use StreamTableEnvironment for both streamingMode and batchMode?
Thanks,
Fanbin
Cool; @aniket and @dagang,
As someone who hasn't dug into the code of either (will go through your
recording) -- might you share any thoughts on differences between:
https://github.com/googlecloudplatform/flink-on-k8s-operator
and
https://github.com/lyft/flinkk8soperator
??
Also, for those in Ba
Hi Arvid,
Thanks for your response. I think I did not word my question properly.
I wanted to confirm that if the data is distributed to more than one
partition then the ordering cannot be maintained (which is documented).
According to your response I understand if I set the parallelism to number
o
Thanks so much Timo, got it working now. All down to my lack of Java skill.
Many thanks,
Chris Stevens
Head of Research & Development
+44 7565 034 595
On Wed, 19 Feb 2020 at 15:12, Timo Walther wrote:
> Hi Chris,
>
> your observation is right. By `new Sensor() {}` instead of just `new
> Sensor
Hey Timo,
Thanks for the assignment link! Looks like most of my issues can be solved
by getting better acquainted with Java file APIs and not in Flink-land.
Best,
Austin
On Wed, Feb 19, 2020 at 6:48 AM Timo Walther wrote:
> Hi Austin,
>
> the StreamingFileSink allows bucketing the output data
Yes, I create it the way you mentioned.
From: Yun Gao [mailto:yungao...@aliyun.com]
Sent: Dienstag, 18. Februar 2020 10:12
To: Gobbi, Jacopo-XT; user
Subject: [External] Re: Flink's Either type information
Hi Jacopo,
Could you also provide how the KeyedBroadcastProcessFunction is
Thanks, Timo. I have not used and explore Table API until now. I have used
dataset and datastream API only.
I will read about the Table API.
On Wed, Feb 19, 2020 at 4:33 PM Timo Walther wrote:
> Hi Anuj,
>
> another option would be to use the new Hive connectors. Have you looked
> into those? Th
Thanks, Rafi. I will try with this but yes if partitioning is not possible
then I also have to look some other solution.
On Wed, Feb 19, 2020 at 3:44 PM Rafi Aroch wrote:
> Hi Anuj,
>
> It's been a while since I wrote this (Flink 1.5.2). Could be a
> better/newer way, but this is what how I read
Thanks for the update! Since we are still in the planning stage I will try to
find another way to achieve what we are trying to do in the meantime and I'll
keep an eye on that Jira. Two workarounds I thought about are to either match
the parallelism of the source to the partition count, or since
Hi Chris,
your observation is right. By `new Sensor() {}` instead of just `new
Sensor()` you are creating an anonymous non-static class that references
the outer method and class.
If you check your logs, there might be also a reason why your POJO is
used as a generic type. I assume because y
Hi guys,
Thanks for your answers and sorry for the late reply.
My use case is :
I receive some events on one stream, each events can contain:
- 1 field category
- 1 field subcategory
- 1 field category AND 1 field subcategory
Events are matched against rules which can contain :
Hi,
I now realize that you are using the batch API, and I gave you an answer
for the streaming API :(
The mapPartition function also has a close() method, which you can use to
implement the same pattern.
With a JVM Shutdown hook, you are assuming that the TaskManager is shutting
down at the end of
Thanks again Timo, I hope I replied correctly this time.
As per my previous message the Sensor class is a very simple POJO type (I
think).
When the serialization trace talks about PGSql stuff it makes me think that
something from my operator is being included in serialization. Not just the
Sensor
I have to correct myself. DataStream respects the
ExecutionConfig.enableObjectReuse which happens in the form of creating
different Outputs in the OperatorChain. This is also in line with the
different behaviour you are observing.
Concerning your initial question Theo, you could do the following i
Hi Hemant,
Flink passes your configurations to the Kafka consumer, so you could check
if you can subscribe to only one partition there.
However, I would discourage that approach. I don't see the benefit to just
subscribing to the topic entirely and have dedicated processing for the
different devi
Hi Chris,
[forwarding the private discussion to the mailing list again]
first of all, are you sure that your Sensor class is either a top-level
class or a static inner class. Because it seems there is way more stuff
in it (maybe included by accident transitively?). Such as:
org.apache.loggin
Hi Andreas,
you are right, currently the Row type only supports accessing fields by
index. Usually, we recommend to fully work in Table API. There you can
access structured type fields by name (`SELECT row.field.field` or
`'row.get("field").get("field")`) and additional utilities such as
`fla
Hi,
would Apache Avro be an option for you? Because this is currently still
the best supported format when it comes to schema upgrades as far as I
know. Maybe Gordon in CC can give your some additional hints.
Regards,
Timo
On 18.02.20 10:38, ApoorvK wrote:
I have some case class which have
Hi Chris,
it seems there are field serialized into state that actually don't
belong there. You should aim to treat Sensor as a POJO instead of a Kryo
generic serialized black-box type.
Furthermore, it seems that field such as
"org.apache.logging.log4j.core.layout.AbstractCsvLayout" should no
Hi Austin,
the StreamingFileSink allows bucketing the output data.
This should help for your use case:
https://ci.apache.org/projects/flink/flink-docs-stable/dev/connectors/streamfile_sink.html#bucket-assignment
Regards,
Timo
On 19.02.20 01:00, Austin Cawley-Edwards wrote:
Following up on th
Hi Theo,
there are lot of performance improvements that Flink could do but they
would further complicate the interfaces and API. They would require deep
knowledge of users about the runtime when it is safe to reuse object and
when not.
The Table/SQL API of Flink uses a lot of these optimizat
Hi Anuj,
another option would be to use the new Hive connectors. Have you looked
into those? They might work on SQL internal data types which is why you
would need to use the Table API then.
Maybe Bowen in CC can help you here.
Regards,
Timo
On 19.02.20 11:14, Rafi Aroch wrote:
Hi Anuj,
I
Hi Anuj,
It's been a while since I wrote this (Flink 1.5.2). Could be a better/newer
way, but this is what how I read & write Parquet with hadoop-compatibility:
// imports
> import org.apache.avro.generic.GenericRecord;
> import org.apache.flink.api.java.hadoop.mapreduce.HadoopInputFormat;
>
impo
Thanks a lot for reporting this issue.
Which version of Flink are you using?
I checked the code of the Kinesis ShardConsumer (the current version
though), and I found that exceptions from the ShardConsumer are properly
forwarded to the lower level runtime.
Did you check the *.out files of the Tas
I have the same experience as Eleanore,
When enabling object reuse, I saw a significant performance improvement and
in my profiling session. I saw that a lot of serialization/deserialization
was not performed any more.
That’s why my question should originally aim a bit further: It’s good th
32 matches
Mail list logo