Hi Fanbin,
the host should be the external IP of your master node. However, usually
the port 5005 is not open in EMR (you could any other open, non-used port).
Alternatively, you could use a SSH port forwarding [1]:
ssh -L ::5005
And then connect to localhost: in your IDE.
[1] https://www.ssh.
Hi Alexey
If possible, I think you could move some RDBMS maintenance operations to the
#open method of RichFunction to reduce the possibility of blocking processing
records.
Best
Yun Tang
From: Alexey Trenikhun
Sent: Tuesday, January 28, 2020 15:15
To: Yun Tang
Hi Ahmad
Apart from setting the logger level of RocksDB, I also wonder why you would
meet rocksdb checkpoint IO logs were filling up disk space very very quickly.
How larger the local checkpoint state is and how long the checkpoint interval
is? I think you might give a too short interval of che
Thank you Yun Tang.
My implementation potentially could block for significant amount of time,
because I wanted to do RDBMS maintenance (create partitions for new data, purge
old data etc) in-line with writing stream data to a database
From: Yun Tang
Sent: Sunday
Hi Andrew,
as far as I know there is nothing particularly special about the sink in
terms of how it handles state or time. You can not leave the pipeline
"unfinished", only sinks trigger the execution of the whole pipeline.
Cheers,
Konstantin
On Fri, Jan 24, 2020 at 5:59 PM Andrew Roberts wr
Hi Flavio,
Your requirement should be to use blink batch to read the tables in Presto?
I'm not familiar with Presto's catalog. Is it like hive Metastore?
If so, what needs to be done is similar to the hive connector.
You need to implement a catalog of presto, which translates the Presto
table int
Hi Dominique,
FLIP-17 (Side Inputs) is not yet implemented, AFAIK.
One possible way to overcome this right now if your reference data is static
and not continuously changing, is to use the State Processor API to
bootstrap a savepoint with the reference data.
Have you looked into that and see if i
I can build flink 1.10 and install it on to EMR
(flink-dist_2.11-1.10.0.jar). but what about other dependencies in my
project build.gradle, ie. flink-scala_2.11, flink-json, flink-jdbc... do I
continue to use 1.9.0 since there is no 1.10 available?
Thanks,
Fanbin
On Fri, Jan 24, 2020 at 11:39 PM
Hi Jingsong,
Thanks for the clarification!
The limitation description is a bit confusing to me but it was clear after
seeing the above example posted by you.
Regards,
RK.
On Mon, Jan 27, 2020 at 6:25 AM Jingsong Li wrote:
> Hi RKandoji,
>
> You understand this bug wrong, your code will not g
Hi Yun,
Thanks for the suggestion!
Best
Eleanore
On Mon, Jan 27, 2020 at 1:54 AM Yun Tang wrote:
> Hi Yi
>
> Glad to know you have already resolved it. State process API would use
> data stream API instead of data set API in the future [1].
>
> Besides, you could also follow the guide in "the
Dang what a massive PR: Files changed2,118, +104,104 −29,161 lines changed.
Thanks for the details, Jark!
On Mon, Jan 27, 2020 at 4:07 PM Jark Wu wrote:
> Hi Kant,
> Having a custom state backend is very difficult and is not recommended.
>
> Hi Benoît,
> Yes, the "Query on the intermediate stat
Hi,
I think reducing the frequency of the checkpoints and decreasing parallelism of
the things using the S3AOutputStream class, would help to mitigate the issue.
I don’t know about other solutions. I would suggest to ask this question
directly to Steve L. in the bug ticket [1], as he is the on
Hi Kant,
Having a custom state backend is very difficult and is not recommended.
Hi Benoît,
Yes, the "Query on the intermediate state is on the roadmap" I mentioned is
referring to integrate Table API & SQL with Queryable State.
We also have an early issue FLINK-6968 to tracks this.
Best,
Jark
I know from experience that Flink's shaded S3A FileSystem does not
reference core-site.xml, though I don't remember offhand what file (s) it
does reference. However since it's shaded, maybe this could be fixed by
building a Flink FS referencing 3.3.0? Last I checked I think it referenced
3.1.0.
On
Yes, Flink does batch processing by "reevaluating a stream" so to speak.
Presto doesn't have sources and sinks, only catalogs (which are always
allowing reads, and sometimes also writes).
Presto catalogs are a configuration - they are managed on the node
filesystem as a configuration file and nowh
Both Presto and Flink make use of a Catalog in order to be able to
read/write data from a source/sink.
I don't agree about " Flink is about processing data streams" because Flink
is competitive also for the batch workloads (and this will be further
improved in the next releases).
I'd like to regist
Does StreamingFileSink use core-site.xml ? When I was using it, it didn't
load any configurations from core-site.xml.
On Mon, Jan 27, 2020 at 12:08 PM Mark Harris
wrote:
> Hi Piotr,
>
> Thanks for the link to the issue.
>
> Do you know if there's a workaround? I've tried setting the following in
Hi Flavio,
Presto contributor and Starburst Partners here.
Presto and Flink are solving completely different challenges. Flink is
about processing data streams as they come in; Presto is about ad-hoc /
periodic querying of data sources.
A typical architecture would use Flink to process data stre
Thanks Chesnay!
On Mon, 27 Jan 2020 at 11:29, Chesnay Schepler wrote:
> Please see https://issues.apache.org/jira/browse/FLINK-15068
>
> On 27/01/2020 12:22, Ahmad Hassan wrote:
>
> Hello,
>
> In our production systems, we see that flink rocksdb checkpoint IO logs
> are filling up disk space ver
Hi Piotr,
Thanks for the link to the issue.
Do you know if there's a workaround? I've tried setting the following in my
core-site.xml:
fs.s3a.fast.upload.buffer=true
To try and avoid writing the buffer files, but the taskmanager breaks with the
same problem.
Best regards,
Mark
Hi all,
is there any integration between Presto and Flink? I'd like to use Presto
for the UI part (preview and so on) while using Flink for the batch
processing. Do you suggest something else otherwise?
Best,
Flavio
Please see https://issues.apache.org/jira/browse/FLINK-15068
On 27/01/2020 12:22, Ahmad Hassan wrote:
Hello,
In our production systems, we see that flink rocksdb checkpoint IO
logs are filling up disk space very very quickly in the order of GB's
as the logging is very verbose. How do we disab
Hi RKandoji,
You understand this bug wrong, your code will not go wrong.
The bug is:
TableEnv tEnv = TableEnv.create(...);
Table t1 = tEnv.sqlQuery(...);
tEnv.insertInto("sink1", t1);
tEnv.execute("job1");
Table t2 = tEnv.sqlQuery(...);
tEnv.insertInto("sink2", t2);
tEnv.execute("job2");
This w
Hello,
In our production systems, we see that flink rocksdb checkpoint IO logs are
filling up disk space very very quickly in the order of GB's as the logging
is very verbose. How do we disable or suppress these logs please ? The
rocksdb file checkpoint.cc is dumping huge amount of checkpoint logs
Hi Yi
Glad to know you have already resolved it. State process API would use data
stream API instead of data set API in the future [1].
Besides, you could also follow the guide in "the brodcast state pattern"[2]
// a map descriptor to store the name of the rule (string) and the rule itself.
Ma
25 matches
Mail list logo