Hello, I'm new to Flink. Thank you for your help.
My application scenario is to process the log through the Flink program, and
finally store the log in HBase.
Through Kafka, my Flink application receives log information from other
systems. This information can not be immediately sent to HBASE
如何让FlinkSQL访问到阿里云MaxCompute上的表?
又或者是Confluent Schema Registry上那些带schema的kafka topic?
需要自己定义Catalog吗?有相关的教程和资料么?谢谢!
Our job just crashed while running a savepoint, it ran out of disk space. I
inspected the disk and found the following:
-rw--- 1 yarn yarn 10139680768 Dec 12 22:14
presto-s3-10125099138119182412.tmp
-rw--- 1 yarn yarn 10071916544 Dec 12 22:14
presto-s3-10363672991943897408.tmp
-rw
Thanks! That makes sense.
On Sat, Dec 12, 2020 at 11:13 AM Steven Wu wrote:
> This is a performance optimization in JVM when the same exception is
> thrown too frequently. You can set `-XX:-OmitStackTraceInFastThrow` to
> disable the feature. You can typically find the full stack trace in the l
> things are actually moving pretty smoothly
Do you mean the job is otherwise healthy? like there is no lag etc.
Do you see any bottleneck at system level, like CPU, network, disk I/O etc.?
On Sat, Dec 12, 2020 at 10:54 AM Rex Fenley wrote:
> Hi,
>
> We're running a job with on the order of >1
This is a performance optimization in JVM when the same exception is
thrown too frequently. You can set `-XX:-OmitStackTraceInFastThrow` to
disable the feature. You can typically find the full stack trace in the log
before the optimization kicks in.
On Sat, Dec 12, 2020 at 2:05 AM Till Rohrmann w
Hi,
We're running a job with on the order of >100GiB of state. For our initial
run we wanted to keep things simple, so we allocated a single core node
with 1 Taskmanager and 1 parallelism and 1 TiB storage (split between 4
disks on that machine). Overall, things are actually moving pretty
smoothly
Also, small correction from earlier, there are 4 volumes of 256 GiB so
that's 1 TiB total.
On Sat, Dec 12, 2020 at 10:08 AM Rex Fenley wrote:
> Our first big test run we wanted to eliminate as many variables as
> possible, so this is on 1 machine with 1 task manager and 1 parallelism.
> The mach
Our first big test run we wanted to eliminate as many variables as
possible, so this is on 1 machine with 1 task manager and 1 parallelism.
The machine has 4 disks though, and as you can see, they mostly all use
around the same space for storage until a savepoint is triggered.
Could it be that giv
Noted, thanks!
On Sat, Dec 12, 2020 at 2:28 AM David Anderson wrote:
> RocksDB can not be configured to spill to another filesystem or object
> store. It is designed as an embedded database, and each task manager needs
> to have sufficient disk space for its state on the host disk. You might be
I'm trying flatAggregate, the whole code is bug free and as follows:
https://paste.ubuntu.com/p/TM6n2jdZfr/
the result I get is:
8>
(true,1,+1705471-09-26T16:50,+1705471-09-26T16:55,+1705471-09-26T16:54:59.999,4,1)
4>
(true,3,+1705471-09-26T16:50,+1705471-09-26T16:55,+1705471-09-26T16:54:5
RocksDB does do compaction in the background, and incremental checkpoints
simply mirror to S3 the set of RocksDB SST files needed by the current set
of checkpoints.
However, unlike checkpoints, which can be incremental, savepoints are
always full snapshots. As for why one host would have much more
RocksDB can not be configured to spill to another filesystem or object
store. It is designed as an embedded database, and each task manager needs
to have sufficient disk space for its state on the host disk. You might be
tempted to use network attached storage for the working state, but that's
usua
Ok, then let's see whether it reoccurs. What you could do is to revert the
fix and check the stack trace again.
Cheers,
Till
On Sat, Dec 12, 2020, 02:16 Dan Hill wrote:
> Hmm, I don't have a good job I can separate for reproduction. I was using
> Table SQL and inserting a long field (which was
14 matches
Mail list logo