tor?
Best,
Stephan
On Tue, Jun 12, 2018 at 4:35 PM, sihua zhou wrote:
Hi,
Maybe I would like to add more information concerning to the Linked Filter
Nodes on each key group. The reason that we need to maintance a Linked Filter
Nodes is that we need to handle data skew, data skew is al
Sihua Zhou created FLINK-9661:
-
Summary: TTL state should support to do time shift after restoring
from checkpoint( savepoint).
Key: FLINK-9661
URL: https://issues.apache.org/jira/browse/FLINK-9661
Hi Amol,
I think If you set the parallelism of the source node equal to the number of
the partition of the kafka topic, you could have per kafka customer per
partition in your job. But if the number of the partitions of the kafka is
dynamic, the 1:1 relationship might break. I think maybe @Gor
I am delighted to announce Sihua Zhou as a new Flink
committer!
Sihua has been an active member of our community for several months. Among
other things, he helped developing Flip-6, improved Flink's state backends
and fixed a lot of major and minor issues. Moreover, he is helping the
Sihua Zhou created FLINK-9633:
-
Summary: Flink doesn't use the Savepoint path's filesystem to
create the OuptutStream on Task.
Key: FLINK-9633
URL: https://issues.apache.org/jira/browse/
n the order but only for out of
orderness period of time which also increases latency.
Cheers,
Andrey
On 19 Jun 2018, at 14:12, sihua zhou wrote:
Hi Amol,
I'm not sure whether this is impossible, especially when you need to
operate the record in multi parallelism.
IMO, in theroy, we can
Sihua Zhou created FLINK-9622:
-
Summary: DistributedCacheDfsTest failed on travis
Key: FLINK-9622
URL: https://issues.apache.org/jira/browse/FLINK-9622
Project: Flink
Issue Type: Bug
Sihua Zhou created FLINK-9619:
-
Summary: Always close the task manager connection when the
container is completed in YarnResourceManager
Key: FLINK-9619
URL: https://issues.apache.org/jira/browse/FLINK-9619
Hi Amol,
I'm not sure whether this is impossible, especially when you need to operate
the record in multi parallelism.
IMO, in theroy, we can only get a ordered stream when there is a single
partition of kafka and operate it with a single parallelism in flink. Even in
this case, if you on
Sihua Zhou created FLINK-9613:
-
Summary: YARNSessionCapacitySchedulerITCase failed because
YarnTestBase.checkClusterEmpty()
Key: FLINK-9613
URL: https://issues.apache.org/jira/browse/FLINK-9613
Project
Sihua Zhou created FLINK-9601:
-
Summary: Snapshot of CopyOnWriteStateTable will failed, when the
amount of record is more than MAXIMUM_CAPACITY
Key: FLINK-9601
URL: https://issues.apache.org/jira/browse/FLINK-9601
e
for the future incoming records, so we get a Linked Filter Node on each key
group and only the head Node is writable, the rest are immutable.
Best, Sihua
On 06/12/2018 16:22,sihua zhou wrote:
Hi Fabian,
Thanks a lot for your reply, you are right that users would need to configure a
TTL for t
the current node
which would not be required if we don't remove nodes, right?
From the small summary of approximated filters, cuckoo filters seem to be most
appropriate as they also support deletes.
Are you aware of any downsides compared to bloom filters (besides potentially
slower inser
e time to do it currently), even though I
think the design can be improved definitely, but maybe we could discuss the
improvement better base on the code, and I believe most of the code could be
cherry picked for the "final implementation". Does anyone object this?
Best, Sihua
On 06/6/20
Hi Stephan,
Thanks very much for your response! That gave me the confidence to continue to
work on the Elastic Filter. But even though we have implemented it(based on
1.3.2) and used it on production for a several months, If there's one commiter
is willing to guide me(since it's not a very tri
Sihua Zhou created FLINK-9546:
-
Summary: The heartbeatTimeoutIntervalMs of HeartbeatMonitor should
be larger than 0
Key: FLINK-9546
URL: https://issues.apache.org/jira/browse/FLINK-9546
Project: Flink
Hi,
Sorry, but pinging for more feedbacks on this proposal...
Even the negative feedbacks is highly appreciated!
Best, Sihua
On 05/30/2018 13:19,sihua zhou wrote:
Hi,
I did a survey of the variants of Bloom Filter and the Cuckoo filter these
days. Finally, I found 3 of them maybe
Hi Stephan,
could you please also consider the "Elastic Filter " feature discussioned in
thread
http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/PROPOSAL-Introduce-Elastic-Bloom-Filter-For-Flink-td22430.html
?
Best, Sihua
On 06/4/2018 17:21,Stephan Ewen wrote:
Hi Flink Comm
l free to give any feedback and comments.
Thanks,
Andrey
On 27 May 2018, at 09:46, sihua zhou wrote:
Hi Bowen,
Thanks for your clarification, I agree that we should wait for the timer on
RocksDB to be finished, after that we could even do some micro-benchmark before
start implementing.
Sihua Zhou created FLINK-9480:
-
Summary: Let local recovery support rescaling
Key: FLINK-9480
URL: https://issues.apache.org/jira/browse/FLINK-9480
Project: Flink
Issue Type: Improvement
Sihua Zhou created FLINK-9479:
-
Summary: Let the rescale API to use local recovery
Key: FLINK-9479
URL: https://issues.apache.org/jira/browse/FLINK-9479
Project: Flink
Issue Type: Improvement
Sihua Zhou created FLINK-9475:
-
Summary: introduce an approximate version of "select distinct"
Key: FLINK-9475
URL: https://issues.apache.org/jira/browse/FLINK-9475
Project: Flink
cs/article/viewFile/826/814)"(compare
to the paper, the approach I outlined could have a better query performance and
also support the RELAXED TTL), maybe it can help to understand the desgin doc.
Looking forward any feedback!
Best, Sihua
On 05/24/2018 10:36,sihua zhou wrote:
Hi,
Thanks f
Sihua Zhou created FLINK-9474:
-
Summary: Introduce an approximate version of "count distinct"
Key: FLINK-9474
URL: https://issues.apache.org/jira/browse/FLINK-9474
Project: Flink
Issue
Sihua Zhou created FLINK-9468:
-
Summary: get outputLimit of LimitedConnectionsFileSystem
incorrectly
Key: FLINK-9468
URL: https://issues.apache.org/jira/browse/FLINK-9468
Project: Flink
Issue
I also +1 for this very good proposal!
In general, the design is good, especially the part the related to the timer on
Heap, but refer to the part of the timer on RocksDB, I think there may still
exist some improvement that we can do, I just left the comments on the doc.
Best, Sihua
On 0
FLIP. I
suggest we wait and keep a close eye on those efforts, and as they mature,
we'll have a much better idea of the whole picture.
Thanks, Bowen
On Sat, May 26, 2018 at 7:52 AM, sihua zhou wrote:
Hi,
thanks for your reply Fabian, about the overhead of storing the key bytes
twi
That would only work efficiently if we relax the clean-up logic which could be
a valid design decision.
Best, Fabian
2018-05-14 9:33 GMT+02:00 sihua zhou :
Hi Fabian,
thanks you very much for the reply, just a alternative. Can we implement the
TTL logical in `AbstractStateBackend` and `Ab
Hi,
Thanks for your suggestions @Elias! I have a brief look at "Cuckoo Filter" and
"Golumb Compressed Sequence", my first sensation is that maybe "Golumc
Compressed Sequence" is not a good choose, because it seems to require
non-constant lookup time, but Cuckoo Filter maybe a good choose, I shou
Sihua Zhou created FLINK-9426:
-
Summary: Harden RocksDBWriteBatchPerformanceTest.benchMark()
Key: FLINK-9426
URL: https://issues.apache.org/jira/browse/FLINK-9426
Project: Flink
Issue Type: Bug
1]), you mentioned that
the Bloom Filters would be growing.
If we keep them in memory, how can we prevent them from exceeding memory
boundaries over time?
Best,
Fabian
[1] https://issues.apache.org/jira/browse/FLINK-8918
<https://issues.apache.org/jira/browse/FLINK-8918>
2018-05-23 9:56 GMT+
Hi Devs!
I proposal to introduce "Elastic Bloom Filter" for Flink, the reason I make up
this proposal is that, it helped us a lot on production, it let's improve the
performance with reducing consumption of resources. Here is a brief description
fo the motivation of why it's so powful, more deta
Hi,
just one minor thing, I found the JIRA release notes seem a bit inconsistent
with the this RC. For example, https://issues.apache.org/jira/browse/FLINK-9058
hasn't been merged yet but included in the release notes, and
https://issues.apache.org/jira/browse/FLINK-9070 has been merged but no
Sihua Zhou created FLINK-9401:
-
Summary: Data lost when rescaling the job from incremental
checkpoint
Key: FLINK-9401
URL: https://issues.apache.org/jira/browse/FLINK-9401
Project: Flink
Issue
I will update the PR immediately once I finish the dinner.
Best, Sihua
| |
sihua zhou
邮箱:summerle...@163.com
|
签名由 网易邮箱大师 定制
在2018年05月17日 18:29,Till Rohrmann 写道:
Hi Sihua,
thanks for making me aware. This sounds indeed like a problem which might
cause the data loss. I think it's
Hi,
And we found this one[1]. It is an issue that could lead to data
losing(checkpoint & restoring) when people using the RocksDBBackend, cause by
the not so nice APIs of RocksIterator...had a hard time to believe it, but this
seems to affect all the release versions, so I'm not sure whether i
Sihua Zhou created FLINK-9373:
-
Summary: Always call RocksIterator.status() to check the internal
error of RocksDB
Key: FLINK-9373
URL: https://issues.apache.org/jira/browse/FLINK-9373
Project: Flink
Sihua Zhou created FLINK-9364:
-
Summary: Add doc for the memory usage in flink
Key: FLINK-9364
URL: https://issues.apache.org/jira/browse/FLINK-9364
Project: Flink
Issue Type: Improvement
Moreover, using the same timer service and using the public state APIs helps to
have a consistent TTL behavior across different state backend.
Best, Fabian
2018-05-14 8:51 GMT+02:00 sihua zhou :
Hi Bowen,
thanks for your doc! I left some comments on the doc, the main concerning is
that it mak
Hi Bowen,
thanks for your doc! I left some comments on the doc, the main concerning is
that it makes me feel like a coupling that the TTL need to depend on `timer`.
Because I think the TTL is a property of the state, so it should be backed by
the state backend. If we implement the TTL base on th
Sihua Zhou created FLINK-9351:
-
Summary: RM stop assigning slot to Job because the TM killed
before connecting to JM successfully
Key: FLINK-9351
URL: https://issues.apache.org/jira/browse/FLINK-9351
Sihua Zhou created FLINK-9325:
-
Summary: generate the _meta file only when the writing is totally
successful
Key: FLINK-9325
URL: https://issues.apache.org/jira/browse/FLINK-9325
Project: Flink
Sihua Zhou created FLINK-9269:
-
Summary: Concurrency problem in HeapKeyedStateBackend when
performing checkpoint async
Key: FLINK-9269
URL: https://issues.apache.org/jira/browse/FLINK-9269
Project: Flink
Sihua Zhou created FLINK-9263:
-
Summary: Kafka010ITCase failed on travis flaky
Key: FLINK-9263
URL: https://issues.apache.org/jira/browse/FLINK-9263
Project: Flink
Issue Type: Bug
Sihua Zhou created FLINK-9260:
-
Summary: Introduce a friendly way to resume the job from
externalized checkpoints automatically
Key: FLINK-9260
URL: https://issues.apache.org/jira/browse/FLINK-9260
Sihua Zhou created FLINK-9251:
-
Summary: Move MemoryStateBackend to flink-state-backends
Key: FLINK-9251
URL: https://issues.apache.org/jira/browse/FLINK-9251
Project: Flink
Issue Type
Sihua Zhou created FLINK-9243:
-
Summary:
SuccessAfterNetworkBuffersFailureITCase#testSuccessfulProgramAfterFailure is
unstable
Key: FLINK-9243
URL: https://issues.apache.org/jira/browse/FLINK-9243
Sihua Zhou created FLINK-9174:
-
Summary: The type of state created in
ProccessWindowFunction.proccess() is inconsistency
Key: FLINK-9174
URL: https://issues.apache.org/jira/browse/FLINK-9174
Project
Sihua Zhou created FLINK-9116:
-
Summary: Introduce getAll and removeAll for MapState
Key: FLINK-9116
URL: https://issues.apache.org/jira/browse/FLINK-9116
Project: Flink
Issue Type: New Feature
Sihua Zhou created FLINK-9102:
-
Summary: Make the JobGraph disable queued scheduling for
Flip6LocalStreamEnvironment
Key: FLINK-9102
URL: https://issues.apache.org/jira/browse/FLINK-9102
Project: Flink
Sihua Zhou created FLINK-9028:
-
Summary: flip6 should check config before starting cluster
Key: FLINK-9028
URL: https://issues.apache.org/jira/browse/FLINK-9028
Project: Flink
Issue Type: Bug
Sihua Zhou created FLINK-9022:
-
Summary: fix resource close in
`StreamTaskStateInitializerImpl.streamOperatorStateContext()`
Key: FLINK-9022
URL: https://issues.apache.org/jira/browse/FLINK-9022
Project
Sihua Zhou created FLINK-8968:
-
Summary: Fix native resource leak caused by ReadOptions
Key: FLINK-8968
URL: https://issues.apache.org/jira/browse/FLINK-8968
Project: Flink
Issue Type: Bug
Sihua Zhou created FLINK-8927:
-
Summary: Eagerly release the checkpoint object created from RocksDB
Key: FLINK-8927
URL: https://issues.apache.org/jira/browse/FLINK-8927
Project: Flink
Issue
Sihua Zhou created FLINK-8918:
-
Summary: Introduce Runtime Filter Join
Key: FLINK-8918
URL: https://issues.apache.org/jira/browse/FLINK-8918
Project: Flink
Issue Type: Bug
Components
Sihua Zhou created FLINK-8859:
-
Summary: RocksDB backend should pass WriteOption to Rocks.put()
when restoring
Key: FLINK-8859
URL: https://issues.apache.org/jira/browse/FLINK-8859
Project: Flink
Sihua Zhou created FLINK-8846:
-
Summary: ntroducing `parallel recovery` mode for incremental
checkpoint
Key: FLINK-8846
URL: https://issues.apache.org/jira/browse/FLINK-8846
Project: Flink
Sihua Zhou created FLINK-8845:
-
Summary: Introducing `parallel recovery` mode for fully
checkpoint (savepoint)
Key: FLINK-8845
URL: https://issues.apache.org/jira/browse/FLINK-8845
Project: Flink
Sihua Zhou created FLINK-8817:
-
Summary: Decrement numPendingContainerRequests only when request
container successfully
Key: FLINK-8817
URL: https://issues.apache.org/jira/browse/FLINK-8817
Project
Sihua Zhou created FLINK-8816:
-
Summary: Remove the oldWorker only after starting newWorker
successfully in registerTaskExecutorInternal()
Key: FLINK-8816
URL: https://issues.apache.org/jira/browse/FLINK-8816
Sihua Zhou created FLINK-8790:
-
Summary: Improve performance for recovery from incremental
checkpoint
Key: FLINK-8790
URL: https://issues.apache.org/jira/browse/FLINK-8790
Project: Flink
Issue
Sihua Zhou created FLINK-8777:
-
Summary: improve resource release when recovery from failover
Key: FLINK-8777
URL: https://issues.apache.org/jira/browse/FLINK-8777
Project: Flink
Issue Type
Sihua Zhou created FLINK-8753:
-
Summary: Introduce Incremental savepoint
Key: FLINK-8753
URL: https://issues.apache.org/jira/browse/FLINK-8753
Project: Flink
Issue Type: New Feature
Sihua Zhou created FLINK-8699:
-
Summary: Fix concurrency problem in rocksdb full checkpoint
Key: FLINK-8699
URL: https://issues.apache.org/jira/browse/FLINK-8699
Project: Flink
Issue Type: Bug
Sihua Zhou created FLINK-8679:
-
Summary: RocksDBKeyedBackend.getKeys(stateName, namespace) doesn't
filter data with namespace
Key: FLINK-8679
URL: https://issues.apache.org/jira/browse/FLINK-8679
Pr
Sihua Zhou created FLINK-8676:
-
Summary: Memory Leak in AbstractKeyedStateBackend.applyToAllKeys()
when backend is base on RocksDB
Key: FLINK-8676
URL: https://issues.apache.org/jira/browse/FLINK-8676
Sihua Zhou created FLINK-8657:
-
Summary: Fix incorrect description for external checkpoint vs
savepoint
Key: FLINK-8657
URL: https://issues.apache.org/jira/browse/FLINK-8657
Project: Flink
Sihua Zhou created FLINK-8639:
-
Summary: Fix always need to seek multiple times when iterator
RocksDBMapState
Key: FLINK-8639
URL: https://issues.apache.org/jira/browse/FLINK-8639
Project: Flink
Sihua Zhou created FLINK-8602:
-
Summary: Accelerate recover from failover when use incremental
checkpoint
Key: FLINK-8602
URL: https://issues.apache.org/jira/browse/FLINK-8602
Project: Flink
Sihua Zhou created FLINK-8601:
-
Summary: Introduce LinkedBloomFilterState for Approximate
calculation and other situations of performance optimization
Key: FLINK-8601
URL: https://issues.apache.org/jira/browse/FLINK
Sihua Zhou created FLINK-8044:
-
Summary: Introduce scheduling mechanism to satisfy both state
locality and input
Key: FLINK-8044
URL: https://issues.apache.org/jira/browse/FLINK-8044
Project: Flink
Sihua Zhou created FLINK-8018:
-
Summary: RMQ does not support disabling queueDeclare, when the
user has no declaration permissions, it cannot connect
Key: FLINK-8018
URL: https://issues.apache.org/jira/browse/FLINK
Sihua Zhou created FLINK-7873:
-
Summary: Introduce HybridStreamStateHandle for quick recovery from
checkpoint.
Key: FLINK-7873
URL: https://issues.apache.org/jira/browse/FLINK-7873
Project: Flink
Sihua Zhou created FLINK-7219:
-
Summary: Current allocate strategy cann‘t achieve the optimal
effect with input's location
Key: FLINK-7219
URL: https://issues.apache.org/jira/browse/FLINK-7219
Pr
Sihua Zhou created FLINK-7218:
-
Summary: ExecutionVertex.getPreferredLocationsBasedOnInputs() will
always return empty
Key: FLINK-7218
URL: https://issues.apache.org/jira/browse/FLINK-7218
Project: Flink
Sihua Zhou created FLINK-7180:
-
Summary: CoGroupStream perform checkpoint failed
Key: FLINK-7180
URL: https://issues.apache.org/jira/browse/FLINK-7180
Project: Flink
Issue Type: Bug
Sihua Zhou created FLINK-7160:
-
Summary: Support hive like udtf
Key: FLINK-7160
URL: https://issues.apache.org/jira/browse/FLINK-7160
Project: Flink
Issue Type: New Feature
Components
Sihua Zhou created FLINK-7153:
-
Summary: JM can't allocate source for ExecutionGraph correctly
Key: FLINK-7153
URL: https://issues.apache.org/jira/browse/FLINK-7153
Project: Flink
Issue
Sihua Zhou created FLINK-6980:
-
Summary: TypeExtractor.getForObject can't get typeinfo correctly.
Key: FLINK-6980
URL: https://issues.apache.org/jira/browse/FLINK-6980
Project: Flink
Issue
79 matches
Mail list logo