Thanks for driving the discussion Shuiqiang, and sorry for chiming in late.
*bq. However, all the state access will be synchronized in the Java
operator and so there will be no concurrent access to the state backend.*
Could you add a section to explicitly mention this in the FLIP document? I
think
wym_maozi created FLINK-20839:
-
Summary: some mistakes in state.md and state_zh.md
Key: FLINK-20839
URL: https://issues.apache.org/jira/browse/FLINK-20839
Project: Flink
Issue Type: Bug
Leonard Xu created FLINK-20840:
--
Summary: Projection pushdown doesn't work in temporal(lookup) join
Key: FLINK-20840
URL: https://issues.apache.org/jira/browse/FLINK-20840
Project: Flink
Issue
Hi Deep,
Flink has dropped support for specifying the number of TMs via -n since the
introduction of Flip-6. Since then, Flink will automatically start TMs
depending on the required resources. Hence, there is no need to specify the
-n parameter anymore. Instead, you should specify the parallelism
Matthias created FLINK-20841:
Summary: Fix compile error due to duplicated generated files
Key: FLINK-20841
URL: https://issues.apache.org/jira/browse/FLINK-20841
Project: Flink
Issue Type: Bug
zhouenning created FLINK-20842:
--
Summary: Data row is smaller than a column index, internal schema
representation is probably out of sync with real database schema
Key: FLINK-20842
URL: https://issues.apache.org/jira
Hi Yu,
Thanks a lot for your suggestions.
I have addressed your inlined comments in the FLIP and also added a new
section "State backed access synchronization" that explains the way to make
sure there is no concurrent access to the state backend. Please have a look.
Best,
Shuiqiang
Yu Li 于202
Aljoscha Krettek created FLINK-20843:
Summary: UnalignedCheckpointITCase is unstable
Key: FLINK-20843
URL: https://issues.apache.org/jira/browse/FLINK-20843
Project: Flink
Issue Type: Bug
Hi all,
I'd like to start a discussion about introducing a few convenient operations in
Table API from the perspective of ease of use.
Currently some tasks are not easy to express in Table API e.g. deduplication,
topn, etc, or not easy to express when there are hundreds of columns in a
table,
Hi Dian,
Big +1 for making the Table API easier to use. Java users and Python users can
both benefit from it. I think it would be better if we add some Python API
examples.
Best,
Wei
> 在 2021年1月4日,20:03,Dian Fu 写道:
>
> Hi all,
>
> I'd like to start a discussion about introducing a few con
I agree, we should allow streaming operators to use managed memory for
other use cases.
Do you think we need an additional "consumer" setting or that they would
just use `DATAPROC` and decide by themselves what to use the memory for?
Best,
Aljoscha
On 2020/12/22 17:14, Jark Wu wrote:
Hi all
Rui Li created FLINK-20844:
--
Summary: Support add jar in SQL client
Key: FLINK-20844
URL: https://issues.apache.org/jira/browse/FLINK-20844
Project: Flink
Issue Type: New Feature
Component
Nick Burkard created FLINK-20845:
Summary: Drop support for Scala 2.11
Key: FLINK-20845
URL: https://issues.apache.org/jira/browse/FLINK-20845
Project: Flink
Issue Type: Sub-task
Co
This makes sense, I have some questions about method names.
What do you think about renaming `dropDuplicates` to `deduplicate`? I don't
think that drop is the right word to use for this operation, it implies
records are filtered where this operator actually issues updates and
retractions. Also, de
Thanks Till, for the detailed explanation.I tried and it is working fine.
Once again thanks for your quick response.
Regards,
-Deep
On Mon, 4 Jan, 2021, 2:20 PM Till Rohrmann, wrote:
> Hi Deep,
>
> Flink has dropped support for specifying the number of TMs via -n since the
> introduction of F
Hello,
in FLINK-20790 we have changed where generated Avro files are
placed.Until then they were put directly under the src/ tree, with some
committed to git, other being ignored via .gitignore.
This has caused issues when switching branches (specifically, not being
able to compile 1.11 afte
Hi Dian,
thanks for the proposed FLIP. I haven't taken a deep look at the
proposal yet but will do so shortly. In general, we should aim to make
the Table API as concise and self-explaining as possible. E.g. `dropna`
does not sound obvious to me.
Regarding `myTable.coalesce($("a"), 1).as("a"
Till Rohrmann created FLINK-20846:
-
Summary: Decouple checkpoint services from CheckpointCoordinator
Key: FLINK-20846
URL: https://issues.apache.org/jira/browse/FLINK-20846
Project: Flink
Iss
Roman Khachatryan created FLINK-20847:
-
Summary: Update CompletedCheckpointStore.shutdown() signature
Key: FLINK-20847
URL: https://issues.apache.org/jira/browse/FLINK-20847
Project: Flink
Local aggregation is more like a physical operator rather than logical
operator. I would suggest going with idea #1.
Best,
Kurt
On Wed, Dec 30, 2020 at 8:31 PM Sebastian Liu wrote:
> Hi Jark, Thx a lot for your quick reply and valuable suggestions.
> For (1): Agree: Since we are in the period
Hi Kurt,
Thx a lot for your feedback. If local aggregation is more like a physical
operator rather than logical
operator, I think your suggestion should be idea #2 which handle all in the
physical optimization phase?
Looking forward for the further discussion.
Kurt Young 于2021年1月5日周二 上午9:52写道:
Sorry for the typo -_-!
I meant idea #2.
Best,
Kurt
On Tue, Jan 5, 2021 at 10:59 AM Sebastian Liu wrote:
> Hi Kurt,
>
> Thx a lot for your feedback. If local aggregation is more like a physical
> operator rather than logical
> operator, I think your suggestion should be idea #2 which handle a
Hi Aljoscha,
I think we may need to divide `DATAPROC` into `OPERATOR` and
`STATE_BACKEND`, because they have different scope (slot vs. operator).
But @Xintong Song may have more insights on it.
Best,
Jark
On Mon, 4 Jan 2021 at 20:44, Aljoscha Krettek wrote:
> I agree, we should allow streami
Thanks Dian,
+1 to `deduplicate`.
Regarding `myTable.coalesce($("a"), 1).as("a")`, I'm afraid it may
conflict/confuse the built-in expression `coalesce(f0, 0)` (we may
introduce it in the future).
Besides that, could we also align other features of Flink SQL, e.g.
event-time/processing-time temp
Thanks a lot for your comments!
Regarding to Python Table API examples: I thought it should be straightforward
about how to use these operations in Python Table API and so have not added
them. However, the suggestions make sense to me and I have added some examples
about how to use them in Pyth
I'm also +1 for idea#2.
Regarding to the updated interface,
Result applyAggregates(List aggregateExpressions,
int[] groupSet, DataType aggOutputDataType);
final class Result {
private final List acceptedAggregates;
private final List remainingAggregates;
}
I have following co
Thanks for the clarification. I have resolved all of the comments and added
a conclusion section.
Looking forward to the further feedback from our community. If we get
consensus on the design doc, I can push the implementation related work.
Kurt Young 于2021年1月5日周二 上午11:04写道:
> Sorry for the typ
+1 for allowing streaming operators to use managed memory.
As for the consumer names, I'm afraid using `DATAPROC` for both streaming
ops and state backends will not work. Currently, RocksDB state backend uses
a shared piece of memory for all the states within that slot. It's not the
operator's dec
+1 to Xingtong's proposal!
Best,
Jark
On Tue, 5 Jan 2021 at 12:13, Xintong Song wrote:
> +1 for allowing streaming operators to use managed memory.
>
> As for the consumer names, I'm afraid using `DATAPROC` for both streaming
> ops and state backends will not work. Currently, RocksDB state back
Thanks Shuiqiang for driving this.
The design looks good to me. +1 to start the vote if there are no more comments.
Regards,
Dian
> 在 2021年1月4日,下午7:40,Shuiqiang Chen 写道:
>
> Hi Yu,
>
> Thanks a lot for your suggestions.
>
> I have addressed your inlined comments in the FLIP and also added a
+1 for allowing streaming operators to use managed memory.
The memory use of streams requires some hierarchy, and the bottom layer is
undoubtedly the current StateBackend.
Let the stream operators freely use the managed memory, which will make the
memory management model to be unified and give the
Hello dev:
Now I encounter a problem when using the method
"Catalog#listPartitions(ObjectPath, CatalogPartitionSpec)".
I found that the partitionSpec type in CatalogPartitionSpec is
Map,
This is no problem for hivecatalog, but my subclass of Catalog needs
precise types. For example,
Hi Jun,
I'm curious why it doesn't work when represented in string?
You can get the field type from the CatalogTable#getSchema(),
then parse/cast the partition value to the type you want.
Best,
Jark
On Tue, 5 Jan 2021 at 13:43, Jun Zhang wrote:
> Hello dev:
> Now I encounter a problem w
Hi Dev,
I'd like to start a discussion about whether we can let catalog handle
temporary catalog objects. Currently temporary table/view is managed by
CatalogManager, and temporary function is managed by FunctionCatalog.
This causes problems when I try to support temporary hive table/function.
Fo
Hi Jark,
Thx for your further feedback and help. The interface of
SupportsAggregatePushDown may indeed need some adjustments.
For (1) Agree: Yeah, the upstream only need to know if the TableSource can
handle all of the aggregates.
It's better to just return a boolean type to indicate whether all
hi ,Jack:
If the partition type is int and we pass in a string type, the system will
throw an exception that the type does not match. We can indeed cast by get
the schema, but I think if CatalogPartitionSpec#partitionSpec is of type
Map, there is no need to do cast operation, and the
universal and
Thanks for your proposal! Sebastian.
+1 for SupportsAggregatePushDown. The above wonderful discussion has solved
many of my concerns.
## Semantic problems
We may need to add some mechanisms or comments, because as far as I know,
the semantics of each database is actually different, which may nee
Qingsheng Ren created FLINK-20848:
-
Summary: Kafka consumer ID is not specified correctly in new
KafkaSource
Key: FLINK-20848
URL: https://issues.apache.org/jira/browse/FLINK-20848
Project: Flink
Hi Jun,
AFAIK, the main reason to use Map is because it's easy for
serialization and deserialization.
For example, if we use Java `LocalDateTime` instead of String to represent
TIMESTAMP partition value,
then users may deserialize into Java `Timestamp` to Flink framework, which
may cause problems.
hi,jark:
Thanks for your explanation. I am doing the integration of flink and
iceberg. The iceberg partition needs to be of accurate type, and I cannot
modify it.
I will follow what you suggestion, get the column type by schema, and then
do the cast.
Jark Wu 于2021年1月5日周二 下午3:05写道:
> Hi Jun,
>
Qingsheng Ren created FLINK-20849:
-
Summary: Improve JavaDoc and logging of new KafkaSource
Key: FLINK-20849
URL: https://issues.apache.org/jira/browse/FLINK-20849
Project: Flink
Issue Type:
41 matches
Mail list logo