Qingsheng Ren created FLINK-20849:
-
Summary: Improve JavaDoc and logging of new KafkaSource
Key: FLINK-20849
URL: https://issues.apache.org/jira/browse/FLINK-20849
Project: Flink
Issue Type:
hi,jark:
Thanks for your explanation. I am doing the integration of flink and
iceberg. The iceberg partition needs to be of accurate type, and I cannot
modify it.
I will follow what you suggestion, get the column type by schema, and then
do the cast.
Jark Wu 于2021年1月5日周二 下午3:05写道:
> Hi Jun,
>
Hi Jun,
AFAIK, the main reason to use Map is because it's easy for
serialization and deserialization.
For example, if we use Java `LocalDateTime` instead of String to represent
TIMESTAMP partition value,
then users may deserialize into Java `Timestamp` to Flink framework, which
may cause problems.
Qingsheng Ren created FLINK-20848:
-
Summary: Kafka consumer ID is not specified correctly in new
KafkaSource
Key: FLINK-20848
URL: https://issues.apache.org/jira/browse/FLINK-20848
Project: Flink
Thanks for your proposal! Sebastian.
+1 for SupportsAggregatePushDown. The above wonderful discussion has solved
many of my concerns.
## Semantic problems
We may need to add some mechanisms or comments, because as far as I know,
the semantics of each database is actually different, which may nee
hi ,Jack:
If the partition type is int and we pass in a string type, the system will
throw an exception that the type does not match. We can indeed cast by get
the schema, but I think if CatalogPartitionSpec#partitionSpec is of type
Map, there is no need to do cast operation, and the
universal and
Hi Jark,
Thx for your further feedback and help. The interface of
SupportsAggregatePushDown may indeed need some adjustments.
For (1) Agree: Yeah, the upstream only need to know if the TableSource can
handle all of the aggregates.
It's better to just return a boolean type to indicate whether all
Hi Dev,
I'd like to start a discussion about whether we can let catalog handle
temporary catalog objects. Currently temporary table/view is managed by
CatalogManager, and temporary function is managed by FunctionCatalog.
This causes problems when I try to support temporary hive table/function.
Fo
Hi Jun,
I'm curious why it doesn't work when represented in string?
You can get the field type from the CatalogTable#getSchema(),
then parse/cast the partition value to the type you want.
Best,
Jark
On Tue, 5 Jan 2021 at 13:43, Jun Zhang wrote:
> Hello dev:
> Now I encounter a problem w
Hello dev:
Now I encounter a problem when using the method
"Catalog#listPartitions(ObjectPath, CatalogPartitionSpec)".
I found that the partitionSpec type in CatalogPartitionSpec is
Map,
This is no problem for hivecatalog, but my subclass of Catalog needs
precise types. For example,
+1 for allowing streaming operators to use managed memory.
The memory use of streams requires some hierarchy, and the bottom layer is
undoubtedly the current StateBackend.
Let the stream operators freely use the managed memory, which will make the
memory management model to be unified and give the
Thanks Shuiqiang for driving this.
The design looks good to me. +1 to start the vote if there are no more comments.
Regards,
Dian
> 在 2021年1月4日,下午7:40,Shuiqiang Chen 写道:
>
> Hi Yu,
>
> Thanks a lot for your suggestions.
>
> I have addressed your inlined comments in the FLIP and also added a
+1 to Xingtong's proposal!
Best,
Jark
On Tue, 5 Jan 2021 at 12:13, Xintong Song wrote:
> +1 for allowing streaming operators to use managed memory.
>
> As for the consumer names, I'm afraid using `DATAPROC` for both streaming
> ops and state backends will not work. Currently, RocksDB state back
+1 for allowing streaming operators to use managed memory.
As for the consumer names, I'm afraid using `DATAPROC` for both streaming
ops and state backends will not work. Currently, RocksDB state backend uses
a shared piece of memory for all the states within that slot. It's not the
operator's dec
Thanks for the clarification. I have resolved all of the comments and added
a conclusion section.
Looking forward to the further feedback from our community. If we get
consensus on the design doc, I can push the implementation related work.
Kurt Young 于2021年1月5日周二 上午11:04写道:
> Sorry for the typ
I'm also +1 for idea#2.
Regarding to the updated interface,
Result applyAggregates(List aggregateExpressions,
int[] groupSet, DataType aggOutputDataType);
final class Result {
private final List acceptedAggregates;
private final List remainingAggregates;
}
I have following co
Thanks a lot for your comments!
Regarding to Python Table API examples: I thought it should be straightforward
about how to use these operations in Python Table API and so have not added
them. However, the suggestions make sense to me and I have added some examples
about how to use them in Pyth
Thanks Dian,
+1 to `deduplicate`.
Regarding `myTable.coalesce($("a"), 1).as("a")`, I'm afraid it may
conflict/confuse the built-in expression `coalesce(f0, 0)` (we may
introduce it in the future).
Besides that, could we also align other features of Flink SQL, e.g.
event-time/processing-time temp
Hi Aljoscha,
I think we may need to divide `DATAPROC` into `OPERATOR` and
`STATE_BACKEND`, because they have different scope (slot vs. operator).
But @Xintong Song may have more insights on it.
Best,
Jark
On Mon, 4 Jan 2021 at 20:44, Aljoscha Krettek wrote:
> I agree, we should allow streami
Sorry for the typo -_-!
I meant idea #2.
Best,
Kurt
On Tue, Jan 5, 2021 at 10:59 AM Sebastian Liu wrote:
> Hi Kurt,
>
> Thx a lot for your feedback. If local aggregation is more like a physical
> operator rather than logical
> operator, I think your suggestion should be idea #2 which handle a
Hi Kurt,
Thx a lot for your feedback. If local aggregation is more like a physical
operator rather than logical
operator, I think your suggestion should be idea #2 which handle all in the
physical optimization phase?
Looking forward for the further discussion.
Kurt Young 于2021年1月5日周二 上午9:52写道:
Local aggregation is more like a physical operator rather than logical
operator. I would suggest going with idea #1.
Best,
Kurt
On Wed, Dec 30, 2020 at 8:31 PM Sebastian Liu wrote:
> Hi Jark, Thx a lot for your quick reply and valuable suggestions.
> For (1): Agree: Since we are in the period
Roman Khachatryan created FLINK-20847:
-
Summary: Update CompletedCheckpointStore.shutdown() signature
Key: FLINK-20847
URL: https://issues.apache.org/jira/browse/FLINK-20847
Project: Flink
Till Rohrmann created FLINK-20846:
-
Summary: Decouple checkpoint services from CheckpointCoordinator
Key: FLINK-20846
URL: https://issues.apache.org/jira/browse/FLINK-20846
Project: Flink
Iss
Hi Dian,
thanks for the proposed FLIP. I haven't taken a deep look at the
proposal yet but will do so shortly. In general, we should aim to make
the Table API as concise and self-explaining as possible. E.g. `dropna`
does not sound obvious to me.
Regarding `myTable.coalesce($("a"), 1).as("a"
Hello,
in FLINK-20790 we have changed where generated Avro files are
placed.Until then they were put directly under the src/ tree, with some
committed to git, other being ignored via .gitignore.
This has caused issues when switching branches (specifically, not being
able to compile 1.11 afte
Thanks Till, for the detailed explanation.I tried and it is working fine.
Once again thanks for your quick response.
Regards,
-Deep
On Mon, 4 Jan, 2021, 2:20 PM Till Rohrmann, wrote:
> Hi Deep,
>
> Flink has dropped support for specifying the number of TMs via -n since the
> introduction of F
This makes sense, I have some questions about method names.
What do you think about renaming `dropDuplicates` to `deduplicate`? I don't
think that drop is the right word to use for this operation, it implies
records are filtered where this operator actually issues updates and
retractions. Also, de
Nick Burkard created FLINK-20845:
Summary: Drop support for Scala 2.11
Key: FLINK-20845
URL: https://issues.apache.org/jira/browse/FLINK-20845
Project: Flink
Issue Type: Sub-task
Co
Rui Li created FLINK-20844:
--
Summary: Support add jar in SQL client
Key: FLINK-20844
URL: https://issues.apache.org/jira/browse/FLINK-20844
Project: Flink
Issue Type: New Feature
Component
I agree, we should allow streaming operators to use managed memory for
other use cases.
Do you think we need an additional "consumer" setting or that they would
just use `DATAPROC` and decide by themselves what to use the memory for?
Best,
Aljoscha
On 2020/12/22 17:14, Jark Wu wrote:
Hi all
Hi Dian,
Big +1 for making the Table API easier to use. Java users and Python users can
both benefit from it. I think it would be better if we add some Python API
examples.
Best,
Wei
> 在 2021年1月4日,20:03,Dian Fu 写道:
>
> Hi all,
>
> I'd like to start a discussion about introducing a few con
Hi all,
I'd like to start a discussion about introducing a few convenient operations in
Table API from the perspective of ease of use.
Currently some tasks are not easy to express in Table API e.g. deduplication,
topn, etc, or not easy to express when there are hundreds of columns in a
table,
Aljoscha Krettek created FLINK-20843:
Summary: UnalignedCheckpointITCase is unstable
Key: FLINK-20843
URL: https://issues.apache.org/jira/browse/FLINK-20843
Project: Flink
Issue Type: Bug
Hi Yu,
Thanks a lot for your suggestions.
I have addressed your inlined comments in the FLIP and also added a new
section "State backed access synchronization" that explains the way to make
sure there is no concurrent access to the state backend. Please have a look.
Best,
Shuiqiang
Yu Li 于202
zhouenning created FLINK-20842:
--
Summary: Data row is smaller than a column index, internal schema
representation is probably out of sync with real database schema
Key: FLINK-20842
URL: https://issues.apache.org/jira
Matthias created FLINK-20841:
Summary: Fix compile error due to duplicated generated files
Key: FLINK-20841
URL: https://issues.apache.org/jira/browse/FLINK-20841
Project: Flink
Issue Type: Bug
Hi Deep,
Flink has dropped support for specifying the number of TMs via -n since the
introduction of Flip-6. Since then, Flink will automatically start TMs
depending on the required resources. Hence, there is no need to specify the
-n parameter anymore. Instead, you should specify the parallelism
Leonard Xu created FLINK-20840:
--
Summary: Projection pushdown doesn't work in temporal(lookup) join
Key: FLINK-20840
URL: https://issues.apache.org/jira/browse/FLINK-20840
Project: Flink
Issue
wym_maozi created FLINK-20839:
-
Summary: some mistakes in state.md and state_zh.md
Key: FLINK-20839
URL: https://issues.apache.org/jira/browse/FLINK-20839
Project: Flink
Issue Type: Bug
Thanks for driving the discussion Shuiqiang, and sorry for chiming in late.
*bq. However, all the state access will be synchronized in the Java
operator and so there will be no concurrent access to the state backend.*
Could you add a section to explicitly mention this in the FLIP document? I
think
41 matches
Mail list logo