[jira] [Created] (FLINK-22582) Scrollbar "jumps" when clicking on tabs for Scala and Python

2021-05-06 Thread Matthias (Jira)
Matthias created FLINK-22582:


 Summary: Scrollbar "jumps" when clicking on tabs for Scala and 
Python 
 Key: FLINK-22582
 URL: https://issues.apache.org/jira/browse/FLINK-22582
 Project: Flink
  Issue Type: Bug
  Components: Documentation
Affects Versions: 1.13.0
Reporter: Matthias


For example, I am here: 
https://ci.apache.org/projects/flink/flink-docs-release-1.13/docs/dev/datastream/operators/overview/#rescaling,
 if you click between the Java/Scala/Python code example tab,  the page will 
scroll up and down to a totally different section. This makes it very hard to 
read for users.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (FLINK-22583) Flink UI is not showing completed jobs once JM Leader is deleted

2021-05-06 Thread Bhagi (Jira)
Bhagi created FLINK-22583:
-

 Summary: Flink UI is not showing completed jobs once JM Leader is 
deleted  
 Key: FLINK-22583
 URL: https://issues.apache.org/jira/browse/FLINK-22583
 Project: Flink
  Issue Type: Bug
  Components: Deployment / Kubernetes
Affects Versions: 1.12.2
Reporter: Bhagi


Hi Team,

when i was testing Flink HA with zk , i executed few jobs from WebUI, completed 
jobs were showing under completed jobs section.

As part of Leader election testing, i killed the current leader JM , and lost 
the jobs from Web UI also.

I have enabled following configuration already.
jobstore.expiration-time=86400
jobstore.max-capacity=200
jobmanager.archive.fs.dir: file:///persistent/flinkData2/completed-jobs
historyserver.archive.fs.refresh-interval: 1
historyserver.archive.fs.dir: file:///persistent/flinkData2/completed-jobs

Any other configuration is need to store the jobs irrespective of jobmanager 
Leader.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


Re: [DISCUSS] Backport FLIP-27 Kafka source connector fixes with API change to release-1.12.

2021-05-06 Thread Arvid Heise
Just to double check are the changes in the PR of Thomas even touching the
API or would that be another PR?
Or are the API changes contained to FLINK-20379?

On Wed, May 5, 2021 at 8:29 PM Arvid Heise  wrote:

> After looking a bit more into it, I'm also +1.
>
> Let's merge it soonish.
>
> On Wed, May 5, 2021 at 7:20 PM Thomas Weise  wrote:
>
>> I opened the PR to backport the changes:
>> https://github.com/apache/flink/pull/15840
>>
>> Without these fixes the new KafkaSource in 1.12 is near unusable. The most
>> obvious problem I ran into during testing was that checkpoints fail when
>> consumption has not started for a split (easily reproduced with a topic
>> partition that does not have data).
>>
>> Thanks,
>> Thomas
>>
>>
>> On Tue, Apr 13, 2021 at 11:33 AM Stephan Ewen  wrote:
>>
>> > Hi all!
>> >
>> > Generally, avoiding API changes in Bug fix versions is the right thing,
>> in
>> > my opinion.
>> >
>> > But this case is a bit special, because we are changing something that
>> > never worked properly in the first place.
>> > So we are not breaking a "running thing" here, but making it usable.
>> >
>> > So +1 from my side to backport these changes, I think we make more users
>> > happy than angry with this.
>> >
>> > Best,
>> > Stephan
>> >
>> >
>> > On Thu, Apr 8, 2021 at 11:35 AM Becket Qin 
>> wrote:
>> >
>> > > Hi Arvid,
>> > >
>> > > There are interface changes to the Kafka source, and there is a
>> backwards
>> > > compatible change in the base source implementation. Therefore
>> > technically
>> > > speaking, users might be able to run the Kafka source in 1.13 with a
>> 1.12
>> > > Flink job. However, it could be tricky because there might be some
>> > > dependent jar conflicts between 1.12 and 1.13. So this solution seems
>> a
>> > > little fragile.
>> > >
>> > > I'd second Till's question if there is an issue for users that start
>> with
>> > > > the current Kafka source (+bugfixes) to later upgrade to 1.13 Kafka
>> > > source
>> > > > with API changes.
>> > >
>> > >
>> > > Just to clarify, the bug fixes themselves include API changes, they
>> are
>> > not
>> > > separable. So we basically have three options here:
>> > >
>> > > 1. Do not backport fixes in 1.12. So users have to upgrade to 1.13 in
>> > order
>> > > to use the new Kafka source.
>> > > 2. Write some completely different fixes for release 1.12 and ask
>> users
>> > to
>> > > migrate to the new API when they upgrade to 1.13
>> > > 3. Backport the fix with API changes to 1.12. So users don't need to
>> > handle
>> > > interface change when they upgrade to 1.13+.
>> > >
>> > > Personally I think option 3 here is better because it does not really
>> > > introduce any trouble to the users. The downside is that we do need to
>> > > change the API of Kafka source in 1.12. Given that the changed API
>> won't
>> > be
>> > > useful without these bug fixes, changing the API seems to be doing
>> more
>> > > good than bad here.
>> > >
>> > > Thanks,
>> > >
>> > > Jiangjie (Becket) Qin
>> > >
>> > >
>> > >
>> > > On Thu, Apr 8, 2021 at 2:39 PM Arvid Heise  wrote:
>> > >
>> > > > Hi Becket,
>> > > >
>> > > > did you need to change anything to the source interface itself?
>> > Wouldn't
>> > > it
>> > > > be possible for users to simply use the 1.13 connector with their
>> Flink
>> > > > 1.12 deployment?
>> > > >
>> > > > I think the late-upgrade argument can be made for any feature, but I
>> > also
>> > > > see that the Kafka connector is of high interest.
>> > > >
>> > > > I'd second Till's question if there is an issue for users that start
>> > with
>> > > > the current Kafka source (+bugfixes) to later upgrade to 1.13 Kafka
>> > > source
>> > > > with API changes.
>> > > >
>> > > > Best,
>> > > >
>> > > > Arvid
>> > > >
>> > > > On Thu, Apr 8, 2021 at 2:54 AM Becket Qin 
>> > wrote:
>> > > >
>> > > > > Thanks for the comment, Till and Thomas.
>> > > > >
>> > > > > As far as I know there are some users who have just upgraded their
>> > > Flink
>> > > > > version from 1.8 / 1.9 to Flink 1.12 and might not upgrade the
>> > version
>> > > > in 6
>> > > > > months or more. There are also some organizations that have the
>> > > strategy
>> > > > of
>> > > > > not running the latest version of a project, but only the second
>> > latest
>> > > > > version with bug fixes. So those users may still benefit from the
>> > > > backport.
>> > > > > However, arguably the old Kafka source is there anyways in 1.12,
>> so
>> > > they
>> > > > > should not be blocked on having the new source.
>> > > > >
>> > > > > I am leaning towards backporting the fixes mainly because this
>> way we
>> > > > might
>> > > > > have more users migrating to the new Source and provide feedback.
>> It
>> > > will
>> > > > > take some time for the users to pick up 1.13, especially for the
>> > users
>> > > > > running Flink at large scale. So backporting the fixes to 1.12
>> would
>> > > help
>> > > > > get the new source to be used sooner.
>> > > > >
>> > > > > Thanks,
>> > > > >
>

Re: [ANNOUNCE] Apache Flink 1.13.0 released

2021-05-06 Thread Zhu Zhu
Thanks Dawid and Guowei for being the release managers! And thanks everyone
who has made this release possible!

Thanks,
Zhu

Yun Tang  于2021年5月6日周四 下午2:30写道:

> Thanks for Dawid and Guowei's great work, and thanks for everyone involved
> for this release.
>
> Best
> Yun Tang
> --
> *From:* Xintong Song 
> *Sent:* Thursday, May 6, 2021 12:08
> *To:* user ; dev 
> *Subject:* Re: [ANNOUNCE] Apache Flink 1.13.0 released
>
> Thanks Dawid & Guowei as the release managers, and everyone who has
> contributed to this release.
>
>
> Thank you~
>
> Xintong Song
>
>
>
> On Thu, May 6, 2021 at 9:51 AM Leonard Xu  wrote:
>
> > Thanks Dawid & Guowei for the great work, thanks everyone involved.
> >
> > Best,
> > Leonard
> >
> > 在 2021年5月5日,17:12,Theo Diefenthal 
> 写道:
> >
> > Thanks for managing the release. +1. I like the focus on improving
> > operations with this version.
> >
> > --
> > *Von: *"Matthias Pohl" 
> > *An: *"Etienne Chauchot" 
> > *CC: *"dev" , "Dawid Wysakowicz" <
> > dwysakow...@apache.org>, "user" ,
> > annou...@apache.org
> > *Gesendet: *Dienstag, 4. Mai 2021 21:53:31
> > *Betreff: *Re: [ANNOUNCE] Apache Flink 1.13.0 released
> >
> > Yes, thanks for managing the release, Dawid & Guowei! +1
> >
> > On Tue, May 4, 2021 at 4:20 PM Etienne Chauchot 
> > wrote:
> >
> >> Congrats to everyone involved !
> >>
> >> Best
> >>
> >> Etienne
> >> On 03/05/2021 15:38, Dawid Wysakowicz wrote:
> >>
> >> The Apache Flink community is very happy to announce the release of
> >> Apache Flink 1.13.0.
> >>
> >> Apache Flink® is an open-source stream processing framework for
> >> distributed, high-performing, always-available, and accurate data
> streaming
> >> applications.
> >>
> >> The release is available for download at:
> >> https://flink.apache.org/downloads.html
> >>
> >> Please check out the release blog post for an overview of the
> >> improvements for this bugfix release:
> >> https://flink.apache.org/news/2021/05/03/release-1.13.0.html
> >>
> >> The full release notes are available in Jira:
> >>
> >>
> https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12315522&version=12349287
> >>
> >> We would like to thank all contributors of the Apache Flink community
> who
> >> made this release possible!
> >>
> >> Regards,
> >> Guowei & Dawid
> >>
> >>
> >
> >
>


Re: [ANNOUNCE] Apache Flink 1.13.0 released

2021-05-06 Thread Rui Li
Thanks to Dawid and Guowei for the great work!

On Thu, May 6, 2021 at 4:48 PM Zhu Zhu  wrote:

> Thanks Dawid and Guowei for being the release managers! And thanks
> everyone who has made this release possible!
>
> Thanks,
> Zhu
>
> Yun Tang  于2021年5月6日周四 下午2:30写道:
>
>> Thanks for Dawid and Guowei's great work, and thanks for everyone
>> involved for this release.
>>
>> Best
>> Yun Tang
>> --
>> *From:* Xintong Song 
>> *Sent:* Thursday, May 6, 2021 12:08
>> *To:* user ; dev 
>> *Subject:* Re: [ANNOUNCE] Apache Flink 1.13.0 released
>>
>> Thanks Dawid & Guowei as the release managers, and everyone who has
>> contributed to this release.
>>
>>
>> Thank you~
>>
>> Xintong Song
>>
>>
>>
>> On Thu, May 6, 2021 at 9:51 AM Leonard Xu  wrote:
>>
>> > Thanks Dawid & Guowei for the great work, thanks everyone involved.
>> >
>> > Best,
>> > Leonard
>> >
>> > 在 2021年5月5日,17:12,Theo Diefenthal 
>> 写道:
>> >
>> > Thanks for managing the release. +1. I like the focus on improving
>> > operations with this version.
>> >
>> > --
>> > *Von: *"Matthias Pohl" 
>> > *An: *"Etienne Chauchot" 
>> > *CC: *"dev" , "Dawid Wysakowicz" <
>> > dwysakow...@apache.org>, "user" ,
>> > annou...@apache.org
>> > *Gesendet: *Dienstag, 4. Mai 2021 21:53:31
>> > *Betreff: *Re: [ANNOUNCE] Apache Flink 1.13.0 released
>> >
>> > Yes, thanks for managing the release, Dawid & Guowei! +1
>> >
>> > On Tue, May 4, 2021 at 4:20 PM Etienne Chauchot 
>> > wrote:
>> >
>> >> Congrats to everyone involved !
>> >>
>> >> Best
>> >>
>> >> Etienne
>> >> On 03/05/2021 15:38, Dawid Wysakowicz wrote:
>> >>
>> >> The Apache Flink community is very happy to announce the release of
>> >> Apache Flink 1.13.0.
>> >>
>> >> Apache Flink® is an open-source stream processing framework for
>> >> distributed, high-performing, always-available, and accurate data
>> streaming
>> >> applications.
>> >>
>> >> The release is available for download at:
>> >> https://flink.apache.org/downloads.html
>> >>
>> >> Please check out the release blog post for an overview of the
>> >> improvements for this bugfix release:
>> >> https://flink.apache.org/news/2021/05/03/release-1.13.0.html
>> >>
>> >> The full release notes are available in Jira:
>> >>
>> >>
>> https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12315522&version=12349287
>> >>
>> >> We would like to thank all contributors of the Apache Flink community
>> who
>> >> made this release possible!
>> >>
>> >> Regards,
>> >> Guowei & Dawid
>> >>
>> >>
>> >
>> >
>>
>

-- 
Best regards!
Rui Li


[jira] [Created] (FLINK-22584) Use protobuf-shaded in StateFun core.

2021-05-06 Thread Igal Shilman (Jira)
Igal Shilman created FLINK-22584:


 Summary: Use protobuf-shaded in StateFun core.
 Key: FLINK-22584
 URL: https://issues.apache.org/jira/browse/FLINK-22584
 Project: Flink
  Issue Type: Improvement
  Components: Stateful Functions
Reporter: Igal Shilman


We have *statefun-protobuf-shaded* module, that is used by the remote java sdk,

we can use it to shade protobuf internally.

The major hurdle we need to overcome is that, in embedded functions, we have to 
be able to accept instances of protobuf generated messages by the user.

For example:
{code:java}
UserProfile userProfile = UserProfile.newBilder().build();
context.send(..., userProfile) {code}
If we will simply use the shaded Protobuf version, we will get immediately a 
class cast exception.

One way to overcome this is to use reflection and find the well known methods 
on the generated classes and call toBytes() / parseFrom() reflectively.

This however will cause a significant slow down, even by using MethodHandles.
A small experiment that I've previously done with ByteBuddy mitigates this, by 
generating 
accessors, in pre-flight:

{code:java}
package org.apache.flink.statefun.flink.common.protobuf.serde;

import static net.bytebuddy.matcher.ElementMatchers.named;import 
java.io.InputStream;
import java.io.OutputStream;
import java.lang.reflect.InvocationTargetException;
import java.lang.reflect.Method;
import net.bytebuddy.ByteBuddy;
import net.bytebuddy.dynamic.DynamicType;
import net.bytebuddy.implementation.FixedValue;
import net.bytebuddy.implementation.MethodCall;
import net.bytebuddy.implementation.bytecode.assign.Assigner;final class 
ReflectiveProtobufSerde {  @SuppressWarnings({"unchecked", "rawtypes"})
  static  ProtobufSerde ofProtobufGeneratedType(Class type) {
try {
  DynamicType.Unloaded unloaded = configureByteBuddy(type);  
Class writer = 
unloaded.load(type.getClassLoader()).getLoaded();  return 
(ProtobufSerde) writer.getDeclaredConstructor().newInstance();
} catch (Throwable e) {
  throw new IllegalArgumentException();
}
  }  @SuppressWarnings("rawtypes")
  private static DynamicType.Unloaded 
configureByteBuddy(Class type)
  throws NoSuchMethodException, InvocationTargetException, 
IllegalAccessException {
Method writeToMethod = type.getMethod("writeTo", OutputStream.class);
Method parseFromMethod = type.getMethod("parseFrom", InputStream.class);
Method getSerializedSizeMethod = type.getMethod("getSerializedSize");// 
get the message full name
Method getDescriptorMethod = type.getMethod("getDescriptor");
Object descriptor = getDescriptorMethod.invoke(null);
Method getFullNameMethod = descriptor.getClass().getMethod("getFullName");
String messageFullName = (String) getFullNameMethod.invoke(descriptor);
return new ByteBuddy()
.subclass(ProtobufSerde.class)
.typeVariable("M", type)
.method(named("writeTo"))
.intercept(
MethodCall.invoke(writeToMethod)
.onArgument(0)
.withArgument(1)
.withAssigner(Assigner.DEFAULT, Assigner.Typing.DYNAMIC))
.method(named("parseFrom"))
.intercept(MethodCall.invoke(parseFromMethod).withArgument(0))
.method(named("getSerializedSize"))
.intercept(
MethodCall.invoke(getSerializedSizeMethod)
.onArgument(0)
.withAssigner(Assigner.DEFAULT, Assigner.Typing.DYNAMIC))
.method(named("getMessageFullName"))
.intercept(FixedValue.value(messageFullName))
.make();
  }
}
 {code}
 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (FLINK-22585) Add deprecated message when "-m yarn-cluster" is used

2021-05-06 Thread Yang Wang (Jira)
Yang Wang created FLINK-22585:
-

 Summary: Add deprecated message when "-m yarn-cluster" is used
 Key: FLINK-22585
 URL: https://issues.apache.org/jira/browse/FLINK-22585
 Project: Flink
  Issue Type: Improvement
  Components: Deployment / YARN
Reporter: Yang Wang
 Fix For: 1.14.0


The unified executor interface has been introduced for a long time, which could 
be activated by "\--target yarn-per-job/yarn-session/kubernetes-application". 
It is more descriptive and clearer. We should try to deprecate "-m 
yarn-cluster" and suggest our users to use the new CLI commands.

 

However, AFAIK, many companies are using some CLI commands to integrate with 
their deployers. So we could not remove the "-m yarn-cluster" very soon. Maybe 
we could do it in the release 2.0 since we could do some breaking changes.

 

For now, I suggest to add the {{@Deprecated}} annotation and printing a WARN 
log message when "-m yarn-cluster" is used. It is useful to let the users know 
the long-term goals and migrate asap.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (FLINK-22586) Improve precision derivation for decimal arithmetics

2021-05-06 Thread Shuo Cheng (Jira)
Shuo Cheng created FLINK-22586:
--

 Summary: Improve precision derivation for decimal arithmetics
 Key: FLINK-22586
 URL: https://issues.apache.org/jira/browse/FLINK-22586
 Project: Flink
  Issue Type: Improvement
  Components: Table SQL / Runtime
Affects Versions: 1.13.0
Reporter: Shuo Cheng
 Fix For: 1.14.0


Currently the precision and scale derivation is not properly for decimal data 
arithmetics, e.g,

considering the following example:
{code:java}
select cast('10.1' as decimal(38, 19)) * cast('10.2' as decimal(38, 19)) from 
T{code}
the result is `null`, which may confuses use a lot, because the result is far 
from overflow actually. 

The root cause is the precision derivation for the above multiplication is:

(38, 19) * (38, 19) -> (38, 38)

So there is no space for integral digits, which leads to null results.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


Re: Flink SQL Cdc with schema changing

2021-05-06 Thread Taher Koitawala
Sure, here's the use case that I want to solve. I want to stream CDC
records that are inserted in kafka via debezium. We want to capture all
events of Debezium, with an alter table add column, modify column, inserts,
updates and deletes over an avro based file format which can then be
queried. For now we are also evaluating if we can write direct records to
Iceberg format and try to solve.

Further leads on iceberg will be appreciated.

It is a very common practise these days to bring over change records with
evolving schema. We have HUDI as an option however that is a Spark based
approach. Personal opinion dont want to offend anyone, however I think
Flink is way way way better than spark when to comes to streaming. Another
problem with Hudi is the compaction process that it needs.

Major goal is to support CDC natively via streams. When a user queries the
data the goal is to get kind of a lock over the dataset where the user can
see committed data only and changes can still be streamed whilst the user
queries that data.

In a nutshell I am trying to design a CDC system over with flink as the
major stream processing engine.









On Wed, May 5, 2021 at 5:40 PM Jark Wu  wrote:

> Hi Taher,
>
> Could you explain a bit more your use case and what do you expect Flink SQL
> to support?
> That could help us to better understand and plan the future roadmap.
>
> Best,
> Jark
>
> On Wed, 5 May 2021 at 19:42, Taher Koitawala  wrote:
>
> > Thank you for the reply Jack Wu, however that does not satisfy my
> > requirements, my use case is to have something that supports a schema
> drift
> > over avro format. Column addition and column datatype change both types
> of
> > variations is what I am trying to solve for. Either way thanks for
> > the help, much appreciated.
> >
> > Regards,
> > Taher Koitawala
> >
> > On Wed, May 5, 2021 at 3:53 PM Jark Wu  wrote:
> >
> > > Hi Taher,
> > >
> > > Currently, Flink (SQL) CDC doesn't support automatic schema change
> > > and doesn't support to consume schema change events in source.
> > > But you can upgrade schema manually, for example, if you have a table
> > > with columns [a, b, c], you can define a flink table t1 with these 3
> > > columns.
> > > When you add new column in source RDBMS, the Flink SQL job on t1
> > > should work fine if you are using format 'debezium-json' or
> > > 'debezium-avro-confluent',
> > > because they supports schema compatibility.
> > > When you are notified there is a schema change in the source RDBMS,
> > > then you can upgrade your Flink SQL DDL and job to include the added
> > > column,
> > > and consume from the previous savepoint [1].
> > >
> > > Best,
> > > Jark
> > >
> > > [1]:
> > >
> > >
> >
> https://ci.apache.org/projects/flink/flink-docs-release-1.13/docs/dev/table/sqlclient/#start-a-sql-job-from-a-savepoint
> > >
> > >
> > >
> > > On Wed, 5 May 2021 at 13:34, Taher Koitawala 
> wrote:
> > >
> > > > Hi All,
> > > >  I have a CDC use case where I want to capture and process
> > > debezium
> > > > logs that are streamed to Kafka via Debezium. As per all the flink
> > > examples
> > > > we have to pre create the schema of the tables where I want to
> perform
> > a
> > > > write.
> > > >
> > > > However my question is what if there is an alter table modify column
> > data
> > > > type query that hits the source RDBMS, how does flink handle that
> > schema
> > > > change and what changes are supported. If someone can give a full
> > example
> > > > it will be very very helpful.
> > > >
> > > >
> > > > Regards,
> > > > Taher Koitawala
> > > >
> > >
> >
>


[jira] [Created] (FLINK-22587) Support aggregations in batch mode with DataStream API

2021-05-06 Thread Etienne Chauchot (Jira)
Etienne Chauchot created FLINK-22587:


 Summary: Support aggregations in batch mode with DataStream API
 Key: FLINK-22587
 URL: https://issues.apache.org/jira/browse/FLINK-22587
 Project: Flink
  Issue Type: Improvement
  Components: API / DataStream
Reporter: Etienne Chauchot


A pipeline like this *in batch mode* would output no data
{code:java}
stream.join(otherStream)
.where()
.equalTo()
.window(GlobalWindows.create())
.apply()
{code}
Indeed the default trigger for GlobalWindow is NeverTrigger which never fires. 
If we set a _EventTimeTrigger_ it will fire with every element as the watermark 
will be set to +INF (batch mode) and will pass the end of the global window 
with each new element. A _ProcessingTimeTrigger_ never fires either and all 
elapsed time or delta based triggers would not be suited for batch.

Same goes for _reduce()_ instead of join().

So I guess we miss something for batch support with DataStream.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


Re: [Parquet support]

2021-05-06 Thread Etienne Chauchot

Hi,

@Jingsong, I agree that adding a new feature (ParquetAvroInputFormat) to 
an old source API is a maintenance burden. But IMHO I think that while 
the new DataStream batch/streaming convergent API is not 100% functional 
we still need to maintain older sources and add missing features to them.


Indeed, I realized that DataStream API in batch mode (1) does not 
support aggregations yet (2) so in such a case a user would stick to the 
DataSet API. And the new FileSource API with 
ParquetColumnarRowInputFormat is not available in DataSet API (3).


So, long story short, in some cases a user will have no other choice 
than using ParquetInputFormat and legacy source.


WDYT ?

[1] https://issues.apache.org/jira/browse/FLINK-19316

[2] https://issues.apache.org/jira/browse/FLINK-22587

[3] 
https://cwiki.apache.org/confluence/display/FLINK/FLIP-27%3A+Refactor+Source+Interface#FLIP27:RefactorSourceInterface-Compatibility,Deprecation,andMigrationPlan


Best,

Etienne

On 24/02/2021 03:35, Jingsong Li wrote:

Hi Etienne,

Thanks for your reporting.

There are indeed many problems. There is no doubt that we need to improve
our current format implementation.

But ParquetTableSource and ParquetInputFormat are legacy implementations
with legacy interfaces. We have introduced new interfaces for execution and
SQL. You can see:
- ParquetColumnarRowInputFormat with BulkFormat interface. It is just for
columnar row reading, not support complex types, we need
migrate ParquetInputFormat to the new BulkFormat interface.
- FileSystemTableSource with DynamicTableSource interface, It is a generic
FileSystem source for all formats, we can just use it for parquet too.

Considering ParquetTableSource and ParquetInputFormat are legacy
interfaces, I think we can finish migration work first, what do you think?

Best,
Jingsong

On Wed, Feb 24, 2021 at 12:46 AM Etienne Chauchot 
wrote:


Hi all,

I've been playing with Parquet with SQL and Avro lately. I've found some
bugs:

1. https://issues.apache.org/jira/browse/FLINK-21388 : I already
submitted a PR on this one (https://github.com/apache/flink/pull/14961)

2. https://issues.apache.org/jira/browse/FLINK-21389

3. https://issues.apache.org/jira/browse/FLINK-21468

I've already started to work on this ticket:
https://issues.apache.org/jira/browse/FLINK-21393


I'd be happy to receive your comments on these tickets


Best

Etienne Chauchot





Re: Configmaps not deleted Kubernetes HA

2021-05-06 Thread Till Rohrmann
Hi Enrique,

I think you are running into this problem FLINK-20695 [1]. In a nutshell,
Flink only deletes the config maps when it shuts down at the moment. We
want to change this with the next release.

[1]  https://issues.apache.org/jira/browse/FLINK-20695

Cheers,
Till

On Wed, May 5, 2021 at 8:36 PM Enrique  wrote:

> Hi all,
>
> I am deploying a Flink Cluster in session mode using Kubernetes HA and have
> seen it working with the different config maps for the dispatcher,
> restserver and resourcemanager. I also have configured storage for
> checkpointing and HA metadata.
>
> When I submit a job, I can see that a config map is created for it
> containing checkpoint information which is updated correctly. Yet, when I
> cancel a job I assume the config map would be deleted but it's seems that
> it
> isn't. Is this the intended behaviour? I am worried that s many jobs are
> submitted and cancelled from Flink Cluster a large number of Config Maps
> would remain in the cluster.
>
> Thanks in advance,
>
> Enrique
>
>
>
> --
> Sent from: http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/
>


[jira] [Created] (FLINK-22588) flink

2021-05-06 Thread forideal (Jira)
forideal created FLINK-22588:


 Summary: flink
 Key: FLINK-22588
 URL: https://issues.apache.org/jira/browse/FLINK-22588
 Project: Flink
  Issue Type: Bug
Reporter: forideal






--
This message was sent by Atlassian Jira
(v8.3.4#803005)