[jira] [Created] (FLINK-26745) Unable to create new env

2022-03-20 Thread Rahul Desai (Jira)
Rahul Desai created FLINK-26745:
---

 Summary: Unable to create new env
 Key: FLINK-26745
 URL: https://issues.apache.org/jira/browse/FLINK-26745
 Project: Flink
  Issue Type: Bug
  Components: API / Python
Affects Versions: 1.14.4, 1.14.3
 Environment: Below is the Error log

FileNotFoundError Traceback (most recent call last) Input In [10], in () > 1 env = EnvironmentSettings.new_instance().build()  2 
table_env = TableEnvironment.create(env) File 
D:\Anaconda3\envs\pyflink_env\lib\site-packages\pyflink\table\environment_settings.py:241,
 in EnvironmentSettings.new_instance()  231 @staticmethod  232 def 
new_instance() -> 'EnvironmentSettings.Builder':  233 """  234 Creates a 
builder for creating an instance of EnvironmentSettings.  235  (...)  239 
:return: A builder of EnvironmentSettings.  240 """ --> 241 return 
EnvironmentSettings.Builder() File 
D:\Anaconda3\envs\pyflink_env\lib\site-packages\pyflink\table\environment_settings.py:51,
 in EnvironmentSettings.Builder.__init__(self)  50 def __init__(self): ---> 51 
gateway = get_gateway()  52 self._j_builder = 
gateway.jvm.EnvironmentSettings.Builder() File 
D:\Anaconda3\envs\pyflink_env\lib\site-packages\pyflink\java_gateway.py:62, in 
get_gateway()  57 _gateway = JavaGateway(  58 gateway_parameters=gateway_param, 
 59 callback_server_parameters=CallbackServerParameters(  60 port=0, 
daemonize=True, daemonize_connections=True))  61 else: ---> 62 _gateway = 
launch_gateway()  64 callback_server = _gateway.get_callback_server()  65 
callback_server_listening_address = callback_server.get_listening_address() 
File 
D:\Anaconda3\envs\pyflink_env\lib\site-packages\pyflink\java_gateway.py:106, in 
launch_gateway()  103 env = dict(os.environ)  104 
env["_PYFLINK_CONN_INFO_PATH"] = conn_info_file --> 106 p = 
launch_gateway_server_process(env, args)  108 while not p.poll() and not 
os.path.isfile(conn_info_file):  109 time.sleep(0.1) File 
D:\Anaconda3\envs\pyflink_env\lib\site-packages\pyflink\pyflink_gateway_server.py:304,
 in launch_gateway_server_process(env, args)  302 signal.signal(signal.SIGINT, 
signal.SIG_IGN)  303 preexec_fn = preexec_func --> 304 return 
Popen(list(filter(lambda c: len(c) != 0, command)),  305 stdin=PIPE, 
preexec_fn=preexec_fn, env=env) File 
D:\Anaconda3\envs\pyflink_env\lib\subprocess.py:858, in Popen.__init__(self, 
args, bufsize, executable, stdin, stdout, stderr, preexec_fn, close_fds, shell, 
cwd, env, universal_newlines, startupinfo, creationflags, restore_signals, 
start_new_session, pass_fds, encoding, errors, text)  854 if self.text_mode:  
855 self.stderr = io.TextIOWrapper(self.stderr,  856 encoding=encoding, 
errors=errors) --> 858 self._execute_child(args, executable, preexec_fn, 
close_fds,  859 pass_fds, cwd, env,  860 startupinfo, creationflags, shell,  
861 p2cread, p2cwrite,  862 c2pread, c2pwrite,  863 errread, errwrite,  864 
restore_signals, start_new_session)  865 except:  866 # Cleanup if the child 
failed starting.  867 for f in filter(None, (self.stdin, self.stdout, 
self.stderr)): File D:\Anaconda3\envs\pyflink_env\lib\subprocess.py:1311, in 
Popen._execute_child(self, args, executable, preexec_fn, close_fds, pass_fds, 
cwd, env, startupinfo, creationflags, shell, p2cread, p2cwrite, c2pread, 
c2pwrite, errread, errwrite, unused_restore_signals, unused_start_new_session)  
1309 # Start the process  1310 try: -> 1311 hp, ht, pid, tid = 
_winapi.CreateProcess(executable, args,  1312 # no special security  1313 None, 
None,  1314 int(not close_fds),  1315 creationflags,  1316 env,  1317 cwd,  
1318 startupinfo)  1319 finally:  1320 # Child is launched. Close the parent's 
copy of those pipe  1321 # handles that only the child should have open. You 
need  (...)  1324 # pipe will not close when the child process exits and the  
1325 # ReadFile will hang.  1326 self._close_pipe_fds(p2cread, p2cwrite,  1327 
c2pread, c2pwrite,  1328 errread, errwrite) FileNotFoundError: [WinError 2] The 
system cannot find the file specified
Reporter: Rahul Desai


I'm getting "File not found error" while creating environment/table environment 
in Pyflink. It was working fine a few days before but since today I'm facing 
this issue. Below are a few steps which I tried to fix it.
 # update packages/downgraded package
 # created new env
 # reinstalled the package



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Created] (FLINK-26746) Update the documentation of Timer's Fault Tolerance section

2022-03-20 Thread Liwei Lin (Jira)
Liwei Lin created FLINK-26746:
-

 Summary: Update the documentation of Timer's Fault Tolerance 
section
 Key: FLINK-26746
 URL: https://issues.apache.org/jira/browse/FLINK-26746
 Project: Flink
  Issue Type: Improvement
  Components: Documentation
Reporter: Liwei Lin


Currently in the documentation of Timer's Fault Tolerance section ( 
[https://nightlies.apache.org/flink/flink-docs-release-1.14/docs/dev/datastream/operators/process_function/#fault-tolerance]
 ), it says:

Timers are always asynchronously checkpointed, except for the combination of 
RocksDB backend / with incremental snapshots / with heap-based timers (will be 
resolved with {{{}FLINK-10026{}}}). 

Actually FLINK-10026 is resolved as 'Won't Do'; the documentation might be 
updated.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Created] (FLINK-26747) [JUnit5 Migration] Module: flink-external-resources

2022-03-20 Thread RocMarshal (Jira)
RocMarshal created FLINK-26747:
--

 Summary: [JUnit5 Migration] Module: flink-external-resources
 Key: FLINK-26747
 URL: https://issues.apache.org/jira/browse/FLINK-26747
 Project: Flink
  Issue Type: Improvement
  Components: Tests
Affects Versions: 1.16.0
Reporter: RocMarshal






--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Created] (FLINK-26748) flink jobmanager metaspace oom

2022-03-20 Thread leishuiyu (Jira)
leishuiyu created FLINK-26748:
-

 Summary: flink jobmanager metaspace oom
 Key: FLINK-26748
 URL: https://issues.apache.org/jira/browse/FLINK-26748
 Project: Flink
  Issue Type: Bug
  Components: Runtime / Task
Affects Versions: 1.14.3
 Environment: flink 1.14.3 ,1.12.0,1.13.0
Reporter: leishuiyu
 Fix For: 1.14.3


submit job to flink,the flink jobmanager metaspace keep increase ,until oom



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Created] (FLINK-26749) Publish the Flink Operator image to public registry

2022-03-20 Thread Xin Hao (Jira)
Xin Hao created FLINK-26749:
---

 Summary: Publish the Flink Operator image to public registry
 Key: FLINK-26749
 URL: https://issues.apache.org/jira/browse/FLINK-26749
 Project: Flink
  Issue Type: Improvement
  Components: Kubernetes Operator
Reporter: Xin Hao


I think we should build and publish the Flink Operator Image to a public 
registry before introducing it to the users.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Created] (FLINK-26750) HA cluster cleanup not complete

2022-03-20 Thread Matthias Pohl (Jira)
Matthias Pohl created FLINK-26750:
-

 Summary: HA cluster cleanup not complete
 Key: FLINK-26750
 URL: https://issues.apache.org/jira/browse/FLINK-26750
 Project: Flink
  Issue Type: Bug
  Components: Runtime / Coordination
Affects Versions: 1.14.4, 1.15.0
Reporter: Matthias Pohl


When starting a Flink cluster with HA (tested with ZooKeeper) in standalone 
mode and stopping it right away (i.e. {{./bin/start-cluster.sh; 
./bin/stop-cluster.sh}}), the paths are not properly cleaned:
{code}
$ ls -R /
/
/flink
/zookeeper
/flink/default
/flink/default/jobgraphs
/flink/default/leader
/zookeeper/config
/zookeeper/quota
{code}

The {{/flink}} path should not be present when stopping the cluster.

This is most likely also a problem in k8s HA. That should be checked as part of 
this ticket.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Created] (FLINK-26751) [FLIP-171] Kafka implementation of Async Sink

2022-03-20 Thread Almog Tavor (Jira)
Almog Tavor created FLINK-26751:
---

 Summary: [FLIP-171] Kafka implementation of Async Sink
 Key: FLINK-26751
 URL: https://issues.apache.org/jira/browse/FLINK-26751
 Project: Flink
  Issue Type: New Feature
  Components: Connectors / Kafka
Reporter: Almog Tavor


*User stories:*
As a Flink user, I’d like to use Kafka as a sink for my data pipeline.

*Scope:*
 * Implement an asynchronous sink for Kafka by inheriting the AsyncSinkBase 
class. The implementation can reside in the Kafka module in flink-connectors.

h2. References

More details to be found 
[https://cwiki.apache.org/confluence/display/FLINK/FLIP-171%3A+Async+Sink]
h4.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Created] (FLINK-26752) Online Machine Learning Training

2022-03-20 Thread TheIoTAcademy (Jira)
TheIoTAcademy created FLINK-26752:
-

 Summary: Online Machine Learning Training
 Key: FLINK-26752
 URL: https://issues.apache.org/jira/browse/FLINK-26752
 Project: Flink
  Issue Type: Bug
Reporter: TheIoTAcademy
 Attachments: Data science, Machine Learning and IoT.jpg

Getting involved experience assumes a significant part in molding the vocation 
of future information researchers and ML engineers. Enlist yourself in the 
Advanced Certification in Applied Data Science, Machine Learning, and IoT By 
E&ICT Academy, IIT Guwahati. This web-based AI preparation is well-arranged for 
understudies and working experts who need to seek a truly amazing job in 
Machine Learning and Data Science.

To know more: [Online Machine Learning 
Training|https://www.theiotacademy.co/advanced-certification-in-data-science-machine-learning-and-iot-by-eict-iitg]

!Data science, Machine Learning and IoT.jpg!



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


Re: [VOTE]FLIP-215: Introduce FlinkSessionJob CRD in the kubernetes operator

2022-03-20 Thread Xintong Song
Thanks Aitozi, the FLIP LGTM.

+1 (binding)

Thank you~

Xintong Song



On Fri, Mar 18, 2022 at 10:36 PM Őrhidi Mátyás 
wrote:

> +1 (non-binding)
> nice addition to the operator
>
> Cheers,
> Matyas
>
> On Fri, Mar 18, 2022 at 12:10 PM Biao Geng  wrote:
>
> > +1(non-binding)
> > Thanks for the work!
> >
> > Best,
> > Biao Geng
> >
> > Yang Wang  于2022年3月18日周五 19:01写道:
> >
> > > +1(binding)
> > >
> > > Thanks for your contribution.
> > >
> > > Best,
> > > Yang
> > >
> > > Gyula Fóra  于2022年3月18日周五 18:44写道:
> > >
> > > > I think this is a simple and valuable addition that will also be a
> > > building
> > > > block for other important future features.
> > > >
> > > > +1
> > > >
> > > > Gyula
> > > >
> > > > On Fri, Mar 18, 2022 at 10:30 AM Aitozi 
> wrote:
> > > >
> > > > > Hi community:
> > > > > I'd like to start a vote on FLIP-215: Introduce FlinkSessionJob
> > CRD
> > > > in
> > > > > the kubernetes operator [1] which has been discussed in the thread
> > [2].
> > > > >
> > > > > The vote will be open for at least 72 hours unless there is an
> > > objection
> > > > or
> > > > > not enough votes.
> > > > >
> > > > > [1]:
> > > > >
> > > > >
> > > >
> > >
> >
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-215%3A+Introduce+FlinkSessionJob+CRD+in+the+kubernetes+operator
> > > > > [2]:
> > https://lists.apache.org/thread/fpp5m9jkr0wnjryd07xtpj13t80z99yt
> > > > >
> > > > > Best,
> > > > > Aitozi.
> > > > >
> > > >
> > >
> >
>


[jira] [Created] (FLINK-26753) PK constraint should include partition key

2022-03-20 Thread Jane Chan (Jira)
Jane Chan created FLINK-26753:
-

 Summary: PK constraint should include partition key
 Key: FLINK-26753
 URL: https://issues.apache.org/jira/browse/FLINK-26753
 Project: Flink
  Issue Type: Sub-task
  Components: Table Store
Affects Versions: 0.1.0
Reporter: Jane Chan


We should check that the primary key should include partition key if the table 
is partitioned



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Created] (FLINK-26754) KafkaCommitterTest#testRetryCommittableOnRetriableError stuck

2022-03-20 Thread Aitozi (Jira)
Aitozi created FLINK-26754:
--

 Summary: KafkaCommitterTest#testRetryCommittableOnRetriableError 
stuck
 Key: FLINK-26754
 URL: https://issues.apache.org/jira/browse/FLINK-26754
 Project: Flink
  Issue Type: Bug
  Components: Tests
Affects Versions: 1.14.4
Reporter: Aitozi


The test stuck for about 1 hour



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Created] (FLINK-26755) Cleanup inprogressfileRecoverable for FileSink on restoring

2022-03-20 Thread Yun Gao (Jira)
Yun Gao created FLINK-26755:
---

 Summary: Cleanup inprogressfileRecoverable for FileSink on 
restoring
 Key: FLINK-26755
 URL: https://issues.apache.org/jira/browse/FLINK-26755
 Project: Flink
  Issue Type: Bug
  Components: Connectors / FileSystem
Affects Versions: 1.15.0, 1.16.0
Reporter: Yun Gao


The FileSink has the similar issue to 
https://issues.apache.org/jira/browse/FLINK-26151



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Created] (FLINK-26756) Failed to deserialize for match recognize

2022-03-20 Thread godfrey he (Jira)
godfrey he created FLINK-26756:
--

 Summary: Failed to deserialize for match recognize
 Key: FLINK-26756
 URL: https://issues.apache.org/jira/browse/FLINK-26756
 Project: Flink
  Issue Type: Bug
  Components: Table SQL / Planner
Affects Versions: 1.15.0
Reporter: godfrey he


Currently, the json deserialization logic is not tested, there de is a bug in 
{{JsonPlanTestBase}}#{{compileSqlAndExecutePlan}} method. The correct logic is 
the {{CompiledPlan}} should be converted to json string, and then the json 
string be deserialized to  {{CompiledPlan}} object. 

After correcting the logic, {{MatchRecognizeJsonPlanITCase}} will get the 
following exception:


{code:java}
at 
org.apache.flink.table.api.internal.TableEnvironmentImpl.loadPlan(TableEnvironmentImpl.java:714)
at 
org.apache.flink.table.planner.utils.JsonPlanTestBase.compileSqlAndExecutePlan(JsonPlanTestBase.java:77)
at 
org.apache.flink.table.planner.runtime.stream.jsonplan.MatchRecognizeJsonPlanITCase.testSimpleMatch(MatchRecognizeJsonPlanITCase.java:66)
{code}

 



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Created] (FLINK-26757) change the default value of state.backend.rocksdb.restore-overlap-fraction-threshold

2022-03-20 Thread Yanfei Lei (Jira)
Yanfei Lei created FLINK-26757:
--

 Summary: change the default value of 
state.backend.rocksdb.restore-overlap-fraction-threshold
 Key: FLINK-26757
 URL: https://issues.apache.org/jira/browse/FLINK-26757
 Project: Flink
  Issue Type: Improvement
  Components: Runtime / State Backends
Reporter: Yanfei Lei
 Attachments: 截屏2022-03-21 上午11.50.28.png

`state.backend.rocksdb.restore-overlap-fraction-threshold` is used to control 
how to restore a state handle, different thresholds can affect the performance 
of restoring. The behavior of deletion in restoring has been changed after 
FLINK-21321.

In theory, setting the default value to 0 is most suitable, since 
`deleteRange()` takes less time than creating a new RocksDB instance and then 
scan-and-put the records. In fact, we also have some experimental data that the 
default value of 0 is more suitable. Here is a comparison of initialization 
times for different thresholds, we can see that the default value to 0 takes 
less time.

!截屏2022-03-21 上午11.50.28.png!

 



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Created] (FLINK-26758) [JUnit5 Migration] Module: flink-container

2022-03-20 Thread RocMarshal (Jira)
RocMarshal created FLINK-26758:
--

 Summary: [JUnit5 Migration] Module: flink-container
 Key: FLINK-26758
 URL: https://issues.apache.org/jira/browse/FLINK-26758
 Project: Flink
  Issue Type: Sub-task
Reporter: RocMarshal






--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Created] (FLINK-26759) Legacy source support waiting for recordWriter to be available

2022-03-20 Thread fanrui (Jira)
fanrui created FLINK-26759:
--

 Summary: Legacy source support waiting for recordWriter to be 
available
 Key: FLINK-26759
 URL: https://issues.apache.org/jira/browse/FLINK-26759
 Project: Flink
  Issue Type: Bug
  Components: Connectors / Common, Runtime / Checkpointing
Affects Versions: 1.14.0, 1.13.0, 1.15.0
Reporter: fanrui
 Fix For: 1.16.0


In order for Unaligned Checkpoint not to be blocked, StreamTask#processInput 
will check recordWriter.isAvailable(). If not available, the data will not be 
processed until recordWriter is available.

The new Source api is compatible with the above logic, but Legacy Source is not 
compatible with the above logic. When using Unaligned Checkpoint, if the 
backpressure of Legacy Source is high, the Checkpoint duration of Legacy Source 
will be very long.

 

Since legacy sources are often used in production, can we add logic to wait for 
recordWriter to be available for legacy source?



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


Re: [ANNOUNCE] New PMC member: Yuan Mei

2022-03-20 Thread Jingsong Li
Congrats Yuan! Well deserved!

Best,
Jingsong

On Thu, Mar 17, 2022 at 8:37 PM Terry Wang  wrote:
>
> Congratulations Yuan!
>
> On Tue, Mar 15, 2022 at 5:55 PM Jark Wu  wrote:
>
> > Congrats Yuan! Well deserved!
> >
> > Best,
> > Jark
> >
> > On Tue, 15 Mar 2022 at 17:42, Qingsheng Ren  wrote:
> >
> > > Congratulations Yuan!
> > >
> > > Best regards,
> > >
> > > Qingsheng Ren
> > >
> > > > On Mar 15, 2022, at 15:09, Yuxin Tan  wrote:
> > > >
> > > > Congratulations, Yuan!
> > > >
> > > > Best,
> > > > Yuxin
> > > >
> > > > Yang Wang  于2022年3月15日周二 15:00写道:
> > > >
> > > >> Congratulations, Yuan!
> > > >>
> > > >> Best,
> > > >> Yang
> > > >>
> > > >> Lincoln Lee  于2022年3月15日周二 13:32写道:
> > > >>
> > > >>> Congratulations, Yuan!
> > > >>>
> > > >>> Best,
> > > >>> Lincoln Lee
> > > >>>
> > > >>>
> > > >>> godfrey he  于2022年3月15日周二 09:53写道:
> > > >>>
> > >  Congratulations, Yuan!
> > > 
> > >  Best,
> > >  Godfrey
> > > 
> > >  Lijie Wang  于2022年3月15日周二 09:18写道:
> > > >
> > > > Congratulations, Yuan!
> > > >
> > > > Best,
> > > > Lijie
> > > >
> > > > Benchao Li  于2022年3月15日周二 08:18写道:
> > > >
> > > >> Congratulations, Yuan!
> > > >>
> > > >> Yun Gao  于2022年3月15日周二 01:37写道:
> > > >>
> > > >>> Congratulations, Yuan!
> > > >>>
> > > >>> Best,
> > > >>> Yun Gao
> > > >>>
> > > >>>
> > > >>>
> > > >>>
> > > >> --
> > > >>> From:Francesco Guardiani 
> > > >>> Send Time:2022 Mar. 15 (Tue.) 00:21
> > > >>> To:dev 
> > > >>> Subject:Re: [ANNOUNCE] New PMC member: Yuan Mei
> > > >>>
> > > >>> Congratulations, Yuan!
> > > >>>
> > > >>> On Mon, Mar 14, 2022 at 3:51 PM yanfei lei 
> > >  wrote:
> > > >>>
> > >  Congratulations, Yuan!
> > > 
> > > 
> > > 
> > >  Zhilong Hong  于2022年3月14日周一 19:31写道:
> > > 
> > > > Congratulations, Yuan!
> > > >
> > > > Best,
> > > > Zhilong
> > > >
> > > > On Mon, Mar 14, 2022 at 7:22 PM Konstantin Knauf <
> > >  kna...@apache.org>
> > > > wrote:
> > > >
> > > >> Congratulations, Yuan!
> > > >>
> > > >> On Mon, Mar 14, 2022 at 11:29 AM Jing Zhang <
> > >  beyond1...@gmail.com>
> > > > wrote:
> > > >>
> > > >>> Congratulations, Yuan!
> > > >>>
> > > >>> Best,
> > > >>> Jing Zhang
> > > >>>
> > > >>> Jing Ge  于2022年3月14日周一 18:15写道:
> > > >>>
> > >  Congrats! Very well deserved!
> > > 
> > >  Best,
> > >  Jing
> > > 
> > >  On Mon, Mar 14, 2022 at 10:34 AM Piotr Nowojski <
> > > > pnowoj...@apache.org>
> > >  wrote:
> > > 
> > > > Congratulations :)
> > > >
> > > > pon., 14 mar 2022 o 09:59 Yun Tang  > > >>>
> > >  napisał(a):
> > > >
> > > >> Congratulations, Yuan!
> > > >>
> > > >> Best,
> > > >> Yun Tang
> > > >> 
> > > >> From: Zakelly Lan 
> > > >> Sent: Monday, March 14, 2022 16:55
> > > >> To: dev@flink.apache.org 
> > > >> Subject: Re: [ANNOUNCE] New PMC member: Yuan Mei
> > > >>
> > > >> Congratulations, Yuan!
> > > >>
> > > >> Best,
> > > >> Zakelly
> > > >>
> > > >> On Mon, Mar 14, 2022 at 4:49 PM Johannes Moser <
> > > > j...@ververica.com>
> > > > wrote:
> > > >>
> > > >>> Congrats Yuan.
> > > >>>
> > >  On 14.03.2022, at 09:45, Arvid Heise <
> > >  ar...@apache.org
> > > >>>
> > > > wrote:
> > > 
> > >  Congratulations and well deserved!
> > > 
> > >  On Mon, Mar 14, 2022 at 9:30 AM Matthias Pohl <
> > > >> map...@apache.org
> > > 
> > > >> wrote:
> > > 
> > > > Congratulations, Yuan.
> > > >
> > > > On Mon, Mar 14, 2022 at 9:25 AM Shuo Cheng <
> > > >> njucs...@gmail.com>
> > > >> wrote:
> > > >
> > > >> Congratulations, Yuan!
> > > >>
> > > >> On Mon, Mar 14, 2022 at 4:22 PM Anton
> > >  Kalashnikov <
> > > >> kaa@yandex.com>
> > > >> wrote:
> > > >>
> > > >>> Congratulations, Yuan!
> > > >>>
> > > >>> --
> > > >>>
> > > >>> Best regards,
> > > >>> Anton Kalashnikov
> > > >>>
> > > 

[jira] [Created] (FLINK-26760) The new CSV source (file system source + CSV format) does not support reading files whose file encoding is not UTF-8

2022-03-20 Thread Lijie Wang (Jira)
Lijie Wang created FLINK-26760:
--

 Summary: The new CSV source (file system source + CSV format) does 
not support reading files whose file encoding is not UTF-8
 Key: FLINK-26760
 URL: https://issues.apache.org/jira/browse/FLINK-26760
 Project: Flink
  Issue Type: Bug
  Components: Connectors / FileSystem
Reporter: Lijie Wang
 Attachments: example.csv

The new CSV source (file system source + CSV format) does not support reading 
files whose file encoding is not UTF-8, but the legacy {{CsvTableSource}} 
supports it.

We provide an {{*example.csv*}} whose file encoding is {{{}ISO-8599-1{}}}.

When reading it with the legacy {{{}CsvTableSource{}}}, it executes correctly:
{code:java}
@Test
public void testLegacyCsvSource() {
EnvironmentSettings environmentSettings = 
EnvironmentSettings.inBatchMode();
TableEnvironment tEnv = TableEnvironment.create(environmentSettings);

CsvTableSource.Builder builder = CsvTableSource.builder();

CsvTableSource source =
builder.path("example.csv")
.emptyColumnAsNull()
.lineDelimiter("\n")
.fieldDelimiter("|")
.field("name", DataTypes.STRING())
.build();
ConnectorCatalogTable catalogTable = 
ConnectorCatalogTable.source(source, true);
tEnv.getCatalog(tEnv.getCurrentCatalog())
.ifPresent(
catalog -> {
try {
catalog.createTable(
new 
ObjectPath(tEnv.getCurrentDatabase(), "example"),
catalogTable,
false);
} catch (Exception e) {
throw new RuntimeException(e);
}
});

tEnv.executeSql("select count(name) from example").print();
}
{code}
When reading it with the new CSV source (file system source + CSV format), it 
throws the following error:
{code:java}
@Test
public void testNewCsvSource() {
EnvironmentSettings environmentSettings = 
EnvironmentSettings.inBatchMode();
TableEnvironment tEnv = TableEnvironment.create(environmentSettings);

String ddl =
"create table example ("
+ "name string"
+ ") with ("
+ "'connector' = 'filesystem',"
+ "'path' = 'example.csv',"
+ "'format' = 'csv',"
+ "'csv.array-element-delimiter' = '\n',"
+ "'csv.field-delimiter' = '|',"
+ "'csv.null-literal' = ''"
+ ")";

tEnv.executeSql(ddl);
tEnv.executeSql("select count(name) from example").print();
}
{code}
{code:java}
Caused by: java.lang.RuntimeException: One or more fetchers have encountered 
exception
at 
org.apache.flink.connector.base.source.reader.fetcher.SplitFetcherManager.checkErrors(SplitFetcherManager.java:225)
at 
org.apache.flink.connector.base.source.reader.SourceReaderBase.getNextFetch(SourceReaderBase.java:169)
at 
org.apache.flink.connector.base.source.reader.SourceReaderBase.pollNext(SourceReaderBase.java:130)
at 
org.apache.flink.streaming.api.operators.SourceOperator.emitNext(SourceOperator.java:385)
at 
org.apache.flink.streaming.runtime.io.StreamTaskSourceInput.emitNext(StreamTaskSourceInput.java:68)
at 
org.apache.flink.streaming.runtime.io.StreamOneInputProcessor.processInput(StreamOneInputProcessor.java:65)
at 
org.apache.flink.streaming.runtime.tasks.StreamTask.processInput(StreamTask.java:519)
at 
org.apache.flink.streaming.runtime.tasks.mailbox.MailboxProcessor.runMailboxLoop(MailboxProcessor.java:203)
at 
org.apache.flink.streaming.runtime.tasks.StreamTask.runMailboxLoop(StreamTask.java:804)
at 
org.apache.flink.streaming.runtime.tasks.StreamTask.invoke(StreamTask.java:753)
at 
org.apache.flink.runtime.taskmanager.Task.runWithSystemExitMonitoring(Task.java:948)
at 
org.apache.flink.runtime.taskmanager.Task.restoreAndInvoke(Task.java:927)
at org.apache.flink.runtime.taskmanager.Task.doRun(Task.java:741)
at org.apache.flink.runtime.taskmanager.Task.run(Task.java:563)
at java.lang.Thread.run(Thread.java:748)
Caused by: java.lang.RuntimeException: SplitFetcher thread 0 received 
unexpected exception while polling the records
at 
org.apache.flink.connector.base.source.reader.fetcher.SplitFetcher.runOnce(SplitFetcher.java:150)
at 
org.apache.flink.connector.base.source.reader.fetcher.SplitFetche

[jira] [Created] (FLINK-26761) Fix the cast exception thrown by PreValidateReWriter when insert into/overwrite a partitioned table.

2022-03-20 Thread zoucao (Jira)
zoucao created FLINK-26761:
--

 Summary: Fix the cast exception thrown by PreValidateReWriter when 
insert into/overwrite a partitioned table.
 Key: FLINK-26761
 URL: https://issues.apache.org/jira/browse/FLINK-26761
 Project: Flink
  Issue Type: Bug
  Components: Table SQL / Planner
Reporter: zoucao


In `PreValidateReWriter#appendPartitionAndNullsProjects`, we should use
{code:java}
val names = sqlInsert.getTargetTableID.asInstanceOf[SqlIdentifier].names
{code}
to get the table name, instead of
{code:java}
val names = sqlInsert.getTargetTable.asInstanceOf[SqlIdentifier].names
{code}
when we execute the following sql:
{code:java}
insert into/overwrite table_name /*+ options(xxx) */ partition(xxx) select  
{code}
invoke `sqlInsert.getTargetTable` will get a SqlTableRef, which can not be cast 
to SqlIdentifier.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[DISCUSS] FLIP-214 Support Advanced Function DDL

2022-03-20 Thread 刘大龙
Hi, everyone




I would like to open a discussion for support advanced Function DDL, this 
proposal is a continuation of FLIP-79 in which Flink Function DDL is defined. 
Until now it is partially released as the Flink function DDL with user defined 
resources is not clearly discussed and implemented. It is an important feature 
for support to register UDF with custom jar resource, users can use UDF more 
more easily without having to put jars under the classpath in advance.

Looking forward to your feedback.




[1] 
https://cwiki.apache.org/confluence/display/FLINK/FLIP-214+Support+Advanced+Function+DDL




Best,

Ron




Re: [VOTE]FLIP-215: Introduce FlinkSessionJob CRD in the kubernetes operator

2022-03-20 Thread Shuiqiang Chen
+1 (non-binding)
It will make job management more convenient in session mode.

Best,
shuiqiang

Xintong Song  于2022年3月21日周一 09:57写道:

> Thanks Aitozi, the FLIP LGTM.
>
> +1 (binding)
>
> Thank you~
>
> Xintong Song
>
>
>
> On Fri, Mar 18, 2022 at 10:36 PM Őrhidi Mátyás 
> wrote:
>
> > +1 (non-binding)
> > nice addition to the operator
> >
> > Cheers,
> > Matyas
> >
> > On Fri, Mar 18, 2022 at 12:10 PM Biao Geng  wrote:
> >
> > > +1(non-binding)
> > > Thanks for the work!
> > >
> > > Best,
> > > Biao Geng
> > >
> > > Yang Wang  于2022年3月18日周五 19:01写道:
> > >
> > > > +1(binding)
> > > >
> > > > Thanks for your contribution.
> > > >
> > > > Best,
> > > > Yang
> > > >
> > > > Gyula Fóra  于2022年3月18日周五 18:44写道:
> > > >
> > > > > I think this is a simple and valuable addition that will also be a
> > > > building
> > > > > block for other important future features.
> > > > >
> > > > > +1
> > > > >
> > > > > Gyula
> > > > >
> > > > > On Fri, Mar 18, 2022 at 10:30 AM Aitozi 
> > wrote:
> > > > >
> > > > > > Hi community:
> > > > > > I'd like to start a vote on FLIP-215: Introduce
> FlinkSessionJob
> > > CRD
> > > > > in
> > > > > > the kubernetes operator [1] which has been discussed in the
> thread
> > > [2].
> > > > > >
> > > > > > The vote will be open for at least 72 hours unless there is an
> > > > objection
> > > > > or
> > > > > > not enough votes.
> > > > > >
> > > > > > [1]:
> > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-215%3A+Introduce+FlinkSessionJob+CRD+in+the+kubernetes+operator
> > > > > > [2]:
> > > https://lists.apache.org/thread/fpp5m9jkr0wnjryd07xtpj13t80z99yt
> > > > > >
> > > > > > Best,
> > > > > > Aitozi.
> > > > > >
> > > > >
> > > >
> > >
> >
>