Hi Rodrigo,
Flink doesn't support an avro uber jar, so you need to add all dependency
jars manually, such as avro, jackson-core-asl, jackson-mapper-asl and
joda-time in release-1.11.
However, I found that there was a JIRA[1] that provided a default version
of avro uber jar a few days ago.
[1] ht
Hi Rodrigo,
For the connectors, Pyflink just wraps the java implementation.
And I am not an expert on Avro and corresponding connectors, but as far as
I know, DataTypes really cannot declare the type of union you mentioned.
Regarding the bytes encoding you mentioned, I actually have no good
suggest
Hi Lei,
If you want to write your custom partitioner, I think you can refer to the
built-in FlinkFixedPartitioner[1]
[1]
https://github.com/apache/flink/blob/master/flink-connectors/flink-connector-kafka-base/src/main/java/org/apache/flink/streaming/connectors/kafka/partitioner/FlinkFixedPartitio
Thanks Zhu for the great work and everyone who contributed to this release!
Best,
Xingbo
Guowei Ma 于2020年8月26日周三 下午12:43写道:
> Hi,
>
> Thanks a lot for being the release manager Zhu Zhu!
> Thanks everyone contributed to this!
>
> Best,
> Guowei
>
>
> On Wed, Aug 26, 2020 at 11:18 AM Yun Tang wr
Congratulations Dian!
Best,
Xingbo
jincheng sun 于2020年8月27日周四 下午5:24写道:
> Hi all,
>
> On behalf of the Flink PMC, I'm happy to announce that Dian Fu is now part
> of the Apache Flink Project Management Committee (PMC).
>
> Dian Fu has been very active on PyFlink component, working on various
>
Hi Manas,
I think you forgot to add kafka jar[1] dependency. You can use the argument
-j of the command line[2] or the Python Table API to specify the jar. For
details about the APIs of adding Java dependency, you can refer to the
relevant documentation[3]
[1]
https://ci.apache.org/projects/flink
a
> pyFlink job is through the command line right? Can I do that through the
> GUI?
>
> On Fri, Aug 28, 2020 at 8:17 PM Xingbo Huang wrote:
>
>> Hi Manas,
>>
>> I think you forgot to add kafka jar[1] dependency. You can use the
>> argument -j of the comm
Hi Manas,
When running locally, you need
`ten_sec_summaries.get_job_client().get_job_execution_result().result()` to
wait job finished. However, when you submit to the cluster, you need to
delete this code. In my opinion, the current feasible solution is that you
prepare two sets of codes for this
ossible to detect the environment programmatically.
>>
>> Regards,
>> Manas
>>
>> On Wed, Sep 2, 2020 at 7:32 AM Xingbo Huang wrote:
>>
>>> Hi Manas,
>>>
>>> When running locally, you need
>>> `ten_sec_summaries.get_job_clien
Hi,
I will do my best to provide pyflink related content, I hope it helps you.
>>> each udf function is a separate process, that is managed by Beam (but
I'm not sure I got it right).
Strictly speaking, it is not true that every UDF is in a different python
process. For example, the two python fu
Hi,
You can use api to set configuration:
table_env.get_config().get_configuration().set_string("taskmanager.memory.task.off-heap.size",
'80m')
The flink-conf.yaml way will only take effect when submitted through flink
run, and the minicluster way(python xxx.py) will not take effect.
Best,
Xingb
ou think that it looks like
> an issue ?
>
> Thx !
>
> вт, 13 окт. 2020 г. в 05:35, Xingbo Huang :
>
>> Hi,
>>
>> You can use api to set configuration:
>> table_env.get_config().get_configuration().set_string("taskmanager.memory.task.off-heap.size",
>
Hi,
Which version of pyflink are you using? I think the api you are using is
not the pyflink since flink 1.9. For detailed usage of pyflink, you can
refer to doc[1]
[1]
https://ci.apache.org/projects/flink/flink-docs-release-1.11/dev/python/table_api_tutorial.html
Best,
Xingbo
大森林 于2020年10月13日周
Hi,
I think you can directly declare `def accumulate(acc: MembershipsIDsAcc,
value1: Long, value2: Boolean)`
Best,
Xingbo
Rex Fenley 于2020年10月26日周一 上午9:28写道:
> If I switch accumulate to the following:
> def accumulate(acc: MembershipsIDsAcc, value:
> org.apache.flink.api.java.tuple.Tuple2[java.
Hi,
You can use the following API to add all the dependent jar packages you
need:
table_env.get_config().get_configuration().set_string("pipeline.jars",
"file:///my/jar/path/connector.jar;file:///my/jar/path/json.jar")
For more related content, you can refer to the pyflink doc[1]
[1]
https://ci.
Hi George,
Have you referred to the official document[1]?
[1]
https://ci.apache.org/projects/flink/flink-docs-release-1.11/ops/deployment/kubernetes.html
Best,
Xingbo
在 2020年11月21日星期六,George Costea 写道:
> Hi there,
>
> Is there an example of how to deploy a flink cluster on Kubernetes?
> I'd lik
Hi Kai,
I took a look at the implementation of the filesystem connector. It will
decide which files to read at startup
and won't change during running. If you want to need this function, you may
need to customize a new connector.
Best,
Xingbo
eef hhj 于2020年11月21日周六 下午2:38写道:
> Hi,
>
> I'm faci
Hi Niklas,
Regarding `Exception in thread "grpc-nio-worker-ELG-3-2"
java.lang.NoClassDefFoundError:
org/apache/beam/vendor/grpc/v1p26p0/io/netty/buffer/PoolArena$1`,
it does not affect the correctness of the result. The reason is that some
resources are released asynchronously when Grpc Server is
> Information Art. 13 DSGVO B2B:
> Für die Kommunikation mit Ihnen verarbeiten wir ggf.
> Ihre personenbezogenen Daten.
> Alle Informationen zum Umgang mit Ihren Daten finden Sie unter
> https://www.uniberg.com/impressum.html.
>
> On 26. Nov 2020, at 02:59, Xingbo H
Hi Pierre,
Sorry for the late reply.
Your requirement is that your `Table` has a `field` in `Json` format and its
key has reached 100k, and then you want to use such a `field` as the
input/output of `udf`, right? As to whether there is a limit on the number of
nested key, I am not quite clear.
ch-schema-to-a-flink-datastream-on-the-fly
> [2]
> https://docs.cloudera.com/csa/1.2.0/datastream-connectors/topics/csa-schema-registry.html
>
> Le sam. 28 nov. 2020 à 04:04, Xingbo Huang a écrit :
>
>> Hi Pierre,
>>
>> Sorry for the late reply.
>> Your req
.f2), source.f3)
result.execute_insert("print_table")
if __name__ == '__main__':
test()
Best,
Xingbo
Pierre Oberholzer 于2020年12月1日周二 下午6:10写道:
> Hi Xingbo,
>
> That would mean giving up on using Flink (table) features on the content
> of the parsed J
> that could be used ?
>
> Thanks a lot !
>
> Best,
>
>
> Le mer. 2 déc. 2020 à 04:50, Xingbo Huang a écrit :
>
>> Hi Pierre,
>>
>> I wrote a PyFlink implementation, you can see if it meets your needs:
>>
>>
>> from pyflink.datastream impo
ding benchmarks
> vs. other frameworks) are also highly welcome.
>
> Thanks !
>
> Best
>
>
> Le jeu. 3 déc. 2020 à 02:53, Xingbo Huang a écrit :
>
>> Hi Pierre,
>>
>> This example is written based on the syntax of release-1.12 that is about
>> to be re
Hi,
This problem has been fixed[1] in 1.12.0,1.10.3,1.11.3, but release-1.11.3
and release-1.12.0 have not been released yet (VOTE has been passed). I run
your job in release-1.12, and the plan is correct.
[1] https://issues.apache.org/jira/browse/FLINK-19675
Best,
Xingbo
László Ciople 于2020年
Hi,
As far as I know, TableAggregateFunction is not supported yet in batch
mode[1]. You can try to use it in stream mode.
[1] https://issues.apache.org/jira/browse/FLINK-10978
Best,
Xingbo
Leonard Xu 于2020年12月8日周二 下午6:05写道:
> Hi, appleyuchi
>
> Sorry for the late reply,
> but could you descr
Hi,
Your code does not show how to create the Table of `Orders`. For how to
specify the time attribute according to DDL, you can refer to the official
document[1].
[1]
https://ci.apache.org/projects/flink/flink-docs-release-1.11/dev/table/sql/create.html
Best,
Xingbo
Appleyuchi 于2020年12月9日周三
Hi,
As Chesnay said, PyFlink has already supported Python DataStream stateless
APIs so that users are able to perform some basic data transformations, but
doesn't provide state access support yet in release-1.12. The proposal[1]
of enhancing the API with state access has been made and related
disc
Hi Torben,
It is indeed a bug, and I have created a JIRA[1]. The work around solution
is to use the index to solve (written in release-1.12):
env = StreamExecutionEnvironment.get_execution_environment()
env.set_parallelism(1)
t_env = StreamTableEnvironment.create(env,
environment_set
Hi Andrew,
According to the error, you can try to check the file permission of
"/test/python-dist-78654584-bda6-4c76-8ef7-87b6fd256e4f/python-files/site-packages/site-packages/pyflink/bin/pyflink-udf-runner.sh"
Normally, the permission of this script would be
-rwxr-xr-x
Best,
Xingbo
Andrew Kram
a python process that executes python udf code. If you don’t use
python udf, the contents of the execution are all running on the JVM.
Best,
Xingbo
Andrew Kramer 于2020年12月28日周一 下午8:29写道:
> Hi Xingbo,
>
> That file does not exist on the file system.
>
> Thanks,
> Andrew
>
> On
@Gordon Thanks a lot for the release and for being the release manager.
And thanks to everyone who made this release possible!
Best,
Xingbo
Till Rohrmann 于2021年1月3日周日 下午8:31写道:
> Great to hear! Thanks a lot to everyone who helped make this release
> possible.
>
> Cheers,
> Till
>
> On Sat, Jan
Hi meneldor,
I guess Shuiqiang is not using the pyflink 1.12.0 to develop the example.
The signature of the `process_element` method has been changed in the new
version[1]. In pyflink 1.12.0, you can use `collector`.collect to send out
your results.
[1] https://issues.apache.org/jira/browse/FLINK
Hi Shuiqiang, meneldor,
1. In fact, there is a problem with using Python `Named Row` as the return
value of user-defined function in PyFlink.
When serializing a Row data, the serializer of each field is consistent
with the order of the Row fields. But the field order of Python `Named Ro
Context')? Can i access the value as in
> process_element() in the ctx for example?
>
> Thank you!
>
> On Mon, Jan 18, 2021 at 4:18 PM Xingbo Huang wrote:
>
>> Hi Shuiqiang, meneldor,
>>
>> 1. In fact, there is a problem with using Python `Named Row` as the
>> re
Thanks Xintong for the great work!
Best,
Xingbo
Peter Huang 于2021年1月19日周二 下午12:51写道:
> Thanks for the great effort to make this happen. It paves us from using
> 1.12 soon.
>
> Best Regards
> Peter Huang
>
> On Mon, Jan 18, 2021 at 8:16 PM Yang Wang wrote:
>
> > Thanks Xintong for the great wor
:
> Thank you Xingbo!
>
> Do you plan to implement CoProcess functions too? Right now i cant find a
> convenient method to connect and merge two streams?
>
> Regards
>
> On Tue, Jan 19, 2021 at 4:16 AM Xingbo Huang wrote:
>
>> Hi meneldor,
>>
>> 1. Yes. Al
Hi,
其实这个是CSV connector的一个可选的
quote_character参数的效果,默认值是",所以你每个字段都会这样外面包一层,你可以配置一下这个参数为\0,就可以得到你要的效果了。
st_env.connect(
Kafka()
.version("0.11")
.topic("logSink")
.start_from_earliest()
.property("zookeeper.connect", "localhost:2181")
客气客气,互相交流学习😀
Best,
Xingbo
jack 于2020年6月1日周一 下午9:07写道:
> 非常感谢解惑,刚开始使用pyflink,不是很熟悉,还请多多指教
>
>
>
>
>
>
> 在 2020-06-01 20:50:53,"Xingbo Huang" 写道:
>
> Hi,
> 其实这个是CSV connector的一个可选的
> quote_character参数的效果,默认值是",所以你每个字段都会这样外面包一层,你可以配置一下这个参数为\0
Hi Manas,
Since Flink 1.9, the entire architecture of PyFlink has been redesigned. So
the method described in the link won't work.
But you can use more convenient DDL[1] or descriptor[2] to read kafka data.
Besides, You can refer to the common questions about PyFlink[3]
[1]
https://ci.apache.org/
5, in register_table_source
> self._j_connect_table_descriptor.registerTableSource(name)
> File
> "/home/manas/anaconda3/envs/flink_env/lib/python3.7/site-packages/py4j/java_gateway.py",
> line 1286, in __call__
> answer, self.gateway_client, self.target_id, self.na
Hi Manas,
Maybe it is the bug of Java Descriptor. You can try the DDL[1] way which is
the more recommended way
[1]
https://ci.apache.org/projects/flink/flink-docs-release-1.11/dev/table/connectors/
Best,
Xingbo
Khachatryan Roman 于2020年7月13日周一 下午7:23写道:
> Hi Manas,
>
> Do you have the same erro
do that.
> @Xingbo Huang - okay, I didn't know DDL was the more
> recommended way.
> Please let me know if you confirm that this is a bug.
> Thanks!
>
> On Mon, Jul 13, 2020 at 5:07 PM Xingbo Huang wrote:
>
>> Hi Manas,
>> Maybe it is the bug of Java Descript
Hi Manas,
I tested your code, but there are no errors. Because execute_sql is an
asynchronous method, you need to await through TableResult, you can try the
following code:
from pyflink.datastream import StreamExecutionEnvironment,
TimeCharacteristic
from pyflink.table import StreamTableEnviron
Hi Jesse,
For how to add jar packages, you can refer to the Common Questions doc[1]
of PyFlink. PyFlink 1.10 and 1.11 have some differences in the way of
adding jar packages which the document has a detailed introduction
[1]
https://ci.apache.org/projects/flink/flink-docs-release-1.11/dev/table/p
he latest pyFlink 1.11 API? I couldn't find anything related to
> awaiting asynchronous results.
>
> Thanks,
> Manas
>
> On Tue, Jul 14, 2020 at 1:29 PM Xingbo Huang wrote:
>
>> Hi Manas,
>>
>>
>> I tested your code, but there are no errors. Because
uot;,
>> "properties": {
>> "monitorId": {
>> "type": "string"
>> },
>> "deviceId": {
>> "type": "string"
>> },
>>
Hi Manas,
You need to join with the python udtf function. You can try the following
sql:
ddl_populate_temporary_table = f"""
INSERT INTO {TEMPORARY_TABLE}
SELECT * FROM (
SELECT monitorId, featureName, featureData, time_st
FROM {INPUT_TABLE}, LATERAL TABLE(split(data)) as T(feature
Hi Jesse,
I think that the type of rowtime you declared on the source schema is
DataTypes.Timestamp(), you also use DataTypes.Timestamp() on the sink schema
Best,
Xingbo
Jesse Lord 于2020年7月15日周三 下午11:41写道:
> I am trying to sink the rowtime field in pyflink 1.10. I get the following
> error
>
>
Hi Wojtek,
you need to use the fat jar 'flink-sql-connector-kafka_2.11-1.11.0.jar'
which you can download in the doc[1]
[1]
https://ci.apache.org/projects/flink/flink-docs-release-1.11/dev/table/connectors/kafka.html
Best,
Xingbo
Wojciech Korczyński 于2020年7月23日周四
下午4:57写道:
> Hello,
>
> I am tr
ain(CliFrontend.java:992)
>
> Maybe the way the python program is written is incorrect. Can it be
> deprecated taking into account that the installed flink version is 1.11?
>
> Best regards,
> Wojtek
>
> czw., 23 lip 2020 o 12:01 Xingbo Huang napisał(a):
>
>> Hi Wojt
.create_temporary_table('mySink')
>
>
> t_env.scan('mySource') \
> .select('"flink_job_" + value') \
> .insert_into('mySink')
>
> t_env.execute("tutorial_job")
>
> I have installed PyFlink 1.11 s
l.invoke0(Native Method)
> at
> java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
> at
> java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.base/java.lang.r
Hi rookieCOder,
You need to make sure that your files can be read by each slaves, so an
alternative solution is to put your files on hdfs
Best,
Xingbo
rookieCOder 于2020年7月27日周一 下午5:49写道:
> 'm coding with pyflink 1.10.0 and building cluster with flink 1.10.0
> I define the source and the sink as
Yes. You are right.
Best,
Xingbo
rookieCOder 于2020年7月27日周一 下午6:30写道:
> Hi, Xingbo
> Thanks for your reply.
> So the point is that simply link the source or the sink to the master's
> local file system will cause the error that the slaves cannot read the
> source/sink files? Thus the simplest so
Hi,
You can use the `set_python_requirements` method to specify your
requirement.txt which you can refer to the documentation[1] for details
[1]
https://ci.apache.org/projects/flink/flink-docs-release-1.11/dev/table/python/dependency_management.html#python-dependency
Best,
Xingbo
rookieCOder 于2
Hi Jincheng,
Thanks a lot for bringing up this discussion and the proposal.
Big +1 for improving the structure of PyFlink doc.
It will be very friendly to give PyFlink users a unified entrance to learn
PyFlink documents.
Best,
Xingbo
Dian Fu 于2020年7月31日周五 上午11:00写道:
> Hi Jincheng,
>
> Thanks
ganized documentation will greatly improve the efficiency and
>>>> > experience for developers.
>>>> >
>>>> > Best,
>>>> > Shuiqiang
>>>> >
>>>> > Hequn Cheng 于2020年8月1日周六 上午8:42写道:
>&g
Hi,
>From the error message, I think the problem is no python interpreter on
your TaskManager machine. You need to install a python 3.5+ interpreter on
the TM machine, and this python environment needs to install pyflink (pip
install apache-flink). For details, you can refer to the document[1].
[
Hi,
The problem is that the legacy DataSet you are using does not support the
FileSystem connector you declared. You can use blink Planner to achieve
your needs.
>>>
t_env = BatchTableEnvironment.create(
environment_settings=EnvironmentSettings.new_instance()
.in_batch_mode().
Hi Yik San,
Thanks for the investigation of PyFlink together with all these ML libs.
IMO, you could refer to the flink-ai-extended project that supports the
Tensorflow on Flink, PyTorch on Flink etc, whose repository url is
https://github.com/alibaba/flink-ai-extended. Flink AI Extended is a
proje
Yes, you need to ensure that the key and value types of the Map are
determined
Best,
Xingbo
Yik San Chan 于2021年3月19日周五 下午3:41写道:
> I got why regarding the simplified question - the dummy parser should
> return key(string)-value(string), otherwise it fails the result_type spec
>
> On Fri, Mar 19
Hi Dawid,
Thanks a lot for the great work! Regarding to the issue of flink-python, I
have provided a quick fix and will try to fix it ASAP.
Best,
Xingbo
Dawid Wysakowicz 于2021年4月2日周五 上午4:04写道:
> Hi everyone,
> As promised I created a release candidate #0 for the version 1.13.0. I am
> not star
Hi Sumeet,
Python Row-based operation will be supported in the releases-1.13. I guess
you are looking at the code of the master branch. Since you are using the
Python Table API, you can use python udf to parse your data. For the
details of python UDF, you can refer to the doc[1].
[1]
https://ci.a
Hi Yik San,
You can check whether the execution environment used is
`LocalStreamEnvironment` and you can get the class object corresponding to
the corresponding java object through py4j in PyFlink. You can take a look
at the example I wrote below, I hope it will help you
```
from pyflink.table impo
Hi Sumeet,
Due to the limitation of the original PyFlink serializers design, there is
no way to pass attribute names to Row in row-based operations. In
release-1.14, I am reconstructing the implementations of serializers[1].
After completion, accessing attribute names of `Row` in row-based
operati
+1 (non-binding)
- verified checksums and signatures
- built from source code
- check apache-flink source/wheel package content
- run python udf job
Best,
Xingbo
Dawid Wysakowicz 于2021年5月27日周四 下午9:45写道:
> +1 (binding)
>
>- verified signatures and checksums
>- built from sources and run
Congratulations, Dian.
Well deserved!
Best,
Xingbo
Wei Zhong 于2020年1月16日周四 下午6:13写道:
> Congrats Dian Fu! Well deserved!
>
> Best,
> Wei
>
> 在 2020年1月16日,18:10,Hequn Cheng 写道:
>
> Congratulations, Dian.
> Well deserved!
>
> Best, Hequn
>
> On Thu, Jan 16, 2020 at 6:08 PM Leonard Xu wrote:
>
>
Hi Jincheng,
Thanks for driving this.
+1 for this proposal.
Compared to building from source, downloading directly from PyPI will
greatly save the development cost of Python users.
Best,
Xingbo
Wei Zhong 于2020年2月4日周二 下午12:43写道:
> Hi Jincheng,
>
> Thanks for bring up this discussion!
>
> +1
+1 (non-binding)
- Install the PyFlink by `pip install` [SUCCESS]
- Run word_count.py [SUCCESS]
Thanks,
Xingbo
Becket Qin 于2020年2月12日周三 下午2:28写道:
> +1 (binding)
>
> - verified signature
> - Ran word count example successfully.
>
> Thanks,
>
> Jiangjie (Becket) Qin
>
> On Wed, Feb 12, 2020 at 1
Thanks a lot for the release.
Great Work, Jincheng!
Also thanks to participants who contribute to this release.
Best,
Xingbo
Till Rohrmann 于2020年2月18日周二 下午11:40写道:
> Thanks for updating the 1.9.2 release wrt Flink's Python API Jincheng!
>
> Cheers,
> Till
>
> On Thu, Feb 13, 2020 at 12:25 PM H
ight documentation?
>
> Thanks!
> -chad
>
>
> On Thu, Feb 20, 2020 at 5:39 AM Xingbo Huang wrote:
>
>> Thanks a lot for the release.
>> Great Work, Jincheng!
>> Also thanks to participants who contribute to this release.
>>
>> Best,
>> Xingbo
Congratulations Jingsong! Well deserved.
Best,
Xingbo
wenlong.lwl 于2020年2月21日周五 上午11:43写道:
> Congrats Jingsong!
>
> On Fri, 21 Feb 2020 at 11:41, Dian Fu wrote:
>
> > Congrats Jingsong!
> >
> > > 在 2020年2月21日,上午11:39,Jark Wu 写道:
> > >
> > > Congratulations Jingsong! Well deserved.
> > >
> > >
Hi Wouter,
Sorry for the late reply. I will try to answer your questions in detail.
1. >>> Perforce problem.
When running udf job locally, beam will use a loopback way to connect back
to the python process used by the compilation job, so the time of starting
up the job will come faster than pyflin
Hi Wouter,
In fact, our users have encountered the same problem. Whenever the `bundle
size` or `bundle time` is reached, the data in the buffer needs to be sent
from the jvm to the pvm, and then waits for the pym to be processed and
sent back to the jvm to send all the results to the downstream op
ommendations with
> regard to these two configuration values, to get somewhat reasonable
> performance?
>
> Thanks a lot!
> Wouter
>
> On Thu, 8 Jul 2021 at 10:26, Xingbo Huang wrote:
>
>> Hi Wouter,
>>
>> In fact, our users have encountered the same problem
Hi,
I think your understanding is correct. The results seem a little wired. I'm
looking into this and will let you know when there are any findings.
Best,
Xingbo
赵飞 于2021年7月12日周一 下午4:48写道:
> Hi all,
> I'm using pyflink to develop a module, whose main functionality is
> processing user data bas
is
not used in `process_element2`
[1] https://issues.apache.org/jira/browse/FLINK-23368
Best,
Xingbo
赵飞 于2021年7月12日周一 下午10:00写道:
> Thanks. In addition, I run the program in a local mini cluster mode, not
> sure if it would affect the results.
>
> Xingbo Huang 于2021年7月12日周一 下午9:0
Hi Zhongle Wang,
Your understanding is correct. Firstly, you need to provide an
implementation of a java connector, then add this jar to the dependency[1],
and finally add a python connector wrapper.
[1]
https://ci.apache.org/projects/flink/flink-docs-release-1.13/docs/dev/python/faq/#adding-jar-
Hi sonia,
As far as I know, pyflink users prefer to use python udf[1][2] for model
prediction. Load the model when the udf is initialized, and then predict
each new piece of data
[1]
https://nightlies.apache.org/flink/flink-docs-release-1.14/docs/dev/python/table/udfs/overview/
[2]
https://nightl
Hi,
With py4j, you can call any Java method. On how to create a Java Row, you
can call the `createRowWithNamedPositions` method of `RowUtils`[1].
[1]
https://github.com/apache/flink/blob/master/flink-core/src/main/java/org/apache/flink/types/RowUtils.java#
Best,
Xingbo
Francis Conroy 于2022年2月25
Thanks a lot for being our release manager Konstantin and everyone who
contributed. I have a question about pyflink. I see that there are no
corresponding wheel packages uploaded on pypi, only the source package is
uploaded. Is there something wrong with building the wheel packages?
Best,
Xingbo
affected Flink
> 1.13.6, the other release I was recently managing. I simply skipped a step
> in the release guide.
>
> It should be fixed now. Could you double-check?
>
> Cheers,
>
> Konstantin
>
> On Wed, Mar 16, 2022 at 4:07 AM Xingbo Huang wrote:
>
> > Thank
Hi,
Are you using datastream api or table api?If you are using the table api,
you can use the connector by executing sql[1]. If you are using the
datastream api, there is really no es connector api provided, you need to
write python wrapper code, but the wrapper code is very simple. The
underlying
Hi Kenneth,
In flink 1.15, pyflink only guarantees support for python 3.6,3.7 and
3.8[1]. In release-1.16, pyflink will provide support for python 3.9[2].
Go back to your installation error. In flink 1.15, the version range of
numpy that pyflink depends on is numpy>=1.14.3,<1.20. So when you exec
Hi John,
Could you provide the code snippet and the version of pyflink you used?
Best,
Xingbo
John Tipper 于2022年6月16日周四 17:05写道:
> Hi all,
>
> I'm trying to run a PyFlink unit test to test some PyFlink SQL and where
> my code uses a Python UDF. I can't share my code but the test case is
> si
my code but Flink is 1.13. The main Flink code is
> running inside Kinesis on AWS so I cannot change the version.
>
> Many thanks,
>
> John
>
> Sent from my iPhone
>
> On 16 Jun 2022, at 10:37, Xingbo Huang wrote:
>
>
> Hi John,
>
> Could you provid
her than explicit
> calls to the Table and DataStream APIs.
>
> Is this a good pattern or are there caveats I should be aware of please?
>
> Many thanks,
>
> John
>
>
> --
> *From:* Xingbo Huang
> *Sent:* 16 June 2022 12:34
>
The Apache Flink community is very happy to announce the release of Apache
Flink 1.14.5, which is the fourth bugfix release for the Apache Flink 1.14
series.
Apache Flink® is an open-source stream processing framework for
distributed, high-performing, always-available, and accurate data streaming
Thanks a lot for being our release manager David and everyone who
contributed.
Best,
Xingbo
David Anderson 于2022年7月8日周五 06:18写道:
> The Apache Flink community is very happy to announce the release of Apache
> Flink 1.15.1, which is the first bugfix release for the Apache Flink 1.15
> series.
>
>
Hi Gyula,
According to the log, we can see that you downloaded the source package of
pemja, not the wheel package of pemja[1]. I guess you are using the m1
machine. If you install pemja from the source package, you need to have
JDK, gcc tools and CPython with Numpy in the environment. I believe th
Thanks Danny for driving this release
Best,
Xingbo
Jing Ge 于2022年8月25日周四 05:50写道:
> Thanks Danny for your effort!
>
> Best regards,
> Jing
>
> On Wed, Aug 24, 2022 at 11:43 PM Danny Cranmer
> wrote:
>
>> The Apache Flink community is very happy to announce the release of
>> Apache Flink 1.15.2
Hi Raman,
This problem comes from the inconsistency between your flink version and
pyflink version
Best,
Xingbo
Ramana 于2022年9月6日周二 15:08写道:
> Hello there,
>
> I have a pyflink setup of 1 : JobManager - 1 : Task Manager.
>
> Trying to run a pyflink job and no matter what i do, i get the follow
The Apache Flink community is very happy to announce the release of Apache
Flink 1.14.6, which is the fifth bugfix release for the Apache Flink 1.14
series.
Apache Flink® is an open-source stream processing framework for
distributed, high-performing, always-available, and accurate data streaming
a
+1 for reverting these changes in Flink 1.16, so I will cancel 1.16.0-rc1.
+1 for `numXXXSend` as the alias of `numXXXOut` in 1.15.3.
Best,
Xingbo
Chesnay Schepler 于2022年10月10日周一 19:13写道:
> > I’m with Xintong’s idea to treat numXXXSend as an alias of numXXXOut
>
> But that's not possible. If it
The Apache Flink community is very happy to announce the release of Apache
Flink 1.16.0, which is the first release for the Apache Flink 1.16 series.
Apache Flink® is an open-source stream processing framework for
distributed, high-performing, always-available, and accurate data streaming
applicat
Hi Yogi,
I think the problem comes from poetry depending on the metadata in PyPI.
This problem has been reported in
https://issues.apache.org/jira/browse/FLINK-29817 and I will fix it in
1.16.1.
Best,
Xingbo
Yogi Devendra 于2022年11月17日周四 06:21写道:
> Dear community/maintainers,
>
> Thanks for the
Hi Harshit,
According to the stack you provided, I guess you define your Python
function in the main file, and the Python function imports xgboost
globally. The reason for the error is that the xgboost library is difficult
to be serialized by cloudpickle. There are two ways to solve
1. Move `impo
Hi Janus,
Which python version are you using? Flink 1.19 removes the support of
Python 3.7.
Best,
Xingbo
janusgraph 于2025年5月12日周一 15:03写道:
> Sorry about that the snapshot is broken.
>
> The errors are as follows.
> ```
> # pip3 install apache-flink==1.20.1
> Collecting apache-flink==1.20.1
>
99 matches
Mail list logo