Re: [DISCUSS] PIP 193 : Sink preprocessing Function

2022-07-25 Thread Jerry Peng
My feedback is to make this change as self contained as possible. Can we just have a special implementation of a sink that will run the logic of the "preprocess" function? There are many places in the code where we figure out if it is a source, sink or a function based on the fields in the Functi

Re: Architecture of function authorization for process mode

2022-01-25 Thread Jerry Peng
Devin, You can customize how each function authenticates with brokers by creating a implementation of this interface: https://github.com/apache/pulsar/blob/master/pulsar-functions/runtime/src/main/java/org/apache/pulsar/functions/auth/FunctionAuthProvider.java and setting the class name of your

Re: [VOTE] PIP-86: Pulsar Functions: Preload and release external resources

2022-01-21 Thread Jerry Peng
+1 On Fri, Jan 21, 2022 at 12:07 PM Neng Lu wrote: > Hi All, > > I would like to start a VOTE on the PIP 86. (If it's already been voted, > please let me know.) > > The issue for PIP 86 is here: > > https://github.com/apache/pulsar/wiki/PIP-86%3A-Pulsar-Functions%3A-Preload-and-release-external-

Re: [DISCUSSION] PIP-133 Pulsar Functions Add API For Accessing Other Function States

2022-01-18 Thread Jerry Peng
I have concerns about security in this case and potential consistency issues. We will need to define and implement some sort of ACLs system first on top of state for this to make sense. On Mon, Jan 17, 2022 at 5:52 PM Ethan Merrill wrote: > Thanks for the feedback. I see your concerns. > > I've

Re: [Vote] PIP 104: Add new consumer type: TableView

2021-12-06 Thread Jerry Peng
+1 On Mon, Dec 6, 2021 at 1:01 PM Enrico Olivelli wrote: > +1 (binding) > > > Enrico > > Il Lun 6 Dic 2021, 19:27 Matteo Merli ha scritto: > > > +1 > > > > > > -- > > Matteo Merli > > > > > > On Wed, Dec 1, 2021 at 12:22 PM Neng Lu wrote: > > > > > > Hi Pulsar Community, > > > > > > I would l

Re: The ability to drain Pulsar Function workers

2021-09-02 Thread Jerry Peng
Ivan, I'm not super familiar with functions, couldn't a new field be added > to Function.Assignment, so that workerId X could be marked as > unschedulable. > Currently, the assignment topic only holds data that provide a mapping between function instances and workers i.e. assignments. The topic

Re: The ability to drain Pulsar Function workers

2021-09-01 Thread Jerry Peng
Ivan, thanks for reviewing my proposal. I will answer your questions inline. When a leader fails, does the new leader automatically create a new > assignment, or does it continue with the assignment from the previous > leader? > The new leader will resume scheduling duties with the current set of

The ability to drain Pulsar Function workers

2021-08-31 Thread Jerry Peng
It is useful to be able to autoscale the number of Pulsar Function workers when configuring the cluster to run instances as threads (ThreadRuntime) in an environment like K8s. When scaling out the number of workers, the "rebalance" endpoint can be involved after the scale out so that the new worke

Re: Lack of retries on TooManyRequests

2021-08-06 Thread Jerry Peng
Currently, there are two ways to get the TooManyRequest errors in the client. 1) The client enforces a maximum number of pending lookups: https://github.com/apache/pulsar/blob/master/pulsar-client/src/main/java/org/apache/pulsar/client/impl/ClientCnx.java#L733 The max number can be set when cre

Re: PIP-89: Structured documented logging for Pulsar

2021-08-03 Thread Jerry Peng
Ivan, Awesome proposal! Structured logging like this will definitely help Pulsar be easier to debug. Best, Jerry On Tue, Aug 3, 2021 at 4:33 AM Ivan Kelly wrote: > Hi folks, > > I've just created a PIP in the wiki to add structured documented > logging for Pulsar. > More often than not, when

Re: Problems with Functions/IO in Upgrading Pulsar from 2.7 to 2.8

2021-07-19 Thread Jerry Peng
I agree that the best we can do right now is to just clearly document this as a potential problem when updating 2.7 to 2.8. We should definitely make every attempt to not make BC breaking changes. However, there are times when we have to make these tough decisions for one reason or another. The bi

Re: Re: Re: Re: Discussion about https://github.com/apache/pulsar/pull/11112

2021-07-09 Thread Jerry Peng
Hi Everyone, Some of my thoughts: 1. If we are going to introduce new methods or stages in the lifecycle of a function such as "open" and "close", I would recommend we follow what we already have for Pulsar IO sources and Sink so that we are consistent. 2. I am also not a particular fan of the

Re: [VOTE] Pulsar Release 2.8.0 Candidate 3

2021-06-14 Thread Jerry Peng
+1 binding Checked (MacOs) * Building from src * Start standalone service * Basic produce and consume * Pulsar Functions - Exclamation Function * Pulsar IO - org.apache.pulsar.tests.integration.io.GenericRecordSource Best, Jerry On Mon, Jun 14, 2021 at 9:23 AM Rui Fu wrote: > +1 (non-bi

Re: [VOTE] Pulsar Release 2.8.0 Candidate 2

2021-06-09 Thread Jerry Peng
Hi all, I noticed another issue with the current release candidate. The java-instance.jar that is used as the root classloader for Pulsar Functions still contains too many dependencies. It should only contain the following deps: 1. pulsar-io-core 2. pulsar-functions-api 3. pulsar-cli

Re: [DISCUSS] Apache Pulsar 2.8.0 Release

2021-06-05 Thread Jerry Peng
Are the following issues fixed? 1. Pulsar-client-admin-api changes to not have dependencies (in progress) 2. Reverting external cluster URL External cluster URL

Re: Standardize authentication and authorization terms

2021-06-03 Thread Jerry Peng
Hi Chris, This is a good idea! We are intermixing a lot of terms in the code which might cause confusion and bugs in the future. Please formalize what you are proposing in a PIP. Thank you! Best, Jerry On Tue, Jun 1, 2021 at 11:45 PM r...@apache.org wrote: > Hello Chris: > > This is a good

Re: [VOTE] Pulsar Release 2.6.4 Candidate 1

2021-06-02 Thread Jerry Peng
+1 binding Checked (on MacOS Catalina) : * Signatures * Bin distribution: - NOTICE, README, LICENS - Start standalone - Validate Pub/Sub and Java Functions - Validate Stateful Functions * Src distribution: - NOTICE, README, LICENSE - Compile - Start standalo

Re: Connectors package registry

2021-05-25 Thread Jerry Peng
Hello Andrey, Thank you for bringing this up! This is definitely an important issue! All of the connector binaries are already hosted on Maven central thus I don't think hosting the binaries is an issue. Perhaps the key problem here is about discovery. My thoughts: 1. We should document clea

Re: Updates on Presto connector for PIP-62

2021-04-26 Thread Jerry Peng
Sijie, Sounds good! On Mon, Apr 26, 2021 at 11:48 AM Sijie Guo wrote: > Hi all, > > I want to share an update on the presto connector for PIP-62. > > We have talked to the Trino community about contributing the Presto/Trino > connector to the Trino project. The Trino community is happy to accep

[ANNOUNCE] New committer: Chris Kellog

2021-04-20 Thread Jerry Peng
Hi everyone, The Project Management Committee (PMC) for Apache Pulsar has invited Chis Kellog to become a committer and we are pleased to announce that he has accepted. Congratulations and welcome onboard Chris Kellog! Please join us to welcome Chris Kellog. Thanks

Re: [DISCUSS] [Proposal] PIP 84: Cluster-Wide and Function-Specific Producer Defaults

2021-04-05 Thread Jerry Peng
mins and users may have the maximum flexibility to customize their > >>> producer configs. It may also reduce the parameters we put into the > configs > >>> and pulsar-admin. Also, we have some interceptors with Pulsar Functions > >>> already, like the `RuntimeC

Re: [DISCUSS] [Proposal] PIP 84: Cluster-Wide and Function-Specific Producer Defaults

2021-04-01 Thread Jerry Peng
y it. (Is this description > any > > clearer? If not, it might be clearer if I diagram it.) > > > > Devin G. Bost > > > > > > On Thu, Apr 1, 2021 at 4:39 PM Jerry Peng > > wrote: > > > >> Hi Devin, > >> > >> I understand th

Re: [DISCUSS] [Proposal] PIP 84: Cluster-Wide and Function-Specific Producer Defaults

2021-04-01 Thread Jerry Peng
ion config however you want. Best, Jerry On Thu, Apr 1, 2021 at 2:08 PM Devin Bost wrote: > *Cluster-Wide and Function-Specific Producer Defaults* > > > > > * - Status: Proposal- Author: Devin Bost (with guidance from Jerry Peng)- > Pull Request: https://github.com/a

Re: Enhancements to Pulsar IO

2021-03-19 Thread Jerry Peng
what the actual schema the message was produced with. The schema of message is overridden when byte[] is specified as the schema for the consumer. This seems like a bug to me. Perhaps Sijie can also chime in on why the behavior is such. > On Fri, Mar 19, 2021 at 12:46 AM Enrico Olivelli wrote:

Re: Enhancements to Pulsar IO

2021-03-19 Thread Jerry Peng
ent it there and not try to push it down to the framework layer. Best, Jerry On Thu, Mar 18, 2021 at 1:38 AM Enrico Olivelli wrote: > Jerry > > Il giorno gio 18 mar 2021 alle ore 03:07 Jerry Peng > ha scritto: > > > > Hi Enrico, > > > > Thanks for taking the

Re: Enhancements to Pulsar IO

2021-03-17 Thread Jerry Peng
Hi Enrico, Thanks for taking the initiative for improvements on Pulsar IO! I have questions in regards to the following statements *Problem 1: Pulsar Sinks must declare at compile time the data type they support* > So I would like to see Sink and let the implementation deal with Record that wil

Re: [Discuss] Separate Function API from Pub/Sub API for python client

2021-02-22 Thread Jerry Peng
means the python client pulls in a lot of unnecessary > dependencies for people who only use pub/sub API. > > As discussed in the community meeting last week, I am starting an email > thread for discussing it. I would like to learn what @Sanjeev Kulkarni > and @Jerry Peng think about it. > > - Sijie >

Re: pulsar SQL is not able to read the latest msg

2020-08-13 Thread Jerry Peng
Hello, Please reference this issue: https://github.com/apache/pulsar/issues/6884 Best, Jerry On Thu, Aug 13, 2020 at 6:44 AM Moshe Baruch wrote: > Hi Pulsar Team, > > > > We use pulsar SQL and we found that the Pulsar SQL is lacking the last > message when querying pulsar topics. > > > > How

Re: [DISCUSS] PIP-65: Adapting Pulsar IO Sources to support Batch Sources

2020-05-20 Thread Jerry Peng
Hi Sijie, We have considered a two stag function as a way implement a "batch" source, however because there are two independent functions, it adds complexity to management especially when there are failures. The two functions will need to be submitted and registered in an atomic fashion which can

Re: [ANNOUNCE] Apache Pulsar 2.5.0 released

2020-01-20 Thread Jerry Peng
πŸŽ‰πŸŽ‰πŸŽ‰ On Mon, Jan 20, 2020 at 10:33 AM Addison Higham wrote: > Congrats on the release and many thanks to all the hard work put in by > maintainers and contributors. Lots of great features and progress and > excited to see Pulsar continue to grow :) > > On Mon, Jan 20, 2020 at 9:01 AM Sijie G

Re: Question/issue wrt running flink window transformations with event time, with pulsar source and sink

2019-12-13 Thread Jerry Peng
A more full featured Pulsar Flink Connector can be found here: https://github.com/streamnative/pulsar-flink On Fri, Dec 13, 2019 at 1:43 PM Jerry Peng wrote: > > Hello Subbu, > > Responding to your comments in Line: > > > When I set the stream characteristic to event time,

Re: Question/issue wrt running flink window transformations with event time, with pulsar source and sink

2019-12-13 Thread Jerry Peng
Hello Subbu, Responding to your comments in Line: > When I set the stream characteristic to event time, I do not observe any data > in the destination topic. Event time is currently not support in this version of of the source > I would like to understand the reason for this above behavior; is

Re: Pulsar Summit is coming 2020

2019-12-04 Thread Jerry Peng
Awesome looking forward to it! On Fri, Nov 29, 2019 at 6:39 AM Jinfeng Huang wrote: > > That's really great news! > If you'd like to contribute a talk, feel free to let us know. > > Best Regards, > Jennifer > > > On Fri, Nov 29, 2019 at 7:32 PM Jia Zhai wrote: >> >> con~ >> Looking forward to it

Integration tests that are skipped

2019-10-21 Thread Jerry Peng
Hello all, I recently noticed that we are not running quite a few of our integration tests. Some integration tests are skipped because they are not added to this TestNG test suite file: https://github.com/apache/pulsar/blob/master/tests/integration/src/test/resources/pulsar.xml Here is a incomp

Re: ApacheCon NA

2019-09-03 Thread Jerry Peng
Sounds good! On Tue, Sep 3, 2019 at 5:32 PM Sijie Guo wrote: > Hi, > > ApacheCon NA is coming next week. There are about 5 pulsar talks this year. > I guess quite a lot of Pulsar folks and users will be attending it. For the > people who is attending ApacheCon, maybe we should gather together fo

Re: PIP 38: Batch Receiving Messages

2019-07-17 Thread Jerry Peng
llector, still need to > consider maxNumMessages maxNumBytes > and timeout of the message collector. > 2. Another benefit as mentioned in the last part of the proposal, this can > allow lazy deserialization and object > creation in the future. > > Thanks for your replay > &g

Re: PIP 38: Batch Receiving Messages

2019-07-15 Thread Jerry Peng
Hi Penghui, So what is the major benefit of using the proposed batch receive API versus just buffering messages in my application code? In terms of performance, consumers already receive messages as batches from the broker. Though the current API only allows the user to retrieve a message one at

[ANNOUNCE] Apache Pulsar 2.3.2 released

2019-05-31 Thread Jerry Peng
The Apache Pulsar team is proud to announce Apache Pulsar version 2.3.2. Pulsar is a highly scalable, low latency messaging platform running on commodity hardware. It provides simple pub-sub semantics over topics, guaranteed at-least-once delivery of messages, automatic cursor management for subsc

Re: [VOTE] Pulsar Release 2.3.2 Candidate 1

2019-05-30 Thread Jerry Peng
The vote is now closed for Pulsar 2.3.2 Release Candidate 1 with 6 +1s , (4 binding) and no -1s Binding +1 Boyang Jerry Peng Matteo Merli Ivan Kelly Jia Zhai Non-binding +1 Yong Zhang Eren Avsarogullari No -1 Thank everyone for validating the release Best, Jerry On Wed, May 29, 2019 at

Re: [VOTE] Pulsar Release 2.3.2 Candidate 1

2019-05-21 Thread Jerry Peng
Sorry the github link to the 2.3.2 release candidate is not correct: https://github.com/apache/pulsar/releases/tag/v2.3.1-candidate-1 -> https://github.com/apache/pulsar/releases/tag/v2.3.2-candidate-1 On Tue, May 21, 2019 at 10:25 PM Jerry Peng wrote: > > Hi all, > > This is th

[VOTE] Pulsar Release 2.3.2 Candidate 1

2019-05-21 Thread Jerry Peng
Hi all, This is the first release candidate for Apache Pulsar, version 2.3.2. It fixes the following issues: https://github.com/apache/pulsar/milestone/23?closed=1 *** Please download, test and vote on this release. This vote will stay open for at least 72 hours *** Note that we are voting upon

Re: Pulsar 2.3.2 Release Discussion

2019-05-21 Thread Jerry Peng
Hello all, These are the additional commits going into 2.3.2 https://github.com/apache/pulsar/compare/v2.3.1...branch-2.3 Best, Jerry On Sun, May 19, 2019 at 11:10 PM Jerry Peng wrote: > Shivji, > > We can probably do that given that there aren't any major merge conf

Re: Pulsar 2.3.2 Release Discussion

2019-05-19 Thread Jerry Peng
t; +91 8884075512 > > > > > > On Sat, May 18, 2019 at 5:07 PM Sijie Guo wrote: > > > > > +1 > > > > > > On Sat, May 18, 2019 at 2:43 AM Jerry Peng < > jerry.boyang.p...@gmail.com> > > > wrote: > > > > > > > Hel

Pulsar 2.3.2 Release Discussion

2019-05-17 Thread Jerry Peng
Hello all, I think it might be time for a 2.3.2 release. There are already quite a few PRs that have gone in for 2.3.2: https://github.com/apache/pulsar/milestone/23 What does everyone else think? Best, Jerry

Re: application for becoming the contributor

2019-05-14 Thread Jerry Peng
Hi Yang Lei, Thanks for your interest in contributing to Apache Pulsar! You don't need any special permissions to contribute and open pull request. Looking forward to seeing your contributions! Best, Jerry On Tue, May 14, 2019 at 6:25 PM yang lei wrote: > Hi, > I want to contribute to Apac

Re: Release 2.4.0 - Feature Freeze

2019-04-22 Thread Jerry Peng
hread. Let’s focus on giving him the list of features > to be included for 2.4.0. So he can know what to wait for. > > - Sijie > > On Tue, Apr 23, 2019 at 11:34 AM Jerry Peng > wrote: > > > Yup the PMC usually need to vote on something like this and be in > agreeme

Re: Release 2.4.0 - Feature Freeze

2019-04-22 Thread Jerry Peng
Yup the PMC usually need to vote on something like this and be in agreement before such a freeze can take into effect. -Jerry On Mon, Apr 22, 2019 at 8:24 PM Matteo Merli wrote: > Negative acks also would be a new feature of 2.4, not just its exposure in > interceptors. > > On Mon, Apr 22, 2019

Re: [VOTE] Pulsar Release 2.3.1 Candidate 2

2019-04-09 Thread Jerry Peng
+1 on release candidate Followed the validation guide: https://github.com/apache/pulsar/wiki/Release-Candidate-Validation Verified: 1. Pulsar standalone mode 2. basic publish and consume 3. Pulsar Functions - Pulsar function state 4. Pulsar Sinks and Sources 5. Pulsar SQL Thanks for making

Re: [VOTE] Pulsar Release 2.2.0 Candidate 2

2018-10-18 Thread Jerry Peng
+1 Environment: MacOS 10.13.6 Went through in full the guide for validating a release candidate Checked: * Signatures * Bin distribution: - NOTICE, README, LICENSE - Start standalone service and producer/consumer test - Pulsar Functions/Pulsar Functions worker - Pulsar Func