Re: NiFi SplitAvro processor giving extra records

2018-03-12 Thread Mark Payne
Hi Rohit, The SplitAvro processor has 3 relationships: failure, split, and original. Did you by chance connect both the original and split relationships to the AvroToJson processor? If you did, then that would explain the behavior. You're routing both the individually split Avro records as well

Re: Issue with nifi startup performance

2018-03-17 Thread Mark Payne
Hello, I would recommend grabbing a thread dump while nifi is starting up, in the part that takes the longest. Typically, the startup time is related more to restoring the repositories more than it is to restoring the flow. Also, are you running on physical hardware or in a VM? How many cpus ar

Re: onPropertyModified doesn't seem to work

2018-03-19 Thread Mark Payne
Sivaprasanna, What package is your processor class in? By default, the logging for the org.apache.nifi.processors package is set to WARN. I am wondering if you are simply not seeing the logging because of your logback configuration. I would recommend you instead add a breakpoint and attach to you

Re: Master failing in travis

2018-03-23 Thread Mark Payne
Otto, Thanks. Looks like a threading bug in a unit test. Will address. Thanks -Mark > On Mar 23, 2018, at 2:11 PM, Otto Fowler wrote: > > Anyone have an idea why Master is failing in travis?

Re: [VOTE] Release Apache NiFi 1.6.0

2018-03-23 Thread Mark Payne
+1 (binding) Was able to verify hashes, build with contrib-check, and start up application. Performed some basic functionality tests and all worked as expected. Thanks! -Mark > On Mar 23, 2018, at 6:02 AM, Takanobu Asanuma wrote: > > Thanks for all your efforts, Joe. > > I have one questio

Re: published by PublishKafkaRecord_0_10 doesn't embed schema.

2018-03-26 Thread Mark Payne
Milan, What version of NiFi are you running? This sounds like NIFI-4639 [1], which was resolved in version 1.5.0. Thanks -Mark [1] https://issues.apache.org/jira/browse/NIFI-4639 On Mar 25, 2018, at 6:45 PM, Milan Das mailto:m...@interset.com>> wrote: Hello Nifi Users, Apparently, it seem

Re: A bug in expression substitution

2018-03-28 Thread Mark Payne
Sergei, Thanks for reporting this! I have created a JIRA [1] to track this. -Mark [1] https://issues.apache.org/jira/browse/NIFI-5026 On Mar 28, 2018, at 7:36 AM, Sergei Zhirikov mailto:sf...@yahoo.com.INVALID>> wrote: Hi, It looks like I have stumbled upon a bug in substitution of evaluated

Re: [VOTE] Release Apache NiFi 1.6.0 (RC2)

2018-03-28 Thread Mark Payne
+1 (binding) Was able to build successfully and start application. Created some simple flows to verify expected behavior. Verified hashes and signing. Thanks -Mark > On Mar 26, 2018, at 11:34 PM, Joe Witt wrote: > > Hello, > > I am pleased to be calling this vote for the source release of

Re: Error with getSolr

2018-04-15 Thread Mark Payne
Pierre-Louis, It looks like you're using GetSolr with the Return Type set to Record. But the Record Writer that you chose is configured with a "Schema Access Strategy" of "Inherit Schema." This Processor doesn't actually know the schema of the Records before-hand, so it needs to be provided. If

Re: Clustering not happening in Kubernetes cluster

2018-04-16 Thread Mark Payne
Jonathan, I've spent some time in the last few weeks playing around with getting some clusters running on Kubernetes as well. I know from the mailing lists that some others have been venturing into this also. I'm by no means a Kubernetes expert myself, but I've gotten it up & running on a few-no

Re: Is there a configuration to limit the size of nifi's flowfile repository

2018-04-26 Thread Mark Payne
Ben, There are three things that I've seen cause really massive FlowFile Repositories: 1) OutOfMemoryError occurs that causes NiFi to stop working properly. 2) The "nifi.flowfile.repository.checkpoint.interval" property is set really long (2 mins is the default). 3) By far, the most common, is

Re: NiFi code re-use

2018-05-13 Thread Mark Payne
So I think we have a lot of different concepts going on here. I’ll try to provide my thoughts on each one as I’ve spent a good bit of timing thinking about each of them over the last year or two :) Wormhole connections: these would be very nice to have because it would allow us to avoid having

Re: Data flow rate exceeding provenance recording rate

2018-05-14 Thread Mark Payne
Phil, This is just a side effect of how the old provenance repository was designed. There is a new implementation that is far faster and seems to be more stable. However, in order to use it, you have to "opt in" simply because we wanted to make sure that it was stable enough to set it as the de

Re: Data flow rate exceeding provenance recording rate

2018-05-15 Thread Mark Payne
ystem seems to be performing better >> than I was used to, even before I started receiving those errors. >> >> Cheers, >> Phil >> >> On Tue, 15 May 2018 at 08:54, Mark Payne wrote: >> >>> Phil, >>> >>> This is just a side eff

Re: [VOTE] Change the name Apache NiFi Fluid Design System to Apache NiFi Flow Design System

2018-05-18 Thread Mark Payne
+1 (binding) Sent from my iPhone > On May 18, 2018, at 10:33 AM, Scott Aslan wrote: > > +1 (binding) > >> On Fri, May 18, 2018 at 10:18 AM, James Wing wrote: >> >> +1 (binding) >> >>> On May 18, 2018, at 6:30 AM, Rob Moran wrote: >>> >>> Following positive response discussing the name cha

Re: Accumulo processors

2018-05-26 Thread Mark Payne
d like. Thanks -Mark On May 26, 2018, at 12:05 PM, davidrsmith mailto:davidrsm...@btinternet.com.INVALID>> wrote: Hi A team at work has a need to interface with accumulo, has anyone tried this, I know a while ago Mark Payne raised nifi jira ticket 818 but as far as I am aware this w

Re: ConvertAvroToJSON: RE: NIFI-5093

2018-06-02 Thread Mark Payne
Matthew, Have you tried looking at the ConvertRecord processor? It is quite a bit newer and much preferred, IMO, over ConvertAvroToJSON. I believe it would give you the output that you’re expecting. Thanks -Mark Sent from my iPhone > On Jun 2, 2018, at 7:56 AM, Matthew Forrester > wrote: >

Re: URL configuration for the remote process group in Nifi 1.3

2018-06-03 Thread Mark Payne
Paresh, When NiFi establishes a connection to the remote instance, it will request information from the remote instance about all nodes in the cluster. It then persists this information in case nifi is restarted. So whichever node you use in your URL is only important for the initial connection

Re: KerberosProperties.validatePrincipalAndKeytab Error ?

2018-06-04 Thread Mark Payne
Jorge, These properties do support Expression Language. However, they do not support evaluating FlowFile Attributes, only values from the Variable Registry. So this is going to be invalid unless you define a variable in the variable registry for both the Principal and the Keytab. Unfortunately,

[DISCUSS] Change of Cluster Flow Inheritance

2018-06-07 Thread Mark Payne
Hi all, Over the past couple of months, I have been doing a lot of testing with large scale flows and talking to others who are using large scale flows in production. ("Large scale" flows in this case means several thousand to tens of thousands of Processors). While NiFi does a really good job

Re: [DISCUSS] Change of Cluster Flow Inheritance

2018-06-07 Thread Mark Payne
different strategies within that would be more maintainable as well. >>> >>> Are you proposing that there is a unified flow comparison >>> capability/implementation/service that is shared between >>> clustering and versioned use cases? >>> >>>

Re: [DISCUSS] Change of Cluster Flow Inheritance

2018-06-07 Thread Mark Payne
with different strategies within that would be more maintainable as well. Are you proposing that there is a unified flow comparison capability/implementation/service that is shared between clustering and versioned use cases? On June 7, 2018 at 09:46:27, Mark Payne (marka...@hotmail.com<mailto

Re: Should the elastic search client service impl get renamed before 1.7?

2018-06-10 Thread Mark Payne
I think it’s a good idea to go ahead and rename it. Even if 6.x doesn’t make breaking changes, 7.x may. And I’d rather not have an ElasticSearchClient that works for 5.x and 6.x and then an ElasticSearchClient_7 that deals with 7. It makes it look like the simply named ElasticSearchClient is gen

Re: NIFI regex expression on RouteOnAttribute

2018-06-13 Thread Mark Payne
Milan, Your regex, I think, is fine, but you're using the "matches" function, which requires that the entire value match the regex exactly. It appears that you're trying to check if the value contains the regex anywhere within it, so you'd want to use the "find" function instead: ${setabsolutep

Re: [VOTE] Release Apache NiFi 1.7.0

2018-06-21 Thread Mark Payne
+1 (binding). Thanks for volunteering to handle the RM duties this time around, Andy! Was able to verify the checksums, build with contrib-check, verify signature. Started app up and perform some simple verifications that the app behaved as expected. Verified README is present and in good shape

Re: validation changes in 1.7.0

2018-07-02 Thread Mark Payne
Hey Mark, The validation logic was changed so that instead of performing validation on demand each time that the user refreshes stats, navigates to a process group, etc., it all is done asynchronously. We then periodically perform validation in the background. So the idea is that we changed whe

Re: validation changes in 1.7.0

2018-07-03 Thread Mark Payne
e stuff, I believe that can cause unit test failures even if it still works (albeit deprecated) on a live NiFi. Regards, Matt On Jul 2, 2018, at 4:52 PM, Mark Payne mailto:marka...@hotmail.com>> wrote: Hey Mark, The validation logic was changed so that instead of performing valida

Re: [VOTE] Release Apache NiFi 1.7.1

2018-07-13 Thread Mark Payne
+1 (binding). Verified checksums and hashes. Verified full build with contrib-check. Verified that the app starts and provides the basic functionalities. Verified that wildcard certificates now work in clustered mode. Verified that the recursively referenced Controller Services now behave as expe

Re: Custom Listener is very slow

2018-07-30 Thread Mark Payne
Hi Bobby, Your understanding is correct that @OnScheduled will be called when the user clicks Start (or upon restart of nifi if it was running when shutdown). The onTrigger method will be called continually. It's hard to say why it performs poorly with just a quick description of the Processor.

Re: Status history graphs over 24hrs (v1.3.0)

2018-08-03 Thread Mark Payne
Hey Phil, There are two properties that control this: nifi.components.status.repository.buffer.size=1440 nifi.components.status.snapshot.frequency=1 min The snapshot.frequency tells NiFi how often to take a 'snapshot' of the stats on your canvas and add to the 'Status History'. The buffer.size

Re: PutSql Loading is slow with Support Fragmented Transactions as True...

2018-08-17 Thread Mark Payne
Hello, That property will basically determine whether pushing the data is transactional or not. For example, if you want to ensure that all data from your CSV is pushed or none of it is, then you'll want that to be true. If you're okay with pushing the data in a bit at a time, then you can set

Re: Reading all flowfiles queued for a processor (>20000 flowfiles)

2018-08-22 Thread Mark Payne
Hi Sam, There are a couple of ways to tackle this problem. My recommendation would be to look at extending the BinFiles processor. This is an abstract class, which MergeContent extends (and I think 1 or 2 other processors?). Its job is to bin 'like flowfiles' together, and it can take care of pu

Re: Error handling in @OnScheduled

2018-08-23 Thread Mark Payne
James, If you are expecting the method to throw an Exception and want to verify that, you should just call the method directly from your unit test and catch the Exception there. The TestRunner expects to run the full lifecycle of the Processor. Thanks -Mark > On Aug 23, 2018, at 10:49 AM, Jam

Re: Reading all flowfiles queued for a processor (>20000 flowfiles)

2018-08-23 Thread Mark Payne
nt works; by having a session for each Bin. > So instead of having N bins/sessions, I've just got +1 Session that holds > onto the FlowFiles that I've keyed/seen? > > Thanks Mark! > > Cheers, > Sam > > On 2018/08/22 13:54:59, Mark Payne wrote: >> Hi S

Re: Error handling in @OnScheduled

2018-08-24 Thread Mark Payne
en using the TestRunner framework, or should I just replicate the test setup (e.g. create my own MockProcessContext etc.) Thanks, James On Thu, 23 Aug 2018 at 21:03, Mike Thomsen mailto:mikerthom...@gmail.com>> wrote: James try it with a throwable like in my example On Thu, Aug 23, 2018 at 10:5

Re: Error handling in @OnScheduled

2018-08-24 Thread Mark Payne
right - does that makes sense? > > James > On Fri, 24 Aug 2018 at 16:01, Mark Payne wrote: >> >> James, >> >> You can certainly catch Throwable there, or AssertionError, more >> specifically, but I'd be very wary >> of doing that, because at tha

Re: Ideal hardware for NiFi

2018-09-11 Thread Mark Payne
Phil, As Sivaprasanna mentioned, your bottleneck will certainly depend on your flow. There's nothing inherent about NiFi or the JVM, AFAIK that would limit you. I've seen NiFi run on VM's containing 4-8 cores, and I've seen it run on bare metal on servers containing 96+ cores. Most often, I see pe

Re: Ideal hardware for NiFi

2018-09-11 Thread Mark Payne
certainly > shoot for NVMe disks in the build. How does NiFi get configured to span > it's repositories across multiple physical disks? > > Thanks, > Phil > > On Wed, 12 Sep 2018 at 01:32, Mark Payne wrote: > >> Phil, >> >> As Sivaprasanna mentioned

Re: Status of "event-driven" scheduling

2018-09-13 Thread Mark Payne
Joe, Mike is right in that it was intended to be a more efficient scheduling strategy. With Timer-Driven, the processors used to constantly be checking if they had work to do and if not would switch contexts and check again. And again. This was pretty expensive, so we added the Event-Driven str

Re: Ideal hardware for NiFi

2018-09-14 Thread Mark Payne
t start using them. if you want to no >>> longer >>>>> use the current dir it might be more involved. >>>>> >>>>> does that help? >>>>> >>>>> thanks >>>>> >>>>> On Thu, Sep 13, 2018,

Re: Improve S2S to ease "star deployments"

2018-09-15 Thread Mark Payne
Hey Pierre, I'm not sure that this is the best route to go down. There are a couple of problems that I think you will run into. The most important will be what happens when the data going to that Output Port queues up into a large queue? If a NiFi instance then requests data, I presume that the

Re: [DISCUSS] Stale PRs

2018-09-15 Thread Mark Payne
I'm 100% on-board here. I brought up this same topic a couple of months ago, but the thread kind of digressed (as these things tend to do on large mailing lists). I am in favor of a 30 day period with a reminder that gives the contributor an extra week before closing the PR. If the contributor is

Re: How to optimise use of MergeContent for large number of bins.

2018-09-21 Thread Mark Payne
Ashwin, You should have no problem here in terms of MergeContent scaling. But there are a few things that you'll want to consider: 1. If you have a cluster of these, you're going to be merging FlowFiles per node. So you'll need to ensure that you have enough data on each node to reach your max o

Re: [VOTE] Release Apache NiFi Registry 0.3.0

2018-09-24 Thread Mark Payne
+1 (binding) Validated hashes, build with contrib-check. Started and ensured that registry is able to store and retrieve flow with load balancing information. Thanks for handling the RM duties this time around, Kevin! -Mark > On Sep 22, 2018, at 9:54 AM, Kevin Doran wrote: > > Hello, > > I

Re: Unable to operate nifi ui due to "java.net.SocketTimeoutException: Read timed out"

2018-09-25 Thread Mark Payne
Hi Ashwin, The embedded ZooKeeper is provided as a convenience so that you can easily test running things in a cluster on a laptop, etc. However, it struggles when your nodes are handling any kind of significant data rate. It is always recommended that an external ZooKeeper be used for any sort

Re: Slow flowfile transfer from process group port to output port.

2018-09-25 Thread Mark Payne
Ashwin, You'll want to Right-Click on the RPG and then choose to configure Remote Ports. From there, you can configure how many threads should be used to pull data from each port. So you've updated the Output Port to use up to 3 threads per node to provide the data, but each node is still only

Re: Any down sides to putting a controller service in the same package as a processor?

2018-09-25 Thread Mark Payne
Mike, The Mongo Controller Service already exists though, right? You cannot move it into the Processors NAR after it's already been released. That would change the 'bundle coordinates' for the controller service and would mean that any flow that uses it is no longer valid and would have to be r

Re: [DISCUSS] Closing in on a release of NiFi 1.8.0?

2018-10-16 Thread Mark Payne
Jeff / all, I ran into an issue with NIFI-375 and re-opened the ticket. If a processor is stopped or started in a cluster, the stats that come back in the response are incorrect because the response is not being properly merged from all nodes in the cluster. I also have run into a couple of oth

Re: [VOTE] Release Apache NiFi 1.8.0 (RC2)

2018-10-22 Thread Mark Payne
-1 (binding) Unfortunately I ran into a bug that I think warrants sinking this vote. I created a JIRA [1] for it. It looks like a bug was introduced that can result in a processor/port/funnel deadlocking when attempting to pull from a Connection or put to a Connection. Thanks -Mark [1] https:/

Re: New Standard Pattern - Put Exception that caused failure in an attribute

2018-10-25 Thread Mark Payne
I agree - the notion of adding a "failure.reason" attribute is, in my opinion, an anti-pattern that should be avoided. Relationships are not a workaround but rather the preferred approach in this scenario - an attribute I would consider a workaround. This is due to the fact that not only is it b

Re: NiFi 1.8.0 LoadBalance Strategy Issue for Connection between Funnel and FetchSFTP

2018-11-08 Thread Mark Payne
Hi Josef, The prioritizers provide a weak ordering to the data, not an absolute sorting. What I mean by that is that if you are prioritizing a FlowFile with attribute A = 123 over a FlowFIle with attribute A = 125, then the first one will likely go first but it's not guaranteed. For example, whe

Re: [DISCUSS] Extension Registry

2018-11-16 Thread Mark Payne
Mike, Thanks for bringing that up. It's a good point that we need to keep in mind. As mentioned, the NAR Maven Plugin will be generating the info necessary and putting it into the NAR. One piece of information that it will gather is any Controller Service API's that are provided by a given impl

Re: Processor forwards/backwards compatibility

2018-11-16 Thread Mark Payne
Hi James, The processor may or may not work with older versions of NiFi (1.x.0). But should work with newer versions of NiFi (1.z.0). That's because there may be a feature of the nifi-api that you use in 1.y.0 and we can guarantee that will not be removed in 1.z.0 but it did not exist in 1.x.0.

Re: Using map cache clients to detect already processed files

2018-12-15 Thread Mark Payne
Mike, There is a DetectDuplicate processor. It gives you the ability to provide an attribute to use for identification (for example, using a SHA256 hash or looking at an identifier in the data or a filename, etc). It uses a DistributedMapCacheClient to track this so it could be backed by Redis

Re: NiFi provenance events written to external systems?

2018-12-19 Thread Mark Payne
Hi Erik, The Site To Site Provenance Reporting Task is the recommended approach. There's a post on how to get started with that [1]. I don't know of any other 3rd party additions to provide anything beyond this. Is there an issue that you're running into with this approach? Thanks -Mark [1] h

Re: ISPEnrichIP test data?

2018-12-19 Thread Mark Payne
Mike, MaxMind provides 2 datasets. The GeoIP2 database is the commercially licensed dataset. There also is a GeoLite2 database [1], which is free and available under the Creative Commons Attribution-ShareAlike 4.0 International License. It is not as accurate as the GeoIP2 database but works wel

Re: ISPEnrichIP test data?

2018-12-19 Thread Mark Payne
opening the database. I was using the City one, which I already had from > working on the record-aware version of GeoEnrichIP. What database should I > be using for testing out ISPEnrichIP? > > Thanks, > > Mike > > On Wed, Dec 19, 2018 at 10:57 AM Mark Payne wrote: >

Re: Checking for existence of mandatory attributes

2018-12-20 Thread Mark Payne
Hey Mike, I would recommend RouteOnAttribute for that. Thanks -Mark Sent from my iPhone > On Dec 20, 2018, at 11:45 AM, Mike Thomsen wrote: > > Do we have any processors that would be particularly good at checking a > flowfile for the existence of certain mandatory attributes and then routin

Re: Checking for existence of mandatory attributes

2018-12-20 Thread Mark Payne
do a bunch of 'and must >> exist' statements." >> >> Thanks, >> >> Mike >> >> On Thu, Dec 20, 2018 at 12:01 PM Mark Payne wrote: >> >>> Hey Mike, >>> >>> I would recommend RouteOnAttribute for that. >>&g

Re: Proposing NiFi-Fn

2019-01-04 Thread Mark Payne
Sam, I love this idea, and I am all for it. I can definitely see how this could be useful both within the context of NiFi itself and outside of NiFi as well. There has been quite a bit of talk of late, in both e-mail and the Slack channel about users needing more ability to perform integration t

Re: [VOTE] Release Apache NiFi 1.9.0 (rc2)

2019-02-19 Thread Mark Payne
+1 (binding) Verified signature Upgraded cluster that ran 1.9.0-RC1. Started up without issue and has been running for about 48 hours without a problem. All issues raised appear to have been addressed in RC2. Flows consist of Site-to-Site, Load-Balanced Connections, Record-oriented processors, h

Re: [VOTE] Release Apache NiFi 1.9.1 (rc1)

2019-03-13 Thread Mark Payne
+1 (binding) Verified signature, hashes. Build succeeded with contrib-check and grpc profiles. Started app and performed some basic functionality testing and all seemed correct. Thanks -Mark > On Mar 13, 2019, at 1:49 AM, Joe Witt wrote: > > Hello, > > I am pleased to be calling this vote f

Re: How to convert unix timestamp to datetime in Apache NiFi

2019-04-17 Thread Mark Payne
Puspak, For that timestamp given, 01/01/1970 05:30:00.000 is the correct time. I suspect, though, that what you have there is not Unix milliseconds but rather Unix seconds. Try multiplying that by 1000: ${actualarrivaltime:multiply(1000):format("MM/dd/ HH:mm:ss.SSS")} If we use this simple

Re: NiFi Apache CSV Question

2019-05-30 Thread Mark Payne
Hello, This definitely sounds like the top of thing that NiFi should be pretty good at. Can you provide more details? What kind of problems are you running into when changing the date format? Can you provide sample input and desired output? Thanks -Mark On May 29, 2019, at 12:11 PM, Maiuri Ran

Re: Any helper functions for checking RecordFieldType compatibility?

2019-06-04 Thread Mark Payne
Mike, You may want to look at DataTypeUtils.mergeDataTypes( final DataType thisDataType, final DataType otherDataType ) I don't believe this is exactly what you are looking for, but will likely give you a good starting point, if you were to implement such a helper method. Thanks -Mark On Jun

Re: Ingestion for hive to mongodb

2019-06-11 Thread Mark Payne
Santosh, The flow that you've outlined there seems reasonable, but it is certainly better if you don't have to split the data up, both in terms of performance as well as in terms of making the flow easier to design. I would imagine that PutMongoRecord missing the Upsert mode is simply an oversi

Re: Thoughts on an internal "Terminate" handler for special processors

2019-06-19 Thread Mark Payne
Peter, Without thinking through this too much, I am not opposed to the idea, but there are certainly a lot of things that would have to be carefully thought through around the lifecycle: - When a processor is terminated, its threads are interrupted. They may or may not ignore the interrupt. Wh

Re: getStatusHistory question

2019-06-26 Thread Mark Payne
Mark, I don't think that it was intentional. The interface was created with the idea that we will have a persistent implementation, as well, and that will hold a lot more data, so the start/end will be very important. For the Volatile case it was probably just overlooked and never noticed becaus

Re: Unable to modify flow when one of the nodes in a cluster is disconnected

2019-06-27 Thread Mark Payne
Purushotham, If the node is disconnected and then attempts to reconnect, flow election does not occur. Rather, the node obtains a copy of the flow from the cluster, determines whether or not it matches, and if so rejoins. If the flow does not match, it disconnects and stops trying to reconnect.

Re: Unable to modify flow when one of the nodes in a cluster is disconnected

2019-07-01 Thread Mark Payne
il, could you please throw some light on which release this it supported from? Regards, Purushotham Pushpavanth On Thu, 27 Jun 2019 at 19:34, Mark Payne mailto:marka...@hotmail.com>> wrote: Purushotham, If the node is disconnected and then attempts to reconnect, flow election does not o

Re: Record path API chooses CHOICE[STRING, RECORD] when the field is missing

2019-07-29 Thread Mark Payne
Mike, What Record Reader is being used here? The problem appears to be due to the Record Reader itself assigning that as the field type. I created a dummy unit test to verify the RecordPath stuff is correct: @Test public void testFromEmail() { final List fields = new ArrayList<>(); fie

Re: Implementation of FilesystemComponentStatusRepository

2019-08-04 Thread Mark Payne
Purushotham, I'm not aware of anyone working on a file-based implementation, but it would certainly be a welcome feature if anyone does take the initiative to develop it. In addition to keeping stats for much longer, it also means that we could avoid holding all of these stats in memory, as the

Re: StateManager race condition potential

2019-08-09 Thread Mark Payne
Russell, The StateManager provides a "setState" method and a "replaceState" method. The former will update the state to whatever you pass it. The latter allows you to pass in the expected state, so that you can atomically replace the value, similar to how ConcurrentMap works. > On Aug 9, 2019,

Re: [EXT] Re: OnPrimaryNodeStateChange vs Primary Only configuration

2019-08-16 Thread Mark Payne
It seems reasonable to me to add a `ExecutionNode getExecutionNode()` method to ProcessContext. This enum already exists in nifi-api, but I don't believe that it's exposed anywhere to the Processor itself. > On Aug 16, 2019, at 11:32 AM, Peter Wicks (pwicks) wrote: > > Bryan, > > I'm familiar

Re: Azure Event Hub Processors Upgrade Problem

2019-08-20 Thread Mark Payne
Hey Sunny, Very happy to have your help! Thanks! Sorry it wasn't more obvious how to connect remotely. What you'll want to do is to update the "bootstrap.conf" file in the conf/ directory. On or around line 39, you'll see the following line: #java.arg.debug=-agentlib:jdwp=transport=dt_socket,se

Re: Azure Event Hub Processors Upgrade Problem

2019-08-22 Thread Mark Payne
Hi Sunny, I replied to the first message, but I suppose you didn't receive it :) I have pasted the response below. When a user replies to the mailing list, it will go directly to dev@nifi.apache.org, rather than including your email address, so it's usually best to s

Re: [EXT] Re: [VOTE] Create NiFi Standard Libraries sub-project

2019-09-04 Thread Mark Payne
I'm a +1 as well. Thanks -Mark > On Sep 4, 2019, at 6:02 AM, Pierre Villard > wrote: > > +1 (binding) > > As a minor comment - but definitely not the place to discuss it - I wonder > if having a dedicated JIRA for that is really required. We could have the > same approach as we do with nifi-

Re: Executing a function over sub-elements of an array with RecordPath

2019-09-19 Thread Mark Payne
Hey Mike, You *SHOULD* be able to do this with UpdateRecord by adding a property with the name /addresses[*]/full and then a value of `concat(../street, ' ', ../city, ' ', ../state)` and that should generate an output like: "addresses": [ { "street": "12345 Main St", "city": "Springfield": "s

Re: high task count

2019-10-16 Thread Mark Payne
Mark, I think you’ll see this behavior when the flowfile is penalized. It’s something that we should really clean up at some point, but it’s mostly harmless so it’s just something that we’ve not gotten around to addressing. Thanks -Mark Sent from my iPhone > On Oct 16, 2019, at 4:27 PM, Mar

Re: [VOTE] Release Apache NiFi 1.10.0 (RC1)

2019-10-24 Thread Mark Payne
-1 (binding) Was able to verify commit hashes. Build completed but did have some problems with the unit tests. I had failures in the KnoxServiceTest (error output below [4]). I've never seen these errors before, but this is the first time in quite a while that I deleted my entire .m2 repository

Re: [VOTE] Release Apache NiFi 1.10.0 (rc3)

2019-10-30 Thread Mark Payne
+1 (binding) Was able to build and verify that all of the Jiras that I raised last time around have been addressed in this RC. Left a cluster of 10 nodes running for a day or two, pretty heavily taxed, and ran into no issues. Thanks -Mark > On Oct 30, 2019, at 3:20 PM, Matt Gilman wrote: >

Re: Forcing cluster traffic to a single node

2019-12-05 Thread Mark Payne
Phil, You can configure the connection between GetTCP and the next processor to have a Load Balancing Strategy of “Single Node.” Then run the next processor on all nodes, not primary node, because it may not be the primary node that receives the data. Thanks -Mark Sent from my iPhone > On D

Re: [VOTE] Release Apache NiFi 1.11.1 (rc1)

2020-02-03 Thread Mark Payne
+1 (binding) Verified signatures Built and ran a dummy flow to ensure basic capabilities Verified that NIFI-7076 and NIFI-7059 were both addressed. Thanks for handling the RC Joe! -Mark > On Jan 31, 2020, at 3:11 PM, Joe Witt wrote: > > Hello, > > I am pleased to be calling this vote for the

Re: Potential 1.11.X showstopper

2020-02-06 Thread Mark Payne
As Joe mentioned earlier in the thread, the way to track down a "too many open files" problem is to run "lsof -p " That will show all open files, and it often makes it pretty obvious what it is that's holding the file handles open. > On Feb 6, 2020, at 2:12 PM, Mike Thomsen wrote: > > Can you

Re: Response time from servername was slow for each of the last 3 requests made.

2020-02-10 Thread Mark Payne
Guy, When you make a request to the NiFi UI, that request must be replicated to all nodes in the cluster. If a particular node in the cluster is overloaded, it can start to take a while to respond to requests. That, in turn, results in the entire UI feeling sluggish. So the idea behind that war

Re: How to preclude user-defined properties...

2020-02-25 Thread Mark Payne
The UI always allows users to enter user-defined properties. It's certainly something that could be improved, I believe. Thanks -Mark > On Feb 25, 2020, at 3:18 PM, Russell Bateman wrote: > > ...in a custom processor. > > I have a custom processor (that I wrote) and, in on-canvas configuratio

Re: Reading the incoming flowfile "twice"

2020-03-31 Thread Mark Payne
Russ, As far as I can tell, this is working exactly as expected. To verify, I created a simple Integration test, as well, which I attached below. Let me outline what I *think* you’re trying to do here and please correct me if I’m wrong: 1. Read the content of the FlowFile. (Via session.read) 2

Re: Reading the incoming flowfile "twice"

2020-03-31 Thread Mark Payne
ll the rest), then > stitch the two back together in a later processor. I see having to coordinate > the two halves of what used to be one file fraught with precarity and > confusion, but I guess that's the solution I'm left with? > > Thanks, > Russ > > >

Re: Reading the incoming flowfile "twice"

2020-03-31 Thread Mark Payne
ond half." Then I'm done. I was thinking of the input >>>> stream as from the in-coming flowfile and a separate thing from the output >>>> stream which I see as being offered to me for my use in creating a new >>>> flowfile to transfer to. I guess this

Re: ...not the most recent version of this FlowFile within this session...

2024-05-07 Thread Mark Payne
Russell, May not be too bad to fix :) Any time you use a method on ProcessSession to modify a FlowFile in some way (such as putAttribute, putAllAttributes, write, removeAttribute, removeAllAttributes, etc.) the ProcessSession returns to you a new FlowFile. So you *should* write code such as:

Re: ...not the most recent version of this FlowFile within this session...

2024-05-08 Thread Mark Payne
sions made me a spoiled, entitled child and I will >> repent immediately. >> >> Thanks, guys! DevOps are happy they don't have to upgrade the customers >> to NiFi 1.13.2. (In a way, I'm unhappy about that, but...). >> >> Best regards, >> >&

Re: ...not the most recent version of this FlowFile within this session...

2024-05-23 Thread Mark Payne
le times for each iteration of the loop? > > On Wed, May 8, 2024 at 5:27 PM Michael Moser wrote: > >> Oh yeah, I do love this behavior of ProcessSession. And thanks for the >> tip, it's easy to forget that there are efficiencies to be gained by using >> different p

Re: Issue with NiFiApi in Python

2024-07-09 Thread Mark Payne
Cyrine, The “nifi-api” dependency is not the official dependency, unfortunately. That is an unrelated project that is not associated with Apache NiFi. We have not yet published the api to PyPI because there hasn't been an official release of 2.0 yet - we’re still on Milestone releases, and unfo

Re: Flow comparison

2024-07-23 Thread Mark Payne
Matt, Yes, this is normal. On startup, the flow is empty. NiFi then compares the empty flow to the flow.json.gz, and these are the differences it found. It is a bit confusing, admittedly, but it is functioning as intended. Thanks -Mark > On Jul 23, 2024, at 2:56 PM, Matt Burgess wrote: > > I

Re: Handling of the contents of the failure result in the python FlowFileTransform API

2024-07-31 Thread Mark Payne
Hi Ferenc, I’m not overly familiar with MiNiFi C++ but this is how it works in traditional NiFi as well. This is intended. While Processors are generally free to treat relationships as they wish, the convention is always to route the original incoming FlowFile to ‘failure’, never a modified ver

Re: Handling of the contents of the failure result in the python FlowFileTransform API

2024-07-31 Thread Mark Payne
HON.md#using-nifi-python-processors > > On Wed, 31 Jul 2024 at 18:24, Mark Payne wrote: >> >> Hi Ferenc, >> >> I’m not overly familiar with MiNiFi C++ but this is how it works in >> traditional NiFi as well. >> This is intended. While Processors a

Re: Handling of the contents of the failure result in the python FlowFileTransform API

2024-08-05 Thread Mark Payne
not possible or feasible, an error log works, too. But we > should make it as loud as feasible, so the user notices the programming error > early. > > Thanks, > Marton > > On 7/31/24 19:52, Mark Payne wrote: >> Agreed, it should absolutely be documented. I don’t have

Re: Python API change to support multiple or no output flow files

2024-08-06 Thread Mark Payne
Gabor, I do agree that it’s reasonable to allow the source processor to return no output. In fact, that’s probably what it should return the vast majority of the time. As for 1 FlowFile in, many out, I think we should hold off. This will be an important strategy to support, for sure. It was in

Re: [DISCUSS] Introducing NiFi Improvement Proposals

2024-08-20 Thread Mark Payne
David, +1 to all of this. Especially for guaranteeing compatibility and maintainability, I think instrumenting a more formal approach for updating the API is a step in the right direction. The huge amount of purging, cleanup, and refactoring that has gone into 2.0 helps to highlight the importan

Re: [ANNOUNCE] New Apache NiFi PMC Member - Andre Fucs de Miranda

2016-11-04 Thread Mark Payne
Welcome, Andre! -Mark > On Nov 3, 2016, at 10:37 PM, Joe Witt wrote: > > Team, > > On behalf of the Apache NiFi PMC, I am pleased to announce that Andre > Fucs de Miranda has accepted the PMC's invitation to join the Apache > NiFi PMC. Andre's excellent contributions to the project reflect th

  1   2   3   4   5   6   >