I have heard about NiFi-Fn (B23 Kubernetes Operator for NiFi-Fn)

Has anyone built a NiFi kubectl processor and possibly a nice NiFi "remote 
jobs" base docker container that can be used to control a remote nifi 
processor/job that conforms to Apache NiFi input and output mechanisms (flow 
file format)?

I know we would need a way to marshal the NiFi flowfile format in and out of a 
container, but if we did we can launch remote Python processes that scale well 
via using cloud native mechanisms (DevOps).

We built a native Python 2.7/3.7 NiFi processor that allows you to quickly 
chain together Java and Python flows. This is powerful because most data 
infrastructure is in python, not Java, especially Geospatial data. Of course 
this wont scale because of the number of Python processors that can potentially 
run on a NiFi node, but it allows you to quickly get things working. 2 days and 
you can do some amazing things.

If I can now offload that Python processing, via Kubernetes kubectl, we can use 
automated DevOps scaling for some really large jobs. Possibly using a NiFi 
processor that wraps https://github.com/kubernetes-client/java

Why all this jazz?
Real Use Case: Geospatial data (GeoJSON, ESRI Shapefile, etc). It requires 
standard python "pip install blah-blah" packages to process it.

Thoughts? Please throw tomatoes at the idea. I welcome constructive and 
destructive criticism because that means people care.

Erik Anderson
Bloomberg

Reply via email to