Hi Marc,

Thanks for the additional info.  Just so you know you’re not the only
one, I’ve also had to re-implement a ListenTCP alternative to get
around the byte delimeter issue for binary and multiline text data.

Phil


On Tue, Aug 3, 2021 at 6:59 AM Marc <[email protected]> wrote:
>
> Hi Adam,
>
> more or less it is a ‚merge', puttcp, listentcp and unpack. I hope that I am 
> not wrong but the nifi ListenTCP processor uses a delimiter (\n as default?). 
> If you are transferring binary data the processor splits the flow into 
> ‚pieces'. And the attributes are not transferred to the destination.
>
> But your idea describes what the processor is doing.
>
> 1. It converts the attributes to a json string
> 2. It transfers the json string and the payload (there is a header that tells 
> the destination how long the json header and how long the payload is)
> 3. The Listener gets the flow and decodes the header (to get the size of the 
> json header and the payload)
> 4. It writes the payload to a flow
> 5. It converts the json string and sets the attributes to the flow
>
> If you do not want to transfer attributes you can configure a different 
> decoder. In this case you can just ‚nectat‘ a binary file to nifi.
>
> The UDP version is far more complex. There must be a counter to tell the 
> destination what part of the flow file was received (even in a diode 
> environment packets are not received in the right order!). And you must be 
> fast, very fast. It is a multithreaded architecture because one thread cannot 
> receive, decode, and write a gigabit per second. I used the disruptor 
> library. Receive a packet in one thread, decode it in another thread. A third 
> thread gets the packet and write the content in the right order to a flow.
>
> I am still learning (and I am not a professional software developer). If I 
> did something wrong or oversaw something please tell me.
>
> Marc
>
> > Am 02.08.2021 um 22:01 schrieb Adam Taft <[email protected]>:
> >
> > Marc,
> >
> > How would this differ from a more generic use of the existing processors,
> > PutTCP/ListentTCP and PutUDP/ListenUDP?  I'm not sure what value is being
> > added above these existing processors, but I'm sure I'm missing something.
> >
> > There's already an ability to serialize flowfiles via MergeContent. And
> > there's the deserialize side in UnpackContent. So a dataflow that looks
> > like the following would seem a reasonable approach to the problem:
> >
> > MergeContent -> PutTCP -> {diode} -> ListentTCP -> UnpackContent
> >
> > I'm actually very interested in this topic, having a project that has a use
> > case for a "diode". So I'm legitimately asking here, not trying to derail
> > your work.
> >
> > Thanks in advance,
> >
> > Adam
> >
> > On Sun, Aug 1, 2021 at 12:26 PM Marc <[email protected]> wrote:
> >
> >> Greetings,
> >>
> >> there are companies and organizations that strictly separate their
> >> networks for security reasons. Such companies often use diodes to achieve
> >> this. But of course they still have to exchange data between the networks
> >> (eg. transfer data from ‚low‘ to ‚high‘). There are at least two kinds of
> >> diodes. Some hardware-based ones only use one fiber optic to send data (UDP
> >> based). Others use TCP, but prevent sending in the reverse direction.
> >>
> >> Nifi is an amazing tool that allows data to be transferred between two
> >> separate networks in a very flexible but also secure way. I have
> >> implemented two processors. The first one ‚merges‘ the attributes and the
> >> content of a flowfile and sends it to the destination. The second one
> >> listens on a TCP port, splits attributes and content and creates a new
> >> flowfile containing all attributes of the origin flow. You can send the
> >> flow without attributes as well. In this case you can easily netcat a
> >> binary file to Nifi.
> >>
> >> These two processors are useful if you do NOT have a bidirectional
> >> communication between two NiFi instances and therefore the site-2-site
> >> mechanism or http(s) cannot be used.
> >>
> >> We have been using these processors for a longer period of time (exactly
> >> the version for 1.13.2) and would like to share these processors with
> >> others. So the question to you all is: Is someone interested in these
> >> processors or is this use case too special?
> >>
> >> The current source code can be found on GitHub. (
> >> https://github.com/nerdfunk-net/diode/ <
> >> https://github.com/nerdfunk-net/diode/>)
> >>
> >> I have also implemented a UDP based version of the processor. Due to the
> >> nature of UDP, this is more complex and these processors are now being
> >> tested.
> >>
> >> Best regards
> >> Marc
>

Reply via email to