Re: Dataflow and mounting large data sets

2023-01-31 Thread Chad Dombrova
Thanks for the info. We are going to test this further and we'll let you know how it goes. -chad On Mon, Jan 30, 2023 at 2:14 PM Valentyn Tymofieiev wrote: > It applies to custom containers as well. You can find the container > manifest in the GCE VM metadata, and it should have an entry for

Re: Dataflow and mounting large data sets

2023-01-31 Thread Luke Cwik via user
I would also suggest looking at NFS client implementations in Java that would allow you to talk to the NFS server without needing to mount it within the OS. A quick search yielded https://github.com/raisercostin/yanfs or https://github.com/EMCECS/nfs-client-java On Tue, Jan 31, 2023 at 3:31 PM Cha

Re: Dataflow and mounting large data sets

2023-01-31 Thread Andrew Pilloud via user
I would guess that you have some existing code that expects random IO access to the files via the Java IO or NIO interface (the common blocking IO in a DoFn pattern), so using a Beam IO which is what we recommend and are discussing here would be a significant rewrite? I worked on Isilon from 6.5 -