On Thu, Jul 20, 2017 at 3:40 PM, Farid Zakaria <[email protected]> wrote:
> Is MMAP the only way to randomly seek to an offset in the file? > > I can't seem to find a way to do that with kj::FdInputStream ? > > > I'm trying to create an index of the elements in the file. > kj::InputStream doesn't assume the stream is seekable and doesn't track the current location. You could create a custom wrapper around InputStream or around BufferedInputStream that remembers how many bytes have been read. You can also lseek() the underlying fd directly, though of course you'll have to discard any buffers after that. But indeed, if you use mmap() this will all be a lot easier, and faster. I highly recommend using mmap() here. On Thu, Jul 20, 2017 at 4:14 PM, Farid Zakaria <[email protected]> wrote: > One more question =) > > I need to copy the root from a FdStream to a vector > Do I need to copy it into a MallocMessageBuilder ? > With InputStreamMessageReader, yes. You have to destroy the InputStreamMessageReader before you can read the next message, and that invalidates the root Reader and all other Readers pointing into it. However, with the mmap strategy, you don't need to delete the FlatArrayMessageReader before reading the next message. So, you can allocate them on the heap and put them into your vector, and then all the Readers pointing into them remain valid, as long as the FlatArrayMessageReaders exist and the memory is still mapped. (In this case you should remove the madvise() line since you plan to go back and randomly access the data later.) Again, I *highly* recommend this strategy instead of using a stream. With the mmap strategy, not only do you avoid copying into a builder, but you avoid copying the underlying data when you read it. The operating system causes the memory addresses to point directly at its in-memory cache of the file data. If multiple programs mmap() the same file, they share the memory, rather than creating their own copies. Moreover, the operating system is free to evict the data from memory and then load it again later on-demand. There are tons of advantages to this approach and it is exactly what Cap'n Proto is designed to enable. -Kenton -- You received this message because you are subscribed to the Google Groups "Cap'n Proto" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. Visit this group at https://groups.google.com/group/capnproto.
