Re: [capnproto] How to store messages for later use without leaking memory?

FANG Colin Thu, 18 Apr 2019 12:31:11 -0700

Wonderful exactly what I need. Thanks you!

On Thursday, April 18, 2019 at 2:01:15 AM UTC+1, Kenton Varda wrote:
>
> Hi Colin,
>
> The reason your first two classes use more memory is because 
> MallocMessageBuilder pre-allocates space, and your messages are probably 
> much smaller than what it pre-allocates. You can pass values to 
> MallocMessageBuilder's constructor to tell it to allocate less space.
>
> That said, you should check out the OwnCapnp class in Sandstorm's code, 
> which I think does exactly what you're looking for here:
>
>
> https://github.com/sandstorm-io/sandstorm/blob/4d86a8144cdb43120ea12845738d0fe4a6ffcda1/src/sandstorm/util.h#L501-L525
>
> This class allocates exactly as much space as is needed, does only one 
> copy, and guarantees that the resulting copy has no out-of-bounds pointers, 
> which allows it to construct a Reader that skips bounds checking for better 
> performance. (This is still safe to use with untrusted input, as all 
> pointers are validated during the copy step.)
>
> I really should add this utility type to Cap'n Proto itself...
>
> As for your memory leaks, I don't think your code is leaking memory. It 
> looks like it is handling ownership correctly and, as you said, valgrind 
> claims no leaks. You say that memory usage stays high even after deleting 
> the objects, but this is typical: many memory allocators do not return 
> memory to the system, or only do so after some time has passed or under 
> other specific conditions. Fragmentation can indeed make it harder to 
> return memory to the system. So, it's pretty common that a process's memory 
> usage tells you the maximum memory it ever allocated, rather than the 
> amount currently allocated.
>
> If you absolutely must return memory to the system promptly, you may need 
> to customize your memory allocation. If you allocate memory using mmap() 
> instead of malloc(), then the corresponding munmap() will definitely return 
> the memory. But, of course, mmap() can only allocate in multiples of the 
> system page size, so then it's up to you how to pack smaller objects into 
> that space. Cap'n Proto gives you a lot of control over memory allocation 
> by writing custom subclasses of MessageReader and MessageBuilder, or by 
> passing in your own scratch space, which might make the job easier.
>
> -Kenton
>
> On Wed, Apr 17, 2019 at 2:23 PM <[email protected] <javascript:>> wrote:
>
>> Hi in my use case I would like to read messages from a stream, 
>> selectively store  certain structs from some messages in a dictionary for 
>> later use. Unfortunately I cannot get it to work without leaking memory.  
>> The total number of messages in the dictionary is bounded, as new messages 
>> with the same key would replace the old message. (Actually valgrind doesn't 
>> think it is leaking, but I can see my memory usage is increasing 
>> indefinitely).
>>
>> Test code
>>
>> messages are stored in a vector `data`
>>
>>
>> struct ProcessStream {
>>     using CapnpMsg = capnp_utils::CapnpMessage<Bar>;
>>     std::vector<std::shared_ptr<CapnpMsg>> data;
>>
>>
>>     bool process_message(
>>         kj::BufferedInputStreamWrapper& buffered_stream, kj::Array<capnp
>> ::word>& scratch
>>     ) {
>>         capnp::InputStreamMessageReader reader(buffered_stream, capnp::
>> ReaderOptions(), scratch.asPtr());
>>
>>
>>         const auto ts_msg = reader.getRoot<Foo>();
>>         auto mir_msg = ts_msg.getBar();
>>         auto capnp_msg = std::make_shared<CapnpMsg>(mir_msg);
>>         data.emplace_back(capnp_msg);
>>         if (data.size() > 1000000) {
>>             return false;
>>         }
>>
>>
>>         return true;
>>     }
>>
>>
>>     void start() {
>>         // 1 MB of scratch.
>>         kj::Array<capnp::word> scratch = kj::heapArray<capnp::word>(1024 
>> * 1024 / sizeof(capnp::word));
>>         kj::FdInputStream fd_stream(fileno(stdin));
>>         kj::BufferedInputStreamWrapper buffered_stream(fd_stream);
>>
>>
>>         while (buffered_stream.tryGetReadBuffer() != nullptr) {
>>             if (!process_message(buffered_stream, scratch)) {
>>                 break;
>>             }
>>         }
>>     }
>> };
>>
>>
>>
>> void foo() {
>>     ProcessStream stream;
>>     stream.start();
>>     std::cout << "A" << std::endl;
>>     usleep(1000 * 1000 * 5);
>>     stream.data.clear();
>> }
>>
>>
>> int main(int argc, char* argv[]) {
>>
>>
>>     foo();
>>     std::cout << "B" << std::endl;
>>     usleep(1000 * 1000 * 10);
>> }
>>
>>
>> The following are the 3 ways to use a wrapper class to store the messages
>>
>>  
>>
>>
>>
>> namespace capnp_utils {
>>
>>     /// 1 & 2 takes 10x more memory.
>>
>>     template<typename R>
>>     class CapnpMessage2 {
>>         // TODO: Should be private, but leave it public for debugging.
>>         public:
>>             typename R::Reader msg_reader;
>>             std::unique_ptr<capnp::MallocMessageBuilder> msg_builder;
>>
>>         public:
>>             CapnpMessage2(const typename R::Reader& reader) {
>>                 msg_builder = std::make_unique<capnp::
>> MallocMessageBuilder>();
>>                 msg_builder->setRoot(reader);
>>                 msg_reader = msg_builder->getRoot<R>().asReader();
>>             }
>>
>>
>>             const typename R::Reader& get() const {
>>                 return msg_reader;
>>             }
>>     };
>>
>>
>>     template<typename R>
>>     class CapnpMessage1 {
>>
>>         public:
>>             typename R::Reader struct_reader;
>>             std::unique_ptr<capnp::MallocMessageBuilder> msg_builder;
>>         public:
>>
>>             CapnpMessage1(const typename R::Reader& reader) {
>>                 capnp::MallocMessageBuilder tmp;
>>                 tmp.setRoot(reader);
>>                 const auto raw = capnp::messageToFlatArray(tmp);
>>
>>
>>                 msg_builder = std::make_unique<capnp::
>> MallocMessageBuilder>();
>>                 capnp::initMessageBuilderFromFlatArrayCopy(raw, *
>> msg_builder);
>>                 struct_reader = msg_builder->getRoot<R>().asReader();
>>             }
>>
>>
>>             const typename R::Reader& get() const {
>>                 return struct_reader;
>>             }
>>     };
>>
>>
>>
>>
>>     template<typename R>
>>     class CapnpMessage {
>>
>>         public:
>>             typename R::Reader struct_reader;
>>             // `struct_reader` is only valid when `msg_reader` is alive.
>>             std::unique_ptr<capnp::FlatArrayMessageReader> msg_reader;
>>             kj::Array<capnp::word> raw;
>>         public:
>>
>>
>>             CapnpMessage(const typename R::Reader& reader) {
>>                 capnp::MallocMessageBuilder msg_builder;
>>                 msg_builder.setRoot(reader);
>>                 raw = capnp::messageToFlatArray(msg_builder);
>>                 msg_reader = std::make_unique<capnp::
>> FlatArrayMessageReader>(raw);
>>                 struct_reader = msg_reader->getRoot<R>();
>>             }
>>
>>             const typename R::Reader& get() const {
>>                 return struct_reader;
>>             }
>>     };
>>
>> }
>>  
>> `CapnpMessage2` copies the message once, while `CapnpMessage1` and 
>> `CapnpMessage` both copy the message twice.
>>
>> In terms of memory usage:
>>                                    A                    B
>> CapnpMessage2      8152MB      8136MB
>> CapnpMessage1      8166MB      8151MB
>> CapnpMessage        640MB        625MB
>>
>>
>> It looks that `CapnpMessage1` & `CapnpMessage2` use far more memory than 
>> `CapnpMessage`. Is it because the builder has reserved too much space even 
>> though I only need it to be as big as the reader?
>> The important part is that in all 3 cases, memory are still used at time 
>> B although  `data` is cleared. (Even if I reset the `unique_ptr` inside the 
>> wrapper class).
>>
>> Is it somehow related to memory fragmentation?
>>
>> What is the best approach in this case? I would very much appreciate any 
>> help to reduce the steady memory increase.
>>
>> -- 
>> You received this message because you are subscribed to the Google Groups 
>> "Cap'n Proto" group.
>> To unsubscribe from this group and stop receiving emails from it, send an 
>> email to [email protected] <javascript:>.
>> Visit this group at https://groups.google.com/group/capnproto.
>>
>


-- 
You received this message because you are subscribed to the Google Groups 
"Cap'n Proto" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
Visit this group at https://groups.google.com/group/capnproto.

Re: [capnproto] How to store messages for later use without leaking memory?

Reply via email to