No problem, feel free to write again if you encounter any issues. Best Regards, Igor
On Tue, Sep 29, 2020 at 8:02 PM Brett Elliott <belli...@icr-team.com> wrote: > Hi Igor, > > Your request to see my Read method made me examine it a little closer. The > particular vector resize that's the problem is my own fault. Thanks for > making me have a look at that again. > > Thanks, > Brett > > -----Original Message----- > From: Igor Sapego [mailto:isap...@apache.org] > Sent: Tuesday, September 29, 2020 2:22 AM > To: dev <dev@ignite.apache.org> > Subject: [EXTERNAL] Re: cpp thin client vector resize > > CAUTION: This email originated from outside of the organization. Do not > click links or open attachments unless you recognize the sender and know > the content is safe. > > > Hi, > > Can you share your ignite::binary::BinaryType::Read method where reading > of the std::vector is going on? > > Also, are your strings large too or only vectors? > > Best Regards, > Igor > > > On Mon, Sep 28, 2020 at 8:29 PM Brett Elliott <belli...@icr-team.com> > wrote: > > > Hello, > > > > Tl;dr: I'm doing some profiling, and the cpp thin client is spending a > > lot of time in memset. It's spending almost as much time as in the > > socket recv call. > > > > Longer version: > > > > I was profiling a test harness that does a Get from our ingite grid. > > The test harness is written in c++ using the thin client. I profiled > > the code using gperftools, and I found that memset (__memset_sse2 in > > my case) was taking a large portion of the execution time. I'm > > Get-ting a BinaryType which contains a std::string, an int32_t, and an > > array of int8_t. For my test case, the array of int8_t values can > > vary, but I got the best throughput on my machine using about 100MB for > the size of that array. > > > > I profiled the test code doing a single Get, and doing 8 Gets. In the > > 8 Get case, the number of memset calls increased, but the percentage > > of overall time spent in memset was reduced. However it was not > > reduced as much as I'd hoped. I was hoping that the first Get call > > would have a large memset, and the rest of the Get calls would skip > > it, but that's maybe not the case. > > > > I'm seeing almost as much time spent in memset as is being spent in > > recv (__libc_recv in this case). That seems like a lot of time spent > > initializing. I suspect that it's std::vector initialization caused by > > resize. > > > > I believe memset is being invoked by a std::vector::resize operation > > inside of ignite::binary::BinaryType::Read. I believe the source file > > is > > modules/platforms/cpp/binary/src/impl/binary/binary_reader_impl.cpp. > > In the code I'm looking at it's line 905. There are only two calls to > > resize in this sourcefile, and it's the one in ReadTopObject0 which I > > think is the culprit. I didn't compile with debug symbols to confirm > > the particular resize call, but my profiler's callstack shows that > resize is to blame for all the memset calls. > > > > Is there any way we can avoid std::vector::resize? I suspect that > > ultimately the problem is that the buffer somewhere gets passed to a > > socket recv call, and recv call takes a pointer and length. In that > > case, there's no way that I know of to use a std::vector for the > > buffer and avoid the unnecessary initialization/memset in the resize > call. > > > > Could another container be used instead of a vector? > > Could the vector be reused, so on subsequent calls we don't need to > > resize it again? > > Could something like uvector (which skips initialization) be used > instead? > > > > Thank, > > Brett > > > > > > >