No problem, feel free to write again if you encounter any issues.

Best Regards,
Igor


On Tue, Sep 29, 2020 at 8:02 PM Brett Elliott <belli...@icr-team.com> wrote:

> Hi Igor,
>
> Your request to see my Read method made me examine it a little closer. The
> particular vector resize that's the problem is my own fault. Thanks for
> making me have a look at that again.
>
> Thanks,
> Brett
>
> -----Original Message-----
> From: Igor Sapego [mailto:isap...@apache.org]
> Sent: Tuesday, September 29, 2020 2:22 AM
> To: dev <dev@ignite.apache.org>
> Subject: [EXTERNAL] Re: cpp thin client vector resize
>
> CAUTION: This email originated from outside of the organization. Do not
> click links or open attachments unless you recognize the sender and know
> the content is safe.
>
>
> Hi,
>
> Can you share your ignite::binary::BinaryType::Read method where reading
> of the std::vector is going on?
>
> Also, are your strings large too or only vectors?
>
> Best Regards,
> Igor
>
>
> On Mon, Sep 28, 2020 at 8:29 PM Brett Elliott <belli...@icr-team.com>
> wrote:
>
> > Hello,
> >
> > Tl;dr: I'm doing some profiling, and the cpp thin client is spending a
> > lot of time in memset. It's spending almost as much time as in the
> > socket recv call.
> >
> > Longer version:
> >
> > I was profiling a test harness that does a Get from our ingite grid.
> > The test harness is written in c++ using the thin client. I profiled
> > the code using gperftools, and I found that memset (__memset_sse2 in
> > my case) was taking a large portion of the execution time. I'm
> > Get-ting a BinaryType which contains a std::string, an int32_t, and an
> > array of int8_t. For my test case, the array of int8_t values can
> > vary, but I got the best throughput on my machine using about 100MB for
> the size of that array.
> >
> > I profiled the test code doing a single Get, and doing 8 Gets. In the
> > 8 Get case, the number of memset calls increased, but the percentage
> > of overall time spent in memset was reduced. However it was not
> > reduced as much as I'd hoped. I was hoping that the first Get call
> > would have a large memset, and the rest of the Get calls would skip
> > it, but that's maybe not the case.
> >
> > I'm seeing almost as much time spent in memset as is being spent in
> > recv (__libc_recv in this case). That seems like a lot of time spent
> > initializing. I suspect that it's std::vector initialization caused by
> > resize.
> >
> > I believe memset is being invoked by a std::vector::resize operation
> > inside of ignite::binary::BinaryType::Read. I believe the source file
> > is
> > modules/platforms/cpp/binary/src/impl/binary/binary_reader_impl.cpp.
> > In the code I'm looking at it's line 905. There are only two calls to
> > resize in this sourcefile, and it's the one in ReadTopObject0 which I
> > think is the culprit. I didn't compile with debug symbols to confirm
> > the particular resize call, but my profiler's callstack shows that
> resize is to blame for all the memset calls.
> >
> > Is there any way we can avoid std::vector::resize? I suspect that
> > ultimately the problem is that the buffer somewhere gets passed to a
> > socket recv call, and recv call takes a pointer and length. In that
> > case, there's no way that I know of to use a std::vector for the
> > buffer and avoid the unnecessary initialization/memset in the resize
> call.
> >
> > Could another container be used instead of a vector?
> > Could the vector be reused, so on subsequent calls we don't need to
> > resize it again?
> > Could something like uvector (which skips initialization) be used
> instead?
> >
> > Thank,
> > Brett
> >
> >
> >
>

Reply via email to