Re: Segfault when running datasketches

2020-06-29 Thread Andy Dang
Will try to get the example data and code out - there's a lot of internal logic at the moment The commit we're using is e402f61aceb64f659845bdc5f03cf4f29797277b - Andy On Mon, Jun 29, 2020 at 9:29 AM Jon Malkin wrote: > You mean you were calling the java library from python? Our testing > gen

Re: Segfault when running datasketches

2020-06-29 Thread Jon Malkin
You mean you were calling the java library from python? Our testing generally has generally shown C++ to be faster. This is still too vague for me to be able to say much. There's no specific git version (tag or hash), no code, and no data. jon On Mon, Jun 29, 2020 at 9:08 AM Andy Dang wrote:

Re: Segfault when running datasketches

2020-06-29 Thread Andy Dang
I was using the Git version and was running with various sketches. I thought the slowness is from Python, but I was able to scan through the same data calculating the same statistics with the Java library in roughly 3 minutes. Any idea why there's such a big difference between the two languages?

Re: Segfault when running datasketches

2020-06-26 Thread Jon Malkin
I haven't done long running python tests recently but I haven't seen that. After you using a release version of the library or did you check out from git? And which sketch or sketches are you using? I've compiled the library in debug mode (gotta modify setup.py to force that) and run python via g