Re: 200-600x slower Dlang performance with nested foreach loop

2021-01-31 Thread a11e99z via Digitalmars-d-learn
On Sunday, 31 January 2021 at 16:50:15 UTC, methonash wrote: What confuses me, at this point, is this: I originally wrote the D code using foreach in this style: foreach( i, ref parentString; strings ) foreach( j, ref childString; strings[ i + 1 .. $ ] ) Essentially, the value of j printed

Re: 200-600x slower Dlang performance with nested foreach loop

2021-01-31 Thread methonash via Digitalmars-d-learn
On Sunday, 31 January 2021 at 00:53:05 UTC, Steven Schveighoffer wrote: I'd suggest trying it in reverse. If you have the sequence "cba", "ba", "a", then determining "a" is in "ba" is probably cheaper than determining "a" is in "cba". I have user requirements that this application track string

Re: 200-600x slower Dlang performance with nested foreach loop

2021-01-30 Thread CraigDillabaugh via Digitalmars-d-learn
On Tuesday, 26 January 2021 at 23:57:43 UTC, methonash wrote: clip That nested loop is an O(n^2) algorithm. Meaning it will slow down *very* quickly as the size of the array n increases. You might want to think about how to improve this algorithm. Nice observation, and yes, this would typical

Re: 200-600x slower Dlang performance with nested foreach loop

2021-01-30 Thread Steven Schveighoffer via Digitalmars-d-learn
On 1/30/21 6:13 PM, methonash wrote: Greetings all, Many thanks for sharing your collective perspective and advice thus far! It has been very helpful and instructive. I return bearing live data and a minimally complete, compilable, and executable program to experiment with and potentially opt

Re: 200-600x slower Dlang performance with nested foreach loop

2021-01-30 Thread methonash via Digitalmars-d-learn
Greetings all, Many thanks for sharing your collective perspective and advice thus far! It has been very helpful and instructive. I return bearing live data and a minimally complete, compilable, and executable program to experiment with and potentially optimize. The dataset can be pulled from

Re: 200-600x slower Dlang performance with nested foreach loop

2021-01-27 Thread Steven Schveighoffer via Digitalmars-d-learn
On 1/27/21 10:14 AM, Paul Backus wrote: On Wednesday, 27 January 2021 at 15:12:32 UTC, Paul Backus wrote: Maybe it's to avoid invalidating the result of `key in aa` when adding or removing entries? The spec doesn't say anything about it either way [1], but allowing invalidation would make AAs

Re: 200-600x slower Dlang performance with nested foreach loop

2021-01-27 Thread Paul Backus via Digitalmars-d-learn
On Wednesday, 27 January 2021 at 14:15:26 UTC, FeepingCreature wrote: Associative arrays allocate per entry added?! https://github.com/dlang/druntime/blob/master/src/rt/aaA.d#L205 Oh God, associative arrays allocate per entry added! Maybe it's to avoid invalidating the result of `key in aa`

Re: 200-600x slower Dlang performance with nested foreach loop

2021-01-27 Thread Paul Backus via Digitalmars-d-learn
On Wednesday, 27 January 2021 at 15:12:32 UTC, Paul Backus wrote: Maybe it's to avoid invalidating the result of `key in aa` when adding or removing entries? The spec doesn't say anything about it either way [1], but allowing invalidation would make AAs much more difficult to use in @safe cod

Re: 200-600x slower Dlang performance with nested foreach loop

2021-01-27 Thread FeepingCreature via Digitalmars-d-learn
On Wednesday, 27 January 2021 at 02:14:39 UTC, H. S. Teoh wrote: Yes, definitely try this. This will completely eliminate the overhead of using an AA, which has to allocate memory (at least) once per entry added. Especially since the data has to be sorted eventually anyway, you might as well

Re: 200-600x slower Dlang performance with nested foreach loop

2021-01-26 Thread H. S. Teoh via Digitalmars-d-learn
On Wed, Jan 27, 2021 at 01:28:33AM +, Paul Backus via Digitalmars-d-learn wrote: > On Tuesday, 26 January 2021 at 23:57:43 UTC, methonash wrote: > > > Using AA's may not necessarily improve performance. It depends on > > > what your code does with it. Because AA's require random access > > >

Re: 200-600x slower Dlang performance with nested foreach loop

2021-01-26 Thread Paul Backus via Digitalmars-d-learn
On Tuesday, 26 January 2021 at 23:57:43 UTC, methonash wrote: Using AA's may not necessarily improve performance. It depends on what your code does with it. Because AA's require random access to memory, it's not friendly to the CPU cache hierarchy, whereas traversing linear arrays is more ca

Re: 200-600x slower Dlang performance with nested foreach loop

2021-01-26 Thread methonash via Digitalmars-d-learn
On Tuesday, 26 January 2021 at 18:17:31 UTC, H. S. Teoh wrote: Do not do this. Every time you call .array it allocates a new array and copies all its contents over. If this code runs frequently, it will cause a big performance hit, not to mention high GC load. The function you're looking for

Re: 200-600x slower Dlang performance with nested foreach loop

2021-01-26 Thread mw via Digitalmars-d-learn
On Tuesday, 26 January 2021 at 21:55:47 UTC, mw wrote: On Tuesday, 26 January 2021 at 17:40:36 UTC, methonash wrote: foreach( i, ref pStr; sortedArr ) { foreach( j, ref cStr; sortedArr[ i + 1 .. $ ] ) { if( indexOf( pStr, cStr ) > -1 ) { // ... yourInnerOp

Re: 200-600x slower Dlang performance with nested foreach loop

2021-01-26 Thread mw via Digitalmars-d-learn
On Tuesday, 26 January 2021 at 17:40:36 UTC, methonash wrote: foreach( i, ref pStr; sortedArr ) { foreach( j, ref cStr; sortedArr[ i + 1 .. $ ] ) { if( indexOf( pStr, cStr ) > -1 ) { // ... } } } Before adding the code excerpt above, the Dlang prog

Re: 200-600x slower Dlang performance with nested foreach loop

2021-01-26 Thread Steven Schveighoffer via Digitalmars-d-learn
On 1/26/21 12:40 PM, methonash wrote: My first attempt to solve this problem space used a small Perl program to perform steps 1 through 3, which would then pipe intermediate output to a small Dlang program handling only step #4 using dynamic arrays (no use of AAs) of ubyte[][] with use of count

Re: 200-600x slower Dlang performance with nested foreach loop

2021-01-26 Thread H. S. Teoh via Digitalmars-d-learn
On Tue, Jan 26, 2021 at 06:13:54PM +, methonash via Digitalmars-d-learn wrote: [...] > I cannot post the full source code. Then we are limited in how much we can help you. > Regarding a reduced version reproducing the issue: well, that's > exactly what the nested foreach loop does. Without

Re: 200-600x slower Dlang performance with nested foreach loop

2021-01-26 Thread H. S. Teoh via Digitalmars-d-learn
On Tue, Jan 26, 2021 at 05:40:36PM +, methonash via Digitalmars-d-learn wrote: [...] > 1) Read a list of strings from a file > 2) De-duplicate all strings into the subset of unique strings > 3) Sort the subset of unique strings by descending length and then by > ascending lexicographic identit

Re: 200-600x slower Dlang performance with nested foreach loop

2021-01-26 Thread Imperatorn via Digitalmars-d-learn
On Tuesday, 26 January 2021 at 17:40:36 UTC, methonash wrote: Greetings Dlang wizards, I seek knowledge/understanding of a very frustrating phenomenon I've experienced over the past several days. [...] Source please 👍

Re: 200-600x slower Dlang performance with nested foreach loop

2021-01-26 Thread methonash via Digitalmars-d-learn
On Tuesday, 26 January 2021 at 17:56:22 UTC, Paul Backus wrote: It would be much easier for us to help you with this if you could post the full program, or at the very least a reduced version that reproduces the same issue. [1] Since your attempts so far have failed to fix the problem, it is

Re: 200-600x slower Dlang performance with nested foreach loop

2021-01-26 Thread Paul Backus via Digitalmars-d-learn
On Tuesday, 26 January 2021 at 17:40:36 UTC, methonash wrote: foreach( i, ref pStr; sortedArr ) { foreach( j, ref cStr; sortedArr[ i + 1 .. $ ] ) { if( indexOf( pStr, cStr ) > -1 ) { // ... } } } Before adding the code excerpt above, the Dlang prog

200-600x slower Dlang performance with nested foreach loop

2021-01-26 Thread methonash via Digitalmars-d-learn
Greetings Dlang wizards, I seek knowledge/understanding of a very frustrating phenomenon I've experienced over the past several days. The problem space: 1) Read a list of strings from a file 2) De-duplicate all strings into the subset of unique strings 3) Sort the subset of unique strings by