On Mon, 1 Mar 2004, David Emerson wrote:

> Hey all,
>
> I've been poking around the sources and documentation for some insight into the 
> details of how ansistrings are implemented, and I am left with some questions.
>
>
> It would be nice if, when comparing two ansistrings, fpc would first check to see if 
> these two pointers are pointing to the same spot in memory, i.e. the same TAnsiRec. 
> If they happen to be pointing to the same, a potentially long operation is reduced 
> to a simple comparison of two memory addresses which probably only takes one 
> processor cycle.
>
> Looking at fpc_ansistr_compare in astrings.inc, and at cgadd.addstring (the only 
> function that seems to call fpc_ansistr_compare), it appears not to do this. Perhaps 
> I'm wrong? I don't _really_ understand what the code is doing. I believe the sources 
> I'm looking at are 1.0.10.

This is already implemented in version 1.9.2.

>
> If this quick comparison is in fact not implemented, I'd like to do it myself. 
> (There are a number of places where I am checking long ansistrings for equality, and 
> there is a reasonable chance that both pointers are pointing to the same address.)
>
> ( @s1[1] = @s2[1] ) seems to give the right result. is this the best way?
> or is it quicker/slower to use ( pointer(s1) = pointer(s2) )
> (no doubt more elegant)

The code uses ( pointer(s1) = pointer(s2) )

>
>
>
> Now on to the second question... getting those ansistrings pointed to the same 
> address!
> (Some of them already are, but I'd like to get more...)
>
> I was kind of surprised to find that
>
>   s1 := 'hello';
>   s2 := 'hello';
>   writeln ( pointer(s1) = pointer(s2) );     ...writes FALSE

This is normal. But if you do it 'correct', i.e:

Const
  MyHello = 'Hello';

S1:= MyHello;
S2:= MyHello;

They will point at the same address.


>
> Thus I assume that
>   readln (s1);
>   readln (s2);    ... would NEVER point them at the same address
>
> Of course, checking every string against every other string would comprise an absurd 
> performance hit in most cases. What I'd really like is to have a relatively small 
> number of constant strings that could be compared against, and only when reading in 
> a data file, or perhaps certain fields in a data file. (yeah, reading will take 
> longer)... Then if my data file (and thus my filled pascal array) has 1,000 
> instances of "some_complex_but_often_identically_repeated_data_value", I get the 
> following:
>   - lots of memory savings
>   - operator "=" gives a very fast TRUE result when they are pointing to the same
>
> In fact, all the string comparison operators return a particular value if the two 
> are equal, so they could all give a fast result in this case. This could happen both 
> when comparing datum values against each other, and when comparing them to a 
> constant string in my code.
>
> Do resource strings offer this kind of intelligence, or are they designed for a 
> completely different purpose, and offer no performance improvement for this special 
> application?

They are normal ansistrings, just stored in a special table.

>
> Is there some special mode or compiler directive that does more string uniqueness 
> checking, including at compile time (e.g. to find my two identical 'hello' 
> assignments)?

No. Please use constants, that's what they're for.

Michael.

_______________________________________________
fpc-pascal maillist  -  [EMAIL PROTECTED]
http://lists.freepascal.org/mailman/listinfo/fpc-pascal

Reply via email to