On Tue, 1 Jun 2010, spir ☣ wrote:

Hello,


The documentation in the ref manual about PChar may have i bit more details: 
http://www.freepascal.org/docs-html/ref/refsu13.html#x36-390003.2.7

Do the following statements hold true?
* This type is mainly intended to interface with C code (or for low-level 
needs?). Else AnsiString should be prefered (even for low-level, since 
AnsiString is also referenced via pointer?).

PChar is for C code.
* Like C strings, and unlike AnsiString-s (even if the latter also are 
"pointed"), PChar strings cannot hold NULL characters (#0). I just checked this 
point.

Correct.


Also:
* How is length computed (traversal?)?

Strlen traverses till the first null.





Can AnsiStrings be safely used as dynamic byte arrays? For instance to benefit 
of ref counting and copy_on_write (if any benefit). Or is it recommended to use 
Array of Byte?

You can use them.


What is the actual benefit of copy-on-write? I ask because of the following 
reasoning:

Copy on write is needed to preserve the Pascal nature of strings while
keeping the benefits of reference counted strings.

After

A:='some string'; // Ref count is 1
B:=A;  // Ref count is 2
B[1]:='S'; // Copy, and ref count of B is 1.

the A[1]='s' should still hold true.



* If a string is just used at several places, for example in output or into 
bigger strings, then there is no reason reason to copy it into a new variable.



* If a programmer explicitely assigns an existing string to a new variable, the 
intent is precisely copy-semantics, to make them independent for further 
changes. If there is no change, there is also no reason for such an assignment.

This is not correct. Many strings are simply referenced several times.

As a consequence, s2:=s1 will nearly always be followed by modification of 
either string, which will result on copy anyway, according to copy-on-write 
semantics. So, the initial gain at assignment time is soon lost. While the cost 
I imagine in terms of type complexity remains (every builtin modification 
method must ensure copying; no user-defined modification method should be 
possible without using builtin ones -- else copy-on-write is lost and 
consequences undefined).

What happens if a programmer indirectly modifies an AnsiString (via a pointer) 
which ref count is > 1:

Var
   s1,s2 : AnsiString;
   pc    : PChar;
begin
   s1 := 'abcde' ; s2 := s1;
   pc := PChar(s1);
   pc[2] := 'X';
   writeln(pc,' ',s1,' ',s2);   // abXde abXde abXde
end.

There must be an error in my reasoning, else language designers would not 
bother with such complication. What do you think?

If the programmer does this, it is his own fault; A pchar typecast should be
considered read-only and valid only for the duration of the expression. It states as much in the docs.

Michael.
_______________________________________________
fpc-pascal maillist  -  fpc-pascal@lists.freepascal.org
http://lists.freepascal.org/mailman/listinfo/fpc-pascal

Reply via email to