On Tue, Nov 28, 2000 at 04:47:42PM -0500, Dan Sugalski wrote:
> At 09:05 PM 11/28/00 +0000, Nicholas Clark wrote:
>>On Tue, Nov 28, 2000 at 03:35:37PM -0500, Dan Sugalski wrote:
Not sure:
Dan:
>>>>> is treated as if it points to a stream of bytes, where the first four are
>>>>> the length of the source to be read followed by the source. If set to
>>>>
>>>>Since you have a fourth argument couldn't that be used for the length
>>>>of the byte stream rather than embedding that length into the byte stream
>>>>itself? Makes more sense to me to separate the bytes from the length.
>>>
>>> I'd rather the stream be self-contained, rather than needing an extra
>>> argument for the length. Counted strings aren't uncommon outside of C, and
>>> there's no reason a Fortran or COBOL (or Java, or...) program can't
>> embed perl.
>>
>>
>>Why four? Surely that's imposing an arbitrary binary structure. If it's a
>>parameter then it's (probably) a machine register and certainly a "natural"
>>quantity for whatever's running the code (and automatically the correct
>>endian-ness just in case perl is running in some (oddball partial)
>>binary emulation environment. Erm. Or something like that.
>
> It's not necessarily in a register. In at least some of the languages I
> named (and you can add BASIC and pascal to the list as well), a string
> consists of a length and data pointer pair, usually together. What's handy
> is a pointer to the data structure, not the length and a pointer to the buffer.
>
> Counted strings should probably just have either a platform-native int in
> front, or a 32-bit int in network format, both of which should be doable on
> any platform that perl deals with.
I agree it's do-able.
What seems to me a good idea not to do it has entered my head.
We're trying to make this an easy embedding API.
For the counted length version:
If I'm being passed a (large) block of data and a length for that block from
an upstream source which I can't change, then I have to malloc a new block
of length+4, copy the length into the first 4 bytes, and all that data into
the rest.
If I'm being passed a counted length block, easy, I pass it onwards.
For the 2 parameter version:
If I'm being passed 2 parameters, a block and its length, I pass them on.
If I'm being passed a counted block, I read the count into one variable,
and use the address (counted block + 4) as the address of a vanilla block.
No malloc or copy. (And I'm assuming that in C one would actually take
the address of the structure member, no hacky pointer arithmetic)
This only applies to external APIs. Internal APIs may well benefit from
counted length systems
(or doing things analogous to allowing the buffer to follow directly after
the struct STRUCT_SV; in a single malloc()ed block in perl5)
Nicholas Clark