>You raised this topic from string line reading to more general case. 😄

Yes.
>If so, the best way could be providing a general function to wrap rdelim for 
>for-each-seg-delim, users may pass a delimiter to decide how to delim (even 
>for bytevectors), and implement for-line-a-file base on it with unicode 
>encoding.
No, this is worse. This limits the procedure to delimiting.
>However, personally I dare to doubt if we really need such general function, 
>since general parsing may require looking backwards. This implies the char 
>based checking delimiter will be more general.
If we don't consider this, it's better to just consider strings with proper 
encoding to avoid over engineering. 
'general parsing may require looking backwards’ does not imply ‘don’t need such 
general function’, and ‘don’t need such general function’ doesn’t imply ‘such a 
general function wouldn’t be useful’. Many parsers don’t need looking 
backwards. E.g., see all examples I mentioned in my mail.
Also, you ‘general’ isn’t my ‘general’. Nowhere did I limit the procedure to 
delimiters and character-based things.
For an example on how this could be used: in Scheme-GNUnet, a fiber is waiting 
on messages on some (stream) socket (SOCK_STREAM, not SOCK_DGRAM). Each message 
consists of ‘message type field + packet size field + information’. So, the 
fiber effectively iterates over all messages on the stream. The parser first 
reads the type and size (no looking backwards needed), then reads the remaining 
information and passes this information to the type-specific parser, which can 
now be acted upon.
(IIRC, the control flow is technically a bit different (let loop with 
arguments, to avoid mutation, not that it really matters since there is state 
anyway in the ports), but it could have been implemented like this instead.)
Copy of list of examples that don’t need looking backwards:
This is overly specific to reading lines, and reading lines with rdelim. If you 
replace ‘read-line’ by an argument, the procedure becomes more general. For 
example, by passing ‘get-char’ you can act on each character, with ‘get-line’ 
I’m not sure what the difference would be, but apparently it’s not ‘read-line’ 
(?), if you give it a JSON reading+parsing proedure you iterate over all JSON 
objects, with get-u8 you iterate over bytes etc..
Best regards,
Maxime Devos

Reply via email to