Hi :) On Thu 14 Apr 2016 16:03, l...@gnu.org (Ludovic Courtès) writes:
> Andy Wingo <wi...@pobox.com> skribis: > >> I have been working on a refactor to ports. The goal is to have a >> better concurrency story. Let me tell that story then get down to the >> details. > > In addition to concurrency and thread-safety, I’m very much interested > in the impact of this change on the API (I’ve always found the port API > in C to be really bad), on the flexibility it would provide, and on > performance—‘read-char’ and ‘get-u8’ are currently prohibitively slow! Yeah. Of course improving the port internals is technically a breaking change, but I think probably the set of people that have implemented ports using the C API can be counted on two hands, and I hope to find everyone and help them adapt :) >From the speed side, I think that considering read-char to be prohibitively slow is an incorrect diagnosis. First let's define a helper: (define-syntax-rule (do-times n exp) (let lp ((i 0)) (let ((res exp)) (if (< i n) (lp (1+ i)) res)))) I want to test four things. ;; 1. How long a loop up to 10 million takes (baseline measurement). (let ((port (open-input-string "s"))) (do-times #e1e7 1)) ;; 2. A call to a simple Scheme function. (define (foo port) 42) (let ((port (open-input-string "s"))) (do-times #e1e7 (foo port))) ;; 3. A call to a port subr. (let ((port (open-input-string "s"))) (do-times #e1e7 (port-line port))) ;; 4. A call to a port subr that touches the buffer. (let ((port (open-input-string "s"))) (do-times #e1e7 (peek-char port))) The results: | baseline | foo | port-line | peek-char ------------------+----------+--------+-----------+---------- guile 2.0 | 0.269s | 0.845s | 1.067s | 1.280s guile master | 0.058s | 0.224s | 0.225s | 0.433s wip-port-refactor | 0.058s | 0.220s | 0.226s | 0.375s These were single measurements at the REPL on my i7-5600U, run with --no-debug. The results were fairly consistent. Note that because this is a loop, Guile 2.2's compiler gets some "unfair" advantages related to loop-invariant code motion and peeling; but real parsers etc written on top of read-char will also have loops, so to a degree it's OK. Conclusions: 1. Guile 2.2 makes calling a subr just as cheap as calling a Scheme function. 2. The overhead of using Guile 2.0 is much greater than the overhead of calling peek-char. 3. peek-char is slower than other leaf functions in Guile 2.2 but only by 2x or so; I am sure it can be faster but I don't know by how much. Consider that it has to: 1. type-check the argument 2. get the port buffer and cursors 3. if there is enough data in the buffer to decode a char, do it. otherwise, slow-path. If we consider implementing this in Scheme, it might get slower than it currently is in 2.2, because of the switch from C->C calls (internal to ports.c and other C files) to Scheme->Scheme calls, probably with some additional subr calls to get state from the port. We might gain some of that back by removing the lock; dunno. It would be nice to be able to decode chars from UTF-8 or ISO-8859-1 ports from Scheme. But we always have to be able to call out to iconv too. Mark has mused on making the port buffer always UTF-8, but I don't quite see how this could work. I guess you could have a second port buffer for decoded UTF-8 chars, but that starts to look quite complicated to me. Anyway. I think that given the huge performance window opened up to us by the 2.0->2.2 switch, we should consider speed considerations as important but not primary -- when given a choice between speed and maintainability, or speed and the ability to suspend a port, we shouldn't choose speed. That said, the real way to make port operations fast is (1) to buffer the port, and (2) to operate on the buffer directly instead of fetching data octet-by-octet. Exposing the port buffer to Scheme allows this kind of punch-through optimization to be implemented where needed. Cheers, Andy