On Wed, Jan 22, 2020 at 12:01 PM David Kastrup <d...@gnu.org> wrote: > Han-Wen Nienhuys <hanw...@gmail.com> writes: > > > I looked a bit through the GUILE source code to see what is going on. > > > > I believe our current hypothesis (LilyPond's slowdown is caused by > > expensive unicode transcoding into 32-bit strings) is incorrect. > > > > If you look into the source code, you can see that the UTF-8 -> SCM > > conversion checks if there are any code points over 255 > > > > > > > https://git.savannah.nongnu.org/cgit/guile.git//tree/libguile/strings.c/?id=1b8e9ca0e37fab366435436995248abdfc780a10#n1620 > > > > if there aren't, it uses Latin1 encoding ("narrow == 1") to encode the > > string as a normal byte array. This code walks the string twice, but that > > is very cheap due to CPU cache locality, so it should be > > essentially equivalent to whatever GUILE 1.8 was doing. > > GUILE 1.8 did not walk the string even once >
GUILE 1.8 walks it once when you do memcpy. > > Even so, if the input flie does use UTF-8, there should be little > > overhead, because the number of texts that we process is always > > small. LilyPond is not a text processor. > > > > So, what hard data do we have on GUILE 2/3 slowness, and what does > > that data say? > > That data says "humongous slowdown". There is not much more than > speculation what this is caused by as far as I know. > > Do we have a standardized test file for benchmarking performance? -- Han-Wen Nienhuys - hanw...@gmail.com - http://www.xs4all.nl/~hanwen