Han-Wen Nienhuys <hanw...@gmail.com> writes: > On Wed, Jan 22, 2020 at 12:01 PM David Kastrup <d...@gnu.org> wrote: > >> Han-Wen Nienhuys <hanw...@gmail.com> writes: >> >> > I looked a bit through the GUILE source code to see what is going on. >> > >> > I believe our current hypothesis (LilyPond's slowdown is caused by >> > expensive unicode transcoding into 32-bit strings) is incorrect. >> > >> > If you look into the source code, you can see that the UTF-8 -> SCM >> > conversion checks if there are any code points over 255 >> > >> > >> > >> https://git.savannah.nongnu.org/cgit/guile.git//tree/libguile/strings.c/?id=1b8e9ca0e37fab366435436995248abdfc780a10#n1620 >> > >> > if there aren't, it uses Latin1 encoding ("narrow == 1") to encode the >> > string as a normal byte array. This code walks the string twice, but that >> > is very cheap due to CPU cache locality, so it should be >> > essentially equivalent to whatever GUILE 1.8 was doing. >> >> GUILE 1.8 did not walk the string even once >> > > GUILE 1.8 walks it once when you do memcpy.
Ok, but that's sort of a cheap walk. >> > Even so, if the input flie does use UTF-8, there should be little >> > overhead, because the number of texts that we process is always >> > small. LilyPond is not a text processor. >> > >> > So, what hard data do we have on GUILE 2/3 slowness, and what does >> > that data say? >> >> That data says "humongous slowdown". There is not much more than >> speculation what this is caused by as far as I know. >> >> > Do we have a standardized test file for benchmarking performance? input/regression/mozart-hrn-3.ly possibly, but it's not particularly large. -- David Kastrup