Hi Wayne, I added some new profiles for the INPUTSTREAM_LINE_READER.
The results are very surprising to me. In debug and release mode, using INPUTSTREAM_LINE_READER with a wxInputFileStream is around 200 times (:-O) slower than a straight std::ifstream, taking over two seconds to read a 6.5MB short-lined file that std::ifstream can do in <10ms. I wonder if there's something I've missed here, as I can't believe it's truly that slow. I've pushed the benchmark to Launchpad for those who are interested: https://code.launchpad.net/~john-j-beard/kicad/+git/kicad/+ref/io_benchmark As for your note about having a generic stream version, yes, that's more flexible and we should aim for that, if we were to provide a std::istream LINE_READER. I just did ifstream as a test to keep things clear and ensure a sensible comparison. As I said, the current performance is "OK", and if we want to limit line lengths, we probably can't get that for free, anyway. I understand the desire to not read infinite lines, but at least in my tests, the std:ifstream method, which has no limit for that, can deal with a 1GB file of a single line in about 300ms. Obviously it's all in disk cache, and you have to pay the allocation for it when reading into the buffer. All the existing LINE_READER explode with IO_ERROR on that file since it's too long for them. Cheers, John On Fri, Feb 17, 2017 at 5:56 AM, Wayne Stambaugh <stambau...@gmail.com> wrote: > John, > > It would have been nice if you would have benchmarked wxFileInputStream > as well. There already is an INPUTSTREAM_LINE_READER object which takes > a pointer to wxInputStream object. I'm curious how it stacks up against > the std::ifstream. There are some interesting wxInputStream objects > that could prove useful. > > I think ifstream wasn't used in case there are really long lines which > there can be if you have text objects with lots of long multiple line > strings in your files. I'm ok with adding a LINE_READER the wraps > istream objects. It's fairly trivial to change LINE_READER types. It > might be a bit more flexible if you just provided an ISTREAM_LINE_READER > that take any istream derived object rather than write a separate > LINE_READER for each istream derivative. > > Cheers, > > Wayne > > On 2/16/2017 8:43 AM, John Beard wrote: >> Hi, >> >> I was trying to profile the eeschema slow library loads, and I got a >> bit distracted by RICHIO's FILE_LINE_READER. >> >> Internally, it uses a very tight loop of reading single chars at a >> time from a file descriptor, which looks inefficient. I wrote a >> benchmarker to compare RICHIO against std::ifstream and a new >> LINE_READER implementation, backed by std::ifstream. operf confirms >> that most of the time in RICHIO burned in the ReadLine() function >> itself. >> >> The results were that RICHIO (in debug mode) is consistently 4-7 times >> slower than using std::ifstream, when reading eeschema library text >> files (so relatively short lines). Compiling the release version >> improved RICHIOs speed more than std::ifstream's, but it is still >> around 3 times slower than std::ifstream. >> >> For files with 1k lines, the slowdown is about 30 times (!) in debug >> and 14 times in release mode, so significantly worse. Few files read >> line-wise by Kicad look like that, however. >> >> Avoiding reconstructing the stream/LINE_READER each time doesn't have >> much of an effect in any case. >> >> Is there a particular reason why STL streams are not used in RICHIO? >> The only thing I think the example ifstream implementation can't do is >> catching over long lines, but that's only used in one place: the VRML >> parser, which hardcodes an 8MB limit. ifstream could do this, but not >> with the simple getline function. >> >> This performance doesn't appear to be a major bottleneck for me, but >> it does seem a shame to throw away (charitably) two thirds of file >> read speeds (and uncharitably, up to 97% in odd cases) if there is no >> particular reason to do so. >> >> As an aside, RICHIO appears to allocate twice as many times as >> std::ifstream when reading the same data, for the roughly the same >> amount of memory in total. >> >> Anyway, I thought I'd share this finding! Please find attached the >> benchmark program, such as it is. >> >> Cheers, >> >> John >> >> >> >> _______________________________________________ >> Mailing list: https://launchpad.net/~kicad-developers >> Post to : kicad-developers@lists.launchpad.net >> Unsubscribe : https://launchpad.net/~kicad-developers >> More help : https://help.launchpad.net/ListHelp >> > > _______________________________________________ > Mailing list: https://launchpad.net/~kicad-developers > Post to : kicad-developers@lists.launchpad.net > Unsubscribe : https://launchpad.net/~kicad-developers > More help : https://help.launchpad.net/ListHelp _______________________________________________ Mailing list: https://launchpad.net/~kicad-developers Post to : kicad-developers@lists.launchpad.net Unsubscribe : https://launchpad.net/~kicad-developers More help : https://help.launchpad.net/ListHelp