On 2/16/2017 8:59 PM, John Beard wrote: > Hi Wayne, > > I added some new profiles for the INPUTSTREAM_LINE_READER. > > The results are very surprising to me. In debug and release mode, > using INPUTSTREAM_LINE_READER with a wxInputFileStream is around 200 > times (:-O) slower than a straight std::ifstream, taking over two > seconds to read a 6.5MB short-lined file that std::ifstream can do in > <10ms. > > I wonder if there's something I've missed here, as I can't believe > it's truly that slow.
Ouch! I wonder what wxFileInputStream is doing that there is this much of a performance hit. > > I've pushed the benchmark to Launchpad for those who are interested: > > https://code.launchpad.net/~john-j-beard/kicad/+git/kicad/+ref/io_benchmark Thanks for testing this. Now we know to use wxInputStream objects cautiously. > > As for your note about having a generic stream version, yes, that's > more flexible and we should aim for that, if we were to provide a > std::istream LINE_READER. I just did ifstream as a test to keep things > clear and ensure a sensible comparison. If you would make this change, I would commit your patch. I think it's a good thing to have for performance testing purposes and using the file stream reader where it makes sense. > > As I said, the current performance is "OK", and if we want to limit > line lengths, we probably can't get that for free, anyway. > > I understand the desire to not read infinite lines, but at least in my > tests, the std:ifstream method, which has no limit for that, can deal > with a 1GB file of a single line in about 300ms. Obviously it's all in > disk cache, and you have to pay the allocation for it when reading > into the buffer. > > All the existing LINE_READER explode with IO_ERROR on that file since > it's too long for them. > > Cheers, > > John > > On Fri, Feb 17, 2017 at 5:56 AM, Wayne Stambaugh <stambau...@gmail.com> wrote: >> John, >> >> It would have been nice if you would have benchmarked wxFileInputStream >> as well. There already is an INPUTSTREAM_LINE_READER object which takes >> a pointer to wxInputStream object. I'm curious how it stacks up against >> the std::ifstream. There are some interesting wxInputStream objects >> that could prove useful. >> >> I think ifstream wasn't used in case there are really long lines which >> there can be if you have text objects with lots of long multiple line >> strings in your files. I'm ok with adding a LINE_READER the wraps >> istream objects. It's fairly trivial to change LINE_READER types. It >> might be a bit more flexible if you just provided an ISTREAM_LINE_READER >> that take any istream derived object rather than write a separate >> LINE_READER for each istream derivative. >> >> Cheers, >> >> Wayne >> >> On 2/16/2017 8:43 AM, John Beard wrote: >>> Hi, >>> >>> I was trying to profile the eeschema slow library loads, and I got a >>> bit distracted by RICHIO's FILE_LINE_READER. >>> >>> Internally, it uses a very tight loop of reading single chars at a >>> time from a file descriptor, which looks inefficient. I wrote a >>> benchmarker to compare RICHIO against std::ifstream and a new >>> LINE_READER implementation, backed by std::ifstream. operf confirms >>> that most of the time in RICHIO burned in the ReadLine() function >>> itself. >>> >>> The results were that RICHIO (in debug mode) is consistently 4-7 times >>> slower than using std::ifstream, when reading eeschema library text >>> files (so relatively short lines). Compiling the release version >>> improved RICHIOs speed more than std::ifstream's, but it is still >>> around 3 times slower than std::ifstream. >>> >>> For files with 1k lines, the slowdown is about 30 times (!) in debug >>> and 14 times in release mode, so significantly worse. Few files read >>> line-wise by Kicad look like that, however. >>> >>> Avoiding reconstructing the stream/LINE_READER each time doesn't have >>> much of an effect in any case. >>> >>> Is there a particular reason why STL streams are not used in RICHIO? >>> The only thing I think the example ifstream implementation can't do is >>> catching over long lines, but that's only used in one place: the VRML >>> parser, which hardcodes an 8MB limit. ifstream could do this, but not >>> with the simple getline function. >>> >>> This performance doesn't appear to be a major bottleneck for me, but >>> it does seem a shame to throw away (charitably) two thirds of file >>> read speeds (and uncharitably, up to 97% in odd cases) if there is no >>> particular reason to do so. >>> >>> As an aside, RICHIO appears to allocate twice as many times as >>> std::ifstream when reading the same data, for the roughly the same >>> amount of memory in total. >>> >>> Anyway, I thought I'd share this finding! Please find attached the >>> benchmark program, such as it is. >>> >>> Cheers, >>> >>> John >>> >>> >>> >>> _______________________________________________ >>> Mailing list: https://launchpad.net/~kicad-developers >>> Post to : kicad-developers@lists.launchpad.net >>> Unsubscribe : https://launchpad.net/~kicad-developers >>> More help : https://help.launchpad.net/ListHelp >>> >> >> _______________________________________________ >> Mailing list: https://launchpad.net/~kicad-developers >> Post to : kicad-developers@lists.launchpad.net >> Unsubscribe : https://launchpad.net/~kicad-developers >> More help : https://help.launchpad.net/ListHelp _______________________________________________ Mailing list: https://launchpad.net/~kicad-developers Post to : kicad-developers@lists.launchpad.net Unsubscribe : https://launchpad.net/~kicad-developers More help : https://help.launchpad.net/ListHelp