Hi, I was trying to profile the eeschema slow library loads, and I got a bit distracted by RICHIO's FILE_LINE_READER.
Internally, it uses a very tight loop of reading single chars at a time from a file descriptor, which looks inefficient. I wrote a benchmarker to compare RICHIO against std::ifstream and a new LINE_READER implementation, backed by std::ifstream. operf confirms that most of the time in RICHIO burned in the ReadLine() function itself. The results were that RICHIO (in debug mode) is consistently 4-7 times slower than using std::ifstream, when reading eeschema library text files (so relatively short lines). Compiling the release version improved RICHIOs speed more than std::ifstream's, but it is still around 3 times slower than std::ifstream. For files with 1k lines, the slowdown is about 30 times (!) in debug and 14 times in release mode, so significantly worse. Few files read line-wise by Kicad look like that, however. Avoiding reconstructing the stream/LINE_READER each time doesn't have much of an effect in any case. Is there a particular reason why STL streams are not used in RICHIO? The only thing I think the example ifstream implementation can't do is catching over long lines, but that's only used in one place: the VRML parser, which hardcodes an 8MB limit. ifstream could do this, but not with the simple getline function. This performance doesn't appear to be a major bottleneck for me, but it does seem a shame to throw away (charitably) two thirds of file read speeds (and uncharitably, up to 97% in odd cases) if there is no particular reason to do so. As an aside, RICHIO appears to allocate twice as many times as std::ifstream when reading the same data, for the roughly the same amount of memory in total. Anyway, I thought I'd share this finding! Please find attached the benchmark program, such as it is. Cheers, John
From 9030890b9ad86ff2601a0488ef6e0003dfbb2e66 Mon Sep 17 00:00:00 2001 From: John Beard <john.j.be...@gmail.com> Date: Wed, 15 Feb 2017 16:47:41 +0800 Subject: [PATCH] IO benchmark program: RICHIO vs std::fstream Also vs fstream-backed LINE_READER. --- qa/CMakeLists.txt | 2 + qa/io_benchmark/CMakeLists.txt | 17 ++++ qa/io_benchmark/fstream_io.cpp | 77 +++++++++++++++ qa/io_benchmark/fstream_io.h | 55 +++++++++++ qa/io_benchmark/io_benchmark.cpp | 199 +++++++++++++++++++++++++++++++++++++++ 5 files changed, 350 insertions(+) create mode 100644 qa/io_benchmark/CMakeLists.txt create mode 100644 qa/io_benchmark/fstream_io.cpp create mode 100644 qa/io_benchmark/fstream_io.h create mode 100644 qa/io_benchmark/io_benchmark.cpp diff --git a/qa/CMakeLists.txt b/qa/CMakeLists.txt index bd579a597..a56f15f51 100644 --- a/qa/CMakeLists.txt +++ b/qa/CMakeLists.txt @@ -9,3 +9,5 @@ if( KICAD_SCRIPTING_MODULES ) ) endif() + +add_subdirectory( io_benchmark ) diff --git a/qa/io_benchmark/CMakeLists.txt b/qa/io_benchmark/CMakeLists.txt new file mode 100644 index 000000000..fde89c9d2 --- /dev/null +++ b/qa/io_benchmark/CMakeLists.txt @@ -0,0 +1,17 @@ + +include_directories( BEFORE ${INC_BEFORE} ) + +set( IOBENCHMARK_SRCS + io_benchmark.cpp + fstream_io.cpp +) + +add_executable( io_benchmark + ${IOBENCHMARK_SRCS} +) + +target_link_libraries( io_benchmark + common + ${wxWidgets_LIBRARIES} +) + diff --git a/qa/io_benchmark/fstream_io.cpp b/qa/io_benchmark/fstream_io.cpp new file mode 100644 index 000000000..420f57348 --- /dev/null +++ b/qa/io_benchmark/fstream_io.cpp @@ -0,0 +1,77 @@ +/* + * This program source code file is part of KiCad, a free EDA CAD application. + * + * Copyright (C) 1992-2017 KiCad Developers, see AUTHORS.txt for contributors. + * + * This program is free software; you can redistribute it and/or + * modify it under the terms of the GNU General Public License + * as published by the Free Software Foundation; either version 2 + * of the License, or (at your option) any later version. + * + * This program is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + * GNU General Public License for more details. + * + * You should have received a copy of the GNU General Public License + * along with this program; if not, you may find one here: + * http://www.gnu.org/licenses/old-licenses/gpl-2.0.html + * or you may search the http://www.gnu.org website for the version 2 license, + * or you may write to the Free Software Foundation, Inc., + * 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301, USA + */ + +#include "fstream_io.h" + +#include <ios> + +FSTREAM_LINE_READER::FSTREAM_LINE_READER( const wxString& aFileName, + unsigned aStartingLineNumber, + unsigned aMaxLineLength ) throw( IO_ERROR ) : + LINE_READER( aMaxLineLength ), + m_stream( aFileName ) +{ + line = &m_buffer[0]; + + if( !m_stream.is_open() ) + { + wxString msg = wxString::Format( + _( "Unable to open filename '%s' for reading" ), aFileName.GetData() ); + THROW_IO_ERROR( msg ); + } + + source = aFileName; + lineNum = aStartingLineNumber; +} + +FSTREAM_LINE_READER::~FSTREAM_LINE_READER() +{ + // this is only a view into a a string, it cant be deleted by the base + line = nullptr; +} + + +char* FSTREAM_LINE_READER::ReadLine() throw( IO_ERROR ) +{ + if( getline( m_stream, m_buffer ) ) + { + length = m_buffer.size(); + line = (char*) m_buffer.data(); //ew why no const?? + } + else + { + length = 0; + } + + // lineNum is incremented even if there was no line read, because this + // leads to better error reporting when we hit an end of file. + ++lineNum; + + return length ? line : nullptr; +} + +void FSTREAM_LINE_READER::Rewind() +{ + m_stream.clear() ; + m_stream.seekg(0, std::ios::beg ); +} diff --git a/qa/io_benchmark/fstream_io.h b/qa/io_benchmark/fstream_io.h new file mode 100644 index 000000000..dbb5d7cab --- /dev/null +++ b/qa/io_benchmark/fstream_io.h @@ -0,0 +1,55 @@ +/* + * This program source code file is part of KiCad, a free EDA CAD application. + * + * Copyright (C) 2017 KiCad Developers, see AUTHORS.txt for contributors. + * + * This program is free software; you can redistribute it and/or + * modify it under the terms of the GNU General Public License + * as published by the Free Software Foundation; either version 2 + * of the License, or (at your option) any later version. + * + * This program is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + * GNU General Public License for more details. + * + * You should have received a copy of the GNU General Public License + * along with this program; if not, you may find one here: + * http://www.gnu.org/licenses/old-licenses/gpl-2.0.html + * or you may search the http://www.gnu.org website for the version 2 license, + * or you may write to the Free Software Foundation, Inc., + * 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301, USA + */ + +#ifndef FSTREAM_LINE_READER_H +#define FSTREAM_LINE_READER_H + +#include <richio.h> + +#include <fstream> + +class FSTREAM_LINE_READER : public LINE_READER +{ +public: + + FSTREAM_LINE_READER( const wxString& aFileName, + unsigned aStartingLineNumber = 0, + unsigned aMaxLineLength = LINE_READER_LINE_DEFAULT_MAX ) throw( IO_ERROR ); + + ~FSTREAM_LINE_READER(); + + char* ReadLine() throw( IO_ERROR ) override; + + /** + * Function Rewind + * rewinds the file and resets the line number back to zero. Line number + * will go to 1 on first ReadLine(). + */ + void Rewind(); + +private: + std::string m_buffer; + std::ifstream m_stream; +}; + +#endif // FSTREAM_LINE_READER_H diff --git a/qa/io_benchmark/io_benchmark.cpp b/qa/io_benchmark/io_benchmark.cpp new file mode 100644 index 000000000..a0035ee92 --- /dev/null +++ b/qa/io_benchmark/io_benchmark.cpp @@ -0,0 +1,199 @@ + +#include <wx/wx.h> +#include <richio.h> + +#include <chrono> +#include <ios> +#include <functional> + +#include <fstream> + +#include "fstream_io.h" + +enum RET_CODES +{ + BAD_ARGS = 1, +}; + +using CLOCK = std::chrono::steady_clock; +using TIME_PT = std::chrono::time_point<CLOCK>; + +static int getMs( const TIME_PT& start, const TIME_PT& end ) +{ + using std::chrono::milliseconds; + using std::chrono::duration_cast; + + milliseconds diff = duration_cast<milliseconds>( end - start ); + + return diff.count(); +} + + +struct BENCH_REPORT +{ + int linesRead; + int charAcc; +}; + +using BENCH_FUNC = std::function<void(const wxString&, int, BENCH_REPORT&)>; + +struct BENCHMARK +{ + BENCH_FUNC func; + wxString name; +}; + + +static void bench_fstream( const wxString& aFile, int aReps, BENCH_REPORT& report ) +{ + std::string line; + + for( int i = 0; i < aReps; ++i) + { + std::ifstream fstr( aFile ); + + while( getline( fstr, line ) ) + { + report.linesRead++; + report.charAcc += line[0]; + } + + fstr.close(); + } +} + +static void bench_fstream_reuse( const wxString& aFile, int aReps, BENCH_REPORT& report ) +{ + std::string line; + std::ifstream fstr( aFile ); + + for( int i = 0; i < aReps; ++i) + { + while( getline( fstr, line ) ) + { + report.linesRead++; + report.charAcc += line[0]; + } + fstr.clear() ; + fstr.seekg(0, std::ios::beg) ; + } + + fstr.close(); +} + +static void bench_richio( const wxString& aFile, int aReps, BENCH_REPORT& report ) +{ + for( int i = 0; i < aReps; ++i) + { + FILE_LINE_READER flr( aFile ); + while( flr.ReadLine() ) + { + report.linesRead++; + report.charAcc += flr.Line()[0]; + } + } +} + +static void bench_richio_reuse( const wxString& aFile, int aReps, BENCH_REPORT& report ) +{ + FILE_LINE_READER flr( aFile ); + for( int i = 0; i < aReps; ++i) + { + while( flr.ReadLine() ) + { + report.linesRead++; + report.charAcc += flr.Line()[0]; + } + + flr.Rewind(); + } +} + +static void bench_fstream_richio( const wxString& aFile, int aReps, BENCH_REPORT& report ) +{ + for( int i = 0; i < aReps; ++i) + { + FSTREAM_LINE_READER fstr( aFile ); + while( fstr.ReadLine() ) + { + report.linesRead++; + report.charAcc += fstr.Line()[0]; + } + } +} + +static void bench_fstream_richio_reuse( const wxString& aFile, int aReps, BENCH_REPORT& report ) +{ + FSTREAM_LINE_READER fstr( aFile ); + for( int i = 0; i < aReps; ++i) + { + + while( fstr.ReadLine() ) + { + report.linesRead++; + report.charAcc += fstr.Line()[0]; + } + + fstr.Rewind(); + } +} + +int main( int argc, char* argv[] ) +{ + if (argc < 3) + { + wxLogDebug("Usage: %s <FILE> <REPS> [r/f/R/F]", argv[0] ); + return BAD_ARGS; + } + + wxString inFile( argv[1] ); + + long reps = 0; + wxString( argv[2] ).ToLong( &reps ); + + // get the benchmark to do, or 0 for all of them + char bench = 0; + if ( argc == 4 ) + bench = argv[3][0]; + + wxLogDebug("IO Bench Mark Util"); + + wxLogDebug( " Benchmark file: %s", inFile ); + wxLogDebug( " Repetitions: %d", (int) reps ); + + std::vector<BENCHMARK> benches; + + + if( !bench || bench == 'f' ) + benches.push_back( { bench_fstream, "std::fstream" } ); + + if( !bench || bench == 'F' ) + benches.push_back( { bench_fstream_reuse, "std::fstream, reused" } ); + + if ( !bench || bench == 'r' ) + benches.push_back( { bench_richio, "RICHIO" } ); + + if ( !bench || bench == 'R' ) + benches.push_back( { bench_richio_reuse, "RICHIO, reused" } ); + + if ( !bench || bench == 'n' ) + benches.push_back( { bench_fstream_richio, "New fstream IO" } ); + + if ( !bench || bench == 'N' ) + benches.push_back( { bench_fstream_richio_reuse, "New fstream IO, reused" } ); + + + for( auto& bmark : benches ) + { + BENCH_REPORT report = {}; + + TIME_PT start = CLOCK::now(); + bmark.func( inFile, reps, report ); + TIME_PT end = CLOCK::now(); + + wxLogDebug( "%25s %d lines, acc: %d in %d ms", + bmark.name, report.linesRead, report.charAcc, getMs( start, end ) ); + } + + return 0; +} -- 2.11.0
_______________________________________________ Mailing list: https://launchpad.net/~kicad-developers Post to : kicad-developers@lists.launchpad.net Unsubscribe : https://launchpad.net/~kicad-developers More help : https://help.launchpad.net/ListHelp