Georg Baum wrote: > Up to now there have been two proposals how to fix the plain text output: > > 1) from Abdel: Still use narrow streams > > virtual int InsetBase::plaintext(Buffer const &, std::ostream & os, > OutputParams const &) const; > > and implement operators for docstring and char_type output that convert to > utf8: > > std::ostream & operator<<(std::ostream & os, lyx::char_type const &); > std::ostream & operator<<(std::ostream & os, lyx::docstring const &); > > > 2) from me: Change the stream type to use lyx::char_type as character type > and do the conversion to utf8 in a special file stream: > > virtual int InsetBase::plaintext(Buffer const &, > std::basic_ostream<lyx::char_type> & os, OutputParams const &) const; > > 1) is easy to implement, but it would either require ucs4 -> utf8 -> ucs4 > conversions or some code duplication/refactoring since plain text output > is also used internally. 2) has some problems: gcc does not have useful > std::locale::facet specializations for anything else than char and wchar_t > character types. AFAICS the ctype (for all streams) and codecvt (for file > streams) facets are the most important ones. BTW I wrote earlier that > char_traits<lyx::char_type> were a problem, but that is only true for > older gcc versions, and I already put the relevant parts from gcc 4.2 in > docstring.h, so this problem is solved. > I tried to pull the wchar_t specialization out of the relevant portions of > libstdc++, but failed to create a working version for lyx::char_type. They > are not only scattered over many files, they also appear in some internal > init function that we can't modify of course, so I did not succeed. > > Unfortunately we need to solve these problems even if we are going to use > solution 1), since otherwise we are not able to use stringstreams for > docstring, and I don't think that we can live without them. > > At least on linux (and other OSes where sizeof(whcar_t) == 4) there is a > very easy solution: > > typedef wchar_t lyx::char_type; > > Then we can easily use wide string streams, and also the conversion to utf8 > can easily be done transparently with something like the > utf8_codecvt_facet in the attached file (which uses btw iconv with no > copying of data). Fortunately Peter confirmed that the existing > lyx::char_type works at least for stringstreams on windows. > > Peter and Abdel, can you please test whether the attached test program > works (or could be made to work) on windows with lyx::char_type == > boost::uint32_t? > > If yes, then I'd like to put the attached patch in and proceed with > solution 2) above. > > > Georg >
It compiles file on msvc8 with the boost type, here the output: test: abcd 228246252 20 e4 f6 fc 196214220 20 c4 d6 dc abcd Peter P.S.: It is really incredible that we have to implement a STL like unicode string class by ourself! There are billions of Dollars in the software industry, there is a world wide open source community, there are billons of people who really need 21 bits for their script - and LyX has to save the world.