Re: [patch] fix plain text output

2006-08-22 Thread Andre Poenitz
On Sun, Aug 20, 2006 at 11:34:43AM +0200, Abdelrazak Younes wrote: > >>There's an added benefit if we go the basic_string way: I think most > >>compilers (gcc, msvc) now do implicit sharing on strings so passing > >>parameters won't be as costly as with std::vector(). > > > >I doubt any recent co

Re: [patch] fix plain text output

2006-08-22 Thread Andre Poenitz
On Sun, Aug 20, 2006 at 03:06:25PM +0200, Lars Gullik Bjønnes wrote: > Andre Poenitz <[EMAIL PROTECTED]> writes: > > | On Wed, Aug 16, 2006 at 10:34:52PM +0200, Lars Gullik Bjønnes wrote: > | > | I know it's a bit late to voice my opinion but I think it should have > been: > | > > | > Yes. I hav

Re: [patch] fix plain text output

2006-08-21 Thread Abdelrazak Younes
Lars Gullik Bjønnes wrote: Abdelrazak Younes <[EMAIL PROTECTED]> writes: | Andre Poenitz wrote: | > On Wed, Aug 16, 2006 at 05:05:53PM +0200, Abdelrazak Younes wrote: | >> Abdelrazak Younes wrote: | Here comes the next bit: I discovered that the result of | | std::vector ucs4_to_u

Re: [patch] fix plain text output

2006-08-20 Thread Lars Gullik Bjønnes
Abdelrazak Younes <[EMAIL PROTECTED]> writes: | Andre Poenitz wrote: | > On Wed, Aug 16, 2006 at 05:05:53PM +0200, Abdelrazak Younes wrote: | >> Abdelrazak Younes wrote: | Here comes the next bit: I discovered that the result of | | std::vector ucs4_to_utf8(boost::uint32_t c) |

Re: [patch] fix plain text output

2006-08-20 Thread Martin Vermeer
On Fri, Aug 18, 2006 at 01:45:23AM +0200, Andre Poenitz wrote: > On Wed, Aug 16, 2006 at 10:34:52PM +0200, Lars Gullik Bjønnes wrote: > > | I know it's a bit late to voice my opinion but I think it should have > > been: > > > > Yes. I have been calling on help on the unicode branch for months...

Re: [patch] fix plain text output

2006-08-20 Thread Lars Gullik Bjønnes
Andre Poenitz <[EMAIL PROTECTED]> writes: | On Wed, Aug 16, 2006 at 10:34:52PM +0200, Lars Gullik Bjønnes wrote: | > | I know it's a bit late to voice my opinion but I think it should have been: | > | > Yes. I have been calling on help on the unicode branch for months... | | Could we take a not

Re: [patch] fix plain text output

2006-08-20 Thread Abdelrazak Younes
Andre Poenitz wrote: On Wed, Aug 16, 2006 at 05:05:53PM +0200, Abdelrazak Younes wrote: Abdelrazak Younes wrote: Here comes the next bit: I discovered that the result of std::vector ucs4_to_utf8(boost::uint32_t c) was never used as a vector. I changed it to std::string, and that simplifies

Re: [patch] fix plain text output

2006-08-20 Thread Andre Poenitz
On Wed, Aug 16, 2006 at 05:05:53PM +0200, Abdelrazak Younes wrote: > Abdelrazak Younes wrote: > >>Here comes the next bit: I discovered that the result of > >> > >>std::vector ucs4_to_utf8(boost::uint32_t c) > >> > >>was never used as a vector. I changed it to std::string, and that > >>simplifies

Re: [patch] fix plain text output

2006-08-19 Thread Andre Poenitz
On Wed, Aug 16, 2006 at 10:34:52PM +0200, Lars Gullik Bjønnes wrote: > | I know it's a bit late to voice my opinion but I think it should have been: > > Yes. I have been calling on help on the unicode branch for months... Could we take a note for the future that working in branches does not reall

Re: [patch] fix plain text output

2006-08-18 Thread Helge Hafting
Lars Gullik Bjønnes wrote: Helge Hafting <[EMAIL PROTECTED]> writes: | Angus Leeming wrote: | > UTF-8 is a multi-byte encoding. It's useful for output to file | > because the data are stored as characters (bytes). So, much of a | > UTF-8 encoded file will be human readable; only the multi-byte |

Re: [patch] fix plain text output

2006-08-17 Thread Lars Gullik Bjønnes
Helge Hafting <[EMAIL PROTECTED]> writes: | Angus Leeming wrote: | > UTF-8 is a multi-byte encoding. It's useful for output to file | > because the data are stored as characters (bytes). So, much of a | > UTF-8 encoded file will be human readable; only the multi-byte | > sequences will not. | > |

Re: [patch] fix plain text output

2006-08-17 Thread Helge Hafting
Angus Leeming wrote: UTF-8 is a multi-byte encoding. It's useful for output to file because the data are stored as characters (bytes). So, much of a UTF-8 encoded file will be human readable; only the multi-byte sequences will not. Actually, the multibyte sequences are human readable too, if

Re: [patch] fix plain text output

2006-08-16 Thread Jean-Marc Lasgouttes
> "Lars" == Lars Gullik Bjønnes <[EMAIL PROTECTED]> writes: Lars> If you give a nice name to ascii_guill I'd think that the code Lars> could be even clearer than it is now. Lars> I am not sure that you can have you cake and eat it at all Lars> times... unicode is more cumbersome to work with.

Re: [patch] fix plain text output

2006-08-16 Thread Georg Baum
Am Mittwoch, 16. August 2006 23:40 schrieb Lars Gullik Bjønnes: > Georg Baum <[EMAIL PROTECTED]> writes: > > | Does this mean we have now agreed on using docstring always when dealing > | with multibyte strings? > > when not close to the converter itself, imho yes. OK. I am currently working o

Re: [patch] fix plain text output

2006-08-16 Thread Lars Gullik Bjønnes
Georg Baum <[EMAIL PROTECTED]> writes: | Am Mittwoch, 16. August 2006 23:12 schrieb Lars Gullik Bjønnes: | > Change the code so that those conversions os not needed, don't change | > the conversions. | | Does this mean we have now agreed on using docstring always when dealing | with multibyte st

Re: [patch] fix plain text output

2006-08-16 Thread Lars Gullik Bjønnes
Jean-Marc Lasgouttes <[EMAIL PROTECTED]> writes: | Lars> if (disp == ascii_guill) ... | | This I do not like much. This hides the meaning of the code, | especially since I have a | else if (disp == ">>") | two lines after. A piece of trivial code is going to be changed to | something not so nic

Re: [patch] fix plain text output

2006-08-16 Thread Georg Baum
Am Mittwoch, 16. August 2006 23:12 schrieb Lars Gullik Bjønnes: > Change the code so that those conversions os not needed, don't change > the conversions. Does this mean we have now agreed on using docstring always when dealing with multibyte strings? I have had a closer look, and noticed that we

Re: [patch] fix plain text output

2006-08-16 Thread Abdelrazak Younes
Lars Gullik Bjønnes wrote: Jean-Marc Lasgouttes <[EMAIL PROTECTED]> writes: | > "Lars" == Lars Gullik Bjønnes <[EMAIL PROTECTED]> writes: | | Lars> So imho if docstring should change to anything as of now it is a | Lars> std::vector | | Is it a threat? ;) Yes. Stop bickering about the b

Re: [patch] fix plain text output

2006-08-16 Thread Lars Gullik Bjønnes
Abdelrazak Younes <[EMAIL PROTECTED]> writes: | Jean-Marc Lasgouttes wrote: | >> "Lars" == Lars Gullik Bjønnes <[EMAIL PROTECTED]> writes: | > Lars> So imho if docstring should change to anything as of now it is | > a | > Lars> std::vector | > Is it a threat? ;) | | No, just Lars reinventing

Re: [patch] fix plain text output

2006-08-16 Thread Jean-Marc Lasgouttes
> "Lars" == Lars Gullik Bjønnes <[EMAIL PROTECTED]> writes: Lars> Jean-Marc Lasgouttes <[EMAIL PROTECTED]> writes: | Here is a Lars> different (related) question. In the insetquote code there | is Lars> this | if (disp == "<<") | code. How should I change it if disp Lars> is a docstring so tha

Re: [patch] fix plain text output

2006-08-16 Thread Abdelrazak Younes
Lars Gullik Bjønnes wrote: Abdelrazak Younes <[EMAIL PROTECTED]> writes: | Lars Gullik Bjønnes wrote: | > Abdelrazak Younes <[EMAIL PROTECTED]> writes: | > | > ucs-2 with qt. | > | | so no utf8 here. | > | | > utf-8/ucs-2 with pango. | > | | So utf8 is not necessary there also as pango deals per

Re: [patch] fix plain text output

2006-08-16 Thread Lars Gullik Bjønnes
Jean-Marc Lasgouttes <[EMAIL PROTECTED]> writes: | > "Lars" == Lars Gullik Bjønnes <[EMAIL PROTECTED]> writes: | | Lars> So imho if docstring should change to anything as of now it is a | Lars> std::vector | | Is it a threat? ;) Yes. Stop bickering about the basic_string already! --

Re: [patch] fix plain text output

2006-08-16 Thread Lars Gullik Bjønnes
Abdelrazak Younes <[EMAIL PROTECTED]> writes: | Lars Gullik Bjønnes wrote: | > Abdelrazak Younes <[EMAIL PROTECTED]> writes: | > | > ucs-2 with qt. | > | | so no utf8 here. | > | | > utf-8/ucs-2 with pango. | > | | So utf8 is not necessary there also as pango deals perfectly | > with ucs2 | > | (a

Re: [patch] fix plain text output

2006-08-16 Thread Lars Gullik Bjønnes
Jean-Marc Lasgouttes <[EMAIL PROTECTED]> writes: | Here is a different (related) question. In the insetquote code there | is this | if (disp == "<<") | code. How should I change it if disp is a docstring so that it still | fits on one line. How do I change a C string to something that | compares

Re: [patch] fix plain text output

2006-08-16 Thread Abdelrazak Younes
Lars Gullik Bjønnes wrote: Abdelrazak Younes <[EMAIL PROTECTED]> writes: | > ucs-2 with qt. | | so no utf8 here. | | > utf-8/ucs-2 with pango. | | So utf8 is not necessary there also as pango deals perfectly with ucs2 | (and 4?) What is your point really? My point is that you don't really

Re: [patch] fix plain text output

2006-08-16 Thread Lars Gullik Bjønnes
Jean-Marc Lasgouttes <[EMAIL PROTECTED]> writes: | I'd like to find a syntax such that the translation of our code to | unicode does not transform every line into 3 lines (like the examples | where we push \0 explicitely). That is easy: just make everything use docstring and char_type. Why do yo

Re: [patch] fix plain text output

2006-08-16 Thread Jean-Marc Lasgouttes
> "Lars" == Lars Gullik Bjønnes <[EMAIL PROTECTED]> writes: | That said, I'd appreciate to work on strings instead of vectors for | utf-8... Lars> Huh? Where do you plan to work on utf-8 at all? OK, I got it wrong. Here is a different (related) question. In the insetquote code there is thi

Re: [patch] fix plain text output

2006-08-16 Thread Abdelrazak Younes
Jean-Marc Lasgouttes wrote: "Lars" == Lars Gullik Bjønnes <[EMAIL PROTECTED]> writes: Lars> So imho if docstring should change to anything as of now it is a Lars> std::vector Is it a threat? ;) No, just Lars reinventing basic_string with vector ;-) If you look at the STL code basic_string i

Re: [patch] fix plain text output

2006-08-16 Thread Lars Gullik Bjønnes
Jean-Marc Lasgouttes <[EMAIL PROTECTED]> writes: | > "Lars" == Lars Gullik Bjønnes <[EMAIL PROTECTED]> writes: | | Lars> However our internal interface is in message.C and from | Lars> Message::get we can perfectly well output a docstring instead of | Lars> a string. (and thus ucs-4) | | Yes

Re: [patch] fix plain text output

2006-08-16 Thread Lars Gullik Bjønnes
Abdelrazak Younes <[EMAIL PROTECTED]> writes: | > ucs-2 with qt. | | so no utf8 here. | | > utf-8/ucs-2 with pango. | | So utf8 is not necessary there also as pango deals perfectly with ucs2 | (and 4?) What is your point really? | > What aspell can use I have no | > idea about. (it can use uc

Re: [patch] fix plain text output

2006-08-16 Thread Jean-Marc Lasgouttes
> "Lars" == Lars Gullik Bjønnes <[EMAIL PROTECTED]> writes: Lars> So imho if docstring should change to anything as of now it is a Lars> std::vector Is it a threat? ;) JMarc

Re: [patch] fix plain text output

2006-08-16 Thread Jean-Marc Lasgouttes
> "Lars" == Lars Gullik Bjønnes <[EMAIL PROTECTED]> writes: Lars> | why do we use unsigned short instead of boost::uint16_t here. Lars> I know | they are the same, but wouldn't it be clearer? Lars> Perhaps. (but of course we don't have a basic_string short> anyway...) I thought about the vec

Re: [patch] fix plain text output

2006-08-16 Thread Abdelrazak Younes
Lars Gullik Bjønnes wrote: Abdelrazak Younes <[EMAIL PROTECTED]> writes: | Lars Gullik Bjønnes wrote: | > Abdelrazak Younes <[EMAIL PROTECTED]> writes: | > | That's exactly what this means but, sure, that is just my opinion. | > You | > | obviously are in love with your vector solution ;-) | > I

Re: [patch] fix plain text output

2006-08-16 Thread Jean-Marc Lasgouttes
> "Lars" == Lars Gullik Bjønnes <[EMAIL PROTECTED]> writes: Lars> However our internal interface is in message.C and from Lars> Message::get we can perfectly well output a docstring instead of Lars> a string. (and thus ucs-4) Yes, and if it is costly, we'll change it later on. But if our po

Re: [patch] fix plain text output

2006-08-16 Thread Lars Gullik Bjønnes
Jean-Marc Lasgouttes <[EMAIL PROTECTED]> writes: | > "Abdelrazak" == Abdelrazak Younes <[EMAIL PROTECTED]> writes: | | Abdelrazak> I know it's a bit late to voice my opinion but I think it | Abdelrazak> should have been: | | Off-topic questions: | | Abdelrazak> typedef std::basic_string ucs

Re: [patch] fix plain text output

2006-08-16 Thread Abdelrazak Younes
Lars Gullik Bjønnes wrote: Abdelrazak Younes <[EMAIL PROTECTED]> writes: | Lars Gullik Bjønnes wrote: | > Abdelrazak Younes <[EMAIL PROTECTED]> writes: | > | > Both ucs2 and ucs4 use a fixed number of bytes for one character | > (2 | > | > and 4, respectively, surprise, surprise!). The problem i

Re: [patch] fix plain text output

2006-08-16 Thread Lars Gullik Bjønnes
Abdelrazak Younes <[EMAIL PROTECTED]> writes: | Lars Gullik Bjønnes wrote: | > Abdelrazak Younes <[EMAIL PROTECTED]> writes: | > | That's exactly what this means but, sure, that is just my opinion. | > You | > | obviously are in love with your vector solution ;-) | > If the only semantics are "bun

Re: [patch] fix plain text output

2006-08-16 Thread Jean-Marc Lasgouttes
> "Abdelrazak" == Abdelrazak Younes <[EMAIL PROTECTED]> writes: Abdelrazak> I know it's a bit late to voice my opinion but I think it Abdelrazak> should have been: Off-topic questions: Abdelrazak> typedef std::basic_string ucs2_string; why do we use unsigned short instead of boost::uint16_t

Re: [patch] fix plain text output

2006-08-16 Thread Lars Gullik Bjønnes
Jean-Marc Lasgouttes <[EMAIL PROTECTED]> writes: | > "Abdelrazak" == Abdelrazak Younes <[EMAIL PROTECTED]> writes: | | >> Or communicationg with other libs api's. | | Abdelrazak> Which one? | | gettext at least. Gettext msgids should be ASCII. (And we require them to be ASCII..) Gettext o

Re: [patch] fix plain text output

2006-08-16 Thread Abdelrazak Younes
Lars Gullik Bjønnes wrote: Abdelrazak Younes <[EMAIL PROTECTED]> writes: | That's exactly what this means but, sure, that is just my opinion. You | obviously are in love with your vector solution ;-) If the only semantics are "bunch of bytes" etc. then a vector is correct. Sure but in this ca

Re: [patch] fix plain text output

2006-08-16 Thread Jean-Marc Lasgouttes
> "Abdelrazak" == Abdelrazak Younes <[EMAIL PROTECTED]> writes: >> Or communicationg with other libs api's. Abdelrazak> Which one? gettext at least. JMarc

Re: [patch] fix plain text output

2006-08-16 Thread Lars Gullik Bjønnes
Abdelrazak Younes <[EMAIL PROTECTED]> writes: | Lars Gullik Bjønnes wrote: | > Abdelrazak Younes <[EMAIL PROTECTED]> writes: | > | > Both ucs2 and ucs4 use a fixed number of bytes for one character | > (2 | > | > and 4, respectively, surprise, surprise!). The problem is a | > | > variable-byte enc

Re: [patch] fix plain text output

2006-08-16 Thread Abdelrazak Younes
Lars Gullik Bjønnes wrote: Abdelrazak Younes <[EMAIL PROTECTED]> writes: | > Both ucs2 and ucs4 use a fixed number of bytes for one character (2 | > and 4, respectively, surprise, surprise!). The problem is a | > variable-byte encoding such as utf8. | | Yes I understood that far, sorry for "qui

Re: [patch] fix plain text output

2006-08-16 Thread Lars Gullik Bjønnes
Abdelrazak Younes <[EMAIL PROTECTED]> writes: | That's exactly what this means but, sure, that is just my opinion. You | obviously are in love with your vector solution ;-) If the only semantics are "bunch of bytes" etc. then a vector is correct. For passing ucs-4 strings around we already have

Re: [patch] fix plain text output

2006-08-16 Thread Abdelrazak Younes
Lars Gullik Bjønnes wrote: Abdelrazak Younes <[EMAIL PROTECTED]> writes: | > UTF-8 is a multi-byte encoding. It's useful for output to file | > because the data are stored as characters (bytes). So, much of a | > UTF-8 encoded file will be human readable; only the multi-byte | > sequences will n

Re: [patch] fix plain text output

2006-08-16 Thread Lars Gullik Bjønnes
Abdelrazak Younes <[EMAIL PROTECTED]> writes: | > UTF-8 is a multi-byte encoding. It's useful for output to file | > because the data are stored as characters (bytes). So, much of a | > UTF-8 encoded file will be human readable; only the multi-byte | > sequences will not. | > Storing UTF-8 encoded

Re: [patch] fix plain text output

2006-08-16 Thread Lars Gullik Bjønnes
Abdelrazak Younes <[EMAIL PROTECTED]> writes: | > Both ucs2 and ucs4 use a fixed number of bytes for one character (2 | > and 4, respectively, surprise, surprise!). The problem is a | > variable-byte encoding such as utf8. | | Yes I understood that far, sorry for "quiproquo". IMHO, the only code

Re: [patch] fix plain text output

2006-08-16 Thread Abdelrazak Younes
Angus Leeming wrote: Abdelrazak Younes <[EMAIL PROTECTED]> writes: Hum... I am not I follows everything but let me summarize what I understand from current code. The std::vectors I am talking about are: * vector: could be replaced by std::basic_string * vector: that is ucs2 right? That could b

Re: [patch] fix plain text output

2006-08-16 Thread Abdelrazak Younes
Georg Baum wrote: Am Mittwoch, 16. August 2006 18:41 schrieb Abdelrazak Younes: Hum... I am not I follows everything but let me summarize what I understand from current code. The std::vectors I am talking about are: * vector: could be replaced by std::basic_string * vector: that is ucs2 right?

Re: [patch] fix plain text output

2006-08-16 Thread Lars Gullik Bjønnes
Abdelrazak Younes <[EMAIL PROTECTED]> writes: | Georg Baum wrote: | > Am Mittwoch, 16. August 2006 18:12 schrieb Abdelrazak Younes: | >> Lars Gullik Bjønnes wrote: | >> | >>> string.length() will be lying to you when you store utf-8 in it. | >> Why is that? Because of some trailing \0? | > No. ut

Re: [patch] fix plain text output

2006-08-16 Thread Angus Leeming
Abdelrazak Younes <[EMAIL PROTECTED]> writes: > Hum... I am not I follows everything but let me summarize what I > understand from current code. The std::vectors I am talking about are: > > * vector: could be replaced by std::basic_string > * vector: that is ucs2 right? That could be replaced by

Re: [patch] fix plain text output

2006-08-16 Thread Georg Baum
Am Mittwoch, 16. August 2006 18:41 schrieb Abdelrazak Younes: > Hum... I am not I follows everything but let me summarize what I > understand from current code. The std::vectors I am talking about are: > > * vector: could be replaced by std::basic_string > * vector: that is ucs2 right? That could

Re: [patch] fix plain text output

2006-08-16 Thread Abdelrazak Younes
Georg Baum wrote: Am Mittwoch, 16. August 2006 18:12 schrieb Abdelrazak Younes: Lars Gullik Bjønnes wrote: string.length() will be lying to you when you store utf-8 in it. Why is that? Because of some trailing \0? No. utf8 is a multibyte encoding: Some characters use one byte, some two and

Re: [patch] fix plain text output

2006-08-16 Thread Georg Baum
Am Mittwoch, 16. August 2006 18:01 schrieb Lars Gullik Bjønnes: > Georg Baum <[EMAIL PROTECTED]> writes: > | Here comes the next bit: I discovered that the result of > | > | std::vector ucs4_to_utf8(boost::uint32_t c) > | > | was never used as a vector. I changed it to std::string, and that simp

Re: [patch] fix plain text output

2006-08-16 Thread Georg Baum
Am Mittwoch, 16. August 2006 18:12 schrieb Abdelrazak Younes: > Lars Gullik Bjønnes wrote: > > > string.length() will be lying to you when you store utf-8 in it. > > Why is that? Because of some trailing \0? No. utf8 is a multibyte encoding: Some characters use one byte, some two and some even m

Re: [patch] fix plain text output

2006-08-16 Thread Lars Gullik Bjønnes
Abdelrazak Younes <[EMAIL PROTECTED]> writes: | Abdelrazak Younes wrote: | >> Here comes the next bit: I discovered that the result of | >> | >> std::vector ucs4_to_utf8(boost::uint32_t c) | >> | >> was never used as a vector. I changed it to std::string, and that | >> simplifies | >> the code. In

Re: [patch] fix plain text output

2006-08-16 Thread Abdelrazak Younes
Lars Gullik Bjønnes wrote: Abdelrazak Younes <[EMAIL PROTECTED]> writes: | Georg Baum wrote: | > Lars Gullik Bjønnes wrote: | > | >> Conversion between the different unicode encodings are pretty cheap. | > Yes, but what I am more concerned about are lots of ucs4_to_utf8 or | > vice | > versa in

Re: [patch] fix plain text output

2006-08-16 Thread Lars Gullik Bjønnes
Georg Baum <[EMAIL PROTECTED]> writes: | Lars Gullik Bjønnes wrote: | | > Conversion between the different unicode encodings are pretty cheap. | | Yes, but what I am more concerned about are lots of ucs4_to_utf8 or vice | versa in the code. That just makes it a bit less readable. | | > | Since

Re: [patch] fix plain text output

2006-08-16 Thread Lars Gullik Bjønnes
Abdelrazak Younes <[EMAIL PROTECTED]> writes: | Georg Baum wrote: | > Lars Gullik Bjønnes wrote: | > | >> Conversion between the different unicode encodings are pretty cheap. | > Yes, but what I am more concerned about are lots of ucs4_to_utf8 or | > vice | > versa in the code. That just makes it

Re: [patch] fix plain text output

2006-08-16 Thread Abdelrazak Younes
Abdelrazak Younes wrote: Here comes the next bit: I discovered that the result of std::vector ucs4_to_utf8(boost::uint32_t c) was never used as a vector. I changed it to std::string, and that simplifies the code. In particular it removes manual fiddling with the terminating '\0', which we sho

Re: [patch] fix plain text output

2006-08-16 Thread Abdelrazak Younes
Georg Baum wrote: Lars Gullik Bjønnes wrote: Conversion between the different unicode encodings are pretty cheap. Yes, but what I am more concerned about are lots of ucs4_to_utf8 or vice versa in the code. That just makes it a bit less readable. | Since the po | files will eventually be in

Re: [patch] fix plain text output

2006-08-16 Thread Georg Baum
Lars Gullik Bjønnes wrote: > Conversion between the different unicode encodings are pretty cheap. Yes, but what I am more concerned about are lots of ucs4_to_utf8 or vice versa in the code. That just makes it a bit less readable. > | Since the po > | files will eventually be in utf8 it seems nat

Re: [patch] fix plain text output

2006-08-16 Thread Angus Leeming
Lars Gullik Bjønnes <[EMAIL PROTECTED]> writes: > Actually I guess that using a mix of utf-8 and ucs-4 is the cop-out to > have everything work as soon as possible. > A full change to ucs-4 will require more code changes than a mix. Ok, you're being practical. That I understand even if I'm uncomf

Re: [patch] fix plain text output

2006-08-16 Thread Lars Gullik Bjønnes
Angus Leeming <[EMAIL PROTECTED]> writes: | Lars Gullik Bjønnes <[EMAIL PROTECTED]> writes: | > Georg Baum <[EMAIL PROTECTED]> writes: | | > | > Should a call to gettext (_()) give us utf8 or ucs4?, so far I am | > | > inclined to go for utf8. | > | If we only knew which variant results in less

Re: [patch] fix plain text output

2006-08-16 Thread Angus Leeming
Lars Gullik Bjønnes <[EMAIL PROTECTED]> writes: > Georg Baum <[EMAIL PROTECTED]> writes: > | > Should a call to gettext (_()) give us utf8 or ucs4?, so far I am > | > inclined to go for utf8. > | If we only knew which variant results in less conversions. > Conversion between the different unicode

Re: [patch] fix plain text output

2006-08-16 Thread Lars Gullik Bjønnes
Georg Baum <[EMAIL PROTECTED]> writes: | > Should a call to gettext (_()) give us utf8 or ucs4?, so far I am | > inclined to go for utf8. | | If we only knew which variant results in less conversions. Conversion between the different unicode encodings are pretty cheap. | Since the po | files wi

Re: [patch] fix plain text output

2006-08-16 Thread Lars Gullik Bjønnes
Jean-Marc Lasgouttes <[EMAIL PROTECTED]> writes: | > "Georg" == Georg Baum <[EMAIL PROTECTED]> writes: | | | Or should we not change the type, but use utf8 as encoding instead? | | I believe the former is safer. | | >> This is one of the things I am thinking about... esp. in rel. to | >> get

Re: [patch] fix plain text output

2006-08-16 Thread Jean-Marc Lasgouttes
> "Georg" == Georg Baum <[EMAIL PROTECTED]> writes: | Or should we not change the type, but use utf8 as encoding instead? | I believe the former is safer. >> This is one of the things I am thinking about... esp. in rel. to >> gettext and l10n. In general, we should declare what code should d

Re: [patch] fix plain text output

2006-08-16 Thread Georg Baum
Lars Gullik Bjønnes wrote: > Georg Baum <[EMAIL PROTECTED]> > writes: > > So far I have only created what I needed. But even if we add more > convenience fuctions we should be careful when adding them, we do not > want to many imho. Yes. We'll see what is useful as the conversion goes on. > | O

Re: [patch] fix plain text output

2006-08-16 Thread Lars Gullik Bjønnes
Georg Baum <[EMAIL PROTECTED]> writes: | This small patch makes most of plain text readable again (in utf8). | | Questions: | | 1) Is it on purpose that the functions in unicode.h convert only between | std::vectors of characters and C strings, but not std::string/docstring? I | think we should