Re: ICU - uneasy feeling

2005-10-14 Thread Angus Leeming
Jean-Marc Lasgouttes wrote: > Lars> Yes. But this is more on the input side and display side of > Lars> things. For storage we will have to support combining chars > even Lars> for european languages, but we don't have to do that as > step Lars> one. > > By combining chars you mean letter+accent,

Re: ICU - uneasy feeling

2005-10-14 Thread Lars Gullik Bjønnes
John Levon <[EMAIL PROTECTED]> writes: | Baby steps is definitely fine, but we should at least /aim/ to get this | stuff done in the first attempt... agree. | > Finishing this step and removing all code now rendered cruft, might | > also give us a better position to move forward with combining |

Re: ICU - uneasy feeling

2005-10-14 Thread Lars Gullik Bjønnes
Jean-Marc Lasgouttes <[EMAIL PROTECTED]> writes: | > "Lars" == Lars Gullik Bjønnes <[EMAIL PROTECTED]> writes: | | Lars> Yes. But this is more on the input side and display side of | Lars> things. For storage we will have to support combining chars even | Lars> for european languages, but we

Re: ICU - uneasy feeling

2005-10-14 Thread John Levon
On Fri, Oct 14, 2005 at 02:58:01AM +0200, Lars Gullik Bj?nnes wrote: > | This seems a horribly euro-centric point of view. (Says the guy who can > | only speak one language...) > > I agree with that, but I also agree with Asger, that this won't be > much worse than what we have right now. And we

Re: ICU - uneasy feeling

2005-10-14 Thread Jean-Marc Lasgouttes
> "Lars" == Lars Gullik Bjønnes <[EMAIL PROTECTED]> writes: Lars> Yes. But this is more on the input side and display side of Lars> things. For storage we will have to support combining chars even Lars> for european languages, but we don't have to do that as step Lars> one. By combining chars

Re: ICU - uneasy feeling

2005-10-14 Thread Jean-Marc Lasgouttes
> "Lars" == Lars Gullik Bjønnes <[EMAIL PROTECTED]> writes: Lars> I'll use it for inspiration... we cannot use it as a starting Lars> point it seems. No, but it points to many the parts of the code that need attention. What might be a goal for 1.5.0 is european languages + hebrew + arabic +

Re: ICU - uneasy feeling

2005-10-14 Thread Lars Gullik Bjønnes
Angus Leeming <[EMAIL PROTECTED]> writes: | Lars Gullik Bjønnes wrote: | > | Very good. How does this compare to the CJK LyX patch? Did you look at | > | it? | > Where can I find the most up to date CJK LyX patch? | | ftp://ftp.u-aizu.ac.jp/pub/tex/cjk-lyx/qt/CJK-LyX-qt-1.3.6-1.patch I'll use it

Re: ICU - uneasy feeling

2005-10-14 Thread Jean-Marc Lasgouttes
> "Angus" == Angus Leeming <[EMAIL PROTECTED]> writes: Angus> Lars Gullik Bjønnes wrote: >> | Very good. How does this compare to the CJK LyX patch? Did you >> look at | it? Where can I find the most up to date CJK LyX patch? Angus> ftp://ftp.u-aizu.ac.jp/pub/tex/cjk-lyx/qt/CJK-LyX-qt-1.3.6-1

Re: ICU - uneasy feeling

2005-10-14 Thread Lars Gullik Bjønnes
Jose' Matos <[EMAIL PROTECTED]> writes: | Lars Gullik Bjønnes wrote: | | > | #LyX 1.3 created this file. For more info see http://www.lyx.org/ | > | \lyxformat 221 | > | | > | is unchanged in UTF-8. We'll just need to read the \lyxformat to | > | ascertain whether the rest of the file is encode

Re: ICU - uneasy feeling

2005-10-14 Thread Angus Leeming
Lars Gullik Bjønnes wrote: > | Very good. How does this compare to the CJK LyX patch? Did you look at > | it? > Where can I find the most up to date CJK LyX patch? ftp://ftp.u-aizu.ac.jp/pub/tex/cjk-lyx/qt/CJK-LyX-qt-1.3.6-1.patch -- Angus

Re: ICU - uneasy feeling

2005-10-14 Thread Jean-Marc Lasgouttes
> "Lars" == Lars Gullik Bjønnes <[EMAIL PROTECTED]> writes: Lars> Jean-Marc Lasgouttes <[EMAIL PROTECTED]> writes: | Lars> Very good. How does this compare to the CJK LyX patch? Did you Lars> look at | it? Lars> Where can I find the most up to date CJK LyX patch? Look there maybe: http://www

Re: ICU - uneasy feeling

2005-10-14 Thread Jose' Matos
Lars Gullik Bjønnes wrote: > | #LyX 1.3 created this file. For more info see http://www.lyx.org/ > | \lyxformat 221 > | > | is unchanged in UTF-8. We'll just need to read the \lyxformat to > | ascertain whether the rest of the file is encoded in UTF-8 and, if not, > | use Python's unicode stuff t

Re: ICU - uneasy feeling

2005-10-14 Thread Lars Gullik Bjønnes
Jean-Marc Lasgouttes <[EMAIL PROTECTED]> writes: | Very good. How does this compare to the CJK LyX patch? Did you look at | it? Where can I find the most up to date CJK LyX patch? -- Lgb

Re: ICU - uneasy feeling

2005-10-14 Thread Lars Gullik Bjønnes
Angus Leeming <[EMAIL PROTECTED]> writes: | Lars Gullik Bjønnes wrote: | > btw... lyxlex (and the filereading) must be adapted to read utf8, and | > lyx2lyx must do its best to translate the old formats (to utf8)... | | Well, that's trivial because the header Well, that part yes. | | #LyX 1.3

Re: ICU - uneasy feeling

2005-10-14 Thread Angus Leeming
Lars Gullik Bjønnes wrote: > btw... lyxlex (and the filereading) must be adapted to read utf8, and > lyx2lyx must do its best to translate the old formats (to utf8)... Well, that's trivial because the header #LyX 1.3 created this file. For more info see http://www.lyx.org/ \lyxformat 221 is unch

Re: ICU - uneasy feeling

2005-10-14 Thread Lars Gullik Bjønnes
Jean-Marc Lasgouttes <[EMAIL PROTECTED]> writes: | > "Jean-Marc" == Jean-Marc Lasgouttes <[EMAIL PROTECTED]> writes: | | > "Lars" == Lars Gullik Bjønnes <[EMAIL PROTECTED]> writes: | Lars> True. And as I said it should give a much better base to work | Lars> from. | | Jean-Marc> And I fe

Re: ICU - uneasy feeling

2005-10-14 Thread Jean-Marc Lasgouttes
> "Jean-Marc" == Jean-Marc Lasgouttes <[EMAIL PROTECTED]> writes: > "Lars" == Lars Gullik Bjønnes <[EMAIL PROTECTED]> writes: Lars> True. And as I said it should give a much better base to work Lars> from. Jean-Marc> And I fear that if we look for too much generality at once, Jean-Marc> w

Re: ICU - uneasy feeling

2005-10-14 Thread Jean-Marc Lasgouttes
> "Lars" == Lars Gullik Bjønnes <[EMAIL PROTECTED]> writes: Lars> True. And as I said it should give a much better base to work Lars> from. And I fear that if we look for too much generality at once, we going to lose track. JMarc

Re: ICU - uneasy feeling

2005-10-14 Thread Lars Gullik Bjønnes
Jean-Marc Lasgouttes <[EMAIL PROTECTED]> writes: | > "Lars" == Lars Gullik Bjønnes <[EMAIL PROTECTED]> writes: | | | Lars> I just did some tests (using libidn and the nice stringprep | Lars> utility functions therein). Just by changeing | Lars> Paragraph::value_type to uint32_t and adding |

Re: ICU - uneasy feeling

2005-10-14 Thread Jean-Marc Lasgouttes
> "Lars" == Lars Gullik Bjønnes <[EMAIL PROTECTED]> writes: Lars> I just did some tests (using libidn and the nice stringprep Lars> utility functions therein). Just by changeing Lars> Paragraph::value_type to uint32_t and adding Lars> stringprep_ucs4_to_utf8 on output, and some Lars> strigpre

Re: ICU - uneasy feeling

2005-10-14 Thread Lars Gullik Bjønnes
Angus Leeming <[EMAIL PROTECTED]> writes: | > Also to make some of this nicer I think we need my "any-patch", I'll | > dig that out of the closet (right when 1.4.0 is released...) | > (we pass a keysym from the frontend... this is turned into a | > std::string and sent to dispatch()... we loose in

Re: ICU - uneasy feeling

2005-10-14 Thread Angus Leeming
Lars Gullik Bjønnes wrote: > I fear that XForms might need an upgrade to use either XwcLookup or > XmbLookup to give us what we require in the keyhandler (we might we > able to do it the event handler as well, unless xforms already ate > some of our multibyte chars). Or some IM thingie (More work f

Re: ICU - uneasy feeling

2005-10-13 Thread Lars Gullik Bjønnes
John Levon <[EMAIL PROTECTED]> writes: | On Fri, Oct 14, 2005 at 12:41:32AM +0100, Angus Leeming wrote: | | > John Levon wrote: | > > This seems a horribly euro-centric point of view. (Says the guy who can | > > only speak one language...) | > | > Really? Which one? | | Northern. At least you

Re: ICU - uneasy feeling

2005-10-13 Thread Lars Gullik Bjønnes
John Levon <[EMAIL PROTECTED]> writes: | > languages if volunteers come and help out. Don't worry about composed | > Unicode glyphs for now - it's a corner case that can be handled once | > someone feels the heat (which will probably when hell freezes over AFAICT). | | This seems a horribly eur

Re: ICU - uneasy feeling

2005-10-13 Thread John Levon
On Fri, Oct 14, 2005 at 12:41:32AM +0100, Angus Leeming wrote: > John Levon wrote: > > This seems a horribly euro-centric point of view. (Says the guy who can > > only speak one language...) > > Really? Which one? Northern. john

Re: ICU - uneasy feeling

2005-10-13 Thread Angus Leeming
John Levon wrote: > This seems a horribly euro-centric point of view. (Says the guy who can > only speak one language...) Really? Which one?

Re: ICU - uneasy feeling

2005-10-13 Thread John Levon
On Thu, Oct 13, 2005 at 10:29:30PM +0200, Asger Ottar Alstrup wrote: > The reason I suggest a unicode inset is that we already have it: the > latex accent inset. Our inset infrastructure is not in a position to accomodate something like this. > languages if volunteers come and help out. Don't w

Re: ICU - uneasy feeling

2005-10-13 Thread Lars Gullik Bjønnes
Asger Ottar Alstrup <[EMAIL PROTECTED]> writes: | Lars Gullik Bjønnes wrote: | > No. I am not sure... but it depends... a combining character can be | > used to produce accents as well... why not an umlaut on top of an | > grave on top of an 'e'. | | The reason I suggest a unicode inset is that w

Re: ICU - uneasy feeling

2005-10-13 Thread Asger Ottar Alstrup
Lars Gullik Bjønnes wrote: No. I am not sure... but it depends... a combining character can be used to produce accents as well... why not an umlaut on top of an grave on top of an 'e'. The reason I suggest a unicode inset is that we already have it: the latex accent inset. Of course you can

Re: ICU - uneasy feeling

2005-10-13 Thread Lars Gullik Bjønnes
Jean-Marc Lasgouttes <[EMAIL PROTECTED]> writes: | Lars> Ad. Asgers idea of a class UnicodeGlyph... (I'd prefere it to | Lars> not be an inset), we could have all chars in a Paragraph have | Lars> that type. Internally we could use some tricks to not use too | Lars> much memory for glyphs that doe

Re: ICU - uneasy feeling

2005-10-13 Thread Jean-Marc Lasgouttes
> "Lars" == Lars Gullik Bjønnes <[EMAIL PROTECTED]> writes: Lars> I am not saying that we must support everything Unicode can in Lars> 1.5, but we must at least think about this. Lars> We might decide that we don't have to worry about combining Lars> chars at all (but I fear that we have to).

Re: ICU - uneasy feeling

2005-10-13 Thread Lars Gullik Bjønnes
Jean-Marc Lasgouttes <[EMAIL PROTECTED]> writes: | >>>>> "Lars" == Lars Gullik Bjønnes <[EMAIL PROTECTED]> writes: | | Lars> fonts deal with glyphs (or rather the display engine), we must | Lars> deal with codepoints all the grit (which surely a lib li

Re: ICU - uneasy feeling

2005-10-13 Thread Jean-Marc Lasgouttes
>>>>> "Lars" == Lars Gullik Bjønnes <[EMAIL PROTECTED]> writes: Lars> fonts deal with glyphs (or rather the display engine), we must Lars> deal with codepoints all the grit (which surely a lib like ICU Lars> can help us with) Could you tell me suc

Re: ICU - uneasy feeling

2005-10-13 Thread Lars Gullik Bjønnes
ake... fonts deal with glyphs (or rather the display engine), we must deal with codepoints all the grit (which surely a lib like ICU can help us with) -- Lgb

Re: ICU - uneasy feeling

2005-10-13 Thread Lars Gullik Bjønnes
Asger Alstrup <[EMAIL PROTECTED]> writes: | Lars Gullik Bjønnes wrote: | > Angus Leeming <[EMAIL PROTECTED]> writes: | > | Lars Gullik Bjønnes wrote: | > | Sure. But that's not information needed by the CORE, is it? The core does | > | act on (strings of) single codepoints. All paragraph breaking

Re: ICU - uneasy feeling

2005-10-13 Thread Martin Vermeer
On Thu, 2005-10-13 at 01:52 +0200, Lars Gullik Bjønnes wrote: > I have been trying to look at the ICU api, but I find the > documentation utterly confusing and hard to get a clear understanding > on how it works. (Probably caused be me not finding a "Hello World" > code snip

Re: ICU - uneasy feeling

2005-10-13 Thread Angus Leeming
Asger Alstrup wrote: >> | Lars Gullik Bjønnes wrote: >> | Sure. But that's not information needed by the CORE, is it? The core >> | does act on (strings of) single codepoints. All paragraph breaking >> | etc, acts on single code points. >> >> How can it? When perhapsh three codepoints ends up beei

Re: ICU - uneasy feeling

2005-10-13 Thread Asger Alstrup
Lars Gullik Bjønnes wrote: Angus Leeming <[EMAIL PROTECTED]> writes: | Lars Gullik Bjønnes wrote: | Sure. But that's not information needed by the CORE, is it? The core does | act on (strings of) single codepoints. All paragraph breaking etc, acts on | single code points. How can it? When perha

Re: ICU - uneasy feeling

2005-10-13 Thread Lars Gullik Bjønnes
Angus Leeming <[EMAIL PROTECTED]> writes: | Lars Gullik Bjønnes wrote: | Thanks for supplying the bigger picture. I've only one point to make: | | > (Even UCS-4 is not "one-codepoint" "one-glyph", combining chars are | > required for proper display) | | Sure. But that's not information needed by

Re: ICU - uneasy feeling

2005-10-13 Thread Angus Leeming
it? The core does act on (strings of) single codepoints. All paragraph breaking etc, acts on single code points. In other words, if ICU can iterate over single codepoints in the unicoded string, then the core algorithms won't need to change at all. Right? -- Angus

Re: ICU - uneasy feeling

2005-10-13 Thread Asger Alstrup
Angus Leeming wrote: Lars Gullik Bjønnes wrote: Would be nice if some of you could have a look at this lib as well, and see what you think of it. I know it is _The_ Unicode lib to use, but still... I agree that ICU is bloated, complicated and antiqued. I'm not sure there is anything b

Re: ICU - uneasy feeling

2005-10-13 Thread Lars Gullik Bjønnes
Angus Leeming <[EMAIL PROTECTED]> writes: | Lars Gullik Bjønnes wrote: | > I have been trying to look at the ICU api, but I find the | > documentation utterly confusing and hard to get a clear understanding | > on how it works. (Probably caused be me not finding a "Hello Wor

Re: ICU - uneasy feeling

2005-10-12 Thread Angus Leeming
Lars Gullik Bjønnes wrote: > I have been trying to look at the ICU api, but I find the > documentation utterly confusing and hard to get a clear understanding > on how it works. (Probably caused be me not finding a "Hello World" > code snippet) > > Also, I must s

ICU - uneasy feeling

2005-10-12 Thread Lars Gullik Bjønnes
I have been trying to look at the ICU api, but I find the documentation utterly confusing and hard to get a clear understanding on how it works. (Probably caused be me not finding a "Hello World" code snippet) Also, I must say, some of this is based on really old (before 2000) ideas

Re: ICU

2000-07-17 Thread Lars Gullik Bjønnes
Jean-Marc Lasgouttes <[EMAIL PROTECTED]> writes: | >>>>> "Lars" == Lars Gullik Bjønnes <[EMAIL PROTECTED]> writes: | | Lars> http://oss.software.ibm.com/icu/ Would perhaps be usable. | | And could pango (www.pango.org) be of some interest for us? Perh

Re: ICU

2000-07-17 Thread Jean-Marc Lasgouttes
>>>>> "Lars" == Lars Gullik Bjønnes <[EMAIL PROTECTED]> writes: Lars> http://oss.software.ibm.com/icu/ Would perhaps be usable. And could pango (www.pango.org) be of some interest for us? JMarc

Re: ICU

2000-07-13 Thread Lars Gullik Bjønnes
Jose Abilio Oliveira Matos <[EMAIL PROTECTED]> writes: | On Thu, Jul 13, 2000 at 12:36:41PM +0200, Lars Gullik Bjønnes wrote: | > | > http://oss.software.ibm.com/icu/ | > | > Would perhaps be usable. | | And the footprint is reasonable? No. This is a full-ledged unicode/l

Re: ICU

2000-07-13 Thread Jose Abilio Oliveira Matos
On Thu, Jul 13, 2000 at 12:36:41PM +0200, Lars Gullik Bjønnes wrote: > > http://oss.software.ibm.com/icu/ > > Would perhaps be usable. And the footprint is reasonable? > Lgb -- José

ICU

2000-07-13 Thread Lars Gullik Bjønnes
http://oss.software.ibm.com/icu/ Would perhaps be usable. Lgb