Bug#513735: [PATCH] patch from bug #511290 broke dictionary building on non koi8r computers

Alexander GQ Gerasiov Sat, 31 Jan 2009 16:10:00 -0800

На Sun, 1 Feb 2009 00:36:45 +0600
"Fedor P. Goncharov" <fe...@gorodok.net> записано:


> But if uniq work in utf-8 environment with KOI8R
> dictionary, it just omit most of lines as repeating. You can verify
> my words if compare dictionary in repository before and after patch
> ( 1,8M   vs 74K for /usr/share/myspell/dicts/ru_RU.dic).  
Sure, debian build scripts are intended to work in clean environment
and in C locale. I don't know why Martin-Eric doesn't use pbuilder or
something similar. 

> Second:
> BUG #511290 already fixed in this string:
> grep -h '[��]' $(DICTIONARIES) | tr '\243\263' '\305\345' >
> yo_subst.koi ^^^^^^^^^^^^
>                                                                          
             yo==IO==ё
> but broken because grep and tr don't know what they work with koi8r
> codepage and because somebody mistaken  
Nope, tr uses symbol codes and works fine. The problem itself is in
grep.

So the right strings are:

build-arch:
        # Generate ispell dictionary
        LC_CTYPE=C grep -h '[ёЁ]' $(DICTIONARIES) | tr '\243\263'
'\305\345' > yo_subst.koi cat $(DICTIONARIES) yo_subst.koi >
$(ILANGUAGE).dict


And may be there should locale specification in line 23 before
i2myspell call, I don't know exactly.



На Sat, 31 Jan 2009 22:48:13 +0200
Martin-Éric Racine <q-f...@iki.fi> записано:

> On Sat, Jan 31, 2009 at 8:36 PM, Fedor P. Goncharov  
> <fe...@gorodok.net> wrote:
> > Package: aspell-ru
> > Version: 0.99g5-6
> > Severity: critical
> >
> > First:
> > patch from bug #511290 do nothing instead "sorting the content and
> > filtering it". But if uniq work in utf-8 environment with KOI8R
> > dictionary, it just omit most of lines as repeating.  
> 
> Too many tools in free software went from assuming locale C to
> assuming UTF-8, which is why we have so many problems building this
> dictionary, then converting it from ispell format to myspell and
> aspell formats. This regularly breaks and I'm not sure how to solve
> this in a permanent and predictable way, because forcing each tool
> used in the scripts to use KOI8-R is not always possible.  
It works fine in C locale.
>   
> > Second:
> > BUG #511290 already fixed in this string:
> > grep -h '[��]' $(DICTIONARIES) | tr '\243\263' '\305\345' >
> > yo_subst.koi ^^^^^^^^^^^^
> >                                                                        
              yo==IO==ё
> > but broken because grep and tr don't know what they work with koi8r
> > codepage and because somebody mistaken  
> 
> Indeed, that's correct.
>   
> > Third:
> > Товарисч Александр вы либо немножко
глупый, либо афигенный 'ч'удак,
> > в первом случае я вообще не понимаю как
у вас оказался данный патч,
> > но понимаю почему вы до сих пор не
перешли на утф, во втором вы
> > наверно догадались что я о вас думаю.  
> 
> (English summary: the bug reporter has an extremely negative view of
> upstream's work, among other things about his recalcitrance at getting
> around transiting the source files to UTF-8).  
Nope, this not really smart message was addressed to me, as I can see.
There are nothing except personal offence there.

> I've always wondered that too.  Transiting the source files to UTF-8
> would make it much easier to generate wordlists for a variety of
> dictionary formats and to maintain the build scripts without
> constantly having to edit them separately in a KOI8-R -enabled editor.
> 
> However, repeated attempts by maintainers of dictionary packages at
> various distributions (Fedora, Debian, etc.) to contact upstream all
> resulted with upstream ignoring them.  Some even tried phoning him
> directly at Moscow State University (which is where the MSU domain in
> his address comes from) and he hung up the phone on them.  
As I said, I can try to contact him offline. Is this the only thing you
want to tell him, or there are something else? Please mail me the
details.

-- 
Best regards,
 Alexander GQ Gerasiov

 Contacts:
 e-mail:    g...@cs.msu.su             Jabber:  g...@jabber.ru
 Homepage:  http://gq.net.ru         ICQ:     7272757
 PGP fingerprint: 0628 ACC7 291A D4AA 6D7D  79B8 0641 D82A E3E3 CE1D




--
To UNSUBSCRIBE, email to debian-bugs-rc-requ...@lists.debian.org
with a subject of "unsubscribe". Trouble? Contact listmas...@lists.debian.org

Bug#513735: [PATCH] patch from bug #511290 broke dictionary building on non koi8r computers

Reply via email to