I'm working on this, but, curiously, on my system I don't get a segfault. Instead, I get infinitely many copies of the error message.
On Tue, Jul 12, 2011 at 6:31 PM, <jida...@jidanni.org> wrote: > X-Debbugs-Cc: billpo...@alum.mit.edu > Package: uni2ascii > Version: 4.18-1 > Severity: important > File: /usr/bin/ascii2uni > > There is a horrible Segmentation fault. > > You won't notice it if you use a pipe to wc(1). > > Capturing into files produce differing results, but all are obviously > different buffer sizes when the fault hits. > $ factor 8192 24576 16384 > 8192: 2 2 2 2 2 2 2 2 2 2 2 2 2 > 24576: 2 2 2 2 2 2 2 2 2 2 2 2 2 3 > 16384: 2 2 2 2 2 2 2 2 2 2 2 2 2 2 > > Here we go, > > $ wget http://www.flickr.com/help/appgarden/ > $ ascii2uni -a Y index.html|wc > ascii2uni: unknown HTML/HDML character entity "&self;" at line 28 > 0 0 0 > $ ascii2uni -a Y index.html > ...<link rel="shortcut icon" type="image/ico" href=" > http://l.yimg.com/g/favicon.ico"> > > ascii2uni: unknown HTML/HDML character entity "&self;" at line 28 > Segmentation fault > > $ echo "&self;"|ascii2uni -a Y > ascii2uni: unknown HTML/HDML character entity "&self;" at line 1 > � > 0 tokens converted > 1 token replaced with Unicode Replacement Character > $ echo "&self_"|ascii2uni -a Y > ascii2uni: unknown HTML/HDML character entity "&self;" at line 1 > Segmentation fault > $ echo "&selfzzz_"|ascii2uni -a Y > ascii2uni: unknown HTML/HDML character entity "&selfzzz;" at line 1 > Segmentation fault > > Yes you could say I shouldn't be feeding the program URIs which look > like they contain entities. > > But still it is no fair to Segmentation fault. > > Yes I wish there was a program that could tell it was inside a URI, but > that is a different topic. (I'm converting webpages for offline reading > on my ASCII (actually Big5) PDA.) > > >