Sorry, the thread was broken and I saw your reply just now. On Thu, Feb 9, 2012 at 16:23, Jan Hauke Rahm <j...@debian.org> wrote: > On Thu, Feb 09, 2012 at 01:58:28AM +0800, Aron Xu wrote: >> >> This is valid for most-used applications/formats like gettext, images >> that are designed to behave in this way, but on the contrary there are >> upstream that don't like to see such impact, especially due to the >> complexity and performance impact. >> >> Currently I am using arch:any for data files which aren't be affected >> with multiarch, i.e. not "same" or "foreign". For endianness-critical >> data that is required to make a library working, I have to force them >> to be installed into /usr/lib/<triplet>/$package/data/ and mark them >> as "Multiarch: same", this is sufficient to avoid breakage, but again >> it consumes a lot of space on mirror. > > Actually, what is "a lot" here? I mean, how many libraries are there > containing endianness-critical data and how big are the actual files? > Not that I'm any kind of expert, but this solution sounds reasonable to > me. > > Hauke >
As far as I know, there isn't too many libraries known to have endianness-critical data, but there might be landmines because the maintainer just aren't aware about it. I have the chance to notice this problem because my team maintain several stack of input methods, which usually need to deal with linguistic data. [1] For me here is a library named libpinyin at hand to package, which has some data files of ~7.5MiB size after gzip -9 (the total size of this library is no more than 9MiB after gzip -9). We have 14 architectures on ftp-master, so the data file eats up 105MiB, while if we find some way to have only one copy for be/le, it'll only use 15MiB. And think about when it get released as a stable, a new copy of those data is making their way to the archive when new version get uploaded to unstable. Such concern is also valid to other endianness-critical data that are not bothered with Multi-Arch at present, we need to make them arch:any and in the end they are eating more and more space. [1] Performance is critical for these applications, this doesn't mean it consumes a lot of CPU percentage, but it must response very quickly to user's input - do some complex calculations to split a sentence into words and find out a list of most related suggestions, which needs to query from 10^5 ~ 10^6 lines of data several times to complete such an action. There was project tried to use something like SQLite3 but the performance is a bit frustrating, so they have now decided not to care about that but just design data format that can fit for their requirements. -- Regards, Aron Xu -- To UNSUBSCRIBE, email to debian-devel-requ...@lists.debian.org with a subject of "unsubscribe". Trouble? Contact listmas...@lists.debian.org Archive: http://lists.debian.org/CAMr=8w6qiM6VB_2iegzKMFx=tv+ert6lqely6naoqfpaco-...@mail.gmail.com