Hi DM, On Tue, May 1, 2012 at 12:00 AM, DM Smith <dmsm...@crosswire.org> wrote:
> > On 04/30/2012 09:37 AM, Daniel Owens wrote: > >> >> >> On 04/30/2012 06:54 AM, Chris Little wrote: >> >>> On 4/30/2012 4:39 AM, David Troidl wrote: >>> >>>> Hi Chris, >>>> >>>> I'm certainly no expert on your TEI dictionaries, but wouldn't it make >>>> sense to have the first key be one that would sort properly, and present >>>> the dictionary in true alphabetical order? I'm thinking of Middle >>>> Liddell, as well as the Hebrew. This key wouldn't even necessarily have >>>> to be shown to the user. The second key, the title, could then maintain >>>> the proper accents for display, without hindering sorting, searching or >>>> navigation. >>>> >>> >>> I confess, I don't understand what you're proposing this as an >>> alternative to. >>> >>> In the example Karl cites, there's just one actual key per entry. It is >>> an uppercased version of the entryFree's n attribute. This is the key that >>> is sorted. >>> >>> The un-uppercased version from the n attribute is being rendered as part >>> of the entry text via the TEI filters. This is the part I'm proposing we >>> retain, but render somewhere else, e.g. right-justified at the bottom of >>> the entry. >>> >>> We also render all the text of the entry, which in these cases includes >>> the text from a title element. >>> >>> I don't know what 'true alphabetical order' means, but if you mean >>> localized sort order, it's not possible with the current implementation of >>> this module type. >>> >>> --Chris >>> >>> >> I think David's concern is something that needs to be dealt with. A >> number of possibilities could be pursued, some of them together: >> >> 1. The current implementation is to sort by unicode code points. This >> works particularly well with numeric keys. A quick solution for languages >> for which such sorting is not alphabetical would be to follow David's >> suggestion of using keys that the user does not even see. This has the >> advantage of providing a workable solution right away, but there are some >> problems with this. First, we could create a new "strongs" standard because >> the current implementation does not actually hide keys. That could be >> solved by making the keys so obscure that no one would remember them. >> Second, any future, more robust solution would require reworking all >> modules keyed to it. I have toyed with this solution, and it might be the >> pragmatic way forward, but it is not ideal. >> >> 2. A localized sort order, which I think this is what David means by >> true alphabetical order, would be a better long-term solution. >> >> 3. In addition, using genbooks for lexica would work for lexica that >> are sorted by root, with subentries nested in a hierarchy, just like in the >> Hesychius module and BDB. I have been working with Troy on this. >> Unfortunately, front-ends do not recognize the Feature=HebrewDef option in >> the conf file and allow genbooks as lexica. I can send anyone an example >> lexicon if you are interested in working on this. In that case, instead of >> @n as the key, */x-entry/@osisID would be the key. >> >> Any thoughts? >> > > I think there is a problem with the sorting of entries in dictionaries > where the keys are not ascii. I don't remember the details, but I seem to > remember it having been discussed here. > > For JSword, we'll be building a Lucene search index for the key, the term > and the whole entry. A user lookup will be normalized and the search will > return the key with which lookup will proceed internally as it does today. > ICU provides the ability to create a localized sort key (not at all > suitable for display) that can be used to sort dictionary entries for the > end-users locale. I'm thinking that for TEI dictionaries the representation > of the key should not be shown at all. > BPBible, and I believe some other frontends as well use binary search on the original module order to locate a key in a virtual list. This provides very noticeable speedups on large dictionaries like ISBE. I think this would require the original module creation to place a module in localised key order if we really wanted to order by that, not just have a lookup which as I understand it would only be done when actually looking for a key? It also really means that a module can be sorted in one and only one way. Then again, I'm not even sure we can guarantee any kind of binary search on localised keys. A related issue for English dictionaries is allowing mixed-case dictionary keys (and I think I have heard similar comments about Greek and maybe other languages). At the moment I think SWORD requires dictionary keys to be upper-case to ensure that they sort correctly, but really "Aaron's Rod" looks much better than "AARON'S ROD". BPBible now attempts to automatically and heuristically turn keys to mixed case, which I think looks a lot better, but ideally this would be done in the same way as for other languages: separating sort order from codepoint order in some way. Jon
_______________________________________________ sword-devel mailing list: sword-devel@crosswire.org http://www.crosswire.org/mailman/listinfo/sword-devel Instructions to unsubscribe/change your settings at above page