Re: [sword-devel] Chinese PinYin, OSIS, SWORD and front-ends

Matthew Talbert Tue, 19 Oct 2010 14:39:55 -0700

> I'm really about as ignorant of (C)Lucene as a person can be, so someone 
> please correct me if I'm wrong. I believe our indexing just indexes at the 
> record level (verses or dictionary entries). So, upon creation of the index, 
> you could just concatenate the text and the transliterated text and do 
> indexing for that. Unless you need to support exact string matches across 
> record boundaries, the concatenation shouldn't affect results.


Yes, you could probably do that, but I think it may be more helpful
and useful long term to store it in a separate field. It would require
more work on the frontend (at least to make it simple to use), but I
think possibly it would be better.

>
> Something I mention on the wiki, that I think you're also advocating, is 
> doing transliteration of the text on a word-by-word basis and placing the 
> result in the <w xlit="..."> attribute (all via a filter). That partly 
> depends on the sourcetype being OSIS (though we could do it to plaintext too, 
> and change its sourcetype at runtime). We could certainly run such a filter 
> process prior to indexing, which would mean that the transliterated text 
> could be searched, even if transliteration is turned off in the current view.

I wasn't really thinking of that, but it should be possible right now
in the filters. I have considered doing something similar for
Greek/Hebrew in different situations. It's one of the things I may
someday get around to...

As far as the filters for indexing, they don't have to match up at all
with what is currently being viewed. In fact, we currently turn off
all (or selective) filters prior to indexing, but the real magic is to
ensure that the filters you have set for indexing are the same ones
you set prior to searching.

Matthew

_______________________________________________
sword-devel mailing list: sword-devel@crosswire.org
http://www.crosswire.org/mailman/listinfo/sword-devel
Instructions to unsubscribe/change your settings at above page

Re: [sword-devel] Chinese PinYin, OSIS, SWORD and front-ends

Reply via email to