SWORD has a number of filtering stages which occur at different places and events.
Specifically interesting for this discussion are "strip filters". These are called immediately before searching and should be called on the search string before passing it to search: ListKey results = module.search(module.stripText(searchTerm)); I am pretty sure most of our frontends to this. This assures not only that the module text is normalize for searching, but also that the search term itself is normalized using the same rules. We strip markup and other things from the module buffer before doing the comparison. We obviously aren't stripping soft hyphens but I suggest we simply add the soft hyphen character to the list of characters we are removing. Additionally, each module can specify its own additional strip filters with a conf entry: LocalStripFilter= But that would entail there was a filter available which could strip out soft hyphens, which I don't believe there is. I have committed to trunk the addition of stripping out soft hyphens to the strip filter for OSIS SourceType modules, for now, if you'd like to have a test. A legacy issue we've had that we'd like to eventual get rid of, is that the we often use the same filter to do double-duty for both strip filters (before searching) and as the render filter for plain text. This means that, as a side effect, soft hyphens will no longer be present if you ask diatheke for plain text output. Let me know if you have a chance to test, Troy On 11/02/2017 03:28 AM, David Haslam wrote: > I am recommending the complete removal of soft hyphens because their use is a > typographical kludge not semantic construction. > > See https://crosswire.org/wiki/Converting_SFM_Bibles_to_OSIS#Soft_hyphens > > Being a kludge, there could never be any possibility that any particular > word would always have the soft hyphen. > > They result because the USFM files were derived retrospectively from files > exported from Quark XPress. > > It's been a useful discussion, prompted by my assistance to Fr Cyrille with > his LinVB repository in GitLab. > > Best regards, > > David > > > > -- > Sent from: http://sword-dev.350566.n4.nabble.com/ > > _______________________________________________ > sword-devel mailing list: sword-devel@crosswire.org > http://www.crosswire.org/mailman/listinfo/sword-devel > Instructions to unsubscribe/change your settings at above page _______________________________________________ sword-devel mailing list: sword-devel@crosswire.org http://www.crosswire.org/mailman/listinfo/sword-devel Instructions to unsubscribe/change your settings at above page