We have a large body of documents that have xml
and ocr embedded within one of the xml fields.

Searches such as "group effect"

are returning hits for docs such as ones that include the following:

 ...group of ~a- The effect...

because, I take it, stop words like 'of' and 'the' and punctuation
are ignored. Is there anything I can do about this other
than write an alternative to the Standard Analyzer?

thanks,

Bob Mason
UCSF Tobacco Industy Digital Library



---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Reply via email to