Thanks Erik, Is there a .NET version of Solr?
Cheers Sachin -----Original Message----- From: Erick Erickson [mailto:[EMAIL PROTECTED] Sent: 08 February 2007 15:26 To: [email protected] Subject: Re: 'a', 's' and 't' don't index properly >From the javadoc... public final class *SimpleAnalyzer*extends Analyzer<file:///C:/lucene-2.0.0/docs/api/org/apache/lucene/analysis/Ana lyzer.html> An Analyzer that filters LetterTokenizer with LowerCaseFilter. On 2/8/07, Kainth, Sachin <[EMAIL PROTECTED]> wrote: > > Thanks Erik, > > Do you know of an analyzer which doesn't remove the characters 'a', 's' > and 't'. > > Sachin > > -----Original Message----- > From: Erick Erickson [mailto:[EMAIL PROTECTED] > Sent: 08 February 2007 13:54 > To: [email protected] > Subject: Re: 'a', 's' and 't' don't index properly > > This really should be posted on the dotlucene list, but.... > > Your indexing analyzer is probably removing them. For instance, > StandardAnalyzer uses a default set of stop words, and a, s, and t are > definitely among them. You need to use a different analyzer than you > are using. > > These will also be removed from queries if you use QueryParser with > one of several analyzers that remove stop words. > > StandardAnalyzer, for instance, also lower-cases tokens, removes most > puncutation, etc, so take some care to understand the analyzers and > what they do. > > Oh, and get a copy of Luke if you haven't already. It'll let you > examine your index, see the results of using various analyzers etc. > > Best > Erick > > On 2/8/07, Kainth, Sachin <[EMAIL PROTECTED]> wrote: > > > > > Hello, > > > > > > I have a database of tracks, artists and albums and I'm indexing > > > these > > > 3 attributes plus also the first letter of the track thus > > > (incidently I'm using dotlucene but the implementation of > > > dotlucene is similar to the Java one): > > > > > > Document Doc = new Document(); > > > String Album = ... > > > String Artist = ... > > > String Track = ... > > > Doc.Add(Field.Text("album", Album)); > > > Doc.Add(Field.Text("artist", Artist)); > > > Doc.Add(Field.Text("track", Track)); > > > Doc.Add(Field.Text("firstletter", Track.Substring(0,1))); > > > > > > Problem is I don't think certain first letters are being indexed > > > properly or at all, either that or there is some problem elsewhere. > > > > I have noticed that the letters 'a', 's' and 't' (there may be > > > others) cause me problems. I shall explain the problem I have. > > > When I search for the documents I perform a sorting operation on > > > the > > > > firstletter field but where the firstletter was 'a', 's' or 't' > > > the returned list does not contain those records in sorted order > > > (all other records are sorted correctly). > > > > > > Here is my search command: > > > > > > Hits hits = searcher.Search(query, new Sort(new SortField[] { new > > > SortField("firstletter", SortField.STRING)})); > > > > > > What I don't know is whether the fault lies in the indexing or in > > > this or other code. Does anyone know what could have happened. > > > > > > Thanks > > > > > > Sachin > > > > > > This email and any attached files are confidential and copyright > > protected. If you are not the addressee, any dissemination of this > > communication is strictly prohibited. Unless otherwise expressly > > agreed in writing, nothing stated in this communication shall be > legally binding. > > > > The ultimate parent company of the Atkins Group is WS Atkins plc. > > Registered in England No. 1885586. Registered Office Woodcote > > Grove, Ashley Road, Epsom, Surrey KT18 5BW. > > > > Consider the environment. Please don't print this e-mail unless you > > really need to. > > > > > This message has been scanned for viruses by MailControl - (see > http://bluepages.wsatkins.co.uk/?4318150) > > --------------------------------------------------------------------- > To unsubscribe, e-mail: [EMAIL PROTECTED] > For additional commands, e-mail: [EMAIL PROTECTED] > > --------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
