Re: Lucene Website Integration

2009-06-05 Thread listan...@gmail.com
Hi Gary, Yes, that certainly helped. Thanks for your reply. Personally for me, it is interesting to hear about Grails, Ruby etc. I usually only see folks using PHP, CGI etc. But, maybe it's just me. I felt like I was being a little anachronistic in using JSP/Servlets. Thanks again. Anand On W

Debugging file lock problem

2009-06-05 Thread Newman, Billy
I am having a problem where I am getting lock timeouts when trying to write to my index file. It would be nice if I could turn on logging to see which server/application has the lock and when. Is there a way to see the lock information without changing code? Thanks, Billy ___

Re: How to support stemming and case folding for english content mixed with non-english content?

2009-06-05 Thread KK
Robert, I tried to use the worddelimiterfilterfactory as well, but I faced the same problem[I saw solr using the factory instead of the filter]. I think I sould try using the first option[copyting that to local directory] and use it with the options you mentioned. I'll try it out and will post it h

Re: Custom sorting!

2009-06-05 Thread Ian Lea
Hi One would indeed expect that JAHN would come before JOHNSON. I can't spot anything wrong with your code but it isn't all there and the problem could lie with something not shown. Why don't you cut it down to a nice short self-contained program or test case and if that doesn't help you find t

Re: How to support stemming and case folding for english content mixed with non-english content?

2009-06-05 Thread Robert Muir
kk an easier solution to your first problem is to use worddelimiterfilterfactory if possible... you can get an instance of worddelimiter filter from that. thanks, robert On Fri, Jun 5, 2009 at 10:06 AM, Robert Muir wrote: > kk as for your first issue, that WordDelimiterFilter is package > protect

Re: How to support stemming and case folding for english content mixed with non-english content?

2009-06-05 Thread Robert Muir
kk as for your first issue, that WordDelimiterFilter is package protected, one option is to make a copy of the code and change the class declaration to public. the other option is to put your entire analyzer in 'org.apache.solr.analysis' package so that you can access it... for the 2nd issue, yes

Re: Custom sorting!

2009-06-05 Thread vanshi
Can somebody take a look in why I'm not getting the correct alphabetical order? I changed last_name field to be 'UN_TOKENIZED' and sorting on only last name, then network status. But with last name as 'J' in query, I am getting results like: JACOBSON JOHNSON JAHN I would expect 'JAHN' to come b

Re: cannot retrieve the values of a field is not stored in the index

2009-06-05 Thread Erick Erickson
Enumerating terms will be inefficient compared to getting the stored field.I'd try soring the fields first until and unless you can demonstrate a problem. BTW, if you're not going to *search* on the field, there's no reason to index it at all. Why do think you don't want to store the paths? How bi

Re: How to support stemming and case folding for english content mixed with non-english content?

2009-06-05 Thread KK
Thanks Robert. There is one problem though, I'm able to plugin the word delimiter filter from solr-nightly jar file. When I tried to do something like, TokenStream ts = new WhitespaceTokenizer(reader); ts = new WordDelimiterFilter(ts); ts = new PorterStemmerFilter(ts); ...rest as in the l

Re: How to support stemming and case folding for english content mixed with non-english content?

2009-06-05 Thread Robert Muir
KK, you got the right idea. though I think you might want to change the order, move the stopfilter before the porter stem filter... otherwise it might not work correctly. On Fri, Jun 5, 2009 at 8:05 AM, KK wrote: > Thanks Robert. This is exactly what I did and its working but delimiter is > mi

Re: How to support stemming and case folding for english content mixed with non-english content?

2009-06-05 Thread KK
Thanks Robert. This is exactly what I did and its working but delimiter is missing I'm going to add that from solr-nightly.jar /** * Analyzer for Indian language. */ public class IndicAnalyzer extends Analyzer { public TokenStream tokenStream(String fieldName, Reader reader) { TokenStream

Re: How to support stemming and case folding for english content mixed with non-english content?

2009-06-05 Thread Robert Muir
i think you are on the right track... once you build your analyzer, put it in your classpath and play around with it in luke and see if it does what you want. On Fri, Jun 5, 2009 at 3:19 AM, KK wrote: > Hi Robert, > This is what I copied from ThaiAnalyzer @ lucene contrib > > public class ThaiAn

Phrase search

2009-06-05 Thread Abhi
Say I have indexed the following strings: 1. "cool gaming laptop" 2. "cool gaming lappy" 3. "gaming laptop cool" Now when I search with a query say "cool gaming computer", I want string 1 and 2 to appear on top (where search terms are closer to each other) followed by 3. I can use a Term query t

Re: cannot retrieve the values of a field is not stored in the index

2009-06-05 Thread Ian Lea
You can't get at the field values from the document hits. Field.Store.NO means it isn't stored and what isn't there can't be retrieved. But you should be able to get at the indexed paths via a TermEnum. Something like this IndexReader reader = IndexReader.open(...); String field =

Re: Query:Adding all docs at once or creating smaller indexes and merge

2009-06-05 Thread Ian Lea
Hi My guess is that one big index would be more efficient since the total IO read and write load would be less. The big reason for creating smaller intermediate indexes is that you could spread their creation over multiple jobs/disks/servers. There is lots of good advice in http://wiki.apache.o

One character highlight

2009-06-05 Thread Piotr Jakubowski
Hello, I am wondering whether using Lucene Highlighter it is possible to highlight parts of words. So for instance, if I type A then in word America I would have America or if I type er I would get America Thanks for replies -- View this message in context: http://www.nabble.com/One-character-

Re: P2P Lucene

2009-06-05 Thread Shashi Kant
Thanks for the up Otis. I will give this some more thought, prototype some, and possibly put in a proposal for the Apache Incubator. Ye, I am not aware of Sixearch, but there are several P2P applications e.g. WiredReach, Grub, Neurogrid etc. However, my idea is quite a bit different from the exis

Re: How to support stemming and case folding for english content mixed with non-english content?

2009-06-05 Thread KK
Hi Robert, This is what I copied from ThaiAnalyzer @ lucene contrib public class ThaiAnalyzer extends Analyzer { public TokenStream tokenStream(String fieldName, Reader reader) { TokenStream ts = new StandardTokenizer(reader); ts = new StandardFilter(ts); ts = new ThaiWordFilter(ts