Re: segment_N file is missed

2010-06-13 Thread Lance Norskog
The CheckIndex class/program will recreate the segment files when it removes a segment from an index. That's the only source I've found for how to make these files. If you are able to hack this up, making a CFSDirectory would be a wonderful addition to the Lucene Directory suite. A CFS file is a c

Re: A question bout google search index?

2010-06-13 Thread Lance Norskog
http://research.google.com/pubs/DistributedSystemsandParallelComputing.html On Thu, Jun 10, 2010 at 1:51 AM, Yuval Feinstein wrote: > Most of the implementation of Google's search index is kept secret by Google. > Based on publicly available information, the indexes are quite different - > Google

Re: Scores equality

2010-06-13 Thread Naama Kraus
Thanks a lot, this is very helpful ! Naama On Sun, Jun 13, 2010 at 5:05 PM, Erick Erickson wrote: > Last I knew, ties were decided by the internal document id. > > you can control this any way you want, just include a Sort > object in your query with multiple SortFields. Two pre-defined > SortFie

Re: Scores equality

2010-06-13 Thread Erick Erickson
Last I knew, ties were decided by the internal document id. you can control this any way you want, just include a Sort object in your query with multiple SortFields. Two pre-defined SortFields types you can use are FIELD_SCORE and FIELD_DOC and you can add any number of other fields to sort by, se

Long queries evaluation

2010-06-13 Thread Naama Kraus
Hi All, I have several questions with regard to long queries evaluation. I'd appreciate your input. In case this issue is documented somewhere, I'd be glad for any pointers. How does long queries effect search performance ? E.g. a search query composed of few tens of term ? Few hundreds of terms

Scores equality

2010-06-13 Thread Naama Kraus
Hi All, I wanted to ask regarding search results scores equality: In case two documents get an equal score - how does Lucene "break" equality ? I.e. by which criteria one document would be ranked before another ? Random ? Indexing time ? Anything else ? Can I control this one somehow ? (I am using

Re: Any Solid State Drive performance comparisons?

2010-06-13 Thread Rob Bygrave
Thanks !! On Sun, Jun 13, 2010 at 9:43 PM, Toke Eskildsen wrote: > Rob Bygrave [robin.bygr...@gmail.com] wrote: > > Has anyone done a performance comparison for an index on a Solid State > Drive > > (vs any other hard drive ... SATA/SCSI)? > > We did a fair amount of testing two years ago and put

RE: Any Solid State Drive performance comparisons?

2010-06-13 Thread Toke Eskildsen
Rob Bygrave [robin.bygr...@gmail.com] wrote: > Has anyone done a performance comparison for an index on a Solid State Drive > (vs any other hard drive ... SATA/SCSI)? We did a fair amount of testing two years ago and put some graphs at http://wiki.statsbiblioteket.dk/summa/Hardware The short vers