Re: why did I build index slower and slower ?

2013-05-14 Thread wgggfiy
up - -- Email: wuqiu.m...@qq.com -- -- View this message in context: http://lucene.472066.n3.nabble.com/why-did-I-build-index-slower-and-slower-tp4062798p4063395.html Sent from the Lucene - Java Users mailing list archive at Nabble.com.

AW: lucene and mongodb

2013-05-14 Thread Hendrik Lücke-Tieke
Hi mate, we did that (w/ lucene 3.6) and reconsidered it as "very bad idea" afterwards. Why? (a) out of the box, mongodb does only 16-mb files. Lucene files grow (much) larger than that. (b) lucene indices seem highly optimized to create good performance when reading them from disk. A layer li

IndexReader doc method performance troubles

2013-05-14 Thread G B
Hi there, We've been having troubles with performance regarding IndexReader's * document *(int docID) method. In summary: Why would the *document

Re: Retrieving FieldInfo

2013-05-14 Thread Nicola Buso
Thanks for the explanation, I'm not in this situation but it's helpful to understand better lucene. Nicola Michael McCandless wrote: >On Tue, May 14, 2013 at 10:02 AM, Nicola Buso wrote: > >> I now this can sound horrible/flexible/... but this mean I can add two >> documents with the same fie

Re: Retrieving FieldInfo

2013-05-14 Thread Michael McCandless
On Tue, May 14, 2013 at 10:02 AM, Nicola Buso wrote: > I now this can sound horrible/flexible/... but this mean I can add two > documents with the same field name, but different configurations, for > example different IndexOptions? Yes and no :) Lucene will happily index such drastically differ

Re: TermsEnum.docFreq() returns 0

2013-05-14 Thread Ravikumar Govindarajan
Thanks for the help Mike. Was quick to jump to a wrong conclusion My codec does not implement Term-Vectors, Payloads, DocValues and Norms. It should be trivial to implement Payloads, but I am not sure about others. Anyways, I can generate a HTML report and identify failures based on individual t

unindexed field boost

2013-05-14 Thread Tamer Gür
Hello all, i was wondering why unindexed fields can't be boosted compare to lucene 3. since these fields are still in the score calculation when i checked the score explanation. Is there any clean way to pass this? Thanks Tamer -

Re: Retrieving FieldInfo

2013-05-14 Thread Nicola Buso
I now this can sound horrible/flexible/... but this mean I can add two documents with the same field name, but different configurations, for example different IndexOptions? Nicola. On Tue, 2013-05-14 at 12:52 +0100, Nicola Buso wrote: > OK, thanks for the reply! > > > Nicola. > > On Tue, 2013

Re: lucene and mongodb

2013-05-14 Thread Jack Krupansky
That was tried with Lucandra/Solandra, which stored the Lucene index in Cassandra, but was less than optimal, so that model was discarded in favor of indexing Cassandra data directly into Solr/Lucene, side-by-side in each Cassandra node, but in native Lucene. The latter approach is now available

Re: lucene and mongodb

2013-05-14 Thread Adrien Grand
Hi, On Tue, May 14, 2013 at 1:34 PM, Rider Carrion Cleger wrote: > So, can I have for sure scalability and safety with a distribution on top > of Lucene like Solr ? Yes, Solr can help you shard your index and add replicas, see http://wiki.apache.org/solr/SolrCloud. -- Adrien

Re: Retrieving FieldInfo

2013-05-14 Thread Nicola Buso
OK, thanks for the reply! Nicola. On Tue, 2013-05-14 at 14:19 +0300, Shai Erera wrote: > If your documents always contain the same fields then yes. But in > general, you can do: > > > addDocument("f:value"); > commit(); > > addDocument("c:value"); > commit(); > > And each AtomicReader will c

Re: lucene and mongodb

2013-05-14 Thread Rider Carrion Cleger
Thank you Adrien. I did not think about index's scalability but a safe place to store them (like a SGBD relation... or NoSql). So, can I have for sure scalability and safety with a distribution on top of Lucene like Solr ? On Tue, May 14, 2013 at 11:08 AM, Adrien Grand wrote: > Hi, > > On Tue

Re: Retrieving FieldInfo

2013-05-14 Thread Shai Erera
If your documents *always* contain the same fields then yes. But in general, you can do: addDocument("f:value"); commit(); addDocument("c:value"); commit(); And each AtomicReader will contain different fields. As getFieldInfos() documents "Get the {@link FieldInfos} describing all fields in *this

Retrieving FieldInfo

2013-05-14 Thread Nicola Buso
Hi, I was looking to a way to obtain FieldInfo(s) from the IndexReader; we need in some way to describe the index. Can I do this? AtomicReader ar = .leaves().get(0).reader(); // than call ar.getFieldInfos(); What I mean is, can I suppose every AtomicReader in leaves() contain the

Re: TermsEnum.docFreq() returns 0

2013-05-14 Thread Michael McCandless
On Tue, May 14, 2013 at 3:03 AM, Ravikumar Govindarajan wrote: > We ran the checkIndex and a simple test case. It passes. Actually, I had > assumed problem with lucene, whereas it was an issue with our custom codec. Phew, thanks for bringing closure! > I do not know how to confirm whether a new

Re: Find index version with an index reader

2013-05-14 Thread Ramprakash Ramamoorthy
On Tue, May 14, 2013 at 1:58 PM, Ian Lea wrote: > Take a look at org.apache.lucene.index.CheckIndex. That displays the > versions of the segment files. Note the plurals - that's a > complication you may need to deal with. > > Or read|store whatever you want with > IndexWriter.get|setCommitData(

Re: lucene and mongodb

2013-05-14 Thread Adrien Grand
Hi, On Tue, May 14, 2013 at 10:35 AM, Rider Carrion Cleger wrote: > - Can I store the lucene index in a mongodb database ? I don't know whether it's possible, but even if it was, I would not recommend it. Lucene works best on local filesystems, and even better if the disk is an SSD. If your inte

lucene and mongodb

2013-05-14 Thread Rider Carrion Cleger
Hi team, I'm working with apache lucene 4.2.1 and I would like to store lucene index in a NoSql database. So my questions are, - Can I store the lucene index in a mongodb database ? thanks you team!

Re: Find index version with an index reader

2013-05-14 Thread Ian Lea
Take a look at org.apache.lucene.index.CheckIndex. That displays the versions of the segment files. Note the plurals - that's a complication you may need to deal with. Or read|store whatever you want with IndexWriter.get|setCommitData(...). You can get the currently running version via LucenePa

RE: Performance of NULL check *:* -category:[* TO *]

2013-05-14 Thread srividhyau
Hi -We are using 3.0.3. Could you point me to a similar functionality prior to 4.0?-Vidhya -- View this message in context: http://lucene.472066.n3.nabble.com/Performance-of-NULL-check-category-TO-tp4063021p4063158.html Sent from the Lucene - Java Users mailing list archive at Nabble.com.

Re: TermsEnum.docFreq() returns 0

2013-05-14 Thread Ravikumar Govindarajan
We ran the checkIndex and a simple test case. It passes. Actually, I had assumed problem with lucene, whereas it was an issue with our custom codec. I do not know how to confirm whether a new codec works correctly. Are there any tools/existing test-cases available for validation? -- Ravi On Mo