Re: New Lucene User

2013-06-18 Thread Ashwin Tandel
Raghav, I would like to second Jack, Solr would take care of indexing your document without writing any code and it has scalability features like replication and sharding if required that would handle large volume of data. http://lucene.apache.org/solr/ Regards, Ashwin On Tue, Jun 18, 2013 at

Re: segments and sorting

2013-06-18 Thread Sriram Sankar
> You can sort each segment independently or have a single segment, both > options are available. To have a single segment, you just need to wrap > your top-level index reader with SlowCompositeReaderWrapper before > wrapping it again in a SortingAtomicReader and calling > IndexWriter.addIndexes.

Re: TestGrouping.Java seems to combine multiple tests into one huge test

2013-06-18 Thread Tom Burton-West
Thanks Mike and Robert, >>Refactoring this test would be fantastic, but I wouldn't want to take it on :) >>Maybe an easier step would be to rename this test something like TestRandomGrouping, and add some brand new very simple-easy-to-understand tests to a new file(s). I opened LUCENE-5065. I'l

Re: TestGrouping.Java seems to combine multiple tests into one huge test

2013-06-18 Thread Michael McCandless
+1 to somehow refactor this scary test to make it more understandable! Mike McCandless http://blog.mikemccandless.com On Tue, Jun 18, 2013 at 12:48 PM, Tom Burton-West wrote: > Hello, > > I'm trying to understand BlockGroupingCollector. I thought I would start > by running the tests in the d

Re: TestGrouping.Java seems to combine multiple tests into one huge test

2013-06-18 Thread Robert Muir
On Tue, Jun 18, 2013 at 9:48 AM, Tom Burton-West wrote: > Hello, > > I'm trying to understand BlockGroupingCollector. I thought I would start > by running the tests in the debugger. However the only test I can find is > > lucene/grouping/src/test/org/apache/lucene/search/grouping/TestGrouping.

TestGrouping.Java seems to combine multiple tests into one huge test

2013-06-18 Thread Tom Burton-West
Hello, I'm trying to understand BlockGroupingCollector. I thought I would start by running the tests in the debugger. However the only test I can find is lucene/grouping/src/test/org/apache/lucene/search/grouping/TestGrouping.java In TestGrouping.java, in the second test, "testRandom" it see

Looking for Search Engineers

2013-06-18 Thread Jagdish Nomula
Hello, SimplyHired.com, a job search engine with the biggest job index in the world is looking for engineers to help us with our core search and auction systems. Some of the problems you will be working on are, a) Scaling to millions of requests b) Working with millions of jobs c) Maximizing the

[ANNOUNCE] Apache Lucene 4.3.1 released

2013-06-18 Thread Shalin Shekhar Mangar
June 2013, Apache Luceneā„¢ 4.3.1 available The Lucene PMC is pleased to announce the release of Apache Lucene 4.3.1 Apache Lucene is a high-performance, full-featured text search engine library written entirely in Java. It is a technology suitable for nearly any application that requires full-text

RE: New Lucene User

2013-06-18 Thread raghavendra.k.rao
Heikki, Thank you very much. I tried it out and the initial results look good. Although I get "java.lang.OutOfMemoryError: Java heap space" when I search for a single TextField over 70 million records. Probably my code needs tuning. I'll research more to figure it out. But this is a great start

Re: Upgrading from 3.6.1 to 4.3.0 and Custom collector

2013-06-18 Thread Peyman Faratin
Hi Adrien thank you very much. It worked. have a good day On Jun 18, 2013, at 5:35 AM, Adrien Grand wrote: > Hi, > > You didn't say specifically what your problem is so I assume it is > with the following method: > > On Tue, Jun 18, 2013 at 4:37 AM, Peyman Faratin > wrote: >>

[ANN] Lux XML search engine

2013-06-18 Thread Michael Sokolov
I'm pleased to announce the first public release of Lux (version 0.9.1), an XML search engine embedding Saxon 9 and Lucene/Solr 4. Lux offers many features found in XML databases: persistent XML storage, index-optimized querying, an interactive query window, and some application support feature

Programmatically create proximity or slop queries using Lucene's flexible parser

2013-06-18 Thread Kasper van den Berg
Hi all, (Cross posted to stackoverflow yesterday (http://stackoverflow.com/q/17154510/814206), no answers at SO yet, perhaps java-users@lucene is a better place for this question; I hope not to annoy any of you with these duplicate messages) In Lucene (currently using version 4.1) using Lucene

Re: Upgrading from 3.6.1 to 4.3.0 and Custom collector

2013-06-18 Thread Adrien Grand
Hi, You didn't say specifically what your problem is so I assume it is with the following method: On Tue, Jun 18, 2013 at 4:37 AM, Peyman Faratin wrote: > public void setNextReader(IndexReader reader, int docBase) > throws IOException{ > this.docBase =

Re: segments and sorting

2013-06-18 Thread Adrien Grand
On Tue, Jun 18, 2013 at 1:05 AM, Sriram Sankar wrote: > I'm sorry - I meant "DocValue" not "FieldValue". Slide 20 in the following > deck talks about the 2Gb limit. Doc values don't have this limit anymore. However, there is a limit of ~32kb per term, but this shouldn't be a problem with reasona