Re: Similarity Implementation

2016-07-07 Thread Đạt Cao Mạnh
Hi Siraj, I think https://lucene.apache.org/core/6_1_0/core/index.html?org/apache/lucene/search/ConstantScoreQuery.html should be good enough. On Fri, Jul 8, 2016 at 12:27 AM Siraj Haider wrote: > We are in the process of upgrading from 2.x to 6.x. In 2.x we implemented > our own similarity whe

Port of Custom value source from v4.10.3 to v6.1.0

2016-07-07 Thread paule_lecuyer
Hi all, I wrote some time ago a ValueSourceParser + ValueSource to allow using results produced by an external system as a facet query : - in solrconfig.xml : added my parser : http://lucene.472066.n3.nabble.com/Port-of-Custom-value-source-from-v4-10-3-to-v6-1-0-tp4286236.html Sent from the Luc

Similarity Implementation

2016-07-07 Thread Siraj Haider
We are in the process of upgrading from 2.x to 6.x. In 2.x we implemented our own similarity where all the functions return 1.0f, how can we implement such thing in 6.x? Is there an implementation already there that we can use and have the same results? -- Regards -Siraj Haider (212) 306-01

Query with modulo function in Lucene without Solr?

2016-07-07 Thread Randy Tidd
I would like to use the mod() function in a query to for example fetch every 10th or 100th matching document, or to return documents that return a certain result from the mod() function for a numeric field. I know this question has come up in the past and I have seen answers that suggest using

Re: Document retrieval, performance, and DocValues

2016-07-07 Thread Michael McCandless
You should do the MultiDocValues.getBinaryDocValues(indexReader, "pos_id") once up front, not per hit. You could operate per-segment instead by making a custom Collector. Are you sorting by your pos_id field? If so, the value is already available in each FieldDoc and you don't need to separately

Re: Hierarchical Facets need duplicated counts

2016-07-07 Thread Nicola Buso
Any hint on how to calculate these values without asking the whole facet hierarchy and count them? Is there a specific point in the code where I can check for this distinct count, and maybe modify the code? Nicola On Wed, 2016-07-06 at 13:42 +0100, Nicola Buso wrote: > Hello everyone, > > we

Re: Lucene cluster with NFS or synchronization tool such as rsync

2016-07-07 Thread Michael McCandless
Alas, there are no more docs than the classes themselves, in the lucene/replicator module, under the oal.replicator.nrt package. Essentially, you create a PrimaryNOde (equivalent of IndexWriter) for indexing documents, in a JVM on machine 1, and a ReplicaNode in a JVM on machine 2, but you must su

Re: IndexWriter and IndexReader in a shared environment

2016-07-07 Thread Michael McCandless
The API is pretty simple. Create IndexWriter and leave it open forever, using it to index/delete documents, and periodically calling IW.commit when you need durability. Create a SearcherManager, passing it the IndexWriter, and use it per-search to acquire/release the searcher. Periodically (idea

Re: lucene index reader performance

2016-07-07 Thread Michael McCandless
Somehow you need to get the sorting server-side ... that's really the only way to do your use case efficiently. Why can't you sort each request to your N shards, and then do a merge sort on the client side, to get the top hits? Mike McCandless http://blog.mikemccandless.com On Thu, Jul 7, 2016

Re: lucene index reader performance

2016-07-07 Thread Tarun Kumar
Any suggestions pls? On Mon, Jul 4, 2016 at 3:37 PM, Tarun Kumar wrote: > Hey Michael, > > docIds from multiple indices (from multiple machines) need to be > aggregated, sorted and first few thousand new to be queried. These few > thousand docs can be distributed among multiple machines. Each ma

Re: dv field is too large

2016-07-07 Thread Michael McCandless
I agree, I'll improve the docs about this limit. Thanks Sheng. Mike McCandless http://blog.mikemccandless.com On Wed, Jul 6, 2016 at 10:59 PM, Sheng wrote: > I agree. That said, wouldn't it also make sense to clearly point it out by > adding the comments to the corresponding classes. This is