Congrats!
A couple questions:
1) Which version of Solr is this based on?
2) How is LWE different from standard Solr? How should one choose between the
two?
Thanks.
--- On Wed, 12/15/10, Grant Ingersoll wrote:
> From: Grant Ingersoll
> Subject: [ANN] General Availability of LucidWorks Enterp
Thanks for your information.
My current stats:
250 GB of data, 40 GB of Index Size, 60 million records is working fine with 1
GB RAM. We are storing minmal amount of data in index. We are doing sorting on
Date. Even in single system, the database are shard.
We are planning to build hosted sol
I have a app that seems to be locking on some search calls. I am
including the stacktrace for the blocked and blocker thread. We are
using the following jars lucene-snowball-2.1.0.jar and lucene-2.1.0.jar.
The indexes are located on the local disk. We are running on multiple
JVM's against the
I don't understand your problem well. but needing know when a new
term occur is a hard problem because when new document is added, it
will be added to a new segment. I think you can only do this in the
last merge in optimization stage. You can read the codes in
SegmentMerger.mergeTermInfos() . I
I'm using MultiPhraseQuery to implement a fuzzy phrase query.
E.g. user enters "blue lorry" and I expand 'blue' to 'turquoise', and 'glue'
and
'lorry' to 'truck', 'van', 'lory' and 'lorrie'. I can then construct a
MultiPhraseQuery with those lists of terms.
The search works correctly but the
I’m using Lucene to index database records and text documents.
I want to provide efficient fuzzy queries over the data so I’m using a
secondary
Lucene index for all of the distinct terms encountered in the primary index.
Each ‘document’ in the secondary index is a term from the primary index wi
As Ryan mentions, you really should consider piling them all into
a single index. Yes, it seems really wasteful to re-index author,
URL with every last photo, but try it and see if the size is acceptable.
Or, more accurately, whether performance is acceptable.
Best
Erick
On Wed, Dec 15, 2010 at 1
Would you be able to create a single index with all photos? Your searches would
go against the photo index. At that point, you would have the most relevant
photos regardless of album. You could then introduce a sort to your Lucene
search to ensure all photos from a given album are grouped togeth
Lucid Imagination is pleased to announce the general availability of our Apache
Solr/Lucene powered LucidWorks Enterprise (LWE). LWE is designed to make it
easier for people to get up to speed on search by providing easier management,
integration with libraries commonly used in building search
Have a look at http://lucene.apache.org/java/3_0_2/scoring.html on how Lucene's
scoring works. You can override the Similarity class in Solr as well via the
schema.xml file.
On Dec 15, 2010, at 10:28 AM, Pavel Minchenkov wrote:
> Hi,
> Please give me advise how to create custom scoring. I ne
On Wed, Dec 15, 2010 at 1:41 PM, Chris Hostetter
wrote:
> files with the same names should be the same, files with differnet names
> should be very different -- but if your binary diff tool is finding
> commonalities between files in new segments as the index grows overtime,
> and you feel like yo
: In my testing, when the filenames are the same, doing an xdelta on the
: files (mainly the file that contains most of the data, the .cfs file),
: there is a significant reduction in the size of the patch file created.
AS noted elsewhere in this thread, the filenames themselves are
significant
Also, when taking the Similarity suggestion below note two things in
Lucene's default behavior that you seem to wish to avoid:
The first is IDF - but only for multi-term queries - otherwise ignore this
comment.
For multi term queries to only consider term frequency and doc length, you
may want to
Hi,
We are using a Lucene 3.x index to search for photo albums based on
textual properties such as photo album title/author/URL and photo
captions/URLs. Goal is to find the most relevant photo albums for a user
query and display the best matching photos for these albums.
In our current solution w
Sounds to me that lucene should do a pretty good job without any extra
work on your part. See javadocs for
org.apache.lucene.search.Similarity
for details on how it works. You can change things by providing your
own implementation.
There is also the org.apache.lucene.search.function package but
Hi,
Please give me advise how to create custom scoring. I need to result that
documents were in order, depending on how popular each term in the document
(popular = how many times it appears in the index) and length of the
document (less terms - higher in search results).
For example, index contai
On Wed, Dec 15, 2010 at 7:49 AM, Doron Cohen wrote:
> Perhaps I'll change my mind after understanding the scenario that creates
> this, but for now I'd rather not to ignore the file names differences.
It may be possible to control the data generation process, so
the filenames are consistent. Chan
> I could make an exception in the patch creation program to detect
> that there is a lucene directly, and diff the .cfs files, even if
> they have different names, but was seeing if I can avoid that
> so the patch program can be agnostic about the contents of the
> directory tree.
>
Doing only th
On Wed, 2010-12-15 at 09:42 +0100, Ganesh wrote:
> What is the advantage of going for 64 Bit.
Larger maximum heap, more memory in the machine.
> People claim performance and usage of more RAM.
Yes, pointers normally take up 64bit on a 64bit machine. Depending on
the application, the overhead can
What is the advantage of going for 64 Bit. People claim performance and usage
of more RAM.
In 32 Bit OS, JVM handles 1 to 1.5 GB of RAM then in case of 64 Bit, Single JVM
cannot use more than 1.5 GB RAM? What if we host multiple JVM instance in the
single system.
Please help me with some mor
20 matches
Mail list logo