I will look a little deeper into the information you supplied and
comment, but will suggest this on my initial cursory review:
1 - You have 32GB of memory. Using the 64bit VM, try using a 16GB or 24GB heap;
2 - Turn-on huge pages:
-XX:+UseLargePages
-XX:LargePageSizeInBytes=256m
3 - Tu
Hi,
The details are as follows:
Solaris version: Solaris 10 U5 and U6
For the Java Setup, I have tried with:
Sun JDK 1.5 (32 & 64)
Sun JDK 1.6 (32 & 64)
Heap Space: 2G from 32 bit and 4G for 64 bit (Set the same values for
both XMS and XMX)
Disk: Tried with ZFS (U6) and UFS (U5)
I reduced the
Fuzzy search tends to be super heavy on CPU because of the Levenstein
distance algo. We use it for a small index 60MB for spell correcting and our
QPS suffers as a result.
There was recently a discussion of a new fuzzy algorithm:
https://issues.apache.org/jira/browse/LUCENE-1513?page=com.atlassian
Could you give some configuration details:
- Solaris version
- Java VM version, heap size, and any other flags
- disk setup
You should also consider using huge pages (see
http://zzzoot.blogspot.com/2009/02/java-mysql-increased-performance-with.html)
I will also be posting performance gains using
: Re: Lucene search performance on Sun UltraSparc T2 (T5120) servers
>
>
> I was having some thoughts recently about speeding up fuzzy search.
>
> The current system does edit-distance on all terms A-Z, single threaded.
> Prefix
> length can reduce the search space a
The method suggested would make the speed faster, but I doubt whether it
would be substantial on processors with slower clock speed. Keeping in
mind that most processors are going multi-core, it would make sense to
multi-thread the scan.
Any remarks are welcome!
Varun Dhussa
Product Architect
I was having some thoughts recently about speeding up fuzzy search.
The current system does edit-distance on all terms A-Z, single threaded. Prefix
length can reduce the search space and there is a "minimum similarity"
threshold but that's roughly where we are. Multithreading this to make use o