Hi, I have the following situation: I have two pretty large indices. One consists of about 1 billion documents (takes ~6GB on disk) and the other has about 2 billion documents (~10GB on disk). The documents are very short (4-5 terms each in the text field, and one numeric field with a long value). This is a read only index - I'm only going to read from it and never write. There is only one segment in each index (At least there should be, I called forceMerge(1) on them).
Search latency is the most important thing to me. I need it to be blazing fast, ~20ms per query. Queries are always of the type +term1 +term2 +term3, and I'm asking for 10 results from each index (searching is done simultaneously on both indices). I have a fast server (12 cores@3GHz each) with 32Gb RAM (running Linux) and I can keep both indices in-memory when using a RAMDirectory. This didn't achieve the expected result (average query time = ~43ms). I'm seeing latency spikes, where the same query is sometimes answered in 10ms, but in a different occasion takes 2-3 seconds. I'm guessing this is due to GC (as explained here<http://lucene.472066.n3.nabble.com/Plans-to-remove-RAMDirectory-td3601156.html>). Using a warmed up MMapDirectory didn't help; the average query time was a bit slower. I tried using InstantiatedIndex, but it has a huge memory consumption, I couldn't even load the smaller 6GB index. Any ideas about what could be the ideal configuration for me? Thanks.