Hi All,
I have a sort performance question: I have a fairly large index consisting of chunks of full-text transcriptions of television, radio and other media, and I'm trying to make it searchable and sortable by date. The search front-end uses a parallelmultisearcher to search up to three indexes at a time (each index contains a month of live data). When I search for the word "toast" (for example) sorted by score the results come back in about 1200ms, when I sort it by DateTime the results come back in 3900ms. Initially I was sorting based on a unixtime field, but having read up on it, I switched to a slightly easier format: "yyyyMMDDHHmm". Now this value is still larger than an int, so I went one step farther and created two more fields for test purposes: SortDate, which is yyyyMMdd and SortTime which is HHmm. When I sort by SortDate then SortTime the results come in even slower, around 4300ms. To summarize: //The sorting fields looks like this: new Field("SortDateTime", sdfDateTime.format(dMySortDateTime), Field.Store.YES, Field.Index.UN_TOKENIZED); new Field("SortDate", sdfDate.format(dMySortDateTime), Field.Store.YES, Field.Index.UN_TOKENIZED); new Field("SortTime", sdfTime.format(dMySortDateTime), Field.Store.YES, Field.Index.UN_TOKENIZED); //and the performance looks like this: //sort by score Sort sSortOrder = Sort.RELEVANCE; //1200ms //sort by datetime Sort sSortOrder = new Sort("SortDateTime", true); //3900ms //sort by date then time //yes, I know this isn't valid code Sort sSortOrder = new Sort({new SortField("SortDate",SortField.INT,bReverse), new SortField("SortTime",SortField.INT,bReverse)}); //4300ms The two indexes that are being searched at the moment look like this: Index 1: Index Path: /storage/unisearch/MMS_index/2007.02/ Index Size on Disk: 1,400,569 KB Number of Records: 2682238 Index Version: 03/13/2007 Index 2: Index Path: /storage/unisearch/MMS_index/2007.03/ Index Size on Disk: 2,055,199 KB Number of Records: 3457434 Index Version: 03/13/2007 The search is being performed in tomcat and I'm running: org.apache.lucene - build 2007-02-14 on a Dual 3.4GHz Xeon w/ 2GB memory and Red Hat 3.4.3-22. So, onto the question: Is this fast, slow, or normal. Along, with the obvious follow up: if it's slow, how can I make it faster. Thanks for your help! -Dave