Hi Scott, sdeck wrote: > I can't combine each of the movie queries together into one, because I get a > memory error because of how many clauses there are (setting the clause > higher did not help)
Have you tried increasing the memory available to the JVM? Sun's JVM takes an option "-Xmx" to change the maximum amount of heap space to use (defaults to 64MB). For Java 1.5, see <http://java.sun.com/j2se/1.5.0/docs/tooldocs/windows/java.html#Xms> for Windows or <http://java.sun.com/j2se/1.5.0/docs/tooldocs/solaris/java.html#Xms> for Solaris and Linux. You may have to increase the maximum # of allowed clauses too (sounds like you're already aware of this one): <http://lucene.apache.org/java/docs/api/org/apache/lucene/search/BooleanQuery.html#setMaxClauseCount(int)> If this doesn't help, you may want to look into QueryFilter <http://lucene.apache.org/java/2_0_0/api/org/apache/lucene/search/QueryFilter.html>. You might try using a ChainedFilter (from the Lucene Sandbox - note the latest release of this class is not located in lucene-core-2.0.0.jar, but rather in lucene-misc-2.0.0.jar) <http://lucene.apache.org/java/2_0_0/api/org/apache/lucene/misc/ChainedFilter.html> to connect movie QueryFilters for a genre. To improve performance (beyond the first query execution), you could wrap the individual QueryFilters in CachingWrapperFilters <http://lucene.apache.org/java/2_0_0/api/org/apache/lucene/search/CachingWrapperFilter.html>. For something completely different, since you seem to be interested in online query performance, you could run all possible queries offline, and use the results to construct a derived index, in which documents contain "actor", "movie" and "genre" fields. This derived index would be plenty fast, I expect. And if running all possible genre queries is too resource-intensive, then you could compromise and construct your derived index with just an "actor" field, or both an "actor" and a "movie" field. In any case, it sounds like the # of documents in your index is fairly small -- have you tried using RAMDirectory <http://lucene.apache.org/java/docs/api/org/apache/lucene/store/RAMDirectory.html>? Hope it helps, Steve > Steven Rowe wrote: >> Hi Sdeck, >> >> sdeck wrote: >>> The query for collecting a specific actor is around 200-300 milliseconds, >>> and the movie one, that actually queries each actor, takes roughly >>> 500-700 >>> milliseconds. Yet, for a genre, where you may have 50-100 movies, it >>> takes >>> 500 milliseconds*# of movies >> I'm having trouble visualizing both what your documents and your queries >> look like. Can you please provide more concrete information? >> Sometimes, actual code helps. >> >> For example, how do actors, movies and genres relate to your documents? >> Do you have some external source(s) of information (i.e. external to >> your Lucene index) that relate actors to movies? And movies to genres? >> >> If actors, movies and genres are supposed to be a metaphor for what >> you're "really" representing, then you'll have to extend your metaphor a >> little bit to make sense (for "me" anyway) of what you're trying to "do". >> >> Steve --------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]