Chris,

The limitation on huge genome map ranges is for display software (putting all
those features into an image a person can understand).  500Kb is about an
average viewable size, though some uses will draw on 1-10 Mb of data.  I used
a real-world test case that will be directly applicable to how fast biologists
get to see their interesting genes.   Some time maybe I'll benchmark bigger
ranges.  This use of Lucene for bio-data is perhaps not where can show its
advantage most dramatically (as noted the SQL databases are pretty good at
numeric searches and Lucene only edges them out by a nose:).  It is really
with the text (biology-jargon) rich literature, experimental data sets where
Lucene can really show its stuff.  Phrase searching of biology experimental
phrases is a good example - almost impossible to do easily with SQL systems 
(even
MySQL textsearch is weak here), and Lucene in my tests easily picks out
relevavnt biology phrases. Lion Bioscience's SRS is a widely used commercial
system that is text-search based, but it lacks phrase search ability.

-- Don
-- d.gilbert--bioinformatics--indiana-u--bloomington-in-47405
-- [EMAIL PROTECTED]://marmot.bio.indiana.edu/

---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Reply via email to