Hi, I have hacked together a small web front end to the Glimpse text indexing engine (see http://webglimpse.net/ for information). I am very happy with how Glimpse indexes and searches data. If I understand it correctly it uses a combination of an index and searching directly in the files themselves as grep or other tools. The problem is that I discovered it is not open source and now that I want to extend the use from private to company wide I will run into license problems/costs.
So, I decided to try out Lucene. I tried the examples and changed them a bit to use another analyzer. But when I started to think about it I realized that I will not be able to build something like Glimpse. At least not easily. Why? I will try to explain: As stated above, Glimpse uses a combination of index and in-file search. This makes it very powerful in the sense that I can get hits for things that are not necessarily being indexes as terms. Let's say I have a file with this content: ... import foo.bar.baz; ... With Glimpse, and without telling it how to index the content I can find the above file using a search string like "foo" or "bar" but also, and this is important, using foo.bar.baz. Another example: We have a lot of PL/SQL source code, and often you can find code like this: ... My_Nice_API.Some_Method ... Here too, Glimpse is almost magic since it combines index and normal search. I can find the file above using "My_Nice_API" or "My_Nice_API.Some_Method". In a sense I can have the cake and eat it too. If I want to do similar "free" search stuff with Lucene I think I have to create analyzers for the different kind of source code files, with fields for this and that. Quite an undertaking. Does anyone understand my point here and am I correct in that it would be hard to implement something as "free" as with Glimpse? I am not trying to critizise, just understand how Lucene (and Glimpse) works. Oh, yes, Glimpse has one big drawback: it only supports search strings up to 32 characters. Thanks! /Mathias --------------------------------------------------------------------- To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org For additional commands, e-mail: java-user-h...@lucene.apache.org