Thanks for the input! Seems I should give this another chance using the hints you all sent me. I'll report back my findings here.
/Mathias On Mon, Feb 4, 2013 at 7:01 PM, Mathias Dahl <mathias.d...@gmail.com> wrote: > Hi, > > I have hacked together a small web front end to the Glimpse text > indexing engine (see http://webglimpse.net/ for information). I am > very happy with how Glimpse indexes and searches data. If I understand > it correctly it uses a combination of an index and searching directly > in the files themselves as grep or other tools. The problem is that I > discovered it is not open source and now that I want to extend the use > from private to company wide I will run into license problems/costs. > > So, I decided to try out Lucene. I tried the examples and changed them > a bit to use another analyzer. But when I started to think about it I > realized that I will not be able to build something like Glimpse. At > least not easily. > > Why? I will try to explain: > > As stated above, Glimpse uses a combination of index and in-file > search. This makes it very powerful in the sense that I can get hits > for things that are not necessarily being indexes as terms. Let's say > I have a file with this content: > > ... > import foo.bar.baz; > ... > > With Glimpse, and without telling it how to index the content I can > find the above file using a search string like "foo" or "bar" but > also, and this is important, using foo.bar.baz. > > Another example: > > We have a lot of PL/SQL source code, and often you can find code like this: > > ... > My_Nice_API.Some_Method > ... > > Here too, Glimpse is almost magic since it combines index and normal > search. I can find the file above using "My_Nice_API" or > "My_Nice_API.Some_Method". > > In a sense I can have the cake and eat it too. > > If I want to do similar "free" search stuff with Lucene I think I have > to create analyzers for the different kind of source code files, with > fields for this and that. Quite an undertaking. > > Does anyone understand my point here and am I correct in that it would > be hard to implement something as "free" as with Glimpse? I am not > trying to critizise, just understand how Lucene (and Glimpse) works. > > Oh, yes, Glimpse has one big drawback: it only supports search strings > up to 32 characters. > > Thanks! > > /Mathias --------------------------------------------------------------------- To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org For additional commands, e-mail: java-user-h...@lucene.apache.org