Here's another thought: if you desperately need complex searches then
you could do a heuristic filtering to narrow down the search: use an
analyzer that does some form of input splitting into terms (removing
excess whitespace or even producing n-grams from the input), then do
the same for the query
So you probably should ask your question to the Elasticsearch mailing list.
I think that some ES users already scales to x billion docs.
Even if ES is Lucene based, it adds features to scale out (sharding,
routing...).
HTH
--
David ;-)
Twitter : @dadoonet / @elasticsearchfr / @scrutmydocs
Le 5
Thanks for the input! Seems I should give this another chance using
the hints you all sent me. I'll report back my findings here.
/Mathias
On Mon, Feb 4, 2013 at 7:01 PM, Mathias Dahl wrote:
> Hi,
>
> I have hacked together a small web front end to the Glimpse text
> indexing engine (see http:/
The records are mostly logging events where they will have:
1. a timestamp
2. the type of the event
3. potentially a set of key/value properties
Then I would want to be able to slice and dice the records based on time
(required), type and/or the key/values.
In addition, I would want to have sta
Part of the answer depends on what kind of records you have. For instance,
are you dealing with a lot of numeric data?
If you need all those functions and only want to support exact matches and
basic boolean comparisons, then I'd go with a RDBMS instead of Lucene.
You'll get better support for the
Hey Guys,
I'm trying to figure out what would be a better approach to indexing when it
comes to a large number of records (say 1 billion).
As far as queries:
1. Only support exact matches (a field is equal to some constant value) or
range matches (a field is larger/smaller than some constant va
I am looking at the versions supported by newer version of Tika (1.3) and was
not sure what version(s) of the Microsoft office it supports
(97/2000/2010/2013) for each of the below?
http://tika.apache.org/1.3/formats.html#Microsoft_Office_document_formats
Microsoft word (also does it support bot
Hi,
For Basics on Lucene How to Create Lucene Index and some basic Stuffs
Look in to Lucene in Action Book.
On Tue, Feb 5, 2013 at 6:28 PM, Álvaro Vargas Quezada wrote:
> Hello,
> I want to implement a central index, and I heard about Lucene, so I would
> like to ask your help to install it an
You're probably better off using Solr which is tightly linked with lucene.
http://lucene.apache.org/solr/
I'm sure there are installation and getting started guides there.
--
Ian.
On Tue, Feb 5, 2013 at 12:58 PM, Álvaro Vargas Quezada
wrote:
> Hello,
> I want to implement a central index, an
Hello,
I want to implement a central index, and I heard about Lucene, so I would like
to ask your help to install it and configure it. My OS is Windows 7/XP/Server
2008. If I could index just one database and make a search I would be happy.
I would be grateful if you can send me any info about th
Hi!
I wonder where one can get information about current Lucene (v 4.1) core search
classes - AtomicReader, CompositeReader, ReaderContexts - and how to use them
properly for building custom search algorithms.
Although "Lucene in Action" is really good, I can't find something on these
classes t
Glimpse seems to use something similar like StandardAnalyzer. So I would give
it a try. For program code this should work quite good. To make the
"auto-phrases" work (which might be a good idea here, too), enable this feature
in the query parser (I am referring to the comment by Jack about auto-
Jack,
What you say sounds hopeful, but it also sounds like quite some work
to define/select the correct analyzer for each type of programming
language (we use SQL, PL/SQL, Java and C# mainly). Compared to what I
do know which is just to throw all files at Glimpse and it makes them
searchable in a
13 matches
Mail list logo