ok, thanks Yuval. I'll take a look.
Could you (or anyone) please elaborate why payloads "seem like a worse fit"
?
TX, Naama
On Wed, Jun 23, 2010 at 11:00 PM, Yuval Feinstein wrote:
> Naama, Maybe you could use the new flexible indexing mechanism.
> Some information is in this lecture:
>
> http:/
Coincidentally, just after I replied to this thread I received an email from
one of our customers. In that email was a quote from one of the commercial
search vendors. My jaw didn't drop because I've seen similar numbers from
other commercial search vendors before, but I won't mention the
Are you sure that the term enum return the terms in correct order? For all
types of RangeQueries, the term enumeration has to be correctly sorted as
specified in the docs, if this is not correct, the enumeration may be
incomplete. It’s a good thing to turn on assertions for the lucene package, a
I won't comment on Attivio, as I think I might have signed some NDA with them.
But they do claim to combine full-text search with DB-like joins. Can't
MarkLogic do that, too?
Otis
Sematext :: http://sematext.com/ :: Solr - Lucene - Nutch
Lucene ecosystem search :: http://search-lucene.co
How do you add documents to the index? Is it synchronized (such that
basically only one thread can add documents at a time)?
The same goes for removing documents as well.
Also, did you encounter any exceptions during the run - if say an addDoc
fails on one of the slices, then you need to revert th
Hi Uwe,
Thank you for your help, it is greatly appreciated. Unfortunately, my
tests all fail except for RangeInclusive. I've changed the step to be 6
as per your recommendation. I had it at max to eliminate step precision
as the cause of the test failure. Essentially, all keys in Cassandra
a
Otis's comments reminded me of one of the astonishing things
I've seen in the Lucene/SOLR ecosystem; I've seen issues
reported, commented on, fixed, and patches made available
*for free* in a matter of hours.
Of course, you have to be willing to use a patched version, but
it sure beats waiting six
Hi all,
We've been waiting for LUCENE-1879 and LUCENE-2425 and have written our own
ParallelWriter class in the meantime. Apparently our indexes are falling out
of sync (I suspect my colleague is seeing error messages come from
ParallelReader stating the the number of documents must be the sam
yes, in my case the competition is one of the list...
On Wed, Jun 23, 2010 at 11:41 PM, Otis Gospodnetic
wrote:
> Off the top of my head:
>
> FAST
> Endeca
> Coveo
> Attivio
> Vivisimo
> Google Search Appliance
> (tell me when to stop)
> Dieselpoint
> IBM OmniFind
> Exalead
> Autonomy
> dtSearch
Otis, I'm 99% sure Attivio is just a wrapper arround Lucene...
And I personally wouldn't count full text search solutions such as Oracle's.
Itamar.
> -Original Message-
> From: Otis Gospodnetic [mailto:otis_gospodne...@yahoo.com]
> Sent: Thursday, June 24, 2010 12:42 AM
> To: java-user@
Off the top of my head:
FAST
Endeca
Coveo
Attivio
Vivisimo
Google Search Appliance
(tell me when to stop)
Dieselpoint
IBM OmniFind
Exalead
Autonomy
dtSearch
ISYS
Oracle
...
...
Otis
Sematext :: http://sematext.com/ :: Solr - Lucene - Nutch
Lucene ecosystem search :: http://search-lucene.com
Just curious. What commercial alternatives are out there?
On Wed, Jun 23, 2010 at 04:01, jm wrote:
> Hi,
>
> I am trying to compile some arguments in favour of lucene as
> management is deciding weather to standardize on lucene or a competing
> commercial product (we have a couple of produc, one
thanks guys, those links are cool. I welcome any other positive thing
anyone can add. Specially references of products/sites moving to
lucene/solr
javier
On Wed, Jun 23, 2010 at 10:49 PM, Otis Gospodnetic
wrote:
> Lucene/Solr choice typically means:
>
> * lower cost of ownership (think about var
Lucene/Solr choice typically means:
* lower cost of ownership (think about various crazy licensing models some of
the commercial search vendors have: per doc, per server, per query, per
year)
* faster implementation (just think about the duration of the sales/negotiation
phase for commerci
Naama, Maybe you could use the new flexible indexing mechanism.
Some information is in this lecture:
http://lucene-eurocon.org/slides/Lucene-Forecast-Version-Unicode-Flex-and-Mod_Willnauer&Schindler.pdf
Alternatively, you may use payloads, but they seem like a worse fit.
Good Luck,
Yuval
_
Hi Sudha,
There is such a tokenizer, named NewStandardTokenizer, in the most recent patch
on the following JIRA issue:
https://issues.apache.org/jira/browse/LUCENE-2167
It keeps (HTTP(S), FTP, and FILE) URLs together as single tokens, and e-mails
too, in accordance with the relevant IETF R
Hi,
I am new to lucene and I am using Lucene 3.0.2.
I am using Lucene to parse text which may contain URLs. I noticed the
StandardTokenizer keeps the email addresses in one token, but not the URLs.
I also looked at Solr wiki pages, and even though the wiki page for
solr.StandardTokenizerFactory s
One thing to consider is that you have access to the source,
so worst-case you won't be cut off at the knees by the commercial
vendor.
Case in point: Fast was acquired by Microsoft, who have since
dropped all future Unix development. Hope all Fast users
really like running their apps on Windows se
On the chance that this is an XY problem
(http://people.apache.org/~hossman/#xyproblem),
why can't you use StopFilter and PorterStemFilter in
your filter chain rather than try to do this yourself?
Best
Erick
On Tue, Jun 22, 2010 at 10:49 PM, Vinicius Carvalho <
viniciusccarva...@gmail.com> wrote:
Hi,
Is there a way for an application to index a document along with its "term
weighted vector" (Lucene's TermFreqVector). I.e., override the term
frequencies computed by Lucene, with an application's computed term weights
(non frequency based) ?
I don't think I want to use Scorer#score() for appl
Hi,
I am trying to compile some arguments in favour of lucene as
management is deciding weather to standardize on lucene or a competing
commercial product (we have a couple of produc, one using lucene,
another using commercial product, imagine what am i using). I searched
the lists but could not f
21 matches
Mail list logo