I am applying the PorterStemFilter at both indexing and search time.
As for schema, I have 3 fields: title, subtitle and notes. When the user
enters a query string of */a*itis/*, my software turns this into an actual
Lucene query of */title: a*itis OR subtitle: a*itis OR notes: a*itis/* and I
get
You can easily use just the CommonGrams stuff from Solr in your pure
lucene project.
There are a couple of useful docs on stop words and common grams et al at
http://www.hathitrust.org/blogs/large-scale-search/slow-queries-and-common-words-part-1
http://www.hathitrust.org/blogs/large-scale-search
Hi Steve,
On 28/11/2011 19:43, Steven A Rowe wrote:
I assume that when you refer to "the impact of stop words," you're concerned
about query-time performance? You should consider the possibility that performance
without removing stop words is good enough that you won't have to take any steps
Hi Dawn,
I assume that when you refer to "the impact of stop words," you're concerned
about query-time performance? You should consider the possibility that
performance without removing stop words is good enough that you won't have to
take any steps to address the issue.
That said, there are
Hi Meghana,
You can only do that by directly instantiating the FuzzyQuery, not via
parsed queries.
Uwe
-
Uwe Schindler
H.-H.-Meier-Allee 63, D-28213 Bremen
http://www.thetaphi.de
eMail: u...@thetaphi.de
> -Original Message-
> From: meghana [mailto:meghana.rav...@amultek.com]
> Sent
Hi folks,
I'm researching the best options to use for analysing/storing newspaper
pages in out online archive, and wondered if anyone has any good hints
or tips on good practice for this type of media?
I'm currently thinking alone the lines of using a customised
StandardAnalyser (no stop wor
Awesome. Thanks guys!
On Mon, Nov 28, 2011 at 12:19 PM, Uwe Schindler wrote:
> You can store the index in WEB_INF directory, just use something:
> ServletContext.getRealPath("/WEB-INF/data/myIndexName");
>
> -
> Uwe Schindler
> H.-H.-Meier-Allee 63, D-28213 Bremen
> http://www.thetaphi.de
>
Hi Uwe ,
I need to do something similar... can u plz tell me how can i pass integer
in my fuzzy search query?
say for ex. i am searching like q=major~0.6
i want to match terms after prefix "maj". how can i pass integer to do that
way ?
Thanks.
Uwe Schindler wrote
>
> Hi,
>
> You can pass
Hi Stephen,
We are doing something similar, and we store as a multifield with each
document as (d,z) pairs where we store the z's (scores) as payloads for
each d (topic). We have had to build a custom similarity which
implements the scorePayload function. So to find docs for a given d
(topic), we
List,
I am trying to incorporate the Latent Dirichlet Allocation (LDA) topic
model into Lucene. Briefly, the LDA model extracts topics
(distribution over words) from a set of documents, and then represents
each document with topic vectors. For example, documents could be
represented as:
d1 = (0,
You can store the index in WEB_INF directory, just use something:
ServletContext.getRealPath("/WEB-INF/data/myIndexName");
-
Uwe Schindler
H.-H.-Meier-Allee 63, D-28213 Bremen
http://www.thetaphi.de
eMail: u...@thetaphi.de
> -Original Message-
> From: Ian Lea [mailto:ian@gmail.co
Using a static string is fine - it just wasn't clear from your
original post what it was.
I usually use a full path read from a properties file so that I can
change it without a recompile, have different settings on
test/live/whatever systems, etc. Works for me, but isn't the only way
to do it.
Hi,
Thanks for your response. Yes, LUCENE_INDEX_DIRECTORY is a static string
which contains the file system path of the index (for example, c:\\index).
Is this good practice? If not, what should the full path to an index
look like?
Thanks
On Mon, Nov 28, 2011 at 4:54 AM, Ian Lea wrote:
> W
Even though the NumericRangeQuery.new* methods do not support
BigInteger, the underlying recursive algorithm supports any sized
number.
Has this been explored?
-
To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
For
> > Could you minimize this to a small stand-alone program that does not work
> > as expected?
>
> This will be hard, because of the bug only appearing after a couple of days
> or more and i'm starting to think that it is triggered by high data
> volumes. I'll try to minimize the code and serve mor
> Could you minimize this to a small stand-alone program that does not work
> as expected?
This will be hard, because of the bug only appearing after a couple of days
or more and i'm starting to think that it is triggered by high data
volumes. I'll try to minimize the code and serve more data to i
Sequence of operations seems logical, I don't see straight why this does
not work.
Could you minimize this to a small stand-alone program that does not work
as expected? This will allow to recreate the problem here and debug it.
It is interesting that facet 3.5 is used with core 3.4 and queries 3.4
All packages used: core3.4, queries3.4, facet3.5.
Once every 3 minutes I *refreshTax* and once per day I *reopenEveryting*.
*InitWriters()*
writer = new ThreadedIndexWriter
taxWriter = new LuceneTaxonomyWriter
// because the reader can't start if doesn't have a valid taxIndex directory
taxWriter.c
What is LUCENE_INDEX_DIRECTORY? Some static string in your app?
Lucene knows nothing about your app, JSP, or what app server you are
using. It requires a file system path and it is up to you to provide
that. I always use a full path since I prefer to store indexes
outside the app and it avoids
As far as I'm aware recent versions of lucene, including the
highlighter, should work out of the box.
I'd guess that highlighting would be the most resource intensive and
therefore troublesome bit.
I'm not aware of any sample code showing lucene working on Android,
but from my very limited experi
Just use one of the search() methods that does sorting and specify an
array of sort fields with SortField.SCORE first, then your name
fields. But be aware that complex real world textual queries and docs
rarely produce identical scores.
You could post-process the results and group them into "good
Lucene won't be aware that you've got duplicate documents, but scoring
does take account of the number of documents in which search terms
appear. See http://lucene.apache.org/java/3_5_0/scoring.html and the
javadocs for oal.search.Similarity.
Only you can say whether or not you need to worry abou
Hi Guys,
I am using Lucene with Neo4j. Currently I have queries working well with a
combination of Exact and Fuzzy matches in one query.
However, we desire a report that first takes the ranking and boosting as the
highest priority, but then we want to sort my first name and last name, and
alwa
23 matches
Mail list logo