bhecht <[EMAIL PROTECTED]> wrote on 07/05/2007 10:26:27:
> I have implemented my own analyzer for each country.
> So as I see it, when I index these records, I want to
> provide lucene, with a specific analyzer per record
> i'm indexing.
>
> When a user performs a query in my JSF form, I will
> us
The only commercial options that I have seen do not have a web presence
(that I know of or can find) and I don't recall the company names (only
peripherally involved).
Here is a web page where a guy does a nice writeup on a few options:
http://dsanalytics.com/dsblog/the-start-of-the-art-in-key
http://snowball.tartarus.org/
That is the Snowball page. There exists a Snowball version of the
Porter2 Stemming algorithm. If you hunt around the download page you
will find it.
- Mark
sandeep chawla wrote:
Hi All.
is there a implemention of Porter2 Stemming algorithim in java..
I dont w
> Is there a way to require a portion of a query only if there are values
for
> > > that field in the document?
> > > e.g. If I know that I only want to match movies made between 1973 and
> > > 1975,
> > > I would like to be able to say in my query that if the document has a
> > > year,
> > > it mu
Sure thing. I actually haven't taken a sufficiently close look at
NearSpansOrdered (I was concentrating more on NearSpansUnordered, which has
got next to no documentation).
- Moti
On 5/7/07, Paul Elschot <[EMAIL PROTECTED]> wrote:
Moti,
I have not yet looked into all the details of your comme
Moti,
I have not yet looked into all the details of your comments,
but I remember I had some trouble in trying to define the precise
semantics of NearSpansOrdered. I'll have another look at
being more precise for the overlaps.
NearSpansUnordered is a specialisation of the previous NearSpans
for t
Anyone knows of a good language detection library that can detect what
language a document (text) is ?
Language detection is easy. It's just a simple
text classification problem.
One way you can do this is using Lucene
itself. Create a so-called pseudo-document
for each language consisting
Sorry,
I didn't understand I need to use the PerFieldanalyzerWrapper for this task,
and tried to index the document twice.
Sorry for the previous post.
thanks for the great help.
But if you already asked, I will be happy to explain what my goal is, and
maybe see if i'm approaching this correctly
7 maj 2007 kl. 15.45 skrev bhecht:
OK, thanks, I think I got it.
Just to see if I understood correctly:
When I do the search on both stemmed and unstemmed fields, I will
do the
following:
1) If I know the country of the requested search - I will use the
stemmed
analyzer, and then the
OK, thanks, I think I got it.
Just to see if I understood correctly:
When I do the search on both stemmed and unstemmed fields, I will do the
following:
1) If I know the country of the requested search - I will use the stemmed
analyzer, and then the unstemmed field
7 maj 2007 kl. 13.27 skrev bhecht:
The last option seems to be the right one for me, using a stemmed and
unstemmed field.
I assume when you mean "unstemmed", you mean indexing the field
using the
UN_TOKENIZED parameter.
No, I mean TOKENIZED, but not using a stemmer analyzer.
--
karl
Hi All:
Can I make nutch to crawl and create separate indices based on scope , where
scope is determined from the querystring?
For example:
Let's assume that I'm having URL like:
http://localhost/admin/orchindex/crawl.asp?lCrpID=0&lPrjID=609&lStrtID=3605&l
then,
lCrpId=0 is one scope
lCorpi
OK, thanks for the reply.
The last option seems to be the right one for me, using a stemmed and
unstemmed field.
I assume when you mean "unstemmed", you mean indexing the field using the
UN_TOKENIZED parameter.
Now my problem starts, when trying to implement this with "Hibernate
Search", which al
7 maj 2007 kl. 12.16 skrev bhecht:
My question regarding "the way to go", was if it is a good solution
to index
a content of a table, using more than 1 analyzer, determining the
analyzer
by the country value of each record.
I'm not sure what you mean, but I'll try.
Do you ask if it makes
I know indexing and searching need to use the same analyzer.
My question regarding "the way to go", was if it is a good solution to index
a content of a table, using more than 1 analyzer, determining the analyzer
by the country value of each record.
Couldn't find a post that describes exactly my
7 maj 2007 kl. 10.02 skrev bhecht:
This means I index and search using the same analyzer.
I was interested to know if this is the way to go?
That would be the way to go (unless you are really sure what you're
doing).
--
karl
--
Hi All.
is there a implemention of Porter2 Stemming algorithim in java..
I dont want to make a snowballfilter based on snowball English Stemmer.
Thanks
Sandeep
--
SANDEEP CHAWLA
House No- 23
10th main
BTM 1st
Hello all,
I need to index a table containing company details (name, address, city ...
country).
Each record contains data written in the language appropriate to the records
country.
I was thinking of indexing each record using an analyzer according to the
records country value.
Then when searchi
Paul,
The comment should be moved up into SpanNearQuery itself (as opposed to the
comments in the package private implementation classes). Still though, that
comment is inaccurate (regarding overlap - only "exact" overlap is handled).
Here are some additional tests for SpanNearQuery. They all fai
19 matches
Mail list logo