Re: Does Lucene Supports Billions of data

2008-05-02 Thread mark harwood
>> If your terms are roughly equally distributed in all N indices(e.g. random 
>> doc->index/shard assignment), the relevance score willroughly match.


Agreed. I did some formal benchmarking of local IDF vs global IDF relevance 
ranking recently.
I measured the movement of the top ranked document in a single index's results 
(global IDF) vs the same document's position in results merged from 2 remote 
indexes with randomized doc->shard assignment (a local IDF scheme). This 
distance was measured for a large number of real-world queries.
Results were very promising - the distributed ranking scheme very rarely 
differed from that of the single large index.

- Original Message 
From: Otis Gospodnetic <[EMAIL PROTECTED]>
To: java-user@lucene.apache.org
Sent: Friday, 2 May, 2008 1:35:04 AM
Subject: Re: Does Lucene Supports Billions of data

Right.  And the typical answer to that is:

- If your terms are roughly equally distributed in all N indices (e.g. random 
doc->index/shard assignment), the relevance score will roughly match.

- If you have business rules for doc->index/shard distribution, then your 
relevance scores will not be comparable.

Otis 

--
Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch

- Original Message 
> From: Toke Eskildsen <[EMAIL PROTECTED]>
> To: java-user@lucene.apache.org
> Sent: Friday, May 2, 2008 12:13:04 AM
> Subject: Re: Does Lucene Supports Billions of data
> 
> From: John Wang 
> [...]
> > sub index 1: 1 billion docs
> > sub index 2: 1 billion docs
> > sub index 3: 1 billion docs
> > 
> > federating search to these subindexes, you represent an index of 3 
> > billiondocs, and all internal doc ids are of type int.
> 
> That falls under Daniel's "...unless you wrap your own framework around it". 
> The 
> problem with the solution you're describing is that it's not functionally 
> equivalent to a single index of 3 billion docs.
> 
> If you just create 3 independent indexes and merge the top hits from all 3, 
> the 
> ranking of the documents will be messed up. You'll need to make sure that the 
> scores from the different indexes can be compared. That's tricky when the 
> score 
> depends on the frequency of the terms in the whole corpus.
> 
> 
> -
> To unsubscribe, e-mail: [EMAIL PROTECTED]
> For additional commands, e-mail: [EMAIL PROTECTED]
> 
> 



-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]






  __
Sent from Yahoo! Mail.
A Smarter Email http://uk.docs.yahoo.com/nowyoucan.html

-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



RE: lucene farsi problem

2008-05-02 Thread esra

Hi Steven,

sorry i made a mistake. unicodes are like this:

> د=U+62F
> ژ = U+632
> and the first letter of "ساب ووفر " is  س = U+633

you can also check them here
:http://www.unics.uni-hannover.de/nhtcapri/persian-alphabet.html

Esra


Steven A Rowe wrote:
> 
> Hi Esra,
> 
> Going back to the original problem statement, I see something that looks
> illogical to me - please correct me if I'm wrong:
> 
> On Apr 30, 2008, at 3:21 AM, esra wrote:
>> i am using lucene's "IndexSearcher" to search the given xml by
>> keyword which contains farsi information.
>> while searching i use ranges like
>> 
>> آ-ث  |  ج-خ  |  د-ژ  |  س-ظ  |  ع-ق  |  ک-ل  |  م-ی
>> 
>> when i do search for  "د-ژ"  range the results are wrong , they
>> are the results of  " س-ظ "range.
>> 
>> for example when i do search for "د-ژ"  one of the the results is
>> "ساب ووفر", this result also shown on the " س-ظ " range's result
>> list which is the corret range.
>> 
>> As IndexSearcher use "compareTo" method and this method uses
>> unicodes for comparing, i found the unicodes of the characters.
>> 
>> د=U+62F
>> ژ = U+698
>> and the first letter of "ساب ووفر " is  س = U+633
> 
> It appears to me that *both* the "د-ژ" range [ U+062F - U+0698 ] and the
> "س-ظ" range [ U+0633 - U+0638 ] contain the first letter of "ساب ووفر",
> which is "س" = U+0633.  
> 
> You stated that U+0633 should be contained in the [ U+0633 - U+0638 ]
> range - I agree - but why do you think U+0633 should not be contained in
> the [ U+062F - U+0698 ] range?
> 
> In other words, it looks to me like your problem is not a problem at all.
> 
> Steve
> 
> 

-- 
View this message in context: 
http://www.nabble.com/lucene-farsi-problem-tp16977096p17019498.html
Sent from the Lucene - Java Users mailing list archive at Nabble.com.


-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



RE: lucene farsi problem

2008-05-02 Thread Steven A Rowe
Hi Esra,

I still think you're wrong :).

On 05/02/2008 at 9:31 AM, esra wrote:
> > ژ = U+632

According to the website you linked to, the above character, which has three 
dots over it, is named "zhe", and its Unicode code point is U+698.  (I had to 
increase the font size to see the three dots.)

I think you are confusing "ژ"/"zhe"/U+698 with the letter "ز"/"ze"/U+632, which 
has just one dot over it.

Unless you were mistaken in all of your emails when you included the character 
"ژ"/"zhe" instead of "ز"/"ze", then what I said in my previous email still 
stands: there is no problem here.

Steve

On 05/02/2008 at 9:31 AM, esra wrote:
> 
> Hi Steven,
> 
> sorry i made a mistake. unicodes are like this:
> 
> > د=U+62F
> > ژ = U+632
> > and the first letter of "ساب ووفر " is  س = U+633
> 
> you can also check them here
> > http://www.unics.uni-hannover.de/nhtcapri/persian-alphabet.html
> 
> Esra
> 
> 
> Steven A Rowe wrote:
> > 
> > Hi Esra,
> > 
> > Going back to the original problem statement, I see something that
> > looks illogical to me - please correct me if I'm wrong:
> > 
> > On Apr 30, 2008, at 3:21 AM, esra wrote:
> > > i am using lucene's "IndexSearcher" to search the given xml by
> > > keyword which contains farsi information.
> > > while searching i use ranges like
> > > 
> > > آ-ث  |  ج-خ  |  د-ژ  |  س-ظ  |  ع-ق  |  ک-ل  |  م-ی
> > > 
> > > when i do search for  "د-ژ"  range the results are wrong , they
> > > are the results of  " س-ظ "range.
> > > 
> > > for example when i do search for "د-ژ"  one of the the results is
> > > "ساب ووفر", this result also shown on the " س-ظ " range's result
> > > list which is the corret range.
> > > 
> > > As IndexSearcher use "compareTo" method and this method uses
> > > unicodes for comparing, i found the unicodes of the characters.
> > > 
> > > د=U+62F
> > > ژ = U+698
> > > and the first letter of "ساب ووفر " is  س = U+633
> > 
> > It appears to me that *both* the "د-ژ" range [ U+062F - U+0698 ] and
> > the "س-ظ" range [ U+0633 - U+0638 ] contain the first letter of "ساب
> > ووفر", which is "س" = U+0633.
> > 
> > You stated that U+0633 should be contained in the [ U+0633 - U+0638 ]
> > range - I agree - but why do you think U+0633 should not be contained
> > in the [ U+062F - U+0698 ] range?
> > 
> > In other words, it looks to me like your problem is not a problem at
> > all.
> > 
> > Steve
> > 
> > 
> 
> -- View this message in context:
> http://www.nabble.com/lucene-farsi-problem-tp16977096p17019498.html Sent
> from the Lucene - Java Users mailing list archive at Nabble.com.
> 
> 
> - To
> unsubscribe, e-mail: [EMAIL PROTECTED] For
> additional commands, e-mail: [EMAIL PROTECTED]
> 
>

 



Re: hybrid query (lucene + db)

2008-05-02 Thread Stephane Nicoll
I had a look to this but didn't find anything that correspond to my problem.

Apparently there is a bug in Hibernate Search. If I use the same load
test on the same index with the same data with a direct access to the
lucene API, I get much better performance (and no deadlock on
SegmentReader).

I will report the problem there.

Thanks,
Stéphane






On Thu, May 1, 2008 at 11:15 AM, mark harwood <[EMAIL PROTECTED]> wrote:
> The issue here is a general one of trying to perform an efficient join 
> between an external resource (rdbms) and Lucene.
>  This experiment may be of interest:
> http://issues.apache.org/jira/browse/LUCENE-434
>
>  KeyMap.java embodies the core service which translates from lucene doc ids 
> to DB primary keys or vice versa.
>  There are a couple of implementations of KeyMap that are not optimal (they 
> pre-date Lucene's FieldCache) but it may give you food for thought.
>
>  Cheers
>  Mark
>
>
>
>
>  - Original Message 
>  From: Stephane Nicoll <[EMAIL PROTECTED]>
>  To: java-user@lucene.apache.org
>  Sent: Thursday, 1 May, 2008 9:00:33 AM
>  Subject: hybrid query (lucene + db)
>
>  Hi there,
>
>  We're using lucene with Hibernate search and we're very happy so far
>  with the performance and the usability of lucene. We have however a
>  specific use cases that prevent us to use only lucene: spatial
>  queries. I already sent a mail on this list a while back about the
>  problem and we started investigating multiple solutions.
>
>  When the user selects a geographic area and some keywords we do the 
> following:
>
>  * Perform a search on the lucene index for the keywords with a
>  projection that returns only the primaryKey of the element sorted by
>  primary key
>  * Perform a search on the database with other criterias and a
>  projection that returns only the primary key of the elements
>  * Iterate on both list to find N matching IDs, optionally with paging
>  (some from X to X + N where X is the first result of the page)
>  * Run a query on the database to return the actual objects (select a
>  from MyClass a where a.id IN (the list of matching IDs) ) We limit the
>  page to 1000 results
>
>  We have searched a way to optimize the queries and to avoid to consume
>  too much memory, knowing that we must support paging.
>
>  With a single user a search by kewyords takes 30msec to complete, a
>  search by box takes 45msec. With both (keywords + spatial area)  it
>  takes 300msec
>
>  With 10 concurrent users, a search by keywords takes 150msec/user  but
>  for both it takes 3 sec/user !!!
>
>  I had the profiler running on this scenario and I've found that *all*
>  threads are waiting on org.apache.lucene.index.SegmentReader. I then
>  configured Hibernate Search to use a separate index reader per thread.
>  The deadlocks disappeared but it's still very slow (2.8sec).
>
>  Some questions:
>
>  * Does anyone knows where the deadlocks on SegmentReader are coming from?
>  * Is the sorting on the primary keys a bad idea regarding performance
>  and memory usage?
>  * Does anyone has an idea to perform this kind of hybrid query in an
>  efficient way?
>
>  I am using lucene 2.3.1 and Hibernate Search 3.0.1. I already ask for
>  support on the Hibernate Search forum but did not get any answer so
>  far.
>
>  Thanks,
>  Stéphane
>
>  --
>  Large Systems Suck: This rule is 100% transitive. If you build one,
>  you suck" -- S.Yegge
>
>  -
>  To unsubscribe, e-mail: [EMAIL PROTECTED]
>  For additional commands, e-mail: [EMAIL PROTECTED]
>
>
>
>
>
>
>
>   __
>  Sent from Yahoo! Mail.
>  A Smarter Email http://uk.docs.yahoo.com/nowyoucan.html
>
>  -
>  To unsubscribe, e-mail: [EMAIL PROTECTED]
>  For additional commands, e-mail: [EMAIL PROTECTED]
>
>



-- 
Large Systems Suck: This rule is 100% transitive. If you build one,
you suck" -- S.Yegge

-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



Re: Lucene Indexing structure

2008-05-02 Thread Chris Hostetter

: Hi Lucene-user and Lucene-dev,

Please do not cross post -- java-user is the suitable place for your 
question.

: Obviously there is something wrong with the above approach (as to get the
: correct document we need to get all the documents and than do the required
: distance calculation), but that' due to lack of my knowledge of Luce and
: lucene's Index storage.
: 
: What I want to know how to improve upon the exsisting architecture other than
: making number of fields in the lucene equalling to total number of
: feature*size of each feature.

I suspect one of the reasons you haven't gotten much of a response yet is 
that people may not understand your problem statement -- I know nothing of 
Image Processing and even after googling "Color Histogram" I don't really 
understand how the examples you gave represent Color Histograms, or what 
it would mean to search on it with your example input.

Perhaps you could describe in more detail what exactly some sample 
data looks like, why certian objects should match certain queries, (and 
just as importantly: why other objects shouldn't match, and give examples 
of one one object is a "better" match then another object for each example 
query.

don't worry about Lucene Document/Field/QueryParse specifics -- just 
explain the concepts you are dealing with.



-Hoss


-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



RE: lucene farsi problem

2008-05-02 Thread esra

Hi Steven ,

yes the correct one is "ژ "/"ze"/U+632.

my problem is when i do search for  "  د-ژ" range. The result is  ""ساب ووفر 
" and this word's first letter is "س " and it's unicode is "U+633"  and  it
is not in the in the [ U+062F - U+0632 ] range.

am i wrong?

Esra


Steven A Rowe wrote:
> 
> Hi Esra,
> 
> I still think you're wrong :).
> 
> On 05/02/2008 at 9:31 AM, esra wrote:
>> > ژ = U+632
> 
> According to the website you linked to, the above character, which has
> three dots over it, is named "zhe", and its Unicode code point is U+698. 
> (I had to increase the font size to see the three dots.)
> 
> I think you are confusing "ژ"/"zhe"/U+698 with the letter "ز"/"ze"/U+632,
> which has just one dot over it.
> 
> Unless you were mistaken in all of your emails when you included the
> character "ژ"/"zhe" instead of "ز"/"ze", then what I said in my previous
> email still stands: there is no problem here.
> 
> Steve
> 
> On 05/02/2008 at 9:31 AM, esra wrote:
>> 
>> Hi Steven,
>> 
>> sorry i made a mistake. unicodes are like this:
>> 
>> > د=U+62F
>> > ژ = U+632
>> > and the first letter of "ساب ووفر " is  س = U+633
>> 
>> you can also check them here
>> > http://www.unics.uni-hannover.de/nhtcapri/persian-alphabet.html
>> 
>> Esra
>> 
>> 
>> Steven A Rowe wrote:
>> > 
>> > Hi Esra,
>> > 
>> > Going back to the original problem statement, I see something that
>> > looks illogical to me - please correct me if I'm wrong:
>> > 
>> > On Apr 30, 2008, at 3:21 AM, esra wrote:
>> > > i am using lucene's "IndexSearcher" to search the given xml by
>> > > keyword which contains farsi information.
>> > > while searching i use ranges like
>> > > 
>> > > آ-ث  |  ج-خ  |  د-ژ  |  س-ظ  |  ع-ق  |  ک-ل  |  م-ی
>> > > 
>> > > when i do search for  "د-ژ"  range the results are wrong , they
>> > > are the results of  " س-ظ "range.
>> > > 
>> > > for example when i do search for "د-ژ"  one of the the results is
>> > > "ساب ووفر", this result also shown on the " س-ظ " range's result
>> > > list which is the corret range.
>> > > 
>> > > As IndexSearcher use "compareTo" method and this method uses
>> > > unicodes for comparing, i found the unicodes of the characters.
>> > > 
>> > > د=U+62F
>> > > ژ = U+698
>> > > and the first letter of "ساب ووفر " is  س = U+633
>> > 
>> > It appears to me that *both* the "د-ژ" range [ U+062F - U+0698 ] and
>> > the "س-ظ" range [ U+0633 - U+0638 ] contain the first letter of "ساب
>> > ووفر", which is "س" = U+0633.
>> > 
>> > You stated that U+0633 should be contained in the [ U+0633 - U+0638 ]
>> > range - I agree - but why do you think U+0633 should not be contained
>> > in the [ U+062F - U+0698 ] range?
>> > 
>> > In other words, it looks to me like your problem is not a problem at
>> > all.
>> > 
>> > Steve
>> > 
>> > 
>> 
>> -- View this message in context:
>> http://www.nabble.com/lucene-farsi-problem-tp16977096p17019498.html Sent
>> from the Lucene - Java Users mailing list archive at Nabble.com.
>> 
>> 
>> - To
>> unsubscribe, e-mail: [EMAIL PROTECTED] For
>> additional commands, e-mail: [EMAIL PROTECTED]
>> 
>>
> 
>  
> 
> 
> 

-- 
View this message in context: 
http://www.nabble.com/lucene-farsi-problem-tp16977096p17022861.html
Sent from the Lucene - Java Users mailing list archive at Nabble.com.


-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



Re: Lucene Indexing structure

2008-05-02 Thread Glen Newton
Vaijanath,

I think I would do things in a different fashion:
Lucene default distance metric is based on tf/idf and the cosine
model, i.e. the frequencies of items. I believe the values that you
are adding as Fields are the values in n-space for each of these
image-based attributes. I don't believe Lucene's default ranking will
not work for this.

You need to alter Lucene so that it understands that the Fields you
are adding represent the n-space values and not tokens, and alter
Lucene so that it uses this n-space to determine distance.

I am not a Lucene internals expert, but I think you need to write a
custom Similarity[1] class for use in your IndexSearcher[2] and
IndexWriter[3] and I think you might need a custom analyser that
understands that you are putting in actual numbers, not tokens, that
you use when building the index as well as querying it.

There are probably things I am missing and there may be a better way
to do this

-Glen

[1]http://lucene.apache.org/java/2_2_0/api/org/apache/lucene/search/Similarity.html
[2]http://lucene.apache.org/java/2_2_0/api/org/apache/lucene/search/Searcher.html#setSimilarity(org.apache.lucene.search.Similarity)
[3]http://lucene.apache.org/java/2_2_0/api/org/apache/lucene/index/IndexWriter.html#getSimilarity()

2008/4/26 Vaijanath N. Rao <[EMAIL PROTECTED]>:
> Hi Lucene-user and Lucene-dev,
>
>  I want to use lucene as an backend for the Image search (Content based
> Image retrieval).
>
>  Indexing Mechanism:
>  a) Get the Image properties such as Texture Tamura (TT), Texture Edge
> Histogram (TE), Color Coherence Vector (CCV) and Color Histogram (CH) and
> Color Correlogram  (CC) .
>  b) Convert each of these vector into String and index into lucene as
> fields, thush each Image (document in terms of lucene) consist of 6 fields
> Image name, TT field, TE field, CCV field, CH field and CC field.
>
>  Searching Mechanism:
>  a) For the search Image convert the Image into the above 5 properties.
>  b) for every field and for every value within the field construct the
> query, For example let's say the user wants to search only Color histogram
> based similarity and the query Image has 3 1 4 5 as the CH value the query
> will look like.
>query = "CH:3 CH:1CH:4 CH:5"
>  c) for the results returned convert all the field values back into float
> and do the distance computation and re-rank the document with lower the
> distance on the top and larger distance at the bottom.
>  for example:
>For above query assume that output has two documents
>with one having CH as "1 3 5 4" and other one having CH as " 3 1 5 4", so
> the distance computation will rank the second document higher than the
> first.
>
>  Obviously there is something wrong with the above approach (as to get the
> correct document we need to get all the documents and than do the required
> distance calculation), but that' due to lack of my knowledge of Luce and
> lucene's Index storage.
>
>  What I want to know how to improve upon the exsisting architecture other
> than making number of fields in the lucene equalling to total number of
> feature*size of each feature.
>
>  Any other pointer will be welcomed. Is there is any Range tree
> implementation within lucene which I can use for this operation.
>
>  --Thanks and Regards
>  Vaijanath N. Rao
>
>  -
>  To unsubscribe, e-mail: [EMAIL PROTECTED]
>  For additional commands, e-mail: [EMAIL PROTECTED]
>
>



-- 

-

-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



RE: lucene farsi problem

2008-05-02 Thread Steven A Rowe
Hi Esra,

You are *still* incorrectly referring to the glyph with three dots over it:

On 05/02/2008 at 12:18 PM, esra wrote:
> yes the correct one is "ژ "/"ze"/U+632.

"ژ" is *not* "ze"/U+632 - it is "zhe"/U+698.

Have you increased the font size?  Can you see the difference between these 
two?:

"ژ"/"zhe"/U+698
"ز"/"ze"/U+632

> my problem is when i do search for  "د-ژ" range. The result
> is  "ساب ووفر" and this word's first letter is "س" and it's unicode is
> "U+633"  and it is not in the in the [ U+062F - U+0632 ] range.

Like I keep saying, in the above description, you're using the glyph 
"ژ"/"zhe"/U+698, while calling at the same time incorrectly referring to it as 
"ze"/U+632.

I don't mean to continually bang on about this - if you're *sure* that when you 
search, you're using the character represented by the glyph with one dot (and 
not three), i.e. "ز"/"ze"/U+632, then the problem lies elsewhere.

Steve

On 05/02/2008 at 12:18 PM, esra wrote:
> yes the correct one is "ژ "/"ze"/U+632.
> 
> my problem is when i do search for  "  د-ژ" range. The result
> is  ""ساب ووفر
> " and this word's first letter is "س " and it's unicode is
> "U+633"  and  it
> is not in the in the [ U+062F - U+0632 ] range.
> 
> am i wrong?
> 
> Esra
> 
> Steven A Rowe wrote:
> > 
> > Hi Esra,
> > 
> > I still think you're wrong :).
> > 
> > On 05/02/2008 at 9:31 AM, esra wrote:
> > > > ژ = U+632
> > 
> > According to the website you linked to, the above character, which has
> > three dots over it, is named "zhe", and its Unicode code point is
> > U+698. (I had to increase the font size to see the three dots.)
> > 
> > I think you are confusing "ژ"/"zhe"/U+698 with the letter
> > "ز"/"ze"/U+632, which has just one dot over it.
> > 
> > Unless you were mistaken in all of your emails when you included the
> > character "ژ"/"zhe" instead of "ز"/"ze", then what I said in my
> > previous email still stands: there is no problem here.
> > 
> > Steve
> > 
> > On 05/02/2008 at 9:31 AM, esra wrote:
> > > 
> > > Hi Steven,
> > > 
> > > sorry i made a mistake. unicodes are like this:
> > > 
> > > > د=U+62F
> > > > ژ = U+632
> > > > and the first letter of "ساب ووفر " is  س = U+633
> > > 
> > > you can also check them here
> > > > http://www.unics.uni-hannover.de/nhtcapri/persian-alphabet.html
> > > 
> > > Esra
> > > 
> > > 
> > > Steven A Rowe wrote:
> > > > 
> > > > Hi Esra,
> > > > 
> > > > Going back to the original problem statement, I see something that
> > > > looks illogical to me - please correct me if I'm wrong:
> > > > 
> > > > On Apr 30, 2008, at 3:21 AM, esra wrote:
> > > > > i am using lucene's "IndexSearcher" to search the given xml by
> > > > > keyword which contains farsi information.
> > > > > while searching i use ranges like
> > > > > 
> > > > > آ-ث  |  ج-خ  |  د-ژ  |  س-ظ  |  ع-ق  |  ک-ل  |  م-ی
> > > > > 
> > > > > when i do search for  "د-ژ"  range the results are wrong , they
> > > > > are the results of  " س-ظ "range.
> > > > > 
> > > > > for example when i do search for "د-ژ"  one of the the results is
> > > > > "ساب ووفر", this result also shown on the " س-ظ " range's result
> > > > > list which is the corret range.
> > > > > 
> > > > > As IndexSearcher use "compareTo" method and this method uses
> > > > > unicodes for comparing, i found the unicodes of the characters.
> > > > > 
> > > > > د=U+62F
> > > > > ژ = U+698
> > > > > and the first letter of "ساب ووفر " is  س = U+633
> > > > 
> > > > It appears to me that *both* the "د-ژ" range [ U+062F - U+0698 ] and
> > > > the "س-ظ" range [ U+0633 - U+0638 ] contain the first letter of "ساب
> > > > ووفر", which is "س" = U+0633.
> > > > 
> > > > You stated that U+0633 should be contained in the [ U+0633 - U+0638 ]
> > > > range - I agree - but why do you think U+0633 should not be contained
> > > > in the [ U+062F - U+0698 ] range?
> > > > 
> > > > In other words, it looks to me like your problem is not a problem at
> > > > all.
> > > > 
> > > > Steve
> > > > 
> > > > 
> > > 
> > > -- View this message in context:
> > > 
> http://www.nabble.com/lucene-farsi-problem-tp16977096p17019498
 .html Sent
> > from the Lucene - Java Users mailing list archive at Nabble.com.
> > 
> > 
> > - To
> > unsubscribe, e-mail: [EMAIL PROTECTED] For
> > additional commands, e-mail: [EMAIL PROTECTED]
> > 
> > 
> 
> 
> 
> 
> 
 
 --
 View this message in context: 
http://www.nabble.com/lucene-farsi-problem-tp16977096p17022861.html
 Sent from the Lucene - Java Users mailing list archive at Nabble.com.
 
 
 -
 To unsubscribe, e-mail: [EMAIL PROTECTED]
 For additional commands, e-mail: [EMAIL PROTECTED]

 



Re: hybrid query (lucene + db)

2008-05-02 Thread Marcelo Ochoa
Hi Stéphane:
  If you are using Oracle Spatial I assume that you are using Oracle
too for storing text :)
  Have you take a look at Oracle-Lucene integration project sponsored
by LendingClub.com?
http://docs.google.com/Doc?id=ddgw7sjp_54fgj9kg
http://sourceforge.net/project/showfiles.php?group_id=56183&package_id=255524&release_id=589900
  Its a new domain index for Oracle using Lucene inside the Oracle JVM.
  By doing that We can use Lucene as Oracle Text, but with many other
features, and using inline pagination We can get better perfomance
than latest 11g Text Counpound Domain Index.
  If you are interested in this implementation simply drop me an email.
  Best regards, Marcelo.

On Fri, May 2, 2008 at 3:58 AM, Stephane Nicoll
<[EMAIL PROTECTED]> wrote:
> Well for the moment we don't. The lucene index only contains the full
>  text content (indexed, not stored). We use lucene to perform full text
>  and fuzzy searches on the keywords field. Once we have the result, we
>  match them with the geospatial box provided by the user (we use Oracle
>  Spatial for that). We have no notion of city, state or zip code. Date
>  overlaps more than one countries most of the time actually.
>
>  We are thinking of reimplementing a quad tree in lucene to flag each
>  item with a spatial area. That way we will be able to pre-filter the
>  zone accordingly.
>
>  Still, this does not explain the deadlock on SegmentReader. If anyone
>  has an idea...
>
>  Thanks,
>  Stéphane
>
>
>
>  On Thu, May 1, 2008 at 8:50 PM, Michael Stoppelman <[EMAIL PROTECTED]> wrote:
>  > Stephane,
>  >
>  >  Could you describe how you setup the spatial area? Having BooleanQuery 
> with
>  >  200 terms in it definitely slows things down (I'm not sure exactly why yet
>  >  -- it seems like it shouldn't be "that" slow). If you can describe your
>  >  spatial area in fewer terms you can get much better performance. It just
>  >  depends on how you're describing your spatial areas and the number of
>  >  results in each zipcode. If you had a field like "city,state" in your 
> index
>  >  you would have far less terms in your query than if that query had all the
>  >  zipcodes in a "city,state" combo, thus making your query much faster.
>  >
>  >  M
>  >
>  >  On Thu, May 1, 2008 at 2:15 AM, mark harwood <[EMAIL PROTECTED]>
>  >  wrote:
>  >
>  >
>  >
>  >  > The issue here is a general one of trying to perform an efficient join
>  >  > between an external resource (rdbms) and Lucene.
>  >  > This experiment may be of interest:
>  >  >http://issues.apache.org/jira/browse/LUCENE-434
>  >  >
>  >  > KeyMap.java embodies the core service which translates from lucene doc 
> ids
>  >  > to DB primary keys or vice versa.
>  >  > There are a couple of implementations of KeyMap that are not optimal 
> (they
>  >  > pre-date Lucene's FieldCache) but it may give you food for thought.
>  >  >
>  >  > Cheers
>  >  > Mark
>  >  >
>  >  >
>  >  > - Original Message 
>  >  > From: Stephane Nicoll <[EMAIL PROTECTED]>
>  >  > To: java-user@lucene.apache.org
>  >  > Sent: Thursday, 1 May, 2008 9:00:33 AM
>  >  > Subject: hybrid query (lucene + db)
>  >  >
>  >  > Hi there,
>  >  >
>  >  > We're using lucene with Hibernate search and we're very happy so far
>  >  > with the performance and the usability of lucene. We have however a
>  >  > specific use cases that prevent us to use only lucene: spatial
>  >  > queries. I already sent a mail on this list a while back about the
>  >  > problem and we started investigating multiple solutions.
>  >  >
>  >  > When the user selects a geographic area and some keywords we do the
>  >  > following:
>  >  >
>  >  > * Perform a search on the lucene index for the keywords with a
>  >  > projection that returns only the primaryKey of the element sorted by
>  >  > primary key
>  >  > * Perform a search on the database with other criterias and a
>  >  > projection that returns only the primary key of the elements
>  >  > * Iterate on both list to find N matching IDs, optionally with paging
>  >  > (some from X to X + N where X is the first result of the page)
>  >  > * Run a query on the database to return the actual objects (select a
>  >  > from MyClass a where a.id IN (the list of matching IDs) ) We limit the
>  >  > page to 1000 results
>  >  >
>  >  > We have searched a way to optimize the queries and to avoid to consume
>  >  > too much memory, knowing that we must support paging.
>  >  >
>  >  > With a single user a search by kewyords takes 30msec to complete, a
>  >  > search by box takes 45msec. With both (keywords + spatial area)  it
>  >  > takes 300msec
>  >  >
>  >  > With 10 concurrent users, a search by keywords takes 150msec/user  but
>  >  > for both it takes 3 sec/user !!!
>  >  >
>  >  > I had the profiler running on this scenario and I've found that *all*
>  >  > threads are waiting on org.apache.lucene.index.SegmentReader. I then
>  >  > configured Hibernate Search to use a separate index reader per thre

RE: lucene farsi problem

2008-05-02 Thread esra

Hi Steven ,

yes you are right, sorry i am a bit confused.

i checked again and the correct one is  "zhe"/U+698. 

It seems the word is in the range but my customer says it shouldn't be.

I think problem occurs because  "zhe" is a Persian letter outside the Arabic
alphabet. In farsi alphabet this letter is not after the "س" letter but it's
unicode is bigger than "س" letter's and the searcher works with unicodes. 

Esra


Steven A Rowe wrote:
> 
> Hi Esra,
> 
> You are *still* incorrectly referring to the glyph with three dots over
> it:
> 
> On 05/02/2008 at 12:18 PM, esra wrote:
>> yes the correct one is "ژ "/"ze"/U+632.
> 
> "ژ" is *not* "ze"/U+632 - it is "zhe"/U+698.
> 
> Have you increased the font size?  Can you see the difference between
> these two?:
> 
> "ژ"/"zhe"/U+698
> "ز"/"ze"/U+632
> 
>> my problem is when i do search for  "د-ژ" range. The result
>> is  "ساب ووفر" and this word's first letter is "س" and it's unicode is
>> "U+633"  and it is not in the in the [ U+062F - U+0632 ] range.
> 
> Like I keep saying, in the above description, you're using the glyph
> "ژ"/"zhe"/U+698, while calling at the same time incorrectly referring to
> it as "ze"/U+632.
> 
> I don't mean to continually bang on about this - if you're *sure* that
> when you search, you're using the character represented by the glyph with
> one dot (and not three), i.e. "ز"/"ze"/U+632, then the problem lies
> elsewhere.
> 
> Steve
> 
> On 05/02/2008 at 12:18 PM, esra wrote:
>> yes the correct one is "ژ "/"ze"/U+632.
>> 
>> my problem is when i do search for  "  د-ژ" range. The result
>> is  ""ساب ووفر
>> " and this word's first letter is "س " and it's unicode is
>> "U+633"  and  it
>> is not in the in the [ U+062F - U+0632 ] range.
>> 
>> am i wrong?
>> 
>> Esra
>> 
>> Steven A Rowe wrote:
>> > 
>> > Hi Esra,
>> > 
>> > I still think you're wrong :).
>> > 
>> > On 05/02/2008 at 9:31 AM, esra wrote:
>> > > > ژ = U+632
>> > 
>> > According to the website you linked to, the above character, which has
>> > three dots over it, is named "zhe", and its Unicode code point is
>> > U+698. (I had to increase the font size to see the three dots.)
>> > 
>> > I think you are confusing "ژ"/"zhe"/U+698 with the letter
>> > "ز"/"ze"/U+632, which has just one dot over it.
>> > 
>> > Unless you were mistaken in all of your emails when you included the
>> > character "ژ"/"zhe" instead of "ز"/"ze", then what I said in my
>> > previous email still stands: there is no problem here.
>> > 
>> > Steve
>> > 
>> > On 05/02/2008 at 9:31 AM, esra wrote:
>> > > 
>> > > Hi Steven,
>> > > 
>> > > sorry i made a mistake. unicodes are like this:
>> > > 
>> > > > د=U+62F
>> > > > ژ = U+632
>> > > > and the first letter of "ساب ووفر " is  س = U+633
>> > > 
>> > > you can also check them here
>> > > > http://www.unics.uni-hannover.de/nhtcapri/persian-alphabet.html
>> > > 
>> > > Esra
>> > > 
>> > > 
>> > > Steven A Rowe wrote:
>> > > > 
>> > > > Hi Esra,
>> > > > 
>> > > > Going back to the original problem statement, I see something that
>> > > > looks illogical to me - please correct me if I'm wrong:
>> > > > 
>> > > > On Apr 30, 2008, at 3:21 AM, esra wrote:
>> > > > > i am using lucene's "IndexSearcher" to search the given xml by
>> > > > > keyword which contains farsi information.
>> > > > > while searching i use ranges like
>> > > > > 
>> > > > > آ-ث  |  ج-خ  |  د-ژ  |  س-ظ  |  ع-ق  |  ک-ل  |  م-ی
>> > > > > 
>> > > > > when i do search for  "د-ژ"  range the results are wrong , they
>> > > > > are the results of  " س-ظ "range.
>> > > > > 
>> > > > > for example when i do search for "د-ژ"  one of the the results is
>> > > > > "ساب ووفر", this result also shown on the " س-ظ " range's result
>> > > > > list which is the corret range.
>> > > > > 
>> > > > > As IndexSearcher use "compareTo" method and this method uses
>> > > > > unicodes for comparing, i found the unicodes of the characters.
>> > > > > 
>> > > > > د=U+62F
>> > > > > ژ = U+698
>> > > > > and the first letter of "ساب ووفر " is  س = U+633
>> > > > 
>> > > > It appears to me that *both* the "د-ژ" range [ U+062F - U+0698 ]
>> and
>> > > > the "س-ظ" range [ U+0633 - U+0638 ] contain the first letter of
>> "ساب
>> > > > ووفر", which is "س" = U+0633.
>> > > > 
>> > > > You stated that U+0633 should be contained in the [ U+0633 - U+0638
>> ]
>> > > > range - I agree - but why do you think U+0633 should not be
>> contained
>> > > > in the [ U+062F - U+0698 ] range?
>> > > > 
>> > > > In other words, it looks to me like your problem is not a problem
>> at
>> > > > all.
>> > > > 
>> > > > Steve
>> > > > 
>> > > > 
>> > > 
>> > > -- View this message in context:
>> > > 
>> http://www.nabble.com/lucene-farsi-problem-tp16977096p17019498
>  .html Sent
>> > from the Lucene - Java Users mailing list archive at Nabble.com.
>> > 
>> > 
>> > -
>> To
>> > unsubscribe, e-mail: [EMAIL PROTECTED] For
>> > additional commands, e-mail: [EMAIL PROTECTED]
>> > 

RE: lucene farsi problem

2008-05-02 Thread Steven A Rowe
Hi Esra,

I have created an issue for this - see 
.

I'll try to take a crack at a patch this weekend.

Steve

On 05/02/2008 at 12:55 PM, esra wrote:
> 
> Hi Steven ,
> 
> yes you are right, sorry i am a bit confused.
> 
> i checked again and the correct one is  "zhe"/U+698.
> 
> It seems the word is in the range but my customer says it
> shouldn't be.
> 
> I think problem occurs because  "zhe" is a Persian letter
> outside the Arabic
> alphabet. In farsi alphabet this letter is not after the "س"
> letter but it's
> unicode is bigger than "س" letter's and the searcher works
> with unicodes.
> 
> Esra
> 
> 
> Steven A Rowe wrote:
> > 
> > Hi Esra,
> > 
> > You are *still* incorrectly referring to the glyph with three dots over
> > it:
> > 
> > On 05/02/2008 at 12:18 PM, esra wrote:
> > > yes the correct one is "ژ "/"ze"/U+632.
> > 
> > "ژ" is *not* "ze"/U+632 - it is "zhe"/U+698.
> > 
> > Have you increased the font size?  Can you see the difference between
> > these two?:
> > 
> > "ژ"/"zhe"/U+698
> > "ز"/"ze"/U+632
> > 
> > > my problem is when i do search for  "د-ژ" range. The result is  "ساب
> > > ووفر" and this word's first letter is "س" and it's unicode is "U+633" 
> > > and it is not in the in the [ U+062F - U+0632 ] range.
> > 
> > Like I keep saying, in the above description, you're using the glyph
> > "ژ"/"zhe"/U+698, while calling at the same time incorrectly referring
> > to it as "ze"/U+632.
> > 
> > I don't mean to continually bang on about this - if you're *sure* that
> > when you search, you're using the character represented by the glyph
> > with one dot (and not three), i.e. "ز"/"ze"/U+632, then the problem
> > lies elsewhere.
> > 
> > Steve
> > 
> > On 05/02/2008 at 12:18 PM, esra wrote:
> > > yes the correct one is "ژ "/"ze"/U+632.
> > > 
> > > my problem is when i do search for  "  د-ژ" range. The result
> > > is  ""ساب ووفر
> > > " and this word's first letter is "س " and it's unicode is
> > > "U+633"  and  it
> > > is not in the in the [ U+062F - U+0632 ] range.
> > > 
> > > am i wrong?
> > > 
> > > Esra
> > > 
> > > Steven A Rowe wrote:
> > > > 
> > > > Hi Esra,
> > > > 
> > > > I still think you're wrong :).
> > > > 
> > > > On 05/02/2008 at 9:31 AM, esra wrote:
> > > > > > ژ = U+632
> > > > 
> > > > According to the website you linked to, the above character, which
> > > > has three dots over it, is named "zhe", and its Unicode code point is
> > > > U+698. (I had to increase the font size to see the three dots.)
> > > > 
> > > > I think you are confusing "ژ"/"zhe"/U+698 with the letter
> > > > "ز"/"ze"/U+632, which has just one dot over it.
> > > > 
> > > > Unless you were mistaken in all of your emails when you included the
> > > > character "ژ"/"zhe" instead of "ز"/"ze", then what I said in my
> > > > previous email still stands: there is no problem here.
> > > > 
> > > > Steve
> > > > 
> > > > On 05/02/2008 at 9:31 AM, esra wrote:
> > > > > 
> > > > > Hi Steven,
> > > > > 
> > > > > sorry i made a mistake. unicodes are like this:
> > > > > 
> > > > > > د=U+62F
> > > > > > ژ = U+632
> > > > > > and the first letter of "ساب ووفر " is  س = U+633
> > > > > 
> > > > > you can also check them here
> > > > > > 
> http://www.unics.uni-hannover.de/nhtcapri/persian-alphabet.html
> > > > > 
> > > > > Esra
> > > > > 
> > > > > 
> > > > > Steven A Rowe wrote:
> > > > > > 
> > > > > > Hi Esra,
> > > > > > 
> > > > > > Going back to the original problem statement, I see something that
> > > > > > looks illogical to me - please correct me if I'm wrong:
> > > > > > 
> > > > > > On Apr 30, 2008, at 3:21 AM, esra wrote:
> > > > > > > i am using lucene's "IndexSearcher" to search the given xml by
> > > > > > > keyword which contains farsi information. while searching i use
> > > > > > > ranges like
> > > > > > > 
> > > > > > > آ-ث  |  ج-خ  |  د-ژ  |  س-ظ  |  ع-ق  |  ک-ل  |  م-ی
> > > > > > > 
> > > > > > > when i do search for  "د-ژ"  range the results are wrong , they
> > > > > > > are the results of  " س-ظ "range.
> > > > > > > 
> > > > > > > for example when i do search for "د-ژ"  one of the the results is
> > > > > > > "ساب ووفر", this result also shown on the " س-ظ " range's result
> > > > > > > list which is the corret range.
> > > > > > > 
> > > > > > > As IndexSearcher use "compareTo" method and this method uses
> > > > > > > unicodes for comparing, i found the unicodes of the characters.
> > > > > > > 
> > > > > > > د=U+62F
> > > > > > > ژ = U+698
> > > > > > > and the first letter of "ساب ووفر " is  س = U+633
> > > > > > 
> > > > > > It appears to me that *both* the "د-ژ" range [
> U+062F - U+0698 ]
> > > and
> > > > > > the "س-ظ" range [ U+0633 - U+0638 ] contain the
> first letter of
> > > "ساب
> > > > > > ووفر", which is "س" = U+0633.
> > > > > > 
> > > > > > You stated that U+0633 should be contained in the [
> U+0633 - U+0638
> > > ]
> > > > > > range - I agree - but why do you think U+0633 should not be
> > > > > > contained in the [

Re: hybrid query (lucene + db)

2008-05-02 Thread Stephane Nicoll
Hi,

Thanks for the response. The very first reason  we're using lucene is
because we're building a product that must support different database
(Oracle 10, Oracle 11 and Postgresql with spatial extensions).

I had a look to this project already but we cannot stick to one database vendor.

Cheers,
Stéphane

On Fri, May 2, 2008 at 6:55 PM, Marcelo Ochoa <[EMAIL PROTECTED]> wrote:
> Hi Stéphane:
>   If you are using Oracle Spatial I assume that you are using Oracle
>  too for storing text :)
>   Have you take a look at Oracle-Lucene integration project sponsored
>  by LendingClub.com?
>  http://docs.google.com/Doc?id=ddgw7sjp_54fgj9kg
>  
> http://sourceforge.net/project/showfiles.php?group_id=56183&package_id=255524&release_id=589900
>   Its a new domain index for Oracle using Lucene inside the Oracle JVM.
>   By doing that We can use Lucene as Oracle Text, but with many other
>  features, and using inline pagination We can get better perfomance
>  than latest 11g Text Counpound Domain Index.
>   If you are interested in this implementation simply drop me an email.
>   Best regards, Marcelo.
>
>
>
>  On Fri, May 2, 2008 at 3:58 AM, Stephane Nicoll
>  <[EMAIL PROTECTED]> wrote:
>  > Well for the moment we don't. The lucene index only contains the full
>  >  text content (indexed, not stored). We use lucene to perform full text
>  >  and fuzzy searches on the keywords field. Once we have the result, we
>  >  match them with the geospatial box provided by the user (we use Oracle
>  >  Spatial for that). We have no notion of city, state or zip code. Date
>  >  overlaps more than one countries most of the time actually.
>  >
>  >  We are thinking of reimplementing a quad tree in lucene to flag each
>  >  item with a spatial area. That way we will be able to pre-filter the
>  >  zone accordingly.
>  >
>  >  Still, this does not explain the deadlock on SegmentReader. If anyone
>  >  has an idea...
>  >
>  >  Thanks,
>  >  Stéphane
>  >
>  >
>  >
>  >  On Thu, May 1, 2008 at 8:50 PM, Michael Stoppelman <[EMAIL PROTECTED]> 
> wrote:
>  >  > Stephane,
>  >  >
>  >  >  Could you describe how you setup the spatial area? Having BooleanQuery 
> with
>  >  >  200 terms in it definitely slows things down (I'm not sure exactly why 
> yet
>  >  >  -- it seems like it shouldn't be "that" slow). If you can describe your
>  >  >  spatial area in fewer terms you can get much better performance. It 
> just
>  >  >  depends on how you're describing your spatial areas and the number of
>  >  >  results in each zipcode. If you had a field like "city,state" in your 
> index
>  >  >  you would have far less terms in your query than if that query had all 
> the
>  >  >  zipcodes in a "city,state" combo, thus making your query much faster.
>  >  >
>  >  >  M
>  >  >
>  >  >  On Thu, May 1, 2008 at 2:15 AM, mark harwood <[EMAIL PROTECTED]>
>  >  >  wrote:
>  >  >
>  >  >
>  >  >
>  >  >  > The issue here is a general one of trying to perform an efficient 
> join
>  >  >  > between an external resource (rdbms) and Lucene.
>  >  >  > This experiment may be of interest:
>  >  >  >http://issues.apache.org/jira/browse/LUCENE-434
>  >  >  >
>  >  >  > KeyMap.java embodies the core service which translates from lucene 
> doc ids
>  >  >  > to DB primary keys or vice versa.
>  >  >  > There are a couple of implementations of KeyMap that are not optimal 
> (they
>  >  >  > pre-date Lucene's FieldCache) but it may give you food for thought.
>  >  >  >
>  >  >  > Cheers
>  >  >  > Mark
>  >  >  >
>  >  >  >
>  >  >  > - Original Message 
>  >  >  > From: Stephane Nicoll <[EMAIL PROTECTED]>
>  >  >  > To: java-user@lucene.apache.org
>  >  >  > Sent: Thursday, 1 May, 2008 9:00:33 AM
>  >  >  > Subject: hybrid query (lucene + db)
>  >  >  >
>  >  >  > Hi there,
>  >  >  >
>  >  >  > We're using lucene with Hibernate search and we're very happy so far
>  >  >  > with the performance and the usability of lucene. We have however a
>  >  >  > specific use cases that prevent us to use only lucene: spatial
>  >  >  > queries. I already sent a mail on this list a while back about the
>  >  >  > problem and we started investigating multiple solutions.
>  >  >  >
>  >  >  > When the user selects a geographic area and some keywords we do the
>  >  >  > following:
>  >  >  >
>  >  >  > * Perform a search on the lucene index for the keywords with a
>  >  >  > projection that returns only the primaryKey of the element sorted by
>  >  >  > primary key
>  >  >  > * Perform a search on the database with other criterias and a
>  >  >  > projection that returns only the primary key of the elements
>  >  >  > * Iterate on both list to find N matching IDs, optionally with paging
>  >  >  > (some from X to X + N where X is the first result of the page)
>  >  >  > * Run a query on the database to return the actual objects (select a
>  >  >  > from MyClass a where a.id IN (the list of matching IDs) ) We limit 
> the
>  >  >  > page to 1000 results
>  >  >  >
>  

Hibernate search (Problem adding new Record)

2008-05-02 Thread oyesiji

I am using Hibernate Search in my Application, the first time i attempt to
index records from the database it works and the second time i attempt to
add records i notice that it does not work

FullTextSession fullTextSession = Search.createFullTextSession(session);
for (JobDescription jobDescription : jobDescriptions) {
fullTextSession.index(jobDescription);
}

Any suggestion is welcome
-- 
View this message in context: 
http://www.nabble.com/Hibernate-search-%28Problem-adding-new-Record%29-tp17029563p17029563.html
Sent from the Lucene - Java Users mailing list archive at Nabble.com.


-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



Re: Hibernate search (Problem adding new Record)

2008-05-02 Thread Otis Gospodnetic
Hi,

Hibernate Search hasn't been talked about much on this list, so you may not get 
much help, if any.  Have you tried asking on the Hibernate Search mailing list? 
(don't know it's address/site).

Otis
--
Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch

- Original Message 
> From: oyesiji <[EMAIL PROTECTED]>
> To: java-user@lucene.apache.org
> Sent: Saturday, May 3, 2008 1:23:42 AM
> Subject: Hibernate search (Problem adding new Record)
> 
> 
> I am using Hibernate Search in my Application, the first time i attempt to
> index records from the database it works and the second time i attempt to
> add records i notice that it does not work
> 
> FullTextSession fullTextSession = Search.createFullTextSession(session);
> for (JobDescription jobDescription : jobDescriptions) {
> fullTextSession.index(jobDescription);
> }
> 
> Any suggestion is welcome
> -- 
> View this message in context: 
> http://www.nabble.com/Hibernate-search-%28Problem-adding-new-Record%29-tp17029563p17029563.html
> Sent from the Lucene - Java Users mailing list archive at Nabble.com.
> 
> 
> -
> To unsubscribe, e-mail: [EMAIL PROTECTED]
> For additional commands, e-mail: [EMAIL PROTECTED]
> 
> 



-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



RE: Hibernate search (Problem adding new Record)

2008-05-02 Thread John Griffin
Try this:

FullTextSession fullTextSession = Search.createFullTextSession(session);
for (JobDescription jobDescription : jobDescriptions) {
fullTextSession.save(jobDescription);
}   ^
|

.index causes a reindex not a save

John G.

-Original Message-
From: oyesiji [mailto:[EMAIL PROTECTED] 
Sent: Friday, May 02, 2008 5:24 PM
To: java-user@lucene.apache.org
Subject: Hibernate search (Problem adding new Record)


I am using Hibernate Search in my Application, the first time i attempt to
index records from the database it works and the second time i attempt to
add records i notice that it does not work

FullTextSession fullTextSession = Search.createFullTextSession(session);
for (JobDescription jobDescription : jobDescriptions) {
fullTextSession.index(jobDescription);
}

Any suggestion is welcome
-- 
View this message in context:
http://www.nabble.com/Hibernate-search-%28Problem-adding-new-Record%29-tp170
29563p17029563.html
Sent from the Lucene - Java Users mailing list archive at Nabble.com.


-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]


-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



RE: Hibernate search (Problem adding new Record)

2008-05-02 Thread John Griffin
P.S.

The Hibernate Search forum is at 

http://forum.hibernate.org/viewforum.php?f=9

John G.



-Original Message-
From: oyesiji [mailto:[EMAIL PROTECTED] 
Sent: Friday, May 02, 2008 5:24 PM
To: java-user@lucene.apache.org
Subject: Hibernate search (Problem adding new Record)


I am using Hibernate Search in my Application, the first time i attempt to
index records from the database it works and the second time i attempt to
add records i notice that it does not work

FullTextSession fullTextSession = Search.createFullTextSession(session);
for (JobDescription jobDescription : jobDescriptions) {
fullTextSession.index(jobDescription);
}

Any suggestion is welcome
-- 
View this message in context:
http://www.nabble.com/Hibernate-search-%28Problem-adding-new-Record%29-tp170
29563p17029563.html
Sent from the Lucene - Java Users mailing list archive at Nabble.com.


-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]


-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]