hi Ritu,
>
this is jaison. i am new to such search. can you just help me
out. i want some body who can guide me in lucene
> Thanks in advance.
>
>
> -
>
Hi
I think you might want to look at Hibernate Search. You can use projections
which basically store instance fields in the index. It does not store the
object in a serialised form in the index. It holds a reference (id) to the
persistant entity.
Cheers
Amin
On Sat, Jul 4, 2009 at 2:39 AM, Er
Hi there,
On Fri, Jul 3, 2009 at 9:32 PM, MilleBii wrote:
> I want to store in the index a data structure and load it back at search
> time.
>
> Is it safe to serialize the java object store it and load it back later ?
It won't be particularly fast nor efficient but it is gonna work.
> Presumably
You can add a serialized object easily as a stored field to a document, just
serialize the object to an byte[] array and store this in the index, e.g.:
ByteArrayOutputStream serData=new ByteArrayOutputStream();
ObjectOutputStream out=new ObjectOutputStream(serData);
try {
out.writeObject(d
> That is one way, or you do it base64 encoded in a text field if don't
> care about space at all. :)
Lucene also have binary fields for storing. Searching on such fields does
not make sense, so its ok to not be able to index them (how should that
work).
I have this use case, too. Sometimes it is
On Sat, Jul 4, 2009 at 10:15 AM, Uwe Schindler wrote:
>> That is one way, or you do it base64 encoded in a text field if don't
>> care about space at all. :)
just for clarification:
one way Java Object Serialization - is not efficient at all It takes a
lot of space and performance is crap.
other wa
Well,
During indexing phase (I'm actually running Nutch), I'm also extracting data
about my pages including some text fragments.
So I'd like to store the resulting objects in lucene index, and reload them
at search time for further manipulation.
I was wondering which way was the simplest.
2009/7
Right I'm not indexing such fields, they are actually a kind of document
property of my own
2009/7/4 Uwe Schindler
> > That is one way, or you do it base64 encoded in a text field if don't
> > care about space at all. :)
>
> Lucene also have binary fields for storing. Searching on such fields do
Then see my other mail about Java Serialization. It works (but not so fast),
but is the simpliest way to do it.
I do not use the serialized fields during searching, I store them only for
usage in some special maintenance tasks on the indexed documents. So it's
the same use-case.
For this use case
OK thanks for the tip on Java object serialization performance.
Most of what I have to store/retrieve is straightforward so I can do it by
hand.
What pushed me on object serialization is that I want to store/retrieve text
fragment of undefined content.
2009/7/4 Simon Willnauer
> On Sat, Jul 4,
OK, thx guys. I see the different options more clear now.
2009/7/4 Uwe Schindler
> Then see my other mail about Java Serialization. It works (but not so
> fast),
> but is the simpliest way to do it.
>
> I do not use the serialized fields during searching, I store them only for
> usage in some sp
HI,
i used the StandartAnalyzer. i changed to WhitespaceAnalyzer so now i got
results when i search for '1%' for exemple, but if i type only the '%' i
still got results.
/***
doc = new Document();
nameField = new
Field("name",strN,Field.Store.YES,Field.Index.TOKENIZED,Field.TermVector.WITH_POSITI
I might be skirting the issue here, but wouldnt it be easier and
faster if you remove the sid before you add it to the index?
Cheers,
Shayak
On Sat, Jul 4, 2009 at 3:03 AM, Erick Erickson wrote:
> WARNING: I haven't actually tried using RegexTermEnum in a
> long time, but...
>
> I *think* that th
It works, thanks.
I thought I had to call next() to know IF there was a term, as you normally
do with hasNext() - next() using iterators, but I was wrong.
So, in order to know if there is a match, I have to check if rte.term() is
null, correct?
Than I can use next() to look for additional matches.
Yes, I thought about this solution too, but the problem is that the "sid"
part can be different in different domains.
So, sometimes we have sid=..., other times we have s= and so on.
If we decide to solve the problem by removing the sid from the url in the
index, when we discover a new "patter
This is about an experiment comparing plain Boolean retrieval with
vector-space-based retrieval.
I would like to disable all of Lucene's scoring mechanisms and just
run a true Boolean query that returns exactly the documents that match a
query specified in Boolean syntax (OR, AND, NOT). No scori
Check out booleanfilter in contrib/queries. It can be wrapped in a
constantScoreQuery
On 4 Jul 2009, at 17:37, Lukas Michelbacher
wrote:
This is about an experiment comparing plain Boolean retrieval with
vector-space-based retrieval.
I would like to disable all of Lucene's scoring mechani
It is also possible to use the HitCollector api and simply ignore
the score values.
Regards,
Paul Elschot
On Saturday 04 July 2009 21:14:41 Mark Harwood wrote:
>
> Check out booleanfilter in contrib/queries. It can be wrapped in a
> constantScoreQuery
>
>
>
> On 4 Jul 2009, at 17:37, Lukas
I want to have a "citystate" field in Lucene index which will store various
city state values like:
Chicago, IL
Boston, MA
San Diego, CA
How do i store these values(shud it be tokenized or non-tokenized?) in
Lucene and
how do I generate a query (should it be phrasequery or termquery or
somet
19 matches
Mail list logo