newly created
documents; it would be much computational intensive.
--
Regards
Kasun Perera
be created only once and I can access
in-memory index as long as web app is live?
--
Regards
Kasun Perera
Resending again, since my question didn't get much attention
-- Forwarded message --
From: Kasun Perera
Date: Tue, Jun 19, 2012 at 3:26 PM
Subject: Different Weights to Lucene fields with Okapi Similarity
To: java-user@lucene.apache.org
Based on this link http://www200
tested it sometimes back and code worked for
me, but I think it needed Lucene-core-2.9.jar. Hope this helps.
I can't see any java code files in your attached ZIP file? it only contains
some text files
Regards
Kasun Perera
On Mon, Jul 2, 2012 at 12:09 PM, nadeesha meththananda <
neranja
freq(t, doc) is the frequency of term t in document doc.
Choosing b=0.25 and k = 1.2 you get
w(t, doc) = idf(t) * 2.2*freq(t, doc) / (1.2*(0.25+0.75*ls(doc)) + freq(t, doc))
--
Regards
Kasun Perera
On Mon, Jun 18, 2012 at 8:48 AM, Kasun Perera wrote:
> I want to calculate average document length for document collection which
> each document having 3 different fields(filed1, field2,field3)
>
> This is the program to calculate average length when only one field is
> th
calculating Doc average
length for 3 field is correct?
--
Regards
Kasun Perera
equation that I can use for calculating cosine
similarity between documents?
--
Regards
Kasun Perera
Lucene?
Thanks
> --
> Ian.
>
>
> On Fri, May 11, 2012 at 8:58 AM, Kasun Perera
> wrote:
> > I have collection of documents (say 10 documents)and i'm indexing them
> this
> > way, by storing the term vector
> >
> > StringReader strRdElt = new Str
I have collection of documents (say 10 documents)and i'm indexing them this
way, by storing the term vector
StringReader strRdElt = new StringReader(content);
Document doc = new Document();
String docname=docNames[docNo];
doc.add(new Field("doccontent", strRdElt, Field.TermVector.Y
ne that can be used to index by semantics? so that
it indexes "owe" "owed" "owing" as one word "owe" with term frequency =3 ?
If not I'd welcome any suggestions achieving this task?
--
Regards
Kasun Perera
eDocCollector collector = TopScoreDocCollector.create(hitsPerPage, true);
searcher.search(q, collector);
ScoreDoc[] hits = collector.topDocs().scoreDocs;
return hits.length;
}
--
Regards
Kasun Perera
eight to Taxonomy and Ontology terms in
> > document similarity calculation?
> >
> >
> > Are there Lucene functions that can be used to give higher weights to the
> > certain fields when calculating TFIDF values using TermFreqVector? can I
> > jus
y calculation?
Are there Lucene functions that can be used to give higher weights to the
certain fields when calculating TFIDF values using TermFreqVector? can I
just use the setboost() function for this purpose, then how?
--
Regards
Kasun Perera
14 matches
Mail list logo