-
--
Email: wuqiu.m...@qq.com
--
--
View this message in context:
http://lucene.472066.n3.nabble.com/how-to-get-payload-of-a-term-after-IndexSearch-search-tp4021789p4073708.html
Sent from the Lucene - Java Users mailing list archive at Nabble
up
-
--
Email: wuqiu.m...@qq.com
--
--
View this message in context:
http://lucene.472066.n3.nabble.com/why-did-I-build-index-slower-and-slower-tp4062798p4063395.html
Sent from the Lucene - Java Users mailing list archive at Nabble.com.
Jack, according to you, How can I implemt this requirement ?Could you give me
a clue ? thank you very much.The regex query seemed not worked ? I got the
field such asFieldType fieldType = new FieldType();
FieldInfo.IndexOptions indexOptions =
FieldInfo.IndexOptions.DOCS
En, thanke you. I also found the question that I should make the writer a
singleton. and the writer commited and closed every batch. That is,In every
buildIndexWriterConfig iwc = new IndexWriterConfig(Version.LUCENE_40,
analyzer);iwc.setOpenMode(IndexWriterConfig.OpenMode.CREATE_OR_APPEND);iwc.setR
My situation is that There are 10,000,000 documents, and I Build index every
5,000 documents. while *in every build*, I follow these steps:
IndexWriterConfig iwc = new IndexWriterConfig(Version.LUCENE_40, analyzer);
iwc.setOpenMode(IndexWriterConfig.OpenMode.CREATE_O
ok,thx but now How can I implemt this requirement ?Jack gave me a clue, but I
failed, and it returns no docs when I cameup with a regex query like
"jakarta.{1,10}apache"Is there some limitations when use regex query like
not indexed and son on ?
-
--
Email: wuqiu.m...
That's the question.When I get the doc by QueryParser("jakarta apache"~10),
which means it hits the query syntax, but it depends on the word position
and not on offset, and that is not my intent. There are some docs which
satisfied the ("jakarta apache"~10) but not satisfied the regex
"jakarta.{1,1
As I know, the syntax *"jakarta apache"~10*, which is a PhraseQuery with a
slop=10 in position, but What I want is *based on offset* not on position?
Anyone can help me ? thx.
-
--
Email: wuqiu.m...@qq.com
--
--
View this message in context:
I found SortField and Sort, but they just sort by a field,
and what I want is to sort by the groupDocs.totalHits ?
Anyone knows ? thx
-
--
Email: wuqiu.m...@qq.com
--
--
View this message in context:
http://lucene.472066.n3.nabble.com/Could-g
rt, I'm totally puzzled,
Can anyone explain it with an example ?
thx.
-
--
Email: wuqiu.m...@qq.com
--
--
View this message in context:
http://lucene.472066.n3.nabble.com/what-s-the-difference-of-facet-and-group-search-tp4037914.html
Sent from
en, it seems nice, but I'm puzzled by you and Andrew Gilmartina above,
what's the difference between you guys ?
and I'm reading the reference about how to
*extract relevant terms from the top document(s). *
anyway, thx
-
--
Email: wuqiu.m...@qq.com
--
In short,
you put in a term like "Lucene",
and The ideal output would be "solr", "index", "full-text search", and so
on.
How to make it ? to find the related words. thx
My idea is to use FuzzyQuery, or MoreLikeThis, or calc the score with all
the terms and then sort.
Any idea ?
-
-
it seems that a doc is really deleted until next index merge or something.
I'm not sure.
-
--
Email: wuqiu.m...@qq.com
--
--
View this message in context:
http://lucene.472066.n3.nabble.com/IndexWriter-deleteDocuments-tp4037365p4037377.html
Se
I found it is very easy to come into OutOfMemoryError.
My idea is that lucene could set the RAM memory Automatically,
but I couldn't find the API. My code:
IndexWriterConfig iwc = new IndexWriterConfig(Version.LUCENE_40, analyzer);
int mb = 1024 * 1024;
double ram = Runtime.getRuntime().maxMemory
hello, but there is no getCommitUserData in IndexReader,
how can I get the userdata ??
thx
-
--
Email: wuqiu.m...@qq.com
--
--
View this message in context:
http://lucene.472066.n3.nabble.com/how-to-add-attributes-to-a-field-just-like-term-s-p
en ha, sounds good, and perhaps already satisfied my need.
thx so much.
-
--
Email: wuqiu.m...@qq.com
--
--
View this message in context:
http://lucene.472066.n3.nabble.com/how-to-add-attributes-to-a-field-just-like-term-s-payload-tp4031045p4
hello, as we know, we can add payload to a term,
but whether can we add extra custom info into a field ?
such the description of the field, which is the property shared by thd field
of all documents.
how to make it ? thx
-
--
Email: wuqiu.m...@qq.com
Thx very much!
Lingpipe and Gate are very useful, and new to me,
but is it too larger to realize the custom like
class TestPostingItem
{
int termId;
long startOffset;
long endOffset;
float score;
int segId;
long timeStamp;
} ?
-
---
After I finish "packing your information into a payload", but
is there some method to search with the information ?
what is the "PayloadTermQuery" for ??
thx
-
--
Email: wuqiu.m...@qq.com
--
--
View this message in context:
http://lucene.47206
thx, mike.
about the 3th question, "encode them all into the payload" is better than
"a new postings format with the codec" ??
I mean replace the orginal posting item (position, startOffset, endOffset,
payload) with my own inverted item such as
class TestPostingItem
{
int termId;
l
s the
offset of the term 'lucene', and 33.2 is a score, and 2 is some id, my
question is how I can make it indexed ?
my first idea is to relized my own posting list format, but is it possible
to make it with the startOffset, endOffset and payload ?
thx.
wgggfiy
--
View this message in
Wa, Exactly !!
thx, jack. good idea
--
View this message in context:
http://lucene.472066.n3.nabble.com/how-do-re-get-the-doc-after-the-doc-was-indexed-tp4020865p4020868.html
Sent from the Lucene - Java Users mailing list archive at Nabble.com.
-
for example:
I indexed a doc with path=c:/books/appach.txt, author=Mike
After a long time, I wanted to modify the author to John.
But the quethion is how I can get the exact same doc fastly ??
My idea is to traverse the docs from id=0 to id=maxDoc(), and
retrive it with store fields, and check its
Does anyone resove this ?
thx
--
View this message in context:
http://lucene.472066.n3.nabble.com/Retrieval-of-the-position-of-indexed-terms-tp4015079p4020835.html
Sent from the Lucene - Java Users mailing list archive at Nabble.com.
I'm study deeply in the index format,
write java utils to log all of it.
And now I have successfully logged .si, .fnm, .fdx, .fdt,
but the .tim and .tiq is too complicated...
--
View this message in context:
http://lucene.472066.n3.nabble.com/Lucene-Index-File-Format-tp4011133p4020685.html
S
me too !
Could you explain how you solved it ??
--
View this message in context:
http://lucene.472066.n3.nabble.com/Lucene-4-0-Get-All-Index-Terms-tp3686023p4020683.html
Sent from the Lucene - Java Users mailing list archive at Nabble.com.
--
26 matches
Mail list logo