Hi All
During my unemployment time ,the happiest thing is diving to study the
Lucene Source Code ,thanks for all the work .
About the BooleanQuery.I am encounterd by a question about the execution
of BooleanQuery:although,BooleanQuery#rewrite has done some works to
remove duplicate FILTER,
I was wondering if anyone can shed some light on an issue we're having:
we're comparing two different indexes on the same collection - one with
lots of different segments (default settings), and one with a force
merged into one segment. It seems that search is sometimes faster with
multiple segmen
Erick,
Thank you very much. Is there any performance comparison between these two
versions? Is there any data on the performance?
Thank you very much
Best Regards
---
Shuangyang Yang
linkedin.com/in/everyoung
On 7/17/15, 10:06 AM, "Erick Erickson" wrote:
>Please look at th
Erick, Amish,
Thank you very much for your reply. They are really helpful. Can you tell
me what are the compelling features from 2.9.1 to 5.X? I¹m sure there are
many.
Thank you very much
Best Regards
---
Shuangyang Yang
linkedin.com/in/everyoung
On 7/16/15, 8:41 PM, "Erick Eri
Hi there,
We are using lucene 2.9.1 in our product and thinking of upgrading to 5.X. Is
it plausible? Is the file format compatible? Is there any guide document that
can give tips on how to do it?
Thank you very much
Best Regards
---
Shuangyang Yang
linkedin.com/in/everyoung
(num,
DateTools.Resolution.SECOND));
Then you get dt as a string in the right format.
Ming-
On Fri, Jun 14, 2013 at 4:24 PM, Mingfeng Yang wrote:
> I did System.println(d.get('date')), and the output is
> "stored,binary,omitNorms,indexOptions=DOCS_ONLY"
>
> Emmm
I did System.println(d.get('date')), and the output is
"stored,binary,omitNorms,indexOptions=DOCS_ONLY"
Emmm.
On Fri, Jun 14, 2013 at 4:05 PM, Chris Hostetter
wrote:
>
> : I used solr to query the index, and verified that each document does
> have a
> : non-blank date field. I suspect that i
Hoss,
I did in two ways. The first is the 1) in your list, q=date:* match
q=*:*.
And all fields are stored in the index. I got a doc id (say 3315), do
q=id:3315, the output contain the datefield and value.
Anyway, I am 100% sure every doc has a date field and value indexed and
stored there.
I have a solr index built with solr 1.4 a few years ago, and later upgraded
to solr 3.6, and now the index is consisting of 150 million documents.
Now I want to read all values of a DateField from the index. But it turns
out that for nearly 100 million documents, document.get('date') return
null
ok, found it:
we are using Cloudera CDHu3u, they change the ulimit for child jobs.
but I still don't know how to change their default settings yet
On Wed, Jun 13, 2012 at 2:15 PM, Yang wrote:
> I got the OutOfMemoryError when I tried to open an Lucene index.
>
> it's very
I'm using disjunction (OR) query. unfortunately all of the clauses are
optional
On Sat, May 26, 2012 at 4:38 AM, Simon Willnauer <
simon.willna...@googlemail.com> wrote:
> On Sat, May 26, 2012 at 2:59 AM, Yang wrote:
> > I tested with more threads / processes. indeed this i
, so the search quality is not
a huge issue here) , so that, for example, fewer fields are evaluated or a
simpler scoring function is used?
thanks
Yang
On Fri, May 25, 2012 at 5:47 PM, Yang wrote:
> thanks a lot guys
>
>
> On Tue, May 22, 2012 at 1:34 AM, Ian Lea wrote:
>
&
ributed searching.
> > if the cpu is not fully used, yuo can do this in one physical machine
> >
> > 在 2012-5-22 上午8:50,"Li Li" 写道:
> >>
> >>
> >> 在 2012-5-22 凌晨4:59,"Yang" 写道:
> >>
> >> >
> >> > I
o no copy overhead is incurred.
the only argument in favor of sharding is that a smaller index might
be faster. but since index search is only O(lg(n)) time, maybe this
time saving is very small.
so will sharding be worth the effort?
thanks
yang
--
yes, that's why many search engines will not allow user visit page
> number greater than a threshold. for most application, users usually
> only visit top results. That's why ranking algorithm is important. if
> you found your users always turn to next page, I think you should
> consider your appli
Thanks Ralf.
basically you are talking about selectivity of columns in a JOIN, right?
but in my above example, "yellow dog", both terms are very common, and both
have long postings lists.
Yang
On Thu, Apr 26, 2012 at 12:17 AM, Ralf Heyde wrote:
> Hi,
>
> i do
ingle term query would quickly return the top-k, but if it's
multi-term, they would have to traverse the entire lists to find the
insersection set, because the lists are not sorted by docId, as in the
Lucene paper case)
On Wed, Apr 25, 2012 at 2:13 PM, Yang wrote:
> I read the paper b
thanks a lot for the detailed info!
On Wed, Apr 13, 2011 at 4:43 AM, Grant Ingersoll wrote:
>
> On Apr 7, 2011, at 9:17 PM, Yang wrote:
>
>> I'm new to lucene/search engine , and have been struggling with these
>> questions recently.
>> I'd appreciate a lot
me good articles on how
lucene/search engines work? I've read the "anatomy of a search engine"
(google Sergey Brin & Larry Page paper),
"introduction to information retrieval (Manning et al ) " , "Lucene
in action"
Thanks
Yang
-
quot;~5 work? If not, why not?
>
> You might get some use from
> http://lucene.apache.org/java/2_4_0/queryparsersyntax.html
>
> Or if that's not germane, perhaps you can explain your use case.
>
> Best
> Erick
>
> On Wed, Mar 30, 2011 at 5:49 PM, Andy Yang wrote:
p://lucene.apache.org/java/2_4_0/queryparsersyntax.html
>
> Or if that's not germane, perhaps you can explain your use case.
>
> Best
> Erick
>
> On Wed, Mar 30, 2011 at 5:49 PM, Andy Yang wrote:
>> Is there a minimum string length requirement for proximity search?
Is there a minimum string length requirement for proximity search? For
example, would "a~" or "an~" trigger proximity search? The result
would be horrible if there is no such requirement.
Thanks,
Andy
-
To unsubscribe, e-mail: ja
What is the difference between the "AND" and "+" operator?
ALso,what is the difference between a query and a filter?
For example
String[] fields={"name","address","classId"};
If I want to search the document whose classId is '4" and whose name or
address contain "Zhongzhou Road No 200",I can use t
function?
after the add is done, should the reader be reopen?
or someone show me a simple example, thx a lot!!!
--
祝一切顺利~
Best Regards,
杨建东
=
Jiandong Yang
Mobile Phone:15921536660
email: yangjiand...@snda.com
Architect management office
BWT,for some condition-required search I can make the condition as a filter
and then filter the result.
Also I can build a BooleanQuery according to the condition just like the
code in the range search,I wonder which is better?
2010/11/18 yang Yang
> Thank you very much!!! :)
>
> I wi
k at the various approaches
> there are for the same. Have a look at the contrib module in lucene.
> http://wiki.apache.org/lucene-java/SpatialSearch
> For your understanding, you could have a look at the bounding box approach.
>
> --
> Anshum Gupta
> http://ai-cafe.blogspot.com
We are using the hibernate search which is based on lucene as the search
engine to build a full text search for our position-related data in the
MYSQL db.
This is the main structure of the table(it save the id,coordinate and name
of one Surface_Feature):
+++-++
| id
I would like to use MultiFieldQueryParser to serach multiple fields, then in
each field, I want to use fuzzy search. How can that be done? Any example
will be appreciated.
Thanks,
Andy
Hi:
I am using MySql,and I want to use the full text search is rather weak.
So I use the Sphinx,however I found it can not support Chinese work
searching prefectly.
So I wonder if Lucene can work better?
Hi,ALL
Playing with an algorithm(Summarize/Highlight Based on Slide Windows),
i find that IndexerReader.termPositions(Term term) not support
wildcard term. Is it meaningful or not to write a patch to support
wildcard term?
-
To u
e new classes to address the problem. To be able to customize ranking
is very important to a search engine.
Yang
Urvashi Gadi wrote:
No...the information is available only at search time
Quoting Erik Hatcher <[EMAIL PROTECTED]>:
Could your computation be done at indexing time rather tha
rch keyword1 in content and keyword2 in record and
they should also have the same pid. Is there anyway to do this? Or is
there any relational database can be integrated with lucene?
Thanks,
Yang
-
To unsubscribe, e-mail: [EMAIL
y. Don't know if I can figure out
something.
Any suggestions? Thanks.
Yang
-Original Message-
From: Yonik Seeley [mailto:[EMAIL PROTECTED]
Sent: 2006年3月8日 21:35
To: java-user@lucene.apache.org
Subject: Re: Lucene Ranking/scoring
Hi Yang,
Boosting works at query time as well as
nd an
answer. Implements the SortDocComparator seems the closest, but it can only
sort the result by one field. The Field boost does not work because the
boosting factor has to be set during index time. What I need is setting the
weight at query time.
Please help. Thanks.
34 matches
Mail list logo