The queries I'm doing really aren't anything clever...just searching for
phrases on pages of text, sometimes narrowing results by other words that
must appear on the page, or words that cannot appear on the same page. I
don't have experience with those span queries so i can't say much about
them.
SpanTermQuery is a TermQuery and not a WildcardQuery. You could use a
SpanRegexQuery. You could also make your own SpanWildcardQuery based
on either WildcardQuery or SpanRegexQuery.
You should probably tell us a bit about the problem you try to solve
rather than asking about the solution y
27 nov 2008 kl. 10.15 skrev Toke Eskildsen:
On Thu, 2008-11-27 at 07:30 +0100, Karl Wettin wrote:
The most scary part is that that you will have to score each and
every
document that has a source, probably all of the documents in your
corpus.
I now see my query-logic was flawed. In order t
Thanks for the tip,
but I can't imagine the number of documents google has to join in order
process such results...
There must be a trick.
Maybe stopwords are not indexed alone but twice with previous and next
token, some sort of 2-gram index?
David.
Aleksander M. Stensby a écrit :
Your que
Below is a document in lucene
--
ID : 1
110_a : library information
--
Case 1:
Term term1 = new Term("110_a", "library");
SpanFirstQuery spanFirstQuery = new SpanFirstQuery(new SpanTermQuery(term1),
1);
Case 2
That's a phrase search, so it's conceivable google could be doing
something similar to nutch, whereby adjacent ngrams are indexed as
unique terms.
But if you do the same search without quotes:
http://www.google.fr/search?hl=fr&q=HOW+at+at+of+a+A+a&btnG=Rechercher&meta=
they still find
Your query includeds apostrophes which tells google to include common
words in the query.
But, if you remove the apostrophes, you will still get results, as google
states:
"Google ignores stop words when they're placed in searches alongside less
common words. For example, a search for [ The
Hi Greg,
Thanks for quick and detailed answer.
What kind of queries do you run? Is it going to work for
SpanNearQueries/SpanNotQueries as well?
Do you also get the word itself at each position?
It would be great if I could search on the content of each payload as well,
but since the payload cont
Hi,
Look at this google query :
http://www.google.fr/search?q=%22HOW+at+at+of+a+A+a%22
What do you think about that concerning stop words?
Google has no stop words?
David.
-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For addi
Fantastic, now its working perfect.
Thank you,
Albert
prabin meitei wrote:
>
> Hi,
>
> You can use MUST an the end.
>Using your code use as
> codisFiltre="XX07_04141_00853#XX06_03002_00852#UX06_07019_02994"
> String[] codi =codisFiltre.split('#');
> *finalFilter = new BooleanFilter();*
>
On Thu, 2008-11-27 at 07:30 +0100, Karl Wettin wrote:
> The most scary part is that that you will have to score each and every
> document that has a source, probably all of the documents in your
> corpus.
I now see my query-logic was flawed. In order to avoid matching all
documents every time,
11 matches
Mail list logo