Query, facets,
boosting, duplicates, span queries and highlighting stuff.
Even Luke and the other Lucene related projects.
Thanks,
Peter W.
Otis Gospodnetic wrote:
Peter - LIA2 is in progress! :) LIA2IP?
-
To unsubscribe, e
Hello,
How is progress on the new Lucene in Action coming?
Thanks,
Peter W.
-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]
Lukas,
One last thing, be sure to log only when a user clicks on a result
and in Hadoop document_id will be a key in the map phase.
Lucene related steps are the same.
Best,
Peter W.
On Aug 14, 2007, at 1:28 PM, Peter W. wrote:
When users perform
a search, log the unique document_id, IP
rt results (reverse order) by score field.
A more advanced version could store previous result
positions as Payloads but I don't understand this
new Lucene concept.
Regards,
Peter W.
On Aug 10, 2007, at 5:56 AM, Lukas Vlcek wrote:
Enis,
Thanks for your time.
I gave a quick glance at Pig an
))
};
MultiReader mr=new MultiReader(indexr_a);
IndexSearcher is=new IndexSearcher(mr);
Regards,
Peter W.
On May 22, 2007, at 1:10 AM, Chris Hostetter wrote:
...and if you are "Multi Searching" over a bunch of local directories
anyway, then use a single INdexSea
a try in the servlet init() method!
Regards,
Peter W.
On May 21, 2007, at 2:46 PM, Erick Erickson wrote:
Why are you doing this in the first place? Do you actually have
evidence that the default Lucene behavior (caching, etc) is inadequate
for your needs?
I'd *strongly* recommend, if y
(searcher_a);
...
}
catch(Exception e)
{
System.out.println(e);
}
For example, one of several indexes is 768MB. Is there possibly a
better way to do this?
Regards,
Peter W.
-
To
which Lucene index user data is written to.
Another option to consider is Solr.
Regards,
Peter W.
On Apr 13, 2007, at 1:39 AM, Dan Wiggin wrote:
But so often, when a developer search how to work with lucene finds
normally
the same code for same problems.
I think it will be useful creat
thn_f,morethn_f};
Filter rf=new ChainedFilter(fa,ChainedFilter.AND);
return rf;
}
It's more expensive at index time, has a bigger storage requirement and
is slower than in-memory but should give the desired functionality.
Regards,
Peter W.
On Apr 3, 2007, at 10:59 AM, Andy Liu wrote:
One more thing...
It could optionally be indexed and stored as a String
then contents of the Hits object could be placed into
a Collection with a comparator that sorts double
values in reverse order.
Regards,
Peter W.
On Apr 2, 2007, at 12:02 PM, "Peter W." <[EMAIL PROTECT
ayments and
pay-per-action (conversions) would be gravy.
Beyond just cloning what's out there, the collective experience
of the Lucene community could take leadership in paid search.
Best,
Peter W.
On Mar 27, 2007, at 12:45 PM, Doron Cohen wrote:
Assuming you don't mean UI design - ho
Howdy,
Does anyone have any design considerations for implementing
a contextual text-link advertising system using Lucene?
The emphasis would be strictly on monetizing search results with
light, non-intrusive behavior (query terms match sponsored results).
Thanks,
Peter W
are unique for each query.
Utilizing "user feedback to improve search results" with clickstream
data could be a sub-project in itself. It moves into future areas of
personalization and would be a cool add-on to Lucene.
Hope that helps,
Peter W.
Because scoring
The way it appears to
On Ma
ilter would be constructed using two RangeFilters setting upper
and lower date boundaries (Strings) combined using NumberTools and
ChainedFilter.
With a subset of your matching results sorting should be much faster.
Regards,
Peter W.
On Mar 20, 2007, at 12:39 PM, David Seltzer wrote:
Hi All,
ing a Sort Object and
passing in an array of SortFields with
"votes" as type SortField.STRING first.
Precedence of sort order kicks in and
your docs with more clicks rank higher.
If everything goes well you will have
results ordered by user generated
scoring.
Regards,
Peter W.
On Mar
StringBuffer delimited by commas, then make
one long String (holding all your dates) and add
to the Lucene doc as one Field.Text.
You might be able to set that Field to indexed, but not
stored to save space.
Regards,
Peter W.
On Feb 28, 2007, at 11:22 AM, Aigner, Thomas wrote:
Walt,
I am no
r.
For someone trying to get work done, use incremental updates to
one local index first. Then explore writing to multiple indexes and
reading them using MultiSearcher.
Afterward, use HTTP-based updates/requests with Solr to scale out.
Hope that helps.
Peter W.
On Feb 20, 2007, at 5:29 PM, ori
ariable to keep track of which page you are on
and a static
method which returns min/max values to be included in your iteration
loop.
You can also see my previous attempt at solving this:
http://www.gossamer-threads.com/lists/lucene/java-user/43595
Regards,
Peter W.
On Feb 21, 2007, at
Hello,
Using a parser to get text out of HTML, XML (including RSS, ATOM) is
only
easy if you control the source documents.
HTML pages in the wild are much different, generating exceptions you
must
catch and deal with. For most projects you can probably use
java.util.regex
to obtain keywo
outside searching need to be turned off during updates?
Also, assuming this runs hourly, and if I need to close then open
each time,
how can a seamless user experience (no frozen queries, minimal delays)
be achieved?
Thanks.
Peter W
ath, those who are can find an explanation of the
latter here:
http://www.ams.org/featurecolumn/archive/pagerank.html
Regards,
Peter W.
On Jan 22, 2007, at 12:00 PM, Mark Miller wrote:
Well first Lucene checks all of the other documents in the world
for any that that refer to the document
est 2.0 version release the Lucene in Action
book provides good background on combining separate indexes.
Regards,
Peter W.
On Jan 4, 2007, at 7:51 AM, Mark Mei wrote:
So this question has two parts:
1. How does Lucene scale, exactly? Do we distribute the index to
multiple
servers somehow? Or
separated data
files would be exposed thru a web service
where load balanced remote boxes access them using servlets.
They connect in rotation downloading batched index updates. Heck,
start splitting up big files using Hadoop's
HDFS and make it a party!
Re
g*hpp);
else
// few results
ri=hc;
} // inner if
else
ri=hc;
} // else
return ri;
}
Also, is there an available sample of using TopDocs .search()?
Peter W.
On Dec 27, 2006, at 10:33 PM,
Hello,
I'm trying to iterate or page through Lucene document hits results.
Before reinventing this, is there an existing solution out there or
in Solr?
Thanks in advance,
Peter
-
To unsubscribe, e-mail: [EMAIL PROTECTED]
Fo
Hello,
I just got this working in three or four steps:
1. goto http://www.apache.org/dyn/closer.cgi/lucene/java/
2. click on any of the mirrors and download "lucene-2.0.0.zip"
3. unzip into preferred directory (step not shown), then use jar to
look at snowball items:
jar tvf /opt/lucene-2.0.
Another solution is work with plain java dates and calendar objects,
convert into Lucene strings
using DateTools (resolution day) then query this field with two
RangeFilters using ChainedFilter.
You will never get the BooleanQuery error.
Peter
On Oct 17, 2006, at 10:57 AM, Bushey, John wrote
undary:
Filter filter=RangeFilter.Less("num",NumberTools.longToString(10L));
// field num < 10
...
FilteredQuery fq = new FilteredQuery(query,filter);
The NumberTools.longToString() method is supposed to replace padding
leading 0's eliminating string comparison issues.
Hopefully, so
eld.Index.TOK
ENIZED) );
writer.addDocument(doc);
writer.optimize();
writer.close();
}
}
Since five is less than ten, why doesn't it work?
Thanks.
Peter W.
-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]
29 matches
Mail list logo