how newer documents have a better score

2005-11-15 Thread gekkokid
Hi, can anyone give me some pointers on making newer documents have a better ranking/score? i.e. documents i indexed today have a higher ranking/score in the index than documents that were index yesterday etc Thanks _gk

Re: How to give weight to document when adding to the index?

2005-11-15 Thread gekkokid
boosting the document using the method setBoost(double), i think the param is a double but not sure, it works on both Document and Field objects lil'example: Document d = new Document(); d.add(Field.Keyword("name","gekkokid")); d.setBoost(1.1); //

Re: how newer documents have a better score

2005-11-15 Thread gekkokid
thanks :) _gk - Original Message - From: "Chris Hostetter" <[EMAIL PROTECTED]> To: Sent: Wednesday, November 16, 2005 6:10 AM Subject: Re: how newer documents have a better score : Hi, can anyone give me some pointers on making newer documents have a : better ranking/score? i.e. do

Re: What is stemming?

2005-11-20 Thread gekkokid
Hello fellow lucener :), firstly im no tutor but i will try my best to explain, if anyone believes im wrong please state it so our friend doesnt get the wrong idea, here it goes. O_o stemming is reducing the word to the root form, where lemmatisation is concerned with linguistics i believe l

Re: Throughput doesn't increase when using more concurrent threads

2005-11-21 Thread gekkokid
Oren Shir wrote: I tested this in version 1.4.3 and 1.9rc1, and they are both the same in this aspect. 1.9rc1 is faster, but does not benefit from multi threading. some newbie questions i have, does 1.4.3 benefit from multi-threading? is 1.9 the version in the source repository? _gk ---

Re: Lucene + LSI

2005-11-30 Thread gekkokid
sorry have to ask - whats LSI - " latent semantic indexing"? _gk - Original Message - From: "Lorenzo Viscanti" <[EMAIL PROTECTED]> To: ; <[EMAIL PROTECTED]> Sent: Thursday, December 01, 2005 12:02 AM Subject: Re: Lucene + LSI It depends on the kind of implementation you are thinking o

Re: ApacheCon next week

2005-12-11 Thread gekkokid
please :) - Original Message - From: "Luke Nezda" <[EMAIL PROTECTED]> To: Sent: Sunday, December 11, 2005 6:28 PM Subject: Re: ApacheCon next week Hello Grant- Could you post the material you present (eg slides, handouts, etc) for those of us who cannot attend? Thanks in advance, -Luk

Re: Top n Searches

2005-12-13 Thread gekkokid
would 'x y z' and 'y z x' be the same results? i didnt think that was the case - Original Message - From: "Paul Williams" <[EMAIL PROTECTED]> To: Sent: Tuesday, December 13, 2005 5:22 PM Subject: RE: Top n Searches That was the approach I was planning to take but I've been asked to

(lucene 1.4.*) Field.Text = (lucene repository) Field.TermVector

2005-12-28 Thread gekkokid
Hi, i have recently updated to the latest version of lucene in the source repository, the enclosed classes (field types) in Field have been depreciated, is the old Field.Text now Field.TermVector where it is analyzed, indexed and stored? thanks, _gk

Re: (lucene 1.4.*) Field.Text = (lucene repository) Field.TermVector

2005-12-28 Thread gekkokid
thank you - Original Message - From: "Erik Hatcher" <[EMAIL PROTECTED]> To: Sent: Wednesday, December 28, 2005 11:52 PM Subject: Re: (lucene 1.4.*) Field.Text = (lucene repository) Field.TermVector On Dec 28, 2005, at 5:27 PM, gekkokid wrote: Hi, i have recentl

Re: how do I connect to the SVN repository to grab the latest source?

2006-01-03 Thread gekkokid
if your using windows just download subversion from subversion.tigris.org and install it - then just enter the command found on the lucene homepage or wiki :) via the command input (i.e. cmd), pretty much the same for linux i guess _gk - Original Message - From: "Colin Young" <[EMAIL

No sub-file with id _18.f0 found

2006-01-23 Thread gekkokid
hi, when i try to view my index with luke i get the loading error: "No sub-file with id _18.f0 found". any ideas what could be causing this? im using IndexWriter.setUseCompoundFile(true) in the past it has worked fine without any problems, im on win xp with java 1.5 Regards, [EMAIL PROTECTED

Re: No sub-file with id _18.f0 found

2006-01-24 Thread gekkokid
is there a web page that lists all the files created in a index so i can track down the problem im having im using the latest source via svn and have rebuild using ant everytime i create an index no-matter how basic i get errors from luke - Original Message - From: "gek

Re: Sorting by calculated custom score at search time

2006-01-24 Thread gekkokid
how does TSS boost by date? give a small boost increase like 0.1 or 0.2 x (ArticlePublishDate - IndexCreationDate)? - Original Message - From: "Nick Vincent" <[EMAIL PROTECTED]> To: Sent: Tuesday, January 24, 2006 5:42 PM Subject: Sorting by calculated custom score at search time I

deleting duplicate documents from my index

2006-01-28 Thread gekkokid
Hi, im trying to delete duplicate documents from my index, the unique indentifier is the documents url (aka field "url"). my initial thought of how to acomplish this is to open the index via a reader and sort them by the documents url and then iterate through them looking for a match with the c

Re: deleting duplicate documents from my index

2006-01-30 Thread gekkokid
hi, thats exactly what i did :) works perfectly thanks _gk - Original Message - From: "Chris Hostetter" <[EMAIL PROTECTED]> To: Sent: Monday, January 30, 2006 5:56 AM Subject: Re: deleting duplicate documents from my index : Hi, im trying to delete duplicate documents from my inde

Re: Inappropriate content detection

2006-02-05 Thread gekkokid
Hi, what scale is this website? millions of posts or under? wouldn't it be easiler to use a bayesian algorithm to scan each new post before it is posted to detect whether it is acceptable or not? just a quick idea of my head _gk - Original Message - From: "Jeff Thorne" <[EMAIL PRO

Re: Accessing Lucene Index stored in a jar file

2006-02-18 Thread gekkokid
couldnt you use the java zip library (http://java.sun.com/j2se/1.5.0/docs/api/java/util/zip/package-summary.html) and compress and uncompress it separately? just an idea - Original Message - From: "Ahmed El-dawy" <[EMAIL PROTECTED]> To: Sent: Saturday, February 18, 2006 8:38 PM Subjec

Re: How to intergrate lucene with my web application

2006-03-01 Thread gekkokid
hi, i would download the 1.9 version as your starting fresh (unless you need the 1.4.3 version for some reason), what is your web application? and what should lucene be doing when intergrated with your web app? there is a simple example in the binary 1.9 download, /src/jsp, look at "results.js

Re: Question

2006-03-07 Thread gekkokid
would lucene even have to be accessed? couldnt you save the queries when submitted and search that via a sql database? _gk - Original Message - From: "Thomas Papke" <[EMAIL PROTECTED]> To: Sent: Tuesday, March 07, 2006 12:11 PM Subject: Question Hello, anyone implement the "Google

Re: Can i use lucene to search the internet.

2006-03-22 Thread gekkokid
Title: Can i use lucene to search the internet. Hi, are you asking does it have a crawler? no it doesn't but nutch does http://lucene.apache.org/nutch/ :)   _gk - Original Message - From: Babu, KameshNarayana (GE, Research, consultant) To: java-user@lucene.apache.org

Re: Can i use lucene to search the internet.

2006-03-22 Thread gekkokid
Message- From: gekkokid [mailto:[EMAIL PROTECTED] Sent: Thursday, March 23, 2006 11:22 AM To: java-user@lucene.apache.org Subject: Re: Can i use lucene to search the internet. Hi, are you asking does it have a crawler? no it doesn't but nutch does http://lucene.apach

Re: Hi Experts

2006-03-29 Thread gekkokid
Hi, Lucene is a component that indexes data and allows you to search that indexed data, you need to be able to program in Java(various ports for other languages are available) or find a crawler you can adapt to download the required data of the internet (still requires basic knowledge of Ja

Re: searching offline

2006-04-05 Thread gekkokid
http://regain.sourceforge.net/ ? - Original Message - From: "Delip Rao" <[EMAIL PROTECTED]> To: Sent: Wednesday, April 05, 2006 2:23 PM Subject: searching offline Hi, I have a large collection of text documents that I want to search using lucene. Is there any command line utility th

Re: Theoretical Lucene Performance

2006-05-16 Thread gekkokid
http://lucenebook.com http://www.amazon.com/exec/obidos/asin/1932394281 :) - Original Message - From: "Andreas Harth" <[EMAIL PROTECTED]> To: Sent: Tuesday, May 16, 2006 10:51 PM Subject: Theoretical Lucene Performance Hello, I'd like to learn a bit more about the index organizati

Re: Krishnendra Nandi is out of the office.

2006-06-02 Thread gekkokid
not again plz - Original Message - From: "Krishnendra Nandi" <[EMAIL PROTECTED]> To: Sent: Friday, June 02, 2006 2:34 PM Subject: Krishnendra Nandi is out of the office. Regarding your message: Re: Num of a term in a Doc I will be out of the office starting 01-Jun-2006 and will n

Re: An interesting thing

2006-06-11 Thread gekkokid
In Windows XP can't you change the registry to use only phyiscal RAM? - Original Message - From: "yueyu lin" <[EMAIL PROTECTED]> To: Sent: Sunday, June 11, 2006 12:31 PM Subject: Re: An interesting thing In some OS, the ram is not only "RAM". The virtual ram uses the disk. That's ve