solr facet query with Tagging and Excluding Filters

2014-09-18 Thread Andy Yu
Hi guys, I want to do a facet with facet query,and let it has the [Tagging and Excluding Filters] ( https://cwiki.apache.org/confluence/display/solr/Faceting)style which facet.field has,so how to do it , pls guide me! Thanks, Andy

RE: Length of the filed does not affect the doc score accurately for chinese analyzer(SmartChineseAnalyzer)

2014-02-12 Thread andy
Hi Uwe, thanks a lot, I will try with that. Uwe Schindler wrote > Hi andy, > > unfortunately, that is not easy to show with one simple code. You have to > change the Similarity used. > > Before starting to do this, you should be sure, that this affects you > users. Th

RE: Length of the filed does not affect the doc score accurately for chinese analyzer(SmartChineseAnalyzer)

2014-02-12 Thread andy
lable in the Solr > server. But Andy uses Lucene directly. In his case he should use > IndexSearcher's explain functionalities to retrieve a structured output of > how the documents are scored for this query for debugging: > > http://lucene.apache.org/core/4_6_0/core/org/a

Re: Length of the filed does not affect the doc score accurately for chinese analyzer(SmartChineseAnalyzer)

2014-02-12 Thread andy
thanks for your reply Erick, this is the case ,But how can I keep the precision of the fields' length? -- View this message in context: http://lucene.472066.n3.nabble.com/Length-of-the-filed-does-not-affect-the-doc-score-accurately-for-chinese-analyzer-SmartChineseAnalyz-tp4111390p4116832.html

Length of the filed does not affect the doc score accurately for chinese analyzer(SmartChineseAnalyzer)

2014-01-15 Thread andy
Hi guys, As the topic,it seems that the length of filed does not affect the doc score accurately for chinese analyzer in my source code index source code private static Directory DIRECTORY; @BeforeClass public static void before() throws IOException { DIRECTORY = new RAMDire

custom solr sort problem

2013-01-05 Thread Andy Yu
t compareDocToValue(int arg0, Object arg1) throws IOException { // TODO Auto-generated method stub return 0; } } } } and solrconfig.xml configuration is mySortComponent Andy

Re: sort by field and score

2012-12-02 Thread Andy Yu
eak your code down into a simple standalone program > and post that if it still doesn't work. > > > -- > Ian. > > On Thu, Nov 29, 2012 at 4:20 AM, Andy Yu wrote: > > I revise the code to > > > > SortField sortField[] = {new Sor

Re: sort by field and score

2012-11-28 Thread Andy Yu
NaN I think you'll need > to use a TopFieldCollector. See for example > http://www.gossamer-threads.com/lists/lucene/java-user/86309 > > > -- > Ian. > > > On Tue, Nov 27, 2012 at 3:51 AM, Andy Yu wrote: > > Hi All, > > > > > > Now I want to sor

Re: minimum string length for fuzzy search

2011-03-30 Thread Andy Yang
My question should really be on "fuzzy search". Is there a minimum length requirement for fuzzy search to start? For example, would "an~0.8" kick off fuzzy search? Thanks, Andy On Wed, Mar 30, 2011 at 4:02 PM, Erick Erickson wrote: > Uhhhm, doesn't "term1 term2&

Re: minimum string length for proximity search

2011-03-30 Thread Andy Yang
~2 term2~2". I am wondering if we should skip short words if it is not done automatically by the engine. Thanks, Andy On Wed, Mar 30, 2011 at 4:02 PM, Erick Erickson wrote: > Uhhhm, doesn't "term1 term2"~5 work? If not, why not? > > You might get some use from > htt

minimum string length for proximity search

2011-03-30 Thread Andy Yang
Is there a minimum string length requirement for proximity search? For example, would "a~" or "an~" trigger proximity search? The result would be horrible if there is no such requirement. Thanks, Andy - To

Re: [ANN] General Availability of LucidWorks Enterprise

2010-12-15 Thread Andy
Congrats! A couple questions: 1) Which version of Solr is this based on? 2) How is LWE different from standard Solr? How should one choose between the two? Thanks. --- On Wed, 12/15/10, Grant Ingersoll wrote: > From: Grant Ingersoll > Subject: [ANN] General Availability of LucidWorks Enterp

combining MultiFieldQueryParserparser with FuzzyQuery

2010-10-18 Thread Andy Yang
I would like to use MultiFieldQueryParser to serach multiple fields, then in each field, I want to use fuzzy search. How can that be done? Any example will be appreciated. Thanks, Andy

RE: How to search by numbers

2010-04-19 Thread Andy
That works, and now that I re-test my original code, it also works. > Date: Mon, 19 Apr 2010 10:52:45 -0700 > From: iori...@yahoo.com > Subject: Re: How to search by numbers > To: java-user@lucene.apache.org > > > > Hi, I have indexed the following two fields: > > org_id - NOT_ANALYZEDorg_name

How to search by numbers

2010-04-19 Thread Andy
Hi, I have indexed the following two fields: org_id - NOT_ANALYZEDorg_name - ANALYZED However when I try to search by org_id, for example, 12345, I get no hits. I am using the StandardAnalyzer to index and search. And I am using: Query query = queryParser.parse("org_id:12345"); Any ideas? Th

RE: How to search multiple fields using multiple search terms

2010-04-15 Thread Andy
o: java-user@lucene.apache.org > > Why are you locked into using MultiFieldQueryParser? The simpler approach is > just send something like +title:abc +desc:123 through the regular query > parser > > HTH > Erick > > On Thu, Apr 15, 2010 at 6:34 PM, Andy wrote

How to search multiple fields using multiple search terms

2010-04-15 Thread Andy
Hi, I am trying to use the MultiFieldQueryParser to search "title" and "desc" fields. However the Lucene API appears to only let me provide a single search term. Is it possible to use multiple search terms (one for each field)? For example, the SQL equivalent would be: select * from luce

Re: lucene search

2010-01-29 Thread andy green
Thanks -- View this message in context: http://old.nabble.com/lucene-search-tp27358766p27367213.html Sent from the Lucene - Java Users mailing list archive at Nabble.com. - To unsubscribe, e-mail: java-user-unsu

lucene search

2010-01-28 Thread andy green
hello, I programmed with Lucene code to handle the search on my site ... the articles indexed are those stored in a database, then I do a search with "lucene.queryparser" on the field "code" of various objects (a "code" is a word of 3 6-character) ... My problem is the fact that when I search, I

Is my app a good fit for Lucene?

2009-07-10 Thread Andy Faibishenko
cause the text format is tripping up the tokenizing. I am trying to figure out whether using Lucene to implement this is a good thing or whether I should just try to implement my own search logic. Andy Faibishenko

Re: Scaling out/up or a mix

2009-06-30 Thread Andy Goodell
rches, but overall I only spent about a week on the project, and got a 60x speed improvement on the target set. (from minutes to seconds) YMMV however, since the app requires the collection of the complete set of results for analysis. - andy g On Mon, Jun 29, 2009 at 12:47 AM, Marcus Herou wrot

Piece of coded needed

2009-04-24 Thread Andy
--- On Sat, 4/25/09, andykan1...@yahoo.com wrote: From: andykan1...@yahoo.com Subject: Piece of coded needed To: java-user@lucene.apache.org Date: Saturday, April 25, 2009, 1:37 AM Hi every body I know it may seem stupid, but I'm in the middle of a research and I need a piece of code in luc

Index in text format

2009-04-09 Thread Andy
Is there a way to have lucene to write index in a txt file?

Re: Vector space implemantion

2009-04-09 Thread Andy
solve? On Apr 9, 2009, at 2:33 AM, Andy wrote: > Hello all, > > I'm trying to implement a vector space model using lucene. I need to have a > file (or on memory) with TF/IDF weight of each term in each document. (in > fact that is a matrix with documents presented

Vector space implemantion

2009-04-09 Thread Andy
Hello all, I'm new to lucene and trying to implement a vector space model using lucene. I need to have a file (or on memory) with TF/IDF weight of each term in each document. (in fact that is a matrix with documents presented as vectors, in which the elements of each vector is the TF weight ...

Vector space implemantion

2009-04-08 Thread Andy
Hello all, I'm trying to implement a vector space model using lucene. I need to have a file (or on memory) with TF/IDF weight of each term in each document. (in fact that is a matrix with documents presented as vectors, in which the elements of each vector is the TF weight ...) Please Please h

Re: Lucene searching across documents

2009-04-08 Thread Andy
Hello all, I'm trying to implement a vector space model using lucene. I need to have a file (or on memory) with TF/IDF weight of each term in each document. (in fact that is a matrix with documents presented as vectors, in which the elements of each vector is the TF weight ...) Please Please

Re: Luke is coming .. not there yet.

2008-10-30 Thread Andy Triana
whichever is chosen. Just a huge thank you for making this tool available! Great tool! //andy On Thu, Oct 30, 2008 at 4:06 AM, Andrzej Bialecki <[EMAIL PROTECTED]> wrote: > Hi all, > > Many people ask me when the next version of Luke becomes available. It's > almost

phrases and slop

2008-08-28 Thread Andy Goodell
brown" to require a 4 instead of a 3, two to transpose brown and fox, two to transpose quick and fox. Why is this only 3? - andy g - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]

Re: delete by doc id

2008-08-08 Thread Andy Triana
a "best practice" to treat Lucene as described. //andy On Fri, Aug 8, 2008 at 2:39 PM, Cam Bazz <[EMAIL PROTECTED]> wrote: > hello, > > what would happen if I modified the class IndexWriter, and made the delete > by id method public? > > I have two fields in my d

Re: Using lucene as a database... good idea or bad idea?

2008-07-31 Thread Andy Liu
be merged and data ends up getting copied over again at certain points. So if you're running a batch process with a lot of inserts, you might get better throughput with BDB as opposed to Lucene, but, of course, benchmark to confirm ;) Andy On Thu, Jul 31, 2008 at 9:12 AM, Karsten F. &l

Re: Using Lucene to find duplicate/similar names

2008-04-16 Thread Andy DePue
ch other as the same NGrams in the search string. I'm hoping NGrams would avoid the need for a whole index scan. Does Lucene already factor this into its hit score, or would I need to do some custom work? - Andy Grant Ingersoll wrote: I believe there were some posts on this about a year

Using Lucene to find duplicate/similar names

2008-04-16 Thread Andy DePue
or similar name. Based on the little I know of Lucene, I'm thinking an NGram algorithm (based on characters, not words) would work best... but, I'm not sure if Lucene takes proximity or edit distances into account? For example, say you have these two names: Andrew John John Andrew I

Re: Indexing Wikipedia dumps

2007-12-12 Thread Andy Goodell
My firm uses a parser based on javax.xml.stream.XMLStreamReader to break (english and nonenglish) wikipedia xml dumps into lucene-style "documents and fields." We use wikipedia to test our language-specific code, so we've probably indexed 20 wikipedia dumps. - andy g On Dec 1

Searching with a score cutoff

2007-06-04 Thread Andy Goodell
n the hits are gathered. The only way I can see of doing this is by over-riding Similarity, which seems like an incredibly complex procedure. What am I missing? - andy g.

Re: How many Searches is a Searcher Worth?

2007-04-05 Thread Andy Goodell
heap dump, and will start an http listener on port 7000 by default. Interesting statistics can be found at the bottom of the front page. These will enable you to discover whether it is a memory leak in the java runtime or in the lucene library. - andy g On 4/5/07, Craig W Conway <[EM

Re: Index updates between machines

2007-04-03 Thread Andy Liu
chine you can load balance 2 search servers and take one out of the cluster when the index is being copied. Alternatively, if it's possible, you can copy the index at an offpeak hour. Andy On 4/3/07, Otis Gospodnetic <[EMAIL PROTECTED]> wrote: How fast are your disks? Perhaps they ar

Re: Range search in numeric fields

2007-04-03 Thread Andy Liu
nding on your data. Andy On 4/3/07, Ivan Vasilev <[EMAIL PROTECTED]> wrote: Hi All, I have the following problem: I have to implement range search for fields that contain numbers. For example the field size that contains file size. The problem is that the numbers are not kept in strings

Re: Using ParallelReader over large immutable index and small updatable index

2007-03-07 Thread Andy Liu
r field4 is a field that would be updated frequently and as real-time as possible. However, once I update field4, the docId's are no longer synchronized, and ParallelReader fails. Andy On 3/6/07, Alexey Lef <[EMAIL PROTECTED]> wrote: We use MultiSearcher for a similar scenario. This

Using ParallelReader over large immutable index and small updatable index

2007-03-06 Thread Andy Liu
is issue on the list, but nothing pointing to a solution. Can somebody help me out? Andy

Re: Lopsided scores for each term in BooleanQuery

2006-09-18 Thread Andy Liu
I'm just not seeing? Andy On 9/18/06, Paul Elschot <[EMAIL PROTECTED]> wrote: On Monday 18 September 2006 23:08, Andy Liu wrote: > For multi-word queries, I would like to reward documents that contain a more > even distribution of each word and penalize documents that have a skewe

Lopsided scores for each term in BooleanQuery

2006-09-18 Thread Andy Liu
would implement this? Thanks, Andy

Re: SQL-Like Join in Lucene

2006-08-10 Thread hu andy
4. Search for records with filter. if the filter returns a lot of ids, it willn' t be fast. Recently I have a test. I customized a filter which get a list of ids from a mysql database table of size 5000. Then I invoke the search(query, filter, hitcollector), I took me more than 40s to retrieve th

Re: About the use of HitCollector

2006-08-08 Thread hu andy
filter. Can you give me some advice? 2006/8/8, Ryan O'Hara <[EMAIL PROTECTED]>: Hey Andy, If you have enough RAM, try using FieldCache: String[] fieldYouWant = FieldCache.DEFAULT.getStrings (searcher.getIndexReader(), "fieldYouWant"); searcher.search(query, new HitColle

Re: About the use of HitCollector

2006-08-07 Thread hu andy
, then use the list to check whether the Lucene seached results should be returned. Can you give some suggestion? Also can you show me to how you use filter? 2006/8/8, Simon Willnauer <[EMAIL PROTECTED]>: Hey Andy, i don't know how you determinate whether a document has to be displ

Re: About the use of HitCollector

2006-08-07 Thread hu andy
document to determine whether I should return the document. The total number of documents is about two hundred thousand. So I'm afraid the performance 2006/8/7, Martin Braun <[EMAIL PROTECTED]>: hi andy, > How can I use HitCollector to iterate over every returned document

About the use of HitCollector

2006-08-07 Thread hu andy
How can I use HitCollector to iterate over every returned document? Thank you in advance.

Re: Consult some information about adding index while searching

2006-07-30 Thread hu andy
Thank you

Re: Consult some information about adding index while searching

2006-07-28 Thread hu andy
These codes are written in C#,. There is a C# version of Lucene 1.9, which can be downloaded from http://www.dotlucene.net This implements the indexing . public void CreateIndex() { try { AddDirectory(directory); writer.Optimize();

Re: Consult some information about adding index while searching

2006-07-27 Thread hu andy
Yes, I have closed IndexWriter. But it doesn't work. 2006/7/27, Michael McCandless <[EMAIL PROTECTED]>: > I met this problem: when searching, I add documents to index. Although I > instantiates a new IndexSearcher, I can't retrieve the newly added > documents. I have to close the program an

Consult some information about adding index while searching

2006-07-27 Thread hu andy
I met this problem: when searching, I add documents to index. Although I instantiates a new IndexSearcher, I can't retrieve the newly added documents. I have to close the program and enter the program, then it will be ok. The platform is win xp. Is it the fault of xp? Thank you in advance.

Re: Maybe a bug of lucene 1.9

2006-06-05 Thread hu andy
stribute it to you. I am glad You understand Chinese. How I should deliver it to you? Because the api includes a Chinese lexis which is nearly 10M in size. Maybe I can mail it to you. 2006/5/30, Erik Hatcher <[EMAIL PROTECTED]>: On May 29, 2006, at 6:34 AM, hu andy wrote: > I indexed

Re: Maybe a bug of lucene 1.9

2006-05-30 Thread hu andy
2006/5/29, hu andy <[EMAIL PROTECTED]>: I indexed a collection of Chinese documents. I use a special segmentation api to do the analysis, because the segmentation of Chinese is different from English. A strange thing happened. With lucene 1.4 or lucene 2.0, it will be all right to re

Maybe a bug of lucene 1.9

2006-05-29 Thread hu andy
I indexed a collection of Chinese documents. I use a special segmentation api to do the analysis, because the segmentation of Chinese is different from English. A strange thing happened. With lucene 1.4 or lucene 2.0, it will be all right to retrieve the corresponding documents given the terms

Ask for a better solution for the case

2006-04-28 Thread hu andy
Hi, I hava an application that need mark the retrieved documents which have been read. So the next time I needn't read the marked documents again. I have an idea that adding a particular field into the indexed document. But as lucene have no update method, I have to delete that document, and

Re: performance differences between 1.4.3 and 1.9.1

2006-04-26 Thread Andy Goodell
For my application we have several hundred indexes, different subsets of which are searched depending on the situation. Aside from not upgrading to lucene 1.9, or making a big index for every possible subset, do you have any ideas for how can we maintain fast performance? - andy g On 4/26/06

What is the retrieval modle for lucene?

2006-04-10 Thread hu andy
I have seen in some documents that there are three kinds of retrieval modle which are used often: Boolean, vector space and probability. So I want to which is it that used by lucene. Thank you in advance

Re: Update or Delete Document for Lucene 1.4.x

2006-04-03 Thread hu andy
IndexReader.delete(int docNum) or IndexReader.delete(Term term) 2006/4/1, Don Vaillancourt <[EMAIL PROTECTED]>: > > Hi All, > > I need to implement the ability to update one document within a Lucene > collection. > > I haven't been able to find anything in the API. Is there a way to > update one

Speed up Indexing

2006-03-22 Thread hu andy
Hi,everyone. I have a large mount of xml files of size 1G. I use lucene(the dotNet edition) to index . There are 8 fields for a document, with 4 keyword fields and 4 unstored fields. I have set the minMergeDocs to 1 and mergeFactor to 100. It took about 2.5 hours (main memeory 3G, CPU p4 ) .I a

Re: question...

2006-03-16 Thread hu andy
Do you mean you pack the index files into the file *.luc.If it is the case, Lucene can't read it. If you put index files and *.luc together under some directory, That's OK. Lucene knows how to find these files 2006/3/14, Aditya Liviandi <[EMAIL PROTECTED]>: > > Hi all, > > > > If I want to embed

About index deletion

2006-03-16 Thread hu andy
Because I will delete the indexed document periodically, So the index files must be deleted after that. If I just want to delete some documents added before some past day from the index, How should i do it? Thank you in advance

who can tell me how lucene search in the index files

2006-03-14 Thread hu andy
I see there are seven different files with extentions .fnm .tis and etc. I just can't make sure how it looks up in the .tis file. Does lucene use Binary-Search to locate the term?

Re: sub search

2006-03-07 Thread hu andy
etween that and search described below: > > TermQuery termQuery = new TermQuery( > BooleanQuery bq = .. > bq.add(termQuery,true,false); > bq.add(query,true,false); > hits = Searcher.search(bq,queryFilter); > > > > -Original Message- > From: hu andy [mailto:[

Re: sub search

2006-03-07 Thread hu andy
2006/3/7, Anton Potehin <[EMAIL PROTECTED]>: > > Is it possible to make search among results of previous search? > > > > > > For example: I made search: > > > > Searcher searcher =... > > > > Query query = ... > > > > Hits hits = > > > > hits = Searcher.search(query); > > > > > > > > After it

IO Error/Jira

2005-12-01 Thread Andy Hind
Jira it is not clear to me what version of lucene I need to include a fix. Has version 1.4.3 been fixed up beyond the latest official binary dated 29-Nov-2004? Should I be getting and building from the repository? Any help appreciated, Regards Andy

Re: Reverse sorting by index order

2005-11-03 Thread Andy Lee
On Nov 3, 2005, at 10:22 AM, Oren Shir wrote: There is no constructor for Sort(SortField, boolean) in Lucene API. Which version are you using? I think 1.9rc1. I have a pretty recent svn checkout -- maybe this constructor is new. --Andy

Re: Reverse sorting by index order

2005-11-03 Thread Andy Lee
erever you were using Sort.INDEXORDER. --Andy - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]

Re: trying to boost a phrase higher than its individual words

2005-10-28 Thread Andy Lee
case the IndexSearcher classes. All I could find was the *Nutch* IndexSearcher's getExplanation() method, which I see sends toHtml() rather than toString() to its internal Lucene IndexSearcher. --Andy - To unsubsc

Re: trying to boost a phrase higher than its individual words

2005-10-28 Thread Andy Lee
it seems to be only HTML now. Finally I wrote a convenience method that dumps the HTML to a file, which I view in a browser. Thanks, Chris and Erik! --Andy - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional

trying to boost a phrase higher than its individual words

2005-10-27 Thread Andy Lee
oosts, and if so, can someone explain (at least roughly) how to achieve the desired result? --Andy - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]

Re: Do you believe in Clause sanity?

2005-10-13 Thread Andy Lee
Oops, I'm confusing libraries. I meant I want to remove a Nutch Clause from a Nutch Query. --Andy On Oct 13, 2005, at 4:45 PM, Andy Lee wrote: The API for BooleanQuery only seems to allow adding clauses. The nearest way I can see to *remove* a clause is by laboriously construct

Do you believe in Clause sanity?

2005-10-13 Thread Andy Lee
me to want to remove clauses from a query. Is there some reasonable way of doing this that I'm missing? --Andy - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]

Query to return all documents in the index

2005-10-05 Thread Andy Goodell
that i could use that in this case? thanks, andy g

Re: A very technical question.

2005-09-28 Thread Andy Liu
"long", "verylong", depending on how granular you need it. Then at query time you can specify the field and a given boost value, i.e. civil war docLength:verylong^5 docLength:long^3 Andy On 9/28/05, Dawid Weiss <[EMAIL PROTECTED]> wrote: > > > Hi. > > I

Re: n-gram indexing

2005-07-18 Thread Andy Roberts
e than one term in the search query. Also, there is obviously going to be some duplication of hits, so you could use a HashMap when iterating of the Hits to ensure you get unique hits when the queries are collated. Andy - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]

Re: n-gram indexing

2005-07-18 Thread Andy Roberts
hieve more generally, we can confirm that you don't need to mess with explicit indexing of indexing. Andy - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]

Re: Hypenated word

2005-06-13 Thread Andy Roberts
On Monday 13 Jun 2005 14:52, Markus Wiederkehr wrote: > On 6/13/05, Andy Roberts <[EMAIL PROTECTED]> wrote: > > On Monday 13 Jun 2005 13:18, Markus Wiederkehr wrote: > > > I see, the list of exceptions makes this a lot more complicated than I > > > thought... Tha

Re: Hypenated word

2005-06-13 Thread Andy Roberts
a hyphen, you can manipulate the buffer to merge the hyphenated tokens. Andy - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]

Relative term frequency?

2005-06-06 Thread Andy Liu
f(int freq) It would be nice to have something like: float tf(int freq, String fieldName, int numTerms) If this isn't available out of the box, how difficult would it be to hack up Lucene to allow for this? Thanks, Andy - To u

Re: Indexing multiple languages

2005-06-03 Thread Andy Roberts
atin1 encoding doesn't support such characters. You need to specify Big5 yourself. Read the info on InputStreamReaders: http://java.sun.com/j2se/1.5.0/docs/api/java/io/InputStreamReader.html Andy > > Btw, I did try running the lucene demo (web template) to index the HTML > files af

Re: Retrieve all terms

2005-05-19 Thread Andy Roberts
termFreq += tp.freq(); } System.out.println(currentTerm.text() + "(" + termFreq + "|" + te.docFreq() + ")"); } reader.close(); } HTH, Andy ---

Digester and simple XML files

2005-04-22 Thread Andy Roberts
t I've been hacking at it for a while with little fun. Any suggestions? Thanks, Andy public class DigesterTest { private Digester dig; public DigesterTest(File inFile) throws IOException, SAXException { dig = new Digester();

Re: Best way to purposely corrupt an index?

2005-04-21 Thread Andy Roberts
nce of their indexes! (And I'm sure they'd prefer it that way too) Andy - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]

Re: Best way to purposely corrupt an index?

2005-04-20 Thread Andy Roberts
On Wednesday 20 Apr 2005 08:27, Maik Schreiber wrote: > > As the index is rather critical to my program, I just wanted to make it > > really robust, and able to cope should a problem occur with the index > > itself. Otherwise, the user will be left with a non-functioning program > > with no explana

Re: Best way to purposely corrupt an index?

2005-04-20 Thread Andy Roberts
, I just wanted to make it really robust, and able to cope should a problem occur with the index itself. Otherwise, the user will be left with a non-functioning program with no explanation. That's my reasoning anyway. Andy > Handle > catch/throw/finally correctly and it should not p

Best way to purposely corrupt an index?

2005-04-19 Thread Andy Roberts
stions, or will removing any file from the directory be sufficient? Many thanks, Andy - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]

Re: getting the number of occurrences within a document

2005-04-14 Thread Andy Roberts
Docs(currentTerm); int docCounter = 1; while (docs.next()) { System.out.println(currentTerm.text() + ", doc" + docCount + ", " + docs.freq()); docCounter++; } } HTH, Andy --

Re: Multi-analyzer ?

2005-04-11 Thread Andy Roberts
. However, it's clear that you can't really accomodate multi-language documents. It would be much easier to ensure all docs were in a single language before indexing. Andy - To unsubscribe, e-mail: [EMAIL PROTECTED] Fo

Re: Multi-analyzer ?

2005-04-11 Thread Andy Roberts
ser to specify their input language because otherwise, results will be poor. Andy Roberts > -MB > > On Apr 11, 2005, at 6:02 AM, Andy Roberts wrote: > > Can you not provide the user with a option list to specify their input > > language? > > > > Language identificat

Re: Terms & Postion from Hits ...

2005-04-11 Thread Andy Roberts
5 in the field "contents" of the index ir. HTH, Andy Roberts On Sunday 10 Apr 2005 15:52, Patricio Galeas wrote: > Hello, > I am new with Lucene. I have following problem. > When I execute a search I receive the list of document Hits. > I get without problem the

Re: Multi-analyzer ?

2005-04-11 Thread Andy Roberts
uages was to build a model based on character bigrams (that is, sequences of two letters) [1] At the end of the day, Lucene cannot help you in choosing the correct language as it doesn't know, and so it'll be up to you to add the necessary logic to tell Lucene which Analyzers to utilis

Re: Escaping special characters

2005-04-07 Thread Andy Roberts
m the book (http://www.lucenebook.com/LuceneInAction.zip). If you unzip this file you will find a directory called "LuceneInAction/src/lia/analysis" and in there is a class called AnalyzerDemo (which depends in AnalyzerUtils). Compile this and run to see how the Analysers work. Put in your hyp

Highlighter compile error

2005-03-10 Thread Andy Roberts
successfully built the code in the lucene-1.4.2-dev branch, but that doesn't contain that class either! Any hints? Google didn't shed any light, btw. Cheers, Andy Roberts - To unsubscribe, e-mail: [EMAIL PROTECTED]

Obtaining the contexts of hits

2005-03-09 Thread Andy Roberts
Hi, I've been using Lucene for a few months now, although not in a typical "building a search engine" kind of way*. Basically, I have some large documents. I would like a system whereby I search for a term, and then I receive a hit for each match, with its context, e.g., ten words either side