MMapDirectory need twice more virtual memory than actually need?

2010-01-29 Thread luocanrao
Environment: 64 bit linux,memory 8G When I used pmap instruction to see virtual memory, I found two big anon memory which grows with the index file size. I had the two following pictures to show the problem, can you explain? This is why I got out of memory exception in 32 bit machine. B

Re: index a mysql database -blob field

2010-01-29 Thread Chris Lu
For blob, it is not so simple since BLOB could contain different file types, like HTML, pdf, word, zip file type. So besides getting results out via resultSet.getBlob() function, you will need to convert the binary stream into simple text strings. DBSight free version already can read the blog

Re: "one of the terms"

2010-01-29 Thread Jake Mannix
coord won't help him, I don't think. Doesn't he just want a DisjunctionMaxQuery instead of BooleanQuery? -jake On Fri, Jan 29, 2010 at 9:28 PM, Otis Gospodnetic < otis_gospodne...@yahoo.com> wrote: > Paul, > > Custom Similarity perhaps, oui. Not 100% sure, maybe have this always > return 1.0

Re: "one of the terms"

2010-01-29 Thread Otis Gospodnetic
Paul, Custom Similarity perhaps, oui. Not 100% sure, maybe have this always return 1.0f. /** Computes a score factor based on the fraction of all query terms that a * document contains. This value is multiplied into scores. * * The presence of a large portion of the query terms ind

Re: Modifying IDF

2010-01-29 Thread Franz Allan Valencia See
How should I go about identifying the domain? Thanks, -- Franz Allan Valencia See | Java Software Engineer franz@gmail.com LinkedIn: http://www.linkedin.com/in/franzsee Twitter: http://www.twitter.com/franz_see On Fri, Jan 29, 2010 at 6:42 PM, Ian Lea wrote: > Instead of playing around wi

Re: FastVectorHighlighter and query with multiple fields

2010-01-29 Thread Koji Sekiguchi
Marc Sturlese wrote: I have FastVectorHighlighter working with a query like: title:Ipod OR title:IPad but it's not working when (0 snippets are returned): title:Ipod OR content:IPad This is true when you are going to highlight IPad in title field and set fieldMatch to true at the FVH constr

RE: index demo throws LockObtainFailedException

2010-01-29 Thread Teruhiko Kurosaka
Thank you, Otis and Mike! I verified that the older verison of Lucene (2.3.1) works w/o a problem like this on this machine. Kuro - To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org For additional commands, e-mail:

"one of the terms"

2010-01-29 Thread Paul Libbrecht
Hello luceners, In our project, we are building queries from long list of possible terms (expanded through ontology deduction). I would like, however, that the rank is unaffected by the number of matches: one or thirty occurrences of one of the many words should give the same score. Did

Re: index a mysql database -blob field

2010-01-29 Thread Ian Lea
Search google for converting mysql BLOBs to text. Or look at the ResultSet stream methods in conjunction with the Field constructors that take Readers. Maybe something like new Field(txt, new InputStreamReader(rs.getBinaryStream(txt)) -- Ian. On Fri, Jan 29, 2010 at 6:10 PM, luciusvorenus w

index a mysql database -blob field

2010-01-29 Thread luciusvorenus
helo One more question to blob : ""d.add(new Field("txt", rs.getString("subject"), Field.Store.NO, Field.Index.ANALYZED));""" but how can i index a blob? the field txt is a blob ... with rs.geBlob(txt) ? thank u thank -- View this message in context: http://old.nabble.com/index-a-mysql-

Re: How further reward documents matching more query terms?

2010-01-29 Thread Ian Lea
I presume that quote is from the javadocs for Similarity. You can write your own Similarity class that extends DefaultSimilarity and provides an implementation of public float coord(int overlap, int maxOverlap) that does what you want, maybe by scaling up the value returned, if I've understood the

Re: AW: index a database

2010-01-29 Thread luciusvorenus
One more question ""d.add(new Field("txt", rs.getString("subject"), Field.Store.NO, Field.Index.ANALYZED));""" the field txt is a blob ...how can i index a blob? with rs.geBlob(txt) ? thank luciusvorenus wrote: > > > "" > > Exception in thread "main" java.lang.NullPointerException >

Re: AW: AW: index a database

2010-01-29 Thread luciusvorenus
Thank you Marc and thank u all... it` working :) :clap: Marc Schwarz wrote: > > Maybe you should seperate the add method from the database function... > > Separate the db loop something like that: > > try >{ > ResultSet rs2 = stm.executeQuery(sql); > while(

How further reward documents matching more query terms?

2010-01-29 Thread Phan The Dai
"When searching with a query as a multi term query, users can further reward documents matching more query terms through a coordination factor: *coord-factor(q,d) " *How we configure this factor? I am needing if documents matching more term queries then their score are higher. Please show me more

AW: Highlighter / cannot be instantiated

2010-01-29 Thread Marc Schwarz
Yep that was it :-) Thanks ! -Ursprüngliche Nachricht- Von: Illés Solt [mailto:illes.s...@gmail.com] Gesendet: Freitag, 29. Januar 2010 15:42 An: java-user@lucene.apache.org Betreff: Re: Highlighter / cannot be instantiated Are you sure you imported Highlighter from the correct lucene

FastVectorHighlighter and query with multiple fields

2010-01-29 Thread Marc Sturlese
I have FastVectorHighlighter working with a query like: title:Ipod OR title:IPad but it's not working when (0 snippets are returned): title:Ipod OR content:IPad Could this be because when FieldQuery is created the query to build it must have just one field? If it's not the case I may be missing

Re: Highlighter / cannot be instantiated

2010-01-29 Thread Illés Solt
Are you sure you imported Highlighter from the correct lucene namespace org.apache.lucene.search.highlight.Highlighter and not something else like javax.swing.text.Highlighter? Illes 2010/1/28 Marc Schwarz : > I'm trying to get the highlighter running, but didn't get it work. > > Everywhere i

RE: Email Filter using Lucene 3.0

2010-01-29 Thread Uwe Schindler
We talked about that internally, i would change the recursion to a while-loop. Else it looks correct. And for efficience I would really use always the same linked list and not create a new one each time. - Uwe Schindler H.-H.-Meier-Allee 63, D-28213 Bremen http://www.thetaphi.de eMail: u...@

Re: Email Filter using Lucene 3.0

2010-01-29 Thread Jamie
Hi Uwe Thanks so much for your help. I now understand Token Filters much better and your suggestion worked! Here's the code for anyone else who is interested. import org.apache.commons.logging.*; import org.apache.lucene.analysis.TokenStream; import org.apache.lucene.analysis.TokenFilter; imp

RE: Email Filter using Lucene 3.0

2010-01-29 Thread Uwe Schindler
Here another variant without a recursion: In ctor: Define a class member (!!!) LinkedList for your splitted email addresses, initially empty termAtt = addAttribute(TermAttribute.class); In incrementToken: While (true) { if (!linkedlist.isEmpty()) { clea

RE: Email Filter using Lucene 3.0

2010-01-29 Thread Uwe Schindler
Can you send us the original filter? The implementation below is wrong in the whole design. All attributes are singletons in each instance of this TokenStream, so your code cannot work. addAttribute always return the same instance. You have to register the singletons in the ctor using addAttrib

Re: Email Filter using Lucene 3.0

2010-01-29 Thread Otis Gospodnetic
Hi Jamie, Could you say more about how it's not working? No compiling? Run-time exceptions? Doesn't work as expected after you run a unit test for it? Otis Sematext :: http://sematext.com/ :: Solr - Lucene - Nutch Hadoop ecosystem search :: http://search-hadoop.com/ - Original Mes

Email Filter using Lucene 3.0

2010-01-29 Thread Jamie
Hi THere In the absence of documentation, I am trying to convert an EmailFilter class to Lucene 3.0. Its not working! Obviously, my understanding of the new token filter mechanism is misguided. Can someone in the know help me out for a sec and let me know where I am going wrong. Thanks. impo

Re: Lucene 2.9.0-rc2 [PROBLEM] : TokenStream API (incrementToken / AttributeSource), cannot implement a LookaheadTokenFilter.

2010-01-29 Thread Jamie
Hi THere In the absense of documentation, I am trying to convert an EmailFilter class to Lucene 3.0. Its not working! Obviously, my understanding of the new token filter mechanism is misguided. Can someone in the know help me out for a sec and let me know where I am going wrong. Thanks. impo

Re: Modifying IDF

2010-01-29 Thread Ian Lea
Instead of playing around with tf/idf, how about just indexing and searching the domain. -- Ian. On Fri, Jan 29, 2010 at 3:43 AM, Franz Allan Valencia See wrote: > Good day, > > I am currently using lucene for my searches. And one of the problems that Im > facing is when keyword is a url. The

Re: index demo throws LockObtainFailedException

2010-01-29 Thread Michael McCandless
Likely you'll have to modify the demo to use SimpleFSLockFactory -- NativeFSLockFactory (now the default for Lucene, as of 2.9) often does not work on NFS. Mike On Thu, Jan 28, 2010 at 8:15 PM, Teruhiko Kurosaka wrote: > We have many Linux machines of different brands, sharing the same NFS > fi

Re: lucene search

2010-01-29 Thread andy green
Thanks -- View this message in context: http://old.nabble.com/lucene-search-tp27358766p27367213.html Sent from the Lucene - Java Users mailing list archive at Nabble.com. - To unsubscribe, e-mail: java-user-unsu