Re: Java 17 and Lucene

2021-10-19 Thread Kevin Rosendahl
ombination than just emailing the user group, or is this our best bet in the future as well? Thanks again! Kevin On Tue, Oct 19, 2021 at 5:07 AM Michael Sokolov wrote: > > I would a bit careful: On our Jenkins server running with AMD Ryzen CPU > it happens quite often that JDK 16, JDK 1

Java 17 and Lucene

2021-10-18 Thread Kevin Rosendahl
any other orgs using Java 17 with Lucene? - Any other considerations we should be aware of? Best, Kevin Rosendahl

Implementing Custom DoubleValues

2021-02-15 Thread Kevin Manuel
true' in this case?) I've read the Javadocs as well as multiple other questions on this topic on this channel, but it's still confusing to me. Appreciate your time and help. Thanks, Kevin

Re: [VOTE] Lucene logo contest, third time's a charm

2020-09-03 Thread Kevin Risden
A1, A2, D (binding) Kevin Risden On Thu, Sep 3, 2020 at 4:44 PM jim ferenczi wrote: > A1 (binding) > > Le jeu. 3 sept. 2020 à 07:09, Noble Paul a écrit : > >> A1, A2, D binding >> >> On Thu, Sep 3, 2020 at 7:22 AM Jason Gerlowski >> wrote: >> &

Custom DoubleValuesSource to Read from Multiple Indexed DocValue Fields

2020-07-16 Thread Kevin Manuel
urce or do I need one for reading from each of the indexed docValue fields and then use a combination of MultiFloatFunctions and SumFloatFunctions to achieve this? Appreciate your time and help. Thanks, Kevin

Re: Using FunctionScoreQuery vs CustomScoreQuery

2020-02-25 Thread Kevin Manuel
I see, thank you Adrien ! I'll look into it and get back to you if I have any questions. On Fri, Feb 21, 2020 at 1:45 AM Adrien Grand wrote: > Hi Kevin, > > FunctionScoreQuery can also work with dynamically-computed values, you just > need to provide it with a DoubleValuesSou

Using FunctionScoreQuery vs CustomScoreQuery

2020-02-20 Thread Kevin Manuel
t the above use case probably needs something more dynamic due to the distance calculation. Was wondering if you had any suggestions on how to achieve this or if maybe I'm misunderstanding something? Thanks, Kevin

Upper limit on Score

2019-04-17 Thread Kevin Manuel
Hi, I was just wondering is there an upper limit to the score that can be generated for a non-constant score query? Thanks, Kevin

Re: Question about BytesRef and BinaryDocValues

2018-08-23 Thread Kevin Manuel
Hi Vadim, Thank you so much for your reply. I think you were right. So if a field is 'analyzed' how can I get both terms 'hey' and 'tom'? Thanks, Kevin On Thu, Aug 23, 2018, 20:26 Vadim Gindin wrote: > Hi Kevin! > > I think that your field is "ana

Question about BytesRef and BinaryDocValues

2018-08-23 Thread Kevin Manuel
wondering if there's something wrong with the way I'm accessing it or it was an issue in these versions. Thanks, Kevin

Sort merge strategy ?

2016-11-16 Thread Kevin Burton
What's the current status of the sort merge strategy? I want to sort an index by a given field and keep it in that order on disk. It seems to have evolved over the years and I can't easily figure out the current status via the Javadoc in 6.x -- We’re hiring if you know of any awesome Java Devo

Re: Possible to cause documents to be contiguous after forceMerge?

2016-11-15 Thread Kevin Burton
hen you are really concerned > with something else. > 500GB per day... additionally, disk is cheap, but IOPS are not. The more we can keep in ram and on SSD the better. And we're trying to get as much in RAM then SSD as possible... plus we have about 2 years of content. It adds up ;) Kevi

Possible to cause documents to be contiguous after forceMerge?

2016-11-15 Thread Kevin Burton
I have a large index (say 500GB) that with a large percentage of near duplicate documents. I have to keep the documents there (can't delete them) as the metadata is important. Is it possible to get the documents to be contiguous somehow? Once they are contiguous then they will compress very well

Modify the StandardTokenizerFactory to concatenate all words

2013-11-05 Thread Kevin
Currently I'm using StandardTokenizerFactory which tokenizes the words bases on spaces. For Toy Story it will create tokens toy and story. Ideally, I would want to extend the functionality ofStandardTokenizerFactory to create tokens toy, story, and toy story. How do I do that?

Urgent! Forgot to close IndexWriter after adding Documents to the index.

2011-03-20 Thread Kevin Tse
Hi, experts I had a program running for 2 days to build an index for around 160 million text files, and after program ended, I tried searching the index and found the index was not correctly built, *indexReader.numDocs()* returns 0. I checked the index directory, it looked good, all the index data

RE: CLucene and Lucene

2008-05-16 Thread Kevin Daly (kedaly)
From: Kevin Daly (kedaly) Sent: Friday, May 16, 2008 1:34 PM To: 'java-user@lucene.apache.org' Subject: CLucene and Lucene I am have a question concerning interop between CLucene and Lucene. It is possible to have a C++ Application using CLucene

CLucene and Lucene

2008-05-16 Thread Kevin Daly (kedaly)
test where I can write/read to/from index using Clucene and Lucene. - Kevin. Kevin Daly Software Engineer IP Communications Business Unit [EMAIL PROTECTED] Phone :+35391384651 Block 10 Parkmore Galway Ireland Ireland www.cisco.com/ This e-mail may contain

index update problems with Linux

2008-01-18 Thread Kevin Dewi
) at de.gesichterparty.LuceneServlet.run(LuceneServlet.java:140) at java.lang.Thread.run(Thread.java:595) On Mac OS X Leopard this code works fine. Thanks Kevin

term location in doc

2007-08-08 Thread Kevin Chen
I can see that termpositions gives an enum with all positions of term in document. I want to do the opposite. Given a position , can I query the document for term at that position in document? - Ready for the edge of your seat? Check out tonight's top pi

TermFreqVector

2007-07-19 Thread Kevin Chen
I need to use getTermFreqVector on a subset of docs that belong to the hits for a query. I understand I need to pass the docNumber as an argument in this case. How do I access that. For ex . doc = hits.doc(0); TermFreqVector vector = reader.getTermFreqVector(docId, "field"); How do I get docI

Re : The localized Languages.

2007-06-21 Thread sejourne kevin
Thank, I found it. I wasn't aware of those both source tree. Kévin. - Message d'origine De : Doron Cohen <[EMAIL PROTECTED]> À : java-user@lucene.apache.org Envoyé le : Mercredi, 20 Juin 2007, 23h42mn 17s Objet : Re: The localized Languages. Hi Kevin, are you looking

The localized Languages.

2007-06-20 Thread sejourne kevin
Hi, It seem that all localized languages Analyser are absent from org.apache.lucene.analysis.* in the lastest 2.2 source release of Lucene. Is this normal or not ? regards, Kévin. _ Ne gardez plus qu'une se

HELP: how to highlight the search key word in lucene's search results?

2006-08-11 Thread kevin
Hi, how to highlight the search key word in lucene's search results? pls give advise,thanks! - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]

Re: Index-Format difference between 1.4.3 and 2.0

2006-07-18 Thread kevin
Hi, how to highlight the keyword in the search result summary ? can i use the /highlight/ package? Thanks! - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]

RE: Commercial vendors monitoring this ML? was: Lucene Performance Issues

2006-03-28 Thread Runde, Kevin
ical RAM on the box. -Kevin -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] Sent: Tuesday, March 28, 2006 12:47 PM To: java-user@lucene.apache.org Subject: Commercial vendors monitoring this ML? was: Lucene Performance Issues Weird, I was just about to comment on the

RE: Vector Space Model <-> Probabilistic Model

2006-03-15 Thread Runde, Kevin
Hello, I recently came across this email in the Lucene user list and am interested in this article. I tried to access it from the link you provided, but couldn't find any link to access it. Do you still have an electronic copy? Thanks, Kevin Runde -Original Message- From: Ma

Re: Too many required clauses for a BooleanQuery

2006-02-09 Thread Kevin Dutcher
Thanks Hoss... You're absolutely right! Kevin On 2/9/06, Chris Hostetter <[EMAIL PROTECTED]> wrote: > > > : I need all the documents returned from the search and am manipulating > the > : results with a custom HitCollector, therefore I can't use filters. >

Re: Too many required clauses for a BooleanQuery

2006-02-09 Thread Kevin Dutcher
> > One more thing: in case these queries are generated, you might > consider building the corresponding (nested) BooleanQuery yourself > instead of using the QueryParser. > > Regards, > Paul Elschot I'll give that a try. Thanks Paul.

Re: Too many required clauses for a BooleanQuery

2006-02-09 Thread Kevin Dutcher
ed all the documents returned from the search and am manipulating the results with a custom HitCollector, therefore I can't use filters. Kevin

Too many required clauses for a BooleanQuery

2006-02-08 Thread Kevin Dutcher
st scenario also. Is there anyway around this error? As a side note, it is very unlikely that this will be encountered in the real world, but b/c we are dealing with content categorization it is still possible. Thanks in advance, Kevin

RE: Search clustering question

2005-11-23 Thread Runde, Kevin
Does anyone have examples of using Carrot2? I've been looking into it lately and am not finding good documentation. -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] Sent: Wednesday, November 23, 2005 2:23 PM To: java-user@lucene.apache.org Subject: Re: Search clusteri

RE: Help with Search Java Code set up

2005-10-26 Thread Kevin L. Cobb
e descerning, so that the term "cat" returns a hit but less than 100%. The term "big green cat" should return 100%, the term "big green" or "green big" should return something less than 100% and then term "big" or "green" or "cat"

RE: Help with Search Java Code set up

2005-10-26 Thread Kevin L. Cobb
er to build the query for the keywordField (only one field to search) 4. Can I combine these separate queries together into one? -Kevin -Original Message- From: Jeff Rodenburg [mailto:[EMAIL PROTECTED] Sent: Wednesday, October 26, 2005 1:04 PM To: java-user@lucene.apache.org Subject

Help with Search Java Code set up

2005-10-26 Thread Kevin L. Cobb
the keyword field. At this point, I'm thinking that I'll need to do two distinct searches, one using the search term in what I'm calling my searchable fields, and the other using the other term in the keyword field. Then join the two HIT lists together. Looking for some advice. Thanks, Kevin

Re: Is Lucene right for my app?

2005-09-18 Thread Kevin Stembridge
ow grease on my part. Thanks very much for the advice. Cheers, Kevin Jeff Rodenburg wrote: Kevin - You've come to the right list to get information to help you make a decision. That said, the responsible answer to your question will be "it depends". The supporter in me s

Is Lucene right for my app?

2005-09-18 Thread Kevin Stembridge
t direction. Either way I would be very grateful for any advice. Cheers, Kevin - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]

Serialized Java Objects

2005-08-25 Thread Kevin L. Cobb
ot be searchable. Thanks, -Kevin

Re: Lucene and Xanga.com

2005-08-25 Thread Kevin Burton
n optimizing REALLY huge indexes like 300G or so. Then you run out of available system memory (4G on 32bit machines) and you hit disk. Then it starts to take weeks to optimize :-) Of course you coudl use multiple machines or get more memory. Kevin -- Kevin A. Burton, Location - San Francis

Re: NGram Language Categorization Source

2005-08-21 Thread Kevin Burton
it lacking. I started off just trying to find a library to use in our crawler but never found anything. Which is why I ended up writing my own. > Of these, the Nutch one is certainly under active development, the > others don't seem to be as far as I can tell. They should just use ngramcat

Re: NGram Language Categorization Source

2005-08-21 Thread Kevin Burton
Yes. We don't handle the mixed language case very well. The chunking method is something I wanted to approach. > So, there is still a lot to do in this area, if you come up with some > unique way of improving LI performance... Maybe I'm being dense but what is LI performance?

Re: NGram Language Categorization Source

2005-08-21 Thread Kevin Burton
hat's a good place to find out about multilingual > corpora. Yeah. That was my biggest problem. This area had never really been solved in the OSS world. -- Kevin A. Burton, Location - San Francisco, CA AIM/YIM - sfburtonator, Web - http

NGram Language Categorization Source

2005-08-20 Thread Kevin Burton
ld use language categorization to help deal with the chaos of tagging and full-text search. Google has done this for a long time now and Technorati has it in beta. http://www.feedblog.org/2005/08/ngram_language_.html -- Kevin A. Burton, Location - San Francisco, CA AIM/YIM - sfburtonat

RE: New Site Live Using Lucene

2005-08-08 Thread Kevin L. Cobb
Open Source C/C++ only? When are you going to include Open Source Java? We demand fair treatmant ;) -Original Message- From: Robert Schultz [mailto:[EMAIL PROTECTED] Sent: Sunday, August 07, 2005 6:18 PM To: java-user@lucene.apache.org Subject: New Site Live Using Lucene Not sure if

Re: Optimizing indexes with mulitiple processors?

2005-06-10 Thread Kevin Burton
y creating a 5G file and then cating that to /dev/null but I have no way to verify that this actually works. I just made the BUFFER_SIZE veriables non-final so that I can set them at any time. Kevin -- Use Rojo (RSS/Atom aggregator)! - visit http://rojo.com. See irc.freenode.net #rojo if y

Re: Optimizing indexes with mulitiple processors?

2005-06-10 Thread Kevin Burton
f anyone has successfully increased the FSOutputStream and FSInputStream buffers and got it not to blow up on array copies I would love to know the short cut Maybe that was my problem... Kevin -- Use Rojo (RSS/Atom aggregator)! - visit http://rojo.com. See irc.freenode.net #rojo if you want

Re: Optimizing indexes with mulitiple processors?

2005-06-09 Thread Kevin Burton
indexing. I'm more interested in merging multiple indexes... Kevin -- Use Rojo (RSS/Atom aggregator)! - visit http://rojo.com. See irc.freenode.net #rojo if you want to chat. Rojo is Hiring! - http://www.rojonetworks.com/JobsAtRojo.html Kevin A. Burton, Location - San Francisc

Re: Optimizing indexes with mulitiple processors?

2005-06-09 Thread Kevin Burton
Bill Au wrote: Optimize is disk I/O bound. So I am not sure what multiple CPUs will buy you. Now on my system with large indexes... I often have the CPU at 100%... Kevin -- Use Rojo (RSS/Atom aggregator)! - visit http://rojo.com. See irc.freenode.net #rojo if you want to chat. Rojo

Optimizing indexes with mulitiple processors?

2005-06-09 Thread Kevin Burton
Is it possible to get Lucene to do an index optimize on multiple processors? Its a single threaded algorithm currently right? Its a shame since I have a quad machine but I'm only using 1/4th of the capacity. Thats a heck of a performance hit. Kevin -- Use Rojo (RSS/Atom aggre

Re: Fastest way to fetch N documents with unique keys within large numbers of indexes..

2005-06-09 Thread Kevin Burton
Andrew Boyd wrote: Kevin, Those results are awsome. Could you please give those of us that were following but not quite understanding everything some pseudo code or some more explaination? Ug.. I hate to say this bug ignore these numbers. Turns out that I was hitting a cache ... I

Re: Fastest way to fetch N documents with unique keys within large numbers of indexes..

2005-06-07 Thread Kevin Burton
e to the filesystem buffer cache but I can't imagine why they'd be faster in the second round. It might be that Linux is deciding not to buffer the document blocks. Kevin -- Use Rojo (RSS/Atom aggregator)! - visit http://rojo.com. See irc.freenode.net #rojo if you want to chat

Re: Fastest way to fetch N documents with unique keys within large numbers of indexes..

2005-06-07 Thread Kevin Burton
searches on 20 TermQueries. Actually.. it wasn't... :-/ It was about 4x slower. Ug... Kevin -- Use Rojo (RSS/Atom aggregator)! - visit http://rojo.com. See irc.freenode.net #rojo if you want to chat. Rojo is Hiring! - http://www.rojonetworks.com/JobsAtRojo.html Kevin A. B

use of LinkedList in ConjunctionScorer hurting performance?

2005-06-07 Thread Kevin Burton
this should be fast ... maybe we're calling it too often? I didn't have much time to look at it but I wanted to illuminate the issue. Kevin -- Use Rojo (RSS/Atom aggregator)! - visit http://rojo.com. See irc.freenode.net #rojo if you want to chat. Rojo is Hiring! - ht

Re: Fastest way to fetch N documents with unique keys within large numbers of indexes..

2005-06-06 Thread Kevin Burton
o one responded. So it seems like my bottleneck is in seek() so It would make sense to figure out how to limit this. Is this O(log(N)) btw or is it O(N) ? Kevin -- Use Rojo (RSS/Atom aggregator)! - visit http://rojo.com. See irc.freenode.net #rojo if you want to chat. Rojo is Hiring! -

Re: Fastest way to fetch N documents with unique keys within large numbers of indexes..

2005-06-06 Thread Kevin Burton
s? I just assumed that termDocs was already sorted... I don't see any mention of this in the API... Kevin -- Use Rojo (RSS/Atom aggregator)! - visit http://rojo.com. See irc.freenode.net #rojo if you want to chat. Rojo is Hiring! - http://www.rojonetworks.com/JobsAtRojo.html

Fastest way to fetch N documents with unique keys within large numbers of indexes..

2005-06-06 Thread Kevin Burton
as I'm on a SCSI RAID array at RAID0 on FAST scsi disks... I also tried tweaking InputStream.BUFFER_SIZE with no visible change in performance. Kevin -- Use Rojo (RSS/Atom aggregator)! - visit http://rojo.com. See irc.freenode.net #rojo if you want to chat. Rojo is Hiring! - http

Performance tuning and org.apache.lucene.store.InputStream.BUFFER_SIZE

2005-06-01 Thread Kevin Burton
other filesystems? I know that XFS is 4096. What about ext2? ext3? JFS? ReiserFS? NTFS? UFS? etc.... Kevin -- Use Rojo (RSS/Atom aggregator)! - visit http://rojo.com. See irc.freenode.net #rojo if you want to chat. Rojo is Hiring! - http://www.rojonetworks.com/JobsAtRojo.html Kevin A. B

Re: Ability to load a document with ONLY a few fields for performance?

2005-06-01 Thread Kevin Burton
ing fun to do tomorrow! w00t! Kevin -- Use Rojo (RSS/Atom aggregator)! - visit http://rojo.com. See irc.freenode.net #rojo if you want to chat. Rojo is Hiring! - http://www.rojonetworks.com/JobsAtRojo.html Kevin A. Burton, Location - San Francisco, CA AIM/YIM - sfburtonator,

Re: Finding minimum and maximum value of a field?

2005-05-31 Thread Kevin Burton
rc.freenode.net #rojo if you want to chat. Rojo is Hiring! - http://www.rojonetworks.com/JobsAtRojo.html Kevin A. Burton, Location - San Francisco, CA AIM/YIM - sfburtonator, Web - http://peerfear.org/ GPG fingerprint: 5FB2 F3E2 760E 70A8 6174 D393 E84D 8D04 99F1 4412 -

Re: Finding minimum and maximum value of a field?

2005-05-31 Thread Kevin Burton
ot;); String maxDateString = maxDoc.get("dateField"); This certainly is an interesting solution. How would lucene score this result set? The first and last will depend on the score... I guess I can build up a quick test Kevin -- Use Rojo (RSS/Atom aggregator)!

Finding minimum and maximum value of a field?

2005-05-31 Thread Kevin Burton
ideas? Kevin -- Use Rojo (RSS/Atom aggregator)! - visit http://rojo.com. See irc.freenode.net #rojo if you want to chat. Rojo is Hiring! - http://www.rojonetworks.com/JobsAtRojo.html Kevin A. Burton, Location - San Francisco, CA AIM/YIM - sfburtonator, Web - http://peerfear.org/ GPG

Possible to find min and max values for a Date field?

2005-05-30 Thread Kevin Burton
Is it possible to find the minimum and maximum values for a date field with a given reader? I guess I could use TermEnum to do a binary search until I get a hit but this seems a bit kludgy. Thoughts? I don't see any APIs for doing this and a google/grep of the source doesn't he

Ability to load a document with ONLY a few fields for performance?

2005-05-28 Thread Kevin Burton
API for doing this and that I'd have to dive into SegmentReader stuff. Any idea? -- Use Rojo (RSS/Atom aggregator)! - visit http://rojo.com. See irc.freenode.net #rojo if you want to chat. Rojo is Hiring! - http://www.rojonetworks.com/JobsAtRojo.html Kevin A. Burton, Location

RE: Best way to purposely corrupt an index?

2005-04-20 Thread Kevin L. Cobb
My policy on this type of exception handling is to only byte off what you can chew. If you catch an IOException, then you simply report to the user that an unexpected error has occurred and the search engine is unobtainable at the moment. Errors should be logged and developers should look at the sp

RE: Lucene bulk indexing

2005-04-19 Thread Kevin L. Cobb
I think your bottleneck is most likely the DB hit. I assume by 2 products you mean 2 distinct entries into the Lucene Index, i.e. 2 rows in the DB to select from. I index about 1.5 million rows from a SQL Server 2000 database with several fields for each entry and it finishes in about

RE: How do you make "protected content" searchable by Google?

2005-03-17 Thread Kevin L. Cobb
I worked on a website that had the same issue. We made a "search engine" page that listed all the documents that we wanted to index as links to documents that contained summaries of those documents with links to the entire document on the limited access site - Google won't be able to follow these l