[ANNOUNCE] Apache Lucene 10.2.2 released

2025-06-20 Thread Chris Hegarty
available in the change log available at: http://lucene.apache.org/core/10_2_2/changes/Changes.html -Chris. - To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org For additional commands, e-mail: java-user-h

[ANNOUNCE] Apache Lucene 9.12.2 released

2025-06-20 Thread Chris Hegarty
://lucene.apache.org/core/9_12_2/changes/Changes.html. -Chris. - To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org For additional commands, e-mail: java-user-h...@lucene.apache.org

[ANNOUNCE] Apache Lucene 10.2.1 released

2025-05-01 Thread Chris Hegarty
/10_2_1/changes/Changes.html -Chris. - To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org For additional commands, e-mail: java-user-h...@lucene.apache.org

[ANNOUNCE] Apache Lucene 9.12.1 released

2024-12-13 Thread Chris Hegarty
log available at: http://lucene.apache.org/core/9_12_1/changes/Changes.html. -Chris. - To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org For additional commands, e-mail: java-user-h...@lucene.apache.org

Confused by DiversifyingChildrenFloatKnnVectorQuery javadocs

2024-10-21 Thread Chris Hostetter
I believe I understand the *purpose* of the DiversifyingChildrenFloatKnnVectorQuery (and DiversifyingChildrenByteKnnVectorQuery) classes, but what I don't understand is the java code example from the javadocs... https://lucene.apache.org/core/9_12_0/join/org/apache/lucene/search/join/Divers

[ANNOUNCE] Apache Lucene 9.12.0 released

2024-09-28 Thread Chris Hegarty
features and changes: https://lucene.apache.org/core/9_12_0/changes/Changes.html -Chris. - To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org For additional commands, e-mail: java-user-h...@lucene.apache.org

[ANNOUNCE] Apache Lucene 9.9.2 released

2024-01-29 Thread Chris Hegarty
Lucene99HnswScalarQuantizedVectorsFormat (Ben Trent) * GITHUB#13014: Rollback the tmp storage of BytesRefHash to -1 after sort (Guo Feng) Further details of changes are available in the change log available at: http://lucene.apache.org/core/9_9_2/changes/Changes.html. -Chris

[ANNOUNCE] Apache Lucene 9.9.1 released

2023-12-16 Thread Chris Hegarty
. This patch release contains bug fixes that are highlighted below. The release is available for immediate download at: https://lucene.apache.org/core/downloads.html Lucene 9.9.1 Release Highlights Bug fixes • JVM SIGSEGV crash when compiling computeCommonPrefixLengthAndBuildHistogram (Chris

[ANNOUNCE] Apache Lucene 9.9.0 released

2023-12-04 Thread Chris Hegarty
CHANGES.txt for a full list of new features and changes: https://lucene.apache.org/core/9_9_0/changes/Changes.html -Chris. - To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org For additional commands, e-mail: java-user-h

Re: TermVectorOffsetStrategy producing Passages with matches out of order? (causing IndexOutOfBoundsException)

2023-07-04 Thread Chris Hostetter
I hacked up the test a bit so it would compile against 9.0 and confirmed the problem existed there as well. So going back a little farther with some manual bisection (to account for the transition from ant to gradle) lead me to the following... # first bad commit: [2719cf6630eb2bd7cb37d0e8462

Re: TermVectorOffsetStrategy producing Passages with matches out of order? (causing IndexOutOfBoundsException)

2023-06-29 Thread Chris Hostetter
With some trial and error I realized two things... 1) the order of the terms in the BooleanQuery seems to matter - but in terms of their "natural order", not the order in the doc (which is why i was so confused trying to reproduce it) 2) the problem happens when using termVectors but

TermVectorOffsetStrategy producing Passages with matches out of order? (causing IndexOutOfBoundsException)

2023-06-29 Thread Chris Hostetter
I've got a user getting java.lang.IndexOutOfBoundsException from the UnifiedHighlighter in Solr 9.1.0 w/Lucene 9.3.0 (And FWIW, this same data, w/same configs, in 8.11.1, purportedtly didn't have this problem) I don't really understand the highlighter code very well, but AFAICT: - Defaul

Re: Reproducible crash matching phrases

2021-02-10 Thread Chris Hostetter
: I'm attaching an updated file as well this this changes. : : This happens in Lucene 8.8.0 (and probably since 8.4.0). Ok -- cool ... with the udpated code i was able to reproduce on branch_8x, and with 8.8 & 8.7 (but not 8.4) -- I've distilled your patch into a test case and attached to a n

Re: Reproducible crash matching phrases

2021-02-10 Thread Chris Hostetter
: I've been able to reproduce a crash we are seeing in our product with newer : Lucene versions. Can you be specific? What exact versions of Lucene are you using that reproduces this failure? If you know of other "older" versions where you can't reproduce the problem, that info would also be

Re: explainOther SOLR concept?

2019-06-27 Thread Chris Hostetter
: It’s a Solr-only param for adding to debug=true…. at the Lucene level it's just calling the explain() method on an arbitrary docId regardless of whether that doc appear in the topN results for that query (or if it matches the query at all) -Hoss http://www.lucidworks.com/ -

Re: Question about usage of LuceneTestCase

2018-08-27 Thread Chris Hostetter
: Current version of Luke supports FS based directory implementations only. : (I think it will be better if future versions support non-FS based custom : implementations, such as HdfsDirectoryFactory for users who need it.) : Disabling the randomization, at least for now, sounds reasonable to me

Re: Practical usages of arbitrary Shingles when using a query parser?

2018-07-31 Thread Chris Hostetter
: The query parser is confused by these overlapping positions indeed, which : it interprets as synonyms. I was going to write that you should set the Sure -- i'm not blaming the QueryParser, what it does with the Shingles output makes sense (and actual works! .. just not as efficiently as poss

Practical usages of arbitrary Shingles when using a query parser?

2018-07-30 Thread Chris Hostetter
Although I've been aware of Shings and some of the useful applications for a long time, today is the first tiem i really sat down and tried to do something non-trivial with them myself. My objective seems realatively straight forard: given a corpus of text and some analyzer (for sake of dis

Re: Size of Document

2018-07-05 Thread Chris Hostetter
: Subject: Size of Document : To: java-user@lucene.apache.org : References: : : : Message-ID: : In-Reply-To: : https://people.apache.org/~hossman/#threadhijack Thread Hijacking on Mailing Lists When starting a new discussion on a mailing list, please do not reply to an existing me

Re: Size of Document

2018-07-04 Thread Chris Bamford
. Thanks everyone. Chris > On 5 Jul 2018, at 03:31, Erick Erickson wrote: > > I think we're not talking about the same thing. > > You asked "How can I calculate the total size of a Lucene Document"... > > I was responding to the Terry's comment "

Re: Size of Document

2018-07-04 Thread Chris Bamford
Hi Erick Yes, size on disk is what I’m after as it will feed into an eventual calculation regarding actual bytes written (not interested in the source data document size, just real disk usage). Thanks Chris Sent from my iPhone > On 4 Jul 2018, at 17:08, Erick Erickson wrote: > >

Re: Size of Document

2018-07-04 Thread Chris Bamford
> IndexWriter.ramBytesUsed() gives you access to the current memory usage of > IndexWriter's buffers, but it can't tell you by how much it increased for a > given document assuming concurrent access to the IndexWriter. > Thanks, although I can’t find that API. Is there an equivalent call for Lucen

Re: Size of Document

2018-07-04 Thread Chris Bamford
maybe some sneaky way of peeking inside the IndexWriter before and after a write to compare buffer sizes? Thanks Chris > Le mer. 4 juil. 2018 à 11:26, Chris and Helen Bamford a > écrit : > >> Hi there, >> >> How can I calculate the total size of a Lucene Document tha

Size of Document

2018-07-04 Thread Chris and Helen Bamford
Hi there, How can I calculate the total size of a Lucene Document that I'm about to write to an index so I know how many bytes I am writing please?  I need it for some external metrics collection. Thanks - Chris ---

Re: analyzer context during search

2018-04-13 Thread Chris Tomlinson
analyzers seems to be likely non-performant. LUCENE-8240 did not appear to me to be a solution direction. Thanks, Chris > On Apr 12, 2018, at 5:24 AM, Michael Sokolov wrote: > > I think you can achieve what you are asking by having a field for every > possible combination of pairs

analyzer context during search

2018-04-11 Thread Chris Tomlinson
sort of scenario has been solved by others numerous times but I’m stumped as to how to implement. Thanks in advance for any help, Chris

upgrading to lucene 5.5.5

2018-04-05 Thread Chris Salem
oost(wildQuery.getBoost()); return query; } The problem is the clause.getQuery() doesn't return a TermQuery anymore, it returns a BoostQuery. How would I get it to return a TermQuery? Or how would I get the term from a BoostQuery? Thanks for your help. Thanks, Chris Salem

Synonyms with multiple alternatives

2017-11-15 Thread Chris . Hill
I am using Lucene 4.8 (.net flavour) and cannot find a decent working example to answer my issue. In our source data we have lots of similar items that can be described in the same way - for example "lawnmower", "lawn mower" & "grass cutter". Obviously we have no control over how people choose

Re: ClassicAnalyzer Behavior on accent character

2017-10-26 Thread Chris Hostetter
Classic is ... "classic" ... it exists largely for historical purposes to provide a tokenizer that does exactly what the javadocs say it does (regarding punctuation, "produc numbers", and email addresses), so that people who depend on that behavior can continue to rely on it. Standard is ...

DocValues and SearcherManager

2017-10-20 Thread Chris and Helen Bamford
ory.open(new File(indexPath));     BinaryDocValues docValues2 = MultiDocValues.getBinaryValues(DirectoryReader.open(newDirectory), "id");     Assert.assertNotSame(null, docValues2);     if(newDirectory != null){     newDirectory.close();     }     } } Can anyone advise? Thanks - Chris

Re: payload at the document level

2017-10-05 Thread Chris Hostetter
what you're describing is essentially just DocValues -- for each document, you can have an arbitrary bytes[] (or number, or sorted list of numbers), and you could write a custom query/similarity/collector that can access that "docvalue" at search time to decide if it's a match (or how to score

Re: LongPoint.newRangeQuery results differ from LegacyNumericRangeQuery.newLongRange

2017-07-24 Thread Chris Hostetter
The Points data structures are completley different and distinct from the Term Index structures used by LegacyNumeric fields -- just having hte backwards codex (or using merges to convert indexes to the new index format) isn't enough -- you have to reindex. -Hoss http://www.lucidworks.com/

Re: Changing the default FSLockFactory implementation

2017-05-31 Thread Chris Hostetter
: We are experiencing some “Lock obtain timed out: NativeFSLock@” issues : on or NFS file system, could someone please show me, what’s the right : way to switch the Lucene default NativeFSLockFactory to : SimpleFSLockFactory? You can specify the LockFactory used when opening your Directory...

RE: Un-used index files are not getting released

2017-05-11 Thread Chris Hostetter
: We do not open any IndexReader explicitly. We keep one instance on : IndexWriter open (and never close) and for searching we use : SearcherManager. I checked the lsof and did not find any files with : delete status. what exactly does your SearchManager usage look like? is every searcher =

Re: will lucene traverse all segments to search a 'primary key'term or will it stop as soon as it get one?

2017-04-21 Thread Chris Hostetter
: Lucene by default will search all segments, because it does not know that : your field is a primary key. : : Trejkaz's suggestion to early-terminate should work well. You could also : write custom code that uses TermsEnum on each segment. Before you go too far down the rabit hole of writting a

Re: How to get document effectively. or FieldCache example

2017-04-21 Thread Chris Hostetter
: then which one is right tool for text searching in files. please can you : suggest me? so far all you've done is show us your *indexing* code; and said that after you do a search, calling searcher.doc(docid) on 500,000 documents is slow. But you still haven't described the usecase you are tr

Re: Automata and Transducer on Lucene 6

2017-04-19 Thread Chris Hostetter
: pairs). It is this kind of : "high-level goal" I asked about. Your answer only adds to the mystery: https://people.apache.org/~hossman/#xyproblem XY Problem Your question appears to be an "XY Problem" ... that is: you are dealing with "X", you are assuming "Y" will help you, and you are asking

Limiting terms / field

2017-03-20 Thread Chris Bamford
best way we can do this? I have found some references to a class called LimitTokenCountFilter, but I believe it is only found in later versions. Thanks - Chris [ YouTube: http://www.youtube.com/user/mimecast#p/u/15/_523kC3lcNQ] [ Twitter: http://twitter.com/mimecast ] [ Our Blog: http

Re: Blog post about upcoming Lucene 7.0 major release changes

2017-03-15 Thread Chris Bamford
Thanks Mike, looking forward to it! Great work folks. Chris Sent from my iPhone > On 15 Mar 2017, at 21:19, Adrien Grand wrote: > > Excellent! > > Le mer. 15 mars 2017 à 15:46, Michael McCandless > a écrit : > >> Hi all, >> >> I just posted a blog po

Re: Index size variation

2017-03-04 Thread Chris Bamford
Thanks Uwe, very useful indeed. Chris > On 3 Mar 2017, at 18:37, Uwe Schindler wrote: > > Hi Chris, > > as always: "it depends". Generally I would reserve space of approximately the > "original" index size. Most indexes that are continuously updated

Index size variation

2017-03-03 Thread Chris Bamford
Hello I have observed that sometimes my index size temporarily increases by a large amount, presumably while it it merges segments. Is there some documentation on this subject? I am trying to estimate total disk space I'll need for a project. Thanks

Dealing with index format changes

2017-01-28 Thread Chris Bamford
tively enable/disable position increments at query time, is that correct? Is there anything else to consider? I assume that if I want to revert the format with the new indexer I would need to switch positions off? Thanks Chris

RE: question

2017-01-19 Thread Chris Hostetter
: Yes, they should be the same unless the field is indexed with shingles, in that case order matters. : Markus just to clarify... The examples provided show *stirngs* which would have to be parsed into Query objects by a query parser. the *default* QueryParser will produce queries that resul

Re: Where did earthDiameter go?

2017-01-12 Thread Chris Hostetter
I don't konw the rhyme/reason but it looks like it was removed (w/o any deprecation first i guess) as part of LUCENE-7123 in commit: ce3114233bdc45e71a315cb6ece64475d2d6b1d4 in that commit, existing callers in the lucene code base were changed to use "2 * GeoProjectionUtils.SEMIMAJOR_AXIS" (o

Request to add CXAIR to 'PoweredBy' page on wiki.apache.org

2017-01-11 Thread Chris Lewis
(ChrisLewis) to be registered as a contributor on the wiki, or for a user to add ‘CXAIR’ as a tool on the ‘PoweredBy’ page? More information on CXAIR is available on our website at http://www.connexica.com <http://www.connexica.com/> Kind regards, Chris Lewis Bid Manager Connexica Limited tel:

Re: Problem sorting long integers

2016-12-13 Thread Chris Hostetter
How are you constructing your SortField at query time? Are you sure you are using SortField.Type.LONG ? Can you show us some minimally self contained reproducible code demonstrating your problem? (ie: create an index with 2 docs, then do a simple serach for both and sort them and show that th

Re: How exclude empty fields?

2016-11-16 Thread Chris Hostetter
: The issue I have is that some promotions are permanent so they don't have : an endDate set. : : I tried doing: : : ( +Promotion.endDate:[210100TOvariable containing yesterday's date] : || -Promotion.endDate:* ) 1) mixing prefix ops with "||" like this is most certainly not doing what

Re: Luke alternative

2016-11-10 Thread Chris Bamford
Hi Erick, Good to know, I'll try and help if I can. No Solr here, though, just pure Lucene. Best, Chris Sent from my iPhone > On 10 Nov 2016, at 15:56, Erick Erickson wrote: > > Please do work with Alan, he does good stuff ;)... > > In the meantime, you migh

Luke alternative

2016-11-10 Thread Chris Bamford
Hi I recently heard about an alternative (API?) to Luke for examining indexes. Can someone please point me to it? Thanks Chris - To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org For additional commands, e-mail

Re: Unsubscribing problems

2016-09-07 Thread Chris Hostetter
Peyman: I'll contact you off list to try and address your specific problem. As a general reminder for all users: If you need help with the mailing list, step #1 should be to email the automated help system via java-user-help@lucene (identified in the Mailin-List and List-Help mail MIME header

Re: BooleanQuery rewrite optimization

2016-08-08 Thread Chris Hostetter
Off the top of my head, i think any optimiation like that would also need to account for minNrShouldMatch, wouldn't it? if your query is "(X Y Z #X)" w/minshouldmatch=2, and you rewrite that query to "(+X Y Z)" w/minshouldmatch=2 you now have a semantically diff query that won't match as many

Re: disable field length normalization on specific fields?

2016-03-28 Thread Chris Hostetter
yep, just use a customied similarity that doesn't include a length factor when computing the norm. If you are currently using TFIDFSimilarity (or one of it's subclasses) then the computeNorm method delegates to a lengthNorm method, and you can override that to return "1" for fields with a cert

Re: 500 millions document for loop.

2015-11-15 Thread Chris Hostetter
: public void collect(int docID) throws IOException { : Document doc = indexSearcher.doc(docID, loadFields); : found.found(doc); : } Based on your description of the calculation you are doing

Re: Lucene 4.x -> 5.x: Converting FieldValueFilter to FieldValueQuery

2015-11-05 Thread Chris Hostetter
: > The fact that you need to index doc values is related to another change in : > which we removed Lucene's FieldCache and now recommend to use doc values : > instead. Until you reindex with doc values, you can temporarily use : > UninvertingReader[1] to have the same behaviour as in Lucene 4.x.

Re: sizes of non-fdt flies affected by compression settings

2015-11-04 Thread Chris Hostetter
: This setting can only affect the size of the fdt (and fdx) files. I suspect : you saw differences in the size of other files because it caused Lucene to : run different merges (because segments had different sizes), and the : compression that we use for postings/terms worked better, but it could

Re: sizes of non-fdt flies affected by compression settings

2015-11-04 Thread Chris Hostetter
: This setting can only affect the size of the fdt (and fdx) files. I suspect : you saw differences in the size of other files because it caused Lucene to : run different merges (because segments had different sizes), and the : compression that we use for postings/terms worked better, but it could

Re: Pagination using searchAfter

2015-09-04 Thread Chris Hostetter
: I want to use the searchAfter API in IndexSearcher. This API takes ScoreDoc as : argument. Do we need to store the last ScoreDoc value (ScoreDoc value from : previous search)? When multiple users perform search, then it might be : difficult to store the last ScoreDoc value. : : I guess, docid v

Re: Getting a proper ID value into every document

2015-06-05 Thread Chris Hostetter
: If you cannot do this for whatever reason, I vaguely remember someone : posting a link to a program they'd put together to do this for a : docValues field, you'd have to search the archives to find it. It was Toke - he generated DocValues for an existing index by writing an IndexReader Filter

Re: multi valued facets

2015-06-04 Thread Chris Hostetter
: Set the field to multiValued="true" in your schema. How'd you manage to : get multiple values in there without an indexing error? An existing : index built with Lucene directly? Erik: this isn't a Solr question -- the error message mentioned comes from the lucene/facets FacetsConfig class.

Re: Specifying a Version vs. not specifying a Version

2015-05-29 Thread Chris Hostetter
: Now StandardTokenizer(Version, Reader) is deprecated and the docs say : to use StandardTokenizer(Reader) instead. But I can't do that, because : that constructor hardcodes Version.LATEST, which will break backwards : compatibility in the future (its Javadoc even confirms that this is : the case.

Re: BytesRef violates the principle of least astonishment

2015-05-20 Thread Chris Hostetter
: I already know how Object#clone() works: May i humbly suggest that you: a) relax a bit; b) keep reading the rest of the javadocs for that method? : As BytesRef#clone() is overriding Object#clone(), I expect it to : comply with that. BytesRef#clone() functions virtually identical to the way O

Lucene/Solr Revolution 2015 - Austin Oct 13-16 - CFP ends next Week

2015-04-30 Thread Chris Hostetter
(cross posted, please confine any replies to general@lucene) A quick reminder and/or heads up for htose who haven't heard yet: this year's Lucene/Solr Revolution is happeing in Austin Texas in October. The CFP and Early bird registration are currently open. (CFP ends May 8, Early Bird ends

Re: Lucene indexing speed on NVMe drive

2015-04-30 Thread Chris Hostetter
: Hi. I am studying Lucene performance and in particular how it benefits from faster I/O such as SSD and NVMe. : parameters as used in nightlyBench. (Hardware: Intel Xeon, 2.5GHz, 20 : processor ,40 with hyperthreading, 64G Memory) and study indexing speed ... : I get best performance

Re: Filters execution efficiency

2015-03-26 Thread Chris Hostetter
FWIW: If you're reading LIA, part of your confusion may be that Filters, and when/how they are factored into iterating over scorers, has changed significantly over the years. : Date: Fri, 27 Mar 2015 00:45:14 +0100 : From: Adrien Grand : Reply-To: java-user@lucene.apache.org : To: java-user@lu

Re: Filtering question

2015-03-11 Thread Chris Bamford
Field:"foo" OR newField:"foo" Where oldField is a StringField and newField is a BinaryDocValues. I must add that a full reindex all in one go is currently not an option, so the solution must support this mixed mode. Any thoughts on how this could be best achieved ..? Thanks C

Re: Filtering question

2015-03-11 Thread Chris Bamford
Additional - I'm on lucene 4.10.2 If I use a BooleanFilter as per Ian's suggestion I still get a null acceptDocs being passed to my NDV filter. Sent from my iPhone > On 11 Mar 2015, at 17:19, Chris Bamford wrote: > > Hi Shai > > I thought that might be what acce

Re: Filtering question

2015-03-11 Thread Chris Bamford
Hi Shai I thought that might be what acceptDocs was for, but in my case it is null and throws a NPE if I try your suggestion. What am I doing wrong? I'd like to really understand this stuff .. Thanks Chris > On 11 Mar 2015, at 13:05, Shai Erera wrote: > > I don'

Filtering question

2015-03-10 Thread Chris Bamford
ilter >> " + matchTag + " matched " + i + " [" + strval + "]"); } } } } return new DVDocSetId(bitSet);// just wraps a FixedBitSet } } Chris Bamford Senior Developer m: +44 7860 405292 p: +44 207 847 8700 w: www.mimecast.com Address click here: www.mimecast.com/About-us/Contact-us/

Re: Eclipse Compiled lucene-core-5.0.0.jar Not Working in Solr

2015-03-09 Thread Chris Hostetter
: If you need to make changes to an existing 4.10 installation, pull down the 4.10 : source code and work from _that_, which you can do with something like: based on the error, i don't think he's trying to drop the lucene-core-5.0.0.jar into a Solr 4 install -- i suspect he's compiled & built

Re: Lucene Version Upgrade (3->4) and Java JVM Versions(6->8)

2015-01-27 Thread Chris Hostetter
: I seem to remember reading that certain versions of lucene were : incompatible with some java versions although I cannot find anything to : verify this. As we have tens of thousands of large indexes, backwards : compatibility without the need to reindex on an upgrade is of prime : importance

REMINDER: ApacheCon 2015 Call For Papers Ends This Week (February 1st)

2015-01-26 Thread Chris Hostetter
(cross posted, please confine replies to general@lucene) ApacheCon 2015 Will be in Austin Texas April 13-17. http://apachecon.com/ The Call For Papers is currently open, but it ends 2015-02-01 (11:55PM GMT-0600) https://events.linuxfoundation.org/events/apachecon-north-america/progra

Re: Details on setting block parameters for Lucene41PostingsFormat

2015-01-13 Thread Chris Hostetter
: : The first int to Lucene41PostingsFormat is the min block size (default : 25) and the second is the max (default 48) for the block tree terms : dict. we were discussing over on the solr-user mailing list how Tom would/could go about configuring Solr to use a custom subclass of Lucene41Postin

RE: Looking for docs that have certain fields empty (an/or not set)

2015-01-07 Thread Chris Hostetter
: In Lucene you don't need to use a query parser for that, especially : because range Queries is suboptimal and slow: There is already a very : fast query/filter available. Ahmet Arslan already mentioned that, we had : the same discussion a few weeks ago: : http://find.searchhub.org/document/a

ANNOUNCE: CFP and Travel Assistance now open for ApacheCon North America 2015

2014-12-16 Thread Chris Hostetter
(NOTE: cross posted to several lucene lists, if you have replies, please confine them to general@lucene) -- Forwarded message -- In case you've missed it: - ApacheCon North America returns to Austin, Texas, 13-17 April 2015 http://apachecon.com/ - Call for Papers open until

Re: Compiling and running Lucene/Solr based on github does not seem to work

2014-12-05 Thread Chris Hostetter
For future questions about solr, please use solr-user@lucene ... : ant compile : ant test : : successfully. Also Jetty seems to startup fine, but when I access : : http://localhost:8983/solr/ : : then I receive ... Note the "Instructions for Building Apache Solr from Source" section

Re: How best to compare tow sentences

2014-12-04 Thread Chris Hostetter
: For a number of years I've been doing this for some time by creating a : RAMDirectory, creating a document for one of the sentence and then doing a : search using the other sentence and seeing if we get a good match. This has : worked reasonably well but since improving the performance of other

Re: How to map lucene scores to range from 0~100?

2014-11-12 Thread Chris Hostetter
: I met a new trouble. In my system, we should score the doc range from 0 : to 100. There are some easy ways to map lucene scores to this scope. : Thanks for your help~ https://wiki.apache.org/lucene-java/ScoresAsPercentages -Hoss http://www.lucidworks.com/

Re: Dangerous reflection access to sun.misc.Cleaner by class org.apache.lucene.store.MMapDirectory$MMapIndexInput$1 detected!

2014-11-03 Thread Chris Hostetter
FYI: random googling for "Dangerous reflection access" indicates these are logged by "TopSecurityManager" in Netbeans random clicking on random messages in the Netbeans forums suggests: 1) these INFO messages are designed to only show up if you run with assertions on (evidently under the assum

Re: Getting min/max of numeric doc-values facets

2014-10-09 Thread Chris Hostetter
: Is there some way when faceted search is executed, we can retrieve the : possible min/max values of numeric doc-values field with supplied custom : ranges in (LongRangeFacetCounts) or some other way to do it ? : : As i believe this can give application hint, and next search request can be : muc

Re: Notifications of new Lucene-Releases

2014-10-06 Thread Chris Hostetter
: Lucene doesn't have a dedicated announce list; maybe subscribe to : Apache's announce list? But then you get announcements for all Apache : projects ... maybe add a mail filter ;) there's also the "product" info feeds which you can subscribe to... https://projects.apache.org/projects/lucene_c

Re: NOTICE: Seeking Moderators for java-user@lucene

2014-10-03 Thread Chris Hostetter
: After a few days (probably on friday?) i'll file an infra request to replace : all current moderators with the new list of volunteers. Thanks to all our volunteers, watch this jira to know when the change happens... https://issues.apache.org/jira/browse/INFRA-8429 -Hoss http://www.lucidwork

NOTICE: Seeking Moderators for java-user@lucene

2014-09-30 Thread Chris Hostetter
Hey folks, I was on facation for the psat 7 days - 6 days ago someone sent an email directly to the java-user moderator list asking for subscription help and never got any response -- indicating that all of our other list moderators are either no longer active, or just happened to be on vacat

Re: Snowball filter - Error instantiating stemmer for a language

2014-09-05 Thread Chris Hostetter
To see about improving the error messages when users make mistakes like this... https://issues.apache.org/jira/browse/LUCENE-5926 -Hoss http://www.lucidworks.com/ - To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache

Re: Snowball filter - Error instantiating stemmer for a language

2014-09-04 Thread Chris Hostetter
Odd ... the class org/tartarus/snowball/ext/CatalanStemmer.class should exist in the same jar as SnowballPorterFilterFactory, can you please confirm that you see it there? $ jar tf lucene-analyzers-common-4.6-SNAPSHOT.jar | grep CatalanStemmer org/tartarus/snowball/ext/CatalanStemmer.class Th

Re: Should .tip/.doc/.tii files be missing/deleted?

2014-09-03 Thread Chris Hostetter
: following files (I'm not listing all extensions) are deleted immediately : upon IndexWriter.close() being called: : : *.fdt, *.tip, *.tii, .*pos : : Only the following 5 files are left in all cases : _0.cfe : _0.cfs ...you're got the CompoundFileFormat configured, so each time a segment is f

RE: escaping characters

2014-08-12 Thread Chris Salem
#setAutoGeneratePhraseQueries(boolean) -- Jack Krupansky -Original Message- From: Chris Salem Sent: Monday, August 11, 2014 1:03 PM To: java-user@lucene.apache.org Subject: RE: escaping characters I'm not using Solr. Here's my code: FSDirectory fsd = FSDirectory.open(new File("C:\\i

RE: escaping characters

2014-08-11 Thread Chris Salem
and that's breaking things up. Best, Erick On Mon, Aug 11, 2014 at 8:54 AM, Chris Salem wrote: > Hi everyone, > > > > I'm trying to escape special characters and it doesn't seem to be working. > If I do a search like resume_text: (LS\/MS) it searches for LS AND M

escaping characters

2014-08-11 Thread Chris Salem
Hi everyone, I'm trying to escape special characters and it doesn't seem to be working. If I do a search like resume_text: (LS\/MS) it searches for LS AND MS instead of LS/MS. How would I escape the slash so it searches for LS/MS? Thanks

Re: Seeking Additional Moderator Volunteers for java-user@lucene

2014-07-29 Thread Chris Hostetter
On Wed, 23 Jul 2014, Yalamarthi, Vineel wrote: : Can I be volunteer too Vineel: sorry i didn't see your response until now. Thanks for volunteering by asfinfra already processed the request and now we've got plenty of moderators. (i think it was actauly processed before you even replied) -

Re: Seeking Additional Moderator Volunteers for java-user@lucene

2014-07-23 Thread Chris Hostetter
Thanks folks, plenty of new volunteers https://issues.apache.org/jira/browse/INFRA-8082 -Hoss http://www.lucidworks.com/ - To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org For additional commands, e-mail: java-

Seeking Additional Moderator Volunteers for java-user@lucene

2014-07-23 Thread Chris Hostetter
We're doing some housekeeping of the moderators of this list, and looking for any new folks that would like to volunteer. (we currently have 3 active moderators, 1-2 additional mods would be helpful for good coverage) If you'd like to volunteer to be a moderator, please reply back to this th

Re: Different Scores For Same Query on Identical Index

2014-07-16 Thread Chris Hostetter
: I created an index with three documents, ran a query, and noted the scores. : Then I deleted one of the documents using IndexWriter.tryDeleteDocument, and : then re-added the exact same document. (I saved the Document in an instance : variable, so I couldn't have accidentally changed any of the

Re: IndexSearcher.doc thread safe problem

2014-07-09 Thread Chris Hostetter
: 4. Syncronized searcher.doc method call in multi-thread(like this: public : synchronized Document getValue( IndexSearcher searcher, int docId ) { : return searcher.doc( docId ); }) : ==> every execution is same. :but If I use this method, It is no difference with single thread :

Re: Query rewriting - caching rewritten quries

2014-07-02 Thread Chris Hostetter
: In the system which I develop I have to store many query objects in memory. : The system also receives documents. For each document MemoryIndex is : instantiated. I execute all stored queries on this MemoryIndex. I realized : that searching over MemoryIndex takes much time for query rewriting. I'

ANNOUNCE: ApacheCon deadlines: CFP June 25 / Travel Assistance Jul 25

2014-06-12 Thread Chris Hostetter
(NOTE: cross-posted announcement, please confine any replies to general@lucene) As you may be aware, ApacheCon will be held this year in Budapest, on November 17-23. (See http://apachecon.eu for more info.) ### ### 1 - Call For Papers - June 25 The CFP for the conference is still open, but w

RE: will score get changed as document continuously added.

2014-06-11 Thread Chris Hostetter
: Yes the score will change, because the new documents change the : statistics. In general, scores cannot be seen as absolute numbers, they : are only useful to compare between search results of the exact same : query at the same index snapshot. They have no global meaning. This wiki page goes

Re: absence of searchAfter method with Collector parameter in Lucene IndexSearcher

2014-06-06 Thread Chris Hostetter
: I was wondering why there is no search method in lucene Indexsearcher to : search after last reference by passing collector. Say a method with : signature like searchAfter(Query query, ScoreDoc after, Collector results). searchAfter only makes sense if there is a Sort involved -- either explic

Re: Question about multi-valued fields

2014-05-21 Thread Chris Bamford
ow come span queries are heading for extinction? Thanks - Chris -Original Message- From: Allison, Timothy B. To: java-user@lucene.apache.org Sent: Tue, 20 May 2014 16:59 Subject: RE: Question about multi-valued fields Chris, Good to see you over here. There's probably an

Question about multi-valued fields

2014-05-20 Thread Chris Bamford
); TopDocs hits = searcher.search(query, 10); This should match our Document. At this point is there a way to also find out which entry in the array making up "multival-field" was responsible? Thanks, - Chris

Re: What is the proper use of stop words in Lucene?

2014-04-28 Thread Chris Tomlinson
ated filters and tokenizers for both Tibetan Unicode and the Extended Wylie transliteration system. If there is interest we will be happy to donate this work to Apache Lucene. This includes paying attention to the myriad punctuation characters, stemming and so on. Thank you, Chris > > Uwe

  1   2   3   4   5   6   7   8   9   10   >