I am trying to debug an issue we are seeing with various deployments of
solr with lucene 4.6
We are seeing many errors like:
ERROR: could not read any segments file in directory
java.nio.file.NoSuchFileException: /Users/ryan/Downloads/indexV2/v0/index/_
of.si
The file listing looks like this:
On Tue, Jan 3, 2012 at 1:44 PM, Robert Muir wrote:
> On Tue, Jan 3, 2012 at 4:30 PM, Ryan McKinley wrote:
>>
>> Just brainstorming, it seems like an FST could be a good/efficient way
>> to match documents. My plan would be to:
>>
>> 1. Use an Analyzer to create
Happy new year!
I'm working on a way to simple geocode documents as they are indexed.
I'm hoping to use existing Lucene infrastructure to do this as much as
possible. My plan is to build an index of known place names then look
for matches in incoming text. When there is a match, some extra
field
>
> Where do you get your Lucene/Solr downloads from?
>
> [] ASF Mirrors (linked in our release announcements or via the Lucene website)
>
> [X] Maven repository (whether you use Maven, Ant+Ivy, Buildr, etc.)
>
> [X] I/we build them from source via an SVN/Git checkout.
>
--
Is there anyway to walk the terms in reverse order?
I have a query that need to find the last matching term -- if it could
start checking from the end, it would avoid a lot of work.
Thanks
Ryan
-
To unsubscribe, e-mail: java-use
On Jan 7, 2010, at 12:21 PM, Marvin Humphrey wrote:
On Thu, Jan 07, 2010 at 10:13:00AM -0500, Ryan McKinley wrote:
With the new flexible indexing stuff, would it be possible to
natively
write an rtree to disk in the index process?
The question I'd have is, how would you h
I'm getting back into the spatial search stuff and wanted to get some
feedback before starting down any path...
I'm building an app, where I need R-tree like functionality -- that is
search for all items within some extent / that touch some extent. If
anyone has ideas for how this could ma
Hello-
I'm looking for a way to make tokens navigate a directory structure.
For example:
Given:
/aaa/bbb/ccc
Make three Tokens:
/aaa/
/aaa/bbb/
/aaa/bbb/ccc
A while ago, I added:
http://issues.apache.org/jira/browse/SOLR-1057
Is there a "standard" way to do this?
thanks
ryan
--
thanks -- I'll move this discussion to solr-user since I am now
delving into SolrIndexReader...
On Apr 15, 2009, at 9:06 PM, Yonik Seeley wrote:
On Wed, Apr 15, 2009 at 8:35 PM, Ryan McKinley
wrote:
uggg. So there is no longer a consistent docId I can use in a
filter?
There are
http://www.thetaphi.de
eMail: u...@thetaphi.de
-Original Message-
From: Ryan McKinley [mailto:ryan...@gmail.com]
Sent: Thursday, April 16, 2009 1:34 AM
To: java-user@lucene.apache.org
Subject: Re: BitSet Filter ArrayIndexOutOfBoundsException?
Are you saying there lucene document could have
xpects you to use the docID space for that
IndexReader (ie, a single segment)?
Mike
On Wed, Apr 15, 2009 at 1:37 PM, Ryan McKinley
wrote:
I am working on a Filter that uses an RTree to test for inclusion.
This
Filter works great *most* of the time -- if the index is optimized,
it works
I am working on a Filter that uses an RTree to test for inclusion.
This Filter works great *most* of the time -- if the index is
optimized, it works all of the time. I feel like I am missing
something basic, but not sure what it could be.
Each time the reader opens (and the index has chan
maybe try:
http://hudson.zones.apache.org/hudson/view/Solr/job/Solr-trunk/
On Jan 16, 2009, at 4:47 PM, Kay Kay wrote:
I am trying to access the nightly lucene builds here at - http://people.apache.org/builds/lucene/java/nightly/
. It does not seem to be available for sometime. Just cur
dooh, never hit paste in the subject line
On Jan 16, 2009, at 1:54 PM, Ryan McKinley wrote:
The PMC is pleased to announce that Patrick O'Leary has been voted
to be a a Lucene-Java Contrib committer.
Patrick has contributed a great foundation for integrating spatial
search with l
The PMC is pleased to announce that Patrick O'Leary has been voted to
be a a Lucene-Java Contrib committer.
Patrick has contributed a great foundation for integrating spatial
search with lucene. I look forward to future development in this area.
Patrick - traditionally we ask you to send o
I think a link is probably enough...
its funny to have "lucene-current.xxx" listed on:
http://archive.apache.org/dist/jakarta/lucene/
On Dec 19, 2008, at 11:25 PM, Chris Hostetter wrote:
a couple of refrences to "Lucene 1.2" in the last few months got me
thinking and made me realize that
dexed". Where is
sortMissingLastAttribute?
thanks.
On Jan 8, 2008, at 4:13 PM, Ryan McKinley wrote:
what do you mean by "fail"? -- there is the sortMissingLast attribute
Michael Prichard wrote:
ok... i should read the manual more often.
i went ahead and just added untokenized,
what do you mean by "fail"? -- there is the sortMissingLast attribute
Michael Prichard wrote:
ok... i should read the manual more often.
i went ahead and just added untokenized, unstored sort fields
question, if I put a field in to sort of but say I have not indexed any
as of yet...will
Andrzej Bialecki wrote:
Lukas Vlcek wrote:
So staring will be accommodated only during indexing phase. Does it
mean it
will be pretty static value not a dynamically changing variable...
correct?
In other words if I add my starts to some document it won't affect the
scoring immediately but afte
Ryan can you post the output of CheckIndex on your now-working index?
(1800 is still too many files I think, certainly after having
optimized).
ok, 1800 was wrong - that was from a botched attempt where I:
1. ran optimize on the broken 18K file index. It crashed midway through.
2. run Check
I just used the CheckIndex tool to try to salvage a corrupt index
(http://www.nabble.com/restoring-a-corrupt-index--tf4783866.html)
Its a great tool thanks!
I'm wondering about adding support for this tool in the solr admin
interface, but have a few questions about how it works before I see if
Ryan are you able to update to that commit I just did? If so I think
you should run the tool without -fix and post back what it printed. It
should report an error on that one segment due to the missing file.
Then, run -fix to remove that segment (please backup your index first!).
Then, if you
Or maybe the index is not corrupt but then we are hitting the descriptor limit
on opening a searcher and it's being reported as "no such file or directory"?
Hmmm, yes that's possible too...
Should be easy to tell by checking if the file _cf9.fnm already exists.
Oh yeah. Ryan, does that file
thanks for all the replies
Yonik do you understand why so many unreferenced files are being produced
here? What's the root cause?
This is an index that has the same documents get updated many times,
that could build up old files w/o optimizing.
Just guesses... but perhaps new index fi
Using solr, we have been running an indexing process for a while and
when I checked on it today, it spits out an error:
java.lang.RuntimeException: java.io.FileNotFoundException:
/path/to/index/_cf9.fnm (No such file or directory)
at org.apache.solr.core.SolrCore.getSearcher(SolrCore.j
customizable Solr really is (rather the ease with which we can do it). Also
Solr doesn't support queryFilter out of the box (Hossman: there's nothing to
stop a solr request handler from using QueryFilter's if they want). How much
extra work is it?
out of the box, solr supports query filters.
mark harwood wrote:
I want to return the "interesting" terms used for MLT
Could you do this using Query.extractTerms() on the rewritten version of the
MoreLikeThis query (a BooleanQuery)?
thanks! that works and avoids the PriorityQueue traverstal problems. I
can even get the boost (norma
2. Do retrieveTerms(int docNum) and createQuery(PriorityQueue q) need
to be private? Can they be public? If not public, could they at
least be protected?
I would think protected would be fine, what is your case for it being
public?
From the solr RequestHandler, I want to return the "
I'm trying to build a custom MoreLikeThis implementation that will run
within solr and I've run into a few API hurdles...
1. Can MLT.java be modified to optionally take the Similarity
implementation in the constructor? Currently it is hardcoded to:
private Similarity similarity = new Default
very, very short.
But I wouldn't do any of this until I was sure that the performance
of the simple approach of walking the TermEnum was too
expensive. Premature optimization and all that..
If this is clear as mud, I'll add details, but I'm 1/2 the way through a
bottle of Mer
Is there an efficient way to know how many distinct terms there are
for a given field name?
I know I can walk through a TermEnum and put them into a hash, but it
would be useful to know beforehand if you are going to get 4 distinct
values or 40,000
I don't need to know what the terms are, just h
Is there any way to find frequent phrases without knowing what you are
looking for?
I could index "A B C D E" as "A B C", "B C D", "C D E" etc, but that
seems kind of clunky particularly if the phrase length is large. Is
there any position offset magic that will surface frequent phrases
automati
32 matches
Mail list logo