subject:"content disappears in the index"

Re: content disappears in the index

2012-11-15 Thread Erick Erickson

Oddly I had the exact same thought. Although it's not obvious from the name (and common usage) of trim-like functions that you'd also have a way to specify maximum length (after trimming I'd assume). And the other thought I had was that TrimFilter should optionally take a list of characters to tri

Re: content disappears in the index

2012-11-13 Thread Bernd Fehling

Hi Geoff, cool, that will eliminate possible regex pitfalls in schema.xml I was thinking about enhancing an existing filter as multi-purpose filter. E.g. TrimFilter, if maxLength is set then also limit the termAtt to maxLength. This will keep the number of available filters small, especially for s

Re: content disappears in the index

2012-11-13 Thread Geoff Cooney

Hi, I've been following this thread and happen to have a simple TruncatingFilter class I wrote for the same purpose. I think this should do what you want: import java.io.IOException; import org.apache.lucene.analysis.TokenFilter; import org.apache.lucene.analysis.TokenStream; import org.apach

Re: content disappears in the index

2012-11-13 Thread Erick Erickson

There's nothing in Solr that I know of that does this. It would be a pretty easy custom filter to create though FWIW, Erick On Tue, Nov 13, 2012 at 7:02 AM, Robert Muir wrote: > On Mon, Nov 12, 2012 at 10:47 PM, Bernd Fehling > wrote: > > By the way, why does TrimFilter option updateOffse

Re: content disappears in the index

2012-11-13 Thread Robert Muir

On Mon, Nov 12, 2012 at 10:47 PM, Bernd Fehling wrote: > By the way, why does TrimFilter option updateOffset defaults to false, > just keep it backwards compatible? > In my opinion this option should be removed. TokenFilters shouldn't muck with offsets, for a lot of reasons, but especially becau

Re: content disappears in the index

2012-11-12 Thread Bernd Fehling

Hi Erik, I like the fortune cookie :-) I came to the same solution as you did but with a short java proggy by trying different patterns, so try and error ;-) This brings me to the question, is there now (with 4.0) any filter doing the job for me? I took a look at LengthFilter but it has a differ

Re: content disappears in the index

2012-11-12 Thread Erick Erickson

Because your regex is wrong? (sorry, couldn't resist). Regexes always give me indigestion. But if you look at your results, your regex isn't working in any case at all. The second group is being removed from the end of the string. I _think_ what's happening is that the longest possible string is b

Re: content disappears in the index

2012-11-12 Thread Bernd Fehling

Yes, it is the second PatternReplaceFilterFactory. the String "Arslanagic, Aida ; Siqveland, Elisabeth" is reduced to "a", whereas the other strings are: "Alexander, Kvam ; Bjørn, Nyland ; Bjørn, Reiten ; Øystein, Huse" --> "alexanderkvambj" "Brennmoen, Ingar ; Hauklien, Øystein ; Hedalen, Trond

Re: content disappears in the index

2012-11-12 Thread Bernd Fehling

The field type is derived from the distributed alphaOnlySort as follows: It reduces long lists of author names (100 and more authors) to the first 30 chars for sorting and removes some illegal chars to keep sorting with utf8 solid. Don't see any problems there.

Re: content disappears in the index

2012-11-12 Thread Jack Krupansky

: http://wiki.apache.org/solr/CommonQueryParameters For example, have an "author" field that is "text" and an "author_s" (or "author_sorted" or "author_string") field that you copy the name to: Query on "author", but sort on &quo

Re: content disappears in the index

2012-11-12 Thread Erick Erickson

First, sorting on tokenized fields is undefined/unsupported. You _might_ get away with it if the author field always reduces to one token, i.e. if you're always indexing only the last name. I should say unsupported/undefined when more than one token is the result of analysis. You can do things lik

RE: content disappears in the index

2012-11-12 Thread Uwe Schindler

rg > Subject: content disappears in the index > > Hi list, > a user reported wrong sorting of our search service running on solr. > While chasing this issue I traced it back through lucene into the index. > I have a text field for sorting > (stored,indexed,tokenized,omitNorms,sortM

content disappears in the index

2012-11-12 Thread Bernd Fehling

Hi list, a user reported wrong sorting of our search service running on solr. While chasing this issue I traced it back through lucene into the index. I have a text field for sorting (stored,indexed,tokenized,omitNorms,sortMissingLast) and three docs with author names. If I trace at org.apache.lu

Re: content disappears in the index

Re: content disappears in the index

Re: content disappears in the index

Re: content disappears in the index

Re: content disappears in the index

Re: content disappears in the index

Re: content disappears in the index

Re: content disappears in the index

Re: content disappears in the index

Re: content disappears in the index

Re: content disappears in the index

RE: content disappears in the index

content disappears in the index

13 matches

Site Navigation

Mail list logo

Footer information