[jira] Updated: (SOLR-1997) analyzed field: Store internal value instead of input one

Joan Codina (JIRA) Mon, 12 Jul 2010 07:55:18 -0700

     [ 
https://issues.apache.org/jira/browse/SOLR-1997?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]


Joan Codina updated SOLR-1997:
------------------------------

    Attachment: SOLR-1997-1.4.patch
                SOLR-1997-1.5.patch

patch for 1.4 and 1.5 versions

> analyzed field: Store internal value instead of input one
> ---------------------------------------------------------
>
>                 Key: SOLR-1997
>                 URL: https://issues.apache.org/jira/browse/SOLR-1997
>             Project: Solr
>          Issue Type: New Feature
>    Affects Versions: 1.4, 1.4.1, 1.5
>            Reporter: Joan Codina
>             Fix For: 1.4, 1.4.1, 1.5
>
>         Attachments: SOLR-1997-1.4.patch, SOLR-1997-1.5.patch
>
>
> Solr implements a set of filters and tokenizers that allow the filtering and 
> treatment of text, but when the field is set to be stored, the text stored is 
> the input one. This is may useful when the end user reads the input, but may 
> not be like this in others, cases, when for example there are payloads and 
> the text is something like A|2.0 good|1.0 day|3.0, or if the result of a 
> query is processed using something like Carrot2
> So this is a simple new kind of field that takes as input the output of a 
> given type (source), and then performs the normal processing with the desired 
> tokenizers and filters . The difference is that the stored value is the 
> output of the source type, and this is what is retrieved when getting the 
> document.
> The name of the field type  is AnalyzedField and in the schema is introduced 
> in the following way to create the analyzedSourceType from the  SourceType
>               <fieldType name="SourceType" class="solr.TextField"  >
>                       <analyzer type="index">
>                               <tokenizer 
> class="solr.StandardTokenizerFactory" />
>                               <filter class......." />
>                       </analyzer>
>                       <analyzer type="query">
>                               <tokenizer 
> class="solr.StandardTokenizerFactory" />
>                               <filter ....." />
>                       </analyzer>
>               </fieldType>
>  <fieldType name="analyzedSoureType" class="solr.AnalyzedField" 
> positionIncrementGap="100" preProcessType="SourceType">
>              <analyzer>
>                  <tokenizer class="solr.WhitespaceTokenizerFactory"/>
>            </analyzer>
>  </fieldType>
> many times just the WhitespaceTokenizerFactory  is needed as the tokens have 
> already been cut down by the  SourceType
> finally, a field can be declared as 
> <field name="analyzedData" type="analyzedSoureType" indexed="true" 
> stored="true" termVectors="true" multiValued="true"/>
> which can be written directly or can be defined as a copy of the source one.
> <field name="Data" type="analyzedSoureType" indexed="true" stored="true" 
> termVectors="true" multiValued="true"/>
> ...
> <copyField source=data" dest="analyzedData"/>

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[jira] Updated: (SOLR-1997) analyzed field: Store internal value instead of input one

Reply via email to