Github user justinleet commented on the issue:

    https://github.com/apache/metron/pull/995
  
    @merrimanr Let me replay my understanding to see if I'm on the right track.
    
    The problem we have is that we're returning fields that we can't reindex as 
a whole document when we run a glob query ("*"). In particular, the ones we've 
seen are the subfields of LatLon.  We can't reindex the _coordinate fields, but 
they come back in a search.
    
    These fields will come back if they are either
    * stored (which are returned normally),
    * docValues that aren't stored, which are returned in the case of a glob 
query per Solr docs:
      >Field values retrieved during search queries are typically returned from 
stored values. However, non-stored docValues fields will be also returned along 
with other stored fields when all fields (or pattern matching globs) are 
specified to be returned (e.g. “fl=*”)
    
    This is why setting the dynamic field solves the problem (it both makes 
them not stored and not docValues).
    
    Is this correct so far?
    
    So I dug the slightest bit into Lucene source for the Currency field (as a 
specific example of a nonproblematic field per your test).
    
    Here's a snippet of 
    
    ```
      private void createDynamicCurrencyField(String suffix, FieldType type) {
        String name = "*" + POLY_FIELD_SEPARATOR + suffix;
        Map<String, String> props = new HashMap<>();
        props.put("indexed", "true");
        props.put("stored", "false");
        props.put("multiValued", "false");
        props.put("omitNorms", "true");
        int p = SchemaField.calcProps(name, type, props);
        schema.registerDynamicFields(SchemaField.create(name, type, p, null));
      }
    
    ...
      @Override
      public void inform(IndexSchema schema) {
        this.schema = schema;
        createDynamicCurrencyField(FIELD_SUFFIX_CURRENCY,   fieldTypeCurrency);
        createDynamicCurrencyField(FIELD_SUFFIX_AMOUNT_RAW, fieldTypeAmountRaw);
      }
    ```
    
    What's interesting is that it appears to create an entirely new dynamic 
field, `*____currency` to catch everything under the hood.  This field is not 
stored and uses the default docValues, which is false.
    
    Output from a LukeRequest similar to the test above:
    ```
    KEY: *____currency
    VALUE NAME: *____currency
    FLAGS: [INDEXED, OMIT_NORMS]
    
    KEY: *____amount_raw
    VALUE NAME: *____amount_raw
    FLAGS: [INDEXED, OMIT_NORMS]
    
    KEY: *
    VALUE NAME: *
    FLAGS: [DOC_VALUES, OMIT_NORMS, OMIT_TF]
    
    KEY: *.c
    VALUE NAME: *.c
    FLAGS: [INDEXED, STORED, OMIT_TF]
    ```
    
    Note that only the catch all has the docValues flag, but the custom type's 
currency and amount_raw do not.
    
    The short version is that it seems like the majority of the default types 
manage their subfields more reasonably and therefore aren't a problem.
    



---

Reply via email to