I am new to all of this myself, so caveat emptor.  However, I think know of at 
least two ways to do this.  

The first is to use two different buckets, but with the same key universe.  One 
bucket is for the portion of the records that you want indexed, and the other 
is for everything else.  Depending on the application, you may want there to be 
some duplication between the data that is stored, but this way you can insure 
that only the right bits are indexed.  For example, I have an application where 
some source HTML is stored in one bucket (along with other fields that never 
change -- like creation date and source site), and another bucket stores the 
stuff that should be indexed and that also periodically changes (i.e., the text 
of the html that should be indexed and some user assigned tags).

The second approach is not one that I tried but is something that should work 
in principle.  You could modify the precommit hooks to selectively index based 
on your criteria (in this case, MIME type).

-- GWF



On Jan 21, 2011, at 3:46 PM, Gordon Tillman wrote:

> Howdy Folks,
> 
> Is there a way to configure Riak search so that only objects with certain 
> content-types (e.g., application/json) are to be indexed?
> 
> I have installed a schema on a test bucket that contains field definitions 
> for the (JSON) fields that I wish to have indexed.  At then end I include a 
> dynamic_field definition that instructs search to skip everything else, like 
> this:
> 
>        %% anything left over is not indexed
>        {dynamic_field, [
>            {name, "*"},
>            {skip, true}
>        ]}
> 
> After populating my bucket with some test data and then dumping the contents 
> of the corresponding index bucket (_rsid_test), I see that there are entries 
> for every key that is in the test bucket, even for keys that correspond to 
> objects that do not have a content-type of application/json and who do not 
> contain any data that corresponds to one of the field definitions.
> 
> I would really like to avoid having any indexing information stored for 
> objects that are not application/json.  Is that possible?
> 
> Many thanks!
> 
> --gordon
> _______________________________________________
> riak-users mailing list
> riak-users@lists.basho.com
> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com


_______________________________________________
riak-users mailing list
riak-users@lists.basho.com
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com

Reply via email to