Re: Evolving data model

Deepak Balasubramanyam Wed, 19 Sep 2012 12:30:35 -0700

It really depends on your use case. Guido's tips on not-null and
ignore-properties will help. With *JsonIgnoreProperties* you can also
specify which ones you would like to ignore. That helps check ignoring
properties that you know *should* exist.


Converting your documents to a new format is a lot of work. I assume your
buckets are filled with millions of keys that need to take on a new format.
The complexity depends on your Q / R / W values and the format of your keys
(are they sequential / random with a uniform distribution ?). You will need
to take into account the eventual consistency of the system and how the
migration should be done on a live environment. It can get messy.
Performance would be a concern too since you will be iterating through all
keys in a bucket which Riak frowns upon. Not to mention recreating indexes
/ links / metadata (if applicable).

It will be easier to just keep adding attributes and ignore the ones that
you know older models will not understand. Think of the model as a JSON
object that is like a protobuf message (a .proto file if you have come
across one) or a representation of a row on a table. It is easy to add
data, but when you delete something and someone's model was depending on
it, it can get ugly.

Thanks
Deepak Bala

On Wed, Sep 19, 2012 at 10:03 PM, Guido Medina <guido.med...@temetra.com>wrote:

>  We have done similar things, but it always depends on the available
> tools, your requirements and needs, I will give you a short example, our
> main application uses a standard SQL, for historical data we use Riak, data
> that is not changing, for example, audit trails, daily chunks from
> different sources and so on.
>
> We make sure our data is 100% JSON compatible, and our tools are the
> available JSON libraries, it is fair easy to add new "columns" to your data
> (I know, columns right?), and keep your fetching still valid by ignoring
> deprecated properties and when writing back just overwriting old data, that
> way, your schema can change all the time without losing old data and
> evolving at the same time.
>
> 2i is fine to stamp and migrate if you wish, since it makes your code
> tedious if you need to be checking for nulls all the time, so for migration
> you can simply use JSON transformers, from this property to another (from
> old to new schema) without even coding, but it will be up to your tools.
>
> In Java for example all that can be aid with Jackson which happens to be
> the defacto JSON Riak Java client provider, here is a list of few
> annotations to accomplish most of the things are you worried about:
>
>
> *@JsonIgnoreProperties(ignoreUnknown=true)* (Say an old property just got
> deprecated and you want your POJO not to throw exceptions while converting
> from JSON to your POJO)
> *@JsonSerialize(include=JsonSerialize.Inclusion.NON_NULL)* (Saves lot of
> space)
>
>
>   @Override
>   *@JsonProperty("ranges")* (Say, a property just changed its type, and
> because of that, I need to map it to a new property, in this case, I don't
> have a list of integers anymore but a list of ranges, so a transformation
> is required...)
>   public List<Integer[]> getEntries()
>   {
>     return intRangeCollection.getRangesAsArray();
>   }
>
>   *@JsonProperty("ranges")*
>   public void setEntries(final List<Integer[]> entries)
>   {
>
> this.intRangeCollection=IntRangeCollection.buildIntRangesCollectionFromArrays(entries);
>   }
>
> Well, there is so much I could show you, but, examples are limitless, so
> depending on your use cases, you will figure your own ways to keep your
> code kind of clean and your schema changing constantly.
>
> Hope that helps,
>
> Guido.
>
>
> On 19/09/12 16:08, Pinney Colton wrote:
>
> Wow, that's a pretty big question.
>
>  IMO, it depends upon what you're doing with your data.  Personally, I'm
> storing up to 4 different "versions" of the same data in Riak, each version
> is optimized for different types of analytical operations.
>
>  That's probably not ideal for everybody.  Heck, storing 4 copies of my
> data isn't even optimal  for me from a storage perspective - but it does
> help optimize performance of different queries.  I care more about that
> than disk or memory.
>
> On Tue, Sep 18, 2012 at 5:55 PM, Allen Johnson <akjohnso...@gmail.com>wrote:
>
>> Hey everyone,
>>
>> I'm beginning to experiment with Riak and I'm trying to better
>> understand how to model my data.  One question I have at the moment is
>> how to evolve my data with a data store such as Riak?  I know that
>> it's schema-less and that I can add new fields as needed but I'm
>> thinking more about the existing documents.
>>
>> For example, say hypothetically, that I have a fairly successful
>> riak-based app.  As the application code continues to evolve over
>> several versions I begin to find a "better" way to model my data.  By
>> this time I have already stored many, many documents.  What is the
>> appropriate path here?  Do I version my documents with metadata and
>> rely on my application code to continue to deal with old-style
>> documents -- or do I perform some sort of bulk transformation on these
>> existing documents?
>>
>> Thanks,
>> Allen
>>
>> _______________________________________________
>> riak-users mailing list
>> riak-users@lists.basho.com
>> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
>>
>
>
>
>  --
> *Pinney H. Colton*
>  *Bitwise Data, LLC*
>  +1.763.220.0793 (o)
> +1.651.492.0152 (m)
> http://www.bitwisedata.com
>
>
>
>
> _______________________________________________
> riak-users mailing 
> listriak-users@lists.basho.comhttp://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
>
>
>
> _______________________________________________
> riak-users mailing list
> riak-users@lists.basho.com
> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
>
>

_______________________________________________
riak-users mailing list
riak-users@lists.basho.com
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com

Re: Evolving data model

Reply via email to