Re: CorrugatedIron v0.1.3 Released

OJ Reeves Sat, 01 Oct 2011 03:36:32 -0700

Hi Kyle,

Thanks for the response. My comments are inline.

On 1 October 2011 17:17, Kyle Quest <kcq.li...@gmail.com> wrote:

> You guys have done an amazing job the library in terms of its
> capabilities. The docs are ok in most cases, but not always great.
> Though compared to a lot of other code I've seen you are way ahead :-)
>

Thank you. We realise we still have a way to go with the docs, but we wanted
to make sure that we at least had sample applications and enough
documentation to get people going. So far, we seemed to have managed to get
just over the line. More work certainly needs to be done and it's something
that Jeremiah and I are very aware of.

> I didn't mean to imply that it would be a good idea to do a carbon
> copy of the Erlang client library, but picking a language that's close
> enough would makes sense.
>

I was using Erlang as an extreme example, but the principle holds regardless
of what we compare it to. The closest language that has a Riak client is the
official Java client. While Russell has done an outstanding job with all the
work he has done I really don't feel that the .NET version should
necessarily take on the same look and feel. This is for a few reasons:

   - There are obvious language differences, some of which are much nicer in
   .NET than in Java (lambdas in particular). The .NET API should make good use
   of features like this rather than suffer from trying to mimic another
   language that doesn't have support for them. (though I believe Java either
   has lambdas now or will have them soon. not sure, I'm not a big Java person
   :)).
   - The Java client has been around for quite a long time and its code base
   has a long history. It has the pain of backwards compatibility and
   non-breaking changes for existing users. Our library is obviously very new
   and doesn't need to carry around any design issues that the Java client may
   or may not have. Don't get me wrong, I am not saying that the Java client
   has design issues. What I'm trying to say is that a library's design should
   do it's best to model the requirement for the target DB as it currently
   stands and not take on the look and feel of another library which has been
   around longer.
   - The approach to Java apps isn't necessarily the same as .NET ones and I
   don't believe that Java-like idioms should be brought into the .NET world.
   .NET people have their own ways and they like to stick to them.

Apart from those points, keeping API's consistent across the different
libraries isn't something that I consider important. I think it's more
important to have the library make sense to the community that it has been
targeted at. Asking someone to use your code is hard enough without saying
"please think like a Java guy when you use this" :)

> .NET 4.0 has/had a bug in ConfigurationManager.GetSection() and that's
> exactly what your library uses, so your library gets that bug too :-)
>

Are you referring to the exception that gets thrown when you run the
application on a network share? (ie. this
one<http://social.msdn.microsoft.com/Forums/en/netfxbcl/thread/9ab6b9ee-7337-43de-8f82-177dd32aecbc>)?
If not, please let me know which one it is as I haven't noticed any news of
it at all. If it *is* this one, it honestly doesn't concern me at all.
Nobody is going to run an application like this, which talks to Riak, across
a network share or using UNC paths. Particularly not in production.
Production applications will run in IIS or will run on local machines. If
this issue does crop up, I'm happy to deal with it at that point. My gut
feeling is that .NET 4.x will be released with the fix before it is an issue
for any of our users.

> What you can't do with CorrugatedIron is something simple like this
> (preferred):
>
> var client = new RiakClient(port: 8098, transport:
> RiakClient.HttpTransport);
> client.Blah()... etc
>

There are multiple issues with this approach:

   - You're completely bypassing clustering, load balancing and connection
   pooling.
   - You have hard-coded references to ports, hosts, etc.
   - The user of the API is responsible for the lifetime of the client
   connection.

The above approach really only suits rapid development. It's the kind of
thing that you're going throw into a REPL and nothing more. Personally I
don't like the idea of exposing this kind of functionality to the user and
run the risk of it being misused in a production application. During
development you might want to quickly determine if you're able to connect to
the server and this can be done in so many ways, even outside of CI.
Configuration will need to be specified at some point any way, and won't end
up living in this version of the Config API.

We made a point from the outset with CI to make sure the user doesn't have
to think about the lifetime of the connections they're using. In fact, we
didn't want them to think about connections at all. We wanted users to focus
on what they wanted to do, the data they wanted to work with and to focus on
the business problem they're solving. Boilerplate code which worries about
connection lifetimes is not what we wanted to see ripple (pun intended)
through the client's code. Instead, when you use the RiakClient object in
CI, you just say "please do this!" without saying where it should happen,
how it should happen and what to do with the resources it has used while
doing it once it has finished. I think this has really helped in making the
API clean, and I'm quite pleased with the result :)

Let's face it, asking a dev to remember to put a connection back in a pool
is giving them rope to hang themselves with.

BTW as a small side point: you can't specify the transport type for the
client connections in CI. You have no choice but to use both REST and PBC.
This is because CorrugatedIron is responsible for making the right choice
for the transport to use for a given API call. That decision shouldn't be
made by the user of the library, instead they should only care about using
the client and asking for a function to be performed. How that function is
performed is and what transport is used is an implementation detail they
should be totally ignorant of. Not only does this allow people to just focus
on what they want to do, it means that we can easily change the transports
for the relative API functions across releases and the users of the API do
not have to change anything at all in their code. We're trying to protect
the users and I think this is a good idea.

or at least something like that (to be close to the model you have):
>
> var cluster = RiakCluster.FromParams(port: 8098, transport:
> RiakClient.HttpTransport);
> var client = cluster.CreateClient();
>
> The above statement (RiakCluster.FromParams(...) can optionally take
> an array of string to represent IP addresses for each node in the
> cluster defaulting to one node on the localhost if nothing is
> specified.
>

This approach would require every Riak node to be running the exact same
configuration on every single node. This goes all the way to the number of
connections in the pool per node and the ports that REST and PBC are running
on. While this may be the most common scenario (and probably the recommended
one) it doesn't allow you to specify that configuration per node. To add
that support you'd have to add an (optional?) array for each of those
fields, at which point you're better off having objets/records which
represent each node. Now we're moving back towards what we already have.

If the list of IP addresses/host names was optional, where should we connect
to if they don't specify a host? The idea of "sensible defaults" is almost
impossible to apply. Every deployment of Riak should have its parameters
carefully considered and tweaked appropriately so that it is optimal for
that scenario.

Again, other than adding this to a REPL, I don't see this happening in
production.

As for XML... I personally try to stay away from it and most of my
> configs are in JSON even in .NET apps :-) By the way, you are using a
> good JSON library, which is pretty good at dealing with not always
> perfectly formatted JSON, so you can easily support JSON-based
> configs.
>

If we were talking about streams of data that are pushed from client to
server and vice versa then I would agree and I would look to use JSON. JSON
is much nicer and easier to deal with.

For one-off configuration, I really don't see an issue. XML and JSON are
really no different when it's a single-shot configuration block. I think the
XML vs JSON issue really isn't the deal breaker, for me it's the question of
where the configuration lives.

Almost every production .NET application has some sort of app.config or
web.config. This is where configuration lives and is where .NET people go
when they have their own configuration or if they're looking to modify
existing configuration. If we were to create a JSON-based configuration over
an XML one, where would it go? Bearing in mind that .NET config is XML by
default, we'd either have to have the files elsewhere on disk (resulting in
more assets to manage) or put the configuration in an XML field. The thought
of either of these doesn't appeal to me at all. What makes sense to me, and
to many other .NET people, is to remain consistent with the approach that
.NET takes as a whole, and use the existing features that come out of the
box. That is, use the configuration sections/elements/etc, use the
attributes that define the properties to get rudimentary validation and easy
deserialisation.

I just don't see the argument for JSON configuration compelling. I'm
certainly open to being convinced, but I'm yet to see an argument which
really tells me that I should go out on a limb and put in something
different to what most of the .NET world is already doing. And let's be
honest, we're talking about *configuration* here. Once it's done, it's done.
It's not really an operational expense, a real-time data risk or anything
else that can drastically affect the success of your application. We're
literally just saying "Hey, CI, please talk to that thing over there".

Once again I'd like to say that you did a great job.

Thank you again for the feedback. It's always hard to get feedback of any
kind, and Jeremiah and I really appreciated it.

> Having inline API
> docs would be helpful.

Agreed. Jeremiah has been bugging me about this (and rightly so). My
argument so far for not having it has been that our code churn has been
really high while building the early version of the library. Keeping code
comments and inline API documentation in sync while building it would have
been a huge overhead for us. Now that we're getting to the point where the
API is stable, we'll be investing the time to make sure that the comments
(which ironically have to be XML ;)) will be added.

> And having a simple way to create clients
> without having to deal with configs would be great (especially
> considering the .NET 4.0 bug you are stuck with).
>
>
 Again, I'm not too concerned about the bug. I think it's an edge case that
won't hurt us at all. I also think that most users don't mind having a bit
of XML config as it's something they end up dealing with most of the time
any way.

Having said all this, I don't want to appear ignorant of the needs of the
users. My question is: does a block of XML configuration really stop you
from using the library?

Thank you again for your comments. We've dived into the configuration stuff
a bit here, but I'm still *very* interested in hearing about the other areas
of the API which you found complicated. Configuration is one thing that,
even if you don't like it, you can get past without too much pain. API
complexity is a very different and, in my view, a much more important issue
to fix. So when you have time I would love to hear your thoughts on where
the complexity has made your use of CI painful.

All the best!
OJ

_______________________________________________
riak-users mailing list
riak-users@lists.basho.com
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com

Re: CorrugatedIron v0.1.3 Released

Reply via email to