Proposal for chunked storage (related to Lowlevel Access to RiakCS objects)

2014-03-29 Thread Timo Gatsonides

Related to the recent thread about "Lowlevel Access to RiakCS objects" I plan 
to implement an extension to a Riak Object (in the Golang driver at 
https://github.com/tpjg/goriakpbc) that will cover two use cases:
1) storing larger files directly in Riak (not Riak CS)
2) store growing files, e.g. log files, efficiently in Riak

I have some questions for the Riak community. First: am I about to re-invent a 
wheel and if so can someone please point me to an example implementation?

If not I would like to know if there is more interest and maybe more use cases 
so we can develop a standard way to store these objects in Riak, allowing 
access from multiple programming languages. If there is, please read the 
proposal below and provide feedback.

Store the meta-information about a “BigObject” in a regular Riak value. This 
object will be stored in  , . The object will have the following 
meta tags:
- segment_size - size of each segment in bytes
- segment_count - the number of segments of the object
- total_size - optional total size of the entire object (see below).

The objects data would be stored in segments in , . The 
Content-Type would be the same for all objects. Depending on the use case 
total_size could be filled.

Using the two use cases above as an example:
1) storing large files, e.g. video: use a large segment_size, e.g. 1Mb and 
store the total_size since the file will be static and the meta-information can 
be written at once
2) store growing files, e.g. daily log files: use a smaller segment_size, e.g. 
10Kb-100Kb, do not store total_size as otherwise each “Append” operation would 
require two PUTs. If a segment grows beyond the segment_size update the 
meta-information K/V, otherwise only PUT the last segment again. In the 
extension to a client driver a BigObject would store some state information 
that keeps track of the meta-information K/V and the last segment to make the 
Append operation somewhat efficient - note though that each append is 
overwriting the last segment.

Any input is highly appreciated.

Kind regards,
Timo

 



smime.p7s
Description: S/MIME cryptographic signature
___
riak-users mailing list
riak-users@lists.basho.com
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com


Re: Lowlevel Access to RiakCS objects?

2014-03-29 Thread Martin Alpers
Hi Brian,

thanks for your input, it triggered the idea to make the application query the 
cache for a short range; one byte is sufficient.
If this request is served within a very short period, say 25ms, then the real 
request is forwarded to the cache, and the the backend otherwise.
While it would lose me a few caching opportunities, it looks simpler than 
segmenting, and should still allow me to serve most range requests from cache.

Best regards,
Martin

On 14/03/28.18:34:1396028098, Brian Akins wrote:
> I know of at least one case that uses riak cs for live HLS video at scale, 
> which is somewhat similar to your use case, so this is not uncharted 
> territory. While technically it's possible you can design and implement 
> something more efficient than CS, that may be effort better spent in other 
> areas of you application. JMO.
> 
> Sent from my iPhone
> 
> > On Mar 28, 2014, at 10:53 AM, Martin Alpers  wrote:
> > 
> > Thanks for your answer, Tom.
> > 
> > I wanted to avoid the overhead of one segemting layers whose objects are 
> > segmented by yet another layer, but come to think of it, since Riak can 
> > store chunks up to 1MB in size, why not use Riak directly?
> > And I wanted to avoid additional complexity of additional code which 
> > basically mimics RiakCS.
> > 
> > But I see nothing to prevent me from doing all that segmentation stuff. 
> > Having slept a night over it, I think it is the way to go.
> > 
> > So thanks for the feedback, and for taking your time to read my question in 
> > the first place.
> > 
> > Best regards,
> > Martin
> > 
> >> Hi Martin,
> >> 
> >> The short answer is no, Riak CS does not expose lower level access to the
> >> blocks that are replicated to Riak.
> >> 
> >> That said, I'm curious, is anything preventing you from segmenting your
> >> videos and writing a playlist + the segements to Riak CS? This would allow
> >> you to seed varnish, while at the same time reducing the cost of a cache
> >> miss to something reasonable for users.
> >> 
> >> Regards,
> >> Tom
> >> 
> >>> On Thu, Mar 27, 2014 at 3:05 PM, Martin Alpers  
> >>> wrote:
> >>> 
> >>> Hi all,
> >>> 
> >>> is there a canonical way to access RiakCS objects on a lower level? If I
> >>> remember correctly, RiakCS basically distributes larger objects into 
> >>> chunks
> >>> of one megabye each, and mapreduces them together on retrieval.
> >>> I would like to read those chunks for caching purposes.
> >>> 
> >>> For those interested in why I would wnat that:
> >>> A Riak/RiakCS cluster is the heart of our yet-to-be-implemented video
> >>> delivery cluster. A video management system will enable registered user to
> >>> upload their videos and the public can watch them.
> >>> In order to reduce intra-cluster traffic, we intent to cache the videos,
> >>> preferably in RAM.
> >>> We do not have any numbers on how often users would skip parts of the
> >>> video and generate range requests. If that case is really common, we would
> >>> prefer to serve them from cache as well, at least with Varnish and Squid,
> >>> some users would experience unacceptably long delays.
> >>> We looked out for a cache that could pipe through any request on an URL on
> >>> which caching is in progress and serve from cache afterwards.
> >>> 
> >>> The problem with both Varnish and Squid (and I suppose most caches,
> >>> because this behaviour seems reasonable in most cases) boils down to
> >>> treating a caching in progress as a cache hit.
> >>> My colleague started to write his own caching proxy in NodeJS, but using
> >>> asynchronous callbacks to check if a file exists, and to create it if it
> >>> does not, strikes me as somewhat couragous for production.
> >>> 
> >>> Now while we cannot risk to let some users wait for hundreds of megabytes
> >>> to be cached before delivery begins, and while we want at least to be
> >>> prepared to face many more range requests than the average "wget was
> >>> interrupted" case, it occurred to me thata few megabytes are not an issue
> >>> at multi-fast-ethernet speed.
> >>> So if we can split our files into objects small enough, we could code a
> >>> proxy that translates a range request into one or more normal requests for
> >>> those chunks, cuts off a certain offset of the first chunk if the range of
> >>> the orignal request began somewhere off the boundary, and reconcatenates
> >>> those chunks in correct order for delivery.
> >>> So the cache would never have to be bypassed, and the whole headache of
> >>> telling a complete hit from one "in progress" would be gone.
> >>> 
> >>> Since RiakCS has already split our files into small pieces, and somehow
> >>> tracks them, so could we possibly piggyback on that?
> >>> 
> >>> And by the way, I just came across the memory backend. I assume it is
> >>> distributed like the persistent ones, so it will not help me redice
> >>> internal traffic, right?
> >>> 
> >>> Any input is highly appreciated.
> >>> 
> >>> Best Regards
> >>> Martin
> >>> 
> >>>