This new part of the series focuses on the use of fast compressors such as
Snappy to improve access speed to image data:

https://medium.com/@p.rozas.larraondo/divide-compress-and-conquer-building-an-earth-data-server-in-go-part-2-88670cafc167

IMO fast compressors will play a very important role in the design of high
performance systems - as a method to overcome RAM speed limitations on
CPUs. Feedback, comments and experiences from the community are highly
appreciated!

Pablo

On Thu, Dec 21, 2017 at 2:04 AM, Michael Jones <michael.jo...@gmail.com>
wrote:

> Certainly as you say, individual user patterns are not generally
> predictable. Sometimes aggregate patterns can be. The "sea of tiles" is the
> natural design and works great in normal cases. It seems the way to teach
> it in any case.
>
> Where the filesystem issue comes in would be, for example, the nominal 1
> meter per pixel Google Earth, which in plate carrée or like form with
> 400x400 pixel tiles consists of 253,701,184 tiles at that one ground sample
> distance. That is a lot for "ls" and a lot for most file systems to enjoy
> quickly accessing in a single directory. A pyramidal reduced resolution
> dataset hierarchy will require 4/3rds of this in total, or 338,268,246
> tiles. Finer details, such as 50cm advanced satellite and 10cm aerial/drone
> images scale the portions covered by 4x to 100x. So in the limit one will
> find the edge of what any OS designer expects "sane" developers to expect.
> :-)
>
> On Wed, Dec 20, 2017 at 3:33 PM, Pablo Rozas Larraondo <
> p.rozas.larrao...@gmail.com> wrote:
>
>> Hi Michael,
>>
>> Thanks for your comments, I totally agree with them. File systems will
>> struggle with the explosion of files resulting from the tile operation. As
>> you point out, other formats, such as geoTIFF, HDF5 or NetCDF define the
>> tiling or chunking process internally at the file level.
>>
>> The reason for creating the tiles as individual files in the article was
>> because this is ultimately intended to be stored on the cloud as objects
>> (this will be covered in the 3rd article). As far as I know, cloud object
>> stores (ie AWS S3, Google Cloud Storage) do not have a limitation in the
>> number of objects stored in a bucket (If someone has more information about
>> this, please share). That is why I proposed to split the tiles as separate
>> files in the article.
>>
>> I also find the caching considerations quite amusing. It is a complex
>> matter and, in my experience, cache optimisations are quite dependent on
>> the user access patterns, which are normally hard to predict.
>>
>> Cheers,
>> Pablo
>>
>>
>>
>> On Wednesday, December 20, 2017 at 2:24:01 AM UTC+1, Michael Jones wrote:
>>>
>>> Thank you, Pablo. Very helpful to have this kind of step by step example
>>> for Go developers.
>>>
>>> I have some familiarity in this area and I'd say the practical issues in
>>> large-scale, high-throughput operation tend to relate to the native
>>> filesystem. Too many small files overwhelm them and can make directory
>>> lookups slow. Too many directory levels leads to slow filesystem traversal.
>>> Sometimes it can help to dice the big image into small independent tiles
>>> and store those tiles as a mosaic in one's own file type. This is the
>>> nature of TILED vs ROW storage in the TIFF format. The next level of tuning
>>> is about leverage the operating system's cache of data read from disk in a
>>> productive way. You can have our own cache in RAM, of course, but the OS
>>> likely has that same data cached. There are cases where memory mapping the
>>> small tile files does what you would want.
>>>
>>> There are also dynamic considerations. It may well be that a client
>>> accessing tile [i][j] will soon want one of the eight surrounding tiles.
>>> over time, it may be that a direction of browsing through tile-space can be
>>> established and this can encourage read-ahead, though the benefit is not
>>> always assured; maybe the accesses are structured and maybe they are not.
>>>
>>> Some high-throughput servers in the era of smart web clients (aka Google
>>> Maps / leaflet ./ etc.) refuse to build custom images and only supply tiles
>>> in response to a request--leaving tile assembly to the client.
>>>
>>> Just some thoughts. None of them would help make what you've done any
>>> clearer or more helpful to the reader.
>>>
>>> Best,
>>> Michael
>>>
>>>
>>> On Tue, Dec 19, 2017 at 3:37 PM, Pablo Rozas Larraondo <
>>> p.rozas....@gmail.com> wrote:
>>>
>>>> Thank you Thomas for the link to the vips library. I didn't know about
>>>> it and now I want to read more about its design and internals.
>>>>
>>>> The objective of the article was to set a baseline using the Go image
>>>> library and play with several factors to see how it affects performance. In
>>>> this first article, I wasn't really trying to come up with the fastest
>>>> possible image server but to point a few basic techniques that can improve
>>>> access speed and reduce memory consumption. These techniques should be
>>>> applicable to any image library, so similar relative performance gains can
>>>> be achieved with any language or library.
>>>>
>>>> The next part, which I'm currently writing, proposes the snappy
>>>> compression as a way of improving access speed to the data.
>>>>
>>>> Cheers,
>>>> Pablo
>>>>
>>>> On Tuesday, December 19, 2017 at 10:28:48 AM UTC+1, Thomas Bruyelle
>>>> wrote:
>>>>>
>>>>> Interesting and nice pieces of code. I wonder if the performances can
>>>>> be compared to something like `vips` (https://jcupitt.github.io/lib
>>>>> vips).
>>>>>
>>>>> Le lundi 18 décembre 2017 22:51:49 UTC+1, Pablo Rozas Larraondo a
>>>>> écrit :
>>>>>>
>>>>>> Hi,
>>>>>>
>>>>>> For those interested on serving or using satellite imagery, I've just
>>>>>> published the first of a three part series on this subject using Go:
>>>>>>
>>>>>> https://medium.com/@p.rozas.larraondo/divide-compress-and-co
>>>>>> nquer-building-an-earth-data-server-in-go-part-1-d82eee2eceb1
>>>>>>
>>>>>> Any feedback or comment that you might have would be greatly
>>>>>> appreciated!
>>>>>>
>>>>>> Thanks,
>>>>>> Pablo
>>>>>>
>>>>> --
>>>> You received this message because you are subscribed to the Google
>>>> Groups "golang-nuts" group.
>>>> To unsubscribe from this group and stop receiving emails from it, send
>>>> an email to golang-nuts...@googlegroups.com.
>>>> For more options, visit https://groups.google.com/d/optout.
>>>>
>>>
>>>
>>>
>>> --
>>> Michael T. Jones
>>> michae...@gmail.com
>>>
>> --
>> You received this message because you are subscribed to the Google Groups
>> "golang-nuts" group.
>> To unsubscribe from this group and stop receiving emails from it, send an
>> email to golang-nuts+unsubscr...@googlegroups.com.
>> For more options, visit https://groups.google.com/d/optout.
>>
>
>
>
> --
> Michael T. Jones
> michael.jo...@gmail.com
>

-- 
You received this message because you are subscribed to the Google Groups 
"golang-nuts" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to golang-nuts+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Reply via email to