If mandatory, you can use a local cache like alluxio

Le 1 juin 2017 10:23 AM, "Mich Talebzadeh" <mich.talebza...@gmail.com> a
écrit :

> Thanks Vincent. I assume by physical data locality you mean you are going
> through Isilon and HCFS and not through direct HDFS.
>
> Also I agree with you that shared network could be an issue as well.
> However, it allows you to reduce data redundancy (you do not need R3 in
> HDFS anymore) and also you can build virtual clusters on the same data. One
> cluster for read/writes and another for Reads? That is what has been
> suggestes!.
>
> regards
>
> Dr Mich Talebzadeh
>
>
>
> LinkedIn * 
> https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw
> <https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw>*
>
>
>
> http://talebzadehmich.wordpress.com
>
>
> *Disclaimer:* Use it at your own risk. Any and all responsibility for any
> loss, damage or destruction of data or any other property which may arise
> from relying on this email's technical content is explicitly disclaimed.
> The author will in no case be liable for any monetary damages arising from
> such loss, damage or destruction.
>
>
>
> On 1 June 2017 at 08:55, vincent gromakowski <
> vincent.gromakow...@gmail.com> wrote:
>
>> I don't recommend this kind of design because you loose physical data
>> locality and you will be affected by "bad neighboors" that are also using
>> the network storage... We have one similar design but restricted to small
>> clusters (more for experiments than production)
>>
>> 2017-06-01 9:47 GMT+02:00 Mich Talebzadeh <mich.talebza...@gmail.com>:
>>
>>> Thanks Jorn,
>>>
>>> This was a proposal made by someone as the firm is already using this
>>> tool on other SAN based storage and extend it to Big Data
>>>
>>> On paper it seems like a good idea, in practice it may be a Wandisco
>>> scenario again..  Of course as ever one needs to EMC for reference calls
>>> ans whether anyone is using this product in anger.
>>>
>>>
>>>
>>> At the end of the day it's not HDFS.  It is OneFS with a HCFS API.
>>>  However that may suit our needs.  But  would need to PoC it and test it
>>> thoroughly!
>>>
>>>
>>> Cheers
>>>
>>>
>>>
>>> Dr Mich Talebzadeh
>>>
>>>
>>>
>>> LinkedIn * 
>>> https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw
>>> <https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw>*
>>>
>>>
>>>
>>> http://talebzadehmich.wordpress.com
>>>
>>>
>>> *Disclaimer:* Use it at your own risk. Any and all responsibility for
>>> any loss, damage or destruction of data or any other property which may
>>> arise from relying on this email's technical content is explicitly
>>> disclaimed. The author will in no case be liable for any monetary damages
>>> arising from such loss, damage or destruction.
>>>
>>>
>>>
>>> On 1 June 2017 at 08:21, Jörn Franke <jornfra...@gmail.com> wrote:
>>>
>>>> Hi,
>>>>
>>>> I have done this (not Isilon, but another storage system). It can be
>>>> efficient for small clusters and depending on how you design the network.
>>>>
>>>> What I have also seen is the microservice approach with object stores
>>>> (e.g. In the cloud s3, on premise swift) which is somehow also similar.
>>>>
>>>> If you want additional performance you could fetch the data from the
>>>> object stores and store it temporarily in a local HDFS. Not sure to what
>>>> extent this affects regulatory requirements though.
>>>>
>>>> Best regards
>>>>
>>>> On 31. May 2017, at 18:07, Mich Talebzadeh <mich.talebza...@gmail.com>
>>>> wrote:
>>>>
>>>> Hi,
>>>>
>>>> I realize this may not have direct relevance to Spark but has anyone
>>>> tried to create virtualized HDFS clusters using tools like ISILON or
>>>> similar?
>>>>
>>>> The prime motive behind this approach is to minimize the propagation or
>>>> copy of data which has regulatory implication. In shoret you want your data
>>>> to be in one place regardless of artefacts used against it such as Spark?
>>>>
>>>> Thanks,
>>>>
>>>> Dr Mich Talebzadeh
>>>>
>>>>
>>>>
>>>> LinkedIn * 
>>>> https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw
>>>> <https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw>*
>>>>
>>>>
>>>>
>>>> http://talebzadehmich.wordpress.com
>>>>
>>>>
>>>> *Disclaimer:* Use it at your own risk. Any and all responsibility for
>>>> any loss, damage or destruction of data or any other property which may
>>>> arise from relying on this email's technical content is explicitly
>>>> disclaimed. The author will in no case be liable for any monetary damages
>>>> arising from such loss, damage or destruction.
>>>>
>>>>
>>>>
>>>>
>>>
>>
>

Reply via email to