If mandatory, you can use a local cache like alluxio Le 1 juin 2017 10:23 AM, "Mich Talebzadeh" <mich.talebza...@gmail.com> a écrit :
> Thanks Vincent. I assume by physical data locality you mean you are going > through Isilon and HCFS and not through direct HDFS. > > Also I agree with you that shared network could be an issue as well. > However, it allows you to reduce data redundancy (you do not need R3 in > HDFS anymore) and also you can build virtual clusters on the same data. One > cluster for read/writes and another for Reads? That is what has been > suggestes!. > > regards > > Dr Mich Talebzadeh > > > > LinkedIn * > https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw > <https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw>* > > > > http://talebzadehmich.wordpress.com > > > *Disclaimer:* Use it at your own risk. Any and all responsibility for any > loss, damage or destruction of data or any other property which may arise > from relying on this email's technical content is explicitly disclaimed. > The author will in no case be liable for any monetary damages arising from > such loss, damage or destruction. > > > > On 1 June 2017 at 08:55, vincent gromakowski < > vincent.gromakow...@gmail.com> wrote: > >> I don't recommend this kind of design because you loose physical data >> locality and you will be affected by "bad neighboors" that are also using >> the network storage... We have one similar design but restricted to small >> clusters (more for experiments than production) >> >> 2017-06-01 9:47 GMT+02:00 Mich Talebzadeh <mich.talebza...@gmail.com>: >> >>> Thanks Jorn, >>> >>> This was a proposal made by someone as the firm is already using this >>> tool on other SAN based storage and extend it to Big Data >>> >>> On paper it seems like a good idea, in practice it may be a Wandisco >>> scenario again.. Of course as ever one needs to EMC for reference calls >>> ans whether anyone is using this product in anger. >>> >>> >>> >>> At the end of the day it's not HDFS. It is OneFS with a HCFS API. >>> However that may suit our needs. But would need to PoC it and test it >>> thoroughly! >>> >>> >>> Cheers >>> >>> >>> >>> Dr Mich Talebzadeh >>> >>> >>> >>> LinkedIn * >>> https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw >>> <https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw>* >>> >>> >>> >>> http://talebzadehmich.wordpress.com >>> >>> >>> *Disclaimer:* Use it at your own risk. Any and all responsibility for >>> any loss, damage or destruction of data or any other property which may >>> arise from relying on this email's technical content is explicitly >>> disclaimed. The author will in no case be liable for any monetary damages >>> arising from such loss, damage or destruction. >>> >>> >>> >>> On 1 June 2017 at 08:21, Jörn Franke <jornfra...@gmail.com> wrote: >>> >>>> Hi, >>>> >>>> I have done this (not Isilon, but another storage system). It can be >>>> efficient for small clusters and depending on how you design the network. >>>> >>>> What I have also seen is the microservice approach with object stores >>>> (e.g. In the cloud s3, on premise swift) which is somehow also similar. >>>> >>>> If you want additional performance you could fetch the data from the >>>> object stores and store it temporarily in a local HDFS. Not sure to what >>>> extent this affects regulatory requirements though. >>>> >>>> Best regards >>>> >>>> On 31. May 2017, at 18:07, Mich Talebzadeh <mich.talebza...@gmail.com> >>>> wrote: >>>> >>>> Hi, >>>> >>>> I realize this may not have direct relevance to Spark but has anyone >>>> tried to create virtualized HDFS clusters using tools like ISILON or >>>> similar? >>>> >>>> The prime motive behind this approach is to minimize the propagation or >>>> copy of data which has regulatory implication. In shoret you want your data >>>> to be in one place regardless of artefacts used against it such as Spark? >>>> >>>> Thanks, >>>> >>>> Dr Mich Talebzadeh >>>> >>>> >>>> >>>> LinkedIn * >>>> https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw >>>> <https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw>* >>>> >>>> >>>> >>>> http://talebzadehmich.wordpress.com >>>> >>>> >>>> *Disclaimer:* Use it at your own risk. Any and all responsibility for >>>> any loss, damage or destruction of data or any other property which may >>>> arise from relying on this email's technical content is explicitly >>>> disclaimed. The author will in no case be liable for any monetary damages >>>> arising from such loss, damage or destruction. >>>> >>>> >>>> >>>> >>> >> >