Re: HDFS read/write data throttling

2013-11-18 Thread Andrew Wang
https://issues.apache.org/jira/browse/HDFS-5499 On Mon, Nov 18, 2013 at 10:46 AM, Jay Vyas wrote: > Where is the jira for this? > > Sent from my iPhone > > > On Nov 18, 2013, at 1:25 PM, Andrew Wang > wrote: > > > > Thanks for asking, here's a link: > > > > http://www.umbrant.com/papers/socc12

Re: HDFS read/write data throttling

2013-11-18 Thread Jay Vyas
Where is the jira for this? Sent from my iPhone > On Nov 18, 2013, at 1:25 PM, Andrew Wang wrote: > > Thanks for asking, here's a link: > > http://www.umbrant.com/papers/socc12-cake.pdf > > I don't think there's a recording of my talk unfortunately. > > I'll also copy my comments over to the

Re: HDFS read/write data throttling

2013-11-18 Thread Andrew Wang
Thanks for asking, here's a link: http://www.umbrant.com/papers/socc12-cake.pdf I don't think there's a recording of my talk unfortunately. I'll also copy my comments over to the JIRA, though I'd like to not distract too much from what Lohit's trying to do. On Wed, Nov 13, 2013 at 2:54 AM, Ste

Re: HDFS read/write data throttling

2013-11-13 Thread Steve Loughran
this is interesting -I've moved my comments over to the JIRA and it would be good for yours to go there too. is there a URL for your paper? On 13 November 2013 06:27, Andrew Wang wrote: > Hey Steve, > > My research project (Cake, published at SoCC '12) was trying to provide > SLAs for mixed wo

Re: HDFS read/write data throttling

2013-11-12 Thread Andrew Wang
Hey Steve, My research project (Cake, published at SoCC '12) was trying to provide SLAs for mixed workloads of latency-sensitive and throughput-bound applications, e.g. HBase running alongside MR. This was challenging because seeks are a real killer. Basically, we had to strongly limit MR I/O to k

Re: HDFS read/write data throttling

2013-11-12 Thread Steve Loughran
I've looked at it a bit within the context of YARN. YARN containers are where this would be ideal, as then you'd be able to request IO capacity as well as CPU and RAM. For that to work, the throttling would have to be outside the App, as you are trying to limit code whether or not it wants to be,

Re: HDFS read/write data throttling

2013-11-11 Thread lohit
2013/11/11 Andrew Wang > Hey Lohit, > > This is an interesting topic, and something I actually worked on in grad > school before coming to Cloudera. It'd help if you could outline some of > your usecases and how per-FileSystem throttling would help. For what I was > doing, it made more sense to t

Re: HDFS read/write data throttling

2013-11-11 Thread Andrew Wang
Hey Lohit, This is an interesting topic, and something I actually worked on in grad school before coming to Cloudera. It'd help if you could outline some of your usecases and how per-FileSystem throttling would help. For what I was doing, it made more sense to throttle on the DN side since you hav

Re: HDFS read/write data throttling

2013-11-11 Thread Haosong Huang
Hi, lohit. There is a Class named ThrottledInputStream in hadoop-distcp, you could check it out and find more details. In addition to this, I am wor

Re: HDFS read/write data throttling

2013-11-11 Thread lohit
Hi Adam, Thanks for the reply. The changes I was referring was in FileSystem.java layer which should not affect HDFS Replication/NameNode operations. To give better idea this would affect clients something like this Configuration conf = new Configuration(); conf.setInt("read.bandwitdh.mbpersec",

Re: HDFS read/write data throttling

2013-11-11 Thread Adam Muise
See https://issues.apache.org/jira/browse/HDFS-3475 Please note that this has met with many unexpected impacts on workload. Be careful and be mindful of your Datanode memory and network capacity. On Mon, Nov 11, 2013 at 1:59 PM, lohit wrote: > Hello Devs, > > Wanted to reach out and see if a

HDFS read/write data throttling

2013-11-11 Thread lohit
Hello Devs, Wanted to reach out and see if anyone has thought about ability to throttle data transfer within HDFS. One option we have been thinking is to throttle on a per FileSystem basis, similar to Statistics in FileSystem. This would mean anyone with handle to HDFS/Hftp will be throttled globa