Hello,

Spark collect HDFS read/write metrics per application/job see details
http://spark.apache.org/docs/latest/monitoring.html.

I have connected spark metrics to Graphite and then doing nice graphs
display on Graphana.

BR,

Arek

On Thu, Dec 31, 2015 at 2:00 PM, Steve Loughran <ste...@hortonworks.com> wrote:
>
>> On 30 Dec 2015, at 13:19, alvarobrandon <alvarobran...@gmail.com> wrote:
>>
>> Hello:
>>
>> Is there anyway of monitoring the number of Bytes or blocks read and written
>> by an Spark application?. I'm running Spark with YARN and I want to measure
>> how I/O intensive a set of applications are. Closest thing I have seen is
>> the HDFS DataNode Logs in YARN but they don't seem to have Spark
>> applications specific reads and writes.
>>
>> 2015-12-21 18:29:15,347 INFO
>> org.apache.hadoop.hdfs.server.datanode.DataNode.clienttrace: src:
>> /127.0.0.1:53805, dest: /127.0.0.1:50010, bytes: 72159, op: HDFS_WRITE,
>> cliID: DFSClient_NONMAPREDUCE_-1850086307_1, offset: 0, srvID:
>> a9edc8ad-fb09-4621-b469-76de587560c0, blockid:
>> BP-189543387-138.100.13.81-1450715936956:blk_1073741837_1013, duration:
>> 2619119
>> hadoop-alvarobrandon-datanode-usuariop81.fi.upm.es.log:2015-12-21
>> 18:29:15,429 INFO org.apache.hadoop.hdfs.server.d
>>
>> Is there any trace about this kind of operations to be found in any log?
>
>
> 1. the HDFS namenode and datanodes all collect metrics of their use, with 
> org.apache.hadoop.hdfs.server.datanode.metrics.DataNodeMetrics being the most 
> interesting on IO.
> 2. FileSystem.Statistics is a static structure collecting data on operations 
> and data for each thread in a client process.
> 3. The HDFS input streams also supports some read statistics (ReadStatistics 
> via getReadReadStatistics)
> 4. the recent versions of HDFS are also adding htrace support, to trace 
> end-to-end performance.
>
> I'd start with FileSystem.Statistics; if that's not being collected across 
> spark jobs, it should be possible
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
> For additional commands, e-mail: user-h...@spark.apache.org
>

---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
For additional commands, e-mail: user-h...@spark.apache.org

Reply via email to