> On July 12, 2016, 5:22 p.m., Chris Pettitt wrote: > > samza-core/src/main/java/org/apache/samza/container/host/ProcfsBasedStatisticsMonitor.java, > > line 67 > > <https://reviews.apache.org/r/49877/diff/1/?file=1441036#file1441036line67> > > > > How stable is this across versions of Linux? It appears to be totally > > non-standard across platforms. For example, FreeBSD doesn't appear to have > > a stat file. It has "status" but the format is different: > > https://www.freebsd.org/cgi/man.cgi?query=procfs&apropos=0&sektion=5&manpath=FreeBSD+11.0-stable&arch=default&format=html
Fixed! With the new ps command based approach, this is slightly more portable. > On July 12, 2016, 5:22 p.m., Chris Pettitt wrote: > > samza-core/src/main/java/org/apache/samza/container/host/ProcfsBasedStatisticsMonitor.java, > > line 72 > > <https://reviews.apache.org/r/49877/diff/1/?file=1441036#file1441036line72> > > > > It looks like you can drop the first two groups - I don't see them used > > anywhere. Nice find. dropped. Thanks for noticing. > On July 12, 2016, 5:22 p.m., Chris Pettitt wrote: > > samza-core/src/main/java/org/apache/samza/container/host/ProcfsBasedStatisticsMonitor.java, > > line 90 > > <https://reviews.apache.org/r/49877/diff/1/?file=1441036#file1441036line90> > > > > Looks like this can be private. Done! > On July 12, 2016, 5:22 p.m., Chris Pettitt wrote: > > samza-core/src/main/java/org/apache/samza/container/host/ProcfsBasedStatisticsMonitor.java, > > lines 150-151 > > <https://reviews.apache.org/r/49877/diff/1/?file=1441036#file1441036line150> > > > > Use blocks. Also, should we log a warning? Done. the method procFsFileExists already logs a warning. > On July 12, 2016, 5:22 p.m., Chris Pettitt wrote: > > samza-core/src/main/java/org/apache/samza/container/host/ProcfsBasedStatisticsMonitor.java, > > line 178 > > <https://reviews.apache.org/r/49877/diff/1/?file=1441036#file1441036line178> > > > > You might be able to log an error with a little more context here. > > Also, not sure if log4j is going to emit the stack trace for the error if > > you're asking it to stringify the first arg. Fixed the log lines. they now read : `log.error("Exception during obtaining statistics: ", e)` > On July 12, 2016, 5:22 p.m., Chris Pettitt wrote: > > samza-core/src/main/java/org/apache/samza/container/host/ProcfsBasedStatisticsMonitor.java, > > line 181 > > <https://reviews.apache.org/r/49877/diff/1/?file=1441036#file1441036line181> > > > > We should probably be catching exceptions here to prevent one listener > > from blowing up others. I believe the code you copied this from had the > > same flaw (shame on the original author :P). lol, https://s-media-cache-ak0.pinimg.com/originals/c6/5f/17/c65f1775a6c152947a2081ac72c0d885.gif fixed the original author's version too ;-) > On July 12, 2016, 5:22 p.m., Chris Pettitt wrote: > > samza-core/src/main/java/org/apache/samza/container/host/ProcfsBasedStatisticsMonitor.java, > > line 249 > > <https://reviews.apache.org/r/49877/diff/1/?file=1441036#file1441036line249> > > > > In the code you copied this from I prefix the thread name with > > "Samza-". The reason to do that is it becomes much easier when looking at a > > thread dump to determine if the thread is from Samza or thirdy party code. > > I'd recommend indicating this is a Samza thread somehow. Thanks! Fixed it. > On July 12, 2016, 5:22 p.m., Chris Pettitt wrote: > > samza-core/src/main/java/org/apache/samza/container/host/ProcfsBasedStatisticsMonitor.java, > > lines 140-141 > > <https://reviews.apache.org/r/49877/diff/1/?file=1441036#file1441036line140> > > > > Prefer blocks '{' ... '}' for better safety as the code is changed. Fixed! > On July 12, 2016, 5:22 p.m., Chris Pettitt wrote: > > samza-core/src/main/java/org/apache/samza/container/host/ProcfsBasedStatisticsMonitor.java, > > lines 93-94 > > <https://reviews.apache.org/r/49877/diff/1/?file=1441036#file1441036line93> > > > > It's the default, but it can be changed. If the page size is off it > > could result in undercounting. Ideally we would rely on something like `getconf PAGE_SIZE` to get the page size, and scale it. But, with the revised RB, the default behavior is to use `ps`. The Procfs based codepath only exists if we want to add more stuff to the monitor in the future. - Jagadish ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/49877/#review141915 ----------------------------------------------------------- On July 12, 2016, 12:17 a.m., Jagadish Venkatraman wrote: > > ----------------------------------------------------------- > This is an automatically generated e-mail. To reply, visit: > https://reviews.apache.org/r/49877/ > ----------------------------------------------------------- > > (Updated July 12, 2016, 12:17 a.m.) > > > Review request for samza, Boris Shkolnik, Chris Pettitt, Fred Ji, Jake Maes, > Yi Pan (Data Infrastructure), and Navina Ramesh. > > > Repository: samza > > > Description > ------- > > This feature introduces physical memory monitoring in SamzaContainer. > > Context: > Often memory used by the SamzaContainer process includes > A. JVM Heap memory: This is where all JVM variables live. > B. Native memory: This memory lives out of the JVM heap and is not visible to > the JVM. Examples include used by RocksDb, native libraries that user code > depends on etc. > > User jobs could be killed by Yarn if their total memory (A+B) exceeds the > configured maximum of yarn.container.memory.mb. > > Currently, while our existing metrics provide visibility into [A] via JMX, we > don't have visibility into [B]. (as it's totally external to the JVM). > > This feature uses Linux ProcFS to provide a complete view of the memory (both > A & B) to help Samza users understand memory better. (Schedulers like Apache > Yarn that require a holistic view of memory (A+B) also use ProcFS. For the > curious, here's the Yarn implementation - > http://grepcode.com/file/repo1.maven.org/maven2/org.apache.hadoop/hadoop-yarn-common/0.23.1/org/apache/hadoop/yarn/util/ProcfsBasedProcessTree.java > that inspired this idea) > > Scope: The scope of this RB is only to Linux distributions. (Mac based > implementation may be a separate change list.) > > > Diffs > ----- > > build.gradle ba4a9d14fe24e1ff170873920cd5eeef656955af > checkstyle/import-control.xml 325c38131047836dc8aedaea4187598ef3ba7666 > > samza-core/src/main/java/org/apache/samza/container/host/ProcfsBasedStatisticsMonitor.java > PRE-CREATION > > samza-core/src/main/java/org/apache/samza/container/host/SystemStatistics.java > PRE-CREATION > > samza-core/src/main/java/org/apache/samza/container/host/SystemStatisticsMonitor.java > PRE-CREATION > samza-core/src/main/scala/org/apache/samza/container/SamzaContainer.scala > 18c09224bbae959342daf9b2b7a7d971cc224f48 > > samza-core/src/main/scala/org/apache/samza/container/SamzaContainerMetrics.scala > 2044ce01ffded8434e762d99355d5df43642c66b > > samza-core/src/test/java/org/apache/samza/container/host/TestProcfsBasedStatisticsMonitor.java > PRE-CREATION > > Diff: https://reviews.apache.org/r/49877/diff/ > > > Testing > ------- > > 1. Unit tests with mock PROC-FS snapshots of processes > 2. Deployed actual jobs on my dev box. > 2.1 Obtained the operating system's view of the container memory using > 'ps' and other tools. > 2.2 Verified that the total memory reported by the monitor is the same as > the OS's view of memory[2.1] > 3. Tested on various Linux distributions I could find internally: > - RHEL release 6.4, 6.5, 6.6 (Santiago) > > > Thanks, > > Jagadish Venkatraman > >