Great. Post your patch at reviews.apache.org Thanks Rajesh Battala
-----Original Message----- From: Saurav Lahiri [mailto:saurav.lah...@sungard.com] Sent: Wednesday, March 19, 2014 8:49 PM To: dev@cloudstack.apache.org Subject: Re: system vm disk space issue in ACS 4.3 Thanks Rajesh. I have created a jira ticket for this https://issues.apache.org/jira/browse/CLOUDSTACK-6258. Will send in the fix for review in a couple of days. Thanks Saurav On Wed, Mar 19, 2014 at 8:03 PM, Rajesh Battala <rajesh.batt...@citrix.com>wrote: > Can you please file a bug and send your fix for review. > > Thanks > Rajesh Battala > > -----Original Message----- > From: Saurav Lahiri [mailto:saurav.lah...@sungard.com] > Sent: Wednesday, March 19, 2014 7:20 PM > To: dev@cloudstack.apache.org > Subject: Re: system vm disk space issue in ACS 4.3 > > The problem appears to be the start function in the /etc/init.d/cloud > service for console proxy. > More specifically the following line also writes to /var/log/cloud.out > > > ---------------------------------------------------------------------- > ------------------------------------------------------------ > (cd $CLOUD_COM_HOME/systemvm; nohup ./run.sh > > /var/log/cloud/cloud.out > 2>&1 & ) > > ---------------------------------------------------------------------- > ------------------------------------------------------------ > > since run.sh calls _run.sh and both has "set -x" enabled, in certain > situations they can keep logging messages to cloud.out without being > aware of the settings in log4j-cloud.xml > > > One way to fix that could be that run.sh and _run.sh would log to > cloud.out only if a debug flag was set to true, otherwise only the > java process would write to cloud.out and log4j would respect the > settings in log4j-cloud.xml > > > Thanks > Saurav > > > > On Mon, Mar 17, 2014 at 8:47 PM, Saurav Lahiri > <saurav.lah...@sungard.com > >wrote: > > > Could it have something to do with the RollingFileAppender that is > > being used. > > The following > > rollingfileappender<http://apache-logging.6191.n7.nabble.com/Rolling > > Fi leAppender-not-working-consistently-td8582.html> link appears to > > be a > bit outdated but they more or less describe a similar problem that we > are seeing? > > > > > > On our environment that is what we have seeing for sometime on > > console proxy. The root filesystem goes full with the cloud.out.* > > occupying all the space. This happens pretty frequently and we have > > to regularly recycle the console proxy to resolve this issue. > > > > > > As seen below, cloud.out.2 should not have exceeded 10MB but it > > stands at 217MB now. > > > > drwxr-xr-x 2 root root 4.0K Mar 17 14:57 . > > drwxr-xr-x 8 root root 4.0K Mar 17 15:01 .. > > -rw-r--r-- 1 root root 0 Mar 12 18:18 api-server.log > > -rw-r--r-- 1 root root 357K Mar 17 15:06 cloud.out > > -rw-r--r-- 1 root root 2.1M Mar 17 14:56 cloud.out.1 > > -rw-r--r-- 1 root root 217M Mar 17 15:06 cloud.out.2 > > > > root@v-zzzz-VM:/var/log/cloud# lsof | grep cloud.out > > sleep 649 root 1w REG 202,1 226122291 181737 > > /var/log/cloud/cloud.out.2 > > sleep 649 root 2w REG 202,1 226122291 181737 > > /var/log/cloud/cloud.out.2 > > bash 2312 root 1w REG 202,1 226122291 181737 > > /var/log/cloud/cloud.out.2 > > bash 2312 root 2w REG 202,1 226122291 181737 > > /var/log/cloud/cloud.out.2 > > bash 2339 root 1w REG 202,1 226122291 181737 > > /var/log/cloud/cloud.out.2 > > bash 2339 root 2w REG 202,1 226122291 181737 > > /var/log/cloud/cloud.out.2 > > bash 2786 root 1w REG 202,1 226122291 181737 > > /var/log/cloud/cloud.out.2 > > bash 2786 root 2w REG 202,1 226122291 181737 > > /var/log/cloud/cloud.out.2 > > java 2805 root 1w REG 202,1 226122291 181737 > > /var/log/cloud/cloud.out.2 > > java 2805 root 2w REG 202,1 226122291 181737 > > /var/log/cloud/cloud.out.2 > > java 2805 root 116w REG 202,1 319382 181769 > > /var/log/cloud/cloud.out > > root@v-zzzz-VM:/var/log/cloud# ls -alh > > > > Thanks > > Saurav > > > > > > On Tue, Mar 11, 2014 at 7:58 AM, Chiradeep Vittal < > > chiradeep.vit...@citrix.com> wrote: > > > >> Yes, it was deliberate. I can¹t find the discussion, but it > >> revolved around a security best practice of having separate > >> partitions for /, /swap, home directories > >> > >> > >> On 3/10/14, 11:35 AM, "Marcus" <shadow...@gmail.com> wrote: > >> > >> >There have been several raised, actually regarding /var/log. As > >> >for the system vm partitioning, it was explicitly changed from > >> >single to multiple partitions last year. I have no idea why, but I > >> >generally don't file bugs without community discussion on things > >> >that seem deliberate. > >> > > >> >On Sat, Mar 8, 2014 at 11:32 AM, Marcus <shadow...@gmail.com> wrote: > >> >> Yeah, I've just seen on busy systems where even with log > >> >>rotation working properly the little space left in var after OS > >> >>files is barely enough, for example the conntrackd log on a busy > >> >>VPC. We actually ended up rolling our own system vm, the > >> >>existing image has plenty of space, its just locked up in other > >> >>partitions. > >> >> > >> >> On Mar 8, 2014 8:58 AM, "Rajesh Battala" > >> >><rajesh.batt...@citrix.com> > >> >>wrote: > >> >>> > >> >>> Yes, only 435MB is available for /var . we can increase the > >> >>> space > >> also. > >> >>> But we need to find out the root cause which services are > >> >>>causing the /var to fill up. > >> >>> Can you please find out and post which log files are taking up > >> >>>more space in /var > >> >>> > >> >>> Thanks > >> >>> Rajesh Battala > >> >>> > >> >>> -----Original Message----- > >> >>> From: Marcus [mailto:shadow...@gmail.com] > >> >>> Sent: Saturday, March 8, 2014 8:19 PM > >> >>> To: dev@cloudstack.apache.org > >> >>> Subject: RE: system vm disk space issue in ACS 4.3 > >> >>> > >> >>> Perhaps there's a new service. I know in the past we've seen > >> >>>issues with this , specifically the conntrackd log. I think the > >> >>>cloud logs weren't getting rolled either, but I thought it was > >> >>>all fixed. > >> >>> > >> >>> There's also simply not a ton of space on /var, I wish we would > >> >>>go back to just having one partition because it orphans lots of > >> >>>free space in other filesystems. > >> >>> On Mar 8, 2014 12:37 AM, "Rajesh Battala" > >> >>><rajesh.batt...@citrix.com> > >> >>> wrote: > >> >>> > >> >>> > AFAIK, log roation is enabled in the systemvm. > >> >>> > Can you check whether the logs are getting zipped .? > >> >>> > > >> >>> > -----Original Message----- > >> >>> > From: Anirban Chakraborty [mailto:abc...@juniper.net] > >> >>> > Sent: Saturday, March 8, 2014 12:46 PM > >> >>> > To: dev@cloudstack.apache.org > >> >>> > Subject: system vm disk space issue in ACS 4.3 > >> >>> > > >> >>> > Hi All, > >> >>> > > >> >>> > I am seeing system vm disk has no space left after running > >> >>> > for few > >> >>>days. > >> >>> > Cloudstack UI shows the agent in v-2-VM in alert state, while > >> >>> > agent state of s-1-VM shows blank (hyphen in the UI). > >> >>> > Both the system vms are running and ssh-able from the host. > >> >>> > The log > >> >>>in > >> >>> > s-1-Vm shows following errors: > >> >>> > > >> >>> > root@s-1-VM:~# grep 'Exception' /var/log/cloud/*.* > >> >>> > /var/log/cloud/cloud.out.2:java.io.IOException: No space left > >> >>> > on device > >> >>> > /var/log/cloud/cloud.out.2:java.io.IOException: No space left > >> >>> > on device > >> >>> > > >> >>> > whereas logs in v-1-VM shows > >> >>> > /var/log/cloud/cloud.out.3:java.io.IOException: No space left > >> >>> > on device > >> >>> > /var/log/cloud/cloud.out.3:java.io.IOException: No space left > >> >>> > on device > >> >>> > /var/log/cloud/cloud.out.3:07:18:00,547 INFO > >> CSExceptionErrorCode:87 > >> >>> > - Could not find exception: > >> >>> > com.cloud.exception.AgentControlChannelException > >> >>> > in error code list for exceptions > >> >>> > > >> >>> > > >> > >> >>>/var/log/cloud/cloud.out.3:com.cloud.exception.AgentControlChann > >> >>>el > >> >>>Except > >> >>>ion: > >> >>> > Unable to post agent control request as link is not available > >> >>> > > >> >>> > Looks like cloud agent is filling up the log, which is > >> >>> > leading to > >> the > >> >>> > disk full state. > >> >>> > > >> >>> > Is this a known issue? Thanks. > >> >>> > > >> >>> > Anirban > >> >>> > > >> > >> > >> > > > >