I have a script that runs as a cron job every minute (on Ubuntu 10.10 and R 2.11.1), querying a database for new data. Most of the time it takes a few seconds to run, but once in while it takes more than a minute and the next run starts (on the same data) before the previous one has finished. In extreme cases this will fill up memory with a large number of runs of the same script on the same data. My 'solution' has been to create a process id file with the currently running script, first checking whether there is another process id file and whether that process is still running. I use the following code:
pid <- max(system("pgrep -x R", intern = TRUE)) if (file.exists("/var/run/myscript.pid")) { rm(pid) pid <- read.table("/var/run/myscript.pid")[[1]] if (length(system(paste("ps -p", pid), intern = TRUE)) != 2) { stop("Myscript is already running in another process.") } else { pid <- max(system("pgrep -x R", intern = TRUE)) write(pid, "/var/run/myscript.pid") } } else { write(pid, "/var/run/myscript.pid") } ....my script ..... file.remove("/var/run/myscript.pid") #The End The trouble here is that I also have other R scripts running on the same system, so while max(system("pgrep -x R", intern = TRUE)) will almost always give me the right pid, it is not guaranteed to work. There are two situations where it could fail: when the process id numbers round 32000 and start over again, and if another process starts up at the same time, the process ids could get swapped. Is there a way to query for the process id of the specific R script, rather than all R processes? Mikkel ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.