Hi,

Thanks for sending the updated information. To make sure I know how the files relate, let's take snow-3.2.x, BiocParallel 1.3.34 run on July 14 as an example.

memory files:

The mem_email files report max virtual memory (Max vmem) on the master, for the whole job. I see 3 outputs here - one run from July 9 and two from July 14:

https://github.com/lcolladotor/SnowParam-memory/blob/gh-pages/mem_emails/snow-3.2.x.txt

This last entry corresponds to BiocParallel 1.3.34 from July 14:

Job 6463239 (snow-3.2.x) Complete
 User             = lcollado
 Queue            = share...@compute-082.cm.cluster
 Host             = compute-082.cm.cluster
 Start Time       = 07/14/2015 12:20:48
 End Time         = 07/14/2015 12:22:27
 User Time        = 00:00:18
 System Time      = 00:00:04
 Wallclock Time   = 00:01:39
 CPU              = 00:10:14
 Max vmem         = 7.286G
 Exit Status      = 0


log files:

Corresponding log files for BiocParallel 1.3.34 on July 14 show memory used on workers:

https://github.com/lcolladotor/SnowParam-memory/blob/gh-pages/logs/snow-3.2.x.o6463213
https://github.com/lcolladotor/SnowParam-memory/blob/gh-pages/logs/snow-3.2.x.o6463239



On 07/14/2015 10:23 AM, Leonardo Collado Torres wrote:
Hi Valerie,

My other recent thread about SnowParam
(https://stat.ethz.ch/pipermail/bioc-devel/2015-July/007788.html)
allowed me to see a small difference.

In R 3.1.x, using 'outfile' as in
https://github.com/lcolladotor/SnowParam-memory/blob/d9d70086016c7e720714bec48f121aeca05b1416/SnowParam-memory-derfinder.R#L83
showed more info than in R 3.2.0 and 3.2.1 as shown below:

* R 3.1.x 
https://github.com/lcolladotor/SnowParam-memory/blob/gh-pages/logs/der-snow-3.1.x.e6416599
* R 3.2.0 
https://github.com/lcolladotor/SnowParam-memory/blob/gh-pages/logs/der-snow-3.2.e6416601
* R 3.2.1 
https://github.com/lcolladotor/SnowParam-memory/blob/gh-pages/logs/der-snow-3.2.x.e6416603

using BiocParallel versions

* 1.0.3 
https://github.com/lcolladotor/SnowParam-memory/blob/gh-pages/logs/der-snow-3.1.x.o6416599#L57
* 1.2.8 
https://github.com/lcolladotor/SnowParam-memory/blob/gh-pages/logs/der-snow-3.2.o6416601#L58
* 1.3.33 
https://github.com/lcolladotor/SnowParam-memory/blob/gh-pages/logs/der-snow-3.2.x.o6416603#L58

Basically, only some of the message() calls get printed in stderr in R
3.2.0 and 3.2.1 which is something I wasn't expecting. However, using
"log = TRUE", all the message() calls get printed in stdout, as
expected given the docs, in which case using 'outfile' is kind of
useless.

The idea was to phase out the snow-centric 'outfile' and replace it with 'log' and 'logdir'. You're right, previous versions of BiocParallel did not capture stdout and stderr on the workers; the current release and devel do. As you've now seen, when log = TRUE stderr and stdout messages print to the console, if logdir is given they go to a file. One thing I have not done yet is capture stdout and stderr from the master to a 'logdir' file (workers captured only).


See below for the output under R 3.2.0 when log = TRUE
https://github.com/lcolladotor/SnowParam-memory/blob/gh-pages/logs/der-snow-3.2.o6463201
https://github.com/lcolladotor/SnowParam-memory/blob/gh-pages/logs/der-snow-3.2.e6463201

Assuming that it's all working as intended, I'll switch to using log =
TRUE in some of my code when users specify 'verbose = TRUE'.

Sounds good.


Cheers,
Leo

PS In page 1 of
http://www.bioconductor.org/packages/3.2/bioc/vignettes/BiocParallel/inst/doc/Errors_Logs_And_Debugging.pdf
there's a typo. It reads "SnowParma" instead of "SnowParam".

Thanks, I'll fix that.

Valerie


On Fri, May 22, 2015 at 5:33 PM, Valerie Obenchain
<voben...@fredhutch.org> wrote:
Hi,

Thanks for reporting the bug. Now fixed in 1.13.4 (devel) and 1.2.2
(release).

bplapply(1:2, print, BPPARAM=SnowParam(outfile = NULL))
starting worker for localhost:11031
starting worker for localhost:11031
Type: EXEC
[1] 1
Type: EXEC
[1] 2
Type: DONE
Type: DONE
[[1]]
[1] 1

[[2]]
[1] 2


Some new features have been added to BiocParallel - logging with
futile.logger, writing out log and result files and control over how a job
is divided into tasks.

## send log file to 'logdir':
bplapply(1:2, print, BPPARAM=SnowParam(log=TRUE, logdir=tempdir())

## write results to 'resdir':
bplapply(1:2, print, BPPARAM=SnowParam(resdir=tempdir())

## by default jobs are divided evenly over the workers
bplapply(1:100, print, BPPARAM=SnowParam(workers=4))

## force the job to be run in 2 tasks
## (useful when you know runtime or memory requirements)
bplapply(1:100, print, BPPARAM=SnowParam(tasks=2))


It would be great if you had a chance to try any of these out - I'd be
interested in the feedback. Logging was intended to take the idea of the
'outfile' from snow further with the ability to add messages that can be
filtered by threshold.

The package also has a new errors/logging vignette:

http://www.bioconductor.org/packages/3.2/bioc/vignettes/BiocParallel/inst/doc/Errors_Logs_And_Debugging.pdf

Valerie




On 05/21/2015 07:58 AM, Leonardo Collado Torres wrote:

Hi,

This might be a BioC support website question, but maybe it's a bug.

In previous versions of BiocParallel, you could specify where to
direct the output from the workers by using the 'outfile' argument.
For example, SnowParam(outfile = Sys.getenv('SGE_STDERR_PATH')). I'm
not finding how to do so with the current version (1.3.12).

I understand that SnowParam() has a ... argument that gets passed to
snow::makeCluster(), according to the SnowParam() docs. I also see
that when log = TRUE, a script is used instead of snow. But log =
FALSE by default.

If I use snow, the 'outfile' argument does work as shown below:

## Nothing gets printed by default

cl <- makeCluster(2, type = 'SOCK')
y <- clusterApply(cl, 1:2, print)



## Use outfile now, print works

cl <- makeCluster(2, type = 'SOCK', outfile = NULL)

starting worker for localhost:11671
starting worker for localhost:11671

y <- clusterApply(cl, 1:2, print)

Type: EXEC
Type: EXEC
[1] 1
[1] 2

packageVersion('snow')

[1] ‘0.3.13’


However, I can't use 'outfile' with SnowParam:

## SerialParam works

x <- bplapply(1:2, print, BPPARAM = SerialParam())

[1] 1
[1] 2

## No printing by default with SnowParam, as expected

x <- bplapply(1:2, print, BPPARAM = SnowParam(workers = 1))


## Can't pass 'outfile' argument

x <- bplapply(1:2, print, BPPARAM = SnowParam(workers = 1, outfile =
NULL))

Error in bplapply(1:2, print, BPPARAM = SnowParam(workers = 1, outfile
= NULL)) :
    error in evaluating the argument 'BPPARAM' in selecting a method for
function 'bplapply': Error in envRefSetField(.Object, field, classDef,
selfEnv, elements[[field]]) :
    ‘outfile’ is not a field in class “SnowParam”


Digging at the code, I see that in SnowParam() the ... argument is
saved in .clusterargs

args <- c(list(spec = workers, type = type), list(...))
      .clusterargs <- lapply(args, force)

However, ... is still passed to .SnowParam(), and .SnowParam() fields are:

fields=list(
          cluster="cluster",
          .clusterargs="list",
          .controlled="logical",
          log="logical",
          threshold="ANY",
          logdir="character",
          resultdir="character")


So, I'm wondering if ... should not be passed to .SnowParam().



## Trying to pass outfile to .clusterargs directly doesn't work
## Actually, I'm surprised it didn't crash .SnowParam()

x <- bplapply(1:2, print, BPPARAM = SnowParam(workers = 1,
.clusterargs=list(outfile = NULL)))



Full log: https://gist.github.com/9ac957c6cad1c07f4ea4


Thanks,
Leo

_______________________________________________
Bioc-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/bioc-devel



--
Computational Biology / Fred Hutchinson Cancer Research Center
1100 Fairview Ave. N, Seattle, WA 98109

Email: voben...@fredhutch.org
Phone: (206) 667-3158


--
Computational Biology / Fred Hutchinson Cancer Research Center
1100 Fairview Ave. N, Seattle, WA 98109

Email: voben...@fredhutch.org
Phone: (206) 667-3158

_______________________________________________
Bioc-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/bioc-devel

Reply via email to