Hi all, Please find attached the diff relative to the code that I cloned from Bioconductor-mirror yesterday (please ignore the previous diff file).
Basically three new features: - As per previous email: display up to the first 6 lines that are over 80 characters long - *New*: display up to the first 6 lines that are not indented by a multiple of 4 spaces - *New*: display up to the first 6 lines that use TAB instead of 4 spaces for indentation I also attach the output of the updated code <https://github.com/kevinrue/BiocCheck>, to illustrate the changes. Notes: - For demonstration purpose, I indented a handful of lines from the checks.R file itself with TAB characters. I assume that's OK, as some lines were already longer than 80 characters and not indented by a multiple of 4 spaces. All the best, Kevin On Wed, Nov 2, 2016 at 10:00 PM, Kevin RUE <kevinru...@gmail.com> wrote: > Me again :) > > Please find attached the first patch to print the first 6 lines over 80 > characters long. (I'll get to the tabulation offenders next). > > Note that all the offending lines are stored in the "df.length" > data.frame. How about an option like "fullReport=c(FALSE, TRUE)" that print > *all* the offending lines? > The data.frame also stores the content of the lines for the record, but > does not print them. I think Kasper is right: filename and line should be > enough to track down the line. > > All the best, > Kevin > > > > On Wed, Nov 2, 2016 at 8:08 PM, Kevin RUE <kevinru...@gmail.com> wrote: > >> Thanks for the feedback! >> >> I also tend to prefer *all* the lines being reported (or to be honest, >> that was really true when I had lots of them; a problem that I largely >> mitigated by fixing all of them once and subsequently paying more attention >> while developing). >> >> Printing the content of the offending line somewhat helps me spot the >> line faster (more so for tab issues). But I must admit that showing the >> whole line is somewhat "overkill". I just started thinking of a compromise >> being to only show the first N characters of the line, with N being 80 >> minus the number of characters necessary to print the filename and line >> number. >> >> Thanks Martin for pointing out the lines in BiocCheck. (Now I feel bad >> for not having checked sooner.. hehe!) >> I think the idea of BiocCheck showing the first 6 offenders in BiocCheck >> quite nice, as I rarely have more since I use using the RStudio "Tools > >> Global Options > Code > Display > Show Margin > Margin column: 80" feature. >> >> I'll give a go at both approaches (developing BiocCheck and my own >> scripts) >> >> Cheers, >> Kevin >> >> >> On Wed, Nov 2, 2016 at 7:41 PM, Kasper Daniel Hansen < >> kasperdanielhan...@gmail.com> wrote: >> >>> I would prefer all line numbers reported, but on the other hand I am >>> indifferent wrt. the content of the line, unless (say) TABs are marked up >>> somehow. >>> >>> Kasper >>> >>> On Wed, Nov 2, 2016 at 3:17 PM, Martin Morgan < >>> martin.mor...@roswellpark.org> wrote: >>> >>>> On 11/02/2016 02:49 PM, Kevin RUE wrote: >>>> >>>>> Dear all, >>>>> >>>>> Just thought I'd share a handful of scripts that I wrote to follow up >>>>> on >>>>> certain NOTE messages thrown by R CMD BiocCheck. >>>>> >>>>> https://github.com/kevinrue/BiocCheckTools >>>>> >>>>> They're very simple, but I occasionally find them quite convenient. >>>>> Apologies if something similar already exists somewhere :) >>>>> >>>> >>>> Maybe consider creating a diff against the source code that, e.g., >>>> reported the first 6 offenders? The relevant lines are near >>>> >>>> https://github.com/Bioconductor-mirror/BiocCheck/blob/master >>>> /R/checks.R#L1081 >>>> >>>> Martin >>>> >>>> >>>>> All the best, >>>>> Kevin >>>>> >>>>> [[alternative HTML version deleted]] >>>>> >>>>> _______________________________________________ >>>>> Bioc-devel@r-project.org mailing list >>>>> https://stat.ethz.ch/mailman/listinfo/bioc-devel >>>>> >>>>> >>>> >>>> This email message may contain legally privileged and/or...{{dropped:2}} >>>> >>>> >>>> _______________________________________________ >>>> Bioc-devel@r-project.org mailing list >>>> https://stat.ethz.ch/mailman/listinfo/bioc-devel >>>> >>> >>> >> >
diff --git a/R/checks.R b/R/checks.R index 02e3841..9b1f273 100644 --- a/R/checks.R +++ b/R/checks.R @@ -1057,15 +1057,12 @@ checkFormatting <- function(pkgdir) tablines <- 0L badindentlines <- 0L ok <- TRUE - - df.length <- data.frame(stringsAsFactors=FALSE) - df.indent <- data.frame(stringsAsFactors=FALSE) - df.tab <- data.frame(stringsAsFactors=FALSE) + for (file in files) { - pkgname <- getPkgNameFromPkgDir(pkgdir) if (file.exists(file) && file.info(file)$size == 0) { + pkgname <- getPkgNameFromPkgDir(pkgdir) handleNote(sprintf("Add content to the empty file %s.", mungeName(file, pkgname))) } @@ -1075,22 +1072,14 @@ checkFormatting <- function(pkgdir) lines <- readLines(file, warn=FALSE) totallines <- totallines + length(lines) n <- nchar(lines, allowNA=TRUE) - n <- n[!is.na(n)]; lines <- lines[!is.na(n)] + n <- n[!is.na(n)] names(n) <- seq_along(1:length(n)) - long <- which(n > 80) - + long <- n[n > 80] if (length(long)) { - df.i <- 1 - for (i in long){ - df.length[df.i,1] <- mungeName(file, pkgname) # filename - df.length[df.i,2] <- names(n)[i] # line number - df.length[df.i,3] <- lines[i] # content - df.length[df.i,4] <- n[i] # length - df.i <- df.i + 1 - } - + ## TODO/FIXME We could tell the user here which lines are long + ## in which files. longlines <- longlines + length(long) } @@ -1098,89 +1087,42 @@ checkFormatting <- function(pkgdir) if (any(tabs)) { tablines <- tablines + length(which(tabs)) - df.i <- 1 - for (i in which(tabs)){ - df.tab[df.i,1] <- mungeName(file, pkgname) # filename - df.tab[df.i,2] <- names(n)[i] # line number - df.tab[df.i,3] <- lines[i] # content - df.i <- df.i + 1 - } } res <- regexpr("^([ ]+)", lines) if (any(res > -1)) { match.length <- attr(res, "match.length") - badindents <- which( - match.length > -1 & - match.length %% 4 != 0 - ) - badindentlinesinthisfile <- length(badindents) + indents <- match.length[match.length > -1] + badindentlinesinthisfile <- length(which(indents %% 4 != 0)) badindentlines <- badindentlines + badindentlinesinthisfile } - if (badindentlinesinthisfile){ - df.i <- 1 - for (i in badindents){ - df.indent[df.i,1] <- mungeName(file, pkgname) # filename - df.indent[df.i,2] <- names(n)[i] # line number - df.indent[df.i,3] <- lines[i] # content - df.indent[df.i,4] <- match.length[i] # indent width - df.i <- df.i + 1 - } - } } } - colnames(df.length) <- c("File", "Line", "Content", "Length") - colnames(df.tab) <- c("File", "Line", "Content") - colnames(df.indent) <- c("File", "Line", "Content", "Indent") - if (longlines > 0) { ok <- FALSE - h.length <- head(df.length) handleNote(sprintf( "Consider shorter lines; %s lines (%i%%) are > 80 characters long.", longlines, as.integer((longlines/totallines) * 100))) - message(sprintf(" The first %i lines are:", nrow(h.length))) - for (i in 1:nrow(h.length)) - { - row <- h.length[i,] - message(sprintf(" %s (line %s): %s characters", - row$File, row$Line, row$Length)) - } } if (tablines > 0) { ok <- FALSE - h.tab <- head(df.tab) handleNote(sprintf( "Consider 4 spaces instead of tabs; %s lines (%i%%) contain tabs.", tablines, as.integer((tablines/totallines) * (100/1) ))) - message(sprintf(" The first %i lines are:", nrow(h.tab))) - for (i in 1:nrow(h.tab)) - { - row <- h.tab[i,] - message(sprintf(" %s (line %s)", - row$File, row$Line)) - } } if (badindentlines > 0) { ok <- FALSE - h.tab <- head(df.indent) handleNote(sprintf(paste( "Consider indenting lines with a multiple of 4 spaces;", "%s lines (%i%%) are not."), badindentlines, as.integer((badindentlines/totallines) * 100))) - message(sprintf(" The first %i lines are:", nrow(h.tab))) - for (i in 1:nrow(h.tab)) - { - row <- h.tab[i,] - message(sprintf(" %s (line %s): %s spaces", - row$File, row$Line, row$Indent)) - } + } if (!ok)
> BiocCheck("BiocCheck") * This is BiocCheck, version 1.11.0. * BiocCheck is a work in progress. Output and severity of issues may change. * Installing package... * Checking vignette directory... * This is a software package, checking vignette directories... # of chunks: 3, # of eval=FALSE: 2 (66%) * WARNING: Evaluate more vignette chunks. * Checking version number... * Checking version number validity... * Checking R Version dependency... * WARNING: Update R version dependency from 3.3.0 to 3.4. * Checking biocViews... * Checking that biocViews are present... * Checking for non-trivial biocViews... * Checking that biocViews come from the same category... * Checking biocViews validity... * Checking for recommended biocViews... * Checking build system compatibility... * Checking for blank lines in DESCRIPTION... * Checking for whitespace in DESCRIPTION field names... * Checking that Package field matches dir/tarball name... * Checking for Version field... * Checking for valid maintainer... * Checking unit tests... * Checking native routine registration... * Checking for deprecated package usage... * Checking parsed R code in R directory, examples, vignettes... * Checking for direct slot access... * Checking for T... * Checking for F... * Checking for browser()... * Checking for <<-... Found <<- in R/checks.R (line 521, column 14) * NOTE: Avoid '<<-'' if possible (found in 1 files) * Checking for library/require of BiocCheck... * Checking DESCRIPTION/NAMESPACE consistency... * WARNING: Import utils in DESCRIPTION as well as NAMESPACE. * Checking function lengths....... The longest function is 218 lines long The longest 5 functions are: BiocCheck() (R/BiocCheck.R, line 45): 218 lines checkFormatting() (R/checks.R, line 1044): 147 lines checkForBadDepends() (R/checks.R, line 570): 106 lines checkBiocViews() (R/checks.R, line 222): 93 lines checkBBScompatibility() (R/checks.R, line 316): 93 lines * Checking man pages... * Checking exported objects have runnable examples... * Checking package NEWS... * NOTE: Consider adding a NEWS file, so your package news will be included in Bioconductor release announcements. * Checking formatting of DESCRIPTION, NAMESPACE, man pages, R source, and vignette source... * NOTE: Consider shorter lines; 26 lines (1%) are > 80 characters long. The first 6 lines are: cCheck/R/util.R (line 127): 87 characters cCheck/R/util.R (line 218): 82 characters cCheck/R/checks.R (line 435): 82 characters cCheck/R/checks.R (line 459): 110 characters cCheck/R/checks.R (line 472): 81 characters cCheck/R/checks.R (line 644): 87 characters * NOTE: Consider 4 spaces instead of tabs; 4 lines (0%) contain tabs. The first 4 lines are: cCheck/R/checks.R (line 1103) cCheck/R/checks.R (line 1104) cCheck/R/checks.R (line 1105) cCheck/R/checks.R (line 1106) * NOTE: Consider indenting lines with a multiple of 4 spaces; 99 lines (3%) are not. The first 6 lines are: cCheck/R/util.R (line 288): 7 spaces cCheck/R/checks.R (line 1251): 13 spaces cCheck/R/checks.R (line 1252): 13 spaces cCheck/R/checks.R (line 1253): 13 spaces cCheck/vignettes/BiocCheck.Rmd (line 22): 2 spaces cCheck/vignettes/BiocCheck.Rmd (line 24): 2 spaces See http://bioconductor.org/developers/how-to/coding-style/ * Checking for canned comments in man pages... * Checking if package already exists in CRAN... * Checking for support site registration... * Maintainer email is ok. Summary: ERROR count: 0 WARNING count: 0 NOTE count: 0 For detailed information about these checks, see the BiocCheck vignette, available at http://bioconductor.org/packages/3.5/bioc/vignettes/BiocCheck/inst/doc/BiocCheck.html#interpreting-bioccheck-output
_______________________________________________ Bioc-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/bioc-devel