On July 22nd, Nadeem Abdul Hamid wrote: > Apparently the coverage thing only works if the evaluator is given a > (byte) string (as opposed to an input port or anything else), so using > port->bytes when reading from a file produces the expected result: > > (define Ev > (call-with-input-file* "test.rkt" > (λ(inp) > (parameterize ([sandbox-coverage-enabled #t]) > (make-evaluator 'racket > (port->bytes inp) > ))))) > (get-uncovered-expressions Ev)
Apologies for the late reply -- I originally thought that there was a bug there, and now when I got to it I see that there isn't a proper bug -- but there's a change that will make things more convenient and less confusing. Please read on if you're using the sandbox, to make sure that this won't break code (it should make things easier). So here's the short version of the issue around coverage information from the sandbox. The `get-uncovered-expressions' filters the result that it gives you according to the syntax source information. The syntax source (the result of `syntax-source') is an arbitrary value that indicates where the syntax came from -- it's usually a string or a path for the source code. In the sandbox case, it uses `read-syntax' over the given path -- if given a path -- so that produces a similar syntax source. But if you give it a quoted sexpr, or a string, or a byte string, then there is no way for it to know the source, and it makes one up. The one that it makes up is 'program. Now, getting to `get-uncovered-expressions' -- it has three arguments: the evaluator, a boolean flag indicating whether you want the uncovered expression after the module was first evaluated, or including any following interactions, and the last one is some value which is used to filter the results -- it will leave in only syntaxes with that source. This last argument is the problematic one in this case. The default value for it is 'program -- which makes it work well *if* you originally used some source-less data for the code (a string, in your case). But if you give it a path, then that path will get used as the source, and if you filter out expressions that don't have 'program as their source you're left with ... nothing (which is often surprising). So the change that I'm thinking of is this: * Each sandbox will have its own default source to filter on, which will be used as the default third argument to `get-uncovered-expressions'. * When a sandbox is constructed, it will *try* to decide what this default value should be. If it's given a string or a sexpr, it will use 'program; and if it's given a path, it will use it. * This will work fine for strings and for paths, and will *usually* work for sexprs too. The problem with sexprs, is that they might contain some syntax values in them, and those will keep their own source instead of getting 'program -- and as a result they will be filtered out. This is probably a much less common case, so the question is how confusing it can be if people do run into that case. I think that the overall benefit is better, and that rare case is already confusing in the same way. There are two other ways to improve this (but note that so far the above seems to me like the best): A. A much simpler change -- just use #f as the default value. This means that you get the unfiltered results (by default). The problem with that is expressions from other files that get in via macros. These expressions get annotated too, but they're almost always useless for you, since you usually don't care about how a macro was implemented. I dislike this since I think it will lead to more confusion, when you keep getting uncovered expressions from random racket libraries. (E.g., `match' can be very amuzing, since it creates a *lot* of hairy expressions, and you'll see all of that in the output.) B. Another option is to scan the input syntax (that is, after the string is read, or the sexpr is converted to syntax), and filter out anything that is not in that syntax. This is very precise in the sense that you never get expressions that were not in the original code -- but the problem is that it's too conservative. Specifically, some (foo) macro can expand to some different (bar) expression with the same source, and executing or not the (bar) should usually mark the "(foo)" in the source as touched or not. This makes the results very broken, as you'll see below. *** Here's a concrete example: ----< some-library.rkt >---- #lang racket/base (provide foo) (define-syntax-rule (foo) (bar)) (define (bar) (+ 1 2)) ----< sandboxed-code.rkt >----- #lang racket (require "some-library.rkt") (and #f (foo)) running the second file through the sandbox, and getting the unfiltered list of uncovered expressions shows: '(#<syntax:/tmp/sandboxed-code.rkt:3:8 (#%app bar)> #<syntax:/tmp/some-library.rkt:3:27 bar> #<syntax:/tmp/some-library.rkt:4:14 (#%app + (quote 1) (quote 2))> #<syntax:/tmp/some-library.rkt:4:15 +> #<syntax:/tmp/some-library.rkt:4:17 (quote 1)> #<syntax:/tmp/some-library.rkt:4:19 (quote 2)>) You can see that unfiltered results are not great: they contain pieces from the macro, which the sandboxed code should not care about. You can also see that the (foo) expression is not in there -- the expressions that are collected are from the expanded code, so (foo) disappeared, and doing the precise filtering will leave nothing visible. (And it should be clear now why this would be completely broken: every function application in racket is a macro, which means that all applications in the original code are never seen as uncovered.) -- ((lambda (x) (x x)) (lambda (x) (x x))) Eli Barzilay: http://barzilay.org/ Maze is Life! _________________________________________________ For list-related administrative tasks: http://lists.racket-lang.org/listinfo/users