steakhal added a comment. Excellent work. Don't be afraid of the number of nits I dump on you! Good job.
================ Comment at: clang/docs/analyzer/checkers.rst:2338 +Default sources defined by `GenericTaintChecker`: +``fdopen``, ``fopen``, ``freopen``, ``getch``, ``getchar``, ``getchar_unlocked``, ``gets``, ``scanf``, ``socket``, ``wgetch`` ---------------- Same for the rest. ================ Comment at: clang/docs/analyzer/checkers.rst:2351 + + clang --analyze ... -Xclang -analyzer-config -Xclang alpha.security.taint.TaintPropagation:Config=taint_config.yaml + ---------------- Per https://reviews.llvm.org/D113004#inline-1078695 we should not advocate users use the `-Xclang` machinery, we should rather refer to it by other tools such as `scan-build`. However, we haven't reached a consensus about this decision yet. Consider moving some parts of this doc to the proposed Configuration documentation file - housing the //more// user-facing analyzer options. ================ Comment at: clang/docs/analyzer/checkers.rst:2356 + +For a more detailed description of configuration options, please see the :doc:`user-docs/TaintAnalysisConfiguration`. For an example see :ref:`taint-configuration-example`. + ---------------- Please note that this configuration file is not stable (yet). We might change it in a non-backward compatible way without any notice in the upcoming releases. ================ Comment at: clang/docs/analyzer/user-docs/TaintAnalysisConfiguration.rst:19 +Taint analysis works by checking for the occurrence of special events during the symbolic execution of the program. +Taint analysis defines sources, sinks, and propagation rules. It identifies errors by detecting a flow of information that originates in a taint source, touches a taint sink, and propagates through the program paths via propagation rules. +A source, sink, or an event that propagates taint is mainly domain-specific knowledge, but there are some built-in defaults provided by :ref:`alpha-security-taint-TaintPropagation`. ---------------- ================ Comment at: clang/docs/analyzer/user-docs/TaintAnalysisConfiguration.rst:41-51 + # sources: + # signature: + # size_t fread(void *ptr, size_t size, size_t nmemb, FILE * stream) + # + # example: + # FILE* f = fopen("file.txt"); + # char buf[1024]; ---------------- Since there the `SrcArgs` is //empty//, `fread` will work as an //unconditional// propagator - aka. a taint source. It's probably better to phrase it something like that. By doing so you could signify that in the next example you have a //non-empty// `SrcArgs` list, thus the propagation is //conditional//. I think it would be better to follow this pattern in the corresponding section later as well. ================ Comment at: clang/docs/analyzer/user-docs/TaintAnalysisConfiguration.rst:49 + # size_t read = fread(buf, sizeof(buf[0]), sizeof(buf)/sizeof(buf[0]), f); + # // read and buf is tainted + - Name: fread ---------------- ================ Comment at: clang/docs/analyzer/user-docs/TaintAnalysisConfiguration.rst:76 +In the example file above, the entries under the `Propagation` key implement the conceptual sources and propagations, and sinks have their dedicated `Sinks` key. +The user can define program points where the tainted values should be cleansed by listing entries under the `Filters` key. +Filters model the sanitization of values done by the programmer, and providing these is key to avoiding false-positive findings. ---------------- The literature probably refers to //cleansing// functions as **sanitizers**. [[ https://en.wikipedia.org/wiki/Taint_checking | Wikipedia ]] ================ Comment at: clang/docs/analyzer/user-docs/TaintAnalysisConfiguration.rst:89 + +Under the `Filters` entry, the user can specify a list of events that remove taint (see :ref:`taint-filter-details` for details). + ---------------- Previously it was referred to by the `key` word. Chose one for consistency. ================ Comment at: clang/docs/analyzer/user-docs/TaintAnalysisConfiguration.rst:89 + +Under the `Filters` entry, the user can specify a list of events that remove taint (see :ref:`taint-filter-details` for details). + ---------------- steakhal wrote: > Previously it was referred to by the `key` word. Chose one for consistency. I would rather use the //operation// or //function call// instead of //event// everywhere. ================ Comment at: clang/docs/analyzer/user-docs/TaintAnalysisConfiguration.rst:91 + +Under the `Propagations` entry, the user can specify a list of events that generate and propagate taint (see :ref:`taint-propagation-details` for details). +The user can identify taint sources with a `SrcArgs` key in the `Propagation` entry, while propagations have none. ---------------- ================ Comment at: clang/docs/analyzer/user-docs/TaintAnalysisConfiguration.rst:92 +Under the `Propagations` entry, the user can specify a list of events that generate and propagate taint (see :ref:`taint-propagation-details` for details). +The user can identify taint sources with a `SrcArgs` key in the `Propagation` entry, while propagations have none. + ---------------- This sentence definitely needs some polishing. 1) It should emphasize that a source is an unconditional propagation - at least in the current design. 2) The user might want to //enlist// or //mark// a specific function as //taint source// but definitely not //identify// them. ================ Comment at: clang/docs/analyzer/user-docs/TaintAnalysisConfiguration.rst:94 + +Under the `Sinks` entry, the user can specify a list of events where the checker should emit a bug report if taint reaches there (see :ref:`taint-sink-details` for details). + ---------------- ================ Comment at: clang/docs/analyzer/user-docs/TaintAnalysisConfiguration.rst:102 + - `Name` is a string that specifies the name of a function. + Encountering this function during symbolic execution will clean taint on some parameters or the return value. + - `Args` is a list of numbers in the range of [-1..int_max]. ---------------- ================ Comment at: clang/docs/analyzer/user-docs/TaintAnalysisConfiguration.rst:109 +The following keys are optional: + - `Scope` is a string that specifies the prefix of the function's name in its fully qualified name. This option restricts the set of matching function calls. + ---------------- ================ Comment at: clang/docs/analyzer/user-docs/TaintAnalysisConfiguration.rst:114 +Propagation syntax and semantics +################################ +An entry under `Propagation` is a `YAML <https://yaml.org/>`_ object with the following mandatory keys: ---------------- In the following section, you have an empty line after this marker line. Consolidate these. ================ Comment at: clang/docs/analyzer/user-docs/TaintAnalysisConfiguration.rst:117 + - `Name` is a string that specifies the name of a function. + Encountering this function during symbolic execution propagate taint from one or more parameters to other parameters and possibly the return value. + It helps model the taint-related behavior of functions that are not analyzable otherwise. ---------------- You sometimes use `parameter` and in other cases `argument` seemingly interchangeably. You should consider consolidating these. ================ Comment at: clang/docs/analyzer/user-docs/TaintAnalysisConfiguration.rst:133 + The value of ``None`` will not consider the arguments that are part of a variadic argument list (this option is redundant but can be used to temporarily switch off handling of a particular variadic argument option without removing the entire variadic entry). + - `VariadicIndex` is a number in the range of [0..int_max]. It indicates the starting index of the variadic argument in the signature of the function. + ---------------- It's not exactly for this patch, but we should investigate If we could infer this index from the declaration of the function. Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D113251/new/ https://reviews.llvm.org/D113251 _______________________________________________ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits