If you guys go down this road I would suggest using https://docs.python.org/2/library/multiprocessing.html rather than Python Threads if that is what is being proposed..
On 6/8/15, 10:17 AM, "Finnigan, Jamie" <jamie.finni...@hp.com> wrote: >On 6/8/15, 8:26 AM, "Ian Cordasco" <ian.corda...@rackspace.com> wrote: > >>Hey everyone, >> >>I drew up a blueprint >>(https://blueprints.launchpad.net/bandit/+spec/use-threading-when-running >>- >>c >>hecks) to add the ability to use multiprocessing (or threading) to >>Bandit. >>This essentially means that each "thread" will be fed a file and analyze >>it and return the results. (A file will only ever be analyzed by one >>thread.) >> >>This has lead to significant speed improvements in Flake8 when running >>against a project like Nova and I think the same improvements could be >>made to Bandit. > >We skipped parallel processing earlier in Bandit development to keep >things simple, but if we can speed up execution against the bigger code >bases with minimal additional complexity (still needs to be 'easy¹ to add >new checks) then that would be a nice win. > >I don't think we¹d lose anything by processing in parallel vs. serial. If >we do ever add additional state tracking more broadly than per-file, >checks would need to be run at the end of execution anyway to take >advantage of the full state. > > >> >>I'd love some feedback on the following points: >> >>1. Should this be on by default? >> >> Technically, this is backwards incompatible (unless we decide to order >>the output before printing results) but since we're still in the 0.x >>release series of Bandit, SemVer allows backwards incompatible releases. >>I >>don't know if we want to take advantage of that or not though. > >It looks like flake8 default behavior is off (1 "thread²), which makes >sense to me... > > >> >>2. Is output ordering significant/important to people? > >Output ordering is important - output for repeated runs against an >unchanged code base should be predictable/repeatable. We also want to >continue to support aggregating output by filename or by issue. Under this >model though, we'd just collect results then order/output at completion >rather than during execution. > > >> >>3. If this is off by default, should the flag accept a special value, >>e.g., 'auto', to tell Bandit to always use the number of CPUs present on >>the machine? > >That seems like a tidy way to do things. > > >Overall, this feels like a nice evolutionary step. Not opposed (in fact, >I'd support it), but would want to make sure it doesn't overly complicate >things. > >What library/ies would you suggest using? I still like the idea of >keeping as few external dependencies as possible. > > >Jamie > > >__________________________________________________________________________ >OpenStack Development Mailing List (not for usage questions) >Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe >http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev __________________________________________________________________________ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev