On Wed, Jun 29, 2016 at 9:52 AM, Inspectre Gadget <inspec...@inspect.re> wrote:
> Hey everyone,
>
> Here’s my issue, I will try to keep this short and concise:
>
> I have written a program that will accept a URL, spider that URL’s domain
> and scheme (http/https), and return back all input fields found throughout
> to the console. The purpose is largely for web application security testing,
> as input fields are the most common vulnerability entry points (sinks), and
> this program automates that part of the reconnaissance phase.
>
> Here is the problematic code:
> https://github.com/insp3ctre/input-field-finder/blob/ce7983bd336ad59b2e2b613868e49dfb44110d09/main.go
>
> The issue lies in the last for loop in the main() function. If you were to
> run this program, it would check the queue and workers so frequently that it
> is bound to find a point where there are both no workers working, and no
> URLs in the queue (as proved by the console output statements before it
> exits). Nevermind that the problem is exacerbated by network latency. The
> number of URLs actually checked varies on every run, which causes some
> serious inconsistencies, and prevents the program from being at all
> reliable.
>
> The issue was fixed here:
> https://github.com/insp3ctre/input-field-finder/blob/f0032bb550ced0b323e63be9c4f40d644257abcd/main.go
>
> I fixed it by removing all concurrency from network requests, leaving it
> only in the internal HTML processing functions.
>
> So, the question is- how does one run efficient concurrent code when the
> number of wait groups is dynamic, and unknown at program initialization?
>
> I have tried:
>
> Using “worker pools”, which consist of channels of workers. The for loop
> checks the length of the URL queue and the number of workers available. If
> the URL queue is empty and all the workers are available, then it exits the
> loop.
> Dynamically adding wait groups (wg.Add(1)) every time I pull a URL from the
> URL queue. I can’t set the wait group numbers before the loop, because I can
> never know how many URLs are going to be checked.
>
>
> So I have tried using both channels and wait groups to check alongside the
> URL queue length to determine whether more concurrent network requests are
> needed. In both cases, the for loop checks the values so fast that it
> eventually stumbles upon a non-satisfied loop condition, and exits. This
> usually results in either the program hanging as it waits for wait groups to
> exit that never do, or it simply exits prematurely, as more URLs are added
> to the queue after the fact.
>
> I would really like to know if there is a way to actually do this well in
> Go.

First thing I noticed in your code is

for len(urlQueue) > 0 {
    ... <- urlQueue ...

Never do that.  That is racy.  Instead, do

    for url : = range urlQueue {
    }

Ian

-- 
You received this message because you are subscribed to the Google Groups 
"golang-nuts" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to golang-nuts+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Reply via email to