On Wed, Jun 29, 2016 at 9:52 AM, Inspectre Gadget <inspec...@inspect.re> wrote: > Hey everyone, > > Here’s my issue, I will try to keep this short and concise: > > I have written a program that will accept a URL, spider that URL’s domain > and scheme (http/https), and return back all input fields found throughout > to the console. The purpose is largely for web application security testing, > as input fields are the most common vulnerability entry points (sinks), and > this program automates that part of the reconnaissance phase. > > Here is the problematic code: > https://github.com/insp3ctre/input-field-finder/blob/ce7983bd336ad59b2e2b613868e49dfb44110d09/main.go > > The issue lies in the last for loop in the main() function. If you were to > run this program, it would check the queue and workers so frequently that it > is bound to find a point where there are both no workers working, and no > URLs in the queue (as proved by the console output statements before it > exits). Nevermind that the problem is exacerbated by network latency. The > number of URLs actually checked varies on every run, which causes some > serious inconsistencies, and prevents the program from being at all > reliable. > > The issue was fixed here: > https://github.com/insp3ctre/input-field-finder/blob/f0032bb550ced0b323e63be9c4f40d644257abcd/main.go > > I fixed it by removing all concurrency from network requests, leaving it > only in the internal HTML processing functions. > > So, the question is- how does one run efficient concurrent code when the > number of wait groups is dynamic, and unknown at program initialization? > > I have tried: > > Using “worker pools”, which consist of channels of workers. The for loop > checks the length of the URL queue and the number of workers available. If > the URL queue is empty and all the workers are available, then it exits the > loop. > Dynamically adding wait groups (wg.Add(1)) every time I pull a URL from the > URL queue. I can’t set the wait group numbers before the loop, because I can > never know how many URLs are going to be checked. > > > So I have tried using both channels and wait groups to check alongside the > URL queue length to determine whether more concurrent network requests are > needed. In both cases, the for loop checks the values so fast that it > eventually stumbles upon a non-satisfied loop condition, and exits. This > usually results in either the program hanging as it waits for wait groups to > exit that never do, or it simply exits prematurely, as more URLs are added > to the queue after the fact. > > I would really like to know if there is a way to actually do this well in > Go.
First thing I noticed in your code is for len(urlQueue) > 0 { ... <- urlQueue ... Never do that. That is racy. Instead, do for url : = range urlQueue { } Ian -- You received this message because you are subscribed to the Google Groups "golang-nuts" group. To unsubscribe from this group and stop receiving emails from it, send an email to golang-nuts+unsubscr...@googlegroups.com. For more options, visit https://groups.google.com/d/optout.