Hey everyone, Here’s my issue, I will try to keep this short and concise:
I have written a program that will accept a URL, spider that URL’s domain and scheme (http/https), and return back all input fields found throughout to the console. The purpose is largely for web application security testing, as input fields are the most common vulnerability entry points (sinks), and this program automates that part of the reconnaissance phase. Here is the problematic code: https://github.com/insp3ctre/input-field-finder/blob/ce7983bd336ad59b2e2b613868e49dfb44110d09/main.go The issue lies in the last for loop in the main() function. If you were to run this program, it would check the queue and workers so frequently that it is bound to find a point where there are both no workers working, and no URLs in the queue (as proved by the console output statements before it exits). Nevermind that the problem is exacerbated by network latency. The number of URLs actually checked varies on every run, which causes some serious inconsistencies, and prevents the program from being at all reliable. The issue was fixed here: https://github.com/insp3ctre/input-field-finder/blob/f0032bb550ced0b323e63be9c4f40d644257abcd/main.go I fixed it by removing all concurrency from network requests, leaving it only in the internal HTML processing functions. So, the question is- how does one run efficient concurrent code when the number of wait groups is dynamic, and unknown at program initialization? I have tried: - Using “worker pools”, which consist of channels of workers. The for loop checks the length of the URL queue and the number of workers available. If the URL queue is empty and all the workers are available, then it exits the loop. - Dynamically adding wait groups (wg.Add(1)) every time I pull a URL from the URL queue. *I can’t set the wait group numbers before the loop, because I can never know how many URLs are going to be checked.* So I have tried using both channels and wait groups to check alongside the URL queue length to determine whether more concurrent network requests are needed. In both cases, the for loop checks the values so fast that it eventually stumbles upon a non-satisfied loop condition, and exits. This usually results in either the program hanging as it waits for wait groups to exit that never do, or it simply exits prematurely, as more URLs are added to the queue after the fact. I would really like to know if there is a way to actually do this well in Go. Cheers, Inspectre -- You received this message because you are subscribed to the Google Groups "golang-nuts" group. To unsubscribe from this group and stop receiving emails from it, send an email to golang-nuts+unsubscr...@googlegroups.com. For more options, visit https://groups.google.com/d/optout.