On Wed, 11 Jan 2017 14:46:29 -0800 (PST) Nako Zeta <nakotoff...@gmail.com> wrote:
> Is there any package to read http directories? > > For example to read an Apache index of /files/ > > I been doing this using goquery and parsing the HTML to know if its a > folder or a file but I am looking for better alternatives Do you control that Apache's instance? I mean, the path "/files/" in an URL served by a server is just the location of a resource, and it can be served in several different ways depending on multiple conditions -- for instance, whatever a client sent in the Accept header of its HTTP request. What your client receives now when requesting the resource by that URL is a HTML document generated by Apache; such documents do not follow any standard structure and are different between different server implementations. In other words, they are for humans to read them rendered in their browsers. What I'm leading you to, is that if you control the server, you can stick your own handler to serve that resource (programmatically) and then teach that handler to understand, say, "text/json" in the client's Accept header and generate a JSON document in response -- which is trivially parsable programmatically. With Apache, it may be even possible to use a combinaion of settings of mod_mime and mod_negotiation to have the Apache serve the directory index by itself -- unless the client explicitly asked for something more interesting like that index returned as JSON, -- in which case your custom handler would be called. If you do not control the server, I'm afraid the situation is no different from any common "web scraping" task. You might get help from packages such as [1] for it, or by directly using [2]. 1. https://godoc.org/github.com/PuerkitoBio/goquery 2. http://golang.org/x/net/html -- You received this message because you are subscribed to the Google Groups "golang-nuts" group. To unsubscribe from this group and stop receiving emails from it, send an email to golang-nuts+unsubscr...@googlegroups.com. For more options, visit https://groups.google.com/d/optout.