Hi all, I have attached an update.
1. It is against the latest master. 2. It includes an originprompt.html and an originprompt-nojs.html that works properly when javascript is disabled. 3. The Web Storage database has been moved into the per-origin folder, even though it is probably already compliant with the same-origin policy. This just makes certain, in case that changes. The spec allows the same-origin policy to be broken here, and if cookies get blocked due to industry pressure, then I want protections in place to prevent this feature from taking the place of trojan cookies. 4. I added a randomized User-Agent if it is NULL in the config file. WebKit normally returns a default when the user-agent property is NULL or "". 5. I added an Accept-Language header that forwards the locale in $LANG, and adds some additional random locales at a lower quality to throw off naive profiling. 6. I left in the download directory config. 7. I fixed one case where the originprompt was being used even when the navigation was explicit. I read some papers on the profiling issue, and most seem to say that lowering the diversity is the key, effectively lowering the "bandwidth" of the "signal", and want to avoid randomizing anything. However: 1. If noise is added to this "signal", then noise reduction techniques must be used, and such techniques usually need an appropriate model or profile of the noise to discard it, and that is a fairly difficult thing to do at scale. 2. A valid concern is that semantics could suffer. But it is not difficult to add noise that is semantically valid. If a profiling method needed to rely on semantics, then the available bandwidth is limited even further. For example, the order of values may be semantically insignificant, but different orderings would be a profiling value in itself, because they would alter a digest of the header. By randomizing the order, the semantics would need to be understood, and would provide less signal entropy. Naive digests would be useless. 3. Digests are commonly used to share device identifiers in the tracking industry, and it is trivial for the industry to tool that same code to other headers, like User-Agent. By breaking naive digest methods, the tracking industry would need to use more sophisticated methods that returned less value. Future plans: 1. I plan on doing more semantically valid randomization like what I did to the Accept-Language header. 2. I was thinking of using dmenu instead of the HTML prompt, by using a wrapper script that launched surf or aborted. This wrapper could then isolate by merely exporting a different $HOME to surf, for each origin. This would allow me to move a bunch of code out of surf.c and into a shell script. If I can get the changes to surf.c down to just a few lines, then, I can package up the wrapper separately, and make changes to it without affecting the surf build. 3. This also may make it easier to support other embeddable browsers, like dillo, since the per-origin $HOME would work there. The prompt could even map different browsers to different origins. A simple origin library with a standard interface could be used by various browsers, just calling out to it whenever navigation occurs. 4. I thought about using GtkMenu when you click a link, but dmenu is surf's conventional menu, and suits surf's keyboard-driven use cases. 5. I am thinking of using the stylesheet regex technique to map URLs to origins, so that grouped origins like google subdomains can be easier to set up. Currently, I use symbolic links to map origin folders together. The main benefit is that the configuration can all be in one place. Symbolic links are easy to create, but can be difficult to maintain. However, if I break the code out into a separate library, I would probably adopt thttpd's glob patterns ("*" selected anything in between delimiters, while "**" selected anything across delimiters). 6. I ran into a cross-origin POST issue. I still need to figure out a good way to handle that other than mapping the origin profiles together with a symbolic link. As always, any input would be appreciated, and thanks again for providing such an easy browser to work with. Thank you, Ben On 1/8/15, sta...@cs.tu-berlin.de <sta...@cs.tu-berlin.de> wrote: > Hi > > sounds very interesting. thanks. will review, test and report when I get > some > spare time⦠> >
d60efcd7608930dd055add8ea699db86686f2733.patch
Description: Binary data