JoelOnSoftware wrote an article I recently saw linked on perlmonks: http://www.joelonsoftware.com/articles/Wrong.html
The article discusses writing robust software, specifically by dealing with data separation. In my interpretation the article introduces a type system. This type system helps write robust software, but has some limitations: * Type information is checked by the programmer * Full annotations must be supplied by the programmer * Lack of annotation is hard to detect The system helps you separate data that has not been massaged for a certain piece of code, from touching that code. The only way to let that data reach the code is by using a filter that sanitizes it. Joel uses 'Request("Foo")' to mean something akin to $q->param("Foo") in CGI.pm land, and Write like 'print' (assuming an HTML output). His example shows how cross site scripting can arise, and how to use the type system to avoid this problem. The type system is implemented using coding standards: you tag variable names, much like a tagged union. In his example, the union type discusses data safety, and has two subtypes: safe and unsafe. This relates very closely to tainting, but differs in one respect - it's a static analysis. Tainting does the same thing with no user annotation, at runtime, under very specific situation. Perl 6 will need support for this kind of tainting, and I raised it before, but now I would like to propose something else. Let's look at Joel's code for a second: us = UsRequest("name") usName = us recordset("usName") = usName sName = SFromUs(recordset("usName")) WriteS sName At the top, the 'us' annotations denote that Request will return an unsafe value, and 'us' is an unsafe value. Then 'usName' is assigned to it (in a far away piece of code, btw). The programmer knows that 'usName' cannot be named 'sName' because it's getting it's value from a variable that is also tagged with 'us'. Later, the value is stored in a DB. When extracted from the DB, we know the value is unsafe, because it is tagged as such. Then SFromUS is like a complex casting operator, that makes something unsafe into something safe. The naming convention is supposed to help the programmer *see* when things go wrong. In Perl 6 ideally this would look like this, IMHO: my $str = $q.param("name"); ... my $name = $str; $storage.store("name", $name); ... my $name = $storage.get("name"); print encode($name); because type annotation sucks. Superficially, this code does not have the property that both Joel and I want it to have - safety, but I think this can be resolved. Perl 6 has the notion of roles. Let's say we were to decorate the param method of the http request object, asking for a symbolic role to be attached to all the values it returns. What we want to get out of it is that in the scope of our code (the lexical scope, the current class and it's subclasses, the consumers of this module, etc etc), any retrieval of a param will tag the data as unsafe, without param even knowing about this. Then the view is also tagged - no data may enter the Template namespace with this tag, or even more analy, for the scope that we use Template, the only data we allow ourselves to put into it, is something that is explicitly tagged as safe. The implementation of this system is trivial with Perl 6's tools: roles and compile time type inferrence allow the user to make a system that gives the exact same features as Joel's system does by wrapping interfaces. However, what I'm more interested in is decorating existing interfaces, in a limited scope. The reason we want a limiting scope is that it is not our concern how other pieces of code use $q.param safely or unsafely, with our definition of safety or with someone else's definition of it. What I'd like to be able to do is declare something that applies to all code in my system (application, module, script, whatever) that does this: my $str = $q.param("name"); ... my $name = $str; $storage.store("name", $name); ... my $name = $storage.get("name"); print encode($name); and enables me to say that print $name; is disallowed using the following rules: everything from $q.param is also of the type Unsafe everything going into $storage.store needs to get a callback triggered if it us unsafe (and more data about it will be stored in the DB). everything coming out of $storage.get must also trigger a callback, that will retag it as necessary. everything going into print must be of the type Safe the function encode has the type Unsafe -> Safe Using these 5 rules I can then gain control over much larger bits of code. The only question left unanswered is how do I say what code, and what is the syntax for these decorations. This tagging gets very interesting with his examples later on. Here's an excert of Joel's article: In Excel's source code you see a lot of rw and col and when you see those you know that they refer to rows and columns. Yep, they're both integers, but it never makes sense to assign between them. There is a real benefit to be gained here, but the usability of e.g. int formatting functions should not be hindered by overzealous typing. -- () Yuval Kogman <[EMAIL PROTECTED]> 0xEBD27418 perl hacker & /\ kung foo master: /me has realultimatepower.net: neeyah!!!!!!!!!!!!
pgp0aSpBeHuHF.pgp
Description: PGP signature