named sub-expressions, n-ary functions, things and stuff

Darren Duncan Mon, 13 Nov 2006 02:05:06 -0800

All,

As I've continued to develop my Perl-implemented and integratableRDBMS, a number of aspects have inspired thought for posibleimprovements for the Perl 6 language design.

For context, the query and command language of my RDBMS intentionallyoverlaps with Perl 6 as much as reasonable; that is, it is a subsetof Perl 6 with a very simple syntax and with domain-specificadditions; so using it should be loosely like using Perl 6. Sufficeit to say that the more of these "additions" that end up beingprovided by Perl 6 itself as options or features, the easier my jobwill be in making an easily Perl 6 integratable RDBMS product.


The language has a partial profile like this:

- The type system consists of just strong types, each value andvariable is of a specific type, and all type conversions are explicit.- The type system is explicitly finite, so no Inf etc values, and alltype generators take parameters which specify applicable limits (eg,0 <= N < 256); a notable exception is that the Bool type is used asis, without parameterization, because it is already a finite domain.

- There are no Undef or NaN etc values or variables.
- All type definitions include an explicit default value, eg 0 or ''.

- A failure always manifests as a thrown exception, and an exceptionis the result of an operator that can't return a value within theallowed domain, eg when one divides by zero.

- All logic is 2VL not 3+VL.
- All data types are immutable.

- All operators are prefix operators, invoked on their package name,like with modules that don't export, and not as object methods.- All operators and functions take exclusively named arguments, andargument lists are always bounded in parenthesis.- All core operators and types are pure functions, with noside-effects, except for the assignment operator, certain shorthands,and IO-like or monad functions.- System defined storable types/type-generators include, otherwise asdefined in Perl 6: Bool, Int, Num, Str, Blob.- Additional system defined storable types include: DateTime etc,spacial types, the set based concept of a Tuple type, the set basedRelation type.- All operators that make sense in an n-ary form are declared withjust one main argument which is the list of operands; this includes:'+', '*', '~', 'and', 'or', 'min', 'max', 'avg', 'union','intersection', (relational) 'join'; said operators can also doublefor use in list (eg, relation) summarization.- System defined transient (non-storable) types include: Seq, Set,Bag; their primary purpose is to facilitate list arguments such forn-ary operators that hold the operands, or as a short hand forrepresenting a sorted query result; note that if one wants to storethe same sort of thing, they define an appropriate Relation typeinstead.- It is valid for all generic collection type values to consist ofzero elements; so eg, a Tuple can have zero attributes; zero-aryvalues also happen to be the default values for their correspondingtypes.

- Users can define their own types and operators.
- Operators can be recursive.

- Any collection type can be composed of any other type, includingcollection types.- Multiple update operations aka variable assignments can beperformed in a single statement, and this statement is atomic; rvalueexpressions see the same consistent system state before anyassignments, and all assignments are performed after all rvalues arecomputed; I suppose like Perl's list assignment.- Multi-level transactions are supported, where any statements withina transaction level are collectively atomic and can succeed or fail;any block marked as atomic, and all named routines and try-catchblocks are atomic; in the last case, a thrown exception indicates afailure of the block.

- A database is centrally a persistent-like collection of Relation variables.

- A database as a whole, and each of its parts by extension, isalways perceived by users as being in a consistent state, where allof its defined constraints or business rules are satisfied; any givenmutating statement will only change it from one consistent state toanother, with no inconsistent state visible between statementboundaries at any level (in ACID terms, it is serializable isolation).

Note that a number of the above features in combination result in alanguage grammar that is extremely simple, though somewhat verbose.But then, it is largely meant to be an explicit intermediate languageor AST that others can target.


Anyway, a few questions or suggestions about Perl 6 ...

1. I'm not sure if it is possible yet, but like Haskell et al (orsome SQL dialects "WITH" clause), it should be possible to write aPerl 6 routine or program in a pure functional notation or paradigm,such that the entire routine body is a single expression, but thathas named reusable sub-expressions.


For example, in pseudo-code:

  routine foo ($bar) {
    return
      with
        $bar * 17 -> $baz,
        $baz - 3 -> $quux,
        $baz / $quux;
  }

This is instead of either of:

  routine foo ($bar) {
    return ($bar * 17) / ($bar * 17 - 3);
  }

  routine foo ($bar) {
    my $baz = $bar * 17;
    my $quux = $baz - 3;
    return $baz / $quux;
  }

The former is an expression that can be embedded in otherexpressions, and any redundant parts are explicitly only coded orcalculated once.

2. While it is not strictly necessary, I think it would provide auseful syntactical short-hand to add an actual immutable "Bag" type.In context of Synopsis 6, it could look like this:


    Seq         Completely evaluated (hence immutable) sequence
    Set         Unordered Seqs that allow no duplicates
    Bag         Unordered Seqs that do allow duplicates

Declaration of Bag values can be parameterized with 'of' etc the sameas Set or Seq or Array etc can.

A Bag type could be implemented as a Mapping of values to occurancecounts, the latter of which are Int > 0.

Unlike a Seq, which conceptually preserves either an input order ofits elements or a specific sorting of its elements, the Bag doesn'tcare to preserve them because the order doesn't matter.

Within the context of the n-ary operators I mentioned earlier, eachone would conceptually take their list of arguments as a ofSeq|Set|Bag of the values:

- Seq: '~'.
- Set: 'and', 'or', 'min', 'max', 'union', 'intersection', (relational) 'join'.
- Bag: '+', '*', 'avg'.

With string concatenation, both input duplicates and the order theyappear will determine the output. With math ops like sum, product,average, the order of the input doesn't affect the output, but anyduplicates do. With the other above ops, neither order norduplicates affect the output, so dups can conceptually be filteredout first via set construction for efficiency of use.

If nothing else, I note that a lot of code examples in the Synopsisreference a Bag type and it has the generic-enough appearance to looklike a built-in.

3. I don't know if it is the case now, but there should be separateoperators (which can have the same base name) for Int and Num ops,particularly the division; a division of 2 Int always returns an Int;a division involving a Num will return a Num; a division of 2 weaktypes that contain numbers will do the Num version even if they looklike integers, so that type alone can determine behaviour, which Isee as being more predictable and consistent.

4. There should be floor() and ceil() functions that take a Num asinput and return an Int; likewise with round() etc. FYI, this is themethod I use for explicit Num->Int type conversion; users can specifyhow the conversion is done by which function they explicitly use todo it.

5. It would help simplify my implementation tasks if all the built-inPerl 6 types had multis for their operators such that the operatorscould all be invoked exclusively with named arguments, even if thereis just 1 argument. Though if you don't want to do this, then itsnot a big deal, and I'll just subclass them with wrappers that doprovide such.


Thank you in advance for any consideration or feedback.

-- Darren Duncan

named sub-expressions, n-ary functions, things and stuff

Reply via email to