All,

As I've continued to develop my Perl-implemented and integratable RDBMS, a number of aspects have inspired thought for posible improvements for the Perl 6 language design.

For context, the query and command language of my RDBMS intentionally overlaps with Perl 6 as much as reasonable; that is, it is a subset of Perl 6 with a very simple syntax and with domain-specific additions; so using it should be loosely like using Perl 6. Suffice it to say that the more of these "additions" that end up being provided by Perl 6 itself as options or features, the easier my job will be in making an easily Perl 6 integratable RDBMS product.

The language has a partial profile like this:
- The type system consists of just strong types, each value and variable is of a specific type, and all type conversions are explicit. - The type system is explicitly finite, so no Inf etc values, and all type generators take parameters which specify applicable limits (eg, 0 <= N < 256); a notable exception is that the Bool type is used as is, without parameterization, because it is already a finite domain.
- There are no Undef or NaN etc values or variables.
- All type definitions include an explicit default value, eg 0 or ''.
- A failure always manifests as a thrown exception, and an exception is the result of an operator that can't return a value within the allowed domain, eg when one divides by zero.
- All logic is 2VL not 3+VL.
- All data types are immutable.
- All operators are prefix operators, invoked on their package name, like with modules that don't export, and not as object methods. - All operators and functions take exclusively named arguments, and argument lists are always bounded in parenthesis. - All core operators and types are pure functions, with no side-effects, except for the assignment operator, certain shorthands, and IO-like or monad functions. - System defined storable types/type-generators include, otherwise as defined in Perl 6: Bool, Int, Num, Str, Blob. - Additional system defined storable types include: DateTime etc, spacial types, the set based concept of a Tuple type, the set based Relation type. - All operators that make sense in an n-ary form are declared with just one main argument which is the list of operands; this includes: '+', '*', '~', 'and', 'or', 'min', 'max', 'avg', 'union', 'intersection', (relational) 'join'; said operators can also double for use in list (eg, relation) summarization. - System defined transient (non-storable) types include: Seq, Set, Bag; their primary purpose is to facilitate list arguments such for n-ary operators that hold the operands, or as a short hand for representing a sorted query result; note that if one wants to store the same sort of thing, they define an appropriate Relation type instead. - It is valid for all generic collection type values to consist of zero elements; so eg, a Tuple can have zero attributes; zero-ary values also happen to be the default values for their corresponding types.
- Users can define their own types and operators.
- Operators can be recursive.
- Any collection type can be composed of any other type, including collection types. - Multiple update operations aka variable assignments can be performed in a single statement, and this statement is atomic; rvalue expressions see the same consistent system state before any assignments, and all assignments are performed after all rvalues are computed; I suppose like Perl's list assignment. - Multi-level transactions are supported, where any statements within a transaction level are collectively atomic and can succeed or fail; any block marked as atomic, and all named routines and try-catch blocks are atomic; in the last case, a thrown exception indicates a failure of the block.
- A database is centrally a persistent-like collection of Relation variables.
- A database as a whole, and each of its parts by extension, is always perceived by users as being in a consistent state, where all of its defined constraints or business rules are satisfied; any given mutating statement will only change it from one consistent state to another, with no inconsistent state visible between statement boundaries at any level (in ACID terms, it is serializable isolation).

Note that a number of the above features in combination result in a language grammar that is extremely simple, though somewhat verbose. But then, it is largely meant to be an explicit intermediate language or AST that others can target.

Anyway, a few questions or suggestions about Perl 6 ...

1. I'm not sure if it is possible yet, but like Haskell et al (or some SQL dialects "WITH" clause), it should be possible to write a Perl 6 routine or program in a pure functional notation or paradigm, such that the entire routine body is a single expression, but that has named reusable sub-expressions.

For example, in pseudo-code:

  routine foo ($bar) {
    return
      with
        $bar * 17 -> $baz,
        $baz - 3 -> $quux,
        $baz / $quux;
  }

This is instead of either of:

  routine foo ($bar) {
    return ($bar * 17) / ($bar * 17 - 3);
  }

  routine foo ($bar) {
    my $baz = $bar * 17;
    my $quux = $baz - 3;
    return $baz / $quux;
  }

The former is an expression that can be embedded in other expressions, and any redundant parts are explicitly only coded or calculated once.

2. While it is not strictly necessary, I think it would provide a useful syntactical short-hand to add an actual immutable "Bag" type. In context of Synopsis 6, it could look like this:

    Seq         Completely evaluated (hence immutable) sequence
    Set         Unordered Seqs that allow no duplicates
    Bag         Unordered Seqs that do allow duplicates

Declaration of Bag values can be parameterized with 'of' etc the same as Set or Seq or Array etc can.

A Bag type could be implemented as a Mapping of values to occurance counts, the latter of which are Int > 0.

Unlike a Seq, which conceptually preserves either an input order of its elements or a specific sorting of its elements, the Bag doesn't care to preserve them because the order doesn't matter.

Within the context of the n-ary operators I mentioned earlier, each one would conceptually take their list of arguments as a of Seq|Set|Bag of the values:
- Seq: '~'.
- Set: 'and', 'or', 'min', 'max', 'union', 'intersection', (relational) 'join'.
- Bag: '+', '*', 'avg'.

With string concatenation, both input duplicates and the order they appear will determine the output. With math ops like sum, product, average, the order of the input doesn't affect the output, but any duplicates do. With the other above ops, neither order nor duplicates affect the output, so dups can conceptually be filtered out first via set construction for efficiency of use.

If nothing else, I note that a lot of code examples in the Synopsis reference a Bag type and it has the generic-enough appearance to look like a built-in.

3. I don't know if it is the case now, but there should be separate operators (which can have the same base name) for Int and Num ops, particularly the division; a division of 2 Int always returns an Int; a division involving a Num will return a Num; a division of 2 weak types that contain numbers will do the Num version even if they look like integers, so that type alone can determine behaviour, which I see as being more predictable and consistent.

4. There should be floor() and ceil() functions that take a Num as input and return an Int; likewise with round() etc. FYI, this is the method I use for explicit Num->Int type conversion; users can specify how the conversion is done by which function they explicitly use to do it.

5. It would help simplify my implementation tasks if all the built-in Perl 6 types had multis for their operators such that the operators could all be invoked exclusively with named arguments, even if there is just 1 argument. Though if you don't want to do this, then its not a big deal, and I'll just subclass them with wrappers that do provide such.

Thank you in advance for any consideration or feedback.

-- Darren Duncan

Reply via email to