The problem with generating a permutation vector for an "arbitrary" Unicode string is still a problems of collating order. There is no inherent order in Unicode; someone has to decide on what makes sense as a collating order for the subset of code points used by the application.
You should use ⎕ucs with a vector of code points to define your own collating order for Unicode; any code points not explicitly specified in the collating order will sort to the end. For example (and this is an easy case) you can use this to specify a default collating order (based upon ordinal value of the code points themselves) for the 8-bit ASCII subset: ⎕ucs ⎕io-⍨⍳256 On Tue, 2014-07-08 at 12:09 +0800, Elias Mårtenson wrote: > Dyadic grade doesn't make much sense in the context of Unicode though. > How do you grade an arbitrary Unicode string? > > > That issue is there even if we completely disregard all the > other Unicode-related collating issues. > > > Regards, > Elias > > > On 8 July 2014 12:00, David B. Lamkins <dlamk...@gmail.com> wrote: > Check my follow-up post. > > I'm fairly certain that the issue is whether monadic grade > applied to a > list of strings should do anything but signal a domain error. > The ISO > spec says that monadic grade is defined only on numeric > arguments. > > My test case appears to have monadic grade treating strings as > if they > encode numbers in a sufficiently large base. > > If you want to sort strings, use dyadic grade. The left > argument > specifies a collating sequence. > > On Tue, 2014-07-08 at 11:43 +0800, Elias Mårtenson wrote: > > Ordering by size first makes very little sense to me. It > makes it very > > hard to sort any list of strings. > > > > > > I was hoping that the following would have done so, but it > also > > suffers from the "length first" issue: > > > > > > z[⍋ ⎕UCS¨ z←'aa' 'xx' 'aaa' 'xxx'] > > aa xx aaa xxx > > > > > > What is the proper way to sort strings given the existing > semantics of > > grade? > > > > > > Regards, > > Elias > > > > > > On 8 July 2014 02:34, David Lamkins <da...@lamkins.net> > wrote: > > Looking at the spec, it seems that monadic grade is > defined > > only for numeric data. > > > > > > That leaves open the question of whether my example > should > > have signaled a domain error. > > > > > > > > On Mon, Jul 7, 2014 at 11:25 AM, David Lamkins > > <da...@lamkins.net> wrote: > > Given a list of character vectors (and > scalars), grade > > appears to generate the permutation vector > first by > > length then by content. > > > > ⍋'aaa' 'xx' 'y' 'bbb' 'cc' > > 3 5 2 1 4 > > > > > > This seems counterintuitive. It seems as if > ⍋ treats > > character strings like numbers. Is this a > bug? > > > > -- > > "The secret to creativity is knowing how to > hide your > > sources." > > Albert Einstein > > > > > > http://soundcloud.com/davidlamkins > > http://reverbnation.com/lamkins > > http://reverbnation.com/lcw > > http://lamkins-guitar.com/ > > http://lamkins.net/ > > http://successful-lisp.com/ > > > > > > > > -- > > "The secret to creativity is knowing how to hide > your > > sources." > > Albert Einstein > > > > > > http://soundcloud.com/davidlamkins > > http://reverbnation.com/lamkins > > http://reverbnation.com/lcw > > http://lamkins-guitar.com/ > > http://lamkins.net/ > > http://successful-lisp.com/ > > > > > > > > >