Re: Fwd: [fricas-devel] Extentions to symbols and strings

Waldek Hebisch Sat, 25 Nov 2023 16:08:34 -0800

On Sat, Nov 25, 2023 at 05:24:10PM +0100, Prof. Dr. Johannes Grabmeier wrote:
> 
> I enhanced Character and String with greek letters and other useful
> functions. Code string-enhanced.spad is included. Also symbol-jg.spad with
> signatures like alpha: () -> Sybol
<snip> 
> I am curious to learn what is going wrong now. And, if this can be fixed I
> would appreciate to include my enhancements to the distribution.


After fixing the make_full_CVEC problem your changes compile fine.

Some comments about includinhg them in FriCAS distribution: currently
we can not assume that Unicode characters are a single FriCAS
character.  We have utility function that produces FriCAS string
consisting of single Unicode character.  Currently GCL uses UFT-8
encoding where character codes are 8-bit and non-ASCII Unicode characters
produce strings of length bigger than 1.  And I would like to
have possiblity of representing FriCAS strings as byte arrays
even when using different Lisp.  So I do not like idea of
adding Unicode character to FriCAS Character.

I also do not like idea representing CharacterClass by array with
more than 256 entries.  First, it does not work when character
codes are 8-bit.  Second, once we add more codes there is no
natural limit, logically we should have entire Unicode, that
is 1114112 positions.  But such tables are very troublesome
performancewise.  They take a lot of space and it takes time
to initialize them.  One may think that with gigabyte memories
this is not a problem.  Well, modern machine are fast when
data is in caches.  Caches have limited size and big tables
may exceed size of caches.  Also, FriCAS normally is usable
on quite small machines, spending a lot of memory on something
that looks trivial means that it would be less usable on
small machines.

We already have equvalent functionality to some things that
you want to add.  In package ScanningUtilities there is
'parse_integer' function which converts a string to an integer.
This covers functionalty of 'toInteger'.  If function to convert
digits to integers is desired it would be natural to add it in
the same package.  I am not sure how important is
'toDecimalExpansion' but AFAICS the effect may be obtained by

decimal(parse_integer(s)::FRAC(INT))

Also we have 'newline()' to get end of line character,
so your 'endOfLine' just introduces a duplicate name.

Concerning adding greek letter and other symbols to Symbol,
I am affraid of "namespace pollution".  Symbol is used
a lot and users may want to use names of Greek letters for
their purposes.  So I would put such functions in a separate
unexposed package.  Concering implementation, 'getUTF'
may be implemented as

getUTF(i : PositiveInteger) : Symbol == ucodeToString(i)::Symbol

which should work with our convention.  Similarly, 'newGreek'
should use array of strings as set of greek letters (or
use ucodeToString to produce them on the fly).  ATM FriCAS
sources should use ASCII (not ASCII charcter may cause but
failure in some cases).  So documentaion string should use
spelled-out name of character, not to charater itself.

More generally, for handling Unicode strings there may be some
use in functions that extract next Unicode code point (Integer
from 0 to 1114112 - 1 and function to give first string index
after/befor a Unicode character.  Such functions allow iteration
over Unicode strings in both directions which should be
enough for basic Unicode-aware string operations.

-- 
                              Waldek Hebisch

-- 
You received this message because you are subscribed to the Google Groups 
"FriCAS - computer algebra system" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To view this discussion on the web visit 
https://groups.google.com/d/msgid/fricas-devel/ZWKMe7kig75WvCEK%40fricas.org.

Re: Fwd: [fricas-devel] Extentions to symbols and strings

Reply via email to