[Bug-apl] Request for enhancement

2015-07-23 Thread Fred Weigel
The current definition of letter includes a..z and A..Z, _, del, high
minus and underscored del.

I would like to request three additional UTF-8 characters to be added:

Overline E2 80 BE (UTF-8), complementary to _ (underline)

Combining underline CC B2 (UTF-8) -- if this follows a letter, the
letter is rendered with an underline.

Combining overline CC 85 (UTF-8)-- if this follows a letter, the letter
is rendered with an overline.

As well, I would like combining overline to be included in the
definition of numeric digit: this can be used to mark digits as
repeating in code: 0.3(overbar) could be expanded to 0.33.. (as many as
needed to fill the precision). 2.1(overbar)2(overbar) should become
2.121212... (as many as needed).

Of course, I am be insane (my mother never had me tested).

Fred Weigel



Re: [Bug-apl] Prime Performance and others

2015-08-28 Thread Fred Weigel
Juergen

I just built 664 -- the performance of the factorial 300 that Mike Duvos
posted has indeed improved enormously - now running 8 seconds instead of
20 on my reference platform.

In line with expectations from Elias' profiling results.

Very well done!

FredW





[Bug-apl] Near Proof of concept of an )edit somefunction_name

2016-03-21 Thread Fred Weigel
Christian

Back in August, I started using GNU APL -- the first thing I did was to
bring up the Toronto Toolkit. The second was to implement edit function
and edit array. Find attached my work.

I convert the text of the function to a 'del' definition, edit it, and
then read it in with )copy. A )reset could be added, but I decided
against that.

Yes, I understand that xed and xeda are horrible hacks -- they work for
me. If you want to generalize and and to bits&pieces, feel free to do
so.

cr2lf is defined here to allow "easier" bringing in of external data
into a workspace. howdel is a reminder of the features of the del editor
(and some tricks, like capturing interactive for definition).

Fred Weigel


#!/usr/local/bin/apl --script
⍝
⍝ These functions are written as a sample extension of the Toronto
⍝ Toolkit. "explain 'xed'" will work, as will "comments 'xed'".
⍝ 'xed', 'xeda' and 'cr2lf' are added to toolkitlist, but only if the
⍝ toolkit is present. So, use ")copy 1 toolkit" before ")copy 1 xed"
⍝ for proper integration with the toolkit.
⍝
⍝ xed is set to call up my "e" editor, which is a terminal application.
⍝ It launches xterm to run a new terminal with the editor. This is done
⍝ because )host doesn't restore terminal settings before running its
⍝ given command.
⍝
⍝ As well, "explain 'del'" will provide a reminder of the features of
⍝ the del editor in GNU APL. (I used del until xed itself was
⍝ operational).
⍝
⍝ 'xeda array' edits an array by converting to csv, and using an
⍝ external editor (libreoffice calc). It returns the edited array.
⍝ It uses CSV to convert the array to a CSV, libreoffice calc to edit
⍝ the array, and CSV to read it in again. The CSV library brings in
⍝ FILE_IO, which is also needed.
⍝

⍝ )COPY 5 FILE_IO
)COPY 1 CSV

g∆nl ← ⎕UCS 10
g∆tmp ← '/tmp/'

g∆editor ← 'e '
g∆editorcsv ← 'oocalc '
g∆terminal ← 'xterm -xrm XTerm.vt100.initialFont:6 -e '

'→' ⎕EA 'toolkitlist ← toolkitlist on ''xed'''
'→' ⎕EA 'toolkitlist ← toolkitlist on ''xeda'''
'→' ⎕EA 'toolkitlist ← toolkitlist on ''cr2lf'''

∇r←cr2lf s
 ⍝convert CR to LF in string 
 ⍝.k text-editing
 ⍝.n fmgw
 ⍝.t 2015.8.5.10.33.0
 ⍝.v 1.0 / 05aug15
 r←s
 r[(s = ⎕UCS 13)/(⍳⍴s)]←g∆nl
∇

∇del
 ⍝describe 'del' editor usage
 ⍝.k text-editing
 ⍝.n fmgw
 ⍝.t 2015.8.5.10.33.0
 ⍝.v 1.0 /  05aug15
 'Enter: explain ''del'''
 'for a brief explanation of the ∇ editor and its use with GNU APL.'
∇

howcr2lf←'DOS (and APL2) used ASCII CR (carrier return, 13)'
howcr2lf←howcr2lf, ' to separate lines.', g∆nl
howcr2lf←howcr2lf, 'Linux (GNU APL) uses ASCII LF (line feed, 10)'
howcr2lf←howcr2lf, ' for this function.', g∆nl
howcr2lf←howcr2lf, 'cr2lf converts any CR characters in a string'
howcr2lf←howcr2lf, ' to LF characters.'

howdel←'Help for GNU APL built-in del (∇) editor', g∆nl, g∆nl
howdel←howdel, '∇FUNopen function FUN', g∆nl
howdel←howdel, '∇FUN[⎕] open with command, ∆ and → not'
howdel←howdel, ' allowed', g∆nl
howdel←howdel, '∇FUN[⎕]∇open, list, close', g∆nl, g∆nl
howdel←howdel, '∇   close', g∆nl
howdel←howdel, '⍫   close and lock', g∆nl, g∆nl
howdel←howdel, '[⎕] show   [⎕] [n⎕] [⎕m] [n⎕m] [⎕n-m]'
howdel←howdel, g∆nl
howdel←howdel, '[∆] delete [n∆] [∆m] [n∆m] [∆n-m]'
howdel←howdel, ' [∆n1 n2 ...]', g∆nl
howdel←howdel, '[→] escape (clear definition, keep'
howdel←howdel, ' header)', g∆nl
howdel←howdel, '[n] goto', g∆nl
howdel←howdel, '[n] textreplace existing text on line n'
howdel←howdel, g∆nl
howdel←howdel, 'textreplace existing text', g∆nl, g∆nl
howdel←howdel, 'After opening function, up and down arrows will'
howdel←howdel, ' retrieve function', g∆nl
howdel←howdel, '^K - kill to end of line, ^Y yank, ^A/^E, ^N/^P (can'
howdel←howdel, ' retrieve input', g∆nl
howdel←howdel, 'line, ^K, define a function, then ^Y).'

howxed←'xed ''function''. Edits function by calling out to an external'
howxed←howxed, ' editor.', g∆nl 
howxed←howxed, 'Uses globals g∆nl, g∆tmp, g∆editor and g

[Bug-apl] Defect in 722 and 723

2016-04-26 Thread Fred Weigel
A defect was introduced in revision 722.

In libapl.cc, line 410, a parameter "false" should be added.

FredW



[Bug-apl] Not a bug, need help coding search&replace on a vector

2016-06-21 Thread Fred Weigel
Christian Robert

For this, try the "toronto toolkit" in our bits&pieces.

There is a function "condense"

' ' condense '  a b   c d  '

a b c d

Which is what you want (I think). Uses ∆db (delete blanks). Preferable
to get from the toolkit -- functions reproduced below.

∇y←d condense v;b
 ⍝remove redundant blanks and blanks around characters specified in 
 ⍝.d 1985.6.20.10.10.10
 ⍝.e 'apple,betty,cat,,dog' = ',' condense '  apple, betty, cat, , dog'
 ⍝.k delete-characters
 ⍝.t 1992.3.10.20.19.23
 ⍝.v 1.1
 ⍝v is character vector (rank 1) only
 ⍝remove leading, trailing, and multiple internal blanks
 y←∆db,v
 ⍝remove blanks around characters specified in 
 ⍝e.g. if  =<,>, blanks are removed around commas in 'a , b , d'
 b←y∈d
 y←(∼(y=' ')∧(1⌽b)∨¯1⌽b)/y
∇

∇y←∆db v;b
 ⍝delete blanks (leading, trailing and multiple) from v (rank 0 - 2)
 ⍝.e 'apple betty cat' = ∆db '  apple  betty  cat  '
 ⍝.k delete-characters
 ⍝.v 1.1
 →((0 1 2=⍴⍴v),1)/l1,l1,l2,err1
 l1:
 b←' '≠v←' ',v
 y←1↓(b∨1⌽b)/v
 →0
 l2:
 b←∨⌿' '≠v←' ',v
 y←0 1↓(b∨1⌽b)/v
 →0
 err1:⎕←'∆db rank error'
∇






[Bug-apl] Not a bug, need help coding search&replace on a vector

2016-06-21 Thread Fred Weigel
An alternative (from hans-peter sorge from this list, april 13, 2016).
Careful though, this does have some edge conditions which will bite:
 ⍝replace  with  ( is 2 elements) in string 
 ⍝.k text-editing
 ⍝.n hans-peter sorge
 ⍝.t 2016.4.13.0.0.0
 ⍝.v 1.0 / 13apr16
 r←((0=⍴z)/a),z←∊((e×⍴v)↑¨⊂v),¨((⍴u)×e←+/¨(1+p)⊂p)↓¨(1++\p←u⍷a)⊂a←s⊣(u
v)←o
∇






[Bug-apl] Windows Linux Subsystem

2016-08-03 Thread Fred Weigel
Mike

Sure, why not try? Indeed, as I understand it, this will run linux
binaries. You will need:


libpthread.so.0 => /lib64/libpthread.so.0 (0x7fdeabe6c000)
libncurses.so.6 => /lib64/libncurses.so.6 (0x7fdeabc42000)
libtinfo.so.6 => /lib64/libtinfo.so.6 (0x7fdeaba16000)
libnsl.so.1 => /lib64/libnsl.so.1 (0x7fdeab7fd000)
libstdc++.so.6 => /lib64/libstdc++.so.6 (0x7fdeab475000)
libm.so.6 => /lib64/libm.so.6 (0x7fdeab16b000)
libgcc_s.so.1 => /lib64/libgcc_s.so.1 (0x7fdeaaf54000)
libc.so.6 => /lib64/libc.so.6 (0x7fdeaab91000)

(or similar) shared objects. These are all standard. I don't even think
a recompile will be needed.

However, (again, as far as I know), the "linux" environment will not be
able to share files with the Windows side. Perhaps a modification of
AP210/APserver could be made, if needed. I think I will scratch at that
problem.

Then again, Cygwin does not suffer from that particular problem. And, it
shouldn't even be that big of an issue, because workspaces should be ok.


FredW



[Bug-apl] 2 Defects in 81x

2016-12-13 Thread Fred Weigel
Two defects to report.

I use GNU APL as libapl from another application (not lua). I also use
parallel.

Defect 1:

In libapl.cc, function Unicode_to_UTF8(), the memcpy() second argument
is &utf8.at(0) This is not correct with the current Simple_string.hh

Easiest fix is to remove the "protected:" line in Simple_string.hh

Defect 2:

In Parallel.cc, there is a call all_CPUs.resize(count). With the current
Simple_string.hh this does not work.

Easiest fix is to change to all_CPUs.shrink(count)

Fred Weigel



Re: [Bug-apl] 2 Defects in 81x

2016-12-14 Thread Fred Weigel

Jürgen


Appears that both the libapl and parallel compilation are ok now.
Thanks!


I am beginning to scale out some problems. Will have to (either)
automatically parallel, or do it manually.
Just beginning to work in that space. So far, you are right (just a
slight benefit), but I haven't yet had
any lock-ups.


FredW



[Bug-apl] 844 Build Issue

2017-01-11 Thread Fred Weigel
Juergen

I just updated from 838 to 844. I (primarily) use libapl.so.

This does not work -- libapl needs an update. But, the easiest fix
is SharedValuePointer.hh. Line 52, comment out the protected:
libapl uses decrement_owner_count() and increment_owner_count()

Fred Weigel



Re: [Bug-apl] Free APL reference documentation, any takers?

2017-04-17 Thread Fred Weigel
Elias, Juergen

I use the "Toronto Toolkit" convention, which looks like this:

∇y←x adjust d;⎕IO;ex;i;line;lmrg;pw;w
 ⍝adjust each row of matrix  according to parameters 
 ⍝.e ('/' ∆box 'please do not  / enter') = 15 adjust 'please do not
enter'
 ⍝.k formatting
 ⍝.n rml
 ⍝.t 1992.4.24.14.4.17
 ⍝.v 1.0 / 05jan82
 ⍝.v 2.0 / 05apr88 / change order of , use subroutines
 ⍝.v 2.1 / 24apr92 / using signalerror
 ⍝ x[1] width of result in columns
 ⍝ x[2] width of left margin (i.e. number of blank columns)
 ⍝ x[3] number of blank lines to insert between each row
 ⎕IO←1

...and more of the function...

I suspect that if the first line is a lamp line, that can be taken as
the description, and a double-lamp taken as a single lamp. This lamp
line may be indented, and could then be the "help" for the function.
The "Toronto Toolkit" format then has lamp lines with ".x" (. followed
by letter) and data .e is example, .t is timestamp, .v is version, .k
for keywords, .n for name/initials and more. Then there are function in
the Toolkit that handle this data. But, again, the first lamp-line gives
the overview.

Just food for thought.
FredW



[Bug-apl] )HELP for defined functions

2017-04-20 Thread Fred Weigel
Juergen

The help extraction works beautifully! I am making a change to toolkit
to change first line of each comment to double-lamp (because that works
perfectly). And this gives great help when working interactively!

Again, many thanks!
FredW



[Bug-apl] wsid backups

2017-04-22 Thread Fred Weigel
Don't )DROP the .bak files as well!! If that is done, the DROP cannot be
undone!

As it is, if the .bak is needed, special consideration is needed; this
should not be considered a "normal" case.

So, the existing behaviour is good. The important thing is consistency.

Just my 2c

FredW



[Bug-apl] Use with word2vec

2017-04-28 Thread Fred Weigel
Jeurgen, and other GNU APL experts.

I am exploring neural nets, word2vec and some other AI related areas.

Right now, I want to tie in google's word2vec trained models (the
billion word one GoogleNews-vectors-negative300.bin.gz)

This is a binary file containing a lot of floating point data -- about
3.5GB of data. These are words, followed by cosine distances. I could
attempt to feed this in slow way, and put it into an APL workspace. 
But... I also intend on attempting to feed the data to a GPU. So, what I
am looking for is a modification to GNU APL (and yes, I am willing to do
the work) -- to allow for the complete suppression of normal C++
allocations, etc. and allow the introduction of simple float/double
vectors or matrices (helpful to allow "C"-ish or UTF-8-ish strings: the
data is (C string containing word name) (fixed number of floating
point)... repeated LOTs of times.

The data set(s) may be compressed, so I don't want read them directly --
possibly from a shared memory region (64 bit system only, of course), or
, perhaps using shared variables... but I don't think that would be fast
enough.

Anyway, this begins to allow the push into "big data" and AI
applications. Just looking for some input and ideas here.

Many thanks
Fred Weigel



[Bug-apl] Use with word2vec

2017-04-28 Thread Fred Weigel
Jeurgen, and other GNU APL experts.

I am exploring neural nets, word2vec and some other AI related areas.

Right now, I want to tie in google's word2vec trained models (the
billion word one GoogleNews-vectors-negative300.bin.gz)

This is a binary file containing a lot of floating point data -- about
3.5GB of data. These are words, followed by cosine distances. I could
attempt to feed this in slow way, and put it into an APL workspace. 
But... I also intend on attempting to feed the data to a GPU. So, what I
am looking for is a modification to GNU APL (and yes, I am willing to do
the work) -- to allow for the complete suppression of normal C++
allocations, etc. and allow the introduction of simple float/double
vectors or matrices (helpful to allow "C"-ish or UTF-8-ish strings: the
data is (C string containing word name) (fixed number of floating
point)... repeated LOTs of times.

The data set(s) may be compressed, so I don't want read them directly --
possibly from a shared memory region (64 bit system only, of course), or
, perhaps using shared variables... but I don't think that would be fast
enough.

Anyway, this begins to allow the push into "big data" and AI
applications. Just looking for some input and ideas here.

Many thanks
Fred Weigel



Re: [Bug-apl] Use with word2vec

2017-04-29 Thread Fred Weigel
Thanks!

I'll probably go with SHMEM for future cuda/opencl use (I was thinking
along those lines). I don't yet need typical size -- the model I am
working with this weekend is vector8.bin, which is 71000 x 200 floats
(71000 words, each with 200 floats = 57MB) in size, but the *big* one is
much larger.

Fred Weigel

On Fri, 2017-04-28 at 21:32 -0400, Xiao-Yong Jin wrote:
> If shared variables can go through SHMEM, you can probably interface
> cuda that way without much bottle neck.
> But with the way GNU APL is implemented now, there are just too many
> other limitations on performance with arrays of such size.
> 
> > On Apr 28, 2017, at 9:19 PM, Fred Weigel  wrote:
> > 
> > Jeurgen, and other GNU APL experts.
> > 
> > I am exploring neural nets, word2vec and some other AI related
> > areas.
> > 
> > Right now, I want to tie in google's word2vec trained models (the
> > billion word one GoogleNews-vectors-negative300.bin.gz)
> > 
> > This is a binary file containing a lot of floating point data --
> > about
> > 3.5GB of data. These are words, followed by cosine distances. I
> > could
> > attempt to feed this in slow way, and put it into an APL workspace. 
> > But... I also intend on attempting to feed the data to a GPU. So,
> > what I
> > am looking for is a modification to GNU APL (and yes, I am willing
> > to do
> > the work) -- to allow for the complete suppression of normal C++
> > allocations, etc. and allow the introduction of simple float/double
> > vectors or matrices (helpful to allow "C"-ish or UTF-8-ish strings:
> > the
> > data is (C string containing word name) (fixed number of floating
> > point)... repeated LOTs of times.
> > 
> > The data set(s) may be compressed, so I don't want read them
> > directly --
> > possibly from a shared memory region (64 bit system only, of
> > course), or
> > , perhaps using shared variables... but I don't think that would be
> > fast
> > enough.
> > 
> > Anyway, this begins to allow the push into "big data" and AI
> > applications. Just looking for some input and ideas here.
> > 
> > Many thanks
> > Fred Weigel
> > 
> 
> 



Re: [Bug-apl] Use with word2vec

2017-04-29 Thread Fred Weigel
Leslie

It is not so much "interpret speed". The data is an array of floats (32
bit) - 71,000 to 3,000,000 rows each with 200 to 300 columns. Each row
will be subject to a vector multiplication for a query (obviously 71000
to millions, depending on number of rows). Yes, I am interested in
parallel computation (one of the reasons I started looking at GNU APL).

The data is completely clean -- no NANs, etc. Each row corresponds to a
word from a corpus. The word list is separate when computation begins
(but, in the model data, interleaved; I extract and build the memory
structures separately).

My test model is 71,000 x 200 floats, the "standard" model is 3,000,000
x 300 floats (3.5GB of memory)

The use is for low end AI (alternate word/concept selection, basic
analogies) to begin the process of deriving "meaning" from documents. I
figure around one billion operations per word in a document for this
processing. I am looking at APL specification and testing, and
deployment on GPGPU (OpenCL or CUDA). For example Futhark or something
like that.

FredW

On Sat, 2017-04-29 at 01:50 +, Leslie S Satenstein wrote:
> Hi  Fred  
> Following up on Xiao-Yong Jin's response. 
> 
> You did not mention if you need the data in realtime or if you can
> work at the apl interpretor speed.Do you have a structure for your
> data.  You mentioned a format of  [text][floats] without
> specifyingsize of text and number of floats.  Is your data clean or
> does it need to be vetted. (NANs excluded)?
> I believe you should create a data dictionary which constructed with
> sqlite.  That data wouldbe loaded into sqlite via some C, CPP, python
> code and subsequently read via shared variables.APL is an
> interpretor.  What would take hours with APL to do what you want to
> do,  could take a few 
> minutes by externally loading the sql database and then using APL for
> presentation.
> Its an interesting idea you have.  Can you put out a more formal draft
> starter document. 
> Something to fill in the topics below.
> Aim:Data Descriptions/Quantities:Vetting and Filtering:Processing
> speed:
> Frequency of use.
>  
> Since you propose to do the work, who can estimate the cost.
> 
> From: Xiao-Yong Jin  To: fwei...@crisys.com 
> Cc: GNU APL 
>  Sent: Friday, April 28, 2017 9:32 PM
>  Subject: Re: [Bug-apl] Use with word2vec
>   
> 
>  
> If shared variables can go through SHMEM, you can probably interface
> cuda that way without much bottle neck.
> But with the way GNU APL is implemented now, there are just too many
> other limitations on performance with arrays of such size.
> 
> > On Apr 28, 2017, at 9:19 PM, Fred Weigel  wrote:
> > 
> > Jeurgen, and other GNU APL experts.
> > 
> > I am exploring neural nets, word2vec and some other AI related
> > areas.
> > 
> > Right now, I want to tie in google's word2vec trained models (the
> > billion word one GoogleNews-vectors-negative300.bin.gz)
> > 
> > This is a binary file containing a lot of floating point data --
> > about
> > 3.5GB of data. These are words, followed by cosine distances. I
> > could
> > attempt to feed this in slow way, and put it into an APL workspace. 
> > But... I also intend on attempting to feed the data to a GPU. So,
> > what I
> > am looking for is a modification to GNU APL (and yes, I am willing
> > to do
> > the work) -- to allow for the complete suppression of normal C++
> > allocations, etc. and allow the introduction of simple float/double
> > vectors or matrices (helpful to allow "C"-ish or UTF-8-ish strings:
> > the
> > data is (C string containing word name) (fixed number of floating
> > point)... repeated LOTs of times.
> > 
> > The data set(s) may be compressed, so I don't want read them
> > directly --
> > possibly from a shared memory region (64 bit system only, of
> > course), or
> > , perhaps using shared variables... but I don't think that would be
> > fast
> > enough.
> > 
> > Anyway, this begins to allow the push into "big data" and AI
> > applications. Just looking for some input and ideas here.
> > 
> > Many thanks
> > Fred Weigel
> > 
> 
> 
> 
> 
>    



Re: [Bug-apl] Use with word2vec

2017-04-30 Thread Fred Weigel
Juergen
This is useful -- I was looking at LApack.cc already. It is in line with
what I need (as a template).
I am not worried about saving these things, but I have a 300x300
array of C float,and do a 300 element vector by 300 element multiply on
each of the 3 million rows in a "typical"processing step. I don't want
to convert to C double (that would increase memory from 3.6GB to
7.2GB).I don't really want to copy the data at all! I can generate a
descriptor to the data (memory pointer, dimensions).   I think I want to
plant the data into a shared memory region (and, in future, pass it to a
GPU).
I think I want to do some specific functions on the data -- right now I
pass in row sets to GNU APL usingthe API, and execute APL code using the
API. However, the control is exclusively from outside APL,meaning I
cannot experimentally analyze using APL.
I can work on the model given by LApack.cc, and supply some functions
which (basically) providea "virtual memory/workspace".
The main problem with these array sizes is saving and loading -- this
array would be around 30GB inGNU APL (as far as I can tell). If ever
saved, it would then take 300GB. I can convert from float to double,and
create the Cell structures, but I would want to simply mmap() the thing
into GNU APL (and, of course,never have the thing participate in memory
management). Again, I was leaning towards partial mapping.Because, when
I start with tensors, the arrays will be sparse.
So, two real problems -- (1) how to deal with LARGE non-sparse matrices,
and (2) how to deal withLARGE sparse matrices.
I really like the expression afforded by APL.
It may be possible to use the APL parser,  and provide new
implementations of primitives -- thanksfor that idea.
LApack.cc seems to provide for something I can start with -- the actual
LARGE arrays won't changeso this provides a good demark point and start
for something workable. 
Thanks!Fred Weigel



On Sat, 2017-04-29 at 13:04 +0200, Juergen Sauermann wrote:
> Hi Fred,
> 
>   
> 
>   I have not fully understood what you want to do exactly, but is
>   looks to me as if you want to go for
> 
>   native GNU APL functions. Native functions provide the means to
>   bypass the GNU APL interpreter
> 
>   itself to the extent desired. For example you can use APL
> variables
>   but not the APL parser, or the
> 
>   APL parser but not the implementation of primitives, or whatever
>   else you are up to.
> 
>   
> 
>   As to plain double vectors, it is very difficult to introduce
> them
>   as a new built-in data type because that
> 
>   change would affect: every APL primitive, every APL operator,
>   )LOAD, )SAVE, )DUMP, and a lot
> 
>   more.
> 
>   
> 
>   However, you can have a look at (the top-level of) the
>   implementation of the matrix divide primitive which
> 
>   is doing what you are maybe after. The implementation of matrix
>   divide expects either a double vector or
> 
>   a complex vector as argument(s) and returns such a
>   vector as result. Before and after the computation
> 
>   of matrix divide a conversion between APL values and the plain
>   double or complex vector is performed.
> 
>   This conversion is very lightweight. If you have a homogenious
> GNU
>   APL value, say all revel items being double,
> 
>   then that value is almost like a C double *. The difference is a
>   space between adjacent ravel elements. In other
> 
>   words (expressed in APL):
> 
>   
> 
>   C_vector ←→ 1 0 1 0 ... / APL_vector
> 
>   
> 
>   I can provide you with more information if you want to go along
>   this path.
> 
>   
> 
>   /// Jürgen
> 
>   
> 
>   
> 
>   
> 
> 
> 
> On 04/29/2017 03:19 AM, Fred Weigel
>   wrote:
> 
> 
> 
> >   Jeurgen, and other GNU APL experts.
> > 
> > I am exploring neural nets, word2vec and some other AI related
> > areas.
> > 
> > Right now, I want to tie in google's word2vec trained models (the
> > billion word one GoogleNews-vectors-negative300.bin.gz)
> > 
> > This is a binary file containing a lot of floating point data --
> > about
> > 3.5GB of data. These are words, followed by cosine distances. I
> > could
> > attempt to feed this in slow way, and put it into an APL workspace. 
> > But... I also intend on attempting to feed the data to a GPU. So,
> > what I
> > am looking for is a modification to GNU APL (and yes, I am willing
> > to do
> > the work) -- to allow for the complete suppression of normal C++
> > a

[Bug-apl] 950 )HELP Issue, and mem.cc

2017-05-16 Thread Fred Weigel
Jürgen


In 950 (and going back a few releases) Command.cc has broken )HELP.

At (around) line 150, add

if (!strcmp(command, ")HELP")) return false;

without this, things like ")HELP +" (or whatever) won't work.

Second: )HELP doesn't actually behave well with NATIVE functions.  At
(around) line 962,
try 
if (ufun) ufun->help(CERR);
instead of
Assert(ufun);

Now, it would be useful to implement a "help" option to NATIVE -- but I
haven't done that.

As a side-note, it would be useful to have a CHANGES.txt file detailing
the changes from a release
to the next.

Find attached a first cut at shared memory support (mem.cc). It allows
mmap() of a region
(get the file handle from fopen), and supports floating, integer,
character, complex vectors
of different sizes transferring from GNU APL to and from the shared
memory region. I am
working with this currently, and when I get some experience will try to
graft into some
of the primitives. But this gives an idea of what I was talking about at
the beginning of the
month.

Fred Weigel


  


/**
 **
 * mem.cc *
 **
 * Implement shared memory for GNU APL. Vectors of the same type are  *
 * stored in memory buffers (mmap and munmap interfaces are provided) *
 * The handle to access the file is provided by Quad FIO (fopen), and *
 * a pointer to memory is provided. The size in bytes (chars) of each *
 * supported type is provided, and is used to compute strides. Vector *
 * of each type can be transferred between the buffer and an APL  *
 * vector. The memory vectors must be of only one type. In future,*
 * the APL primitives may be enhanced to support these uni-type   *
 * vectors directly, as this maximizes cache/memory performance.  *
 * The current advantage to using mem.cc is that the APL Workspace is *
 * not bloated with data, and the data is not even read if it is not  *
 * used.  *
 *    *
 * Copyright (C) 2017 Fred Weigel *
 **
 * This program is free software: you can redistribute it and/or  *
 * modify it under the terms of the GNU General Public License as *
 * published by the Free Software Foundation, either version 3 of the *
 * License, or (at your option) any later version.*
 **
 * This program is distributed in the hope that it will be useful,*
 * but WITHOUT ANY WARRANTY; without even the implied warranty of *
 * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the   *
 * GNU General Public License for more details.   *
 **
 * You should have received a copy of the GNU General Public License  *
 * along with this program. If not, see   *
 * <http://www.gnu.org/licenses/>.*
 **
 **/

#include 
#include 
#include 
#include 
#include 

#include "../Value.icc"
#include "../Native_interface.hh"
#include "../Quad_FIO.hh"

class NativeFunction;

static Fun_signature get_signature() {
return SIG_Z_A_F2_B;
}

static Token help(ostream &out) {

out <<
"   Functions provided by MEM...\n"
"\n"
"   Legend: e - error code (integer)\n"
"   h - file handle (integer)\n"
"   p - pointer (integer)\n"
"   i - integer\n"
"   s - string\n"
"\n"
"   MEM[  0] ''   print this text\n"
"   Zp ←MEM[  1] B1h B2i B3s  mmap fd B1h, length B2i, mode B3s,\n"
"B4i   offset B4i\n"
" mode: r read\n"
"   w write\n"
"   s shared (default private)\n"
"   h huge pages\n"
"   n noreserve\n"
"   p populate\n"
" B3s default: rw private\n"
" B1h = -1: anonymous map\n"
"   Ze ←MEM[  2] B1p B2i  munmap pointer B1p, leng

[Bug-apl] More word2vec

2017-05-25 Thread Fred Weigel
Jürgen, GNU APL Gurus

More on my current AI in APL work. I have implemented functions
setup∆word2vec, distance and analogy in GNU APL. Run setup∆word2vec
first, and then distance (try 'dog' when prompted for input). Try
analogy with 'paris france berlin' (which should, of course, yield
germany). The file vector8 must be in current directory when running the
setup function.

To use this, you will have to build mem.cc -- put it into your GNU APL
source in src/native, and add lib_mem.la to pkglib_LTLIBRARIES, and add
a line 'lib_mem_la_SOURCES = mem.cc'

You then need to 'autoreconf', 'configure' and 'make'. Since this is
still early development, none of that has been automated. Also, this has
ONLY been run on Linux 64 bit (no other platform has been tried). 

See describe∆word2vec for some details on data sizing. You can, of
course, examine the functions in the workspace without having a
lib_mem.so file, but those native functions are needed to run the
sample.

Here are the files (gzip compressed).

https://www.dropbox.com/s/cfcaojjuzjxra7j/mem.cc.gz?dl=0
https://www.dropbox.com/s/97f5umkh3xd72cb/vector8.gz?dl=0
https://www.dropbox.com/s/pfheb6qic9wefqd/word2vec.xml.gz?dl=0

I am still using C code to generate vector8, but I would like to convert
the training to APL as well.

This is an embarrassingly parallel problem. I am thinking about how to
push the access to the dataset lower into the APL to achieve more
efficiency.

Any comments/feedback/ideas are welcome. This is a very simple AI
application, using (at present) a very very small model. I am looking to
begin "scaling" this development soon. I need to be able to support both
very dense datasets and sparse datasets (using additional transfer
calls). The sparse datasets will be for tensor support. Again, feedback
is welcome. I haven't yet implemented any of the tensor stuff -- right
now, concentrating on tooling issues (I like APL for this work).

Fred Weigel



[Bug-apl] 956 Command.cc issue

2017-06-08 Thread Fred Weigel
Juergen

I noticed in 956, that there is a defect in Command.cc. At line 170:

if (args_ucs[1] == '.' && args_ucs[2] == '.') many = true;

has two defects: First is that it may read off the end of the prototype
string.
Second is that it doesn't work as intended. I propose:

if ((a > 1) && (args_ucs[a - 1] == '.') && (args_ucs[a - 2] == '.'))
many = true;

That works (try the command )HOST ls -l for a quick test case).

Fred Weigel

  



[Bug-apl] Dyalog 16.0

2017-06-30 Thread Fred Weigel
Dyalog 16.0 has incorporated cef. I have also (somewhat over 6 months
ago), incorporated cef into GNU APL. I use it for some interface stuff.
Only tested on Linux 64 bit (but the underlying native library works on
Windows 64 bit as well -- I have no plans for Mac). Is there any
interest in this work?

Also, has anyone tried my word2vec stuff? Is there interest in an update
to the shared memory interface? I have it complete and (mostly)
documented.

You can also email me directly: fred_weigel at hotmail dot com will
work.

FredW



[Bug-apl] Issues building with GCC 8.1.1

2018-05-23 Thread Fred Weigel
I just updated to Fedora 28. Had some issues compiling GNU APL
   
__ _   __ __  _____    __ 
   / // | / // / / /   /   |   / __ \ / / 
  / / __ /  |/ // / / /   / /| |  / /_/ // /  
 / /_/ // /|  // /_/ /   / ___ | / // /___
 \//_/ |_/ \/   /_/  |_|/_//_/
   
 Welcome to GNU APL version 1.7 local / 1050M
   
Copyright (C) 2008-2016  Dr. Jürgen Sauermann
   Banner by FIGlet: www.figlet.org
   
This program comes with ABSOLUTELY NO WARRANTY;
  for details run: apl --gpl.
   
 This program is free software, and you are welcome to redistribute
it
 according to the GNU Public License (GPL) version 3 or later.
   
DUMPED 2017-08-06  18:41:50 (GMT-4)


Note that this has some local changes (memory mapping support, and some
other minor changes), thus the designation "1.7 local".

But, these notes apply to the unaltered version 1050 as well (and the
pragma notes apply to 1047).

c++ (GCC) 8.1.1 20180502 (Red Hat 8.1.1-1)
Copyright (C) 2018 Free Software Foundation, Inc.
This is free software; see the source for copying conditions.  There is
NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR
PURPOSE.

This version of GCC is more stringent. The following pragmas can be
added to successfully compile:

LibPaths.cc:#pragma GCC diagnostic ignored "-Wstringop-truncation"
Quad_SVx.cc:#pragma GCC diagnostic ignored "-Wformat-truncation"
Svar_DB.cc:#pragma GCC diagnostic ignored "-Wformat-truncation"
Svar_record.cc:#pragma GCC diagnostic ignored "-Wformat-truncation"
Svar_record.hh:#pragma GCC diagnostic ignored "-Wclass-memaccess"

If RATIONAL_NUMBERS_DEFINED is defined:

FloatCell.cc line 527

 const FloatCell inv_B(denom, numer);
 
should be

 const FloatCell inv_B(B_denom, B_numer);
 
  
 IntCell.cc line 533
 
 const APL_Integer b = get_int_value();
 
should be
 
 APL_Integer b = get_int_value();
 
and
 
 const APL_Integer a = A->get_int_value();
 
should be
 
 APL_Integer a = A->get_int_value();

(these, because of a = -a and b = -b)

Fred Weigel



Re: [Bug-apl] Issues building with GCC 8.1.1

2018-05-24 Thread Fred Weigel
Juergen

Thanks! All worked, except (as you suspected) Svar_record. Here is the
error message from GCC 8.1.1:

libtool: compile:  g++ -DHAVE_CONFIG_H -I. -I.. -Wall -I sql -Wold-
style-cast -Werror -I/usr/include -I/usr/include -rdynamic -O3 -MT
libapl_la-Archive.lo -MD -MP -MF .deps/libapl_la-Archive.Tpo -c
Archive.cc  -fPIC -DPIC -o .libs/libapl_la-Archive.o
In file included from Svar_DB.hh:32,
 from Symbol.hh:30,
 from SystemVariable.hh:26,
 from Quad_RL.hh:24,
 from Workspace.hh:32,
 from Archive.hh:28,
 from Archive.cc:29:
Svar_record.hh: In member function 'void Svar_record::clear()':
Svar_record.hh:183:56: error: 'void* memset(void*, int, size_t)'
clearing an object of non-trivial type 'struct Svar_record'; use
assignment or value-initialization instead [-Werror=class-memaccess]
void clear()   { memset(this, 0, sizeof(Svar_record)); }
^
Svar_record.hh:174:8: note: 'struct Svar_record' declared here
 struct Svar_record
^~~
cc1plus: all warnings being treated as errors
make[3]: *** [Makefile:1176: libapl_la-Archive.lo] Error 1

Fred Weigel

On Thu, 2018-05-24 at 13:53 +0200, Juergen Sauermann wrote:
> Hi Fred,
> 
> thanks, hopefully fixed in SVN 1051.
> 
> The -Wclass-memaccess warning is not documented in the gcc 8,1
> manual, therefore
> the warnings in Svar_record.cc and/or Svar_record.hh may have
> survived my attempt
> to fix them. If so, then please send me the complete warning output
> (containing the
> source line number) so that I can give it another try.
> 
> Best regards,
> 
> /// Jürgen
> 
> 
> On 05/23/2018 09:45 PM, Fred Weigel wrote:
> > I just updated to Fedora 28. Had some issues compiling GNU APL
> >
> > __ _   __ __  _____    __ 
> >/ // | / // / / /   /   |   / __ \ / / 
> >   / / __ /  |/ // / / /   / /| |  / /_/ // /  
> >  / /_/ // /|  // /_/ /   / ___ | / // /___
> >  \//_/ |_/ \/   /_/  |_|/_//_/
> >
> >  Welcome to GNU APL version 1.7 local / 1050M
> >
> > Copyright (C) 2008-2016  Dr. Jürgen Sauermann
> >Banner by FIGlet: www.figlet.org
> >
> > This program comes with ABSOLUTELY NO WARRANTY;
> >   for details run: apl --gpl.
> >
> >  This program is free software, and you are welcome to
> > redistribute
> > it
> >  according to the GNU Public License (GPL) version 3 or
> > later.
> >
> > DUMPED 2017-08-06  18:41:50 (GMT-4)
> > 
> > 
> > Note that this has some local changes (memory mapping support, and
> > somea
> > other minor changes), thus the designation "1.7 local".
> > 
> > But, these notes apply to the unaltered version 1050 as well (and
> > the
> > pragma notes apply to 1047).
> > 
> > c++ (GCC) 8.1.1 20180502 (Red Hat 8.1.1-1)
> > Copyright (C) 2018 Free Software Foundation, Inc.
> > This is free software; see the source for copying
> > conditions.  There is
> > NO
> > warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR
> > PURPOSE.
> > 
> > This version of GCC is more stringent. The following pragmas can be
> > added to successfully compile:
> > 
> > LibPaths.cc:#pragma GCC diagnostic ignored "-Wstringop-truncation"
> > Quad_SVx.cc:#pragma GCC diagnostic ignored "-Wformat-truncation"
> > Svar_DB.cc:#pragma GCC diagnostic ignored "-Wformat-truncation"
> > Svar_record.cc:#pragma GCC diagnostic ignored "-Wformat-truncation"
> > Svar_record.hh:#pragma GCC diagnostic ignored "-Wclass-memaccess"
> > 
> > If RATIONAL_NUMBERS_DEFINED is defined:
> > 
> > FloatCell.cc line 527
> > 
> >  const FloatCell inv_B(denom, numer);
> >  
> > should be
> > 
> >  const FloatCell inv_B(B_denom, B_numer);
> >  
> >   
> >  IntCell.cc line 533
> >  
> >  const APL_Integer b = get_int_value();
> >  
> > should be
> >  
> >  APL_Integer b = get_int_value();
> >  
> > and
> >  
> >  const APL_Integer a = A->get_int_value();
> >  
> > should be
> >  
> >  APL_Integer a = A->get_int_value();
> > 
> > (these, because of a = -a and b = -b)
> > 
> > Fred Weigel
> > 
>