We need a predicate like
(: flonum-matrix? (All (A) (-> (Matrix A) Boolean : (Matrix Flonum))))
Then `matrix-solve` could dispatch to `flmatrix-solve` and still be
well-typed. We could/should do something similar for every operation for
which checking flonum-ness is cheap compared to computing the result,
which at least includes everything O(n^3).
One thing we should really do is get your LAPACK FFI into the math
library and have `flmatrix-solve` use that, but fail over to Racket code
systems that don't have LAPACK. If I remember right, it would have to
transpose the data because LAPACK is column-major.
Some thoughts, in no particular order:
1. Because of transposition and FFI overhead, there's a matrix size
threshold under which we ideally should use the code below, even on
systems with LAPACK installed.
2. Because of small differences in how it finds pivots, LAPACK's
solver can return slightly different results. Should we worry about that
at all?
3. A design decision: if a matrix contains just one flonum, should we
convert it to (Matrix Flonum) and solve it quickly with
`flmatrix-solve`, or use the current `matrix-solve` to preserve some of
its exactness?
I lean toward regarding a matrix with one flonum as a flonum matrix.
It's definitely easier to write library code for, and would make it
easier for users to predict when a result entry will be exact.
Currently, we have this somewhat confusing situation, in which how
pivots are chosen determines which result entries are exact:
> (matrix-row-echelon (matrix ([1 2 3] [4.0 5 4])) #t #t 'first)
(mutable-array #[#[1 0 -2.333333333333333]
#[-0.0 1.0 2.6666666666666665]])
> (matrix-row-echelon (matrix ([1 2 3] [4.0 5 4])) #t #t 'partial)
(mutable-array #[#[1.0 0.0 -2.333333333333333]
#[0 1.0 2.6666666666666665]])
I doubt this has caused problems for anyone, but it bothers me a little.
Neil ⊥
On 05/13/2014 12:45 PM, Jens Axel Søgaard wrote:
That's great!
The question is now how to automate this sort of thing.
/Jens Axel
2014-05-13 1:39 GMT+02:00 Neil Toronto <neil.toro...@gmail.com>:
When I change it to operate on (Vectorof FlVector) instead of (Vectorof
(Vectorof Flonum)), I get this:
cpu time: 996 real time: 995 gc time: 22
1.0000000000009335
cpu time: 15387 real time: 15384 gc time: 13006
1.0000000000009335
cpu time: 1057 real time: 1056 gc time: 85
1.0000000000009335
cpu time: 11514 real time: 11510 gc time: 9097
1.0000000000009335
cpu time: 1079 real time: 1079 gc time: 100
1.0000000000009335
cpu time: 15425 real time: 15426 gc time: 13072
1.0000000000009335
When I use `racket/unsafe/ops` instead of `math/private/unsafe`, I get this:
cpu time: 591 real time: 591 gc time: 17
1.0000000000016622
cpu time: 11514 real time: 11509 gc time: 9195
1.0000000000016622
cpu time: 604 real time: 604 gc time: 31
1.0000000000016622
cpu time: 15739 real time: 15737 gc time: 13358
1.0000000000016622
cpu time: 596 real time: 595 gc time: 24
1.0000000000016622
cpu time: 11498 real time: 11493 gc time: 9154
1.0000000000016622
Racket's floating-point math is fast if the flonums aren't allocated
separately on the heap.
Neil ⊥
On 05/12/2014 02:17 PM, Jens Axel Søgaard wrote:
Hi Eric,
You were absolute right. The version below cuts the time in half.
It is mostly cut and paste from existing functions and removing
non-Flonum cases.
/Jens Axel
#lang typed/racket
(require math/matrix
math/array
math/private/matrix/utils
math/private/vector/vector-mutate
math/private/unsafe
(only-in racket/unsafe/ops unsafe-fl/)
racket/fixnum
racket/flonum
racket/list)
(define-type Pivoting (U 'first 'partial))
(: flonum-matrix-gauss-elim
(case-> ((Matrix Flonum) -> (Values (Matrix Flonum) (Listof Index)))
((Matrix Flonum) Any -> (Values (Matrix Flonum) (Listof
Index)))
((Matrix Flonum) Any Any -> (Values (Matrix Flonum) (Listof
Index)))
((Matrix Flonum) Any Any Pivoting -> (Values (Matrix
Flonum) (Listof Index)))))
(define (flonum-matrix-gauss-elim M [jordan? #f] [unitize-pivot? #f]
[pivoting 'partial])
(define-values (m n) (matrix-shape M))
(define rows (matrix->vector* M))
(let loop ([#{i : Nonnegative-Fixnum} 0]
[#{j : Nonnegative-Fixnum} 0]
[#{without-pivot : (Listof Index)} empty])
(cond
[(j . fx>= . n)
(values (vector*->matrix rows)
(reverse without-pivot))]
[(i . fx>= . m)
(values (vector*->matrix rows)
;; None of the rest of the columns can have pivots
(let loop ([#{j : Nonnegative-Fixnum} j] [without-pivot
without-pivot])
(cond [(j . fx< . n) (loop (fx+ j 1) (cons j
without-pivot))]
[else (reverse without-pivot)])))]
[else
(define-values (p pivot)
(case pivoting
[(partial) (find-partial-pivot rows m i j)]
[(first) (find-first-pivot rows m i j)]))
(cond
[(zero? pivot) (loop i (fx+ j 1) (cons j without-pivot))]
[else
;; Swap pivot row with current
(vector-swap! rows i p)
;; Possibly unitize the new current row
(let ([pivot (if unitize-pivot?
(begin (vector-scale! (unsafe-vector-ref rows
i)
(unsafe-fl/ 1. pivot))
(unsafe-fl/ pivot pivot))
pivot)])
(flonum-elim-rows! rows m i j pivot (if jordan? 0 (fx+ i 1)))
(loop (fx+ i 1) (fx+ j 1) without-pivot))])])))
(: flonum-elim-rows!
((Vectorof (Vectorof Flonum)) Index Index Index Flonum
Nonnegative-Fixnum -> Void))
(define (flonum-elim-rows! rows m i j pivot start)
(define row_i (unsafe-vector-ref rows i))
(let loop ([#{l : Nonnegative-Fixnum} start])
(when (l . fx< . m)
(unless (l . fx= . i)
(define row_l (unsafe-vector-ref rows l))
(define x_lj (unsafe-vector-ref row_l j))
(unless (= x_lj 0)
(flonum-vector-scaled-add! row_l row_i (fl* -1. (fl/ x_lj
pivot)) j)
;; Make sure the element below the pivot is zero
(unsafe-vector-set! row_l j (- x_lj x_lj))))
(loop (fx+ l 1)))))
(: flonum-matrix-solve
(All (A) (case->
((Matrix Flonum) (Matrix Flonum) -> (Matrix Flonum))
((Matrix Flonum) (Matrix Flonum) (-> A) -> (U A (Matrix
Flonum))))))
(define flonum-matrix-solve
(case-lambda
[(M B) (flonum-matrix-solve
M B (λ () (raise-argument-error 'flonum-matrix-solve
"matrix-invertible?" 0 M B)))]
[(M B fail)
(define m (square-matrix-size M))
(define-values (s t) (matrix-shape B))
(cond [(= m s)
(define-values (IX wps)
(parameterize ([array-strictness #f])
(flonum-matrix-gauss-elim (matrix-augment (list M B)) #t
#t)))
(cond [(and (not (empty? wps)) (= (first wps) m))
(submatrix IX (::) (:: m #f))]
[else (fail)])]
[else
(error 'flonum-matrix-solve
"matrices must have the same number of rows; given ~e
and ~e"
M B)])]))
(define-syntax-rule (flonum-vector-generic-scaled-add! vs0-expr
vs1-expr v-expr start-expr + *)
(let* ([vs0 vs0-expr]
[vs1 vs1-expr]
[v v-expr]
[n (fxmin (vector-length vs0) (vector-length vs1))])
(let loop ([#{i : Nonnegative-Fixnum} (fxmin start-expr n)])
(if (i . fx< . n)
(begin (unsafe-vector-set! vs0 i (+ (unsafe-vector-ref vs0 i)
(* (unsafe-vector-ref vs1
i) v)))
(loop (fx+ i 1)))
(void)))))
(: flonum-vector-scaled-add!
(case-> ((Vectorof Flonum) (Vectorof Flonum) Flonum -> Void)
((Vectorof Flonum) (Vectorof Flonum) Flonum Index -> Void)))
(define (flonum-vector-scaled-add! vs0 vs1 s [start 0])
(flonum-vector-generic-scaled-add! vs0 vs1 s start + *))
(: mx Index)
(define mx 600)
(: r (Index Index -> Flonum))
(define (r i j) (random))
(: A : (Matrix Flonum))
(define A (build-matrix mx mx r))
(: sum : Integer Integer -> Flonum)
(define (sum i n)
(let loop ((j 0) (acc 0.0))
(if (>= j mx) acc
(loop (+ j 1) (+ acc (matrix-ref A i j))) )))
(: b : (Matrix Flonum))
(define b (build-matrix mx 1 sum))
(time
(let [(m (flonum-matrix-solve A b))]
(matrix-ref m 0 0)))
(time
(let [(m (matrix-solve A b))]
(matrix-ref m 0 0)))
(time
(let [(m (flonum-matrix-solve A b))]
(matrix-ref m 0 0)))
(time
(let [(m (matrix-solve A b))]
(matrix-ref m 0 0)))
(time
(let [(m (flonum-matrix-solve A b))]
(matrix-ref m 0 0)))
(time
(let [(m (matrix-solve A b))]
(matrix-ref m 0 0)))
/Jens Axel
2014-05-11 23:26 GMT+02:00 Eric Dobson <eric.n.dob...@gmail.com>:
Where is the time spent in the algorithm? I assume that most of it is
in the matrix manipulation work not the orchestration of finding a
pivot and reducing based on that. I.e. `elim-rows!` is the expensive
part. Given that you only specialized the outer part of the loop, I
wouldn't expect huge performance changes.
On Sun, May 11, 2014 at 2:13 PM, Jens Axel Søgaard
<jensa...@soegaard.net> wrote:
I tried restricting the matrix-solve and matrix-gauss-elim to (Matrix
Flonum).
I can't observe a change in the timings.
#lang typed/racket
(require math/matrix
math/array
math/private/matrix/utils
math/private/vector/vector-mutate
math/private/unsafe
(only-in racket/unsafe/ops unsafe-fl/)
racket/fixnum
racket/list)
(define-type Pivoting (U 'first 'partial))
(: flonum-matrix-gauss-elim
(case-> ((Matrix Flonum) -> (Values (Matrix Flonum) (Listof Index)))
((Matrix Flonum) Any -> (Values (Matrix Flonum) (Listof
Index)))
((Matrix Flonum) Any Any -> (Values (Matrix Flonum) (Listof
Index)))
((Matrix Flonum) Any Any Pivoting -> (Values (Matrix
Flonum) (Listof Index)))))
(define (flonum-matrix-gauss-elim M [jordan? #f] [unitize-pivot? #f]
[pivoting 'partial])
(define-values (m n) (matrix-shape M))
(define rows (matrix->vector* M))
(let loop ([#{i : Nonnegative-Fixnum} 0]
[#{j : Nonnegative-Fixnum} 0]
[#{without-pivot : (Listof Index)} empty])
(cond
[(j . fx>= . n)
(values (vector*->matrix rows)
(reverse without-pivot))]
[(i . fx>= . m)
(values (vector*->matrix rows)
;; None of the rest of the columns can have pivots
(let loop ([#{j : Nonnegative-Fixnum} j] [without-pivot
without-pivot])
(cond [(j . fx< . n) (loop (fx+ j 1) (cons j
without-pivot))]
[else (reverse without-pivot)])))]
[else
(define-values (p pivot)
(case pivoting
[(partial) (find-partial-pivot rows m i j)]
[(first) (find-first-pivot rows m i j)]))
(cond
[(zero? pivot) (loop i (fx+ j 1) (cons j without-pivot))]
[else
;; Swap pivot row with current
(vector-swap! rows i p)
;; Possibly unitize the new current row
(let ([pivot (if unitize-pivot?
(begin (vector-scale! (unsafe-vector-ref
rows i)
(unsafe-fl/ 1.
pivot))
(unsafe-fl/ pivot pivot))
pivot)])
(elim-rows! rows m i j pivot (if jordan? 0 (fx+ i 1)))
(loop (fx+ i 1) (fx+ j 1) without-pivot))])])))
(: flonum-matrix-solve
(All (A) (case->
((Matrix Flonum) (Matrix Flonum) -> (Matrix
Flonum))
((Matrix Flonum) (Matrix Flonum) (-> A) -> (U A (Matrix
Flonum))))))
(define flonum-matrix-solve
(case-lambda
[(M B) (flonum-matrix-solve
M B (λ () (raise-argument-error 'flonum-matrix-solve
"matrix-invertible?" 0 M B)))]
[(M B fail)
(define m (square-matrix-size M))
(define-values (s t) (matrix-shape B))
(cond [(= m s)
(define-values (IX wps)
(parameterize ([array-strictness #f])
(flonum-matrix-gauss-elim (matrix-augment (list M B))
#t #t)))
(cond [(and (not (empty? wps)) (= (first wps) m))
(submatrix IX (::) (:: m #f))]
[else (fail)])]
[else
(error 'flonum-matrix-solve
"matrices must have the same number of rows; given
~e and ~e"
M B)])]))
(: mx Index)
(define mx 600)
(: r (Index Index -> Flonum))
(define (r i j) (random))
(: A : (Matrix Flonum))
(define A (build-matrix mx mx r))
(: sum : Integer Integer -> Flonum)
(define (sum i n)
(let loop ((j 0) (acc 0.0))
(if (>= j mx) acc
(loop (+ j 1) (+ acc (matrix-ref A i j))) )))
(: b : (Matrix Flonum))
(define b (build-matrix mx 1 sum))
(time
(let [(m (flonum-matrix-solve A b))]
(matrix-ref m 0 0)))
(time
(let [(m (matrix-solve A b))]
(matrix-ref m 0 0)))
(time
(let [(m (flonum-matrix-solve A b))]
(matrix-ref m 0 0)))
(time
(let [(m (matrix-solve A b))]
(matrix-ref m 0 0)))
(time
(let [(m (flonum-matrix-solve A b))]
(matrix-ref m 0 0)))
(time
(let [(m (matrix-solve A b))]
(matrix-ref m 0 0)))
2014-05-11 21:48 GMT+02:00 Neil Toronto <neil.toro...@gmail.com>:
The garbage collection time is probably from cleaning up boxed flonums,
and
possibly intermediate vectors. If so, a separate implementation of
Gaussian
elimination for the FlArray type would cut the GC time to nearly zero.
Neil ⊥
On 05/11/2014 01:36 PM, Jens Axel Søgaard wrote:
Or ... you could take a look at
https://github.com/plt/racket/blob/master/pkgs/math-pkgs/math-lib/math/private/matrix/matrix-gauss-elim.rkt
at see if something can be improved.
/Jens Axel
2014-05-11 21:30 GMT+02:00 Jens Axel Søgaard <jensa...@soegaard.net>:
Hi Eduardo,
The math/matrix library uses the arrays from math/array to represent
matrices.
If you want to try the same representation as Bigloo, you could try
Will
Farr's
matrix library:
http://planet.racket-lang.org/package-source/wmfarr/simple-matrix.plt/1/1/planet-docs/simple-matrix/index.html
I am interested in hearing the results.
/Jens Axel
2014-05-11 21:18 GMT+02:00 Eduardo Costa <edu50...@gmail.com>:
What is bothering me is the time Racket is spending in garbage
collection.
~/wrk/scm/rkt/matrix$ racket matrix.rkt
0.9999999999967226
cpu time: 61416 real time: 61214 gc time: 32164
If I am reading the output correctly, Racket is spending 32 seconds
out
of
61 seconds in garbage collection.
I am following Junia Magellan's computer language comparison and
I
cannot
understand why Racket needs the garbage collector for doing Gaussian
elimination. In a slow Compaq/HP machine, solving a system of 800
linear
equations takes 17.3 seconds in Bigloo, but requires 58 seconds in
Racket,
even after removing the building of the linear system from
consideration.
Common Lisp is also much faster than Racket in processing arrays. I
would
like to point out that Racket is very fast in general. The only
occasion
that it lags badly behind Common Lisp and Bigloo is when one needs
to
deal
with arrays.
Basically, Junia is using Rasch method to measure certain latent
traits
of
computer languages, like productivity and coaching time. In any
case,
she
needs to do a lot of matrix calculations to invert the Rasch model.
Since
Bigloo works with homogeneous vectors, she wrote a few macros to
access
the
elements of a matrix:
(define (mkv n) (make-f64vector n))
(define $ f64vector-ref)
(define $! f64vector-set!)
(define len f64vector-length)
(define-syntax $$
(syntax-rules ()
(($$ m i j) (f64vector-ref (vector-ref m i) j))))
(define-syntax $$!
(syntax-rules ()
(($$! matrix row column value)
($! (vector-ref matrix row) column value))))
I wonder whether homogeneous vectors would speed up Racket. In the
same
computer that Racket takes 80 seconds to build and invert a system
of
equations, Bigloo takes 17.3 seconds, as I told before. Common Lisp
is
even
faster. However, if one subtracts the gc time from Racket's total
time,
the
result comes quite close to Common Lisp or Bigloo.
~/wrk/bgl$ bigloo -Obench bigmat.scm -o big
~/wrk/bgl$ time ./big
0.9999999999965746 1.000000000000774 0.9999999999993039
0.9999999999982576
1.000000000007648 0.999999999996588
real 0m17.423s
user 0m17.384s
sys 0m0.032s
~/wrk/bgl$
Well, bigloo may perform global optimizations, but Common Lisp
doesn't.
When
one is not dealing with matrices, Racket is faster than Common Lisp.
I
hope
you can tell me how to rewrite the program in order to avoid garbage
collection.
By the way, you may want to know why not use Bigloo or Common Lisp
to
invert
the Rasch model. The problem is that Junia and her co-workers are
using
hosting services that do not give access to the server or to the
jailshell.
Since Bigloo requires gcc based compilation, Junia discarded it
right
away.
Not long ago, the hosting service stopped responding to the sbcl
Common
Lisp
compiler for reasons that I cannot fathom. Although Racket 6.0
stopped
working too, Racket 6.0.1 is working fine. This left Junia, her
co-workers
and students with Racket as their sole option. As for myself, I am
just
curious.
2014-05-11 6:23 GMT-03:00 Jens Axel Søgaard <jensa...@soegaard.net>:
2014-05-11 6:09 GMT+02:00 Eduardo Costa <edu50...@gmail.com>:
The documentation says that one should expect typed/racket to be
faster
than
racket. I tested the math/matrix library and it seems to be almost
as
slow
in typed/racket as in racket.
What was (is?) slow was a call in an untyped module A to a function
exported
from a typed module B. The functions in B must check at runtime
that
the values coming from A are of the correct type. If the A was
written
in Typed Racket, the types would be known at compile time.
Here math/matrix is written in Typed Racket, so if you are writing
an
untyped module, you will in general want to minimize the use
of,say,
maxtrix-ref. Instead operations that works on entire matrices or
row/columns are preferred.
(: sum : Integer Integer -> Flonum)
(define (sum i n)
(let loop ((j 0) (acc 0.0))
(if (>= j mx) acc
(loop (+ j 1) (+ acc (matrix-ref A i j))) )))
(: b : (Matrix Flonum))
(define b (build-matrix mx 1 sum))
The matrix b contains the sums of each row in the matrix.
Since matrices are a subset of arrays, you can use array-axis-sum,
which computes sum along a given axis (i.e. a row or a column when
speaking of matrices).
(define A (matrix [[0. 1. 2.]
[3. 4. 5.]
[6. 7. 8.]]))
(array-axis-sum A 1)
- : (Array Flonum)
(array #[3.0 12.0 21.0])
However as Eric points out, matrix-solve is an O(n^3) algorithm,
so the majority of the time is spent in matrix-solve.
Apart from finding a way to exploit the relationship between your
matrix A and the column vector b, I see no obvious way of
speeding up the code.
Note that when you benchmark with
time racket matrix.rkt
you will include startup and compilation time.
Therefore if you want to time the matrix code,
insert a literal (time ...) call.
--
Jens Axel Søgaard
--
--
Jens Axel Søgaard
____________________
Racket Users list:
http://lists.racket-lang.org/users
--
--
Jens Axel Søgaard
____________________
Racket Users list:
http://lists.racket-lang.org/users
____________________
Racket Users list:
http://lists.racket-lang.org/users