just for the fun things, i just find that the parallelized results are false :-) even for low indices: the non // result: C5 = ((B1 ∧ B3) ∨ (B2 ∧ B3) ∨ (B2 ∧ B4)) is different than the parallelized result below: C5 = ((B1 ∧ B3) ∨ (B2 ∧ B3) ∨ (B2 ∧ B4) ∨ (B3 ∧ B4))
i have to check things again ,i think futures are easy to use but limited in solving acute // problems, such as using a global variable like an hash table... Damien On Mon, Oct 17, 2022 at 3:17 PM Damien Mattei <damien.mat...@gmail.com> wrote: > Hello, > > sorry for my late answer ( i wrote the code friday but i could only debug > today, being busy and on travel last week-end) > > i modified my code to works with 'futures' , the speed is now incredible > !!! (and it computes exact) > the speed seems multiplied by even more than the number of CPUs (6): > cat /proc/cpuinfo | grep processor > processor : 0 > processor : 1 > processor : 2 > processor : 3 > processor : 4 > processor : 5 > > > https://github.com/damien-mattei/library-FunctProg/blob/master/guile/logiki%2B.scm#L1900 > > (declare minterms-vector unified-minterms-vector-1) > > (define (funct-unify-minterms-set-1-unit-future set1 set2) > > {function-unify-minterms-list <+ (λ (L) (apply > function-unify-two-minterms-and-tag L))} > > ;; note : sorting is useless > > {minterms-set <+ (product-set-with-set-imperative-sorted set1 set2)} > ;;(product-set-with-set-imperative set1 set2)} ;;(product-set-with-set set1 > set2)} ;;(associate-set-with-set set1 set2)} ;; set multiplication : create > list of pair of minterms > > {minterms-vector <- (list->vector minterms-set)} > > {minterms-vector-length <+ (vector-length minterms-vector)} > > {nb-procs <+ (current-processor-count)} > > {segmts <+ (segment 0 {minterms-vector-length - 1} nb-procs)} ;; compute > the segments > > {unified-minterms-vector-1 <- (make-vector minterms-vector-length #f)} > > (run-in-parallel segmts proc-unify-minterms-seg) ;; run the parallel code > > {unified-minterms-set-1 <+ (vector->list unified-minterms-vector-1)} > > {unified-minterms-set-2 <+ (filter (λ (x) x) unified-minterms-set-1)} ;; > remove #f results > > {unified-minterms-set <+ (remove-duplicates-sorted > unified-minterms-set-2)} ;; uniq > > unified-minterms-set) > > i keeped your 'segment' definitions to index an array of data after > convert it from list to vector (list->vector) which probably consume > additional time on big data list ( i had more than 1000000 element list > lengths at some point) > > i used a simplified version of run-in parallel (that do not do 'reduce > after processing data): > > > https://github.com/damien-mattei/library-FunctProg/blob/master/guile/logiki%2B.scm#L1794 > > the computation on a segment is done by those functions: > > > https://github.com/damien-mattei/library-FunctProg/blob/master/guile/logiki%2B.scm#L1842 > > {function-unify-minterms-list <+ (λ (L) (apply > function-unify-two-minterms-and-tag L))} > > ;; proc to be called with futures > (define (proc-unify-minterms-seg seg) > {start <+ (segment-start seg)} > {end <+ (segment-end seg)} > (for ({i <+ start} {i <= end} {i <- {i + 1}}) > {mtL <+ {minterms-vector[i]}} > {unified-minterms-vector-1[i] <- (function-unify-minterms-list > mtL)})) > > > i use those code in another program symbolically to compute a value named > Cn: > > C0 = T > C1 = T > C2 = T > C3 = (B1 ∨ B2) > C4 = (B2 ∨ (B1 ∧ B3)) > C5 = ((B1 ∧ B3) ∨ (B2 ∧ B3) ∨ (B2 ∧ B4) ∨ (B3 ∧ B4)) > C6 = ((B1 ∧ B3 ∧ B5) ∨ (B2 ∧ B3 ∧ B5) ∨ (B2 ∧ B4) ∨ (B3 ∧ B4) ∨ (B4 ∧ B5)) > C7 = ((B1 ∧ B3 ∧ B5) ∨ (B2 ∧ B3 ∧ B5) ∨ (B2 ∧ B4 ∧ B6) ∨ (B3 ∧ B4 ∧ B6) ∨ > (B4 ∧ B5) ∨ (B5 ∧ B6)) > C8 = ((B1 ∧ B3 ∧ B5 ∧ B7) ∨ (B2 ∧ B3 ∧ B5 ∧ B7) ∨ (B2 ∧ B4 ∧ B6) ∨ (B3 ∧ > B4 ∧ B6) ∨ (B4 ∧ B5 ∧ B7) ∨ (B5 ∧ B6) ∨ (B6 ∧ B7)) > C9 = ((B1 ∧ B3 ∧ B5 ∧ B7) ∨ (B2 ∧ B3 ∧ B5 ∧ B7) ∨ (B2 ∧ B4 ∧ B6 ∧ B8) ∨ > (B3 ∧ B4 ∧ B6 ∧ B8) ∨ (B4 ∧ B5 ∧ B7) ∨ (B5 ∧ B6 ∧ B8) ∨ (B6 ∧ B7) ∨ (B7 ∧ > B8)) > > from C1 to C9 used to be fast,less than a minute for the whole with or > without // > > C10 = ((B1 ∧ B3 ∧ B5 ∧ B7 ∧ B9) ∨ (B2 ∧ B3 ∧ B5 ∧ B7 ∧ B9) ∨ (B2 ∧ B4 ∧ B6 > ∧ B8) ∨ (B3 ∧ B4 ∧ B6 ∧ B8) ∨ (B4 ∧ B5 ∧ B7 ∧ B9) ∨ (B5 ∧ B6 ∧ B8) ∨ (B6 ∧ > B7 ∧ B9) ∨ (B7 ∧ B8) ∨ (B8 ∧ B9)) > > C10 takes a few minutes > > but C11 used to take one day before // with 'future' and i got now the > result during less than one hour!!! > > C11 = ((B1 ∧ B3 ∧ B5 ∧ B7 ∧ B9) ∨ (B10 ∧ B2 ∧ B4 ∧ B6 ∧ B8) ∨ (B10 ∧ B3 ∧ > B4 ∧ B6 ∧ B8) ∨ (B10 ∧ B5 ∧ B6 ∧ B8) ∨ (B10 ∧ B7 ∧ B8) ∨ (B10 ∧ B9) ∨ (B2 ∧ > B3 ∧ B5 ∧ B7 ∧ B9) ∨ (B4 ∧ B5 ∧ B7 ∧ B9) ∨ (B6 ∧ B7 ∧ B9) ∨ (B8 ∧ B9)) > > i never got C12 in the past ,even with 7 days of computation! and i got it > today during the lunchbreak !!! > > C12 = ((B1 ∧ B11 ∧ B3 ∧ B5 ∧ B7 ∧ B9) ∨ (B10 ∧ B11) ∨ (B10 ∧ B2 ∧ B4 ∧ B6 > ∧ B8) ∨ (B10 ∧ B3 ∧ B4 ∧ B6 ∧ B8) ∨ (B10 ∧ B5 ∧ B6 ∧ B8) ∨ (B10 ∧ B7 ∧ B8) > ∨ (B10 ∧ B9) ∨ (B11 ∧ B2 ∧ B3 ∧ B5 ∧ B7 ∧ B9) ∨ (B11 ∧ B4 ∧ B5 ∧ B7 ∧ B9) ∨ > (B11 ∧ B6 ∧ B7 ∧ B9) ∨ (B11 ∧ B8 ∧ B9)) > > (note that a wise people can find a general formula for Cn based on Cn-1 > but that is another (mathematical) story....) > > i'm computing C13 but i see that it consumes 12gb out of 16gb of my system > ! and stopped on the linux box : > GC Warning: Repeated allocation of very large block (appr. size 510935040): > May lead to memory leak and poor performance > Processus arrêté > but still running on the Mac laptop...ok will see that but i think that > the data set is now huge and there is noting that can be done with that... > > note that there is still an hash table used in > function-unify-two-minterms-and-tag and i will use another data structure > and algo to avoid that, i feared that the concurrent access to hash table > can cause the guile 'future' to crash or fails or to slow down the process > but not! result is incredibly fast.Also i use continuation and i read it > can cause problem with 'future' i will improve that too... > > I will see where i can improve the algo and data structure to optimize > again but this is already really good. > > Thanks > > Damien > > > On Fri, Oct 14, 2022 at 10:39 AM Zelphir Kaltstahl < > zelphirkaltst...@posteo.de> wrote: > >> Hello Damien, >> On 10/14/22 10:21, Damien Mattei wrote: >> >> yes Zelphir i think there is a problem in the threading part of guile, >> whatever the code is ,it run well the first time, and after is unstable. >> Long ago at the very begin of Java language on my master project at >> university i experienced same problem with Java "green" threads, under >> windows and/or linux ,i can't remember. >> >> I'm trying your code and future, which have perheaps also the portability >> with other schemes, future can provide "light" // , with carefull code it >> could be ok. >> >> in your segment.scm code ,is segment-count possibly replaced by the >> number of available CPUs or nodes, if i understand well? >> >> Regards, >> Damien >> >> Correct. For each segment one future is used. >> >> However, there are scenarios, where one would rather want to interleave >> the numbers of the whole range, to have a "fairer" workload per future, for >> example: >> >> (1 2 3 4 5 6 7 8 9) >> >> -> (1 4 7), (2 5 8), (3 6 9) >> >> instead of >> >> -> (1 2 3) (4 5 6) (7 8 9) >> >> (I seem to remember this to be the case for Mandelbrot set calculations, >> but might be wrong.) >> >> But that is all a matter of forming some segments and what a segment is, >> not really part of the logic of creating futures. So one could create then >> in any way that fits ones use-case. I have not generalized that segment >> logic that far, but it is doable. >> >> Anyway, if anyone more knowledgeable could chime in on what the >> performance differences between futures and parallel map are, that would be >> great! Or pointing out some flaws that this kind of future usage might >> have. Or when the point is reached to switch to guile-fibers library. >> >> Regards, >> Zelphir >> >> -- >> repositories: https://notabug.org/ZelphirKaltstahl >> >>