Dear Linh Chi, the exercise is basically Lab 7 from the last instance of the Northeastern beginners course:
http://www.ccs.neu.edu/course/cs2500f16/lab7.html If you work through this lab, you will get a performant solution based on lists. Once you have that, it should be fairly obvious how to replace the ‘table’ with a hashtable in this solution, though you have to bump the language to ASL and include the 2htdp/abstraction teachpack to get the style your teacher wants. For help with the solution, consider working through Part III of the text book: http://www.ccs.neu.edu/home/matthias/HtDP2e/ For your own learning experience, it might be best if you followed the design recipe and then ask questions about specific problems with function design. People on the list will gladly help then. — Matthias > On Feb 21, 2017, at 3:55 AM, Linh Chi Nguyen <nguyenlinhch...@gmail.com> > wrote: > > hello, > i'm doing this exercise and would appreciate any comments. i want to create a > machine to scan a text, then split the text into elements (storing in a hash > table). then we connect these hash keys in a probabilistic way, so that if we > start from a word, we can jump to other words in a probabilistic way. hence, > we can generate a sentence that is sufficiently independent from our bias. > the point is, if these probability makes sense (for example, it is real > statistics from real high quality text, i.e. from famous writers), i hope > that the machine can generate one or two sentences that is entertaining. > > (require 2htdp/batch-io) > (require racket/hash) > > (define input (read-file "sample.txt")) > (define data (remove-duplicates (string-split input))) > > to make it very easy at first, i dont use the frequency of the elements > (words mostly) just yet. i just make a hash table that take each element as a > key, the associated value will be a dispatching rule. to begin with, the > dispatching rule is simple: the current key will be connected to two other > keys, with probability. for example: the element "run" is connected to > "instead." with probability .48 and "helmets" with probability .08. > > (hash "run" (hash 0.48 "instead." 0.08 "helmets")) > > > (struct state (word dispatch) #:transparent) > > (define (random-member lst) > (list-ref lst (random (length lst)))) > > (define (make-sample-machine lst) > (define l (length lst)) > (define (make-transition) > (hash (round-2 (random)) (random-member lst) > (round-2 (random)) (random-member lst))) > (foldl (lambda (word h) (hash-union h (hash word (make-transition)))) > (hash) lst)) > > (define (round-n x n) > (/ (round (* x (expt 10 n))) (expt 10 n))) > (define (round-2 x) > (round-n x 2)) > > (define m (make-sample-machine data)) > > data of the machine looks like this: > > '#hash(("spread" . #hash((0.9 . "right") (0.21 . "deep"))) > ("instead" . #hash((0.64 . "then") (0.19 . "dark"))) > ("through" . #hash((0.3 . "meadow") (0.95 . "white,"))) > ("their" . #hash((0.56 . "instead") (0.98 . "valley,"))) > > now i try to generate a sentence of 10 words, i guess it is some kind of > loops, but when it write the function, it is super slow. > > the idea is that, we randomise to choose the first word, then this first word > has an associated dispatching rule. we use the probability in this rule to > randomise for the next word.. > > is it because i use too much randomisation that the function is super slow? > > (define (accumulate lst) > (define total (apply + lst)) > (let absolute->relative ([elements lst] [so-far #i0.0]) > (cond > [(empty? elements) '()] > [else (define nxt (+ so-far (round-2 (/ (first elements) total)))) > (cons nxt (absolute->relative (rest elements) nxt))]))) > > (define (randomise accumulated-lst) > (define r (random)) > (for/last ([p (in-naturals)] [% (in-list accumulated-lst)] #:final (< r %)) > p)) > > (define (generate-text m n) > (define l (hash-count m)) > (define r (random l)) > (match-define (cons first-word dispatch) (hash-iterate-pair m r)) > (cons first-word > (let generate ([count-down n] [next-batch dispatch]) > (cond > [(zero? count-down) '()] > [else > (define proba (hash-keys dispatch)) > (define next-word (hash-iterate-key m (randomise (accumulate > proba)))) > (define next-dispatch (hash-ref m next-word)) > (cons next-word (generate (- n 1) next-dispatch))])))) > > this exercise is at the beginer level, so i guess, someone must have done it > before. anyone has experience in doing this? like, is it a good way to > represent the data in a hash table? how to handle when the sample text (so > the hash table) becomes very large? > > here is the sample text: > > They had marched more than thirty kilometres since dawn, along the white, hot > road where occasional thickets of trees threw a moment of shade, then out > into the glare again. On either hand, the valley, wide and shallow, glittered > with heat; dark green patches of rye, pale young corn, fallow and meadow and > black pine woods spread in a dull, hot diagram under a glistening sky. But > right in front the mountains ranged across, pale blue and very still, snow > gleaming gently out of the deep atmosphere. And towards the mountains, on and > on, the regiment marched between the rye fields and the meadows, between the > scraggy fruit trees set regularly on either side the high road. The > burnished, dark green rye threw off a suffocating heat, the mountains drew > gradually nearer and more distinct. While the feet of the soldiers grew > hotter, sweat ran through their hair under their helmets, and their knapsacks > could burn no more in contact with their shoulders, but seemed instead to > give off a cold, prickly sensation. > > thank you, > and have a good day, > (if you read until this point) > > > > -- > You received this message because you are subscribed to the Google Groups > "Racket Users" group. > To unsubscribe from this group and stop receiving emails from it, send an > email to racket-users+unsubscr...@googlegroups.com. > For more options, visit https://groups.google.com/d/optout. -- You received this message because you are subscribed to the Google Groups "Racket Users" group. To unsubscribe from this group and stop receiving emails from it, send an email to racket-users+unsubscr...@googlegroups.com. For more options, visit https://groups.google.com/d/optout.