Re: community interest in machine learning (?)

Jim - FooBar(); Sat, 04 Aug 2012 12:19:11 -0700

Hmmm, I think it is worth downloading the source for encog 3.1 for javaand look into: org.encog.ml.data.temporal.TemporalMLDataSet

I think this is what you need to add several columns...unfortunatelyI've not wrapped this yet so you will have to do some interop to get itgoing...I promise you it will be the first thing I look at as soon as Ifind some time...


hope that helps...

Jim


On 04/08/12 20:08, Jim - FooBar(); wrote:

I will address your second issue shortly...You say you have a lazy-seqof arrays that have 5 strings? why strings?
Jim

On 04/08/12 20:02, Jim - FooBar(); wrote:
Clojars has been updated with a clojure-encog jar containing all thenamespaces...I'm really sorry I can't believe I hadn't noticed that!The code is in complete sync with github at the moment so instead oftyping 'doc' all the time feel free to have a browser open...I've notchanged much - I just removed some redundant let bindings and addedability to create an empty dataset... I also added a simple k-meansclustering example. If i understood correctly what you're doing theclosest example regarding preparing/normalising your data is thepredict-sunspots example...
Hope that helps... :)

Jim
ps: empirically, tanh and sigmoid work almost always best...I cansay the same for the nuygen-widrow randomiser...Also, just so youknow I'll be renaming clojure-encog to "enclog" for the 0.5 release...
On 04/08/12 19:18, Jim - FooBar(); wrote:
poooo this is very strange...i'll update clojars within the nexthour...sorry about this!
Jim

On 04/08/12 18:52, Timothy Washington wrote:
Hey Jim,
So I started playing around with clojure-encog<https://github.com/jimpil/clojure-encog>, and I'm pretty excitedabout it so far. Again, I'm trying to make a financial seriespredictor. And I'm trying to go through the steps of 1) nomalizing/ preparing the data 2) creating a feed-forward neural networkwith back-prop (I'll try sigmoid & gaussian activations). Then I'll3) train and 4) run the network.
*A)* The first problem I'm having is a library one. I'm trying tonormalize the data with the (*prepare* ...) function, but the*/normalization/* namespace isn't in */[clojure-encog"0.4.0-SNAPSHOT"]/*. Here, we see that the */nnets/* and*/training/* namespaces are in the snapshot jar, but not the/*normalization*/ namespace. So I don't know how easy it is toupdate the snapshot jar. But in the meantime, I'll see if I can usethe github version.
    webkell@ubuntu:~/Projects/nn$ jar tvf
    lib/clojure-encog-0.4.0-20120518.170223-1.jar
        72 Fri May 18 17:58:04 PDT 2012 META-INF/MANIFEST.MF
      1961 Fri May 18 17:58:04 PDT 2012
    META-INF/maven/clojure-encog/clojure-encog/pom.xml
       111 Fri May 18 17:58:04 PDT 2012
    META-INF/maven/clojure-encog/clojure-encog/pom.properties
       584 Fri May 18 17:00:30 PDT 2012 project.clj
    *  9839 Fri May 18 17:01:38 PDT 2012 clojure_encog/nnets.clj*
     11532 Fri May 18 17:57:20 PDT 2012 clojure_encog/examples.clj
    * 10144 Fri May 18 17:43:58 PDT 2012 clojure_encog/training.clj*
      2177 Mon May 14 21:57:20 PDT 2012 java/NeuralPilot.java
      7574 Wed May 16 20:34:30 PDT 2012 java/PredictSunspotSVM.java
      2338 Mon May 14 21:56:42 PDT 2012 java/LanderSimulator.java
      1794 Fri May 18 16:02:22 PDT 2012 java/XORNEAT.java
      1672 Fri May 18 16:04:14 PDT 2012 java/XORNEAT.class
      1872 Mon May 14 14:53:26 PDT 2012 java/LanderSimulator.class
      1943 Mon May 14 14:53:26 PDT 2012 java/NeuralPilot.class
      7357 Wed May 16 20:37:20 PDT 2012 java/PredictSunspotSVM.class
*B)* The second problem I see is when trying to deal with the inputdata. The example in clojure-encog<https://github.com/jimpil/clojure-encog/blob/master/src/clojure_encog/examples.clj#L107>,has just an array of doubles. But my input data is slightlydifferent in that I'm dealing with a LazySeq of arrays. Each ofthose arrays contain tick data, Time, Ask, Bid, AskVolume andBidVolume:
    (["01.05.2012 20:00:00.676" "1.32390" "1.32379" "3000000.00"
    "2250000.00"]
     ["01.05.2012 20:00:00.888" "1.32390" "1.33238" "3000000.10"
    "2200000.00"]
     ...)
So of course a call to ((*make-data* ...) , fails with the error"/clojure.lang.LazySeq cannot be cast to [Double../". So I need tofigure out 1) a way to get each one of those input data points ,into an input-layer neuron. I've started to think about that when Iwas dabbling with code<https://github.com/twashing/nn/blob/master/src/nn/neuralnet.clj>.If you like, I can look into trying to jerry-rig these kinds oftick data mappings into ( training/make-data<https://github.com/jimpil/clojure-encog/blob/master/src/clojure_encog/training.clj#L43> ).But I need a better understanding of the concept of aTemporalwindow. The other thing is 2) to figure out how totransform the time field into data the nn can use. I've beenspitting the Datetime object out to longs.
Thanks

Tim Washington
Interruptsoftware.ca <http://Interruptsoftware.ca>
416.843.9060
On Sun, Jul 29, 2012 at 11:35 AM, Dimitrios Jim Piliouras<jimpil1...@gmail.com <mailto:jimpil1...@gmail.com>> wrote:
    Hi Tim,

    According to :
    
http://www.heatonresearch.com/content/encog-30-article-2-design-goals-overview


    encog 3 should have descent support for any temporal
    (time-series) based prediction support in particular for
    financial predictions...I'm afraid however that the only
    example that I've ported to clojure-encog which uses temporal
    data is the sunspot example (SVM not NN).

    Also, you shouldn't have any problems with the data (most
    likely you need to normalize them - I usually find  (-1 1) or
    (0 1) to work best.
    for an example of how exactly you would do it  look for
    "PREDICT-SUNSPOT-SVM"  here:
    
https://github.com/jimpil/clojure-encog/blob/master/src/clojure_encog/examples.clj


    these 2 lines do all the job with regards to your input data:

    normalizedSunspots  (prepare  :array-range  nil  nil  :raw-seq  spots  
:ceiling  0.9  :floor  0.1)



    train-set ((make-data :temporal-window normalizedSunspots)
    window-size 1)


    As far as algorimthmic problems go encog has been around for
    quite a while...even though I don't necessarily agree with all
    the design decisions made along the way I find it is a rather
    mature lib...of course it is written in Java so being large
    means it is a bit of a mess! also there is a lot of duplication
    in random places...anyways, what I'm trying to say is:

    if you've got a specific example in mind, (like the financial
    prediction) maybe it's worth trying it out using clojure-encog
    or the encog-workbench (the gui) or any other already-made lib
    and see how it goes...writing your own will certainly teach you
    loads but it might take a while until you actually test what
    you want to test...

    Normalisation, randomisation or both are almost always needed...

    Hope that helps...

    Jim



    On Sun, Jul 29, 2012 at 5:41 PM, Timothy Washington
    <twash...@gmail.com <mailto:twash...@gmail.com>> wrote:

        Hey Ben,

        It's the same problem.

            user> (incanter/exp (incanter/minus 3254604.9658621363))
            0.0


        But it's not the functions. It's the math. Euler's number
        2.71828... raised to the power of 3254604.9658621363, gives
        Infinity. So for my neural net's activation func, either i)
        I shouldn't used a sigmoid, or ii) my linear combiner needs
        to keep values within a certain bound. My neuron inputs are
        below. And it's the bid and sk volumes and the long time
        value that's giving me such a large number.

          * 1.3239 (bid price)
          * 1.32379 (ask price)
          * 3000000.0 (bid volume)
          * 2250000.0 (ask volume)
          * 1335902400676 ( #<DateTime 2012-05-01T20:00:00.676Z>
            long value)


        I just had the idea to try a Gaussian or tanh activation
        function. I think this is the point where I'll give
        clojure-encog <https://github.com/jimpil/clojure-encog> a
        whirl. I have a feeling I'll be running into a lot of these
        data and other algorithmic problems. And it'd be good to
        work with something that has already dealt with these
        issues. I still don't know if I need to normalize my input
        data, how to untangle the activation result for
        back propagation, etc. Any insights are welcome.


        Tim Washington
        Interruptsoftware.ca <http://Interruptsoftware.ca>
        416.843.9060 <tel:416.843.9060>

--
You received this message because you are subscribed to the Google
Groups "Clojure" group.
To post to this group, send email to clojure@googlegroups.com
Note that posts from new members are moderated - please be patientwith your first post.
To unsubscribe from this group, send email to
clojure+unsubscr...@googlegroups.com
For more options, visit this group at
http://groups.google.com/group/clojure?hl=en


--
You received this message because you are subscribed to the Google
Groups "Clojure" group.
To post to this group, send email to clojure@googlegroups.com
Note that posts from new members are moderated - please be patient with your 
first post.
To unsubscribe from this group, send email to
clojure+unsubscr...@googlegroups.com
For more options, visit this group at
http://groups.google.com/group/clojure?hl=en

Re: community interest in machine learning (?)

Reply via email to