We are very pleased to announce the beta release of the BIDMach machine
learning toolkit, version 0.9. The main page is here, which includes
precompiled downloads for 64-bit Windows, Linux and Mac OSX:
http://bid2.berkeley.edu/bid-data-project/
BIDMach has several unique features:
Speed: BIDMach is currently the fastest tool for many common machine
learning tasks, and the list is growing. When run on a single machine
with graphics processor, BIDMach is faster than any other system *on a
single node or cluster* for regression, clustering, classification, and
matrix factorization. Every compute primitive has been "rooflined" which
means its been optimized close to theoretical performance limits.
Checkout the benchmarks in:
https://github.com/BIDData/BIDMach/wiki/Benchmarks
Scalability: BIDMach has run larger calculations on one node than most
cluster systems: with a large RAID, it has run LDA (Latent Dirichlet
Allocation) on a 10 TB dataset. BIDMach can also run on a cluster, and
includes a new communication protocol called "Kylix" which gives
nearly-optimal throughput for distributed ML and graph tasks. It
currently holds the record for Pagerank analysis of large graphs, and
was 3-6x faster than any other system on 64 nodes.
Usability: BIDMach inherits a powerful command line/batch file
interpreter from the Scala language in which it is written. It has the
simplicity of R, Python etc. but with uniformly high performance. It
fully taps Scala's extensible syntax, so that math written in BIDMach
looks like math. BIDMach includes a simple plotting class, and we are
adding "interactive models" which allow interactive tuning.
Customizability: BIDMach includes likelihood "mixins" to allow the
qualities of basic models to be tailored to more specific needs. e.g.
topic models can be tuned to favor more coherent or more
mutually-independent topics.
Modularity: BIDMach favors mini-batch algorithms and includes core
classes that take care of optimization, data sourcing, and model
tailoring. Writing a new model typically requires writing a small
generic model class with a gradient method. The learner classes take
care of running the model, using sparse or dense data, running on CPU or
GPU and single or double precision.
_______________________________________________
uai mailing list
uai@ENGR.ORST.EDU
https://secure.engr.oregonstate.edu/mailman/listinfo/uai