Hi Dragan, The situation as I see it: - You've created a matrix library that performs well on one benchmark (dense matrix multiplication). - Neanderthal meets your own personal use cases. Great job! - Neanderthal *doesn't* fit the use cases of many others (e.g. some need a portable pure JVM implementation, so Neanderthal is immediately out) - Fortunately, in the Clojure world we have a unique way for such libraries to interoperate smoothly with a common API (core.matrix) - Neanderthal could fit nicely in this ecosystem (possibly it could even replace Clatrix, which as you note hasn't really been maintained for a while...) - For some strange reason, it *appears to me* that you don't want to collaborate. If I perceive wrongly, then I apologise.
If you want to work together with the rest of the community, that's great. I'm personally happy to help you make Neanderthal into a great matrix implementation that works well with core.matrix. I'm 100% sure that is an relatively simple and achievable goal, having done it already with vectorz-clj If on the other hand your intention is to go your own way and build something that is totally independent and incompatible, that is of course your right but I think that's a really bad idea and would be detrimental to the community as a whole. Fragmentation is a likely result. At worst, you'll be stuck maintaining a library with virtually no users (the Clojure community is fairly small anyway... and it is pretty lonely to be a minority within a minority) I can see from your comments below that you still don't understand core.matrix. I'd be happy to help clarify if you are seriously interested in being part of the ecosystem. Ultimately I think you have some talent, you have obviously put in a decent amount of work and Neanderthal could be a great library *if and only if* it works well with the rest of the ecosystem and you are personally willing to collaborate. Your call. On Monday, 22 June 2015 10:05:15 UTC+1, Dragan Djuric wrote: > > > > On Monday, June 22, 2015 at 2:02:19 AM UTC+2, Mikera wrote: >> >> >> There is nothing fundamentally wrong with BLAS/LAPACK, it just isn't >> suitable as a general purpose array programming API. See my comments >> further below. >> > > I was discussing it from the *matrix API* perspective. My comments follow: > > >> If you think the core.matrix API is "unintuitive and complicated" then >> I'd love to hear specific examples. We're still open to changing things >> before we hit 1.0 >> > > I will only give a couple basic ones, but I think they draw a bigger > picture. Let's say I am a Clojure programmer with no huge experience in > numerical computing. I do have some knowledge about linear algebra and have > a textbook or a paper with an algorithm that I need, which is based on some > linear algebra operations. I'd say that this is the most common use case > for an API such as core.matrix, and I hope you agree. After trying to write > my own loops and recursion and fail to do it well, I shop around and find > core.matrix with its cool proposal: a lot of numerical stuff in Clojure, > with pluggable implementations. Yahooo! My problem is almost solved. Go to > the main work right away: > > 1. I add it to my project and try the + example from the github page. It > works. > 2. Now I start implementing my algorithm. How to add-and-multiply a few > matrices? THERE IS NO API DOC. I have to google and find > https://github.com/mikera/core.matrix/wiki/Vectors-vs.-matrices so I > guess it's mmul, but there is a lot of talk of some loosely related > implementation details. Column matrixes, slices, ndarrays... What? A lot of > implementation dependent info, almost no info on what I need (API). > 3. I read the mailing list and the source code, and, if I manage to filter > API information from a lot of implementation discussion I manage to draw a > rough sketch of what I need (API). > 4. I implement my algorithm with the default implementation (vectorz) and > it works. I measure the performance, and as soon as the data size becomes a > little more serious, it's too slow. No problem - pluggable implementations > are here. Surely that Clatrix thing must be blazingly fast, it's native. I > switch the implementations in no time, and get even poorer performance. > WHAT? > 5. I try to find help on the mailing list. I was using the implementation > in a wrong way. WHY? It was all right with vectorz! Well, we didn't quite > implemented it fully. A lot of functions are fallback. The implementation > is not suitable for that particular call... Seriously? It's featured on the > front page! > 6. But, what is the right way to use it? I want to learn. THERE IS NO > INFO. But, look at this, you can treat a Clojure vector as a quaternion and > multiply it with a JSON hash-map, which is treated as a matrix of > characters (OK, I am exaggerating, but not that much :) > etc, etc... > > But it certainly isn't "arbitrarily invented". Please note that we have >> collectively considered a *lot* of previous work in the development of >> core.matrix. People involved in the design have had experience with BLAS, >> Fortran, NumPy, R, APL, numerous Java libraries, GPU acceleration, low >> level assembly coding etc. We'd welcome your contributions too.... but I >> hope you will first take the time to read the mailing list history etc. and >> gain an appreciation for the design decisions. >> > > I read lots of those discussions before. I may or may not agree with the > written fully or partially, but I see that the result is far from what I > find recommended in numerical computing literature that I read, and I do > not see the core.matrix implementations show that literature wrong. > > >> >>> >>> In my opinion, the best way to create a standard API is to grow it from >>> successful implementations, instead of writing it first, and then >>> shoehorning the implementations to fit it. >>> >> >> It is (comparatively) easy to write an API for a specific implementation >> that supports a few specific operations and/or meets a specific use case. >> The original Clatrix is an example of one such library. >> > > Can you point me to some of the implementations where switching the > implementation of an algorithm from vectorz to clatrix shows performance > boost? > And, easy? Surely then the Clatrix implementation would be fully > implemented and properly supported (and documented) after 2-3 years since > it was included? > > >> But that soon falls apart when you realise that the API+implementation >> doesn't meet broader requirements, so you quickly get fragmentation e.g. >> - someone else creates a pure-JVM API for those who can't use native code >> (e.g. vectorz-clj) >> > > So, what is wrong with that? There are dozens of Clojure libraries for > SQL, http, visualization, etc, and all have their place. > > >> - someone else produces a similar library with a new API that wins on >> some benchmarks (e.g. Neanderthal) >> > > I get your point, but would just note that Neanderthal wins *ALL* > benchmark (that fit use cases that I need). Not because it is something too > clever, but because it stands on the shoulders of giants (ATLAS). > > >> - someone else needs arrays that support non-numerical scalar types (e.g. >> core.matrix NDArray) >> - a library becomes unmaintained and someone forks a replacement >> - someone wants to integrate a Java matrix library for legacy reasons >> - someone else has a bad case of NIH syndrome and creates a whole new >> library >> > > That could be said about virtually every application domain. Why is here > many http, html, javascript, database APIs? Why don't have one API that > could be used for any existing library? It's not that people didn't try. I > prefer the microframework approach to a monolithic framework that has one > true way. > > >> >> Before long you have a fragmented ecosystem with many libraries, many >> different APIs and many annoyed / confused users who can't easily get their >> tools to work together. Many of us have seen this happen before in other >> contexts, and we don't want to see the same thing to happen for Clojure. >> > > How many implementations of core.matrix work *WELL* together, for all > supported use cases today? > > >> core.matrix solves the problem of library fragmentation by providing a >> common abstract API, while allowing users choice over which underlying >> implementation suits their particular needs best. To my knowledge Clojure >> is the *only* language ecosystem that has developed such a capability, and >> it has already proved extremely useful for many users. >> > > How many choices there are today that fully and properly implement > core.matrix? > > So if you see people asking for Neanderthal to join the core.matrix >> ecosystem, hopefully this helps to explain why. >> > > As I explained that I would *LOVE* to be able to do such integration and > benefit from it myself. But, I failed to see how to do it properly, and > satisfy core.matrix goals (and my goals with Neanderthal at the same time). > > Currently, I see core.matrix as a formula: idea + ? = success > I do not say that I do not like the idea generally. I would *LOVE* to see > such thing. I do not see what is the "?" yet, and the current offering do > not convince me that other people (core.matrix) can see it. > > >> >>> a) I would rather see the core.matrix interoperability as an additional >>> separate project first, and when/if it shows its value, and there is a >>> person willing to maintain that part of the code, consider adding it to >>> Neanderthal. I wouldn't see it as a second rate, and no fork is needed >>> because of Clojure's extend-type/extend-protocol mechanism. >>> >> >> > >> While this could work from a technical perspective, I would encourage you >> to integrate core.matrix support directly into Neanderthal, for at least >> three reasons: >> a) It will allow you to save the effort of creating and maintaining a >> whole duplicate API, when you can simply adopt the core.matrix API (for >> many operations) >> > > If I could do it simply, I would have already do that. I do not have to > maintain a duplicate API now, though. I can maintain a simple Neanderthal > API that I understand, which is based on BLAS/LAPACK, with lots of > literature and know-how available online and offline, which does one thing > and (in my opinion) does it well, and leave core.matrix integration for > anyone that needs it. > > >> b) It will reduce maintenance, testing and deployment effort (for you and >> for others) >> > > If core.matrix was a good fit. However, I failed to see it that way by now. > > >> c) You are much more likely to get outside contributors if the library >> forms a coherent whole and plays nicely with the rest of the ecosystem >> > > That is true. However, I would rather have a library that fits well to my > needs even if it attracts less people. And, I do not see how it is > difficult to integrate Neanderthal with other libraries, since I used it > with plotting libraries (clojure/java and external), and the integration > was straightforward. > > >> This really isn't hard - in the first instance it is just a matter of >> implementing a few core protocols. To get full performance, you would need >> to implement more of the protocols, but that could be added over time. >> >> >>> b) I am not sure about what's exactly "wrong" with core.matrix. Maybe >>> nothing is wrong. The first thing that I am interested in is what do >>> core.matrix team think is wrong with BLAS/LAPACK in the first place to be >>> able to form an opinion in that regard >>> >> >> BLAS/LAPACK is a low level implementation. core.matrix is a higher level >> abstraction of array programming. They simply aren't comparable in a >> meaningful way. It's like comparing the HTTP protocol with the Apache web >> server. >> > > Here I have to disagree. BLAS/LAPACK is not a low level implementation. It > is a *DE FACTO STANDARD* for numerical linear algebra for dense matrices: > http://www.netlib.org/blas/blast-forum/blas-report.pdf. There are many > implementations of that standard, and they set a really high mark. Besides > atlas, there are Intel MKL, OpenBLAS, cuBLAS, clBLAS, and many other highly > performant libraries. All implementing an API that's been crafted for > decades and is as battle tested as a library could be. > > >> You could certainly use BLAS/LAPACK to create a core.matrix >> implementation (which is roughly what Clatrix does, and what Neanderthal >> could do if it became a core.matrix implementation). Performance of this >> implementation should roughly match raw BLAS/LAPACK (all that core.matrix >> requires is the protocol dispatch overhead, which is pretty minimal and >> only O(1) per operation so it quickly becomes irrelevant for operations on >> large arrays). >> > > Looking at the state of Clatrix integration, I have to disagree with that. > Certainly anybody *COULD* program anything (hypoteticall) , but I'd stay > more down to earth: what *IS*, and what are the reasons that it *IS NOT > (yet)*. > > >> >> In terms of API, core.matrix is *far* more powerful than BLAS/LAPACK. >> > > I do not agree. For *numerical linear algebra* it is not even close to > BLAS/LAPACK. I agree BLAS/LAPACK is not a good date parser, and I am glad > that it is not :) > > >> Some examples: >> - Support for arbitrary N-dimensional arrays (slicing, reshaping, >> multi-dimensional transposes etc.) >> > > And core.matrix is? Compared to NymPy? Compared to state of the art tensor > libraries? (Torch, etc.) > > >> - General purpose array programming operations (analogous to NumPy and >> APL) >> > > See previous. > > >> - Independence from underlying implementation. You can support pure-JVM >> implementations (like vectorz-clj for example), native implementations, GPU >> implementations. >> > > That could be said for any API. Neanderthal have interfaces, which I think > are simpler than core.matrix, and it also *can* support pure-JVM, native, > GPU. The question is: which of those things *ARE* supported? Or maybe the > answer to question "why they are not supported" would answer why I do not > feel core.matrix is such wonderful solution for a generally noble goal. > > >> - Support for arbitrary scalar types (complex numbers? strings? dates? >> quaternions anyone?) >> > > Where is that support for complex numbers? (To remind you, BLAS/LAPACK > already has it, although I didn't implement that part in Neanderthal yet, > since I didn't need it). > > >> - Transparent support for both dense and sparse matrices with the same API >> > > What is the performance of such operations? There is a reason dense/sparse > APIs are different in numerical libraries. > > >> - Support for both mutable and immutable arrays >> > > While I support such thing for the sake of effort in the name of elegance, > in *numerical computing* mutable structures are what is important, and that > is the first thing that any literature stresses first. > > >> - Transparent support for the in-built Clojure data structures (Clojure >> persistent vectors etc.) >> > > Transparent is what worries me. Looks appealing, but shoots me in the > foot. Again, in *numerical computing*, when I want to convert something, I > want it to be explicit. > > >> - Support for mixing different array types >> - Supports for convenience operations such as broadcasting, coercion >> > > See the previous point. > > >> If you build an API that supports all of that with a reasonably coherent >> design... then you'll probably end up with something very similar to >> core.matrix >> > > On the other thing, I may think such an API an unnecessary complex and > overblown thing. So, I guess that we just have a different perspective here. > > Thank you for the great effort, though. Even if I do not see that it fits > my needs, I am glad that it is a great tool for other people. > -- You received this message because you are subscribed to the Google Groups "Clojure" group. To post to this group, send email to clojure@googlegroups.com Note that posts from new members are moderated - please be patient with your first post. To unsubscribe from this group, send email to clojure+unsubscr...@googlegroups.com For more options, visit this group at http://groups.google.com/group/clojure?hl=en --- You received this message because you are subscribed to the Google Groups "Clojure" group. To unsubscribe from this group and stop receiving emails from it, send an email to clojure+unsubscr...@googlegroups.com. For more options, visit https://groups.google.com/d/optout.