Renjin and Spark's dataframes are not going to be easily removed from their respective codebases, as far as my brief perusal of the source can tell. I agree that N-D DataFrames would be a good addition to the ecosystem, similar to the goals of Python's xarray (xarray.pydata.org). However, it is not a priority for myself as of this time. Thanks for pointing out the DataSet proposal. I'll take a look at that later.
On a slightly related note, where is the best place to ask core.matrix questions? I have some small questions about sparse matrix support in core.matrix, and what sparse formats are implemented. On Thursday, March 10, 2016 at 7:45:44 PM UTC-5, Mikera wrote: > > core.matrix maintainer here. > > I think it would be great to have more work on dataframe-type support. I > think the right strategy is as follows: > a) Make use of the core.matrix Dataset protocols where possible (or add > new ones) > b) Create implementation(s) for these protocols for whatever back-end data > frame implementation is being used > > The beauty of core.matrix is that we *can* support multiple > implementations without fragmentation, because the protocol based approach > means that every implementation can use the same API. This is already > working well for the array programming APIs (it's easy to mix and match > Clojure data structures, Vectorz Java-based arrays, GPU backed arrays in > computations). We just need to do the same for DataFrames. > > Now: the current core.matrix Dataset API is a bit focused on 2D data > tables, but I think it can be extended to general N-dimensional dataframe > capability. Would be a great project for someone to take on, happy to give > guidance and help merge in changes as needed. > > I don't have a particularly strong opinion on which Dataframe > implementations are best, but it looks like Spark and Renjin are both great > candidates and would be very useful additions to the Clojure numerical > ecosystem. If we do things right, they should interoperate easily with the > core.matrix APIs, making Clojure ideal for "glue" code across such > implementations. > > On Thursday, 10 March 2016 04:57:31 UTC+8, arthur.ma...@gmail.com wrote: >> >> Is there any desire or need for a Clojure DataFrame? >> >> >> By DataFrame, I mean a structure similar to R's data.frame, and Python's >> pandas.DataFrame. >> >> Incanter's DataSet may already be fulfilling this purpose, and if so, I'd >> like to know if and how people are using it. >> >> From quickly researching, I see that some prior work has been done in >> this space, such as: >> >> * https://github.com/cardillo/joinery >> * https://github.com/mattrepl/data-frame >> * >> http://spark.apache.org/docs/latest/sql-programming-guide.html#dataframes >> >> Rather than going off and creating a competing implementation ( >> https://xkcd.com/927/), I'd like to know if anyone here is actively >> working on, or would like to work on a DataFrame and related utilities for >> Clojure (and by extension Java)? Is it something that's sorely needed, or >> is everybody happy with using Incanter or some other library that I'm not >> aware of? If there's already a defacto standard out there, would anyone >> care to please point it out? >> >> As background information: >> >> My specific use-case is in NLP and ML, where I often explore and >> prototype in Python, but I'm then left to deal with a smattering of >> libraries on the JVM (Mallet, Weka, Mahout, ND4J, DeepLearning4j, CoreNLP, >> etc.), each with their own ad-hoc implementations of algorithms, matrices, >> and utilities for reading data. It would be great to have a unified way to >> explore my data in the Clojure REPL, and then serve the same code and >> models in production. >> >> I would love for Clojure to have a broadly compatible ecosystem similar >> to Python's Numpy/Pandas/Scikit-*/Scipy/matplotlib/GenSim,etc. Core.Matrix >> and Incanter appear to fulfill a large chunk of those roles, but I am not >> aware if they've yet become the defacto standards in the community. >> >> Any feedback is greatly appreciated. >> > -- You received this message because you are subscribed to the Google Groups "Clojure" group. To post to this group, send email to clojure@googlegroups.com Note that posts from new members are moderated - please be patient with your first post. To unsubscribe from this group, send email to clojure+unsubscr...@googlegroups.com For more options, visit this group at http://groups.google.com/group/clojure?hl=en --- You received this message because you are subscribed to the Google Groups "Clojure" group. To unsubscribe from this group and stop receiving emails from it, send an email to clojure+unsubscr...@googlegroups.com. For more options, visit https://groups.google.com/d/optout.