Package: wnpp Severity: wishlist * Package name : python-sframe Version : 1.8.4 Upstream Author : Dato, Inc. * URL : https://github.com/dato-code/SFrame * License : BSD Programming Lang: C++, Python Description : scalable tabular (SFrame, SArray) and graph (SGraph) data-structures built for out-of-core data analysis.
The SFrame package provides the complete implementation of: * SFrame * SArray * SGraph * The C++ SDK surface area (gl_sframe, gl_sarray, gl_sgraph) The SFrame contains the open source components GraphLab Create from Dato. For more details on GraphLab Create (including documentation and tutorials) see http://dato.com. Some of the key features of this package are. * A scalable column compressed disk-backed dataframe optimized for machine learning and data science needs. * Designed for both tabular (SFrame, SArray) as well as graph data (SGraph) * Support for strictly typed columns (int, float, str, datetime), weakly typed columns (schema free lists, dictionaries) as well as specialized types such as Image. * Uniform support for missing data. * Query optimization and Lazy evaluation. * A C++ API (gl_sarray, gl_sframe, gl_sgraph) with direct native access via the C++ SDK. * A Python API (SArray, SFrame, SGraph) with an indirect access via an interprocess layer. ---- Since I am interested in this package, I am willing to help co-maintain it (as soon as I orphan some packages of mine), especially if some other more experienced module packager is willing to guide me through some of the process of having a hybrid module like this one. Also, since this package is very similar in spirit to Pandas, I'm including the pandas mantainers as CC, in case they are interested here. Thanks, -- Rogério Brito : rbrito@{ime.usp.br,gmail.com} : GPG key 4096R/BCFCAAAA http://cynic.cc/blog/ : github.com/rbrito : profiles.google.com/rbrito DebianQA: http://qa.debian.org/developer.php?login=rbrito%40ime.usp.br