Thanks, this is really super cool work!
Let me also point out that Query.jl works great with DataStream sources and sinks. For example, let’s say you want to load some code from a SQLite database, apply some filtering and transformations and write the result out as a CSV file, you can do that like this: using Query, SQLite, CSV sqlite_db = SQLite.DB(joinpath(Pkg.dir("SQLite"), "test", "Chinook_Sqlite.sqlite")) q = @from i in SQLite.Source(sqlite_db, "SELECT * FROM Employee") begin @where i.ReportsTo==2 @select {Name=i.LastName, Adr=i.Address} @collect CSV.Sink("test-output.csv") end Data.close!(q) Note that this will actually never materialize the data into a DataFrame or anything like that, instead everything is streamed throughout, from the DataStreams source to the think, including the whole query part in the middle. Best, David From: julia-users@googlegroups.com [mailto:julia-users@googlegroups.com] On Behalf Of Jacob Quinn Sent: Thursday, October 27, 2016 11:33 PM To: julia-users <julia-users@googlegroups.com> Subject: [julia-users] [ANN] DataStreams v0.1: Blog post + Package Release Notes Hey everyone, Just wanted to put out the announcement of the release of DataStreams v0.1. (it was actually tagged a few weeks ago, but I've been letting a few last things shake out before announcing). I've written up a blog post on the updates and release here: http://quinnj.github.io/datastreams-jl-v0-1/ The TL;DR is DataStreams.jl now defines concrete interfaces for Data.Sources and Data.Sinks, with each being completely decoupled from the other. This has also allowed some cool new features like appending to Data.Sinks and allowing simple transform functions to be applied to data "in-transit". I included release notes of existing packages in the blog post, but I'll copy-paste here below for easier access: Do note that the DataStreams.jl framework is now Julia 0.5-only. · CSV.jl o <http://juliadata.github.io/CSV.jl/stable/> Docs o Supports a wide variety of delimited file options such as delim, quotechar, escapechar, custom null strings; a header can be provided manually or on a specified row or range of rows; types can be provided manually, and results can be requested as nullable or not (nullable=true by default); and the # of rows can be provided manually (if known) for efficiency. o CSV.parsefield(io::IO, ::Type{T}) can be called directly on any IOtype to tap into the delimited-parsing functionality manually · SQLite.jl o <http://juliadb.github.io/SQLite.jl/stable/> Docs o Query results will now use the declared table column type by default, which can help resultset column typing in some cases o Parameterized SQL statements are fully supported, with the ability to bind julia values to be sent to the DB o Full serialization/deserialization of native and custom Julia types is supported; so Complex{Int128} can be stored in its own SQLite table column and retrieved without any issue o Pure Julia scalar and aggregation functions can be registered with an SQLite database and then called from within SQL statements: full docs <http://juliadb.github.io/SQLite.jl/stable/#User-Defined-Functions-1> here * Feather.jl o <http://juliastats.github.io/Feather.jl/stable/> Docs o Full support for feather release v0.3.0 to ensure compatibility o Full support for returning "factor" or "category" type columns as native CategoricalArray and NullableCategoricalArray types in Julia, thanks to the new <https://github.com/JuliaData/CategoricalArrays.jl> CategoricalArrays.jl package o nullable::Bool=true keyword argument; if false, columns without null values will be returned as Vector{T} instead of NullableVector{T} o Feather.Sink now supports appending, so multiple DataFrames or CSV.Source or any Data.Source can all be streamed to a single feather file * ODBC.jl o <http://juliadb.github.io/ODBC.jl/stable/> Docs o A new ODBC.DSN type that represents a valid, open connection to a database; used in all subsequent api calls; it can be constructed using a previously configured system/user dsn w/ username and password, or as a full custom connection string o Full support for the DataStreams.jl framework through the ODBC.Sourceand ODBC.Sink types, along with their high-level convenience methods ODBC.query and ODBC.load o A new ODBC.prepare(dsn, sql) => ODBC.Statement method which can send an sql statement to the database to be compiled and planned before executed 1 or more times. SQL statements can include parameters to be prepared that can have dynamic values bound before each execution.