Thanks, this is really super cool work!

 

Let me also point out that Query.jl works great with DataStream sources and 
sinks. For example, let’s say you want to load some code from a SQLite 
database, apply some filtering and transformations and write the result out as 
a CSV file, you can do that like this:

 

using Query, SQLite, CSV

sqlite_db = SQLite.DB(joinpath(Pkg.dir("SQLite"), "test", 
"Chinook_Sqlite.sqlite"))

q = @from i in SQLite.Source(sqlite_db, "SELECT * FROM Employee") begin

    @where i.ReportsTo==2

    @select {Name=i.LastName, Adr=i.Address}

    @collect CSV.Sink("test-output.csv")

end

Data.close!(q)

 

Note that this will actually never materialize the data into a DataFrame or 
anything like that, instead everything is streamed throughout, from the 
DataStreams source to the think, including the whole query part in the middle.

 

Best,

David

 

From: julia-users@googlegroups.com [mailto:julia-users@googlegroups.com] On 
Behalf Of Jacob Quinn
Sent: Thursday, October 27, 2016 11:33 PM
To: julia-users <julia-users@googlegroups.com>
Subject: [julia-users] [ANN] DataStreams v0.1: Blog post + Package Release Notes

 

Hey everyone, 

 

Just wanted to put out the announcement of the release of DataStreams v0.1. (it 
was actually tagged a few weeks ago, but I've been letting a few last things 
shake out before announcing).

 

I've written up a blog post on the updates and release here: 
http://quinnj.github.io/datastreams-jl-v0-1/

 

The TL;DR is DataStreams.jl now defines concrete interfaces for Data.Sources 
and Data.Sinks, with each being completely decoupled from the other. This has 
also allowed some cool new features like appending to Data.Sinks and allowing 
simple transform functions to be applied to data "in-transit".

 

I included release notes of existing packages in the blog post, but I'll 
copy-paste here below for easier access:

 

Do note that the DataStreams.jl framework is now Julia 0.5-only.

 

 

·      CSV.jl

o     <http://juliadata.github.io/CSV.jl/stable/> Docs

o    Supports a wide variety of delimited file options such as delim, 
quotechar, escapechar, custom null strings; a header can be provided manually 
or on a specified row or range of rows; types can be provided manually, and 
results can be requested as nullable or not (nullable=true by default); and the 
# of rows can be provided manually (if known) for efficiency.

o    CSV.parsefield(io::IO, ::Type{T}) can be called directly on any IOtype to 
tap into the delimited-parsing functionality manually

·      SQLite.jl

o     <http://juliadb.github.io/SQLite.jl/stable/> Docs

o    Query results will now use the declared table column type by default, 
which can help resultset column typing in some cases

o    Parameterized SQL statements are fully supported, with the ability to bind 
julia values to be sent to the DB

o    Full serialization/deserialization of native and custom Julia types is 
supported; so Complex{Int128} can be stored in its own SQLite table column and 
retrieved without any issue

o    Pure Julia scalar and aggregation functions can be registered with an 
SQLite database and then called from within SQL statements: full docs  
<http://juliadb.github.io/SQLite.jl/stable/#User-Defined-Functions-1> here

*       Feather.jl 

o     <http://juliastats.github.io/Feather.jl/stable/> Docs

o    Full support for feather release v0.3.0 to ensure compatibility

o    Full support for returning "factor" or "category" type columns as native 
CategoricalArray and NullableCategoricalArray types in Julia, thanks to the new 
 <https://github.com/JuliaData/CategoricalArrays.jl> CategoricalArrays.jl 
package

o    nullable::Bool=true keyword argument; if false, columns without null 
values will be returned as Vector{T} instead of NullableVector{T}

o    Feather.Sink now supports appending, so multiple DataFrames or CSV.Source 
or any Data.Source can all be streamed to a single feather file

*       ODBC.jl 

o     <http://juliadb.github.io/ODBC.jl/stable/> Docs

o    A new ODBC.DSN type that represents a valid, open connection to a 
database; used in all subsequent api calls; it can be constructed using a 
previously configured system/user dsn w/ username and password, or as a full 
custom connection string

o    Full support for the DataStreams.jl framework through the ODBC.Sourceand 
ODBC.Sink types, along with their high-level convenience methods ODBC.query and 
ODBC.load

o    A new ODBC.prepare(dsn, sql) => ODBC.Statement method which can send an 
sql statement to the database to be compiled and planned before executed 1 or 
more times. SQL statements can include parameters to be prepared that can have 
dynamic values bound before each execution.

Reply via email to