Le jeudi 03 novembre 2016 à 13:35 -0700, LeAnthony Mathews a écrit :
> Thanks Michael,
> I been thinking about this all day. Yes, basically I am going to
> have to create a macro CSVreadtable that mimics the readtable
> command, but in the expantion uses CSV.read. The macro will manually
> constructs a similar readtable sized dataframe array, but use the
> column types I specify or inherit from the original readtable
> command. The macro can use the current CSV.read parameters.
>
> So this would work.
> df1_CSVreadtable = CSVreadtable("$df1_path"; types=Dict(1=>String))
>
> so a:
> eltypes(df1_CSVreadtable)
> 3-element Array{Type,1}:
> Int32
> String
> String
>
>
> Anyway, I was looking for a quick fix, but it least I will learn
> some Julia.
If you don't have missing values and just want a Vector{String}, you
can pass nullable=false to CSV.read().
Regards
>
>
> > DataFrames is currently undergoing a very major change. Looks like
> > CSV creates the new type of DataFrames. I hope someone can help you
> > with using that. As a workaround, on the normal DataFrames version,
> > I have generally just replaced with a string representation:
> > ```
> > df[:account_numbers] = ["$account_number" for account_number in
> > df[:account_numbers]]
> >
> > On Thu, Nov 3, 2016 at 3:05 PM, LeAnthony Mathews <[email protected]
> > om> wrote:
> > > Sure, so I need col #1 in my CSV to be a string in my data frame.
> > >
> > >
> > > So as a test I tried to load the file 3 different ways:
> > >
> > > df1_CSV = CSV.read("$df1_path"; types=Dict(1=>String)) #forcing
> > > the column to stay a string
> > > df1_readtable = readtable("$df1_path") #Do not know how to force
> > > the column to stay a string
> > > df1_convertDF = convert(DataFrame, df1_CSV)
> > >
> > > Here is the output: If they are all dataframes then showcols
> > > should work an all three df1:
> > >
> > > julia> names(df1_CSV)
> > > 3-element Array{Symbol,1}:
> > > :account_number
> > > Symbol("Discharge Date")
> > > :site
> > >
> > > julia> names(df1_readtable)
> > > 3-element Array{Symbol,1}:
> > > :account_number
> > > :Discharge_Date
> > > :site
> > >
> > > julia> names(df1_convertDF)
> > > 3-element Array{Symbol,1}:
> > > :account_number
> > > Symbol("Discharge Date")
> > > :site
> > >
> > >
> > > julia> eltypes(df1_CSV)
> > > 3-element Array{Type,1}:
> > > Nullable{String}
> > > Nullable{WeakRefString{UInt8}}
> > > Nullable{WeakRefString{UInt8}}
> > >
> > > julia> eltypes(df1_readtable)
> > > 3-element Array{Type,1}:
> > > Int32 #Do not know how to force the column to stay a string
> > > String
> > > String
> > >
> > > julia> eltypes(df1_convertDF)
> > > 3-element Array{Type,1}:
> > > Nullable{String}
> > > Nullable{WeakRefString{UInt8}}
> > > Nullable{WeakRefString{UInt8}}
> > >
> > > julia> showcols(df1_convertDF)
> > > 1565x3 DataFrames.DataFrame
> > > ERROR: MethodError: no method matching
> > > countna(::NullableArrays.NullableArray{St
> > > ring,1})
> > > Closest candidates are:
> > > countna(::Array{T,N}) at
> > > C:\Users\lmathews\.julia\v0.5\DataFrames\src\other\ut
> > > ils.jl:115
> > > countna(::DataArrays.DataArray{T,N}) at
> > > C:\Users\lmathews\.julia\v0.5\DataFram
> > > es\src\other\utils.jl:128
> > > countna(::DataArrays.PooledDataArray{T,R<:Integer,N}) at
> > > C:\Users\lmathews\.ju
> > > lia\v0.5\DataFrames\src\other\utils.jl:143
> > > in colmissing(::DataFrames.DataFrame) at
> > > C:\Users\lmathews\.julia\v0.5\DataFram
> > > es\src\abstractdataframe\abstractdataframe.jl:657
> > > in showcols(::Base.TTY, ::DataFrames.DataFrame) at
> > > C:\Users\lmathews\.julia\v0.
> > > 5\DataFrames\src\abstractdataframe\show.jl:574
> > > in showcols(::DataFrames.DataFrame) at
> > > C:\Users\lmathews\.julia\v0.5\DataFrames
> > > \src\abstractdataframe\show.jl:581
> > >
> > > julia> showcols(df1_readtable)
> > > 1565x3 DataFrames.DataFrame
> > > │ Col # │ Name │ Eltype │ Missing │
> > > ├───────┼────────────────┼────────┼─────────┤
> > > │ 1 │ account_number │ Int32 │ 0 │
> > > │ 2 │ Discharge_Date │ String │ 0 │
> > > │ 3 │ site │ String │ 0 │
> > >
> > > julia> showcols(df1_CSV)
> > > 1565x3 DataFrames.DataFrame
> > > ERROR: MethodError: no method matching
> > > countna(::NullableArrays.NullableArray{St
> > > ring,1})
> > > Closest candidates are:
> > > countna(::Array{T,N}) at
> > > C:\Users\lmathews\.julia\v0.5\DataFrames\src\other\ut
> > > ils.jl:115
> > > countna(::DataArrays.DataArray{T,N}) at
> > > C:\Users\lmathews\.julia\v0.5\DataFram
> > > es\src\other\utils.jl:128
> > > countna(::DataArrays.PooledDataArray{T,R<:Integer,N}) at
> > > C:\Users\lmathews\.ju
> > > lia\v0.5\DataFrames\src\other\utils.jl:143
> > > in colmissing(::DataFrames.DataFrame) at
> > > C:\Users\lmathews\.julia\v0.5\DataFram
> > > es\src\abstractdataframe\abstractdataframe.jl:657
> > > in showcols(::Base.TTY, ::DataFrames.DataFrame) at
> > > C:\Users\lmathews\.julia\v0.
> > > 5\DataFrames\src\abstractdataframe\show.jl:574
> > > in showcols(::DataFrames.DataFrame) at
> > > C:\Users\lmathews\.julia\v0.5\DataFrames
> > > \src\abstractdataframe\show.jl:581
> > >
> > >
> > >
> > > > The result of CSV should be a DataFrame by default. What
> > > > return type do you get?
> > > >
> > >
> >
> >