You could use CSV.jl: http://juliadata.github.io/CSV.jl/stable/

In this case, you'd do:

df1 = CSV.read(file1; types=Dict(1=>String)) # assuming your account number
is column # 1
df2 = CSV.read(file2; types=Dict(1=>String))

-Jacob


On Mon, Oct 31, 2016 at 12:50 PM, LeAnthony Mathews <leanthon...@gmail.com>
wrote:

> Using v0.5.0
> I have two different 10,000 line CSV files that I am reading into two
> different dataframe variables using the readtable function.
> Each table has in common a ten digit account_number that I would like to
> use as an index and join into one master file.
>
> Here is the account number example in the original CSV from file1:
> 8018884596
> 8018893530
> 8018909633
>
> When I do a readtable of this CSV into file1 then do a*
> typeof(file1[:account_number])* I get:
> *DataArrays.DataArray(Int32,1)*
>  -571049996
>  -571041062
>  -571024959
>
> when I do a
> *typeof(file2[:account_number])*
> *DataArrays.DataArray(String,1)*
>
>
> *Question:  *
> My CSV files give no guidance that account_number should be Int32 or
> string type.  How do I force it to make both account_number elements type
> String?
>
> I would like this join command to work:
> *new_account_join = join(file1, file2, on =:account_number,kind = :left)*
>
> But I am getting this error:
> *ERROR: TypeError: typeassert: expected Union{Array{Symbol,1},Symbol}, got
> Array{*
> *Array{Symbol,1},1}*
> * in (::Base.#kw##join)(::Array{Any,1}, ::Base.#join,
> ::DataFrames.DataFrame, ::D*
> *ataFrames.DataFrame) at .\<missing>:0*
>
>
> Any help would be appreciated.
>
>
>

Reply via email to