[julia-users] Re: Efficient way to split an array/dataframe?

veryluckyxyz Thu, 26 Mar 2015 11:58:13 -0700

Thank you Gunner, Tim, and James! 
These are great solutions and many times faster than my implementation.



On Thursday, March 26, 2015 at 10:17:10 AM UTC-4, James Fairbanks wrote:
>
> Since you mentioned a test set and a training set. You might want to use 
> MLBase.jl which has reusable tools for conducting good ML research.
> In particular cross validation with random subsamples has already been 
> implemented
>
> RandomSub(*n*, *sn*, *k*)
> https://mlbasejl.readthedocs.org/en/latest/crossval.html
>
> On Thursday, March 26, 2015 at 1:21:42 AM UTC-4, [email protected] 
> wrote:
>>
>> Hi,
>> I have an array of 100 elements. I want to split the array to 70 (test 
>> set) and 30 (train set) randomly.
>>
>> N=100
>> A = rand(N);
>> n = convert(Int, ceil(N*0.7))
>> testindex = sample(1:size(A,1), replace=false,n)
>> testA = A[testindex];
>>
>> How can I get the train set?
>>
>> I could loop through testA and A to get trainA as below
>>
>> trainA = Array(eltype(testA), N-n);
>> k=1
>> for elem in A
>>     if !(elem in testA)
>>         trainA[k] = elem
>>         k=k+1
>>     end
>> end
>>
>> Is there a more efficient or elegant way to do this?
>>
>> Thanks!
>>
>

[julia-users] Re: Efficient way to split an array/dataframe?

Reply via email to