[julia-users] Re: Efficient way to split an array/dataframe?

James Fairbanks Thu, 26 Mar 2015 07:17:50 -0700

Since you mentioned a test set and a training set. You might want to use 
MLBase.jl which has reusable tools for conducting good ML research.
In particular cross validation with random subsamples has already been 
implemented


RandomSub(*n*, *sn*, *k*)
https://mlbasejl.readthedocs.org/en/latest/crossval.html

On Thursday, March 26, 2015 at 1:21:42 AM UTC-4, [email protected] wrote:
>
> Hi,
> I have an array of 100 elements. I want to split the array to 70 (test 
> set) and 30 (train set) randomly.
>
> N=100
> A = rand(N);
> n = convert(Int, ceil(N*0.7))
> testindex = sample(1:size(A,1), replace=false,n)
> testA = A[testindex];
>
> How can I get the train set?
>
> I could loop through testA and A to get trainA as below
>
> trainA = Array(eltype(testA), N-n);
> k=1
> for elem in A
>     if !(elem in testA)
>         trainA[k] = elem
>         k=k+1
>     end
> end
>
> Is there a more efficient or elegant way to do this?
>
> Thanks!
>

[julia-users] Re: Efficient way to split an array/dataframe?

Reply via email to