any other reasons or can you give a thorough analysis? 2014-11-05 11:00 GMT+08:00 Ted Dunning <[email protected]>:
> > Yes, type conversion is a reason. > > Sent from my iPhone > > > On Nov 4, 2014, at 18:59, Lee S <[email protected]> wrote: > > > > eg. kmeans input: > > 1,2,3,4 //text file > > kmeans output > > point1, point2,point3(text file of center points) > > > > > > I just thought of one reason. The input data should be storaged in > > vector(dense or sparse) format ,so a conversion step > > needs to be doned before algorithms deal with data. Is that right? > > > > 2014-11-04 23:56 GMT+08:00 Ted Dunning <[email protected]>: > > > >> What should the input be? > >> > >> > >> > >>> On Tue, Nov 4, 2014 at 12:28 AM, Lee S <[email protected]> wrote: > >>> > >>> Hi all: > >>> I'm wondering why the input and output of most algorithm like > >>> kmeans,naivebayes are all sequencefiles. One more step of conversion > need > >>> to be done if we want the algorithm works.And > >>> I think the step is time consuming. Because it's also a mapreduce job. > >>> For the reason to deal with small files and compress to save disk > >> space? > >> >
