Hi all, We're (ab)using LibLinear (linear SVM) as a multi-class classifier, with 200+ labels and 400K features.
This results in a model that's > 800MB, which is a bit unwieldy. Unfortunately LibLinear uses a full array of weights (nothing sparse), being a port from the C version. I could do feature reduction (removing rows from the matrix) with Mahout prior to training the model, but I'd prefer to reduce the (in memory) nxm array of weights. Any suggestions for approaches to take? Thanks, -- Ken -------------------------- Ken Krugler +1 530-210-6378 http://www.scaleunlimited.com custom big data solutions & training Hadoop, Cascading, Cassandra & Solr
