The first thing to try is feature hashing to reduce your feature vector size.  
With multiple probes and possibly with random weights you might be able to drop 
the size by 10x. 

Sent from my iPhone

On Apr 12, 2013, at 18:30, Ken Krugler <[email protected]> wrote:

> Hi all,
> 
> We're (ab)using LibLinear (linear SVM) as a multi-class classifier, with 200+ 
> labels and 400K features.
> 
> This results in a model that's > 800MB, which is a bit unwieldy. 
> Unfortunately LibLinear uses a full array of weights (nothing sparse), being 
> a port from the C version.
> 
> I could do feature reduction (removing rows from the matrix) with Mahout 
> prior to training the model, but I'd prefer to reduce the (in memory) nxm 
> array of weights.
> 
> Any suggestions for approaches to take?
> 
> Thanks,
> 
> -- Ken
> 
> --------------------------
> Ken Krugler
> +1 530-210-6378
> http://www.scaleunlimited.com
> custom big data solutions & training
> Hadoop, Cascading, Cassandra & Solr
> 
> 
> 
> 
> 

Reply via email to