Re: training-data.libsvm Vs model.libsvm (RelationExtractor)

Steven Bethard Tue, 21 May 2013 06:45:14 -0700

On May 20, 2013, at 9:34 PM, giri vara prasad nambari <girinamb...@gmail.com> 
wrote:
> Here is the link where I found these files:
> https://svn.apache.org/repos/asf/ctakes/tags/ctakes-3.0.0-incubating/ctakes-relation-extractor/resources/models/modifier_extractor/

I see. You're not working from the current repository, you're working from 
3.0.0-incubating. Those files were erroneously included in that release.

> "The training-data.libsvm file will not be generated during classification.
> It's only generated during training", I understood this part.
> 
> "LIBSVM-formatted features and labels" this part is I am not clear, how
> these are generated from training data?

They're generated from the features created by the RelationFeaturesExtractors I 
pointed you to in RelationExtractorAnnotator. The conversion from Feature 
objects to LIBSVM-formatted feature strings is performed by ClearTK, 
specifically by the LIBSVMStringOutcomeDataWriter. (You can see that the models 
are trained with that data writer class in RelationExtractorTrain.)

> Is it based on SVM algorithm?

The LIBSVM-format for providing features and labels is defined by LIBSVM 
(http://www.csie.ntu.edu.tw/~cjlin/libsvm/). It's basically <label> 
<feature>:<value> <feature>:<value> ...

Steve

> On Mon, May 20, 2013 at 5:54 PM, Steven Bethard <steven.beth...@colorado.edu
>> wrote:
> 
>> On May 17, 2013, at 2:29 PM, giri vara prasad nambari <
>> girinamb...@gmail.com> wrote:
>>> Can someone please clarify the difference between training-data.libsvm
>> and
>>> model.libsvm in ctakes-relation-extractor module?
>> 
>> Where are you seeing these? Neither should be in the repository.
>> 
>> That said, training-data.libsvm is the LIBSVM-formatted features and
>> labels, and model.libsvm is the LIBSVM model file.
>> 
>>> If so,
>>> could someone provide any references/sample on how this file will be
>>> generated for a sample annotated sentence?
>> 
>> The training-data.libsvm file will not be generated during classification.
>> It's only generated during training. If you want to see what features are
>> generated during classification, take a look at RelationExtractorAnnotator,
>> which defines a List<RelationFeaturesExtractor> getFeatureExtractors(),
>> which defines the various feature extractors used by the relation
>> extractors.
>> 
>> Not sure if I answered your question. Please feel free to follow up.
>> 
>> Steve

Re: training-data.libsvm Vs model.libsvm (RelationExtractor)

Reply via email to