I am trying to calculate the time required to tag one sentence/file by one trained NLTK HMM Tagger. To do this I am writing the following code, please suggest if I need to revise anything here.
import nltk from nltk.corpus.reader import TaggedCorpusReader import time #HMM reader = TaggedCorpusReader('/python27/', r'.*\.pos') f1=reader.fileids() print f1 sents=reader.tagged_sents() ls=len(sents) print "Total No of Sentences:",ls train_sents=sents[0:40] test_sents=sents[41:46] #TRAINING & TESTING hmm_tagger=nltk.HiddenMarkovModelTagger.train(train_sents) test=hmm_tagger.test(test_sents) appli_sent1=reader.sents(fileids='minicv.pos')[0] print "SAMPLE INPUT:",appli_sent1 #TIME CALCULATION start_time = time.clock() application=hmm_tagger.tag(appli_sent1) #I MAY REPLACE WITH ONE DOCUMENT print "ENTITY RECOGNIZED",application print "Time Taken Is:",time.clock() - start_time, "seconds" NB: This is a toy kind example and I did not follow much of training/testing size parameters. My question is only for the time calculation part. It is not a forum for Machine Learning, but as there are many people who has very high level knowledge on it, any one is most welcome to give his/her valuable feedback which may improve my knowledge. As the code is pasted here from IDLE (with Python2.7 on MS-Windows 7) I could not maintain proper indentation, apology for the same. -- https://mail.python.org/mailman/listinfo/python-list