On Wednesday, May 18, 2016 at 6:58:03 AM UTC-7, Oguz Erkman wrote: > > Hello everyone, > > I am a novice programmer so please forgive me if my question is too simple. > > I created a module which finds lemmas of tokens in pretokenized text and > writes them into a file and into the database. > I got what I wanted in the output file but I had a problem while inserting > the output into the database, it seems like it inserts an empty value after > each lemma. > > My code is as follows: > > ... > with io.open(outfile, 'w') as f: > for token in doc: > f.write(token.lemma_) > _id = db.en_lemmata_analysis.insert(lemmata=token.lemma_) > ... > > I attached the resulting view to this email. > Please help. > > Oz >
I had to look that up in order to make sense of the visual sample. Is this the appropriate context? Stemming usually refers to a crude heuristic process that chops off the ends of words in the hope of achieving this goal correctly most of the time, and often includes the removal of derivational affixes. Lemmatization usually refers to doing things properly with the use of a vocabulary and morphological analysis of words, normally aiming to remove inflectional endings only and to return the base or dictionary form of a word, which is known as the lemma . <URL:http://nlp.stanford.edu/IR-book/html/htmledition/stemming-and-lemmatization-1.html> It does look like you've lost the connection between your token and your lemma. It would help us if you showed your table definition, I think. Your insert line doesn't seem to be including the token itself; without a token, a NULL or 0-length string would be used as a placeholder. I am not sure how you did your picture. Is it a screen shot of the appadmin view of the table? (such as from http://127.0.0.1:8000/lemmaloader/appadmin/select/db?query=db. en_lemmata_analysis.id%3E0 ... %3E encodes '>') (Are you using the sqlite3 DB engine that came with web2py? That would influence how empty values are represented, but not the logic of the insert.) /dps -- Resources: - http://web2py.com - http://web2py.com/book (Documentation) - http://github.com/web2py/web2py (Source code) - https://code.google.com/p/web2py/issues/list (Report Issues) --- You received this message because you are subscribed to the Google Groups "web2py-users" group. To unsubscribe from this group and stop receiving emails from it, send an email to web2py+unsubscr...@googlegroups.com. For more options, visit https://groups.google.com/d/optout.