On Wednesday, May 18, 2016 at 6:58:03 AM UTC-7, Oguz Erkman wrote:
>
> Hello everyone,
>
> I am a novice programmer so please forgive me if my question is too simple.
>
> I created a module which finds lemmas of tokens in pretokenized text and 
> writes them into a file and into the database.
> I got what I wanted in the output file but I had a problem while inserting 
> the output into the database, it seems like it inserts an empty value after 
> each lemma.
>
> My code is as follows:
>
> ...
>         with io.open(outfile, 'w') as f:
>             for token in doc:
>                 f.write(token.lemma_)
>                 _id = db.en_lemmata_analysis.insert(lemmata=token.lemma_)
> ...
>
> I attached the resulting view to this email.
> Please help.
>
> Oz
>

I had to look that up in order to make sense of the visual sample.  Is this 
the appropriate context? 

Stemming usually refers to a crude heuristic process that chops off the 
ends of words in the hope of achieving this goal correctly most of the 
time, and often includes the removal of derivational affixes. Lemmatization 
usually refers to doing things properly with the use of a vocabulary and 
morphological analysis of words, normally aiming to remove inflectional 
endings only and to return the base or dictionary form of a word, which is 
known as the lemma .


<URL:http://nlp.stanford.edu/IR-book/html/htmledition/stemming-and-lemmatization-1.html>

It does look like you've lost the connection between your token and your 
lemma.  It would help us if you showed your table definition, I think. 
 Your insert line doesn't seem to be including the token itself; without a 
token, a NULL or 0-length string would be used as a placeholder.  I am not 
sure how you did your picture.  Is it a screen shot of the appadmin view of 
the table?
(such as from http://127.0.0.1:8000/lemmaloader/appadmin/select/db?query=db.
en_lemmata_analysis.id%3E0   ... %3E encodes '>')

(Are you using the sqlite3 DB engine that came with web2py?  That would 
influence how empty values are represented, but not the logic of the 
insert.)

/dps



-- 
Resources:
- http://web2py.com
- http://web2py.com/book (Documentation)
- http://github.com/web2py/web2py (Source code)
- https://code.google.com/p/web2py/issues/list (Report Issues)
--- 
You received this message because you are subscribed to the Google Groups 
"web2py-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to web2py+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Reply via email to