2011/4/16 Karen McNeil <karenlmcn...@gmail.com> > I have the following view set up: > > def concord(request): > searchterm = request.GET['q'] > ... more stuff ... > return render_to_response('concord.html', locals()) > > With URL "http://mysite.com/concord/?q=برشه", and template code > <p>Your search for {{ searchterm }} returned {{ results }} > results in {{ texts.count }} texts.</p> > I get this result on the page: > Your search for برشه returned 0 results in 8 texts. > > The search term (برشه) is being passed successfully, but there should > be 13 results, not zero. When I hard-code "searchterm = 'برشه' " into > the concord view, instead of "searchterm = request.GET['q']", the page > displays perfectly. >
With this line of code: "searchterm = 'برشه' " searchterm will be a bytestring, utf-8 encoded if that is the encoding of your file. With this: searchterm = request.GET['q']" it will be unicode. Django returns unicode from the DB and request dictionaries. I notice you are explicitly encoding to utf-8 the texts you are searching before passing them into the nltk code. However you never do this for the searchterm, and I'd guess that is why the difference in results when you hard code it vs. pulling it from request.GET. Can't you pass unicode to the nltk code? If you can, I'd get rid of the explicit encoding to utf-8 of the DB content you are searching. If you really must pass it bytestrings instead of of unicode, then explicitly encoding the search term to to utf-8 as well will probably fix the problem. Karen -- http://tracey.org/kmt/ -- You received this message because you are subscribed to the Google Groups "Django users" group. To post to this group, send email to django-users@googlegroups.com. To unsubscribe from this group, send email to django-users+unsubscr...@googlegroups.com. For more options, visit this group at http://groups.google.com/group/django-users?hl=en.