2011/4/16 Karen McNeil <karenlmcn...@gmail.com>

> I have the following view set up:
>
> def concord(request):
>    searchterm = request.GET['q']
>    ... more stuff ...
>    return render_to_response('concord.html', locals())
>
> With URL "http://mysite.com/concord/?q=برشه";, and template code
>     <p>Your search for {{ searchterm }} returned {{ results }}
> results in {{ texts.count }} texts.</p>
> I get this result on the page:
>    Your search for برشه returned 0 results in 8 texts.
>
> The search term (برشه) is being passed successfully, but there should
> be 13 results, not zero.  When I hard-code "searchterm = 'برشه' " into
> the concord view, instead of "searchterm = request.GET['q']", the page
> displays perfectly.
>

With this line of code:

"searchterm = 'برشه' "

searchterm will be a bytestring, utf-8 encoded if that is the encoding of
your file.

With this:

searchterm = request.GET['q']"

it will be unicode. Django returns unicode from the DB and request
dictionaries.

I notice you are explicitly encoding to utf-8 the texts you are searching
before passing them into the nltk code. However you never do this for the
searchterm, and I'd guess that is why the difference in results when you
hard code it vs. pulling it from request.GET.

Can't you pass unicode to the nltk code? If you can, I'd get rid of the
explicit encoding to utf-8 of the DB content you are searching. If you
really must pass it bytestrings instead of of unicode, then explicitly
encoding the search term to to utf-8 as well will probably fix the problem.

Karen
-- 
http://tracey.org/kmt/

-- 
You received this message because you are subscribed to the Google Groups 
"Django users" group.
To post to this group, send email to django-users@googlegroups.com.
To unsubscribe from this group, send email to 
django-users+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/django-users?hl=en.

Reply via email to