Ben Wolfson wrote:
I've got a db some of whose elements have been created automatically
from filesystem data (whose encoding is iso-8859-1).  If I try to
select one of those elements using a standard SQL construct, things
work fine:
[...]

How can I get around this?  I really want to be able to search by
regexp, and not just the standard SQL %-pattern.

Looks like SQLite does not want to pass non-UTF8 strings to functions. The attached script shows that it does work with unicode and buffer (BLOB) parameters, but not with non-UTF8 strings.

Text has to be encoded in UTF-8 in SQLite, it's just not enforced usually. Looks like SQLite enforces it here, though. Kind of ...

-- Gerhard
import sqlite3 as sqlite
import re

def func(x):
    if x is not None:
        return "ok"
    else:
        return "input data did not arrive"

con = sqlite.connect(":memory:")
con.create_function("func", 1, func)
raw_latin1 = unicode("\x86", "latin1").encode("utf-8")
print raw_latin1
unicode_str = unicode(raw_latin1, "latin1")
as_buffer = buffer(raw_latin1)

def test(input_data):
    try:
        print "-" * 50
        print type(input_data)
        print con.execute("select func(?)", (input_data,)).fetchone()[0]
    except Exception, e:
        print "ERROR", e

test(raw_latin1)
test(unicode_str)
test(as_buffer)
-- 
http://mail.python.org/mailman/listinfo/python-list

Reply via email to