On Friday 9 Oct 2015 15:12 CEST, Peter Otten wrote: > Dennis Lee Bieber wrote: > >> On Fri, 09 Oct 2015 08:56:18 +0200, Cecil Westerhof >> <ce...@decebal.nl> declaimed the following: >> >>> My bad, I intended to mention that ORDER BY gives the wrong order >>> (é comes after z and with sort it comes after e), so that is why I >>> use the external sort command. >>> >> My books show how to define collation functions for SQLite3 -- from >> C; don't show if Python can interface to that side (custom >> aggregation functions are shown, however). >> >> Even if you have to use the external sort, you can still avoid one >> process level by not spawning the external SQLite3 process. Collect >> the data from the Python level and sort the output (possibly, as >> mentioned elsewhere, if the data fits in memory, you can sort using >> Python itself and avoid all external processes). >> >> AH! the help file (still Python 2.7 -- though a quick glance at the >> P3 has the same text) shows: >> >> -=-=-=-=- >> create_collation(name, callable) >> >> Creates a collation with the specified name and callable. The >> callable will be passed two string arguments. It should return -1 >> if the first is ordered lower than the second, 0 if they are >> ordered equal and 1 if the first is ordered higher than the second. >> Note that this controls sorting (ORDER BY in SQL) so your >> comparisons don’t affect other SQL operations. >> >> Note that the callable will get its parameters as Python >> bytestrings, which will normally be encoded in UTF-8. >> >> The following example shows a custom collation that sorts “the >> wrong way”: ... -=-=-=-=- >> >> So if one can write a short comparison function that gives the >> desired order, one can extend the SQL query with >> >> ... order by <field> collate <newFunction> > > Turns out such a function already exists; it's called > > locale.strcoll() > > Modified example from the docs: > > $ cat sqlite3_collation.py > import sqlite3 > import locale > > print("Using locale", locale.setlocale(locale.LC_ALL, "")) > > con = sqlite3.connect(":memory:") > con.create_collation("localized", locale.strcoll) > > cur = con.cursor() > cur.execute("create table test(x)") > cur.executemany( > "insert into test(x) values (?)", > "aäbAÄB") > cur.execute("select x from test order by x collate localized") > > for row in cur: > print(row) > con.close() > $ python3 sqlite3_collation.py > Using locale de_DE.UTF-8 > ('a',) > ('A',) > ('ä',) > ('Ä',) > ('b',) > ('B',) > $ PYTHONIOENCODING=UTF-8 LANG=C python3 sqlite3_collation.py > Using locale C > ('A',) > ('B',) > ('a',) > ('b',) > ('Ä',) > ('ä',)
Thanks very much. I now have in my initialisation: conn.create_collation("localized", locale.strcoll) cursor = conn.cursor() locale.setlocale(locale.LC_ALL, 'en_GB.UTF-8') I use: with open('spreekwoorden2.txt', 'w') as spreekwoorden: cursor.execute(export_spreekwoorden) for spreekwoord in cursor: spreekwoorden.write('%s\n' % spreekwoord) and export_spreekwoorden is defined as: export_spreekwoorden = ''' SELECT spreekwoord FROM spreekwoorden ORDER BY spreekwoord COLLATE LOCALIZED ''' -- Cecil Westerhof Senior Software Engineer LinkedIn: http://www.linkedin.com/in/cecilwesterhof -- https://mail.python.org/mailman/listinfo/python-list