Hi Hackers, PFA updated patch as the previous one was not working as expected. I have tried to make it similar to that of pgAdmin3 and you do not need to change client_encoding as it is set now based on server encoding. It works fine with "view data" also. The only problem is, I cannot find equivalent codec for wxConvLibc in python. The closest one I could find is raw_unicode_escape. So, in a SQL_ASCII database, non ASCII characters may differ in pgAdmin4 and pgAdmin3, but it will display results.
Dave, You need to add "E" before the string to be inserted, otherwise \x will be considered as a plain string. INSERT INTO sql_ascii (data) VALUES (E'[Invalid UTF-8] Blob: \xf4\xa5\xa3\xa5'); Kindly review. Thanks and Regards, Aditya Toshniwal Software Engineer | EnterpriseDB Software Solutions | Pune "Don't Complain about Heat, Plant a tree" On Tue, Jun 5, 2018 at 6:42 PM, Dave Page <dp...@pgadmin.org> wrote: > Hi > > On Tue, Jun 5, 2018 at 2:03 PM, Aditya Toshniwal <aditya.toshniwal@ > enterprisedb.com> wrote: > >> Hi >> >> On Tue, Jun 5, 2018 at 6:25 PM, Dave Page <dp...@pgadmin.org> wrote: >> >>> >>> >>> On Tue, Jun 5, 2018 at 1:49 PM, Aditya Toshniwal < >>> aditya.toshni...@enterprisedb.com> wrote: >>> >>>> Hi Dave, >>>> >>>> The problem of SQL ASCII is solved with the patch, and not related to >>>> setting the client encoding of the sql window. >>>> >>> >>> No it's not. It doesn't work for me as I said (and showed the example >>> of). >>> >> >> After setting the client_encoding to SQL_ASCII you got the output. >> Previously, it used to fail in the back end itself because python encoding >> failure. That is fixed. >> The error ERROR: invalid byte sequence for encoding "UTF8": 0x80 is >> thrown by postgres and not python or pgAdmin4. You will get the same error >> even if you >> connect from psql. >> > > Sure - but that is not a fix. You have no way of running the SET command > if you're using "view data" - and in the query tool, users just expect it > to work (as it did in pgAdmin 3). > > >> >>> >>>> I can see there is no SET call in pgAdmin3 for client_encoding. I can >>>> remove the SET client_encoding='UNICODE'; that will solve the problem. >>>> But, can you please let me know why that was added. >>>> >>> >>> There is, but it's inside an API call (PQsetClientEncoding): >>> >>> 300 >>> <https://git.postgresql.org/gitweb/?p=pgadmin3.git;a=blob;f=pgadmin/db/pgConn.cpp#l300> >>> wxLogInfo(wxT("Setting client_encoding to '%s'") >>> , encoding.c_str()); >>> 301 >>> <https://git.postgresql.org/gitweb/?p=pgadmin3.git;a=blob;f=pgadmin/db/pgConn.cpp#l301> >>> if (PQsetClientEncoding(conn, encoding.ToAscii())) >>> 302 >>> <https://git.postgresql.org/gitweb/?p=pgadmin3.git;a=blob;f=pgadmin/db/pgConn.cpp#l302> >>> { >>> 303 >>> <https://git.postgresql.org/gitweb/?p=pgadmin3.git;a=blob;f=pgadmin/db/pgConn.cpp#l303> >>> wxLogError(wxT("%s"), GetLastError().c_str()); >>> 304 >>> <https://git.postgresql.org/gitweb/?p=pgadmin3.git;a=blob;f=pgadmin/db/pgConn.cpp#l304> >>> } >>> 305 >>> <https://git.postgresql.org/gitweb/?p=pgadmin3.git;a=blob;f=pgadmin/db/pgConn.cpp#l305> >>> >>> Oops ! Missed that. Apologies. >> >>> >>> >>>> >>>> Will remove the set call and will send you the updated patch if >>>> everything works fine. >>>> >>> >>> No, we need to ensure the client encoding is set correctly. It just >>> needs to be set to SQL_ASCII if it's a SQL_ASCII database (I believe). >>> >>> >> Need to rework on the initialise method. Will come with an updated. >> patch. Sorry for trouble. >> >>> >>>> >>>> On Tue, Jun 5, 2018 at 6:05 PM, Dave Page <dp...@pgadmin.org> wrote: >>>> >>>>> Hi >>>>> >>>>> On Tue, Jun 5, 2018 at 1:21 PM, Aditya Toshniwal < >>>>> aditya.toshni...@enterprisedb.com> wrote: >>>>> >>>>>> Hi Dave, >>>>>> >>>>>> >>>>>> On Tue, Jun 5, 2018 at 4:56 PM, Dave Page <dp...@pgadmin.org> wrote: >>>>>> >>>>>>> Hi >>>>>>> >>>>>>> On Tue, Jun 5, 2018 at 9:50 AM, Aditya Toshniwal < >>>>>>> aditya.toshni...@enterprisedb.com> wrote: >>>>>>> >>>>>>>> Hi Hackers, >>>>>>>> >>>>>>>> PFA updated patch. The sqleditor change is sent separately and >>>>>>>> removed from current patch as suggested. >>>>>>>> The test cases were running fine when the module was specified >>>>>>>> using --pkg but were failing in complete run. Fixed that. >>>>>>>> >>>>>>> >>>>>>> I did a quick test by creating a SQL_ASCII database containing a >>>>>>> simple table: >>>>>>> >>>>>>> CREATE TABLE sql_ascii (id serial primary key, data text); >>>>>>> >>>>>>> And then populated it with data: >>>>>>> >>>>>>> /Library/PostgreSQL/9.4/bin/psql -d sql_ascii -U postgres -c >>>>>>> "INSERT INTO sql_acsii (data) VALUES ('[Windows-1252] Euro: \x80 >>>>>>> Double >>>>>>> dagger: \x87');" >>>>>>> /Library/PostgreSQL/9.4/bin/psql -d sql_ascii -U postgres -c >>>>>>> "INSERT INTO sql_ascii (data) VALUES ('[Latin-1] Yen: \xa5 Half: >>>>>>> \xbd');" >>>>>>> /Library/PostgreSQL/9.4/bin/psql -d sql_ascii -U postgres -c >>>>>>> "INSERT INTO sql_ascii (data) VALUES ('[Japanese] Ship: >>>>>>> \xe8\x88\xb9');" >>>>>>> /Library/PostgreSQL/9.4/bin/psql -d sql_ascii -U postgres -c >>>>>>> "INSERT INTO sql_ascii (data) VALUES ('[Invalid UTF-8] Blob: >>>>>>> \xf4\xa5\xa3\xa5');" >>>>>>> >>>>>>> I then right-clicked the table in the treeview, and selected the >>>>>>> option to view all rows, and immediately saw an error: >>>>>>> >>>>>>> 2018-06-05 12:23:27,319: SQL pgadmin: Execute (async) for server #1 >>>>>>> - CONN:1187535 (Query-id: 8522474): >>>>>>> SELECT * FROM public.sql_ascii >>>>>>> ORDER BY id ASC >>>>>>> 2018-06-05 12:23:27,320: ERROR pgadmin: Failed to execute query >>>>>>> (execute_async) for the server #1 - CONN:1187535(Query-id: 8522474): >>>>>>> Error Message:ERROR: invalid byte sequence for encoding "UTF8": 0x80 >>>>>>> SQL state: 22021 >>>>>>> >>>>>>> Running "SELECT * FROM sql_ascii" in the query tool resulted in the >>>>>>> same error, however, if I ran "SET client_encoding = 'SQL_ASCII';" >>>>>>> first, I >>>>>>> do see results. >>>>>>> >>>>>>> I have confirmed that I've restarted the server after applying the >>>>>>> patch. >>>>>>> >>>>>>> What am I missing? Why don't we just set the client_encoding to >>>>>>> SQL_ASCII if it's a SQL_ASCII database? >>>>>>> >>>>>> >>>>>> It is by default same as the server encoding. But, the following >>>>>> existing code in web/pgadmin/utils/driver/psycopg2/connection.py makes >>>>>> the client_encoding as UNICODE for every connection. I am not sure it >>>>>> should be removed. >>>>>> >>>>>> status = _execute(cur, "SET DateStyle=ISO;" >>>>>> >>>>>> "SET client_min_messages=notice;" >>>>>> >>>>>> "SET bytea_output=escape;" >>>>>> >>>>>> "SET client_encoding='UNICODE';") >>>>>> >>>>> >>>>> It was probably before you joined, but I have said a number of times >>>>> that pgAdmin 3 handled this differently and that maybe we should do it the >>>>> same way here. See https://git.postgresql.org >>>>> /gitweb/?p=pgadmin3.git;a=blob;f=pgadmin/db/pgConn.cpp, in the >>>>> pgConn::Initialize() function. >>>>> >>>>> Either way, your patch isn't working for me. >>>>> >>>>> >>>>>> >>>>>> >>>>>> Note that this testing was on Python 2.7.10 on MacOS. >>>>>>> >>>>>>> >>>>>>>> >>>>>>>> Kindly review. >>>>>>>> >>>>>>>> Thanks and Regards, >>>>>>>> Aditya Toshniwal >>>>>>>> Software Engineer | EnterpriseDB Software Solutions | Pune >>>>>>>> "Don't Complain about Heat, Plant a tree" >>>>>>>> >>>>>>>> On Tue, Jun 5, 2018 at 10:15 AM, Aditya Toshniwal < >>>>>>>> aditya.toshni...@enterprisedb.com> wrote: >>>>>>>> >>>>>>>>> Hi >>>>>>>>> >>>>>>>>> On Tue, Jun 5, 2018 at 1:08 AM, Joao De Almeida Pereira < >>>>>>>>> jdealmeidapere...@pivotal.io> wrote: >>>>>>>>> >>>>>>>>>> Hello Aditya, >>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> There is no change related to notifications in this patch. >>>>>>>>>>> The below code is minor fix related to connection status of sql >>>>>>>>>>> editor. Can you please share the code snippet if it is not the >>>>>>>>>>> below. >>>>>>>>>>> >>>>>>>>>>> - # Check for the asynchronous notifies statements. >>>>>>>>>>> - conn.check_notifies(True) >>>>>>>>>>> - notifies = conn.get_notifies() >>>>>>>>>>> + if status is not None: >>>>>>>>>>> + # Check for the asynchronous notifies statements. >>>>>>>>>>> + conn.check_notifies(True) >>>>>>>>>>> + notifies = conn.get_notifies() >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>> This is a minor fix, but is it related to querying SQL_ASCII >>>>>>>>>> database? >>>>>>>>>> >>>>>>>>> No its not. It is something I found when I was working on >>>>>>>>> SQL_ASCII related changes. >>>>>>>>> Well then, will send a separate patch for it. >>>>>>>>> >>>>>>>>>> >>>>>>>>>> Thanks >>>>>>>>>> Victoria && Joao >>>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>> >>>>>>> >>>>>>> >>>>>>> -- >>>>>>> Dave Page >>>>>>> Blog: http://pgsnake.blogspot.com >>>>>>> Twitter: @pgsnake >>>>>>> >>>>>>> EnterpriseDB UK: http://www.enterprisedb.com >>>>>>> The Enterprise PostgreSQL Company >>>>>>> >>>>>> >>>>>> >>>>> >>>>> >>>>> -- >>>>> Dave Page >>>>> Blog: http://pgsnake.blogspot.com >>>>> Twitter: @pgsnake >>>>> >>>>> EnterpriseDB UK: http://www.enterprisedb.com >>>>> The Enterprise PostgreSQL Company >>>>> >>>> >>>> >>> >>> >>> -- >>> Dave Page >>> Blog: http://pgsnake.blogspot.com >>> Twitter: @pgsnake >>> >>> EnterpriseDB UK: http://www.enterprisedb.com >>> The Enterprise PostgreSQL Company >>> >> >> > > > -- > Dave Page > Blog: http://pgsnake.blogspot.com > Twitter: @pgsnake > > EnterpriseDB UK: http://www.enterprisedb.com > The Enterprise PostgreSQL Company >
diff --git a/web/pgadmin/tools/sqleditor/tests/test_encoding_charset.py b/web/pgadmin/tools/sqleditor/tests/test_encoding_charset.py new file mode 100644 index 0000000..0826dec --- /dev/null +++ b/web/pgadmin/tools/sqleditor/tests/test_encoding_charset.py @@ -0,0 +1,103 @@ +########################################################################## +# +# pgAdmin 4 - PostgreSQL Tools +# +# Copyright (C) 2013 - 2018, The pgAdmin Development Team +# This software is released under the PostgreSQL Licence +# +########################################################################## + +from pgadmin.utils.route import BaseTestGenerator +from pgadmin.browser.server_groups.servers.databases.tests import utils as \ + database_utils +from regression import parent_node_dict +from regression.python_test_utils import test_utils +import json + + +class TestEncodingCharset(BaseTestGenerator): + """ + This class validates character support in pgAdmin4 for + different PostgresDB encodings + """ + scenarios = [ + ( + 'With Encoding UTF8', + dict( + db_encoding='UTF8', + lc_collate='C', + test_str='A' + )), + ( + 'With Encoding WIN1252', + dict( + db_encoding='WIN1252', + lc_collate='C', + test_str='A' + )), + ( + 'With Encoding EUC_CN', + dict( + db_encoding='EUC_CN', + lc_collate='C', + test_str='A' + )), + ( + 'With Encoding SQL_ASCII', + dict( + db_encoding='SQL_ASCII', + lc_collate='C', + test_str='\\255' + )), + ] + + def setUp(self): + self.encode_db_name = 'encoding_' + self.db_encoding + self.encode_sid = self.server_information['server_id'] + self.encode_did = test_utils.create_database( + self.server, self.encode_db_name, + (self.db_encoding, self.lc_collate)) + + def runTest(self): + + db_con = database_utils.connect_database(self, + test_utils.SERVER_GROUP, + self.encode_sid, + self.encode_did) + if not db_con["info"] == "Database connected.": + raise Exception("Could not connect to the database.") + + # Initialize query tool + url = '/datagrid/initialize/query_tool/{0}/{1}/{2}'.format( + test_utils.SERVER_GROUP, self.encode_sid, self.encode_did) + response = self.tester.post(url) + self.assertEquals(response.status_code, 200) + + response_data = json.loads(response.data.decode('utf-8')) + self.trans_id = response_data['data']['gridTransId'] + + # Check character + url = "/sqleditor/query_tool/start/{0}".format(self.trans_id) + sql = "select E'{0}';".format(self.test_str) + response = self.tester.post(url, data=json.dumps({"sql": sql}), + content_type='html/json') + self.assertEquals(response.status_code, 200) + url = '/sqleditor/poll/{0}'.format(self.trans_id) + response = self.tester.get(url) + self.assertEquals(response.status_code, 200) + response_data = json.loads(response.data.decode('utf-8')) + self.assertEquals(response_data['data']['rows_fetched_to'], 1) + + database_utils.disconnect_database(self, self.encode_sid, + self.encode_did) + + def tearDown(self): + main_conn = test_utils.get_db_connection( + self.server['db'], + self.server['username'], + self.server['db_password'], + self.server['host'], + self.server['port'], + self.server['sslmode'] + ) + test_utils.drop_database(main_conn, self.encode_db_name) diff --git a/web/pgadmin/utils/driver/psycopg2/connection.py b/web/pgadmin/utils/driver/psycopg2/connection.py index cfd161a..e2538d1 100644 --- a/web/pgadmin/utils/driver/psycopg2/connection.py +++ b/web/pgadmin/utils/driver/psycopg2/connection.py @@ -33,7 +33,7 @@ from pgadmin.utils import get_complete_file_path from ..abstract import BaseDriver, BaseConnection from .cursor import DictCursor from .typecast import register_global_typecasters, \ - register_string_typecasters, register_binary_typecasters, \ + register_binary_typecasters, \ register_array_to_string_typecasters, ALL_JSON_TYPES @@ -387,8 +387,6 @@ class Connection(BaseConnection): else: self.conn.autocommit = True - register_string_typecasters(self.conn) - if self.array_to_string: register_array_to_string_typecasters(self.conn) @@ -397,10 +395,19 @@ class Connection(BaseConnection): if self.use_binary_placeholder: register_binary_typecasters(self.conn) - status = _execute(cur, "SET DateStyle=ISO;" - "SET client_min_messages=notice;" - "SET bytea_output=escape;" - "SET client_encoding='UNICODE';") + if self.conn.encoding not in ('SQL_ASCII', 'SQLASCII', + 'MULE_INTERNAL', 'MULEINTERNAL'): + status = _execute(cur, "SET DateStyle=ISO;" + "SET client_min_messages=notice;" + "SET bytea_output=escape;" + "SET client_encoding='UNICODE';") + + encodings[self.conn.encoding] = 'utf-8' + else: + status = _execute(cur, "SET DateStyle=ISO;" + "SET client_min_messages=notice;" + "SET bytea_output=escape;") + encodings[self.conn.encoding] = 'raw_unicode_escape' if status is not None: self.conn.close() diff --git a/web/pgadmin/utils/driver/psycopg2/typecast.py b/web/pgadmin/utils/driver/psycopg2/typecast.py index f136604..a8f6c38 100644 --- a/web/pgadmin/utils/driver/psycopg2/typecast.py +++ b/web/pgadmin/utils/driver/psycopg2/typecast.py @@ -163,49 +163,6 @@ def register_global_typecasters(): psycopg2.extensions.register_type(pg_array_types_to_array_of_string_type) -def register_string_typecasters(connection): - if connection.encoding != 'UTF8': - # In python3 when database encoding is other than utf-8 and client - # encoding is set to UNICODE then we need to map data from database - # encoding to utf-8. - # This is required because when client encoding is set to UNICODE then - # psycopg assumes database encoding utf-8 and not the actual encoding. - # Not sure whether it's bug or feature in psycopg for python3. - if sys.version_info >= (3,): - def return_as_unicode(value, cursor): - if value is None: - return None - # Treat value as byte sequence of database encoding and then - # decode it as utf-8 to get correct unicode value. - return bytes( - value, encodings[cursor.connection.encoding] - ).decode('utf-8') - - unicode_type = psycopg2.extensions.new_type( - # "char", name, text, character, character varying - (19, 18, 25, 1042, 1043, 0), - 'UNICODE', return_as_unicode) - else: - def return_as_unicode(value, cursor): - if value is None: - return None - # Decode it as utf-8 to get correct unicode value. - return value.decode('utf-8') - - unicode_type = psycopg2.extensions.new_type( - # "char", name, text, character, character varying - (19, 18, 25, 1042, 1043, 0), - 'UNICODE', return_as_unicode) - - unicode_array_type = psycopg2.extensions.new_array_type( - # "char"[], name[], text[], character[], character varying[] - (1002, 1003, 1009, 1014, 1015, 0 - ), 'UNICODEARRAY', unicode_type) - - psycopg2.extensions.register_type(unicode_type) - psycopg2.extensions.register_type(unicode_array_type) - - def register_binary_typecasters(connection): psycopg2.extensions.register_type( psycopg2.extensions.new_type( diff --git a/web/regression/python_test_utils/test_utils.py b/web/regression/python_test_utils/test_utils.py index 3e517b6..464a09e 100644 --- a/web/regression/python_test_utils/test_utils.py +++ b/web/regression/python_test_utils/test_utils.py @@ -116,7 +116,7 @@ def clear_node_info_dict(): del node_info_dict[node][:] -def create_database(server, db_name): +def create_database(server, db_name, encoding=None): """This function used to create database and returns the database id""" try: connection = get_db_connection( @@ -130,8 +130,14 @@ def create_database(server, db_name): old_isolation_level = connection.isolation_level connection.set_isolation_level(0) pg_cursor = connection.cursor() - pg_cursor.execute( - '''CREATE DATABASE "%s" TEMPLATE template0''' % db_name) + if encoding is None: + pg_cursor.execute( + '''CREATE DATABASE "%s" TEMPLATE template0''' % db_name) + else: + pg_cursor.execute( + '''CREATE DATABASE "%s" TEMPLATE template0 + ENCODING='%s' LC_COLLATE='%s' LC_CTYPE='%s' ''' % + (db_name, encoding[0], encoding[1], encoding[1])) connection.set_isolation_level(old_isolation_level) connection.commit()