On Thu, Jul 19, 2018 at 4:45 PM, Hoffman, Zachary Robert <zrhoff...@ku.edu> wrote:
> On Thu, 2018-07-19 at 22:35 +0200, Niklas Keller wrote: > > Hey Rasmus > > > > I just found this bug: https://bugs.php.net/bug.php?id=76553 > > > > Has this bug been like that before the migration, too? Or did > > something go wrong? > > No, those used to be Unicode characters from the cyrillic block. This appears to be database-related. Something got messed up with encodings on the mysql dump/import from MySQL 5.1.73 into MariaDB 10.1.26: mysql> select sdesc from bugdb where id=76553; +----------------------------------------------------------------------------------+ | sdesc | +----------------------------------------------------------------------------------+ | Имя переменной может содержать управляющие | +----------------------------------------------------------------------------------+ 1 row in set (0.00 sec) MariaDB [phpbugsdb]> select sdesc from bugdb where id=76553; +--------------------------------------------------------------------------------------------------------------------------------------------------------------------+ | sdesc | +--------------------------------------------------------------------------------------------------------------------------------------------------------------------+ | Ð˜Ð¼Ñ Ð¿ÐµÑ€ÐµÐ¼ÐµÐ½Ð½Ð¾Ð¹ может Ñ Ð¾Ð´ÐµÑ€Ð¶Ð°Ñ‚ÑŒ ÑƒÐ¿Ñ€Ð°Ð²Ð»Ñ ÑŽÑ‰Ð¸Ðµ | +--------------------------------------------------------------------------------------------------------------------------------------------------------------------+ 1 row in set (0.00 sec) The dumped table schema from MySQL has: DROP TABLE IF EXISTS `bugdb`; /*!40101 SET @saved_cs_client = @@character_set_client */; /*!40101 SET character_set_client = utf8 */; CREATE TABLE `bugdb` ( `id` int(8) NOT NULL AUTO_INCREMENT, `package_name` varchar(80) CHARACTER SET latin1 DEFAULT NULL, `bug_type` varchar(32) CHARACTER SET latin1 NOT NULL DEFAULT 'Bug', `email` varchar(40) CHARACTER SET latin1 NOT NULL DEFAULT '', `reporter_name` varchar(80) CHARACTER SET latin1 DEFAULT '', `sdesc` varchar(80) CHARACTER SET latin1 NOT NULL DEFAULT '', `ldesc` text CHARACTER SET latin1 NOT NULL, `php_version` varchar(100) CHARACTER SET latin1 DEFAULT NULL, `php_os` varchar(32) CHARACTER SET latin1 DEFAULT NULL, `status` varchar(16) CHARACTER SET latin1 DEFAULT NULL, `ts1` datetime DEFAULT NULL, `ts2` datetime DEFAULT NULL, `assign` varchar(20) CHARACTER SET latin1 DEFAULT NULL, `passwd` varchar(64) CHARACTER SET latin1 DEFAULT NULL, `registered` tinyint(1) NOT NULL DEFAULT '0', `block_user_comment` char(1) DEFAULT 'N', `cve_id` varchar(15) DEFAULT NULL, `private` char(1) DEFAULT 'N', `visitor_ip` int(10) unsigned NOT NULL, PRIMARY KEY (`id`), KEY `php_version` (`php_version`(1)), KEY `status` (`status`), KEY `package_name` (`package_name`), FULLTEXT KEY `email` (`email`,`sdesc`,`ldesc`) ) ENGINE=MyISAM AUTO_INCREMENT=76637 DEFAULT CHARSET=utf8 PACK_KEYS=1; /*!40101 SET character_set_client = @saved_cs_client */; When I dump it from MariaDB I see: DROP TABLE IF EXISTS `bugdb`; /*!40101 SET @saved_cs_client = @@character_set_client */; /*!40101 SET character_set_client = utf8 */; CREATE TABLE `bugdb` ( `id` int(8) NOT NULL AUTO_INCREMENT, `package_name` varchar(80) CHARACTER SET latin1 DEFAULT NULL, `bug_type` varchar(32) CHARACTER SET latin1 NOT NULL DEFAULT 'Bug', `email` varchar(40) CHARACTER SET latin1 NOT NULL DEFAULT '', `reporter_name` varchar(80) CHARACTER SET latin1 DEFAULT '', `sdesc` varchar(80) CHARACTER SET latin1 NOT NULL DEFAULT '', `ldesc` text CHARACTER SET latin1 NOT NULL, `php_version` varchar(100) CHARACTER SET latin1 DEFAULT NULL, `php_os` varchar(32) CHARACTER SET latin1 DEFAULT NULL, `status` varchar(16) CHARACTER SET latin1 DEFAULT NULL, `ts1` datetime DEFAULT NULL, `ts2` datetime DEFAULT NULL, `assign` varchar(20) CHARACTER SET latin1 DEFAULT NULL, `passwd` varchar(64) CHARACTER SET latin1 DEFAULT NULL, `registered` tinyint(1) NOT NULL DEFAULT '0', `block_user_comment` char(1) DEFAULT 'N', `cve_id` varchar(15) DEFAULT NULL, `private` char(1) DEFAULT 'N', `visitor_ip` int(10) unsigned NOT NULL, PRIMARY KEY (`id`), KEY `php_version` (`php_version`(1)), KEY `status` (`status`), KEY `package_name` (`package_name`), FULLTEXT KEY `email` (`email`,`sdesc`,`ldesc`) ) ENGINE=MyISAM AUTO_INCREMENT=76650 DEFAULT CHARSET=utf8 PACK_KEYS=1; /*!40101 SET character_set_client = @saved_cs_client */; Other than the autoincrement they are identical. I normally use utf8mb4, but I figured I would play it safe and copy it over verbatim. I guess it wasn't safe. I'll do some research, but ideas welcome. -Rasmus