On Thu, Jul 19, 2018 at 4:45 PM, Hoffman, Zachary Robert <zrhoff...@ku.edu>
wrote:

> On Thu, 2018-07-19 at 22:35 +0200, Niklas Keller wrote:
> > Hey Rasmus
> >
> > I just found this bug: https://bugs.php.net/bug.php?id=76553
> >
> > Has this bug been like that before the migration, too? Or did
> > something go wrong?
>
> No, those used to be Unicode characters from the cyrillic block.


 This appears to be database-related. Something got messed up with
encodings on the mysql dump/import from MySQL 5.1.73 into MariaDB 10.1.26:

mysql> select sdesc from bugdb where id=76553;
+----------------------------------------------------------------------------------+
| sdesc
       |
+----------------------------------------------------------------------------------+
| Имя переменной может содержать управляющие |
+----------------------------------------------------------------------------------+
1 row in set (0.00 sec)

MariaDB [phpbugsdb]> select sdesc from bugdb where id=76553;
+--------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| sdesc

             |
+--------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| Ð˜Ð¼Ñ Ð¿ÐµÑ€ÐµÐ¼ÐµÐ½Ð½Ð¾Ð¹ может Ñ Ð¾Ð´ÐµÑ€Ð¶Ð°Ñ‚ÑŒ управлÑ
ющие
                |
+--------------------------------------------------------------------------------------------------------------------------------------------------------------------+
1 row in set (0.00 sec)

The dumped table schema from MySQL has:

DROP TABLE IF EXISTS `bugdb`;
/*!40101 SET @saved_cs_client     = @@character_set_client */;
/*!40101 SET character_set_client = utf8 */;
CREATE TABLE `bugdb` (
  `id` int(8) NOT NULL AUTO_INCREMENT,
  `package_name` varchar(80) CHARACTER SET latin1 DEFAULT NULL,
  `bug_type` varchar(32) CHARACTER SET latin1 NOT NULL DEFAULT 'Bug',
  `email` varchar(40) CHARACTER SET latin1 NOT NULL DEFAULT '',
  `reporter_name` varchar(80) CHARACTER SET latin1 DEFAULT '',
  `sdesc` varchar(80) CHARACTER SET latin1 NOT NULL DEFAULT '',
  `ldesc` text CHARACTER SET latin1 NOT NULL,
  `php_version` varchar(100) CHARACTER SET latin1 DEFAULT NULL,
  `php_os` varchar(32) CHARACTER SET latin1 DEFAULT NULL,
  `status` varchar(16) CHARACTER SET latin1 DEFAULT NULL,
  `ts1` datetime DEFAULT NULL,
  `ts2` datetime DEFAULT NULL,
  `assign` varchar(20) CHARACTER SET latin1 DEFAULT NULL,
  `passwd` varchar(64) CHARACTER SET latin1 DEFAULT NULL,
  `registered` tinyint(1) NOT NULL DEFAULT '0',
  `block_user_comment` char(1) DEFAULT 'N',
  `cve_id` varchar(15) DEFAULT NULL,
  `private` char(1) DEFAULT 'N',
  `visitor_ip` int(10) unsigned NOT NULL,
  PRIMARY KEY (`id`),
  KEY `php_version` (`php_version`(1)),
  KEY `status` (`status`),
  KEY `package_name` (`package_name`),
  FULLTEXT KEY `email` (`email`,`sdesc`,`ldesc`)
) ENGINE=MyISAM AUTO_INCREMENT=76637 DEFAULT CHARSET=utf8 PACK_KEYS=1;
/*!40101 SET character_set_client = @saved_cs_client */;

When I dump it from MariaDB I see:

DROP TABLE IF EXISTS `bugdb`;
/*!40101 SET @saved_cs_client     = @@character_set_client */;
/*!40101 SET character_set_client = utf8 */;
CREATE TABLE `bugdb` (
  `id` int(8) NOT NULL AUTO_INCREMENT,
  `package_name` varchar(80) CHARACTER SET latin1 DEFAULT NULL,
  `bug_type` varchar(32) CHARACTER SET latin1 NOT NULL DEFAULT 'Bug',
  `email` varchar(40) CHARACTER SET latin1 NOT NULL DEFAULT '',
  `reporter_name` varchar(80) CHARACTER SET latin1 DEFAULT '',
  `sdesc` varchar(80) CHARACTER SET latin1 NOT NULL DEFAULT '',
  `ldesc` text CHARACTER SET latin1 NOT NULL,
  `php_version` varchar(100) CHARACTER SET latin1 DEFAULT NULL,
  `php_os` varchar(32) CHARACTER SET latin1 DEFAULT NULL,
  `status` varchar(16) CHARACTER SET latin1 DEFAULT NULL,
  `ts1` datetime DEFAULT NULL,
  `ts2` datetime DEFAULT NULL,
  `assign` varchar(20) CHARACTER SET latin1 DEFAULT NULL,
  `passwd` varchar(64) CHARACTER SET latin1 DEFAULT NULL,
  `registered` tinyint(1) NOT NULL DEFAULT '0',
  `block_user_comment` char(1) DEFAULT 'N',
  `cve_id` varchar(15) DEFAULT NULL,
  `private` char(1) DEFAULT 'N',
  `visitor_ip` int(10) unsigned NOT NULL,
  PRIMARY KEY (`id`),
  KEY `php_version` (`php_version`(1)),
  KEY `status` (`status`),
  KEY `package_name` (`package_name`),
  FULLTEXT KEY `email` (`email`,`sdesc`,`ldesc`)
) ENGINE=MyISAM AUTO_INCREMENT=76650 DEFAULT CHARSET=utf8 PACK_KEYS=1;
/*!40101 SET character_set_client = @saved_cs_client */;

Other than the autoincrement they are identical. I normally use utf8mb4,
but I figured I would play it safe and copy it over verbatim. I guess it
wasn't safe.
I'll do some research, but ideas welcome.

-Rasmus

Reply via email to