错误的字符串值: “ xF0 x9F x8E xB6 xF0 x9F...”MySQL

我试图存储在我的 MYSQL 表中的 tweet。 推特是:

我希望你能听我说话,不要胡言乱语,不要轻言放弃,每天晚上你都会做一个梦,这个梦是我生命中的一部分

最后两个字符都是 《多重音符》(U + 1 F3B6),其 UTF-8编码是 0xf09f8eb6

表中的 tweet_text字段用 utf8mb4编码。但是,当我试图将 tweet 存储在该列中时,会得到以下错误消息:

错误的字符串值: 第1行的“ tweet _ text”列为“ xF0 x9F x8E xB6 xF0 x9F...”。

出什么问题了?我该怎么补救?我还需要存储多种语言,这个字符集适用于所有语言,但不适用于像表情符号和表情符号这样的特殊字符。

这是我的 create table 语句:

CREATE TABLE `twitter_status_data` (
`unique_status_id` bigint(20) NOT NULL AUTO_INCREMENT,
`metadata_result_type` text CHARACTER SET utf8,
`created_at` text CHARACTER SET utf8 NOT NULL COMMENT 'UTC time when this Tweet was    created.',
`id` bigint(20) unsigned NOT NULL COMMENT 'Unique tweet identifier',
`id_str` text CHARACTER SET utf8 NOT NULL,
`tweet_text` text COMMENT 'Actual UTF-8 text',
`user_id_str` text CHARACTER SET utf8,
`user_name` text COMMENT 'User''s name',
`user_screen_name` text COMMENT 'Twitter handle',
`coordinates` text CHARACTER SET utf8,
PRIMARY KEY (`unique_status_id`),
KEY `user_id_index` (`user_id`),
FULLTEXT KEY `tweet_text_index` (`tweet_text`)
) ENGINE=InnoDB AUTO_INCREMENT=82451 DEFAULT CHARSET=utf8mb4;
155125 次浏览

I was finally able to figure out the issue. I had to change some settings in mysql configuration my.ini This article helped a lot http://mathiasbynens.be/notes/mysql-utf8mb4#character-sets

First i changed the character set in my.ini to utf8mb4 Next i ran the following commands in mysql client

SET NAMES utf8mb4;
ALTER DATABASE dreams_twitter CHARACTER SET = utf8mb4 COLLATE = utf8mb4_general_ci;

Use the following command to check that the changes are made

SHOW VARIABLES WHERE Variable_name LIKE 'character\_set\_%' OR Variable_name LIKE 'collation%';

I had hit the same problem and learnt the following-

Even though database has a default character set of utf-8, it's possible for database columns to have a different character set in MySQL. Modified dB and the problematic column to UTF-8:

mysql> ALTER DATABASE MyDB CHARACTER SET 'utf8' COLLATE 'utf8_unicode_ci'


mysql> ALTER TABLE database.table MODIFY COLUMN column_name VARCHAR(255) CHARACTER SET utf8 COLLATE utf8_general_ci NOT NULL;

Now creating new tables with:

> CREATE TABLE My_Table_Name (
twitter_id_str VARCHAR(255) NOT NULL UNIQUE,
twitter_screen_name VARCHAR(512) CHARACTER SET utf8 COLLATE utf8_unicode_ci,
.....
) CHARACTER SET utf8 COLLATE utf8_unicode_ci;

It may be obvious, but it still was surprising to me, that SET NAMES utf8 is not compatible with utf8mb4 encoding. So for some apps changing table/column encoding was not enough. I had to change encoding in app configuration.

Redmine (ruby, ROR)

In config/database.yml:

production:
adapter: mysql2
database: redmine
host: localhost
username: redmine
password: passowrd
encoding: utf8mb4

Custom Yii application (PHP)

In config/db.php:

return [
'class' => yii\db\Connection::class,
'dsn' => 'mysql:host=localhost;dbname=yii',
'username' => 'yii',
'password' => 'password',
'charset' => 'utf8mb4',
],

If you have utf8mb4 as a column/table encoding and still getting errors like this, make sure that you have configured correct charset for DB connection in your application.

According to the create table statement, the default charset of the table is already utf8mb4. It seems that you have a wrong connection charset.

In Java, set the datasource url like this:

jdbc:mysql://127.0.0.1:3306/testdb?useUnicode=true&characterEncoding=utf-8`.

?useUnicode=true&characterEncoding=utf-8 is necessary for using utf8mb4.

It works for my application.

FOR SQLALCHEMY AND PYTHON

The encoding used for Unicode has traditionally been 'utf8'. However, for MySQL versions 5.5.3 on forward, a new MySQL-specific encoding 'utf8mb4' has been introduced, and as of MySQL 8.0 a warning is emitted by the server if plain utf8 is specified within any server-side directives, replaced with utf8mb3. The rationale for this new encoding is due to the fact that MySQL’s legacy utf-8 encoding only supports codepoints up to three bytes instead of four. Therefore, when communicating with a MySQL database that includes codepoints more than three bytes in size, this new charset is preferred, if supported by both the database as well as the client DBAPI, as in:

e = create_engine(
"mysql+pymysql://scott:tiger@localhost/test?charset=utf8mb4")
All modern DBAPIs should support the utf8mb4 charset.

enter link description here

Change database charset and collation

ALTER DATABASE
database_name
CHARACTER SET = utf8mb4
COLLATE = utf8mb4_unicode_ci;

change specific table's charset and collation

ALTER TABLE
table_name
CONVERT TO CHARACTER SET utf8mb4
COLLATE utf8mb4_unicode_ci;

change connection charset in mysql driver

before

charset=utf8&parseTime=True&loc=Local

after

charset=utf8mb4&collation=utf8mb4_unicode_ci&parseTime=True&loc=Local

From this article https://hackernoon.com/today-i-learned-storing-emoji-to-mysql-with-golang-204a093454b7

I had use an emoji in my string that was the reason for this error.

So make sure you are not using some incorrect string that is not valid to save into the database.

As others said, it's because you are trying to save a 4 bytes of data into less space.

If you are facing the similar issue in java and don't have the flexibility to change the charset and collate encoding of database than this answer is for you.

you can use the Emoji Java library to achieve the same. You can convert into alias before saving/updating into database and convert back to unicode post save/update/load from database. The main benefit is readability of the text even after the encoding because this library only alias the emoji's rather than whole string.

I changed MySQL settings and still the same. Finally I used the function utf8_decode() on the string before insert.