在 HTML 中显示 Unicode 符号

我想简单地在 HTML 页面中显示剔(something)和交叉(something)符号,但它显示为框或 goop ——显然与编码有关。

我设置了超能力者标签显示 utf-8但显然我漏掉了什么。

<meta http-equiv="Content-Type" content="text/html; charset=utf-8" />

编辑/解决方案: 通过使用 FireBug 进行注释,我发现页面传递的标题实际上是“ Content-Type: text/html”,而不是 UTF-8。查看使用 Notepad + + 的文件格式,发现我的文件格式为“ UTF-8 without BOM”。改变这只是 UTF-8的符号现在显示正确... 但 Firebug 似乎仍然指示相同的内容类型。

200398 次浏览

You should ensure the HTTP server headers are correct.

In particular, the header:

Content-Type: text/html; charset=utf-8

should be present.

The meta tag is ignored by browsers if the HTTP header is present.

Also ensure that your file is actually encoded as UTF-8 before serving it, check/try the following:

  • Ensure your editor save it as UTF-8.
  • Ensure your FTP or any file transfer program does not mess with the file.
  • Try with HTML encoded entities, like &#uuu;.
  • To be really sure, hexdump the file and look as the character, for the ✔, it should be E2 9C 94 .

Note: If you use an unicode character for which your system can't find a glyph (no font with that character), your browser should display a question mark or some block like symbol. But if you see multiple roman characters like you do, this denotes an encoding problem.

Make sure that you actually save the file as UTF-8, alternatively use HTML entities (&#nnn;) for the special characters.

Unlike proposed by Nicolas, the meta tag isn’t actually ignored by the browsers. However, the Content-Type HTTP header always has precedence over the presence of a meta tag in the document.

So make sure that you either send the correct encoding via the HTTP header, or don’t send this HTTP header at all (not recommended). The meta tag is mainly a fallback option for local documents which aren’t sent via HTTP traffic.

Using HTML entities should also be considered a workaround – that’s tiptoeing around the real problem. Configuring the web server properly prevents a lot of nuisance.

I think this is a file problem, you simple saved your file in 1-byte encoding like latin-1. Google up your editor and how to set files to utf-8.

I wonder why there are editors that don't default to utf-8.

I know an answer has already been accepted, but wanted to point a few things out.

Setting the content-type and charset is obviously a good practice, doing it on the server is much better, because it ensures consistency across your application.

However, I would use UTF-8 only when the language of my application uses a lot of characters that are available only in the UTF-8 charset. If you want to show a unicode character or symbol in one of cases, you can do so without changing the charset of your page.

HTML renderers have always been able to display symbols which are not part of the encoding character set of the page, as long as you mention the symbol in its numeric character reference (NCR). Sounds weird but its true.

So, even if your html has a header that states it has an encoding of ansi or any of the iso charsets, you can display a check mark by using its html character reference, in decimal - &#10003; or in hex - &#x2713;

So its a little difficult to understand why you are facing this issue on your pages. Can you check if the NCR value is correct, this is a good reference http://www.fileformat.info/info/unicode/char/2713/index.htm