如何将 wchar_t 值打印到控制台？

小开

Can I suggest std::wcout ?

So, something like this:

std::cout << "ASCII and ANSI" << std::endl;
std::wcout << L"INSERT MULTIBYTE WCHAR* HERE" << std::endl;

You might find more information in a related question here.

小开

最佳答案

Edit: This doesn’t work if you are trying to write text that cannot be represented in your default locale. :-(

Use std::wcout instead of std::cout.

wcout << ru << endl << en;

小开

You could use use a normal char array that is actually filled with utf-8 characters. This should allow mixing characters across languages.

小开

#include <iostream>
using namespace std;
void main()
{
setlocale(LC_ALL, "Russian");
cout << "\tДОБРО ПОЖАЛОВАТЬ В КИНО!\n";
}

小开

You can print wide characters with wprintf.

#include <iostream>


int main()
{
wchar_t en[] = L"Hello";
wchar_t ru[] = L"Привет"; //Russian language
wprintf(en);
wprintf(ru);
return 0;
}

Output:

Hello
Привет

小开

Windows has the very confusing information. You should learn C/C++ concept from Unix/Linux before programming in Windows.

wchar_t stores character in UTF-16 which is a fixed 16-bit memory size called wide character but wprintf() or wcout() will never print non-english wide characters correctly because no console will output in UTF-16. Windows will output in current locale while unix/linux will output in UTF-8, all are multi-byte. So you have to convert wide characters to multi-byte before printing. The unix command wcstombs() doesn't work on Windows, use WideCharToMultiByte() instead.

First you need to convert file to UTF-8 using notepad or other editor. Then install font in command prompt console so that it can read/write in your language and change code page in console to UTF-8 to display correctly by typing in the command prompt "chcp 65001" while cygwin is already default to UTF-8. Here is what I did in Thai.

#include <windows.h>
#include <stdio.h>


int main()
{
wchar_t* in=L"ทดสอบ"; // thai language
char* out=(char *)malloc(15);
WideCharToMultiByte(874, 0, in, 15, out, 15, NULL, NULL);
printf(out); // result is correctly in Thai although not neat
}

Note that 874=(Thai) code page in the operating system, 15=size of string

My suggestion is to avoid printing non-english wide characters to console unless necessary because it is not easy.

小开

You cannot portably print wide strings using standard C++ facilities.

Instead you can use the open-source {fmt} library to portably print Unicode text. For example (https://godbolt.org/z/nccb6j):

#include <fmt/core.h>


int main() {
const char en[] = "Hello";
const char ru[] = "Привет";
fmt::print("{}\n{}\n", ru, en);
}

prints

Привет
Hello

This requires compiling with the /utf-8 compiler option in MSVC.

For comparison, writing to wcout on Linux:

wchar_t en[] = L"Hello";
wchar_t ru[] = L"Привет";
std::wcout << ru << std::endl << en;

may transliterate the Russian text into Latin (https://godbolt.org/z/za5zP8):

Privet
Hello

This particular issue can be fixed by switching to a locale that uses UTF-8 but a similar problem exists on Windows that cannot be fixed just with standard facilities.

Disclaimer: I'm the author of {fmt}.

小开

The way to do it is to convert UTF-16 LE (Default Windows encoding) into UTF-8, and then print to console (chcp 65001 first, to switch codepage to UTF-8).

It's pretty trivial to convert UTF-16 to UTF-8. Use this page as a guide, if you need more than 2 byte characters.

short* cmd_s = (short*)cmd;
while(cmd_s[i] != 0)
{
short u16 = cmd_s[i++];
if(u16 > 0x7F)
{
unsigned char c0 = ((char)u16 & 0x3F) | 0x80; // Least significant
unsigned char c1 = char(((u16 >> 6) & 0x1F) | 0xC0); // Most significant
cout << c1 << c0; // Use Big-endian network order
}
else
{
unsigned char c0 = (char)u16;
cout << c0;
}
}

Of course, you can put it in a function and extend it to handle wider characters (For Cyrillic it should be enough), but I wanted to show basic algorithm, and to prove that it's not hard at all and you don't need any libraries, just a few lines of code.