Update: after several more hours of Googling and experimenting, I have found a way to display Japanese in the Console. For more information, check out my new post, “Windows Console and Double/Multi Byte Character Set“. The rest of this post is still accurate with regards to Unicode support and Western system locales.
Have you been hoping to see Japanese (or Thai, Hindi, Arabic, etc.) characters appear when you type dir
into your command prompt? Well, prepare to be disappointed, as the Windows CMD.exe Console cannot display Unicode characters. You’ll have to use the Powershell ISE if you want to see full Unicode text output.
The best that the Command Shell can do is to write out boxes or question marks and, when characters are marked and copied, the clipboard will be populated with the correct Unicode characters. Those characters can then be pasted into smarter executables, like Notepad.
Michael S. Kaplan, an expert on all things Unicode and Microsoft, wrote about this at great length on MSDN Blogs. Unfortunately, Microsoft decided to wipe his blog from the Internet, even though it breaks links from the likes of Raymond Chen’s The Old New Thing.
Michael’s relevant blog posts can be found on The Internet Archive’s Wayback Machine:
- Anyone who says the console can’t do Unicode isn’t as smart as they think they are from 4/7/2010 (explains that the Unicode characters won’t display, but they will copy to the clipboard).
- The real problem(s) with all of these console “fallback” discussions from 2/15/2010
- Cunningly conquering communicated console caveats. Comprende, mon Capitán? from 5/7/2010 (provides functions to determine whether output is to the Console or the Powershell ISE).
- A confluence of circumstances leaves a stone unturned… from 9/23/2010 (discusses problems with stdin).
- Conventional wisdom is retarded, aka What the @#%&* is _O_U16TEXT? from 3/18/2008 (explains wide-character output).
Actually you’ll find this is factually untrue. The Windows console window can display, for example Korean and Japanese characters, just fine (example I know is a US-English Windows 7 with Korean and Japanese language package installed and system set to use them). I am not Korean or Japanese, but I had to use this with a colleague to actually work with a Korean partner and his Japanese customer.
And the first link provide, kind of gives the notion that likely only a font or code table of a kind is missing.
However, didn’t know they removed Kaplan’s blog. It was always a great treasure trove of information. Of course it could be that Kaplan left MS and he demanded it be taken down.