Will C#, .NET and MS Visual Studio save the day?

Problem: take a file from one source in UTF16 encoding, perform some string operations, and write the new file in UTF8 encoding for use in a different system. Real problem: diacritics that turn into garbage in the 2nd system, repeatedly. Prior to UTF16 encoded files, this was not an issue.


How one might feel when fighting with character encoding

As I struggle with trying to convert a UTF16 file to UTF8 encoding, I find that Python, even Python 3 with its supposed improved Unicode support, is not working for me. The Linux tricks for converting file encoding doesn’t work either. HexEdit only allows me to see if a BOM (Byte Order Marker) is present at the beginning of the file.

I have been informed that the answer to my problems is C# (C Sharp), as part of the .NET (“dot net”) framework. Microsoft Visual Studio is a tool for programming in various languages for .NET. Microsoft, as in MS Excel, uses UTF16 encoding natively, which is whence my original problem likely comes from. Python uses UTF8.

The Community Edition 2013 is available for free, and so is Microsoft Virtual Academy. MVA is, so far, a lot like a MOOC with no time limits. I’ve printed “Hello World” to the console and done some if-else statements. Not at the point where I can do anything useful yet; I’ll keep plugging along. One advantage of using MVS is that you can create Windows apps that are usable by layfolks, rather than the gritty hobby feel of Linux and Python. Which is not to say that I’ll abandon Linux and Python, but this is another weapon in the warchest against Unicode hasselry.

Leave a comment

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )


Connecting to %s

%d bloggers like this: