Friday, April 25, 2014

Unicode anyone?

As you may have guessed from my previous post I've been digging into Unicode. One may think it odd that I haven't done this sooner, but when working on legacy applications it is hard to justify supporting more modern things when you spend all your time trying to make legacy software better.

Fortunately, I'm now in a position of writing fresh new code that can be built on a strong basis. As I thought through many of the things I wanted to do I fairly quickly came to the conclusion that moving forward I needed to write things using Unicode.

Should be easy. Unicode has been around for 20 years now so, unlike the mere 3 years since adoption of C++ 11, everything should work and the skids should be smooth and well greased.

Of course not!

Windows seems to like UTF-16. That's fine. Other operating systems seem to like UTF-8 or UTF-16.

Out of the box on Mac and Ubuntu I can do something like the following:

    std::cout <<  u8"Α-Ωα-ω\n";

Of course on Windows not only is the u8" syntax not yet supported, but even if you get a std::string with a proper UTF-8 encoding that won't work either.

Turns out that I can set the cmd.exe console encoding to use UTF-8, and it works great if I use printf for my string, but std::cout doesn't work. To top things off Microsoft decided to explicitly disallow UTF-8 in their std::locale implementation. So I can't tell std::cout to send things to the console as UTF-8. Instead it appears that I will need to use printf or find another obscure way of outputting Unicode in my unit test console based application.

I'm not sure what this means, but it does give hope to those worried that the machines will take over. It will likely take them several decades to figure out the mess we have made with software.


g++ is dead

Lately I've been working on a project that involves using C++ to develop libraries for Windows, Mac, Ubuntu, iOS and Android, with an eye toward quality and portability.

Recently I decided that I needed to build a Unicode class to support a forward thinking basis for things. Of course C++ is generally not Unicode friendly, but it isn't all that unfriendly either. Especially with C++ 11.

So I dig in and start learning and coding and came up with a first pass of my class on Windows using Visual Studio 2013. So far so good.

Now let's go over to Ubuntu where where we have compilers like g++ 4.8 and Clang 3.3 that are reportedly more compliant than Visual Studio.

The first thing I notice is a missing include . The next thing I notice is that the API for std::basic_string that I was trying to emulate is well, just plain wrong! Not even close! 

I start digging and find that the GNU standard C++ library is about 6 sigmas off of supporting C++ 11. How can you claim your compiler is C++ 11 compliant but the library is not?

Fortunately, I eventually got to the point of using the new libc++ library which Apple switched to awhile ago and things work, in clang. But when I try to build for Android I can't get the experimental clang support to work. Probably because of 2 or 3 really important steps that I missed, but nonetheless I gave up.

Working through the problem I manged to hack around the deficiencies in the GNU standard library with regard to std::basic_string, but got nowhere with trying to get the codecvt stuff working. Turns out that the other option of using iconv isn't built into Android. I would have needed to jump through several major hoops to compile it and hook it in.

I eventually wrote my own conversion routines between UTF-8, UTF-16 and UTF-32.

So to the point of my post.

It is 2014. C++ 11 was ratified, in March of 2011 and formally adopted in August of 2011. So why, 3 years later, is the GNU library languishing? I would not have expected full support on day 1 or even 1 year after adoption, but 3 years? And the thing that is particularly annoying is that some of the stuff that isn't supported is trivial to correct.

My sense is that the ideas behind GNU and the "Free" software thing are fallacious, which I speculate is a major reason the Apple decided to ditch GNU and that we now see that a more open "open source" implementation in Clang and libc++ is now displacing GNU.

While I will likely continue to keep an eye on the GNU C++ compiler for awhile, it is now a distant #3 in my compiler recommendation list after Visual Studio and Clang. This leads me to the prediction that G++ will be a footnote in the history of C++ compilers. Although they could turn this around, but I'm not going to hold my breath.

Thursday, January 30, 2014

Software Quality Ceiling

Over the past 3 years I have worked on a project porting a large legacy C++ code base so that it will run on Mac, Windows and in the future other platforms. In doing so we used a number of third party toolkits including Qt.

Portable libraries are nice to use when they work right. The problem is that sometimes you run up against quality issues with the libraries. Things that work right on one platform and not on another or things that simply don't work right on any platform.

Qt is a nice toolkit, but it has many issues. We worked closely with the support team at Digia to correct many issues, but with every software update a new set of problems are added. This means a long and laborious process of first creating a reproducible bit of code, reporting it, sometimes needing to convince the support people that it really is a bug, getting a patch, trying it out and often iterating several times until the patch is correct. Then we hope that the problem is corrected in the next update, which it often is, but about as often has new issues.

Of course Qt is not the only problem. Yesterday I was investigating the new threading library for C++ 11 in Visual Studio 2013. Everything was going along just fine as I was working on a simple ThreadPool class. It worked great except that it would hang when shutting down. I finally did a search and ran across the following http://connect.microsoft.com/VisualStudio/feedback/details/747145. A bug report against Visual Studio 2012. Marked Closed Deferred.

I'm able to workaround the problem by ensuring that my thread pool shuts down before returning from main, but now I'm faced with how to write a class that logically would be global that will correctly shutdown in Visual Studio. I have yet to test it on Mac or Ubuntu, but I'm fairly confident that I won't have the same problem there.

Over the years I have encountered numerous code generation problems with compilers. The good news is that I haven't run across anything recently. But in the back of my mind I continue to worry that one may show up. This is really bad because unless you actually run the code affected in the manner that would exhibit the bug you may never know you have this problem.

Then there are things like the openGL drivers for video cards. In some cases something works fine on one card but not on another. At that point you are faced with answering the question, did I do something wrong? Sometimes I did. In some cases the fact that it worked on one card was wrong. But in other cases I used the API correctly and there is a bug in the driver.

Then there is the unspecified behavior clause in languages. In C++ there are many places in the standard where the behavior is listed as unspecified or compiler dependent. In these cases I would prefer that it didn't compile or at the very least it should crash when I try to use it. In my recent porting project I ran across several cases where the compiler on the Mac crashed on badly written code, but on Windows it didn't. I was happy to find and fix those issues. But they should have not worked in the first place.

All of these issues result in a quality ceiling. My goal is to write code that is 100% bug free. Occasionally, I achieve that for simple cases. More often I have a few bugs that testing should eventually uncover. But if I'm working with tools, compilers, libraries, operating systems, etc. that are buggy I will always be faces with the fact that my software will be of a lower quality than I'm theoretically capable of writing.

As an industry we need to figure out how to break through the quality ceiling.

With that I close, because I have no answers.