A Tale of Two Character Encodings
Sunday, June 12, 2005 by GreenReaper | Discussion: Software Development
This friday I finished up some work on a new version of one of Stardock's products, which'll probably see the light shortly after the company finishes moving to Plymouth. So what do geeks do with their down-time? Well, in my case, it's often pretty much the same to what I do for money, only for the communities I'm interested in. Recently, a lot of my time has been spent on the Creatures Community, the group of people who've played the Creatures series of artificial life games. When I'm not contributing to the Creatures Wiki, I'll be writing some sort of tool, like this sprite thumbnail viewer, or polishing the next version of JRNet. But enough about my projects, as this one is actually about someone else's . . .
GEL is a genetic editor for Creatures 2. It is used to edit the genes for the various creatures (Norns, Ettins and Grendels). There are other editors, but people get attached to their favourite programs, and GEL is no different.
The trouble is, although GEL worked great on Windows 98, it didn't seem to want to work on XP. OK, so that wasn't great, but the real problem was that the source code - the words that the programmer types and gives to the compiler to turn into a program - had been lost in a hard disk accident, and so he couldn't fix the problem. Without the source code, you just have the "compiled" version, and it's very hard to make any changes to that.
There were several people upset about this, though and it's always a shame to lose a useful program, so I decided to see if I could do something about it. Overall it took about two days of work to get it back up and running, which I thought was pretty good. I figured it would be kinda neat to tell you how I did it, and show you some of the different tools I used, so I wrote this article. Just skip over the bits that get too technical!
Let's start with what I got when I installed the program and tried starting it up myself:
OK. I got this error, and then when I pressed OK it closed on me. That didn't work out that well. I'm sure you've seen similarly confusing errors on your own computers! Turns out, it's not always easy for programmers to figure out what it means, either . . .
So, we have a problem. But where? "Path not found" isn't a very helpful message - it doesn't tell you what path, for a start! I decided this would be the first thing to try and find out, so I started up FileMon, a utility that monitors what files are accessed by running programs. I was looking for any "not found" messages, and there were a few, but they all turned out to be dead ends.
By now it was clear that it wasn't going to be as easy as a missing file or a permission problem. The next thing I tried was another of the Sysinternals tools, RegMon. This does much the same thing as FileMon - monitor what's happening - but for the Windows Registry, so you can see what settings are being written and read. I consider both of these tools essential if you want to know what's really going on.
This was the last registry read before the error
As it happens, RegMon did turn up something - the last thing that GEL read before it all started to go wrong was the main path of Creatures 2. The thing was, this registry read didn't fail. This just happened to be the last thing that it did with the registry that I couldn't narrow down to other causes. I did try modifying this value in the registry, but this just resulted in slightly different errors.
After that, I briefly tried using another utility called API Monitor in order to see what calls the program was making to the operating system. This program is rather like a general version of Regmon and Filemon - while they monitor specific things, API Monitor "hooks" pretty much every system function that there is and records their use. Unfortunately, I couldn't find what I was looking for; I later found out that it didn't even start sending messages until a window had been created.
To recap: I'd found it wasn't a case of failed registry entries or a file not being there. It was time to bring out the big guns.
My first tool is recognizable to pretty much all Windows programmers, even if they don't use it themselves - Microsoft's Visual Studio. This is the number one tool for Windows development, and although it has its detractors, it's pretty good as development environments go. I would use this to run the program and stop it halfway, examining and changing the memory that it used.
The second might be a little less familiar to most programmers - IDA, the Interactive Disassembler. A disassembler is a program that turns compiled programs back one step into "assembly code", the last point at which it can be considered remotely readable. Few programmers actually write code at this level - most use a higher-level language like C++, Pascal, Java or Visual Basic - but it is usually possible to get a good idea of how parts of a program works through reading it in assembly.
Disassembling programs (also known as reverse-engineering) is something of a shady activity - one of IDA's most popular uses (though not one they advertise) is to figure out how to get around serial code checks, and this is one reason why disassembly is forbidden in most software licenses. However, all tools have their uses, and when you need to know exactly what a program is doing in order to fix it, but don't have the source, a good disassembler is a requirement.
Anyway, I started the program running in the Visual Studio debugger - a mode in which you can control exactly how a program executes, and modify the variables it is using - and ran through the code to see where the problem occurred. It was pretty easy to see what part of the error was in - a file called glsupcts.dll that came with the program. To see exactly what the code did, I set IDA running on it; after a few minutes it had an assembly listing of the code ready for me to read.
Of course, the assembly code wasn't actually all that easy to read. Something that made it even difficult was that the program had been written in Visual Basic, a language that I like which has a very easy to use system of programming, but which is often more general than required. As a result, it often did things in an odd way, and the code made a lot of calls to functions in the Visual Basic library. Of course, since these library calls were not documented, I ended up having to decompile this library as well, just to figure out what the program was doing! Hopefully nobody from Microsoft who cares is reading this.
Reading through the IDA output, I found the check for the registry value just before the error occured. It certainly seemed these were linked in some way. Then I found a reference to "AllChemicals.str", a file that contains the names of chemicals in Creatures. It made sense that GEL would try to load this file, so that it knew what each of the chemicals was called!
Now I had a clue - since I knew from reading the FileMon output that it never actually managed to load that file, it was probably failing while trying to. Using Visual Studio to look at the memory when the program crashed, I saw there was something odd about the path it had given to the "open file" function. It started off fine, but the end didn't look at all right. Here was my problem!
The system had used part of the memory given to it to work out the path (see the end for details), and GEL had thought this was part of the path itself. It was all clear now - the buffer was not being trimmed of the working copy, and this was getting left after the path name, so when the program put "AllChemicals.str" on the end, the middle of the path was invalid. This was the reason it wasn't showing up on FileMon - it didn't even get to the point where it looked for the file on the disk.
So what could I do? Well, I knew it was trimming off the last part of the string - the trouble was, it thought it was twice as big as it actually was, so it was keeping twice as much as it should. The length had to be stored as a number somewhere. Eventually I found the number being returned from a call to a function called
vbaLenBstr - which naturally calculated the length of the incorrect path. Now I just needed to divide it by two and it would only use the correct portion of the string.
Remembering my computer operations, I knew that the best way to divide by two was to shift the number to the right. What does this mean? Well, you can think of numbers inside a computer as being like a group of people all standing in a row, with flags with numbers on - starting from the right, they'd go 1, 2, 4, 8, 16 . . . all the powers of 2. When you shift right, the people all look at whoever's holding the next-highest flag, and do what they're doing. It looks like this:
Flag num 128 64 32 16 8 4 2 1 Before: 1 0 1 0 1 0 1 1 = 128 + 32 + 7 + 2 + 1 = 171 After: 0 1 0 1 0 1 0 1 = 64 + 16 + 4 + 1 = 85Voila - division by two! Of course, you lose any remainder, since there's no 0.5 flag. Fortunately there's no such thing as half a character.
Of course, I'm not a whiz at assembly, so I had to look up exactly how to do the shift - I actually found the one I needed elsewhere in the code, so I could just use that. Now I had my instruction, and I knew where it had to go. It should be simple from here, right?
Well, no. The trouble is, you can't just add another instruction to the middle of a compiled program, moving all the others along. It would be like rearranging pages in a book and not updating the index (which is regenerated each time you "compile" a book). Worse, since machine instructions usually take more than one byte, moving them means instructions would start in the wrong place, changing their whole meaning - imagine what would happen if you kept all the spaces in a book in the same place but moved each letter along one position! When things get out of order in a computer, programs crash.
One thing I could have done would be to overwrite what was there already (perhaps something that didn't matter much). It had been enough trouble figuring out what one piece of assembly did, though - I didn't want to have to go through that all over again!
Fortunately, I didn't have to try that, because there was a convenient area of
NOP instructions nearby.
NOP stands for "no op" - it's an instruction that does nothing but move onto the next instruction. This seems useless, but it can in fact be useful for various things.
In this case, it was useful because it meant I had some space to work with. Because this space was free, I could fill it with more code. I needed to add just one instruction, but to get to that instruction I needed to put a jump instruction in. I looked these up, and it turned out the one I needed was a whole five bytes long, including the place to jump to.
That meant I had to move the code it replaced down into the section of NOPs as well, after my right shift. I then needed to jump back up to the point after the first jump instruction, so that the code could continue as if nothing had happened.
So, after that, the big question is did it work? . . . Yes! It finally loads!
For those who've read this far, congratulations! I hope you found this little view into my world educational.
This is a bit more than you'd usually have to do when debugging a problem, but it's pretty representative of what most programmers do in real life - it's not all fast cars, mansions and stock options! Sure, you don't usually have to go as hardcore as writing assembly-code patches for broken DLLs (I'm sure I'm going to get nasty comments from the real hardcore folks out there who do stuff like this every day , but a lot of the time you're figuring out problems with existing code, not just writing new code.
Often it's not our code, either - it'll be written by someone else (who left six months ago) in a way that seems totally nonsensical. Sometimes you're right to think that, other times you just don't understand it yet; either way, you have to fix it, and probably add a few new things, too! Ahh, well, all in a day's work . . .
Please login to comment and/or vote for this skin.
Welcome Guest! Please take the time to register with us.
There are many great features available to you once you register, including:
- Richer content, access to many features that are disabled for guests like commenting on the forums and downloading skins.
- Access to a great community, with a massive database of many, many areas of interest.
- Access to contests & subscription offers like exclusive emails.
- It's simple, and FREE!