A simple way to obfuscate string literals

By Andre M. Maier

While I was sitting in a doctor’s waiting room last week, it occurred to me that I could send out a geeky holiday greeting to my younger programming students. Naturally, as a teacher, I wanted to make it educational. It should allow my students to recap on the very basics of computer programming, and maybe provide a new aspect that is not covered in my regular curriculum. These thoughts led me to the idea of “hiding” the characters of the holiday note within integer literals.

I first came across this principle back in the late 80s when I spent hours after hours studying the machine code of various commercial computer games. Some programmers have used this method to prevent level passwords or product keys from showing up as plain ASCII in any text editor. Many years later, when I worked in computer forensics, I learned that many famous hardware and software manufacturers used similar, albeit more effective, techniques to obfuscate proprietary information.

So, I took my laptop out of my bag and wrote the following program.


The innermost while loop retrieves three subsequent digits from the value of an int variable n. The digits are weighted with powers of 10 according to their position, and then summed up in a char variable which represents an actual ASCII character. The inner for loop repeats this procedure three times, because this simple variant allows you to hide up to three ASCII characters per int literal. Thus, the first int value in the array (101105076) contains the ASCII characters 76, 105, 101 in decimal representation, which resemble the printed letters L, i, and e. The rest of the program should be pretty much self-explanatory.

As a result, the output of my program looks like this:

Liebe BKI15, 
wuenscht Euch
Andre Maier

I must admit that I have both seen and used far more sophisticated methods of hiding information in source code or in binary files. One of the most impressive variants I have seen involved some weird algorithm that was used to pick 4 bit of each character from a JPEG image. What made it impressive to me was the fact that the algorithm retrieved its changing parameters from the same source of data. With only disassembled RISC machine code and no debugging tools available, it took me quite a while to figure out the exact function and design of the algorithm.

The program above is simple to understand and straight forward. Nevertheless, I think it makes a neat little exercise on the essentials of computer programming.

Merry Christmas and a Happy New Year 2017 to all of you!



About bitjunkie

Teacher, Lecturer, and BITJUNKIE ...
This entry was posted in Information Theory, Java, Programming Essentials and tagged , , , , , . Bookmark the permalink.