Introducing elfpatch, for safely patching ELF binaries


I recently had a problem with a program behaving badly. As a developer familiar with open source, my normal strategy in this case would be to find the source and debug or patch it. Although I was familiar with the source code, I didn't have it on hand and would have faced significant inconvenience having it patched, recompiled and introduced to the runtime environment.

Conveniently, the program has not been stripped of symbol names, and it was running on Solaris. This made it possible for me to whip up a quick dtrace script to print a log message as each function was entered and exited, along with the return values. This gives a precise record of the runtime code path. Within a few minutes, I could see that just changing the return value of a couple of function calls would resolve the problem.

On the x86 platform, functions set their return value by putting the value in the EAX register. This is a trivial thing to express in assembly language and there are many web-based x86 assemblers that will allow you to enter the instructions in a web-form and get back hexadecimal code instantly. I used the bvi utility to cut and paste the hex code into a copy of the binary and verify the solution.

All I needed was a convenient way to apply these changes to all the related binary files, with a low risk of error. Furthermore, it needed to be clear for a third-party to inspect the way the code was being changed and verify that it was done correctly and that no other unintended changes were introduced at the same time.

Finding or writing a script to apply the changes seemed like the obvious solution. A quick search found many libraries and scripts for reading ELF binary files, but none offered a patching capability. Tools like objdump on Linux and elfedit on Solaris show the raw ELF data, such as virtual addresses, which must be converted manually into file offsets, which can be quite tedious if many binaries need to be patched.

My initial thought was to develop a concise C/C++ program using libelf to parse the ELF headers and then calculating locations for the patches. While searching for an example, I came across pyelftools and it occurred to me that a Python solution may be quicker to write and more concise to review.

elfpatch (on github) was born. As input, it takes a text file with a list of symbols and hexadecimal representations of the patch for each symbol. It then reads one or more binary files and either checks for the presence of the symbols (read-only mode) or writes out the patches. It can optionally backup each binary before changing it.