Divert Functions: A recipe for fault injection
It was a statement made by my manager at Hewlett-Packard overlooking the OpenSSI project. I was still pursuing my post-graduation at IIIT, Bangalore and this project was to be the culmination of that 2 year program. This was the beginning of the fault-injection system that was to be built for the OpenSSI cluster. The enormity of the task was beyond most people there. The only thing that made me take up that project was my own belief in myself. Although I had written a small operating system in the previous 3 semesters and had a lot of experience of writing all kinds of programs in user-space in Linux, except for some small kernel modules I had not done any major programming in the Linux kernel. A presentation on OpenSSI by Dr. Badrinath from HP earlier in IIITB had left most people high and dry owing to the complexity of OpenSSI kernel. It had excited me to the point of obsession. There was no other course in IIITB, specifically related to systems programming in general operating systems (although embedded systems was there) and this was my only chance to gate-crash into the party. Even the small OS I had built earlier, I had done it on my own time under the guidance of my mentor Prof. P. C. P. Bhatt. If it hadn't been for his faith in me and his enormous support I probably wouldn't have made it till here on my own. I had to work on OpenSSI, no matter what. I found my Zahir for the most satisfying stint of my life in software so far. It was an enormous leap of faith.
Starting from that statement above by the HP manager - except for the initial fault-injection survey (which was done by 3 of us under the guidance of Prof. Purnendu Sinha), design and development of the whole system was done by me. The result was FISSI - Fault Injection for OpenSSI. The article on KProbes was a result of working on this project. KProbes as I have described in a previous post was the work of IBM and was recently included in the Linux kernel at that time. Its antecedents go further back in history going back to Cygnus where the whole idea was germinated and described in the paper: The Heisenberg Debugging Technology.
In very simple terms, KProbes allows some user-written code to be executed just before and after the execution of a particular instruction. This allows a user to insert tracing statements as well as modify registers in the context in which the instruction was executed. This much information was gleaned from the Linux-Bangalore 2004 summit that we attended, just before the start of FISSI project. There was a special session on KProbes technology, given by IBM's Prasanna Panchmukhi. He described KProbes in the session, all the implications of which were not immediately clear to me, although I decided to pursue it further. I had a faint feeling KProbes could be of some help in implementing a fault-injection system in the kernel because it did something similar to what I had to do. In a running system it wrested control itself and allowed arbitrary code to be executed.
In our survey of fault-injection systems we came across all kinds of them - hardware based, injection libraries which needed developers to insert fault-injection statements in their code and recompile, communication protocol fault-injectors, user-space fault-injectors, kernel based fault injectors etc. I was very clear about what kind of a fault-injector we had to built from the beginning - something that was immediately useful to developers without the need to modify their own source code. I knew no one will want to use a fault-injection system where extra code was needed to simulate faults. That will make the project code (OpenSSI kernel in this case) unmaintainable apart from making it incomprehensible to newcomers. Apart from that testing and QA teams are different usually from development teams - so that people who test the system are completely different from the developers and have little or no idea about the code written inside a software package. The fault-injection should be immediately useful to these people . It should allow them to use it without knowing the source code, although still flexible enough to let others more knowledgeable about the code, try out complex scenarios by writing custom fault-injectors.
Working day and night to get into the guts of KProbes allowed me to completely understand the code path in the kernel used by KProbe. It was complex to say the least and very delicate. It also made it clear to me that KProbes was not designed to inject faults per se (although it could to a certain degree), rather it was more of a trace tool to trace the execution of instructions inside the kernel. It was designed as a component of DProbes - IBM's kernel tracing framework.
I started thinking about the method I could use to inject faults - initial design involved changing the value of the accumulator just before a function returned - this will change the return value of the function. Since most errors inside the kernel are indicated by functions returning some pre-defined error indicating values, this seemed to be a good idea. This was easily achievable by using KProbes. A probe will be inserted just before the ret instruction in a function which will alter the value in the accumulator and then execute the ret instruction, effectively changing the return value of the function. On deeper probing it was clear that just changing the return value of a function was not enough. The function could have executed some code successfully which would have altered the state of the system in a way that it became inconsistent with the return value (akin to carrying out an operation successfully and then returning an error code). This will not work.
The next design iteration asked for a way to insert a probe just at the start of the function which will change the accumulator value and transfer control to the return address on the stack immediately (by changing the instruction pointer register), effectively hijacking the function completely. This will produce an illusion as if the function simply returned an error without changing the state of the system. This looked really promising. The problem was KProbes unmodified in this case was useless, since KProbes does not allow probe handler to change the instruction pointer (although other general purpose registers could be changed). On return from the probe handler it overwrites the instruction pointer to what it thinks the next address of execution should be. In this case, this address will be the address of the next instruction in the function. Now we had a design for injecting faults in a running system by simulating error returns by lower level functions, but no support in the kernel for implementing it directly.
However, it turned out that there could be cases where a function could change the state of the system to some extent and then return an error value. Simply hijacking the function and return an error value in this case will not work. Anyway, changing return values of functions could potentially cover a large number of faults inside the kernel (as I found out by checking out functions in the kernel - somewhat arbitrarily though) but there could be other cases where changing the return value was not enough. An example is - accessing the wrong address inside the kernel, which will trigger a page fault and subsequently an oops. This could only be simulated by actually executing an instruction which accessed a wrong address. Moreover this had to be done inside some function in the kernel whose code could not be changed at runtime. All of this lead to finding out a way to replace code-paths completely inside the kernel - where execution flow could be transferred to some other dynamically injected code in the kernel, where anything could be done arbitrarily to simulate a fault.
The final result was divert functions. A divert function is a dynamically inserted function in the kernel (using a kernel module) with the same prototype as some other function, from which the execution flow has to be diverted to it. This is done just before the other function executes, so that the divert function effectively replaces the original function. This produces GOD mode for the fault-injector. Anything can be done here within a divert function to simulate a fault inside the original function in the kernel. We have a powerful, capability now. The problem was how to implement it.
Then I paid back a visit to JProbes, which is a part of KProbes infrastructure, and allows access to a function's arguments. This is done by transferring control to another function with the same prototype as the probed function. In that function a copy of the function's arguments are placed which it can print out or use. The problem was JProbes always transferred control back to the original function as well as copied back the original context (processor registers and stack) before giving back the control. This meant I had to modify JProbes to implement divert functions. I converted JProbes into Divert Functions by modifying the implementation. Now divert functions could switch between acting like JProbes (transferring back control and copying context) and replacing the original function entirely. This provided an even greater flexibility, because now divert functions had the ability, when needed, to skip fault-injection and carry on usually as if nothing had happened when they deemed so. This meant that faults could be injected conditionally. A powerful capability had been achieved for fault-injection.
Apart from fault-injection, I allowed tracing capabilities to be intertwined with fault-injection capabilities in FISSI so that the effects of fault-injection could be traced inside the kernel. A lot of other things were planned but my term at IIITB came to an end sooner than I knew it did.
Today FISSI is being developed further under a collaboration between IIITB and HP. Hopefully, it'll be mature enough soon to be released as part of OpenSSI.