Friday, May 26, 2006


Long time since I last wrote here. Life has been busy of course!

In the meantime I have finished a project and am going to move on to another in a few days of time. Did quite a lot of research on caching algorithms. And was simply amazed at how much the industry is behind with respect to caching strategies. Solaris 10 still uses a 2-handed clock algorithm! Linux is still at LRU with page ageing. Oracle still uses multiple LRU queues in its buffer cache. Hitachi Tagmastore - arguably the best cache-based array solution still uses LRU algorithm for caching! The surprise was PostgreSQL which moved on to ARC, only to discover that it was patented by IBM! Poor guys had to go back to 2Q after dumping ARC.

But things are improving. A patch came in for Linux with a generalized page-replacement framework implementing not one but four different caching strategies! But as they say - you can't have all the good things - it got rejected on the grounds of injecting maintainability and stability issues in the Linux kernel. But there's still hope - Clock-PRO algorithm is still being considered for the current page-replacement scheme in the Linux kernel. Hopefully, things will improve for the better.

Monday, December 19, 2005


This is a follow up to the previous post. Let's tackle each problem one by one. Note that each problem can have several solutions depending on your preferences/requirements. Of course, I am assuming this system is our dear old Linux box.

  1. You wrote the communication protocol subsystem for a local client-server system (e.g. X-server & the TrueType font server) running on the system and want to verify everything being exchanged between the two over a Unix domain socket or a FIFO or a pipe. What do you do?

    Solution: In this case since you are the developer of the system a very obvious way to tackle this problem is to echo everything being exchanged over the Unix domain socket/FIFO over stdout. You have the source code so you are the master. If the messages are binary you can print them out in hex or store them in separate files - one for the client, other for the server and then examine them later with a hex editor. Obviously, these are required for this to work:

    - You own the source code and know the innards enough to insert debugging statements
    - You are willing to write extra code to debug the client-server
  1. You have recently noticed a suspicious program which seems to be reading/writing to sockets and communicating with another program non-stop over a pipe. And you need a way to find out everything being exchanged over the sockets and the pipe. What do you do?

    Solution: Ok, this one is tricky. If the program is genuinely malicious, then it will not use some way through which it'll take an IP address of some server from the user and try connecting to it like regular FTP/Mail clients do - instead it'll most probably use a hard-coded address (IP/host). This means using a man-in-the-middle kind of server for looking at everything being exchanged is not as straightforward a task. The first step will be to find out the IP address of the machine to which the program is trying to connect. This can be done using simple tools like "lsof/netstat" to find out the sockets opened by the program and checking the IP address there. Now the second task will be to fool the program into connecting to a server with "that IP address". Of course you'll have to write this server or use some freely available tools for the purpose of posing as a man-in-the-middle server. Apply your networking knowledge there (another task for you ;-). Ok, you got hold of the sockets by the ears, what about the pipes?
    Use system call tracers such as "strace" to find out what is being written to pipe file descriptors. To see which file descriptor is which - scourge through /proc/pid/fd directory and check out the ones which say it's a pipe. Attach a system call tracer to the program using it's pid.

    strace -p pid -x -s 1024

    will do for a start ;-). By the way, this trick should work for sockets as well. Told ya, there are more solutions than one to these depending on what you prefer. And I kind of don't prefer any of these. They are too messy. More on this later.

Well, for now these two will do for a start. Me back to work... Meanwhile you muse about the rest of the three. The ideal solution hasn't been discussed mind you. Well, of course you know. Don't ya?

Friday, December 02, 2005

A Brainteaser


  1. You wrote the communication protocol subsystem for a local client-server system (e.g. X-server & the TrueType font server) running on the system and want to verify everything being exchanged between the two over a Unix domain socket or a FIFO or a pipe. What do you do?
  2. You have recently noticed a suspicious program which seems to be reading/writing to sockets and communicating with another program non-stop over a pipe. And you need a way to find out everything being exchanged over the sockets and the pipe. What do you do?
  3. You are a system administrator and have noticed suspicious commands being executed by a particular user on one of your servers, and you want to find out the exact transcript of that user's session any time that user logins. What do you do?
  4. You are a developer working on a Volume Manager which implements a storage virtualization system. And you want to see every update being done to a particular region of the virtual disk (a volume) by your I/O daemons. What do you do?
  5. You are responsible for security on a high-profile project which stores all of its files on servers completely isolated yet accessible to internal users. And you fear that the Mossad has a mole planted inside your organization (I can get carried away sometimes ;-). Now you want to monitor access to sensitive directories and log user/date details whenever these are accessed. How do you do that?

I'll give you time to think about these and then probably make comments on what I think the ideal solution(s) should be like soon...

Wednesday, October 12, 2005

Divert Functions: A recipe for fault injection

"Given a point in the code, you have to build a system to inject a fault in the operating system."

It was a statement made by my manager at Hewlett-Packard overlooking the OpenSSI project. I was still pursuing my post-graduation at IIIT, Bangalore and this project was to be the culmination of that 2 year program. This was the beginning of the fault-injection system that was to be built for the OpenSSI cluster. The enormity of the task was beyond most people there. The only thing that made me take up that project was my own belief in myself. Although I had written a small operating system in the previous 3 semesters and had a lot of experience of writing all kinds of programs in user-space in Linux, except for some small kernel modules I had not done any major programming in the Linux kernel. A presentation on OpenSSI by Dr. Badrinath from HP earlier in IIITB had left most people high and dry owing to the complexity of OpenSSI kernel. It had excited me to the point of obsession. There was no other course in IIITB, specifically related to systems programming in general operating systems (although embedded systems was there) and this was my only chance to gate-crash into the party. Even the small OS I had built earlier, I had done it on my own time under the guidance of my mentor Prof. P. C. P. Bhatt. If it hadn't been for his faith in me and his enormous support I probably wouldn't have made it till here on my own. I had to work on OpenSSI, no matter what. I found my Zahir for the most satisfying stint of my life in software so far. It was an enormous leap of faith.

Starting from that statement above by the HP manager - except for the initial fault-injection survey (which was done by 3 of us under the guidance of Prof. Purnendu Sinha), design and development of the whole system was done by me. The result was FISSI - Fault Injection for OpenSSI. The article on KProbes was a result of working on this project. KProbes as I have described in a previous post was the work of IBM and was recently included in the Linux kernel at that time. Its antecedents go further back in history going back to Cygnus where the whole idea was germinated and described in the paper: The Heisenberg Debugging Technology.

In very simple terms, KProbes allows some user-written code to be executed just before and after the execution of a particular instruction. This allows a user to insert tracing statements as well as modify registers in the context in which the instruction was executed. This much information was gleaned from the Linux-Bangalore 2004 summit that we attended, just before the start of FISSI project. There was a special session on KProbes technology, given by IBM's Prasanna Panchmukhi. He described KProbes in the session, all the implications of which were not immediately clear to me, although I decided to pursue it further. I had a faint feeling KProbes could be of some help in implementing a fault-injection system in the kernel because it did something similar to what I had to do. In a running system it wrested control itself and allowed arbitrary code to be executed.

In our survey of fault-injection systems we came across all kinds of them - hardware based, injection libraries which needed developers to insert fault-injection statements in their code and recompile, communication protocol fault-injectors, user-space fault-injectors, kernel based fault injectors etc. I was very clear about what kind of a fault-injector we had to built from the beginning - something that was immediately useful to developers without the need to modify their own source code. I knew no one will want to use a fault-injection system where extra code was needed to simulate faults. That will make the project code (OpenSSI kernel in this case) unmaintainable apart from making it incomprehensible to newcomers. Apart from that testing and QA teams are different usually from development teams - so that people who test the system are completely different from the developers and have little or no idea about the code written inside a software package. The fault-injection should be immediately useful to these people . It should allow them to use it without knowing the source code, although still flexible enough to let others more knowledgeable about the code, try out complex scenarios by writing custom fault-injectors.

Working day and night to get into the guts of KProbes allowed me to completely understand the code path in the kernel used by KProbe. It was complex to say the least and very delicate. It also made it clear to me that KProbes was not designed to inject faults per se (although it could to a certain degree), rather it was more of a trace tool to trace the execution of instructions inside the kernel. It was designed as a component of DProbes - IBM's kernel tracing framework.

I started thinking about the method I could use to inject faults - initial design involved changing the value of the accumulator just before a function returned - this will change the return value of the function. Since most errors inside the kernel are indicated by functions returning some pre-defined error indicating values, this seemed to be a good idea. This was easily achievable by using KProbes. A probe will be inserted just before the ret instruction in a function which will alter the value in the accumulator and then execute the ret instruction, effectively changing the return value of the function. On deeper probing it was clear that just changing the return value of a function was not enough. The function could have executed some code successfully which would have altered the state of the system in a way that it became inconsistent with the return value (akin to carrying out an operation successfully and then returning an error code). This will not work.

The next design iteration asked for a way to insert a probe just at the start of the function which will change the accumulator value and transfer control to the return address on the stack immediately (by changing the instruction pointer register), effectively hijacking the function completely. This will produce an illusion as if the function simply returned an error without changing the state of the system. This looked really promising. The problem was KProbes unmodified in this case was useless, since KProbes does not allow probe handler to change the instruction pointer (although other general purpose registers could be changed). On return from the probe handler it overwrites the instruction pointer to what it thinks the next address of execution should be. In this case, this address will be the address of the next instruction in the function. Now we had a design for injecting faults in a running system by simulating error returns by lower level functions, but no support in the kernel for implementing it directly.

However, it turned out that there could be cases where a function could change the state of the system to some extent and then return an error value. Simply hijacking the function and return an error value in this case will not work. Anyway, changing return values of functions could potentially cover a large number of faults inside the kernel (as I found out by checking out functions in the kernel - somewhat arbitrarily though) but there could be other cases where changing the return value was not enough. An example is - accessing the wrong address inside the kernel, which will trigger a page fault and subsequently an oops. This could only be simulated by actually executing an instruction which accessed a wrong address. Moreover this had to be done inside some function in the kernel whose code could not be changed at runtime. All of this lead to finding out a way to replace code-paths completely inside the kernel - where execution flow could be transferred to some other dynamically injected code in the kernel, where anything could be done arbitrarily to simulate a fault.

The final result was divert functions. A divert function is a dynamically inserted function in the kernel (using a kernel module) with the same prototype as some other function, from which the execution flow has to be diverted to it. This is done just before the other function executes, so that the divert function effectively replaces the original function. This produces GOD mode for the fault-injector. Anything can be done here within a divert function to simulate a fault inside the original function in the kernel. We have a powerful, capability now. The problem was how to implement it.

Then I paid back a visit to JProbes, which is a part of KProbes infrastructure, and allows access to a function's arguments. This is done by transferring control to another function with the same prototype as the probed function. In that function a copy of the function's arguments are placed which it can print out or use. The problem was JProbes always transferred control back to the original function as well as copied back the original context (processor registers and stack) before giving back the control. This meant I had to modify JProbes to implement divert functions. I converted JProbes into Divert Functions by modifying the implementation. Now divert functions could switch between acting like JProbes (transferring back control and copying context) and replacing the original function entirely. This provided an even greater flexibility, because now divert functions had the ability, when needed, to skip fault-injection and carry on usually as if nothing had happened when they deemed so. This meant that faults could be injected conditionally. A powerful capability had been achieved for fault-injection.

Apart from fault-injection, I allowed tracing capabilities to be intertwined with fault-injection capabilities in FISSI so that the effects of fault-injection could be traced inside the kernel. A lot of other things were planned but my term at IIITB came to an end sooner than I knew it did.

Today FISSI is being developed further under a collaboration between IIITB and HP. Hopefully, it'll be mature enough soon to be released as part of OpenSSI.

Saturday, August 06, 2005

Yahoo!: Audio Search Debut

Audio search, ahoy! Within a few days of my last post on audio search, there comes news of Yahoo! launching it's own audio search engine in beta. The results? Not as much as we want, disappointingly. But before looking at the disappointments let's look at what it has to offer.

Yahoo! has linked with around 18 music providers online and has a database of what content is available with which provider. Search results contain references to these music providers which can make the content available to the users. The results also contain links to other relevant information from Yahoo!'s other services such as image, video and news search. This information is made available on the right hand side if it's available for the search made. A nice touch but I wouldn't be interested in those results if I really am searching for audio here, but nevertheless a nice feature in case you do want it. Going by the popularity of Britney Spears searches made on search engines, people certainly would like to see suggestive images of Britney Spears if they listen to her music I guess, who wouldn't. Ok, deny if you want to.

The engine by default provides links to online music providers where you can buy the music. However, in case you are looking for samples to listen to, before making a decision to buy it, you can go to the "More options" link and narrow the search by selecting the duration or format of the search results. However, there's no option to choose between links to online music store OR samples available on the web in the search results. If you really want samples available on the web, then you'll have to choose one of the formats, to tell the engine to make available only links to files in that format. That means apart from the links to online stores, samples available in all the other formats will also not appear in the results. But that's not what we want, we want to listen to samples in any format before going to an online store to buy it. Even more disappointing is the fact that the search results for samples contain links to the web pages which contained the file. I speak in the past tense because more often than not the links to those files are gone. Moreover, the extra level of clicks and search on the web page, (which might contain hundreds of 'em) only results in more time being spent on visually searching for the correct piece of information we seek (in this case the link to the sample). Sorry, you can't have everything your way. Shrug.

Well, you needn't despair. Singing Fish is still there for it which links to the sample file directly, mercifully. Singing Fish also does not have the problem of only getting links to online stores or a single format in the search results. I know you are still confused about that one... even I am.

Apart from the audio content the search engine can help you find the web for artist's biography and reviews as well. In this department as well, beats Yahoo! in the sheer ease of accessing the biography as well as other recommendations for the artist. That's because the biography, popular songs and other recommendations are directly available in the site's interface instead of pointing to other web pages on the Internet containing them like Yahoo! does. really understands its audience - the music lovers who not only want to know about an artist, they want to know about their most popular songs or albums, as well as other artists in the same genre.

Yahoo! is alien to the concept of measuring popularity of audio content as well as the concept of genres. Sadly, the folks at Yahoo! perhaps think that search for audio is done by only those people who know what music they want, but don't know where to get it. What about the rest of us who are not present in the Western hemisphere to know through word of mouth or other media what is/was good music. I want to listen to tracks which were a raze in the 60's or 70's but I can neither search for any recommendations nor can I gauge the popularity of a song I have never heard about. That is an important part since that saves me a lot of trouble, as well as time in listening to unheard of music, to determine whether I like it. The engine Yahoo! offers is really a stringing together of Yahoo!'s web searching technology along with a database of songs/artists/albums/online music stores.

Another aspect of music does not figure at all. How I wish there was the slightest hint of providing a feature to allow searching for the lyrics of a song. As I pointed out before, lyrics search is an important aspect that we fans out here look for - as this guy pointed out as well on Slashdot. We want it - you can take that as a yes! Till then try lyrics search here.

Sadly, the guys behind this search engine do not understand a music aficionado. Overall, we are still where we were, as far as music search is concerned. No breakthroughs here.

Monday, July 18, 2005

Search Engines: An Audio Trail

Life without a search engine on the Internet is unimaginable anymore. If it wasn't for the likes of Google and Yahoo! Internet would have been a strange and mysterious place covered in darkness and most of it inaccessible. Search engines have pervaded every nook and corner of the WWW. A look at the database of WWW URLs amassed at DMOZ blows one away - this site is apparently the starting point for several kinds of speciality search engines which people are designing. If the database of DMOZ containing all the URLs is downloaded and decompressed it takes more than a gigabyte of space. The potential for search is just unlimited with the amount of data out there. To an extent search engines like Google have solved the initial problem of searching textual data in an efficient and scalable way. But this is just the beginning...

The next generation of search war is already being fought with search based on video content. Google Video and Yahoo! Video search engines are already fighting it out however both of them are based on different semantics. While Google not only allows you to search for playable video content, it also suggests the possible time the video will be aired by TV channels. In the Google preferences users could for example chose to enter the zip code for the area in U.S. where the video will be aired as well as set their preference for a particular channel it will be aired on. On the other hand in Yahoo! preferences users can set the format of the video files, their size and their duration in the criteria for search. Well, to each its own. Apart from the big players, their are the smaller fish also however, hoping to carve out a niche in the video search market for themselves, however small. Take a look at Blinkx.

But that's just the beginning. Audio comes next. Yahoo! already seems to be working on an audio search engine. But there are other players out there already offering more than just search. If you forgive the quality of their interfaces and take a look at the content perhaps you'd find the utility of these search engines. The first one is This is a fabulous site offering track recommendations, artist biographies sourced from Wikipedia, playlist collections of users for the interested and searches based on track titles, artists and albums. An excellent way of finding those hidden gems out there. It also shows the most popular songs by an artist on that artist's page. Apart from that there's the concept of communities where artists are grouped under a community tag such as Rock, Country etc. From an artist's page for example you can go and look for other artists in the same community. But then what if you want to listen to those tracks instead of just going by the popularity ratings.

Ok, will not solve that problem of yours - admitted. However, there's another that will! Take a look at Singing Fish. This search engine will allow you to search for clips of songs in various formats such as RealPlayer, WMA, MP3 etc. Combine these together with a lyrics search engine and that completes the whole gamut of features you'd want in an audio search engine. But wait a minute, where do you search for lyrics? Well there's this odd one. But hey it works! What you need is a search engine out there which combines all these three aspects of audio search and we have the next big winner on our hands! Hey, anybody listening?

If you'd like to take a look at what I found out there with the above, you need not wait. Here are the winners:

  1. Tori Amos- Silent all these years
  2. Tori Amos - Professional widow
  3. Peaches & Herb - Shake your groove thing
  4. Blondie - Atomic
  5. Dido - White flag
  6. Garbage - I'm only happy when it rains
  7. The Cardigans - Lovefool
  8. Whitesnake - Is this love?
  9. Billy Idol - White wedding
  10. Dolly Parton - I will always love you
  11. Dolly Parton - Jolene
  12. Pat Benatar - Hit me with your best shot
  13. Men At Work - Down under
  14. Men At Work - Who can it be now?
Try searching these and you might find other gems as well...

Thursday, May 12, 2005

The essence of work.

Then a ploughman said, Speak to us of Work.
And he answered, saying:

You work that you may keep pace with
the earth and the soul of the earth.
For to be idle is to become a stranger unto the seasons,
and to step out of life's procession, that marches in majesty
and proud submission towards the infinite.

When you work you are a flute through
whose heart the whispering of the hours
turns to music.
Which of you would be a reed, dumb and silent,
When all else sings together in unison?

Always you have been told that work is
a curse and labour a misfortune.
But I say to you that when you work you fulfil
a part of earth's furthest dream, assigned to you
when the dream was born,
And in keeping yourself with labour you
are in truth loving life,
And to love life's labour is to be intimate
with life's innermost secret.

But if in your pain you would call birth an affliction
and the support of the flesh a curse
written upon your brow,
than I answer that naught but the sweat of your brow
shall wash away that which is written,

You have been told that life is darkness,
and in your weariness you echo what
was said by the weary.
And I say that life is indeed a darkness
save when there is urge.
And all urge is blind save when there is knowledge,
And all knowledge is vain save when there is work,
And all work is empty save when there is love;
And when you work with love you bind
yourself to yourself, and to one another,
and to God.

And what is it to work with love?
It is to weave the cloth with threads drawn
from your own heart,
even as if your beloved
were to wear that cloth.
It is to build a house with affection, even as if your beloved
were to dwell in that house.
It is to sow seeds with tenderness and
reap the harvest with joy, even as if your beloved
were to eat the fruit.
It is to charge all things you fashion
with a breath of your own spirit,
And to know that all the blessed dead
are standing about you and watching.

Often have I heard you say, as if speaking in sleep,
"He who works in marble, and finds the shape
of his own soul in the stone,
is nobler that he who ploughs the soil.
And he who seizes the rainbow to lay it on a
cloth in the likeness of man, is more
than he who makes the sandals for our feet."
But I say, not in sleep but in the over- wakefulness of noontide,
that the wind speaks not more sweetly to the giant oaks
than to the least of all the blades of grass.
And he alone is great who turns the voice
of the wind into a song made sweeter by
his own loving.
Work is love made visible
And if you cannot work with love but only
with distaste, it is better that you should
leave your work and sit at the gate of the
temple and take alms of those who work with joy..
For if you bake bread with indifference
you bake a bitter bread that feeds but half
man's hunger
And if you grudge the crushing of the
grapes, your grudge distills a poison in the wine
And if you sing though as angels,and
love not the singing, you muffle man's ears
to the voices of the day and the voices of
the night.

- "The Prophet" by Kahlil Gibran