Creating a Multi-Call Linux Binary
Web Doc
Note: This is publication is now archived. For reference only.
A multi-call binary is an executable, written in C, that performs the action of more than one utility. A prime example of a multi-call binary is the BusyBox package. BusyBox implements a large number of standard Linux utilities (such as the ls and ln commands) in a single executable. This enables specialized Linux distributions to have a reduced size. This tip describes how multi-call binaries are written.
There are two ways to invoke BusyBox functions:
# pwd
In the first output you can also see login, which is a symbolic link to /bin/tinylogin. TinyLogin is a partner program to BusyBox, and performs the functions of programs like login and sulogin. These functions could have been implemented in BusyBox, but for security reasons it is preferred to have a separate executable for login processing.
This example also shows us another feature of the BusyBox utility. In the full GNU implementation of ls, the -G option is valid (it suppresses the display of the group name from the directory list). In the interests of saving space, however, not all of the function of the various utilities is provided. This is quite appropriate for BusyBox, however, since the idea is to eliminate unused (or little used) functions in the interests of reducing the executable size.
So, how does a multi-call binary like BusyBox, when invoked using a symbolic link, know what function to perform? The answer is that the way a multi-call binary program is written differs from a normal program.
The C language is used for most systems programming on UNIX/POSIX systems. Programs written in C always have a main() function, which is the first part of the program to be executed. The main function is written in a particular way, to allow the operating system to pass parameters to it. A typical main() function declaration appears here:
int main(int argc, *char argv[])
The parameters passed to the main() function are argc, an integer containing the number of parameters passed by the system to the program, and argv, the list of the parameters passed. By convention (on UNIX/POSIX systems, at least), there will always be at least one parameter passed to the program: the name used to invoke the program. This is usually the command typed by the user at the shell prompt to invoke the command, and will just about always be the name of the file that contains the program. In C notation, this value (the first item in the array called argv) is argv[0].
Most single call binaries ignore the contents of argv[0], as the program is designed to perform a single task and it is irrelevant what name the system used to invoke the program. Some programs, for security reasons, do make sure that the command issued is correct. This can prevent a malicious user from executing a program they should not have access to.
A multi-call binary pays attention to this parameter, however, and uses it to determine which function to execute. In the case of BusyBox, if argv[0] is the same as the executable file name, it will use the second item in the parameter list (argv[1]) as the name of the function to be executed. If argv[0] is not the same as the name of the BusyBox executable file, it will attempt to use the contents of argv[0] as the name of the requested function.
The material included in this document is in DRAFT form and is provided 'as is' without warranty of any kind. IBM is not responsible for the accuracy or completeness of the material, and may update the document at any time. The final, published document may not include any, or all, of the material included herein. Client assumes all risks associated with Client's use of this document.