Central Iowa Railroad Herald

CIRR.COM

Compile Time

Function Pointers and dlopen()


This month we're going to take a short hiatus from the Programming Portably series being presented for the last three months. We'll resume that series next month with a discussion on source management systems.

In the mean time, I'll be presenting an interesting C language feature: Function pointers; and how to use them.

When to use?

First off, I'm going to harp on code clarity and portability. (hey, I'm writing a series on code portability... :-) Function pointers, if used poorly, can obscure the nature of the code, and perhaps impar the portability of the code.

Used well, they can provide clarity, and probably even enhance performance of the resulting program (but be careful of praying to the false idols of efficiency or performance.) To provide clarity, descriptive variable names need to be used.

The first time I really used pointers to functions was a number of years ago (say, 1987 or so). I was writing a program that had to parse telephone switch call record tapes. By and large, all the tapes records had the same information, just in different places and formats. The interesting information from the call records were:

The problem was there were about eight different time formats in common use, six different date formats, and several duration formats. Fortunately, the call record tapes were all from the United States, so they largely used the common North American phone number format (although some of the smaller private branch exchange switches did various forms of truncation, but that was handled in preprocessing.)

So, the program needs to call one of eight different time format standardization routines, and one of six different date format standardization routines for each call record. Assuming that the time formats are identified by a small integer timefmt, and the date formants are identified by a small integer datefmt, the naive implementation looks like this:

switch(timefmt) {
  case 0:
      stdtime = 
	timeformat0(timestr, len(timestr));
      break;
  case 1:
      stdtime = 
	timeformat1(timestr, len(timestr));
      break;
    .
    .
    .
  default:
      /*
       * of course, this case should be
       * caught at option parsing time,
       * but... 
       */
      fprintf(stderr, 
	"%s: unknown time format: %d\n",
	pgm, timefmt);
      exit 1;
}

LISTING 0

The block for datefmt looks similar to the above block for timefmt. Unfortunately, while the above block adequately describes what is happening, it makes the processing loop long, and perhaps hard to read, or at least remember what's happening, as there are many replicated lines with only minimal additional content.

Every time through the processing loop, the switch must be evaluated, and the appropriate function call made. What if we could remove the evaluation of the switch statement? What if we could replace those two switch statement blocks with just two lines of code? That certainly would make the actual processing loop much clearer, wouldn't it?

Suppressing switches

Creating a pair of function pointers, one for the time format standardization and one for the date format standardization, would reduce the code related to date and time conversion to just two lines. Suddenly, the switch blocks in listing 0 are replaced by listing 1 in the processing loop, and listing 2 is added at argument processing time. (fyi: both the date and time format standardization routines return a string in the standard format. It was up to the processes beyond the tape reader to actually convert them to something useful.)

  char *(*timeconv)(char *, int);
  char *(*dateconv)(char *, int);
  char *standardtime, *standarddate;
    .
    .
    .
  while (!EOT(tapedevice)) {
    standardtime = (*timeconv)(timestr, len(timestr));
    standarddate = (*dateconv)(datestr, len(datestr));
    .
    .
    .
  }

LISTING 1

Using function pointers

char timeformat0(), timeformat1();
  .
  .
  .
switch(timefmt) {
  case 0:
      timeconv = timeformat0;
      break;
  case 1:
      timeconv = timeformat1;
      break;
    .
    .
    .
  default:
      /*
       * since we're doing option processing
       * at this point, it's reasonable to
       * bail now..
       */
      fprintf(stderr, 
	"%s: unknown time format: %d\n",
	pgm, timefmt);
      exit 1;
}

LISTING 2

initializing function pointers

Now we've removed the complexity of figuring out which function to option parsing time, and condensed the processing loop a great deal. In the process, assuming we've used good variable names, we've also improved the readability. As a side effect, we've probably improved performance as well.

Gory Details

For those of us who probably didn't immediately grasp how to declare and use function pointers from the above examples, let's go into the gory details of the matter.

In both editions of K&R, function pointers are introduced, and explained with the following statement:

"In C, a function itself is not a variable, but it is possible to define pointers to functions, which can be assigned, placed in arrays, passed to functions, returned by functions, and so on." (K&R 2nd ed, pg 118, section 5.11)

Harbison and Steele (Prentice-Hall, 1987) has little more to say. It's no wonder function pointers might be considered a black art (after a fashion.)

In ANSI C, a function pointer declaration as 3 parts. These parts are return type, the variable name, and the argument list declaration. Thus, for the declaration used above, we have:

char *(*timeconv)(char *, int);

The return type of the function being pointed to is char *. The name of the function pointer is timeconv, with the normal pointer syntax leading the variable name. The argument list declaration is C<(char *, int)>, saying that the function being called will have a first argument of a pointer to a character (what C programmers commonly think of as a character string), and a second argument of an integer.

It is possible to declare an empty argument list, and such a declaration was required in pre-ANSI C. However, the ANSI C standard has been around for 13 years now, and explicitly listing the prototype for the function pointer will provide the C compiler hints on usage, allowing better error checking at compile time.

Assigning a function to a function pointer is easy. Just put the function pointer on the left side, and the function to be assigned on the right side. The function being assigned must not have a parameter list with it, otherwise, the you'll call the function (which of course, can return a pointer to a function, and may do exactly what you wish.)

Invoking a function through a function pointer is much like dereferencing any other pointer.

K&R used to say you must wrap the function pointer in parenthesis. However, ANSI C relaxed that restriction and allows the calling a function using a function pointer to look exactly like calling any other function. Updating our example from earlier to use ANSI C semantics:

  char *(*timeconv)(char *, int);
  char *(*dateconv)(char *, int);
  char *standardtime, *standarddate;
    .
    .
    .
  while (!EOT(tapedevice)) {
    standardtime = timeconv(timestr, len(timestr));
    standarddate = dateconv(datestr, len(datestr));
    .
    .
    .
  }

LISTING 3

Using ANSI-style function pointers

A complete example

Listing 4 contains a complete, if rather simplistic, example of using function pointers.

#include <stdio.h>

void hello(void) {
    puts("hello world!");
}

void goodbye(void) {
    puts("goodbye (cruel) world");
}

void whatami(void) {
    puts("what am i, a postage stamp?");
}

int main(int argc, char **argv) {
void (*speak)(void);

    if (argc < 2) {
	printf("usage: %s [h|g]\n", argv[0]);
	exit(1);
    }

    switch (argv[1][0]) {
	case 'h':
		speak = hello;
		break;
	case 'g':
		speak = goodbye;
		break;
	default:
		speak = whatami;
		break;
    }
    speak();
    exit(0);
}

LISTING 4

A simple example

Function pointers and dynamically loaded modules

Function pointers are extremely useful when using dynamically (run time) loaded shared objects. In fact, they're the only way a programmer can make direct use of the functions in a dynamically loaded objects.

Examples of dynamically loaded objects are web browser plug-ins, PAM modules, and name service switch modules. In each case, the functionally is exported through a well described interface, and the parent program can load the dynamic object, and call the functions as needed.

Netscape (for example) loads all of the shared objects on a search path, calling a well defined initialization routine that registers the MIME type the shared object handles, and the routines to be used to handle that MIME type.

PAM uses a configuration file to tell the framework what shared objects may be available, and which services the shared object should be used for. The PAM framework then loads the desired shared objects based on what's been requested by its caller.

The most common dynamic loading interface is the dlopen(3/3C) interface as originally defined by SunOS 4. The defined functions are dlopen, dlsym, dlclose, and dlerror. These interfaces are defined on Solaris, HP-UX, Linux, NetBSD, FreeBSD, and others. Check your local manual pages. Some operating systems have added additional interfaces to add functionality.

Some example sources to demonstrate how to use dynamically loaded objects and their symbols.

Listing 5 contains the command lines to build the objects

gcc -ldl -o ld dl.c
gcc -o dh0.so -shared dh0.c
gcc -o dh1.so -shared dh1.c

LISTING 5

Listing 6 and 7 contain are the shared objects that get loaded by the main program.

#include <stdio.h>

void
greeting(void)
{
    puts("hello world!");
}

LISTING 6

(dh0.c)

#include <stdio.h>

void
greeting(void)
{
    puts("goodbye (cruel) world");
}

LISTING 7

(dh1.c)

Listing 8 is the main program, that loads the shared objects, locates symbols within the shared objects, and then calls them.

#include <stdio.h>
#include <dlfcn.h>		/* the dynamic loading interface */

main(int argc, char **argv)
{
int (*speak)(void);
void *dh0, *dh1;
void (*dh0f)(void), (*dh1f)(void);

    dh0 = dlopen("./dh0.so", RTLD_LAZY);
    if (dh0 == NULL) {
	fprintf(stderr, "%s: open/load of dh0.so failed: %s\n",
		argv[0], dlerror());
	exit(1);
    }

    dh1 = dlopen("./dh1.so", RTLD_LAZY);
    if (dh1 == NULL) {
	fprintf(stderr, "%s: open/load of dh1.so failed: %s\n",
		argv[0], dlerror());
	exit(1);
    }

    dh0f = (void(*)(void))dlsym(dh0, "greeting");
    if (dh0f == NULL) {
	fprintf(stderr, "%s: symbol lookup in dh0.so failed: %s\n",
		argv[0], dlerror());
	exit(2);
    }

    dh1f = (void(*)(void))dlsym(dh1, "greeting");
    if (dh1f == NULL) {
	fprintf(stderr, "%s: symbol lookup in dh1.so failed: %s\n",
		argv[0], dlerror());
	exit(2);
    }

    dh0f();
    dh1f();

    exit(0);
}

LISTING 8

(dl.c)

So, function pointers are far more useful and versatile than you might have ever thought! With dynamically loaded modules, They make designing and implementing an easily expandable software architecture relatively simple and fairly painless.

That wasn't so hard

Now, function pointers shouldn't be nearly so mystifying any more. They certainly provide a useful mechanism for certain programming tasks. But be careful to make sure their use doesn't obscure the flow of the program, or make it hard for another (or yourself, two years hence) to maintain the software.

References

The C Programming Language (1st Edition)
Brian W. Kernighan, Dennis M. Ritchie
Prentice-Hall, Inc, Englewood Cliffs, NJ 07632
(C) 1978
ISBN: 0-13-110163-3

The C Programming Language (2nd Edition)
Brian W. Kernighan, Dennis M. Ritchie
Prentice-Hall, Inc, Englewood Cliffs, NJ 07632
(C) 1988
ISBN: 0-13-110362-8

C, A Reference Manual (2nd Edition)
Samuel P. Harbison, Guy L. Steele, Jr
Prentice-Hall, Inc, Englewood Cliffs, NJ 07632
(C) 1984
ISBN: 0-13-109802-0


If you have any questions about our site, please send us mail.
Copyright 2000,2001 Central Iowa (Model) Railroad Contact Us Referral
Program
Support
$Id: 2003-Jan.html,v 1.2 2007/10/19 14:46:44 eric Exp $ Terms of Service Privacy Information