Home » Support » Unix C Programming FAQ - Section 6

Go to the first, previous, next, last section, table of contents.


5. Miscellaneous programming

5.1 How do I compare strings using wildcards?

The answer to that depends on what exactly you mean by `wildcards'.

There are two quite different concepts that qualify as `wildcards'. They are:

Filename patterns
These are what the shell uses for filename expansion (`globbing').
Regular Expressions
These are used by editors, grep, etc. for matching text, but they normally aren't applied to filenames.

5.1.1 How do I compare strings using filename patterns?

Unless you are unlucky, your system should have a function fnmatch() to do filename matching. This generally allows only the Bourne shell style of pattern; i.e. it recognises `*', `[...]' and `?', but probably won't support the more arcane patterns available in the Korn and Bourne-Again shells.

If you don't have this function, then rather than reinvent the wheel, you are probably better off snarfing a copy from the BSD or GNU sources.

Also, for the common cases of matching actual filenames, look for glob(), which will find all existing files matching a pattern.

5.1.2 How do I compare strings using regular expressions?

There are a number of slightly different syntaxes for regular expressions; most systems use at least two: the one recognised by ed, sometimes known as `Basic Regular Expressions', and the one recognised by egrep, `Extended Regular Expressions'. Perl has it's own slightly different flavour, as does Emacs.

To support this multitude of formats, there is a corresponding multitude of implementations. Systems will generally have regexp-matching functions (usually regcomp() and regexec()) supplied, but be wary; some systems have more than one implementation of these functions available, with different interfaces. In addition, there are many library implementations available. (It's common, BTW, for regexps to be compiled to an internal form before use, on the assumption that you may compare several separate strings against the same regexp.)

One library available for this is the `rx' library, available from the GNU mirrors. This seems to be under active development, which may be a good or a bad thing depending on your point of view :-)

5.2 What's the best way to send mail from a program?

There are several ways to send email from a Unix program. Which is the best method to use in a given situation varies, so I'll present two of them. A third possibility, not covered here, is to connect to a local SMTP port (or a smarthost) and use SMTP directly; see RFC 821.

5.2.1 The simple method: /bin/mail

For simple applications, it may be sufficient to invoke mail (usually `/bin/mail', but could be `/usr/bin/mail' on some systems).

WARNING: Some versions of UCB Mail may execute commands prefixed by `~!' or `~|' given in the message body even in non-interactive mode. This can be a security risk.

Invoked as `mail -s 'subject' recipients...' it will take a message body on standard input, and supply a default header (including the specified subject), and pass the message to sendmail for delivery.

This example mails a test message to root on the local system:

#include <stdio.h>

#define MAILPROG "/bin/mail"

int main()
{
    FILE *mail = popen(MAILPROG " -s 'Test Message' root", "w");
    if (!mail)
    {
        perror("popen");
        exit(1);
    }

    fprintf(mail, "This is a test.\n");

    if (pclose(mail))
    {
        fprintf(stderr, "mail failed!\n");
        exit(1);
    }
}

If the text to be sent is already in a file, then one can do:

    system(MAILPROG " -s 'file contents' root </tmp/filename");

These methods can be extended to more complex cases, but there are many pitfalls to watch out for:

5.2.2 Invoking the MTA directly: /usr/lib/sendmail

The mail program is an example of a Mail User Agent, a program intended to be invoked by the user to send and receive mail, but which does not handle the actual transport. A program for transporting mail is called an MTA, and the most commonly found MTA on Unix systems is called sendmail. There are other MTAs in use, such as MMDF, but these generally include a program that emulates the usual behaviour of sendmail.

Historically, sendmail has usually been found in `/usr/lib', but the current trend is to move library programs out of `/usr/lib' into directories such as `/usr/sbin' or `/usr/libexec'. As a result, one normally invokes sendmail by its full path, which is system-dependent.

To understand how sendmail behaves, it's useful to understand the concept of an envelope. This is very much like paper mail; the envelope defines who the message is to be delivered to, and who it is from (for the purpose of reporting errors). Contained in the envelope are the headers, and the body, separated by a blank line. The format of the headers is specified primarily by RFC 822; see also the MIME RFCs.

There are two main ways to use sendmail to originate a message: either the envelope recipients can be explicitly supplied, or sendmail can be instructed to deduce them from the message headers. Both methods have advantages and disadvantages.

5.2.2.1 Supplying the envelope explicitly

The recipients of a message can simply be specified on the command line. This has the drawback that mail addresses can contain characters that give system() and popen() considerable grief, such as single quotes, quoted strings etc. Passing these constructs successfully through shell interpretation presents pitfalls. (One can do it by replacing any single quotes by the sequence single-quote backslash single-quote single-quote, then surrounding the entire address with single quotes. Ugly, huh?)

Some of this unpleasantness can be avoided by eschewing the use of system() or popen(), and resorting to fork() and exec() directly. This is sometimes necessary in any event; for example, user-installed handlers for SIGCHLD will usually break pclose() to a greater or lesser extent.

Here's an example:

#include <sys/types.h>
#include <sys/wait.h>
#include <unistd.h>
#include <stdlib.h>
#include <fcntl.h>
#include <sysexits.h>
/* #include <paths.h> if you have it */

#ifndef _PATH_SENDMAIL
#define _PATH_SENDMAIL "/usr/lib/sendmail"
#endif

/* -oi means "dont treat . as a message terminator"
 * remove ,"--" if using a pre-V8 sendmail (and hope that no-one
 * ever uses a recipient address starting with a hyphen)
 * you might wish to add -oem (report errors by mail)
 */

#define SENDMAIL_OPTS "-oi","--"

/* this is a macro for returning the number of elements in array */

#define countof(a) ((sizeof(a))/sizeof((a)[0]))

/* send the contents of the file open for reading on FD to the
 * specified recipients; the file is assumed to contain RFC822 headers
 * & body, the recipient list is terminated by a NULL pointer; returns
 * -1 if error detected, otherwise the return value from sendmail
 * (which uses <sysexits.h> to provide meaningful exit codes)
 */

int send_message(int fd, const char **recipients)
{
    static const char *argv_init[] = { _PATH_SENDMAIL, SENDMAIL_OPTS };
    const char **argvec = NULL;
    int num_recip = 0;
    pid_t pid;
    int rc;
    int status;

    /* count number of recipients */

    while (recipients[num_recip])
        ++num_recip;

    if (!num_recip)
        return 0;    /* sending to no recipients is successful */

    /* alloc space for argument vector */

    argvec = malloc((sizeof char*) * (num_recip+countof(argv_init)+1));
    if (!argvec)
        return -1;

    /* initialise argument vector */

    memcpy(argvec, argv_init, sizeof(argv_init));
    memcpy(argvec+countof(argv_init),
           recipients, num_recip*sizeof(char*));
    argvec[num_recip + countof(argv_init)] = NULL;

    /* may need to add some signal blocking here. */

    /* fork */

    switch (pid = fork())
    {
    case 0:   /* child */

        /* Plumbing */
        if (fd != STDIN_FILENO)
            dup2(fd, STDIN_FILENO);

        /* defined elsewhere -- closes all FDs >= argument */
        closeall(3);

        /* go for it: */
        execv(_PATH_SENDMAIL, argvec);
        _exit(EX_OSFILE);

    default:  /* parent */

        free(argvec);
        rc = waitpid(pid, &status, 0);
        if (rc < 0)
            return -1;
        if (WIFEXITED(status))
            return WEXITSTATUS(status);
        return -1;

    case -1:  /* error */
        free(argvec);
        return -1;
    }
}

5.2.2.2 Allowing sendmail to deduce the recipients

The `-t' option to sendmail instructs sendmail to parse the headers of the message, and use all the recipient-type headers (i.e. To:, Cc: and Bcc:) to construct the list of envelope recipients. This has the advantage of simplifying the sendmail command line, but makes it impossible to specify recipients other than those listed in the headers. (This is not usually a problem.)

As an example, here's a program to mail a file on standard input to specified recipients as a MIME attachment. Some error checks have been omitted for brevity. This requires the `mimencode' program from the metamail distribution.

#include <stdio.h>
#include <unistd.h>
#include <fcntl.h>
/* #include <paths.h> if you have it */

#ifndef _PATH_SENDMAIL
#define _PATH_SENDMAIL "/usr/lib/sendmail"
#endif

#define SENDMAIL_OPTS "-oi"
#define countof(a) ((sizeof(a))/sizeof((a)[0]))

char tfilename[L_tmpnam];
char command[128+L_tmpnam];

void cleanup(void)
{
    unlink(tfilename);
}

int main(int argc, char **argv)
{
    FILE *msg;
    int i;

    if (argc < 2)
    {
        fprintf(stderr, "usage: %s recipients...\n", argv[0]);
        exit(2);
    }

    if (tmpnam(tfilename) == NULL
        || (msg = fopen(tfilename,"w")) == NULL)
        exit(2);

    atexit(cleanup);

    fclose(msg);
    msg = fopen(tfilename,"a");
    if (!msg)
        exit(2);

    /* construct recipient list */

    fprintf(msg, "To: %s", argv[1]);
    for (i = 2; i < argc; i++)
        fprintf(msg, ",\n\t%s", argv[i]);
    fputc('\n',msg);

    /* Subject */

    fprintf(msg, "Subject: file sent by mail\n");

    /* sendmail can add it's own From:, Date:, Message-ID: etc. */

    /* MIME stuff */

    fprintf(msg, "MIME-Version: 1.0\n");
    fprintf(msg, "Content-Type: application/octet-stream\n");
    fprintf(msg, "Content-Transfer-Encoding: base64\n");

    /* end of headers -- insert a blank line */

    fputc('\n',msg);
    fclose(msg);

    /* invoke encoding program */

    sprintf(command, "mimencode -b >>%s", tfilename);
    if (system(command))
        exit(1);

    /* invoke mailer */

    sprintf(command, "%s %s -t <%s",
            _PATH_SENDMAIL, SENDMAIL_OPTS, tfilename);
    if (system(command))
        exit(1);

    return 0;
}