What are C macros useful for?

I have written a little bit of C, and I can read it well enough to get a general idea of what it is doing, but every time I have encountered a macro it has thrown me completely. I end up having to remember what the macro is and substitute it in my head as I read. The ones that I have encountered that were intuitive and easy to understand were always like little mini functions, so I always wondered why they weren't just functions.

I can understand the need to define different build types for debug or cross platform builds in the preprocessor but the ability to define arbitrary substitutions seems to be useful only to make an already difficult language even more difficult to understand.

Why was such a complex preprocessor introduced for C? And does anyone have an example of using it that will make me understand why it still seems to be used for purposes other than simple if #debug style conditional compilations?

Edit:

Having read a number of answers I still just don't get it. The most common answer is to inline code. If the inline keyword doesn't do it then either it has a good reason to not do it, or the implementation needs fixing. I don't understand why a whole different mechanism is needed that means "really inline this code" (aside form the code being written before inline was around). I also don't understand the idea that was mentioned that "if its too silly to be put in a function". Surely any piece of code that takes an input and produces an output is best put in a function. I think I may not be getting it because I am not used to the micro optimisations of writing C, but the preprocessor just feels like a complex solution to a few simple problems.

39870 次浏览

Macros .. for when your &#(*$& compiler just refuses to inline something.

That should be a motivational poster, no?

In all seriousness, google preprocessor abuse (you may see a similar SO question as the #1 result). If I'm writing a macro that goes beyond the functionality of assert(), I usually try to see if my compiler would actually inline a similar function.

Others will argue against using #if for conditional compilation .. they would rather you:

if (RUNNING_ON_VALGRIND)

rather than

#if RUNNING_ON_VALGRIND

.. for debugging purposes, since you can see the if() but not #if in a debugger. Then we dive into #ifdef vs #if.

If its under 10 lines of code, try to inline it. If it can't be inlined, try to optimize it. If its too silly to be a function, make a macro.

It's good for inlining code and avoiding function call overhead. As well as using it if you want to change the behaviour later without editing lots of places. It's not useful for complex things, but for simple lines of code that you want to inline, it's not bad.

One of the obvious reasons is that by using a macro, the code will be expanded at compile time, and you get a pseudo function-call without the call overhead.

Otherwise, you can also use it for symbolic constants, so that you don't have to edit the same value in several places to change one small thing.

Remember that macros (and the pre-processor) come from the earliest days of C. They used to be the ONLY way to do inline 'functions' (because, of course, inline is a very recent keyword), and they are still the only way to FORCE something to be inlined.

Also, macros are the only way you can do such tricks as inserting the file and line into string constants at compile time.

These days, many of the things that macros used to be the only way to do are better handled through newer mechanisms. But they still have their place, from time to time.

While I'm not a big fan of macros and don't tend to write much C anymore, based on my current tasking, something like this (which could obviously have some side-effects) is convenient:

#define MIN(X, Y)  ((X) < (Y) ? (X) : (Y))

Now I haven't written anything like that in years, but 'functions' like that were all over code that I maintained earlier in my career. I guess the expansion could be considered convenient.

I end up having to remember what the macro is and substitute it in my head as I read.

That seems to reflect poorly on the naming of the macros. I would assume you wouldn't have to emulate the preprocessor if it were a log_function_entry() macro.

The ones that I have encountered that were intuitive and easy to understand were always like little mini functions, so I always wondered why they weren't just functions.

Usually they should be, unless they need to operate on generic parameters.

#define max(a,b) ((a)<(b)?(b):(a))

will work on any type with an < operator.

More that just functions, macros let you perform operations using the symbols in the source file. That means you can create a new variable name, or reference the source file and line number the macro is on.

In C99, macros also allow you to call variadic functions such as printf

#define log_message(guard,format,...) \
if (guard) printf("%s:%d: " format "\n", __FILE__, __LINE__,__VA_ARGS_);


log_message( foo == 7, "x %d", x)

In which the format works like printf. If the guard is true, it outputs the message along with the file and line number that printed the message. If it was a function call, it would not know the file and line you called it from, and using a vaprintf would be a bit more work.

One of the case where macros really shine is when doing code-generation with them.

I used to work on an old C++ system that was using a plugin system with his own way to pass parameters to the plugin (Using a custom map-like structure). Some simple macros were used to be able to deal with this quirk and allowed us to use real C++ classes and functions with normal parameters in the plugins without too much problems. All the glue code being generated by macros.

Apart from inlining for efficiency and conditional compilation, macros can be used to raise the abstraction level of low-level C code. C doesn't really insulate you from the nitty-gritty details of memory and resource management and exact layout of data, and supports very limited forms of information hiding and other mechanisms for managing large systems. With macros, you are no longer limited to using only the base constructs in the C language: you can define your own data structures and coding constructs (including classes and templates!) while still nominally writing C!

Preprocessor macros actually offer a Turing-complete language executed at compile time. One of the impressive (and slightly scary) examples of this is over on the C++ side: the Boost Preprocessor library uses the C99/C++98 preprocessor to build (relatively) safe programming constructs which are then expanded to whatever underlying declarations and code you input, whether C or C++.

In practice, I'd recommend regarding preprocessor programming as a last resort, when you don't have the latitude to use high level constructs in safer languages. But sometimes it's good to know what you can do if your back is against the wall and the weasels are closing in...!

This excerpt pretty much sums up my view on the matter, by comparing several ways that C macros are used, and how to implement them in D.

copied from DigitalMars.com

Back when C was invented, compiler technology was primitive. Installing a text macro preprocessor onto the front end was a straightforward and easy way to add many powerful features. The increasing size & complexity of programs have illustrated that these features come with many inherent problems. D doesn't have a preprocessor; but D provides a more scalable means to solve the same problems.

Macros

Preprocessor macros add powerful features and flexibility to C. But they have a downside:

  • Macros have no concept of scope; they are valid from the point of definition to the end of the source. They cut a swath across .h files, nested code, etc. When #include'ing tens of thousands of lines of macro definitions, it becomes problematical to avoid inadvertent macro expansions.
  • Macros are unknown to the debugger. Trying to debug a program with symbolic data is undermined by the debugger only knowing about macro expansions, not the macros themselves.
  • Macros make it impossible to tokenize source code, as an earlier macro change can arbitrarily redo tokens.
  • The purely textual basis of macros leads to arbitrary and inconsistent usage, making code using macros error prone. (Some attempt to resolve this was introduced with templates in C++.)
  • Macros are still used to make up for deficits in the language's expressive capability, such as for "wrappers" around header files.

Here's an enumeration of the common uses for macros, and the corresponding feature in D:

  1. Defining literal constants:

    • The C Preprocessor Way

      #define VALUE 5
      
    • The D Way

      const int VALUE = 5;
      
  2. Creating a list of values or flags:

    • The C Preprocessor Way

      int flags:
      #define FLAG_X  0x1
      #define FLAG_Y  0x2
      #define FLAG_Z  0x4
      ...
      flags |= FLAG_X;
      
    • The D Way

      enum FLAGS { X = 0x1, Y = 0x2, Z = 0x4 };
      FLAGS flags;
      ...
      flags |= FLAGS.X;
      
  3. Setting function calling conventions:

    • The C Preprocessor Way

      #ifndef _CRTAPI1
      #define _CRTAPI1 __cdecl
      #endif
      #ifndef _CRTAPI2
      #define _CRTAPI2 __cdecl
      #endif
      
      
      int _CRTAPI2 func();
      
    • The D Way

      Calling conventions can be specified in blocks, so there's no need to change it for every function:

      extern (Windows)
      {
      int onefunc();
      int anotherfunc();
      }
      
  4. Simple generic programming:

    • The C Preprocessor Way

      Selecting which function to use based on text substitution:

      #ifdef UNICODE
      int getValueW(wchar_t *p);
      #define getValue getValueW
      #else
      int getValueA(char *p);
      #define getValue getValueA
      #endif
      
    • The D Way

      D enables declarations of symbols that are aliases of other symbols:

      version (UNICODE)
      {
      int getValueW(wchar[] p);
      alias getValueW getValue;
      }
      else
      {
      int getValueA(char[] p);
      alias getValueA getValue;
      }
      

There are more examples on the DigitalMars website.

Given the comments in your question, you may not fully appreciate is that calling a function can entail a fair amount of overhead. The parameters and key registers may have to be copied to the stack on the way in, and the stack unwound on the way out. This was particularly true of the older Intel chips. Macros let the programmer keep the abstraction of a function (almost), but avoided the costly overhead of a function call. The inline keyword is advisory, but the compiler may not always get it right. The glory and peril of 'C' is that you can usually bend the compiler to your will.

In your bread and butter, day-to-day application programming this kind of micro-optimization (avoiding function calls) is generally worse then useless, but if you are writing a time-critical function called by the kernel of an operating system, then it can make a huge difference.

They are a programming language (a simpler one) on top of C, so they are useful for doing metaprogramming in compile time... in other words, you can write macro code that generates C code in less lines and time that it will take writing it directly in C.

They are also very useful to write "function like" expressions that are "polymorphic" or "overloaded"; e.g. a max macro defined as:

#define max(a,b) ((a)>(b)?(a):(b))

is useful for any numeric type; and in C you could not write:

int max(int a, int b) {return a>b?a:b;}
float max(float a, float b) {return a>b?a:b;}
double max(double a, double b) {return a>b?a:b;}
...

even if you wanted, because you cannot overload functions.

And not to mention conditional compiling and file including (that are also part of the macro language)...

By leveraging C preprocessor's text manipulation one can construct the C equivalent of a polymorphic data structure. Using this technique we can construct a reliable toolbox of primitive data structures that can be used in any C program, since they take advantage of C syntax and not the specifics of any particular implementation.

Detailed explanation on how to use macros for managing data structure is given here - http://multi-core-dump.blogspot.com/2010/11/interesting-use-of-c-macros-polymorphic.html

Macros allow someone to modify the program behavior during compilation time. Consider this:

  • C constants allow fixing program behavior at development time
  • C variables allow modifying program behavior at execution time
  • C macros allow modifying program behavior at compilation time

At compilation time means that unused code won't even go into the binary and that the build process can modify the values, as long as it's integrated with the macro preprocessor. Example: make ARCH=arm (assumes forwarding macro definition as cc -DARCH=arm)

Simple examples: (from glibc limits.h, define the largest value of long)

#if __WORDSIZE == 64
#define LONG_MAX 9223372036854775807L
#else
#define LONG_MAX 2147483647L
#endif

Verifies (using the #define __WORDSIZE) at compile time if we're compiling for 32 or 64 bits. With a multilib toolchain, using parameters -m32 and -m64 may automatically change bit size.

(POSIX version request)

#define _POSIX_C_SOURCE 200809L

Requests during compilation time POSIX 2008 support. The standard library may support many (incompatible) standards but with this definition, it will provide the correct function prototypes (example: getline(), no gets(), etc.). If the library doesn't support the standard it may give an #error during compile time, instead of crashing during execution, for example.

(hardcoded path)

#ifndef LIBRARY_PATH
#define LIBRARY_PATH "/usr/lib"
#endif

Defines, during compilation time a hardcode directory. Could be changed with -DLIBRARY_PATH=/home/user/lib, for example. If that were a const char *, how would you configure it during compilation ?

(pthread.h, complex definitions at compile time)

# define PTHREAD_MUTEX_INITIALIZER \
{ { 0, 0, 0, 0, 0, 0, { 0, 0 } } }

Large pieces of text may that otherwise wouldn't be simplified may be declared (always at compile time). It's not possible to do this with functions or constants (at compile time).

To avoid really complicating things and to avoid suggesting poor coding styles, I'm wont give an example of code that compiles in different, incompatible, operating systems. Use your cross build system for that, but it should be clear that the preprocessor allows that without help from the build system, without breaking compilation because of absent interfaces.

Finally, think about the importance of conditional compilation on embedded systems, where processor speed and memory are limited and systems are very heterogeneous.

Now, if you ask, is it possible to replace all macro constant definitions and function calls with proper definitions ? The answer is yes, but it won't simply make the need for changing program behavior during compilation go away. The preprocessor would still be required.

From Computer Stupidities:

I've seen this code excerpt in a lot of freeware gaming programs for UNIX:

/*
* Bit values.
*/
#define BIT_0 1
#define BIT_1 2
#define BIT_2 4
#define BIT_3 8
#define BIT_4 16
#define BIT_5 32
#define BIT_6 64
#define BIT_7 128
#define BIT_8 256
#define BIT_9 512
#define BIT_10 1024
#define BIT_11 2048
#define BIT_12 4096
#define BIT_13 8192
#define BIT_14 16384
#define BIT_15 32768
#define BIT_16 65536
#define BIT_17 131072
#define BIT_18 262144
#define BIT_19 524288
#define BIT_20 1048576
#define BIT_21 2097152
#define BIT_22 4194304
#define BIT_23 8388608
#define BIT_24 16777216
#define BIT_25 33554432
#define BIT_26 67108864
#define BIT_27 134217728
#define BIT_28 268435456
#define BIT_29 536870912
#define BIT_30 1073741824
#define BIT_31 2147483648

A much easier way of achieving this is:

#define BIT_0 0x00000001
#define BIT_1 0x00000002
#define BIT_2 0x00000004
#define BIT_3 0x00000008
#define BIT_4 0x00000010
...
#define BIT_28 0x10000000
#define BIT_29 0x20000000
#define BIT_30 0x40000000
#define BIT_31 0x80000000

An easier way still is to let the compiler do the calculations:

#define BIT_0 (1)
#define BIT_1 (1 << 1)
#define BIT_2 (1 << 2)
#define BIT_3 (1 << 3)
#define BIT_4 (1 << 4)
...
#define BIT_28 (1 << 28)
#define BIT_29 (1 << 29)
#define BIT_30 (1 << 30)
#define BIT_31 (1 << 31)

But why go to all the trouble of defining 32 constants? The C language also has parameterized macros. All you really need is:

#define BIT(x) (1 << (x))

Anyway, I wonder if guy who wrote the original code used a calculator or just computed it all out on paper.

That's just one possible use of Macros.

Unlike regular functions, you can do control flow (if, while, for,...) in macros. Here's an example:

#include <stdio.h>


#define Loop(i,x) for(i=0; i<x; i++)


int main(int argc, char *argv[])
{
int i;
int x = 5;
Loop(i, x)
{
printf("%d", i); // Output: 01234
}
return 0;
}

I will add to whats already been said.

Because macros work on text substitutions they allow you do very useful things which wouldn't be possible to do using functions.

Here a few cases where macros can be really useful:

/* Get the number of elements in array 'A'. */
#define ARRAY_LENGTH(A) (sizeof(A) / sizeof(A[0]))

This is a very popular and frequently used macro. This is very handy when you for example need to iterate through an array.

int main(void)
{
int a[] = {1, 2, 3, 4, 5};
int i;
for (i = 0; i < ARRAY_LENGTH(a); ++i) {
printf("a[%d] = %d\n", i, a[i]);
}
return 0;
}

Here it doesn't matter if another programmer adds five more elements to a in the decleration. The for-loop will always iterate through all elements.

The C library's functions to compare memory and strings are quite ugly to use.

You write:

char *str = "Hello, world!";


if (strcmp(str, "Hello, world!") == 0) {
/* ... */
}

or

char *str = "Hello, world!";


if (!strcmp(str, "Hello, world!")) {
/* ... */
}

To check if str points to "Hello, world". I personally think that both these solutions look quite ugly and confusing (especially !strcmp(...)).

Here are two neat macros some people (including I) use when they need to compare strings or memory using strcmp/memcmp:

/* Compare strings */
#define STRCMP(A, o, B) (strcmp((A), (B)) o 0)


/* Compare memory */
#define MEMCMP(A, o, B) (memcmp((A), (B)) o 0)

Now you can now write the code like this:

char *str = "Hello, world!";


if (STRCMP(str, ==, "Hello, world!")) {
/* ... */
}

Here is the intention alot clearer!

These are cases were macros are used for things functions cannot accomplish. Macros should not be used to replace functions but they have other good uses.

Macros let you get rid of copy-pasted fragments, which you can't eliminate in any other way.

For instance (the real code, syntax of VS 2010 compiler):

for each (auto entry in entries)
{
sciter::value item;
item.set_item("DisplayName",    entry.DisplayName);
item.set_item("IsFolder",       entry.IsFolder);
item.set_item("IconPath",       entry.IconPath);
item.set_item("FilePath",       entry.FilePath);
item.set_item("LocalName",      entry.LocalName);
items.append(item);
}

This is the place where you pass a field value under the same name into a script engine. Is this copy-pasted? Yes. DisplayName is used as a string for a script and as a field name for the compiler. Is that bad? Yes. If you refactor you code and rename LocalName to RelativeFolderName (as I did) and forget to do the same with the string (as I did), the script will work in a way you don't expect (in fact, in my example it depends on did you forget to rename the field in a separate script file, but if the script is used for serialization, it would be a 100% bug).

If you use a macro for this, there will be no room for the bug:

for each (auto entry in entries)
{
#define STR_VALUE(arg) #arg
#define SET_ITEM(field) item.set_item(STR_VALUE(field), entry.field)
sciter::value item;
SET_ITEM(DisplayName);
SET_ITEM(IsFolder);
SET_ITEM(IconPath);
SET_ITEM(FilePath);
SET_ITEM(LocalName);
#undef SET_ITEM
#undef STR_VALUE
items.append(item);
}

Unfortunately, this opens a door for other types of bugs. You can make a typo writing the macro and will never see a spoiled code, because the compiler doesn't show how it looks after all preprocessing. Someone else could use the same name (that's why I "release" macros ASAP with #undef). So, use it wisely. If you see another way of getting rid of copy-pasted code (such as functions), use that way. If you see that getting rid of copy-pasted code with macros isn't worth the result, keep the copy-pasted code.

I didn't see anyone mentioning this so, regarding function like macros, eg:

#define MIN(X, Y) ((X) < (Y) ? (X) : (Y))

Generally it's recommended to avoid using macros when not necessary, for many reasons, readability being the main concern. So:

When should you use these over a function?

Almost never, since there's a more readable alternative which is inline, see https://www.greenend.org.uk/rjk/tech/inline.html or http://www.cplusplus.com/articles/2LywvCM9/ (the second link is a C++ page, but the point is applicable to c compilers as far as I know).

Now, the slight difference is that macros are handled by the pre-processor and inline is handled by the compiler, but there's no practical difference nowadays.

when is it appropriate to use these?

For small functions (two or three liners max). The goal is to gain some advantage during the run time of a program, as function like macros (and inline functions) are code replacements done during the pre-proccessing (or compilation in case of inline) and are not real functions living in memory, so there's no function call overhead (more details in the linked pages).