C 语言中模板的模拟(用于队列数据类型)

我正在尝试使用 C 实现一个 queue结构,我的实现非常简单,队列只能容纳 int,其他什么都不能容纳。我想知道是否可以在 C中模拟 C++模板(可能使用预处理器 #define) ,以便我的 queue可以保存任何数据类型。

注意: 我不想使用 void*。我认为这有点冒险,很容易导致奇怪的运行时错误。

83508 次浏览

Implement a queue containing void* data, and interpret this void* as pointer to any type, or even primitive type like int.

Using #define is possible, but think about debugging, if something is wrong...

If you really want to do this, it could be solved by a simple typedef:

typedef int data_t;


struct queue
{
data_t* data;
}

You can now use data_t in all places instead of plain ints. Note, however, that you will not be able to use multiple types at once (at least, I don't see a way how this particular behaviour of C++ templates can be simulated in plain C).

You can't really get a high quality template work-alike in C with preprocessor macros; because, those macros expand only once, so at best you can get a data structure that can be retyped, but once processed is that type for the whole program.

That means you need to consider void * type solutions, which weaken C's type checking. To attempt to fix the weakened type checking, consider embedding a "type" field in your struct which is an "assign once at construction" string that represents the non void* type. Then you can possibly improve on the lack of type checking within the functions related to maintaining the struct. That is, if such a thing is even important to you.

Well, the only possibilty that comes to my mind are macros (#defines). Maybe something like:

queue.h:

#define TYPE int
#define TYPED_NAME(x) int_##x
#include "queue_impl.h"
#undef TYPE
#undef TYPED_NAME


#define TYPE float
#define TYPED_NAME(x) float_##x
#include "queue_impl.h"
#undef TYPE
#undef TYPED_NAME
...

queue_impl.h:

//no include guard, of course
typedef struct
{
TYPE *data;
...
} TYPED_NAME(queue);


void TYPED_NAME(queue_insert) (TYPED_NAME(queue) *queue, TYPE data)
{
...
}

If it works (which I'm not 100% sure of, being not such a preprocessor expert), it should give you the structs int_queue and float_queue, along with the functions

void int_queue_insert(int_queue *queue, int data);
void float_queue_insert(float_queue *queue, float data);

Of course you will have to do the instantiation of the "template" yourself for all the types you need, but this amounts to repeating the 5-line block in queue.h. The actual implementation has to be written only once. Of course you can refine this even more, but the basic idea should be clear.

This will at least give you perfectly type-safe queue templates, though lacking the convenience of completely matching interfaces (the functions have to carry the type name, since C doesn't support overloaded functions).

#define q(t) \
typedef struct _q_##t {t v; struct q_##t *next} q_##t;


q(char);
q(int);


int main(void)
{
q_char qc;
q_int qi;


qc.v = 'c';
qc.next = (void *) 0;


qi.v = 42;
qi.next = (void *) 0;


return 0;
}

But I am not sure that's what you looking for...

Here is a version that can let you instantiate (through preprocessor) and use multiple types in the same C file (Careful, it uses token concatenation):

#include <stdio.h>


#define DEFINE_LL_NODE(CONCRETE_TYPE) \
struct node_of_ ## CONCRETE_TYPE \
{ \
CONCRETE_TYPE data; \
struct node_of_ ## CONCRETE_TYPE *next; \
};


#define DECLARE_LL_NODE(CONCRETE_TYPE,VARIABLE_NAME) \
struct node_of_ ## CONCRETE_TYPE VARIABLE_NAME;


/* Declarations for each type.  */
DEFINE_LL_NODE(int)
DEFINE_LL_NODE(char)


int main (void)
{
/* Declaration of instances of each type.  */
DECLARE_LL_NODE (int, foo)
DECLARE_LL_NODE (char, bar)


/* And you can then use these instances.  */
foo.data = 1;
foo.next = NULL;


bar.data = 'c';
bar.next = NULL;
}

If I preprocess it with cpp, I get:

struct node_of_int { int data; struct node_of_int *next; };


struct node_of_char { char data; struct node_of_char *next; };


int main (void)
{
struct node_of_int foo;
struct node_of_char bar;


foo.data = 1;
foo.next = ((void *)0);


bar.data = 'c';
bar.next = ((void *)0);
}

You can use subtle and ugly tricks in order to create that kind of templates. Here's what I would do:

Creation of a templated list

Macro to define the list

I would first create a macro - let's call it say define_list(type) - that would create all the functions for a list of a given type. I would then create a global structure containing function pointers to all the list's functions and then have a pointer to that global structure in each instance of the list (note how similar it is to a virtual method table). This kind of thing:

#define define_list(type) \
\
struct _list_##type; \
\
typedef struct \
{ \
int (*is_empty)(const struct _list_##type*); \
size_t (*size)(const struct _list_##type*); \
const type (*front)(const struct _list_##type*); \
void (*push_front)(struct _list_##type*, type); \
} _list_functions_##type; \
\
typedef struct _list_elem_##type \
{ \
type _data; \
struct _list_elem_##type* _next; \
} list_elem_##type; \
\
typedef struct _list_##type \
{ \
size_t _size; \
list_elem_##type* _first; \
list_elem_##type* _last; \
_list_functions_##type* _functions; \
} List_##type; \
\
List_##type* new_list_##type(); \
bool list_is_empty_##type(const List_##type* list); \
size_t list_size_##type(const List_##type* list); \
const type list_front_##type(const List_##type* list); \
void list_push_front_##type(List_##type* list, type elem); \
\
bool list_is_empty_##type(const List_##type* list) \
{ \
return list->_size == 0; \
} \
\
size_t list_size_##type(const List_##type* list) \
{ \
return list->_size; \
} \
\
const type list_front_##type(const List_##type* list) \
{ \
return list->_first->_data; \
} \
\
void list_push_front_##type(List_##type* list, type elem) \
{ \
... \
} \
\
_list_functions_##type _list_funcs_##type = { \
&list_is_empty_##type, \
&list_size_##type, \
&list_front_##type, \
&list_push_front_##type, \
}; \
\
List_##type* new_list_##type() \
{ \
List_##type* res = (List_##type*) malloc(sizeof(List_##type)); \
res->_size = 0; \
res->_first = NULL; \
res->_functions = &_list_funcs_##type; \
return res; \
}


#define List(type) \
List_##type


#define new_list(type) \
new_list_##type()

Generic interface

Here are some macros that simply call the list's functions via the stored function pointers:

#define is_empty(collection) \
collection->_functions->is_empty(collection)


#define size(collection) \
collection->_functions->size(collection)


#define front(collection) \
collection->_functions->front(collection)


#define push_front(collection, elem) \
collection->_functions->push_front(collection, elem)

Note that if you use the same structure to design other collections than lists, you'll be able to use the last functions for any collections that stores the good pointers.

Example of use

And to conclude, a small example of how to use our new list template:

/* Define the data structures you need */
define_list(int)
define_list(float)


int main()
{
List(int)* a = new_list(int);
List(float)* b = new_list(float);


push_front(a, 5);
push_front(b, 5.2);
}

You can use that amount of tricks if you really want to have some kind of templates in C, but that's rather ugly (just use C++, it'll be simpler). The only overhead will be one more pointer per instance of data structure, and thus one more indirection whenever you call a function (no cast is done, you don't have to store void* pointers, yeah \o/). Hope you won't ever use that :p

Limitations

There are of course some limitations since we are using mere text replacement macros, and not real templates.

Define once

You can only define each type once per compile unit, otherwise, your program will fail to compile. This can be a major drawback for example if you write a library and some of your headers contain some define_ instructions.

Multi-word types

If you want to create a List whose template type is made of several words (signed char, unsigned long, const bar, struct foo...) or whose template type is a pointer (char*, void*...), you will have to typedef that type first.

define_list(int) /* OK */
define_list(char*) /* Error: pointer */
define_list(unsigned long) /* Error: several words */


typedef char* char_ptr;
typedef unsigned long ulong;
define_list(char_ptr) /* OK */
define_list(ulong) /* OK */

You will have to resort to the same trick if you want to create nested lists.

Use one of the code gen macros in another answer and then finish it up with some C11 overload macros so you don't have to litter your call sites with too much type info.

http://en.cppreference.com/w/c/language/generic

I was wondering about this for a long time but I now have a definite answer that anyone can understand; so behold!

When I was taking Data Structures course, I had to read Standish's book on Data Structures, Algorithms in C; it was painful; it had no generics, it was full of poor notations and whole bunch of global state mutation where it had no warrant being there; I knew adopting his code style meant screwing over all my future projects, but I knew there was a better way, so behold, the better way:

This is what it looked like before I touched it(Actually I touched it anyway to make it formatted in a way humans can read, you're welcome); it is really ugly and wrong on many levels, but I'll list it for reference:

#include <stdio.h>


#define MaxIndex 100


int Find(int A[])
{
int j;


for (j = 0; j < MaxIndex; ++j) {
if (A[j] < 0) {
return j;
}
}


return -1;
}


int main(void)
{
// reminder: MaxIndex is 100.
int A[MaxIndex];


/**
* anonymous scope #1
*     initialize our array to [0..99],
*     then set 18th element to its negative value(-18)
*     to make the search more interesting.
*/
{
// loop index, nothing interesting here.
int i;


// initialize our array to [0..99].
for (i = 0; i < MaxIndex; ++i) {
A[i] = i * i;
}


A[17]= -A[17];
}


/**
* anonymous scope #2
*     find the index of the smallest number and print it.
*/
{
int result = Find(A);


printf(
"First negative integer in A found at index = %d.\n",
result
);
}


// wait for user input before closing.
getchar();


return 0;
}

This program does multiple things in a horrifyingly bad style; In particular, it sets a global macro that is only used within a single scope, but then persists polluting any code onward; very bad, and causes Windows API scale of global scope pollution at large scale.

Furthermore, this program passes the argument as array without a struct to contain it; in other words, the array is dead on arrival once it reaches the function Find; we no longer know the size of the array, so we now have main and Find depend on a global macro, very bad.

There are two brute force ways to make this problem go away but still keep the code simple; the first way is to create a global struct which defines the array as an array of 100 integers; this way passing the struct will preserve the length of the array in it. The second way is to pass the length of the array as an argument of find, and only use the #define the line before creating the array, and #undef it right afterwards, as the scope will still know the size of the array via sizeof(A)/sizeof(A[0]) which has 0 runtime overhead, the compiler will deduce 100 and paste it in.

To solve this problem in a third way, I made a header which plays nice to create generic arrays; it is an abstract data type, but i'd like to call it an automated data structure.

SimpleArray.h

/**
* Make sure that all the options needed are given in order to create our array.
*/
#ifdef OPTION_UNINSTALL
#undef OPTION_ARRAY_TYPE
#undef OPTION_ARRAY_LENGTH
#undef OPTION_ARRAY_NAME
#else
#if (!defined OPTION_ARRAY_TYPE) || !defined OPTION_ARRAY_LENGTH || (!defined OPTION_ARRAY_NAME)
#error "type, length, and name must be known to create an Array."
#endif


/**
* Use the options to create a structure preserving structure for our array.
*    that is, in contrast to pointers, raw arrays.
*/
struct {
OPTION_ARRAY_TYPE data[OPTION_ARRAY_LENGTH];
} OPTION_ARRAY_NAME;


/**
* if we are asked to also zero out the memory, we do it.
* if we are not granted access to string.h, brute force it.
*/
#ifdef OPTION_ZERO_MEMORY
#ifdef OPTION_GRANT_STRING
memset(&OPTION_ARRAY_NAME, 0, OPTION_ARRAY_LENGTH * sizeof(OPTION_ARRAY_TYPE));
#else
/* anonymous scope */
{
int i;
for (i = 0; i < OPTION_ARRAY_LENGTH; ++i) {
OPTION_ARRAY_NAME.data[i] = 0;
}
}
#endif
#undef OPTION_ZERO_MEMORY
#endif
#endif

This header essentially is what every C data structure header should look like if you are forced to use the C preprocessor(in contrast to PHP/Templating toolkit/ASP/your own embeddable scripting language, be it lisp).

Let's take it for a spin:

#include <stdio.h>


int Find(int A[], int A_length)
{
int j;


for (j = 0; j < A_length; ++j) {
if (A[j] < 0) {
return j;
}
}


return -1;
}


int main(void)
{
// std::array<int, 100> A;
#define OPTION_ARRAY_TYPE int
#define OPTION_ARRAY_LENGTH 100
#define OPTION_ARRAY_NAME A
#include "SimpleArray.h"


/**
* anonymous scope #1
*     initialize our array to [0..99],
*     then set 18th element to its negative value(-18)
*     to make the search more interesting.
*/
{
// loop index, nothing interesting here.
int i;


// initialize our array to [0..99].
for (i = 0; i < (sizeof(A.data) / sizeof(A.data[0])); ++i) {
A.data[i] = i * i;
}


A.data[17]= -A.data[17];
}


/**
* anonymous scope #2
*     find the index of the smallest number and print it.
*/
{
int result = Find(A.data, (sizeof(A.data) / sizeof(A.data[0])));


printf(
"First negative integer in A found at index = %d.\n",
result
);
}


// wait for user input before closing.
getchar();


// making sure all macros of SimpleArray do not affect any code
// after this function; macros are file-wide, so we want to be
// respectful to our other functions.
#define OPTION_UNINSTALL
#include "SimpleArray.h"


return 0;
}

BEHOLD, we have invented a naive std::array in pure C and C preprocessor! We used macros, but we are not evil, because we clean up after ourselves! All our macros are undefd at the end of our scope.

There is a problem; we no longer know size of the array, unless we do (sizeof(A.data) / sizeof(A.data[0])). This has no overhead for the compiler, but it's not child-friendly; neither are macros, but we are working within the box here; we can later use a more friendly preprocessor like PHP to make it child friendly.

To solve this, we can create a utility library which acts as methods on our "free" array data structure.

SimpleArrayUtils.h

/**
* this is a smart collection that is created using options and is
*      removed from scope when included with uninstall option.
*
* there are no guards because this header is meant to be strategically
*     installed and uninstalled, rather than kept at all times.
*/
#ifdef OPTION_UNINSTALL
/* clean up */
#undef ARRAY_FOREACH_BEGIN
#undef ARRAY_FOREACH_END
#undef ARRAY_LENGTH
#else
/**
* array elements vary in number of bytes, encapsulate common use case
*/
#define ARRAY_LENGTH(A) \
((sizeof A.data) / (sizeof A.data[0]))


/**
* first half of a foreach loop, create an anonymous scope,
* declare an iterator, and start accessing the items.
*/
#if defined OPTION_ARRAY_TYPE
#define ARRAY_FOREACH_BEGIN(name, iter, arr)\
{\
unsigned int iter;\
for (iter = 0; iter < ARRAY_LENGTH(arr); ++iter) {\
OPTION_ARRAY_TYPE name = arr.data[iter];
#endif


/**
* second half of a foreach loop, close the loop and the anonymous scope
*/
#define ARRAY_FOREACH_END \
}\
}
#endif

This is a fairly feature rich library, which basically exports

ARRAY_LENGTH :: Anything with data field -> int

and if we still have OPTION_ARRAY_SIZE defined, or redefined it, the header also defines how to do a foreach loop; which is cute.

Now let's go crazy:

SimpleArray.h

/**
* Make sure that all the options needed are given in order to create our array.
*/
#ifdef OPTION_UNINSTALL
#ifndef OPTION_ARRAY_TYPE
#undef OPTION_ARRAY_TYPE
#endif


#ifndef OPTION_ARRAY_TYPE
#undef OPTION_ARRAY_LENGTH
#endif


#ifndef OPTION_ARRAY_NAME
#undef OPTION_ARRAY_NAME
#endif


#ifndef OPTION_UNINSTALL
#undef OPTION_UNINSTALL
#endif
#else
#if (!defined OPTION_ARRAY_TYPE) || !defined OPTION_ARRAY_LENGTH || (!defined OPTION_ARRAY_NAME)
#error "type, length, and name must be known to create an Array."
#endif


/**
* Use the options to create a structure preserving structure for our array.
*    that is, in contrast to pointers, raw arrays.
*/
struct {
OPTION_ARRAY_TYPE data[OPTION_ARRAY_LENGTH];
} OPTION_ARRAY_NAME;


/**
* if we are asked to also zero out the memory, we do it.
* if we are not granted access to string.h, brute force it.
*/
#ifdef OPTION_ZERO_MEMORY
#ifdef OPTION_GRANT_STRING
memset(&OPTION_ARRAY_NAME, 0, OPTION_ARRAY_LENGTH * sizeof(OPTION_ARRAY_TYPE));
#else
/* anonymous scope */
{
int i;
for (i = 0; i < OPTION_ARRAY_LENGTH; ++i) {
OPTION_ARRAY_NAME.data[i] = 0;
}
}
#endif
#undef OPTION_ZERO_MEMORY
#endif
#endif

SimpleArrayUtils.h

/**
* this is a smart collection that is created using options and is
*      removed from scope when included with uninstall option.
*
* there are no guards because this header is meant to be strategically
*     installed and uninstalled, rather than kept at all times.
*/
#ifdef OPTION_UNINSTALL
/* clean up, be mindful of undef warnings if the macro is not defined. */
#ifdef ARRAY_FOREACH_BEGIN
#undef ARRAY_FOREACH_BEGIN
#endif


#ifdef ARRAY_FOREACH_END
#undef ARRAY_FOREACH_END
#endif


#ifdef ARRAY_LENGTH
#undef ARRAY_LENGTH
#endif
#else
/**
* array elements vary in number of bytes, encapsulate common use case
*/
#define ARRAY_LENGTH(A) \
((sizeof A.data) / (sizeof A.data[0]))


/**
* first half of a foreach loop, create an anonymous scope,
* declare an iterator, and start accessing the items.
*/
#if defined OPTION_ARRAY_TYPE
#define ARRAY_FOREACH_BEGIN(name, iter, arr)\
{\
unsigned int iter;\
for (iter = 0; iter < ARRAY_LENGTH(arr); ++iter) {\
OPTION_ARRAY_TYPE name = arr.data[iter];
#endif


/**
* second half of a foreach loop, close the loop and the anonymous scope
*/
#define ARRAY_FOREACH_END \
}\
}
#endif

main.c

#include <stdio.h>


// std::array<int, 100> A;
#define OPTION_ARRAY_TYPE int
#define OPTION_ARRAY_LENGTH 100
#define OPTION_ARRAY_NAME A
#include "SimpleArray.h"
#define OPTION_UNINSTALL
#include "SimpleArray.h"


int Find(int A[], int A_length)
{
int j;


for (j = 0; j < A_length; ++j) {
if (A[j] < 0) {
return j;
}
}


return -1;
}


int main(void)
{
#define OPTION_ARRAY_NAME A
#define OPTION_ARRAY_LENGTH (sizeof(A.data) / sizeof(A.data[0]))
#define OPTION_ARRAY_TYPE int


#include "SimpleArray.h"


/**
* anonymous scope #1
*     initialize our array to [0..99],
*     then set 18th element to its negative value(-18)
*     to make the search more interesting.
*/
{
#include "SimpleArrayUtils.h"


printf("size: %d.\n", ARRAY_LENGTH(A));


ARRAY_FOREACH_BEGIN(item, i, A)
A.data[i] = i * i;
ARRAY_FOREACH_END


A.data[17] = -A.data[17];




// uninstall all macros.
#define OPTION_UNINSTALL
#include "SimpleArrayUtils.h"
}


/**
* anonymous scope #2
*     find the index of the smallest number and print it.
*/
{
#include "SimpleArrayUtils.h"
int result = Find(A.data, (sizeof(A.data) / sizeof(A.data[0])));


printf(
"First negative integer in A found at index = %d.\n",
result
);


// uninstall all macros.
#define OPTION_UNINSTALL
#include "SimpleArrayUtils.h"
}


// wait for user input before closing.
getchar();


// making sure all macros of SimpleArray do not affect any code
// after this function; macros are file-wide, so we want to be
// respectful to our other functions.
#define OPTION_UNINSTALL
#include "SimpleArray.h"


return 0;
}

As you can see; we now have the power to express free abstractions(compiler substitutes them in for us), we only pay for what we need(the structs), and the rest gets tossed out, and does not pollute global scope.

I emphasize the power of PHP here because few have seen it outside the context of HTML documents; but you can use it in C documents, or any other text files. You can use Templating Toolkit to have any scripting language you like put in the macros for you; and these languages will be much better than C preprocessor because they have namespaces, variables, and actual functions; this makes them easier to debug since you are debugging actual script that generates the code; not C preprocessor which is hell to debug, largely due to familiarity(who in the right mind spends hours to play with and get familiar with the C preprocessor? few do).

Here's an example of doing this with PHP:

SimpleArray.php

<?php
class SimpleArray {
public $length;
public $name;
public $type;


function __construct($options) {
$this->length = $options['length'];
$this->name = $options['name'];
$this->type = $options['type'];
}


function getArray() {
echo ($this->name . '.data');
}


function __toString() {
return sprintf (
"struct {\n" .
"    %s data[%d];\n" .
"} %s;\n"
,
$this->type,
$this->length,
$this->name
);
}
};
?>

main.php

#include <stdio.h>
<?php include('SimpleArray.php'); ?>


int Find(int *A, int A_length)
{
int i;


for (i = 0; i < A_length; ++i)
{
if (A[i] < 0) {
return i;
}
}


return -1;
}


int main(int argc, char **argv)
{
<?php
$arr = new SimpleArray(array(
'name' => 'A',
'length' => 100,
'type' => 'int'
));
echo $arr;
?>


printf("size of A: %d.\n", <?php echo($arr->length); ?>);


/* anonymous scope */
{
int i;


for (i = 0; i < <?php echo($arr->length)?>; ++i) {
<?php $arr->getArray(); ?>[i] = i * i;
}
<?php $arr->getArray(); ?>[17] = -<?php $arr->getArray()?>[17];
}


int result = Find(<?php $arr->getArray();?>, <?php echo $arr->length; ?>);
printf(
"First negative integer in A found at index = %d.\n",
result
);


getchar();


return 0;
}

run php main.php > main.c

then

gcc main.c -o main
./main

This looks a lot like Objective C, because this is essentially what objective C does, except it tends to link the "macros" of compile time to an actual runtime(as if php was available at runtime during the time C was running, and in turn your C can talk to php and php can talk to C, except the php is the smalltalkish language with many square brackets). The main difference is that Objective C does not to my knowledge have a way to make "static" constructs, as we did here; its objects are actually runtime, and as such are much more expensive to access, but are much more flexible, and preserve structure, whereas C structs collapse to bytes as soon as the header leaves the scope(whereas objects can be reflected back to their original state using internal tagged unions)...

Couple of comments based on what I've seen in responses.

  1. Indeed, the way to tackle this in C is to play around the #define macros and function pointers. The thing is, C++ templates, at in their first version, were pretty much that - a mere way to copy some code, and generate some symbols, sort of textually "parametrize" the original. Sky is the limit in how to use your imagination here, just be careful because in many respects you're on your own when comes to type checking etc., don't expect much help from the compiler.

  2. Saying now and then "why not just use C++" is counter-productive here. The question was specifically of how to simulate the feature when you don't have it. I have to say from my experience, I was once simulating the templates even in C++. Dare to guess why? Because it was beginning of 1990, there was C++, and an idea of templates in C++, but hardly any implementations of such. That's why. Well, I did it in C before that as well. C++ made it somewhat easier just because at least you didn't have to simulate class methods anymore by using function pointers, because, well, you got native language support for that. Otherwise, back then, just like in C, #define was your only friend for a-la parametric programming.