Nargery (Reply) [Matthew's Journal]

I've been thinking for a while now that I should maybe re-organise some of my work code. The question turns out to be slightly less trivial than I thought...

Essentially, I have a library of useful graph/network analysis and simulation tools, written in C. There are currently three ways you can represent a graph in this library:

an n-by-n array of integers

an n-by-n array of bytes

an n-array of adjacency lists

...though I might want to add ones later. Currently you chose between these at compile time (it's a switch in the Makefile), and then matrix.h does things like:

#ifdef S_MATRIX
#include "smatrix.h"
#elif defined I_MATRIX
#include "imatrix.h"
...

The various header files then #define various useful functions, eg:

/*is bit x of a set?*/
#define MAT_TEST(m,a,x) (m[(a)][SM_BYTE(x)] & SM_BIT(x))
/*set bit x of a*/
#define MAT_SET(m,a,x) (m[(a)][SM_BYTE(x)] |= SM_BIT(x))

The various functions I've written call MAT_SET and so on without having to care what type of graph type you're using (since generally it's only an efficiency issue); in the odd case that a function needs a particuar representation, it can check what #define you used (again, this is currently a compile-time thing).

There are a few obvious problems. Firstly, for this library to be more generally useful than for just me (which I think would be a good aim), people shouldn't have to re-build the entire library frequently (consider multi-user systems). Secondly, it is occasionally useful to be able to transform between the different structures (for example, reading in some data formats is best done using an adjacency-list representation), and only with a bit of hackery can you do this currently.

I can think of a few possibilities for going ahead, and I was also wondering what my readers (many of whom are far better at this coding malarky than I am) thought:

Firstly, I could carry on as I am doing, which is the past of least effort currently. I'm already finding myself having to do unnecessary work as a result, though. I'd like to find something better

Secondly, I could have different types for each data type, so you could say SMAT_TEST or IMAT_TEST, and so on. You'd still need some mechanism for chosing the default, though. Also, you'd need some mechanism for a library function to know which sort of structure it had been passed. Maybe the matrix_t structure could be a union of the other types, with a flag to say which it was? That has the problem that the TEST and SET functions are often macros. I guess you could compile your code three times into one vast library, and then have a piece of code to work out what type of graph type you were using, and call the relevant version. This all seems terribly complicated, though...

If I seem a little obsessed with all this, it makes a noticable difference to run-time which data structure type I use for different algorithms - some graph algorithms are very expensive, and I'm working with large graphs, hence the desire to get this sort of thing right.

[if anyone wants to see the complete header files, then do ask, they just seem a bit large to dump into LJ]

Sun	Mon	Tue	Wed	Thu	Fri	Sat
			1	2	3	4
5	6	7	8	9	10	11
12	13	14	15	16	17	18
19	20	21	22	23	24	25 1
26	27	28	29	30	31

Matthew's Journal

Nargery

Reply

October