|
|
 |
 |
 |
 |
Is this legal? Aesthetically acceptable?
I'm writing an abstraction layer over an OS's ability to do non-blocking reads and writes on various things. The basic idea is that the caller will be able to let the I/O happen in the background, and will poll for completion using something along the lines of *nix's select() or Win32's WaitForMultipleObjects, and when the OS signals that a read has completed or a write will be accepted the abstraction layer will be called to actually do the I/O call and call any code that's interested in it. The read and write handling have some basic asymmetry, so the abstraction layer's data and the interface to the code that requests reads and writes will be different depending on which direction the data is going, but the basic poll-and-call-handler pattern in the main loop will be the same either way. What I'd like to do is something like this: -------- /*in overlapped.h*/ struct overlapped_cookie { sync_type need_attention; struct overlapped_data *internal; };
-------- /*in overlapped-read.c*/ struct overlapped_data { /*Internal data for read bookkeeping*/ };
struct overlapped_cookie *setup_read(/*args*/) { struct overlapped_cookie *ret=malloc(sizeof *ret); if(!ret) return NULL; ret->internal=malloc(sizeof ret->internal); if(!ret->internal) { free(ret); return NULL; } /*Continue with initialization*/ }
-------- /*in overlapped-write.c*/ struct overlapped_data { /*Internal data for write bookkeeping*/ };
struct overlapped_cookie *setup_write(/*args*/) { struct overlapped_cookie *ret=malloc(sizeof *ret); if(!ret) return NULL; ret->internal=malloc(sizeof ret->internal); if(!ret->internal) { free(ret); return NULL; } /*Continue with initialization*/ }
-------- The idea is that the main loop can do something like: -------- /*These arrays are populated by other code to keep an up-to-date list of reads and writes (possibly to be extended to other events) that are being waited for */ sync_type sync[N]; int (*handler[N])(struct overlapped_cookie *); struct overlapped_cookie *all_overlapped[N]; /*main loop, minus error handling:*/ while(!are_we_done_yet()) { int which_one=do_wait(sync,num_filled); handler[which_one](all_overlapped[which_one]); }
-------- I have two questions about this: (1) Is it legal and well-defined for external code to use a pointer to the same incomplete struct type for what is actually different types in different translation units, as long as no read struct ever gets passed to the write functions and no write struct ever gets passed to the read functions? (I'm fairly sure, but not certain, that the struct pointer representation requirements make this valid.) (2) Is this considered acceptable, maintainable, and not aesthetically unpleasant by other C programmers? Or, at least, no worse than the alternative of forcing the main poll loop to handle reads and writes separately? (This one I'm not so sure about.) (2b) Is there a better way to do this? dave -- Dave Vandervies dj3va@csclub.uwaterloo.ca I _am_ consistent - if one of those other pointer guide writers came here and asked for comments, they'd get chewed out just as badly. --Richard Bos in comp.lang.c
Dave Vandervies wrote On 05/29/07 13:33,: > [code and exposition snipped for brevity; see up-thread] > I have two questions about this: > (1) Is it legal and well-defined for external code to use a pointer to > the same incomplete struct type for what is actually different types > in different translation units, as long as no read struct ever gets > passed to the write functions and no write struct ever gets passed to > the read functions? > (I'm fairly sure, but not certain, that the struct pointer representation > requirements make this valid.)
As far as I can see, it's legal. The incomplete type is completed differently in the read and write implementations, but as long as the barrier between them remains unbroken I don't see how any trouble can arise. > (2) Is this considered acceptable, maintainable, and not aesthetically > unpleasant by other C programmers? Or, at least, no worse than the > alternative of forcing the main poll loop to handle reads and writes > separately? > (This one I'm not so sure about.)
It's perhaps a matter of taste, but I'd consider the "identifier overloading" to be Asking For Trouble, hence UNmaintainable and UNacceptable. A programmer encounters a `struct overlapped_data', educates himself about what it is and how it's used, and then runs into another `struct overlapped_data' that looks and behaves differently. That would be unremarkable if the two divergent struct types came from independent libraries being integrated into the same program, but to find the conflict within one single package of related functions ... Ugh. Double ugh. > (2b) Is there a better way to do this?
One possibility would be to use a `void*' as the "externally visible" pointer to the internal struct, and to convert it to a `struct overlapped_read_data*' or to a `struct overlapped_write_data*' internally. Completely gives up on type safety, but does little or no violence to your framework as it stands. Another approach might be to rearrange things a bit. Instead of having the overlapped_cookie struct point at different varieties of overlapped_data structures, make the overlapped_cookie be the first element of both flavors of overlapped_data: struct overlapped_cookie { sync_type need_attention; /* any other externally-visible stuff * that you always want */ }; In the internals of the read and write implementations: struct overlapped_read_data { struct overlapped_cookie cookie; /* internal stuff specific to read */ }; and struct overlapped_write_data { struct overlapped_cookie cookie; /* internal stuff specific to write */ }; Your external interface would traffic in pointers to struct overlapped_cookie objects, which are elements of the struct overlapped_*_data objects allocated internally. Thanks to 6.7.2.1p13 it is permissible to convert a struct pointer to a pointer to its first element and vice versa, so an internal function can convert a cookie pointer to a pointer to the struct that contains it. It's still a matter of taste, but I find the second approach far more palatable than the first. -- Eric.Sos@sun.com
Eric Sosman writes: >Dave Vandervies wrote On 05/29/07 13:33,: >> (1) Is it legal and well-defined for external code to use a pointer to >> the same incomplete struct type for what is actually different types >> in different translation units, as long as no read struct ever gets >> passed to the write functions and no write struct ever gets passed to >> the read functions? >> (I'm fairly sure, but not certain, that the struct pointer representation >> requirements make this valid.) > As far as I can see, it's legal. The incomplete type is > completed differently in the read and write implementations, > but as long as the barrier between them remains unbroken I > don't see how any trouble can arise.
What barrier? Google "link time optimization". I don't really expect trouble in this case either, but... > (...) >> (2b) Is there a better way to do this? > One possibility would be to use a `void*' as the > "externally visible" pointer to the internal struct, and > to convert it to a `struct overlapped_read_data*' or to a > `struct overlapped_write_data*' internally. Completely > gives up on type safety, but does little or no violence to > your framework as it stands.
Or point to a union which contains both structs. Wastes memory for the smallest struct though. > Another approach might be to rearrange things a bit. > Instead of having the overlapped_cookie struct point at > different varieties of overlapped_data structures, make > the overlapped_cookie be the first element of both flavors > of overlapped_data: > (...) > Thanks to 6.7.2.1p13 it is permissible to convert a struct > pointer to a pointer to its first element and vice versa, > so an internal function can convert a cookie pointer to > a pointer to the struct that contains it.
Alternatively, declare both structs in a header file and make them members of a union in that header. Then thanks to C99 6.5.2.3p5 it's OK to use the common initial members of either of the struct types - as long as the union is in scope (even if you do not use it). -- Regards, Hallvard
Hallvard B Furuseth wrote On 05/30/07 14:55,: > Eric Sosman writes: >>Dave Vandervies wrote On 05/29/07 13:33,: >>>(2b) Is there a better way to do this? >> One possibility would be to use a `void*' as the >>"externally visible" pointer to the internal struct, and >>to convert it to a `struct overlapped_read_data*' or to a >>`struct overlapped_write_data*' internally. Completely >>gives up on type safety, but does little or no violence to >>your framework as it stands. > Or point to a union which contains both structs. Wastes > memory for the smallest struct though.
It also exposes the details of the no-longer-opaque structs, making it difficult to change those details at a later date. (The O.P. did not say in so many words that opacity was a goal, so I may be reading more than is there.)
>> Another approach might be to rearrange things a bit. >>Instead of having the overlapped_cookie struct point at >>different varieties of overlapped_data structures, make >>the overlapped_cookie be the first element of both flavors >>of overlapped_data: >>(...) >>Thanks to 6.7.2.1p13 it is permissible to convert a struct >>pointer to a pointer to its first element and vice versa, >>so an internal function can convert a cookie pointer to >>a pointer to the struct that contains it. > Alternatively, declare both structs in a header file and > make them members of a union in that header. Then thanks > to C99 6.5.2.3p5 it's OK to use the common initial members > of either of the struct types - as long as the union is > in scope (even if you do not use it).
Again, opacity is lost. Also, I don't think this quite obeys the letter of the law: 6.5.2.3p5 specifically talks about accessing struct elements inside union objects, not about free-standing structs of the same types. I once fixed a bug that had to do with exactly this distinction. As in 6.5.2.3, we had a bunch of structs with a common initial subsequence, and a union containing all of them. Vastly simplified and violently paraphrased: struct One { int type; /* always equal to 1 */ int payload; }; struct Two { int type; /* always equal to 2 */ double data; }; union All { struct One eins; struct Two zwei; }; The bug arose when a pointer to a free-standing struct One instance was passed to a function expecting a union pointer: void printValue(const union All *ptr) { switch (ptr->eins.type) { case 1: printf ("one: %d\n", ptr->eins.payload); break; case 2: printf ("two: %g\n", ptr->zwei.data); break; ... } } ... struct One instance = { 1, 42 }; printValue ((union All*) &instance); At first glance this might look all right, but what it actually produced was the platform's equivalent of a bus error. The problem was that the alignment requirement for a struct Two was stricter than that for a struct One, so the alignment for a union All was also stricter than for a struct One. The free-standing struct One instance was properly aligned for a struct One (of course), but happened not satisfy the stricter union All requirement. Unfortunately, the compiler "knew" that its argument would point to strictly-aligned memory, and generated instructions that faulted when the actual argument turned out to be too leniently aligned. To put it another way: every valid union All pointer was a valid struct One pointer, but the converse didn't hold. Now, this might not be an issue in the O.P.'s case, because he's using malloc() to obtain memory for instances of his "variant" structs and malloc()'s alignment suffices for anything at all. But that doesn't strike me as a safe practice for the long haul: Someday, somebody may embed a struct One in a struct BigBox and not put it at the very start, or somebody may mallocate an array of struct One instances -- the [0] element will be splendidly aligned, but what about the [1] element? From 6.5.2.3p5 and from other parts of the Standard, we can deduce that the common initial subsequences of the various structs that appear together in unions must be arranged identically: You could walk through them with offsetof() and get the same answers from all variants. But identical layout isn't in itself enough to avoid undefined behavior in type punning. And when there's a perfectly clean (and opacity-preserving) method available, there seems little incentive to juggle hand grenades. -- Eric.Sos@sun.com
Eric Sosman writes: > Hallvard B Furuseth wrote On 05/30/07 14:55,: >> Alternatively, declare both structs in a header file and >> make them members of a union in that header. Then thanks >> to C99 6.5.2.3p5 it's OK to use the common initial members >> of either of the struct types - as long as the union is >> in scope (even if you do not use it). > Again, opacity is lost. Also, I don't think this quite > obeys the letter of the law: 6.5.2.3p5 specifically talks > about accessing struct elements inside union objects, not > about free-standing structs of the same types.
Oops, you are right. -- Regards, Hallvard
In article <1180559022.950171@news1nwk>, Eric Sosman <Eric.Sos@Sun.COM> wrote: >Hallvard B Furuseth wrote On 05/30/07 14:55,: >> Or point to a union which contains both structs. Wastes >> memory for the smallest struct though. > It also exposes the details of the no-longer-opaque >structs, making it difficult to change those details at >a later date. (The O.P. did not say in so many words that >opacity was a goal, so I may be reading more than is there.)
It was, but more as a matter of good habits than for any specific reason. (I've seen too much code that was made into a crawling horror by intricate cross-linking of dependencies between different parts, so I like to make it as difficult as possible to introduce dependencies on anything other than a published interface.) After some further thought, hiding the opaque structs behind a void pointer seems to have been what I was after. (I was clearly too close to the initial code that I was trying to abstract to actually see that; it's probably what I would have suggested had somebody else asked the question.) I ended up deciding to make my abstraction layer even more abstract as well, and having done that it makes even more sense to hide as many of the details as possible behind a void *. dave -- Dave Vandervies dj3va@csclub.uwaterloo.ca > I sense some Perl coming on. --Mike Andrews and Steve VanDevender
Be sure to [...] have someone nearby if a particularly in the scary severe attack requires restraint or sedation. devil monastery
Dave Vandervies wrote: > In article <1180559022.950171@news1nwk>, > Eric Sosman <Eric.Sos @Sun.COM> wrote: >> Hallvard B Furuseth wrote On 05/30/07 14:55,: >>> Or point to a union which contains both structs. Wastes >>> memory for the smallest struct though. >> It also exposes the details of the no-longer-opaque >> structs, making it difficult to change those details at >> a later date. (The O.P. did not say in so many words that >> opacity was a goal, so I may be reading more than is there.) > It was, but more as a matter of good habits than for any specific reason. > (I've seen too much code that was made into a crawling horror by intricate > cross-linking of dependencies between different parts, so I like to make > it as difficult as possible to introduce dependencies on anything other > than a published interface.) > After some further thought, hiding the opaque structs behind a void > pointer seems to have been what I was after. (I was clearly too close > to the initial code that I was trying to abstract to actually see that; > it's probably what I would have suggested had somebody else asked the > question.) > I ended up deciding to make my abstraction layer even more abstract as > well, and having done that it makes even more sense to hide as many of > the details as possible behind a void *.
An advantage of using a pointer to an incomplete struct type instead of a pointer to void is that you retain a little bit of type safety. Consider the scenarios: /* in the header: */ void doSomething(void *); /* pass a Gizmo pointer */ /* stupid caller: */ doSomething ("Gizmo"); /* no murmur from compiler */ vs. /* in the header: */ void doSomething(struct gizmo*); /* stupid caller: */ doSomething ("Gizmo"); /* compile-time diagnostic */ Now, it's clear from the outset that users of your software are of above average intelligence, nowhere near stupid enough to pass a string when something else is required. But if there are a lot of different but related types floating about -- struct gizmo, struct pseudoGizmo, struct gizmoExtended -- even an elite audience may suffer occasional confusion. If that should happen, it's usually better to rescue them with the compiler than with the debugger ... The cost of providing such extra safety is usually small: One new identifier pollutes the user's name space for each type you need to invent. That cost is usually tolerable, considering that doSomething() and its brethren are even more polluting (they probably have external linkage). A handful of no-linkage type names will not usually inconvenience the user who's already made up his mind to accept your package's more numerous (usually) external identifiers. Treasure my free advice: It's worth every cent you pay ... -- Eric Sosman esos@acm-dot-org.invalid
|
 |
 |
 |
 |
|