|
|
 |
 |
 |
 |
ptr1 == ptr2 but (int)ptr1 != (int)ptr2
Do anyone know of an architecture where this can break? T *ptr1, *ptr2; ... if (ptr1 == ptr2) if (CHAR_BIT*sizeof(T*) <= (width of int)) /*otherwise undefined*/ assert((int)ptr1 == (int)ptr2); (Feel free to replace int with another integer type if that helps to break something.) I know an addess can have several representations at least on some DOS memory models, but I don't know if it normalizes pointers before converting to integer. -- Regards, Hallvard
Hallvard B Furuseth wrote: > Do anyone know of an architecture where this can break? > T *ptr1, *ptr2; > ... > if (ptr1 == ptr2) > if (CHAR_BIT*sizeof(T*) <= (width of int)) /*otherwise undefined*/ > assert((int)ptr1 == (int)ptr2); > (Feel free to replace int with another integer type if that helps to > break something.) > I know an addess can have several representations at least on some DOS > memory models, but I don't know if it normalizes pointers before > converting to integer.
I would suspect that, on most platforms in which a pointer can fit into an int, if the pointers compare equal, then the converted-to-int values will also compare equal. However, I'm sure the standard probably says it's implementation defined at best. Consider a segmented memory architecture, such as "real-mode" on the x86 line of processors. On such platforms "far" pointers are 32 bits (16-bit segment, plus 16-bit offset), and I would suspect that it may be possible for two pointers to compare equal, even if their bit patterns are not identical. (For example, it may compare FFFF:0000 and F000:FFF0 as "equal", but 0xFFFF0000 and 0xF000FFF0 as ints [or perhaps longs] will not compare equal.) -- +-------------------------+--------------------+-----------------------+ | Kenneth J. Brody | www.hvcomputer.com | #include | | kenbrody/at\spamcop.net | www.fptech.com | <std_disclaimer.h> | +-------------------------+--------------------+-----------------------+ Don't e-mail me at: <mailto:ThisIsASpamT@gmail.com>
"Hallvard B Furuseth" <h.b.furus@usit.uio.no> wrote in message news:hbf.20070525jpa2@bombur.uio.no... > Do anyone know of an architecture where this can break? > T *ptr1, *ptr2; > ... > if (ptr1 == ptr2) > if (CHAR_BIT*sizeof(T*) <= (width of int)) /*otherwise undefined*/ > assert((int)ptr1 == (int)ptr2); > (Feel free to replace int with another integer type if that helps to > break something.) > I know an addess can have several representations at least on some DOS > memory models, but I don't know if it normalizes pointers before > converting to integer.
I'd be surprised if it does. If someone wants to convert a pointer to an integer normally they want to get at the bits, so reinterpretation seems likely. That would mean that if the same address has two interpretations, it would break. However in reality ptr1 == ptr2 would almost certainly break as well. That is why it is illegal to create a pointer that roams over more than one object. If the architecture is segmented, probably the only C way of generating two different pointers to the same physical address is to move one beyond its range. So at runtime the program can cheaply compare pointer for equality by comaring bits, and doesn't need a normalisation step. -- Free games and programming goodies. http://www.personal.leeds.ac.uk/~bgy1mm
Malcolm McLean wrote: > "Hallvard B Furuseth" <h.b.furus@usit.uio.no> wrote in message > news:hbf.20070525jpa2@bombur.uio.no... >> Do anyone know of an architecture where this can break? >> T *ptr1, *ptr2; >> ... >> if (ptr1 == ptr2) >> if (CHAR_BIT*sizeof(T*) <= (width of int)) /*otherwise undefined*/ >> assert((int)ptr1 == (int)ptr2); >> (Feel free to replace int with another integer type if that helps to >> break something.) >> I know an addess can have several representations at least on some DOS >> memory models, but I don't know if it normalizes pointers before >> converting to integer. > I'd be surprised if it does. If someone wants to convert a pointer to an > integer normally they want to get at the bits, so reinterpretation seems > likely. That would mean that if the same address has two > interpretations, it would break. > However in reality ptr1 == ptr2 would almost certainly break as well.
No, ptr1==ptr2 is guaranteed by the standard to always work[1] whether or not the pointers have any relationship to each other and whether or not they use different representations for the same value. > That is why it is illegal to create a pointer that roams over more than > one object.
That is a separate matter from whether different representations can be used for the same pointer value. It allows for range checking implementations, something which does exist and does not require there to be different representations of the same physical address. > If the architecture is segmented, probably the only C way of > generating two different pointers to the same physical address is to > move one beyond its range.
Wrong. The architecture could have overlapping segments such that: { int arr[4069]; /* arr starts in segment 1 */ int *ptr1 = a+1024; /* ptr1 starts in segment 2 */ int *ptr2 = a+4096; /* ptr2 starts in segment 3 */ while (ptr1 != ptr2) ptr2--; Then they meet when ptr1 is using segment 2 but ptr2 is using segment 3. The compiler has to make it work. The good old x86 range of processors uses overlapping segments, although I don't know if any compilers allowed objects (or malloced regions) larger than a segment. > So at runtime the program can cheaply compare > pointer for equality by comaring bits, and doesn't need a normalisation > step.
Wrong. The standard guarantees that comparing for equality always works. It is only relational operators excluding equality and inequality that do not have to work. [1] If one of the pointers is neither null, nor a pointer to an object, nor a pointer to 1 past an object, then undefined behaviour occurs on evaluating the pointer before you get as far as the comparison. -- Flash Gordon
"Flash Gordon" <s @flash-gordon.me.uk> wrote in message news:sbsli4x8pt.ln2@news.flash-gordon.me.uk... > Malcolm McLean wrote: > > So at runtime the program can cheaply compare >> pointer for equality by comaring bits, and doesn't need a normalisation >> step. > Wrong. The standard guarantees that comparing for equality always works. > It is only relational operators excluding equality and inequality that do > not have to work.
Once your pointer holds an illegal address, any other operations on it become undefined. Including tests for equality. So if we move a pointer outside its object, the test for equality with another pointer can either fail or pass, it is undefined. Writing to that pointer may write to the same memory location, or it might trigger a segfault, again the behaviour is UB. So by holding objects to the size of a segment, we can implement equality tests by a simple comparison of bits, and still conform. You have however homed in on a problem with the standard, which is that the "1 past is valid" rule can make correct implementation of segemented objects difficult. > [1] If one of the pointers is neither null, nor a pointer to an object, > nor a pointer to 1 past an object, then undefined behaviour occurs on > evaluating the pointer before you get as far as the comparison.
You've put the right answer here. Once you execute UB, all subsequent operations also become undefined. -- Free games and programming goodies. http://www.personal.leeds.ac.uk/~bgy1mm
Malcolm McLean writes: >Hallvard B Furuseth wrote in message > news:hbf.20070525jpa2@bombur.uio.no... >> Do anyone know of an architecture where this can break? >> T *ptr1, *ptr2; >> ... >> if (ptr1 == ptr2) >> if (CHAR_BIT*sizeof(T*) <= (width of int)) /*otherwise undefined*/ >> assert((int)ptr1 == (int)ptr2); >> (Feel free to replace int with another integer type if that helps to >> break something.) >> I know an addess can have several representations at least on some DOS >> memory models, but I don't know if it normalizes pointers before >> converting to integer. > I'd be surprised if it does. If someone wants to convert a pointer to an > integer normally they want to get at the bits, so reinterpretation seems > likely. That would mean that if the same address has two > interpretations, it would break.
So, two opposite guesses about what would happen so far... that's why I wondered if anyone knew of real-world examples. > However in reality ptr1 == ptr2 would almost certainly break as > well.
No, that can only break if ptr1 or ptr2 does not contain a valid pointer representation. I.e. a trap representation, as C99 calls it. That's not a different representation of another pointer value, it's more like invalid values which accidentally could compare equal to a valid value. > That is why it is illegal to create a pointer that roams over more > than one object. If the architecture is segmented, probably the only C > way of generating two different pointers to the same physical address is > to move one beyond its range.
I'm pretty sure I've seen counterexamples of this, and that some DOS memory model was one of them. -- Hallvard
On May 26, 3:21 pm, Flash Gordon <s@flash-gordon.me.uk> wrote: > The good old x86 range of processors uses overlapping segments, although > I don't know if any compilers allowed objects (or malloced regions) > larger than a segment.
<OT, but not very> I seem to remember a C89 implementation in which malloc wouldn't return larger than a segment, but by using the right compiler switches and an implementation-specific header and library, it was possible to get objects larger than a segment. You could declare char far* s; to get a pointer that could point anywhere in memory (the compiler switch made far into a keyword) but compared like intptr_t would (so occasionally you got counter-intuitive and non-compliant behaviour such as greater-than tests on two pointers into the same object returning the wrong result), or char huge* s; to get a pointer which behaved correctly but was more expensive (i.e. the segmentation was taken into account with extra instructions). 'far' was enough in most cases, and you needed a special implementation-specific farmalloc to get big objects. Of course, the special things you had to do to get this to happen weren't C89, or any other standard that I know of. This was one of the sorts of things that caused confusion for beginning C programmers on DOS systems (it's linked to the whole 'memory model' thing that nowadays compilers can thankfully take care of themselves (hint to anyone who actually ends up having to learn C on such a system: set it to 'small' and you'll get correct C89 behaviour without having to worry about it any further)). </OT> -- ais523
Malcolm McLean wrote: > "Flash Gordon" <s@flash-gordon.me.uk> wrote in message > news:sbsli4x8pt.ln2@news.flash-gordon.me.uk... > > Malcolm McLean wrote: > > > So at runtime the program can cheaply compare > >> pointer for equality by comaring bits, and doesn't need a normalisation > >> step. > > Wrong. The standard guarantees that comparing for equality always works. > > It is only relational operators excluding equality and inequality that do > > not have to work. > Once your pointer holds an illegal address, any other operations on it > become undefined. Including tests for equality.
[...] If you have two valid pointers to separate "objects", must an equality test return "false"? I seem to recall that in the segmented world of real-mode x86 systems, "far" pointers (which consisted of a 16-bit segment and a 16-bit offset) only compared the offset part, meaning that, if the two "objects" were at the same offset within different segments, the pointers would compare as equal. Of course, it's been many years since I've done such work, so I may be remembering incorrectly. -- +-------------------------+--------------------+-----------------------+ | Kenneth J. Brody | www.hvcomputer.com | #include | | kenbrody/at\spamcop.net | www.fptech.com | <std_disclaimer.h> | +-------------------------+--------------------+-----------------------+ Don't e-mail me at: <mailto:ThisIsASpamT@gmail.com>
Flash Gordon wrote:
[...] > The good old x86 range of processors uses overlapping segments, although > I don't know if any compilers allowed objects (or malloced regions) > larger than a segment.
[...] Yes. In addition to "near" pointers (16 bit offset into the default data segment) and "far" pointers (32 bit segment:offset), there were also "huge" pointers (32 bit segment:offset) which could point to memory regions larger than a single 64K segment. Attempting to make this on-topic... On such architectures, the compilers would assume (rightfully so, as far as the standard is concerned) that only the offsets would need to be compared in non-huge pointers. Because, even though a "far" pointer was 32 bits, you can only compare pointers within a single object. Therefore, 1111:0080 and 2222:0080 could compare "equal". Also, I believe that xxxx:0000 would compare equal to NULL, regardless of the segment "xxxx" value. This would mean that 1111:0000 and 2222:0000 would both compare equal to each other, and both would compare equal to NULL, but storing them in long ints would make them compare unequal to each other. (Remember, the above assumes that these pointers are "far" and not "huge".) -- +-------------------------+--------------------+-----------------------+ | Kenneth J. Brody | www.hvcomputer.com | #include | | kenbrody/at\spamcop.net | www.fptech.com | <std_disclaimer.h> | +-------------------------+--------------------+-----------------------+ Don't e-mail me at: <mailto:ThisIsASpamT@gmail.com>
In article <465C5791.4DBEA@spamcop.net>, Kenneth Brody <kenbr@spamcop.net> wrote: >If you have two valid pointers to separate "objects", must an >equality test return "false"?
Yes. Equality and inequality tests, unlike ordering tests, must work on any two valid pointers of the same type, whether or not they point into the same object, and must report inequality for pointers to different objects. > I seem to recall that in the >segmented world of real-mode x86 systems, "far" pointers (which >consisted of a 16-bit segment and a 16-bit offset) only compared >the offset part, meaning that, if the two "objects" were at the >same offset within different segments, the pointers would compare >as equal. >Of course, it's been many years since I've done such work, so I >may be remembering incorrectly.
An implementation that did that would be non-conforming. dave -- Dave Vandervies dj3va@csclub.uwaterloo.ca I _am_ consistent - if one of those other pointer guide writers came here and asked for comments, they'd get chewed out just as badly. --Richard Bos in comp.lang.c
Kenneth Brody wrote, On 29/05/07 17:31: > Flash Gordon wrote: > [...] >> The good old x86 range of processors uses overlapping segments, although >> I don't know if any compilers allowed objects (or malloced regions) >> larger than a segment. > [...] > Yes. In addition to "near" pointers (16 bit offset into the default > data segment) and "far" pointers (32 bit segment:offset), there were > also "huge" pointers (32 bit segment:offset) which could point to > memory regions larger than a single 64K segment.
Ah, but could you use such pointers without using the non-standard far/huge keywords? Say with an option on the compiler? > Attempting to make this on-topic... > On such architectures, the compilers would assume (rightfully so, as > far as the standard is concerned) that only the offsets would need to > be compared in non-huge pointers. Because, even though a "far" > pointer was 32 bits, you can only compare pointers within a single > object. Therefore, 1111:0080 and 2222:0080 could compare "equal". > Also, I believe that xxxx:0000 would compare equal to NULL, regardless > of the segment "xxxx" value. This would mean that 1111:0000 and > 2222:0000 would both compare equal to each other, and both would > compare equal to NULL, but storing them in long ints would make them > compare unequal to each other. (Remember, the above assumes that > these pointers are "far" and not "huge".)
That would be non-conforming. -- Flash Gordon
Kenneth Brody <kenbr @spamcop.net> writes: > Flash Gordon wrote: > [...] >> The good old x86 range of processors uses overlapping segments, although >> I don't know if any compilers allowed objects (or malloced regions) >> larger than a segment. > [...] > Yes. In addition to "near" pointers (16 bit offset into the default > data segment) and "far" pointers (32 bit segment:offset), there were > also "huge" pointers (32 bit segment:offset) which could point to > memory regions larger than a single 64K segment.
<OT> I could be wrong about this, since I've never actually used such a system, but I *think* that "near" and "far" were kinds of pointers, and "huge" was one of several memory models. I don't think there was (is?) such a thing as a "huge" pointer. </OT> -- Keith Thompson (The_Other_Keith) k@mib.org <http://www.ghoti.net/~kst> San Diego Supercomputer Center <*> <http://users.sdsc.edu/~kst> "We must do something. This is something. Therefore, we must do this." -- Antony Jay and Jonathan Lynn, "Yes Minister"
Keith Thompson said:
> Kenneth Brody <kenbr @spamcop.net> writes: >> Flash Gordon wrote: >> [...] >>> The good old x86 range of processors uses overlapping segments, >>> although I don't know if any compilers allowed objects (or malloced >>> regions) larger than a segment. >> [...] >> Yes. In addition to "near" pointers (16 bit offset into the default >> data segment) and "far" pointers (32 bit segment:offset), there were >> also "huge" pointers (32 bit segment:offset) which could point to >> memory regions larger than a single 64K segment. > <OT> > I could be wrong about this, since I've never actually used such a > system, but I *think* that "near" and "far" were kinds of pointers, > and "huge" was one of several memory models. I don't think there was > (is?) such a thing as a "huge" pointer. > </OT>
You're right. Early PCs had six memory models: tiny, small, medium, large, compact, and huge. (Come to think of it, they're probably still in there somewhere!) -- Richard Heathfield "Usenet is a strange place" - dmr 29/7/1999 http://www.cpax.org.uk email: rjh at the above domain, - www.
Op Tue, 29 May 2007 16:11:20 -0700 schreef Keith Thompson:
> Kenneth Brody <kenbr @spamcop.net> writes: >> Flash Gordon wrote: >> [...] >>> The good old x86 range of processors uses overlapping segments, although >>> I don't know if any compilers allowed objects (or malloced regions) >>> larger than a segment. >> [...] >> Yes. In addition to "near" pointers (16 bit offset into the default >> data segment) and "far" pointers (32 bit segment:offset), there were >> also "huge" pointers (32 bit segment:offset) which could point to >> memory regions larger than a single 64K segment. > <OT> > I could be wrong about this, since I've never actually used such a > system, but I *think* that "near" and "far" were kinds of pointers, > and "huge" was one of several memory models. I don't think there was > (is?) such a thing as a "huge" pointer. > </OT>
<OT> TC had huge pointers, from the help: "The huge modifier is similar to the far "modifier except for two additional features. "Its segment is normalized during pointer "arithmetic so that pointer comparisons are "accurate. And, huge pointers can be "incremented without suffering from segment "wraparound. </OT> So I think that's conforming, although the range was 20 bits. -- Coos
Flash Gordon wrote: > Kenneth Brody wrote, On 29/05/07 17:31: > > Flash Gordon wrote: > > [...] > >> The good old x86 range of processors uses overlapping segments, although > >> I don't know if any compilers allowed objects (or malloced regions) > >> larger than a segment. > > [...] > > Yes. In addition to "near" pointers (16 bit offset into the default > > data segment) and "far" pointers (32 bit segment:offset), there were > > also "huge" pointers (32 bit segment:offset) which could point to > > memory regions larger than a single 64K segment. > Ah, but could you use such pointers without using the non-standard > far/huge keywords? Say with an option on the compiler?
I believe so. You told the compiler the memory model you wanted to use (there were at least "small", "medium", "compact" and "large") which determined the type of data and code pointers. (The "medium" and "compact" models had one type as 16 bits and the other as 32. I don't recall which was which.) There may have been a flag to set "huge" as the default. -- +-------------------------+--------------------+-----------------------+ | Kenneth J. Brody | www.hvcomputer.com | #include | | kenbrody/at\spamcop.net | www.fptech.com | <std_disclaimer.h> | +-------------------------+--------------------+-----------------------+ Don't e-mail me at: <mailto:ThisIsASpamT@gmail.com>
"Dave Vandervies" <dj3va @csclub.uwaterloo.ca> wrote in message news:f3hp5l$qdd$1@rumours.uwaterloo.ca...
> In article <465C5791.4DBEA @spamcop.net>, > Kenneth Brody <kenbr @spamcop.net> wrote: >>If you have two valid pointers to separate "objects", must an >>equality test return "false"? > Yes. Equality and inequality tests, unlike ordering tests, must work on > any two valid pointers of the same type, whether or not they point into > the same object, and must report inequality for pointers to different > objects. >> I seem to recall that in the >>segmented world of real-mode x86 systems, "far" pointers (which >>consisted of a 16-bit segment and a 16-bit offset) only compared >>the offset part, meaning that, if the two "objects" were at the >>same offset within different segments, the pointers would compare >>as equal. >>Of course, it's been many years since I've done such work, so I >>may be remembering incorrectly. > An implementation that did that would be non-conforming.
If two pointers have the same bit pattern then, unless you have some seriously weird architecture, they must be equal. However on some architectures there are several representations of the same physical address. However if we increment a pointer to one object so that it points into another then, except in the special case of one past, that is an error, and all subsequent operations, including pointer comparision, become undefined. Therefore, if you restrict objects to one "segment", equality tests can be implemented with a simple comparison of bits. There is no need to resolve to a physical address. Pointers in different objects will always return false. Pointers in the same object have the same base, so not two representations ever address the same memory. With illegal pointers the behaviour is undefined so the compiler can return true, false, or segfault at whim. -- Free games and programming goodies. http://www.personal.leeds.ac.uk/~bgy1mm
On Tue, 29 May 2007 23:29:28 +0000, in comp.lang.c , Richard Heathfield <r @see.sig.invalid> wrote: >Early PCs had six memory models: tiny, small, medium, >large, compact, and huge. Strictly speaking, these were 'features' of early PC compilers, allowing you to choose between different stack and heap sizes and addressable space layout. They weren't features of the hardware per se, whereas far and near pointers were. -- Mark McIntyre "Debugging is twice as hard as writing the code in the first place. Therefore, if you write the code as cleverly as possible, you are, by definition, not smart enough to debug it." --Brian Kernighan
Malcolm McLean wrote: > If two pointers have the same bit pattern then, unless you have some > seriously weird architecture, they must be equal. However on some > architectures there are several representations of the same physical > address. > However if we increment a pointer to one object so that it points into > another then, except in the special case of one past, that is an > error, and all subsequent operations, including pointer comparision, > become undefined. Therefore, if you restrict objects to one "segment", > equality tests can be implemented with a simple comparison of bits. > There is no need to resolve to a physical address. Pointers in > different objects will always return false. Pointers in the same > object have the same base, so not two representations ever address the > same memory. With illegal pointers the behaviour is undefined so the > compiler can return true, false, or segfault at whim.
Bitwise comparison is not good enough, if it is possible to write a program where two pointers to the same object have different representations without invoking UB in creating both representations. If a valid program can create different representations for a pointer to the same object, then you must resolve pointers to the physical address before you can compare them. Otherwise, I agree with your analysis. Bart v Ingen Schenau -- a.c.l.l.c-c++ FAQ: http://www.comeaucomputing.com/learn/faq c.l.c FAQ: http://www.eskimo.com/~scs/C-faq/top.html c.l.c++ FAQ: http://www.parashift.com/c++-faq-lite/
"Coos Haak" <chfo @hccnet.nl> wrote in message news:j8rq4nztpbmu.12dqzp2v6808k$.dlg@40tude.net...
> Op Tue, 29 May 2007 16:11:20 -0700 schreef Keith Thompson: >> Kenneth Brody <kenbr@spamcop.net> writes: >>> Flash Gordon wrote: >>> [...] >>>> The good old x86 range of processors uses overlapping segments, >>>> although >>>> I don't know if any compilers allowed objects (or malloced regions) >>>> larger than a segment. >>> [...] >>> Yes. In addition to "near" pointers (16 bit offset into the default >>> data segment) and "far" pointers (32 bit segment:offset), there were >>> also "huge" pointers (32 bit segment:offset) which could point to >>> memory regions larger than a single 64K segment. >> <OT> >> I could be wrong about this, since I've never actually used such a >> system, but I *think* that "near" and "far" were kinds of pointers, >> and "huge" was one of several memory models. I don't think there was >> (is?) such a thing as a "huge" pointer. >> </OT> > <OT> > TC had huge pointers, from the help: > "The huge modifier is similar to the far > "modifier except for two additional features. > "Its segment is normalized during pointer > "arithmetic so that pointer comparisons are > "accurate. And, huge pointers can be > "incremented without suffering from segment > "wraparound. > </OT> > So I think that's conforming, although the range was 20 bits.
Huge pointers were saying "forget about performance, I just want standard C". Unfortunately the processors were rather slow, so normally you couldn't. Then we were all younger in those days, and thought hacking was cleverer than writing clean and portable software. -- Free games and programming goodies. http://www.personal.leeds.ac.uk/~bgy1mm |
 |
 |
 |
 |
|