Use of "static" in array parameter of function
In both C and C++ a function declared:
void function(Type t);
is converted to
void function(Type* t);
That is, within the function, any information about the size of the array is lost. (you are free to try to access t--good luck.)
C99 adds this syntax:
void function(Type t[static 3]);
This allows the compiler to optimize code within the function in ways that assume *at least* 3 elements. (How much, I don't know.)
It doesn't look like this syntax is allowed in either the current or coming C++ standards. Correct?
Right. C++ doesn't support this feature and there seems to be some misgivings about adding it to C++0x. Anyway, it's too late now since the FCD was approved in March. Therefore, if you're writing code that's meant to be compiled with a C++ compiler, it's best to avoid this feature.
To be fair, anywhere you have
Originally Posted by hendrixj
you are always allowed to access it with
You are lucky to get a compiler warning, compilers cannot understand complex logic well enough to warn on this sort of thing.
For example, just doing this:
x = 27;
a[x]= 1; //woah, way, way outta bounds now
the compilers can't catch it.
the trouble is that, as you noted, the ability to optimize code is lost when you cannot force a way to TELL the compiler how many you have. The bigggest loss here is the heuristic that says "hey, this loop only loops one time, lets eliminate the baggage" or "this only loops twice, lets inline it and discard the jump/loop increment/etc. I think the rule is about 3-5 iterations or less you are better off to unroll the loop, but thats a rule of thumb from way, way back when such things mattered.
Basically, if you *know* its only a in your code, go ahead and unroll your loop inside the function since the compiler cannot, something like this:
void dosomething(int *a)
a = whateva(0);
a = whateva(1);
a = whateva(2);
for(x = 0; x < 3; x++)
a[x] = whateva(x);
which saves you processing X entirely (modify, write back, allocate, etc) and jumps which can break the pipelines, ugly assembly to do addressing on a variable (a[x] is harder to do than a at least on intel assembly) and other such things.
And, after doing all that, you are lucky to save a whole nanosecond unless you are on a slow, tiny embedded processor. So its almost always unimportant to do this stuff anymore.
The loop version works no matter how many you have, an enum or defined constant or whatever can govern it. The unrolled loop requires tinkering if you ever move from 3 to 4 in your code. So would the syntax that was rejected. I do not care for this sort of tweak to C++. IMO, if you are at the level where this type of thing is an issue, you should code that block in assmebly yourself to be certain that its tweaked to the max.
I cannot think of any other major type of optimization that is lost by not allowing the explicit size syntax, but there may be other minor tweaks that an ultra smart compiler could do with the information.
If by "access" you mean reading, you're right. However, writing to that address (one element past the array bounds) causes undefined behavior. This is true only for overflow. Underflow is a different story (you're not allowed to even read an element before ).
Originally Posted by jonnin
Last edited by Danny; 05-20-2010 at 02:39 PM.
What I meant was "the compiler doesn't give a hoot". You can happily type a = 11; and it will happily compile, and you are lucky to get a warning (depending on the complexity of the actual statements). It may crash or do all sorts of horrible things, but the compiler does not care, the statement is legal (one of many perfectly legal statements that are bad news). This is totally unrelated to the static keyword language extension was my main point, and that you can always cause a problem in this way (with or without that language feature/extension).
Originally Posted by Danny
Interesting, because my bucket sort does use negative addressing. It takes an array of twice the size of the max value +1, moves to the middle (take a pointer addressed here), and then sorts (or, really, just counts instances of data) signed values into it via potential negative array access. So if the value in question is say -10, it would access that pointer at -10, or p[-10] ++ for example. This routine has always worked just fine, but I dunno how legal that is. The accessed memory IS valid, since its a pointer into the middle of valid memory.
Is that a bad piece of code? I do not use it much, its rare to have values that work with a bucket sort anyway, but since you mentioned it, I thought I would ask.
yes, your code works by fluke. It has undefined behavior and it's not just a theoretical term. The compiler of course doesn't care because it rarely keeps track of the size of an array. However, the runtime code might code nasty surprises. Take for example an array allocated using new. In some implementation, the size of the array is stashed as a cookie at address [-1]. That is, the first four bytes just before the array's first element are also reserved for the implementation. Any attempt to modify these bytes will cause delete to reclaim an incorrect buffer size to the heap, which will corrupt the heap.
Overflow might just get by because implementation often place a few sentry bytes after each array (for debugging purposes). This enables debuggers to trap overflows.
I never bypass the original sizes though. Its like this, in short:
int *storage = new int;
int * navigate = & (storage);
navigate[-20] ++; ///Is this a problem? Its a valid address, its really storage or whatever.
I never intentionally overstep allocated memory (IE I would never poke at storage or storage[-10] or the like, not even to read them.
I don't know that it works by a fluke, its addressing valid memory at all times, so its simply a question of "is negative indexing itself illegal"?
(note that in many a code setment, pointers like navigate have been ++ and -- modified to move around inside another memory block legally, as have "navigate += 5" or "navigate -= 10". If it goes out of bounds, its a bug, but this sort of pointer walking has been done since vanilla C. It should be the exact same as what I did, except mine keeps navigate where it was and offsets from it.)
Last edited by jonnin; 05-21-2010 at 09:28 AM.
Everything that's undefined behavior is a problem. If it works, it's sheer luck. you can't assume that the code will be portable (even a new compiler version might make the code break) and of course, you can never tell what's hidden in address [-20] in a portable and reliable fashion. So I wouldn't use this technique in production code. It will explode one day.
So, then, are any of these also undefined?
int *x = new int;
int *p = &(x);
p++; //legal?? (legal = defined behavior)
p+= 2; //legal?
p-= 5; //legal?
I do know that whats in -20 in my other example is the same integer thats in the original array at 480. Its the same address, we know *that* for sure, except for one issue, the compiler could force the offset into an unsigned int and calculate a bad address (even if legal/in bounds/defined, it would be the wrong address for the logic). Other than that, we know that a block of memory from new is sequential, so if we move to the middle of it and backtrack, its still in that block and exactly the same as going to the original pointer's positive offset (the code looks different but the computed address is the same).
Its easy enough to fix, for that reason --- I can just change the way a code segment does addressing and its all good (use the original pointer and positive offsets instead), but I am asking to make absolutely certain I understand what is and isnt defined for these manipulations. Its only in a place or two, normally I prefer to go from 0-N but a few algorithms divide things in half or iterate backwards over data and are less intuitive when coded in the other direction (which, you know, doesnt bother ME in the least but for the sake of my peers I try to make stuff at least quasi-readable).
The standard says (from 5.2.1):
A postfix expression followed by an expression in square brackets is a postfix expression. One of the expressions shall have the type “pointer to T” and the other shall have enumeration or integral type. The result is an lvalue of type “T.” The type “T” shall be a completely-defined object type.56) The expression E1[E2] is identical (by definition) to *((E1)+(E2)). [Note: see 5.3 and 5.7 for details of * and + and 8.3.4 for details of arrays. ]
It doesn't appear that a[-3], per se, is undefined syntax, since *(a - 3) is fully defined.
The question is what address a contains. If it's the beginning of the array (i.e., a is a true array, declared like int a; ) then that expression [a-3] is invalid because it might dereference an invalid address, including a negative address, which is never legal. The relevant section in the standard says clearly which pointer arithmetic operations are valid for array:
-5- When an expression that has integral type is added to or subtracted from a pointer, the result has the type of the pointer operand. If the pointer operand points to an element of an array object, and the array is large enough, the result points to an element offset from the original element such that the difference of the subscripts of the resulting and original array elements equals the integral expression. In other words, if the expression P points to the i-th element of an array object, the expressions (P)+N (equivalently, N+(P)) and (P)-N (where N has the value n) point to, respectively, the i+n-th and i-n-th elements of the array object, provided they exist. Moreover, if the expression P points to the last element of an array object, the expression (P)+1 points one past the last element of the array object, and if the expression Q points one past the last element of an array object, the expression (Q)-1 points to the last element of the array object. If both the pointer operand and the result point to elements of the same array object, or one past the last element of the array object, the evaluation shall not produce an overflow; otherwise, the behavior is undefined.
(emphases mine, DK)
Last edited by Danny; 05-21-2010 at 02:34 PM.
So now I am more confused than usual.
"blah .. integral type is added to or subtracted from a pointer blah..."
indicates that what I did is perfectly fine, then, since the pointer in question is in the middle of an existing block, and that all 4 of those statements I asked about are defined/legal in that code snippet, and that the undefined stuff is the same old same old as always (going beyond the boundry of the memory that is allocated in either direction).
I'll summarize it: pointer arithmetic is fine (subtraction, addition, multiplication etc.) so long as the resulting address points to a valid array element. Underflow means accessing an array element that's before element 0. Overflow means accessing elements that are past [n+1] where n is the last array element. Accessing [n+1] is valid so long as you're not writing to this address.
So yes, accessing elements in the middle of an array is perfectly fine. I was under the impression that you code does something different -- accessing elements before elem.
Last Post: 07-16-2007, 09:36 PM
Last Post: 04-14-2006, 09:09 AM
Last Post: 11-27-2001, 06:53 AM
By Narayan in forum VB Classic
Last Post: 06-20-2001, 07:56 AM
By Dan in forum VB Classic
Last Post: 03-17-2000, 05:14 AM
Top DevX Stories
Easy Web Services with SQL Server 2005 HTTP Endpoints
JavaOne 2005: Java Platform Roadmap Focuses on Ease of Development, Sun Focuses on the "Free" in F.O.S.S.
Wed Yourself to UML with the Power of Associations
Microsoft to Add AJAX Capabilities to ASP.NET
IBM's Cloudscape Versus MySQL