I've been looking for this kind of information for some time, but still haven't been able to find it. I've heard from some people that it's actually faster if you declare a variable outside a loop than inside, even if you assign it a new value each iteration. For example:
Point p;
for (int i = 0; i < 10; i++) {
p = new Point(i, i);
}
is said to be faster than:
for (int i = 0; i < 10; i++) {
Point p = new Point(i, i);
}
However, the second option is much more elegant, less error-prone (since the variable is inside the loop) and it doesn't seem hard for the compiler to see such a thing and optimize for it.
If the first code actually runs faster than the latter, then it would be useful to be able to do something like:
for (int i = 0, Point p = null; i < 10; i++) {
p = new Point(i, i);
}
I'd like also to know about other variations, such as:
for (int i = 0; i < 10; i++) {
double d = (double)i;
}
versus:
double d;
for (int i = 0; i < 10; i++) {
d = (double)i;
}
or even:
for (int i = 0, double d = 0; i < 10; i++) {
d = (double)i;
}
And, also, if the final modifier could make things faster:
for (int i = 0; i < 10; i++) {
final Point p = new Point(i, i);
}
Thanks in advance,
Dhakir.
07-21-2005, 05:23 AM
ractoc
defining a variable outside a loop is indeed faster, although on low itteration counts the difference can be ignored.
The reason for the performance difference lies in the way Java handles objects.
An object is generated in 2 steps. the first one is the declaration. This creates a space in memory where a pointer is created. This pointer is pointing nowhere atm. After declaration, you get the initialisation. During this step, the pointer starts pointing to a different memory area where the actual data in the object is located.
Now in you example above, by placing the declaration outside of the loop, this step is only done one. Whereas placing the declaration inside of the loop means that the application has to create that pointer each iteration of the loop.
07-21-2005, 06:18 AM
sjalle
I may be wrong here, but would not the compiler/java VM optimize the loop in the example and
reuse the memory space allocatedfor the Point instance ? I have seen different
performance depending on the platform for other similar cases. E.g. on windows
an applet worked fine, but "hit the roof" on unix as the applet reallocated fonts all
the time. This piled up on unix but windows reused previously allocated memory space.
07-21-2005, 07:13 AM
ractoc
I think this is partly dependant on the JVM running on the machine where you run your application. Since this is where the actual memmory allocation takes place.
07-21-2005, 10:42 AM
dhakir
Thanks for the replies.
If someone could point me where such information would be in the JLS (if there is such information), I'd appreciate. I've tried looking for it, but didn't find anything.
Could I try that using a profiler? I've never used one, so I don't know much about them, but could Sun's profiler help?
07-22-2005, 01:06 AM
htayod
Modern JVMs treat all these kinds of loops identically. That is, no matter whether you declare a variable inside the loop or outside the loop, performance won't change.
You needn't worry about such small things. If you want to write good and fast Java, just code right algorithms. JVM would do the rest.
Thanks for the help. I generally try to follow Bloch's advices from Effective Java (correctness over perfomance), and I never really worry about performance (I prefer elegant code).
But, if I can mantain elegance while improving performance (even small steps), I believe it would be better to learn it. The real problem is, I can't recall an example of such a loop, where a well-known programmer states: "do this", or "do that". All I could find were some VERY strange sites on the net, like some stating that you should use while + Exception to exit a for loop (to avoid the condition check). From trusted sources, I've seen dozens of similar examples, but not one of these. But they're so common to me, so I thought something was wrong.
Anyway, about such kinds of automatic optimization, I had a surprise the other day, when I decided to analyse the speed of divisions by 2 and shifts (using integers, in Sun's 1.5.0 JVM for Windows). I thought it would be automatically optimized, but in my tests there was some noticeable difference between them. In a very stupid test (a for loop with about 10 million iterations and some simple arithmethic inside it), the shift version ran in half the time. Multiplications by 2 and shifts had no noticeable difference, perhaps because multiplication is a simpler operation, perhaps because it's optimized. But I don't know. Since my testing was very simple, it's subject to a lot of mistakes, so I'd like some expert opinion on the issue.
For now, I'll stick with htayod's advice, but if someone has had similar experiences, I'd be very interested to know about them. Also, if there's an explanation for the non optimization of division by 2, please let me know.
07-24-2005, 07:14 AM
sjalle
Just a comment, its very typical that we have to code monster loops to even notice
any difference. If execution speed is essentialy vital for a complex, memory munching monster loop algorithm I would recommend a C or Assembler
executable (or perhaps Fortran..hehe)
07-26-2005, 05:55 AM
htayod
I have two explanations for the /2 phenomenon:
1. Maybe HotSpot does not perform this optimization.
What you do is so-called "microbenchmarking", that is, you test java performance on very small program that test only one feature. This is not similar to typical Java code, which is rather big, spaghetti-like, with many virtual calls, heap accesses, jumps and checks. So I guess that Sun's JVM guys just optimize the jvm for typical Java applications, and they have more significant code patterns to optimize than replacement of /2 with >>.
2. Maybe it does, but your test may be slightly incorrect.
For instance, our JVM does replace /2 with >>. However, simple test shows that j = j/2 is still slower than j = j >> 1. This is because these two operations are not equivalent. If j < 0, then j = j / 2 is equivalent to j = j + 1; j = j >> 1; (I may be wrong with the sign of "1" here). So compiler generates more complex code pattern for /2, than simple shift.