One "feature" of the Java compiler is the lack of optimization. For example, this code:
int x = 10;
int y = 20;
int z = x + y;
compiles into bytecodes that execute literally, meaning 10 is pushed on the stack, stored in local var 0, etc. A good C compiler would optimize this code (assuming x and y are not used elsewhere in the method) to simply store 30 in local var z.
I have been told that the compiler does not optimize such code because then the JIT compiler can most efficiently optimize depending on the hardware platform.
My question is: is there a way to get the standard compiler to do a more efficient job of optimizing the bytecodes? Are there compilers out there that will?
Anyway, back to the original post's points/questions.
One "feature" of the Java compiler is the lack of optimization.
This is wrong. it does optimizae. It may not do all the optimizations that a C compiler does.
I have been told that the compiler does not optimize
such code because then the JIT compiler can most
efficiently optimize depending on the hardware
platform.
I think that it's not that, but rather the HotSpot JVM can optimize individual areas of bytecode based on actual usage stats. I don't know if it actually compiles down to the hardware; it may apply bytecode->bytecode optimizations.
My question is: is there a way to get the standard
compiler to do a more efficient job of optimizing the
bytecodes?
Yes, by writing code such that the compiler can more easily determine what can be compiled away. A really good way to do this is by making more things immutable, and identifying within your code what things are already immutable, and declaring them final. I'm sure there are other areas, but it's debatable whether it's worth the effort. Write clean code with efficient algorithms, and let the JVM handle the picayune optimizations.
Are there compilers out there that will?
I suspect not, because the JVM optimizations reduce the bang for the buck that those optimzations would provide. But I could easily be wrong. I recall an article/tutorial about BCEL showed an application of BCEL to optimize bytecodes...you may want to look into that.
...let the JVM handle the picayune optimizations.
...the JVM optimizations reduce the bang for the buck that those optimzations would provide.
Valid points with a standard JVM. But ours is a 100% hardware implemention. It's kind of hard to do dynamic optimization algorithms in silicon. So, we need as much pre-processing optimization as we can get, both to reduce the memory footprint and to increase performance.
Thanks for the BCEL suggestion. I'll check it out.
Don't you just hate it when people assume that a
simplified example to illustrate a point is taken
directly from actual code?
Please point to where anyone says the example appears to be taken from actual code...
Anyway, that is beside the point and does not change the fact that your post seems to imply that the compiler ought to compensate for poor design / coding skills
(because you failed to tell us why you wanted "full" optimization).
I have been told that the compiler does not optimize
such code because then the JIT compiler can most
efficiently optimize depending on the hardware
platform.
The JIT compiler might very well become bloatware if it had to cater for every imaginable bytecode sequence.
It is quite possible that "hand optimized" bytecode would leave the JIT compiler clueless.
Unless you planned to run with the JIT compiler disabled (for whatever reason) this is not going to help performance.
If you plan to make a hardware implementation, similar considerations to those that apply to the JIT compiler will most likely apply to your pipeline prediction logic.
My question is: is there a way to get the standard
compiler to do a more efficient job of optimizing the
bytecodes? Are there compilers out there that will?
There are other Java compilers. There are also Java bytecode assemblers. Google is your friend.
It is possible that you could get ideas from C optimizer logic and implement post-processing of class files (or pre-processing of bytecode).
But ours is a 100%
hardware implemention. It's kind of hard to do
dynamic optimization algorithms in silicon. So, we
need as much pre-processing optimization as we can
get, both to reduce the memory footprint and to
increase performance.
Ah, you should have said this earlier. This is interesting! Are you also producing all the class files as well, or are your customers going to be writing their own classes so you need to provide tools to your customer that will provide maximum compile-time optimizations?
Another question that comes to mind, is whether you could build the dynamic optimizations into the silicon. If they really are bytecode->bytecode optimizations, then this may be an option.
... are your customers going to be
writing their own classes so you need to provide
tools to your customer that will provide maximum
compile-time optimizations?
Exactly. The customers will be using standard javac, but because of the nature of our systems, we have to reduce the memory footprint as much as possible.
We already do much of what commercial tools do, such as removing unused code, removing unused fields, etc. We also do much more--stuff that's probably specific to our platform. But what we really need is something that can work on the bytecodes themselves.
Another question that comes to mind, is whether you
could build the dynamic optimizations into the
silicon. If they really are bytecode->bytecode
optimizations, then this may be an option.
Impossible. Too complicated, not enough space, not enough funding, ...
Yeah if I had to do this... first I'd google to find specialized compilers that optimize the hell out of the code, and if that didn't work out I'd try to write some optimizers using BCEL. The cool thing about the latter option is that the customer could continue to use whatever compiler they happened to use already.
If x, y are local variables, I would say you are right. If not, think about the effect of multiple threads accessing x and y. In this second case the compiler is right not to optimize the code.