embryo2 wrote:
alexfru wrote:
problem of selecting the optimal jump/branch instructions (when you have a few of different range and therefore length) is actually an NP problem in the general case. So, every assembler either limits the number of labels and jumps/branches (by trying every possible option and therefore "hanging" on large inputs) or at best produces suboptimal object code. And we haven't got anywhere near compilers yet. The battle is already lost in assemblers.
But what is the problem in case of compilers? It is absolutely possible to include the jump optimization part in a compiler. And if such kind of optimizations is also quite natural for assemblers it doesn't mean we have to stop using this optimization in compilers.
Let's go back a little to where you expressed discontent with the lack of theory to arrive at the optimal solution:
embryo2 wrote:
It's craft and art mostly. There's no sound foundation like derivative based optimization in mathematics for example. A bunch of hints and many graph processing algorithms. It's like many different bullets for different guns, but there's no theory how to get an optimal bullet for the target. Trial and error, heuristics and other black magic.
It doesn't matter whether jump/branch optimization remains in an assembler (possibly standalone assembler) or is integrated into a compiler. Moving the problem around doesn't change its nature. The point I'm making is that a compiler as a whole (including everything required to make an executable out of source code, e.g. the assembler, the linker, etc) needs to solve problems which have no optimal solutions in practice. I'm talking about NP problems. And there are more than this one. So, inevitably you end up having suboptimal solutions. With heuristic, trial and error, black magic, craft, art, you name it. I don't see this much different from some math problems having no closed form solutions. You can still solve them, but for that you either simplify/redefine them or use numerical methods, both of which results in suboptimal or inexact solutions.
Another important thing to note is that the optimal solution can only be obtained when we have defined the problem completely. IOW, a compiler that can produce perfect executables for a given target environment (hardware + OS + other software running at the same time) must know (pretty much?) everything about said environment. And then to make use of that info one would need to recompile software of interest from source code. However, as end users we don't do it often. We get pre-compiled OSes, pre-compiled compilers, pre-compiled browsers, etc and use those.
Oh, did I say that a compiler would also need to understand what it compiles?
But maybe we don't need the optimum code every time all the time after all? Maybe we're OK with the 2x or 5x or whatever perf gains that an optimizing compiler gives us despite never generating perfect code?