There are a couple questions I have arising from the OP and some of the comments:
1. Does this mean we shouldn't bother with ASLR? (Isn't it almost "security through obscurity" anyway?)
First; "security through obscurity" is where you're relying on the attacker not being able to figure out how the security works. If the attacker knows exactly how the security works but doesn't know something else (a password, an encryption key, a seed for a random number generator, etc) then it's not considered "security through obscurity".
It's best to think of security as many layers, where an attacker has to penetrate through multiple layers to succeed. The question is whether or not ASLR is a worthwhile additional layer - do the benefits (for security) justify the disadvantages (for code complexity, performance, etc). There's no right answer - it's more a question of design goals.
Note that for 32-bit 80x86 usually you can't get enough entropy for ASLR to be very effective (the "random" addresses end up with the lowest 12 bits known due to page alignment and several upper bits known due splitting the virtual address space into zones; and that only leaves about 18 "random middle bits"); and because the CPU doesn't support efficient "IP relative addressing" for 32-bit code there's a performance cost involved with relocated code and data. This means that ASLR is closer to "not worthwhile" for 32-bit 80x86. For 64-bit 80x86 it's different - there are more addressing bits than can be random, and there is support for "IP relative addressing" and less performance cost involved; so it's "more worthwhile" for 64-bit 80x86 than it would be for 32-bit 80x86.
2. Regarding timing attacks, how precise can timing be before it becomes a security concern? This is something I've wondered for quite a while.
Assume you have an "uncertainty value" like this:
last_tsc = rdtsc();
} while(last_tsc % uncertainty != 0);
I'd expect that an "uncertainty value" of a few hundred cycles would be enough to make half of the attacks (those that rely on cache timing or "shared resource in SMT timing") go from "vaguely plausible" to "impractical". I'd also be tempted to make the amount of uncertainty proportional to how often software asks for the time (e.g. 200 cycles of uncertainty if you ask for the TSC infrequently, and 999999 cycles of uncertainty if you ask for TSC millions of times per second).