Hi,
embryo2 wrote:
Brendan wrote:
embryo2 wrote:
The checks are not free (development complexity, vendor lock, less space for caches, more power consumed etc). And "anything dangerous" is still dangerous - it kills the application.
Except it doesn't necessarily kill the application at all (signals, hardware exceptions reflected back to the process, etc).
You already agreed that such "recovery" should be avoided, but here you use the impotent recovery as an argument against managed. It's a bit unnatural, at least.
Brendan wrote:
I've already explained that this is a language design thing and not a managed vs. unmanaged thing.
I've already explained it's managed vs. unmanaged thing and not language design. Managed can recover safely while unmanaged can't. A bit more on it follows later.
Managed can allow safe recovery in some situations but not others (but doesn't need to and shouldn't in my opinion); and unmanaged can allow safe recovery in some situations but not others (but doesn't need to and shouldn't in my opinion).
embryo2 wrote:
Brendan wrote:
The only reason you think the application must be killed is that (despite the fact that most unmanaged languages do provide facilities for the process to attempt recovery after a crash) very few people use these facilities because very few people care if a buggy application is killed (as long as it doesn't effect other processes or the kernel as a whole).
If some people don't care about application crash it doesn't mean every developer shouldn't care about it.
It's not the developers - it's users who don't want "slow because the programmer thought they were too incompetent to write software that isn't buggy".
It's like.. Would you buy a sports car that has to be surrounded by 2 meter thick bubble-wrap at all times because the brakes probably have design flaws? Of course not - you'd want a car where the manufacturer made sure the brakes work that doesn't need bubble wrap.
embryo2 wrote:
Brendan wrote:
Yes; if you add a whole layer of bloated puss (a managed environment) in theory you can check for more errors at run-time than hardware can. However; in practice (at least for 80x86) hardware has the ability to check for mathematical errors (even including things like precision loss - see FPU exception conditions), array index overflows ("bound" instruction), integer overflows ("into" instruction)
Well, now you are proposing a "whole layer of bloated puss" in hardware instead of much more simple layer in software for the reason of efficiency. Let's look at the actual "efficiency". Instead of using just one "jo" instruction you suggest to use interrupt handling bloat for a lot of very simple things. And yes, now you should show how we can avoid all the inter-privilege level overhead and how it is faster than just one "jo" instruction.
No. I suggest only having necessary protection (e.g. isolation between processes and the kernel) and not having any unnecessary protection (protecting a process from itself) that nobody cares about and shouldn't be needed at all for released software. I also suggest finding all the programmers that have so little confidence in their own abilities that think they need this unnecessary protection and send them to a nice free 2 week vacation to the centre of the Sun.
embryo2 wrote:
Brendan wrote:
and also has the ability to split virtual address spaces up into thousands of smaller segments; and for every single one of these features the CPU can do it faster than software can
Well, what for we need these thousands of smaller segments? Just to trade jo instruction for interrupt related overhead? Very well done "optimization".
We don't need it (and nobody uses it because it's not needed); but that doesn't change the fact that hardware is able to do it better/faster than software can.
Note: "hardware is able to do it better/faster than software" does not imply that any CPU manufacturer cares enough about it to bother making it fast.embryo2 wrote:
Brendan wrote:
and still nobody uses them because everyone would rather have higher performance.
Yes, one jo is better than interrupt.
One jo just gives you undefined behaviour. You need a comparison or something before it; plus something to jump to; plus some way to tell CPU that its an extremely unlikely branch (so it doesn't waste the CPU's branch target buffer). Then you realise it only does one limit and that you typically need two limits ("0 < x < 1234") and you actually need a pair of them. Finally; you create a product and get to add the words "slower than everything else because we suck" to all your advertising and wonder why everyone buys the faster alternative from your competitors while your company goes bankrupt.
embryo2 wrote:
Brendan wrote:
As soon as you allow assembly, any code can do anything to anything (e.g. maliciously tamper with the JVM) and it's no longer "managed".
If I need the best performance and I know my compiler is unable to optimize enough, then yes, I allow my OS to use unsafe code. But it's my informed choice. While unmanaged gives no choice at all. So, managed allows us to select what is of top priority now - security and safeness or speed while unmanaged just denies us such choice.
It's also the "informed choice" of the malicious attacker who's writing a trojan utility. Yay!
embryo2 wrote:
Brendan wrote:
Also note that most "managed environments" (e.g. Java, .NET, etc) exist for portability (so that the same byte-code can run on multiple different platforms) and not to protect babies from their own code; and as soon as you allow assembly you've killed the main reason for them to exist (portability).
Baby code affects not only babies. It's users who should be protected from baby code, so you just misunderstand the purpose of a widely used code. Also you misunderstand the importance of choice. If user has choice of getting better speed for a particular platform and understands it's security and safeness consequences then it's much better than the situation when user has no such choice.
As a user; have you ever purchased any software written in Java?
I have - in my entire life I bought 2 different "beta" games that were written in Java; and both of them combined cost me less than I spend on coffee in one day. For "unmanaged" code I've probably spent about $300 this year; not because I like spending money, but because software that's actually worth paying for is never written in Java.
If a user has a choice of getting better speed for a particular platform and understands it's security and safeness consequences then they never choose "managed".
embryo2 wrote:
Brendan wrote:
embryo2 wrote:
If jump table is accessible to a smart compiler it can optimize it. In the worst case it can issue a warning about possible performance decrease and developer can fix the too complex code.
Translation: You weren't able to understand what I wrote.
Think of a utility like "grep" as a compiler that compiles command line arguments (e.g. a regular expression or something) into native machine code (where that native machine code is the deterministic finite automaton) and then executes that native machine code. For an "ahead of time" compiler that jump table doesn't exist when "grep" is being compiled (it only exists at run-time after its been generated by grep); and a smart compiler can't optimise something that doesn't exist.
Your "grep compiler" seems to be a superintelligent beast that knows more than the programmer who writes the compiler. But it's not true. If a developer knows compiler design principles then he will pay great attention to the effectiveness of the compiler. It means the developer can first check the details at higher level (e.g. there's no way of exceeding the switch range just because the number of options is limited to 100). But if we forget about higher levels (as you have said it's whole layer of bloated puss) then yes, we are unable to predict behavior of our programs.
Brendan wrote:
Yes, you can have a warning about the massive performance decrease (e.g. display some sort of "orange coffee cup" logo so people know its running in a managed environment).
Sometime your comments are really funny
Brendan wrote:
When there's no hardware virtualisation; most whole system emulators (VMWare, VirtualPC, VirtualBox, Qemu, etc - all of them except Bochs) fall back to converting guest code into run-time generated native code. In a managed environment software can't generate and execute its own native code.
What's the problem of moving code generation to the environment? Any sane algorithm can be compiled independently of the application it is used in. So, in managed environment it is possible to generate application related code.
Either it's possible for software to generate native code and execute it at run-time (and therefore managed environments are incapable of protecting anyone from anything because "managed" code can just bypass any/all protection that the environment provides); or it's impossible for (some types of) software to get good performance.
You can not have it both ways. You can't pretend that a managed environment provides protection and no protection at the same time.
embryo2 wrote:
Brendan wrote:
embryo2 wrote:
The problem is the code can corrupt memory within the guarded boundaries. So, it's better not to use proposed "recovery".
For "managed" its the same problem - a bug can corrupt anything that its allowed to modify. If correct code can do "myObject.setter(correctValue);" then buggy code can do "myObject.setter(incorrectValue);".
No. The array access bug will corrupt all the stack or heap after the array. But managed environment just prevents all kinds of such bugs from the very existence. So, in case of unmanaged only "recovery" is possible while in case of managed normal recovery is a widely used practice.
Sigh. It's like a brainless zombie that's incapable of seeing anything beyond "C vs. Java".
Yes; Java is more able to detect problems like "out of bounds array index" than C and C++ because they don't even try (in the same way that it's easy to for an obese 90 year old man to run faster than an Olympic athlete when that Olympic athlete is sleeping soundly).
Nothing says a managed environment must protect against these bugs properly; nothing says an unmanaged can't detect/prevent these problems at compile time, and nothing even says an unmanaged language even has to support arrays to begin with (in the same way that an Olympic athlete might wake up instead of sleeping forever).
Note that in case of managed, normal recovery typically fails spectacularly in practice (unless your idea of "recovery" just means appending exception details to a log and terminating the process anyway, which is exactly what most Java software does).
embryo2 wrote:
Brendan wrote:
I'm ignoring things like Apache (which is still the most dominant web server and is written in an unmanaged language), that creates a new process for each connection so that if any process crashes/terminates all the other connections aren't effected?
You can compare the performance. What are the costs of creating and finishing a process vs costs of taking a thread from a thread pool and then releasing it. If you perform a sane comparison then the problem with unmanaged will be obvious to you.
There are no web servers written for any managed environment to compare, so it's hard to do a fair comparison (but fairly obvious that nobody wanted to write a web server for a managed environment, probably because everyone capable of writing a web server knows that "managed" is a worthless joke when you want anything close to acceptable performance).
Note that for Linux the kernel itself doesn't really know the difference between processes and threads - they're just "tasks" that may or may not share resources and there's very little difference between forking a process and spawning a thread.
embryo2 wrote:
Brendan wrote:
embryo2 wrote:
Bugs in critical code also extremely important. The extra code base introduced with the environment isn't so much an issue if we remember the size of Linux kernel and drivers, for example.
I agree - bugs in critical code are extremely important (regardless of whether it's a layer of pointless "managed environment" bloat, or a kernel or anything else). Note that this is why a lot of people (including me) think micro-kernels are better (and why Microsoft requires digital signatures on third-party drivers, and why Linux people wet their pants when they see a "binary blob").
So, you agree that the environment's complexity is not an issue. Just because it increases the number of bugs just a bit over the level of other existing critical code.
What? It's obvious that I think that reducing both the complexity and amount of critical code is important. A 64 KiB micro-kernel alone is far better than a 10 MiB monolithic kernel alone, or a 64 KiB micro-kernel plus a 10 MiB "managed environment"; and all 3 of these options are better than a 10 MiB monolithic kernel plus a 10 MiB "managed environment".
embryo2 wrote:
Brendan wrote:
In theory, it would be better to simplify Intel's instruction set and remove old baggage nobody uses any more (e.g. most of segmentation, hardware task switching, virtual8086 mode, real mode, the FPU and MMX, etc) and also rearrange opcodes so instructions can be smaller (and not "multiple escape codes/bytes"). In practice, the majority of the silicon is consumed by things like caches, and backward compatibility is far more important than increasing cache sizes by 0.001%.
Well, the 0.001% here is too bold to be true. Completely rearchitected Intel processor will represent something like ARM's 64 bit model, which uses much less silicon and power. So, instruction simplification just as important as important now all mobile things with ARM processors. And only the vendor lock (the monopoly trick) is allowing Intel to persist.
ARM's CPUs are smaller because they don't clock as fast, have smaller caches, have crappy "uncore" (no built-in PCI-e, etc) and have near zero
RAS features. It has nothing to do with baggage.
The things allowing Intel to maintain its monopoly are backward compatibility, and the fact that no other company can come close to its single-threaded performance.
embryo2 wrote:
Brendan wrote:
I still think that eventually, after UEFI is ubiquitous and nobody cares about BIOS (e.g. maybe in the next 10 years if we're lucky) Intel might start removing some of the old stuff (e.g. develop "64-bit only" CPUs and sell them alongside the traditional "everything supported" CPUs, and then spend another 10+ years waiting for everything to shift to "64-bit only").
Yes, it's time required to fade away the monopoly effect.
It's time required to maintain the monopoly - Intel can't break too much backward compatibility too quickly.
embryo2 wrote:
Brendan wrote:
I don't know if you mean 1024 registers or 1024-bit wide registers. For both cases it comes down to "diminishing returns" - exponentially increasing performance costs to support it, with negligible benefits to justify it.
Well, as you have said - "silicon is cheap and no sane person cares". Then what diminishing returns are you talking about? We have plenty of silicon, just use it, no problem if it's useless for many applications, but for some applications it's good to have more registers. So, it's better to recognize the value of silicon instead of insisting "the Intel is of a great value for humanity".
Using more silicon is not the problem, it's how you use it. For an analogy; you can add 10000 lines of code to an existing application (e.g. add a new feature or something) without effecting the performance of the old code; or you can add 5 lines in a critical spot and completely destroy performance.
embryo2 wrote:
Brendan wrote:
This is also why we can't just have fast and large L1 caches (and why L1 cache sizes have remained at 64 KiB for about a decade while Intel have been adding larger/slower L2, L3, ..). Increasing the size decreases performance/latency.
No. Not the size is the problem. The problem is the way Intel's processors work. Cache is of limited usability if we can't predict right content for it. So, the Intel introduces all those additional useless instructions and improves it's vendor lock instead of simplifying the processor instruction set.
Was this even supposed to make sense?
If a pink elephant farts on a sunny day then accountants should eat more bagels because the problem is that water boils at 100 degrees Celsius.
embryo2 wrote:
Brendan wrote:
In my case the ideal is temporarily far away, but over time will get closer. In your case the ideal is permanently far away because you're not getting closer over time.
Your temporarily vs permanently is greatly exaggerated.
As exaggerated as "always trailing the competition" vs. "actually trying to overtake the competition"?
embryo2 wrote:
Brendan wrote:
You failed to read it properly. The author only used direct access to create the attacks and explain them, and direct access is not needed to deploy the attacks. The vulnerability is that a lot of software de-serialises "untrusted" data (e.g. from a random attacker's network connection) and the Java code responsible for serialisation/de-serialisation allows arbitrary code execution (and the "managed environment" does nothing to prevent this).
Well, if you think the direct access is not needed then may be you can prove it? I can open http access on my server for server side deserialization and you can show us how it's easy to run your code on my server without direct access.
It's not my speciality - I've never tried to compromise anything, and Java isn't worth learning enough about.
embryo2 wrote:
The "arbitrary" code in the link is the code somebody deploys using direct access to the critical part of the file system (where server's libraries are).
Yes; it's almost like "white hat" security researchers have some sort of code of conduct to prevent them from being mistaken for "black hat" attackers, and don't just give everyone on the Internet fully working exploits for unpatched vulnerabilities. I'm sure nobody will ever feel like using the exploit to run any other code.
Cheers,
Brendan