OSDev.org

The Place to Start for Operating System Developers
It is currently Tue Mar 19, 2024 1:24 am

All times are UTC - 6 hours




Post new topic Reply to topic  [ 27 posts ]  Go to page 1, 2  Next
Author Message
 Post subject: Breaking ASLR on systems with memory caches
PostPosted: Wed Feb 15, 2017 11:21 am 
Offline
Member
Member
User avatar

Joined: Thu Aug 11, 2005 11:00 pm
Posts: 1109
Location: Tartu, Estonia
I was just pointed to an interesting article that I would like to share:

http://www.cs.vu.nl/~herbertb/download/ ... ndss17.pdf

tl;dr: The article presents a method how address space layout randomization, a security feature used to prevent attacks which are based on knowing the location of a program's code and data, can be broken by probing the memory cache. This probing is done in such as way that an attacker figures out which entries from page tables at all levels of the paging hierarchy have been recently fetched (as a consequence of a page walk, when a TLB miss occurs), and this basically tells the attacker for which virtual address the page walk was performed (and this is the information that ASLR is supposed to disguise). This is demonstrated in JavaScript.

_________________
Programmers' Hardware Database // GitHub user: xenos1984; OS project: NOS


Top
 Profile  
 
 Post subject: Re: Breaking ASLR on systems with memory caches
PostPosted: Wed Feb 15, 2017 12:51 pm 
Offline
Member
Member

Joined: Sat Mar 01, 2014 2:59 pm
Posts: 1146
I'm not the most clued-up about these things but, in my opinion, relying on ASLR to prevent an exploit is a bit like hiding a file containing all your passwords in a long tree of subdirectories with a false name. In other words, if information is held in a place that's accessible to an attacker, assume that they can access it, and don't think "oh, they won't find this".

_________________
When you start writing an OS you do the minimum possible to get the x86 processor in a usable state, then you try to get as far away from it as possible.

Syntax checkup:
Wrong: OS's, IRQ's, zero'ing
Right: OSes, IRQs, zeroing


Top
 Profile  
 
 Post subject: Re: Breaking ASLR on systems with memory caches
PostPosted: Wed Feb 15, 2017 1:03 pm 
Offline
Member
Member
User avatar

Joined: Sat Jan 15, 2005 12:00 am
Posts: 8561
Location: At his keyboard!
Hi,

XenOS wrote:
I was just pointed to an interesting article that I would like to share:


Once upon a time Intel invented a new instruction - "RDTSC". Intel weren't stupid and knew that access to such precise timing would be a security problem, so they also provided a flag in CR4 that an OS can use to prevent CPL=3 code from being able to use the RDTSC instruction, and in the "Volume 3: System Programming Guide" (starting from the very first version that mentioned the time stamp counter) they wrote (highlighting is mine):

Intel in 1995 wrote:
A secure operating system would set the TSD flag during system initialization to disable user access to the time stamp counter. An operating system that disables user access to the timestamp counter should emulate the instruction through a user-accessible programming interface.


This same warning (using the exact same wording) has existed in every single version of the Intel manuals since, and still exists today.

Sadly; incompetent morons ignored Intel then and continued to ignore Intel since. The end result is that every 6 months or so (for a period of over 20 years) there's yet another white-paper showing yet another security problem that relies on very precise timing side-channels that could've and should've been impossible.


Cheers,

Brendan

_________________
For all things; perfection is, and will always remain, impossible to achieve in practice. However; by striving for perfection we create things that are as perfect as practically possible. Let the pursuit of perfection be our guide.


Top
 Profile  
 
 Post subject: Re: Breaking ASLR on systems with memory caches
PostPosted: Wed Feb 15, 2017 2:31 pm 
Offline
Member
Member
User avatar

Joined: Sun Sep 19, 2010 10:05 pm
Posts: 1074
I recently watched a YouTube video on the CCCGen channel about measuring the timings of memory access to both find memory addresses that have been used by the OS, and also to communicate between processes by having one process either access memory or clear cache entries, and the second process measure how long it takes to access the same memory to determine if the access was slow (meaning the bit would be a 1) or fast (meaning the bit would be a 0).

I think they said that they got up to 75Kbps using this approach... Pretty fascinating stuff. Unfortunately, that channel has been (quite recently) shut down by YouTube for breaking TOS, somehow.

EDIT: Found it. https://www.youtube.com/watch?v=iS8YPDuZbKM


_________________
Project: OZone
Source: GitHub
Current Task: LIB/OBJ file support
"The more they overthink the plumbing, the easier it is to stop up the drain." - Montgomery Scott


Top
 Profile  
 
 Post subject: Re: Breaking ASLR on systems with memory caches
PostPosted: Wed Feb 15, 2017 6:49 pm 
Offline
Member
Member

Joined: Mon Jul 05, 2010 4:15 pm
Posts: 595
Is there a good reason cache flush instructions should be available to user space at all?


Top
 Profile  
 
 Post subject: Re: Breaking ASLR on systems with memory caches
PostPosted: Wed Feb 15, 2017 10:06 pm 
Offline
Member
Member
User avatar

Joined: Sat Jan 15, 2005 12:00 am
Posts: 8561
Location: At his keyboard!
Hi,

OSwhatever wrote:
Is there a good reason cache flush instructions should be available to user space at all?


CLFLUSH (and the non-temporal moves that came later) are important for performance in cases where the data you touched recently won't be data you need again soon (especially when it's a lot of data you won't need again soon). They're an effective work around for pathological cases caused by the cache's "least recently used eviction" algorithm that are needed to avoid potentially severe performance problems.

There are also a few other uses - mostly, in code that benchmarks RAM chip latency and/or code that tests if RAM is faulty; where you don't want to accidentally test the CPUs cache and not test RAM.

If CLFLUSH was only allowed at CPL=0 and you had a kernel API function to do it instead, then the only thing it'd do is make good code much slower (because it's mostly used inside loops in performance critical code that processes a large amount of data, the overhead of an exception of kernel API call would be disastrous); and bad/malicious code would just use the kernel API function. If you didn't allow CLFLUSH (or non-temporal moves) at all (disabled with no alternative provided by kernel) you'd be stuck with the potentially severe performance problems (and it'd be harder to benchmark or test RAM).

Note that there is no actual security problem with CLFLUSH itself. There's a security problem caused by dodgy RAM chips ("rowhammer") where even good/non-malicious software can cause unintended RAM corruption; and there's a problem with "too precise timing" that leads to all kinds of timing side-channels (including data caches, and instruction caches, and TLBs, and shared resources in "attacker and victim sharing core" SMT cases, and ....).

The simple solution is for the kernel to disable RDTSC (and RDTSCP) at CPL=3 and either emulate them (for backward compatibility) or provide kernel API function/s. This allows kernel to nerf precision and/or nerf accuracy (and make a lot of timing side-channels far less practical). It also allows kernel to avoid the "unsynchronised time stamp on different CPUs" problem and/or allows kernel to provide a more convenient calibrated time stamp (that doesn't depend on CPU speed - e.g. always "microseconds since something" for all CPUs in all power saving states).


Cheers,

Brendan

_________________
For all things; perfection is, and will always remain, impossible to achieve in practice. However; by striving for perfection we create things that are as perfect as practically possible. Let the pursuit of perfection be our guide.


Top
 Profile  
 
 Post subject: Re: Breaking ASLR on systems with memory caches
PostPosted: Thu Feb 16, 2017 4:47 am 
Offline
Member
Member
User avatar

Joined: Wed Aug 17, 2016 4:55 am
Posts: 251
Brendan wrote:
Note that there is no actual security problem with CLFLUSH itself. There's a security problem caused by dodgy RAM chips ("rowhammer") where even good/non-malicious software can cause unintended RAM corruption; and there's a problem with "too precise timing" that leads to all kinds of timing side-channels (including data caches, and instruction caches, and TLBs, and shared resources in "attacker and victim sharing core" SMT cases, and ....).

That was my thought when I first heard about it, although rather than blame the DRAM chips themselves I'd blame the DRAM controller instead. Under absolutely no circumstance should any memory write interfere with memory refresh, yet that's exactly what seemed to be happening (hence the RAM corruption).


Top
 Profile  
 
 Post subject: Re: Breaking ASLR on systems with memory caches
PostPosted: Thu Feb 16, 2017 6:46 am 
Offline
Member
Member

Joined: Sat Feb 27, 2010 8:55 pm
Posts: 147
There are a couple questions I have arising from the OP and some of the comments:

1. Does this mean we shouldn't bother with ASLR? (Isn't it almost "security through obscurity" anyway?)

2. Regarding timing attacks, how precise can timing be before it becomes a security concern? This is something I've wondered for quite a while.


Top
 Profile  
 
 Post subject: Re: Breaking ASLR on systems with memory caches
PostPosted: Thu Feb 16, 2017 8:22 am 
Offline
Member
Member
User avatar

Joined: Sat Jan 15, 2005 12:00 am
Posts: 8561
Location: At his keyboard!
Hi,

azblue wrote:
There are a couple questions I have arising from the OP and some of the comments:

1. Does this mean we shouldn't bother with ASLR? (Isn't it almost "security through obscurity" anyway?)


First; "security through obscurity" is where you're relying on the attacker not being able to figure out how the security works. If the attacker knows exactly how the security works but doesn't know something else (a password, an encryption key, a seed for a random number generator, etc) then it's not considered "security through obscurity".

It's best to think of security as many layers, where an attacker has to penetrate through multiple layers to succeed. The question is whether or not ASLR is a worthwhile additional layer - do the benefits (for security) justify the disadvantages (for code complexity, performance, etc). There's no right answer - it's more a question of design goals.

Note that for 32-bit 80x86 usually you can't get enough entropy for ASLR to be very effective (the "random" addresses end up with the lowest 12 bits known due to page alignment and several upper bits known due splitting the virtual address space into zones; and that only leaves about 18 "random middle bits"); and because the CPU doesn't support efficient "IP relative addressing" for 32-bit code there's a performance cost involved with relocated code and data. This means that ASLR is closer to "not worthwhile" for 32-bit 80x86. For 64-bit 80x86 it's different - there are more addressing bits than can be random, and there is support for "IP relative addressing" and less performance cost involved; so it's "more worthwhile" for 64-bit 80x86 than it would be for 32-bit 80x86.

azblue wrote:
2. Regarding timing attacks, how precise can timing be before it becomes a security concern? This is something I've wondered for quite a while.


Assume you have an "uncertainty value" like this:
Code:
    do {
        last_tsc = rdtsc();
    } while(last_tsc % uncertainty != 0);
    return last_tsc/uncertainty;


I'd expect that an "uncertainty value" of a few hundred cycles would be enough to make half of the attacks (those that rely on cache timing or "shared resource in SMT timing") go from "vaguely plausible" to "impractical". I'd also be tempted to make the amount of uncertainty proportional to how often software asks for the time (e.g. 200 cycles of uncertainty if you ask for the TSC infrequently, and 999999 cycles of uncertainty if you ask for TSC millions of times per second).


Cheers,

Brendan

_________________
For all things; perfection is, and will always remain, impossible to achieve in practice. However; by striving for perfection we create things that are as perfect as practically possible. Let the pursuit of perfection be our guide.


Top
 Profile  
 
 Post subject: Re: Breaking ASLR on systems with memory caches
PostPosted: Thu Feb 16, 2017 8:37 am 
Offline
Member
Member

Joined: Thu May 17, 2007 1:27 pm
Posts: 999
Brendan wrote:
I'd expect that an "uncertainty value" of a few hundred cycles would be enough to make half of the attacks (those that rely on cache timing or "shared resource in SMT timing") go from "vaguely plausible" to "impractical". I'd also be tempted to make the amount of uncertainty proportional to how often software asks for the time (e.g. 200 cycles of uncertainty if you ask for the TSC infrequently, and 999999 cycles of uncertainty if you ask for TSC millions of times per second).

Note that there are programs that legitimately ask for TSC 10s - 100s of thousands times per second. The overhead of a syscall would totally kill those applications.

_________________
managarm: Microkernel-based OS capable of running a Wayland desktop (Discord: https://discord.gg/7WB6Ur3). My OS-dev projects: [mlibc: Portable C library for managarm, qword, Linux, Sigma, ...] [LAI: AML interpreter] [xbstrap: Build system for OS distributions].


Top
 Profile  
 
 Post subject: Re: Breaking ASLR on systems with memory caches
PostPosted: Thu Feb 16, 2017 8:46 am 
Offline
Member
Member
User avatar

Joined: Sat Jan 15, 2005 12:00 am
Posts: 8561
Location: At his keyboard!
Hi,

Korona wrote:
Brendan wrote:
I'd expect that an "uncertainty value" of a few hundred cycles would be enough to make half of the attacks (those that rely on cache timing or "shared resource in SMT timing") go from "vaguely plausible" to "impractical". I'd also be tempted to make the amount of uncertainty proportional to how often software asks for the time (e.g. 200 cycles of uncertainty if you ask for the TSC infrequently, and 999999 cycles of uncertainty if you ask for TSC millions of times per second).

Note that there are programs that legitimately ask for TSC 10s - 100s of thousands times per second. The overhead of a syscall would totally kill those applications.


Which applications, and why?


Cheers,

Brendan

_________________
For all things; perfection is, and will always remain, impossible to achieve in practice. However; by striving for perfection we create things that are as perfect as practically possible. Let the pursuit of perfection be our guide.


Top
 Profile  
 
 Post subject: Re: Breaking ASLR on systems with memory caches
PostPosted: Thu Feb 16, 2017 8:53 am 
Offline
Member
Member

Joined: Sat Mar 15, 2014 3:49 pm
Posts: 96
azblue wrote:
There are a couple questions I have arising from the OP and some of the comments:

1. Does this mean we shouldn't bother with ASLR? (Isn't it almost "security through obscurity" anyway?)

2. Regarding timing attacks, how precise can timing be before it becomes a security concern? This is something I've wondered for quite a while.


1. You should absolutely bother with exploit mitigation in general and ASLR in particular.

https://www.openbsd.org/papers/auug04/mgp00001.html is old but still excellent background.

You can harden your exploit mitigation further by W^X, even for JITed code, having an at-install-time or at-each-load-time code order randomizer and so on. I think I heard that apple and google are now doing the latter, e.g. apple for watch apps.

There's no substitute for strong memory protection and carefully audited kernel APIs. But you absolutely want exploit mitigation too.

Re 2 iirc they demonstrated a pure-Javascript timer loop that worked for this attack.


Top
 Profile  
 
 Post subject: Re: Breaking ASLR on systems with memory caches
PostPosted: Thu Feb 16, 2017 8:59 am 
Offline
Member
Member

Joined: Thu May 17, 2007 1:27 pm
Posts: 999
Brendan wrote:
Which applications, and why?

For example if you're polling (IRQs do not make sense at high I/O rates) a low-latency link (e.g. Infiniband) and want to keep track of RTT or wait times. I'm sure there are more examples in real-time applications, this is just one I know of.

EDIT: Another example would be performance measurements in HPC applications like SAT or ILP solvers or physics/weather/fluid/whatever simulation.

_________________
managarm: Microkernel-based OS capable of running a Wayland desktop (Discord: https://discord.gg/7WB6Ur3). My OS-dev projects: [mlibc: Portable C library for managarm, qword, Linux, Sigma, ...] [LAI: AML interpreter] [xbstrap: Build system for OS distributions].


Top
 Profile  
 
 Post subject: Re: Breaking ASLR on systems with memory caches
PostPosted: Thu Feb 16, 2017 9:40 am 
Offline
Member
Member
User avatar

Joined: Sat Jan 15, 2005 12:00 am
Posts: 8561
Location: At his keyboard!
Hi,

Korona wrote:
Brendan wrote:
Which applications, and why?

For example if you're polling (IRQs do not make sense at high I/O rates) a low-latency link (e.g. Infiniband) and want to keep track of RTT or wait times. I'm sure there are more examples in real-time applications, this is just one I know of.

EDIT: Another example would be performance measurements in HPC applications like SAT or ILP solvers or physics/weather/fluid/whatever simulation.


That doesn't really sound like enough to matter; given that NIC drivers are drivers (which tend to be "more trusted" and could or would be exempt) and the rest don't need precision at all (e.g. benchmark how much work they can do in a 5 day period and you can use a sundial instead ;-) ).


Cheers,

Brendan

_________________
For all things; perfection is, and will always remain, impossible to achieve in practice. However; by striving for perfection we create things that are as perfect as practically possible. Let the pursuit of perfection be our guide.


Top
 Profile  
 
 Post subject: Re: Breaking ASLR on systems with memory caches
PostPosted: Thu Feb 16, 2017 5:56 pm 
Offline
Member
Member
User avatar

Joined: Wed Jan 06, 2010 7:07 pm
Posts: 792
Brendan wrote:
That doesn't really sound like enough to matter; given that NIC drivers are drivers (which tend to be "more trusted" and could or would be exempt) and the rest don't need precision at all (e.g. benchmark how much work they can do in a 5 day period and you can use a sundial instead ;-) ).
Low-latency networking tends to be done in userspace, not in a driver; performance measurements that only measure overall throughput are useful but insufficient.

Someone's always going to be able to come up with a good reason to want high-precision timing information. The solution is not to get rid of it, but to use a well-designed execution environment with enough mitigations that it becomes impractical to take advantage of timing for malicious purposes.

_________________
[www.abubalay.com]


Top
 Profile  
 
Display posts from previous:  Sort by  
Post new topic Reply to topic  [ 27 posts ]  Go to page 1, 2  Next

All times are UTC - 6 hours


Who is online

Users browsing this forum: Bing [Bot], songziming and 6 guests


You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot post attachments in this forum

Search for:
Jump to:  
Powered by phpBB © 2000, 2002, 2005, 2007 phpBB Group