Memory handling with newlib

Agola · **Posted:** Sat Apr 01, 2017 8:13 am

Hello OSDevers.

The name, "handling" would incorrect for that subject but I couldn't find a better one for it. As newlib allocates memory by moving kernel heap breakpoint to upper places in memory using sbrk, sbrk heap should be contiguous in virtual memory. I already have a page frame allocator so it is not a problem for me as virtual memory grows contiguous.

But how can I handle the virtual memory with newlib? When I run two or more tasks, how can sbrk handle all of them? As page directories, in other words mapped pages of each task is different, sbrk must have multiple heaps but it have only one? I got confused.

And can I use mmap instead of sbrk? I already have mmap and munmap implemented.

Thanks in advance.

Nable · **Joined:** Tue Nov 08, 2011 11:35 am **Posts:** 453

It's quite simple: newlib lives in user-space and don't have to know anything about virtual memory. And handling virtual memory is a task for kernel-space components. These are two separate worlds, don't forget about stability/security considerations: application shouldn't be able to break the whole system and look into other applications' internals (except for explicitly allowed kinds of IPC).

OSwhatever · **Joined:** Mon Jul 05, 2010 4:15 pm **Posts:** 595

Agola wrote:

And can I use mmap instead of sbrk? I already have mmap and munmap implemented.

Typically sbrk uses "mmap" or equivalent function for mapping virtual memory. You can use mmap outside sbrk too if you want to, for custom allocators for example. sbrk and mmap aren't mutually exclusive but you need to implement sbrk so that malloc and free can properly work in Newlib.

mallard · **Posted:** Sat Apr 01, 2017 11:55 am

Personally, I've replaced newlib's memory allocator with liballoc and have no longer have an "sbrk". sbrk is a very old-fashioned idea that isn't really compatible with things like flat memory spaces and dynamically loaded libraries. I have system calls that allow userspace applications to request and release memory on the page level (in a way that's fairly similar to POSIX "mmap"), which work perfectly with liballoc.

The brk/sbrk mechanism is considered deprecated and advised against in many more recent OSs, including UNIX-like systems, for example:

QNX:

Don't use brk() and sbrk() with any other memory functions (such as malloc(), mmap(), and free()). The brk() function assumes that the heap is contiguous; in Neutrino, memory is returned to the system by the heap, causing the heap to become sparse. The Neutrino malloc() function is based on mmap(), and not on brk().

Oracle Solaris:

The behavior of brk() and sbrk() is unspecified if an application also uses any other memory functions (such as malloc(3C), mmap(2), free(3C)). The brk() and sbrk() functions have been used in specialized cases where no other memory allocation function provided the same capability. The use of mmap(2) is now preferred because it can be used portably with all other memory allocation functions and with any function that uses other allocation functions.

NetBSD:

Note that ordinary application code should use malloc(3) and related
functions to allocate memory, or mmap(2) for lower-level page-granularity
control. While the brk() and/or sbrk() functions exist in most Unix-like
environments, their semantics sometimes vary subtly and their use is not
particularly portable. Also, one must take care not to mix calls to
malloc(3) or related functions with calls to brk() or sbrk() as this will
ordinarily confuse malloc(3); this can be difficult to accomplish given
that many things in the C library call malloc(3) themselves.

Dawin/Mac OS X:

The brk and sbrk functions are historical curiosities left over from earlier days before the advent of virtual memory management.

(Note that the source code of Apple's implementation shows that it simply allocates a 4MB block to simulate a "data segment" for brk/sbrk).

Even the version 2 of the Single UNIX Specification points out that brk and sbrk are basically useless:

The behaviour of brk() and sbrk() is unspecified if an application also uses any other memory functions (such as malloc(), mmap(), free()). Other functions may use these other memory functions silently.

brk and sbrk were removed completely from SUSv3 and POSIX:2001. "mmap" is now considered the "standard" way to allocate memory on UNIX-like systems.

Agola · **Posted:** Sun Apr 02, 2017 3:58 am

mallard wrote:

Personally, I've replaced newlib's memory allocator with liballoc and have no longer have an "sbrk". sbrk is a very old-fashioned idea that isn't really compatible with things like flat memory spaces and dynamically loaded libraries. I have system calls that allow userspace applications to request and release memory on the page level (in a way that's fairly similar to POSIX "mmap"), which work perfectly with liballoc.

The brk/sbrk mechanism is considered deprecated and advised against in many more recent OSs, including UNIX-like systems, for example:

QNX:

Don't use brk() and sbrk() with any other memory functions (such as malloc(), mmap(), and free()). The brk() function assumes that the heap is contiguous; in Neutrino, memory is returned to the system by the heap, causing the heap to become sparse. The Neutrino malloc() function is based on mmap(), and not on brk().

Oracle Solaris:

The behavior of brk() and sbrk() is unspecified if an application also uses any other memory functions (such as malloc(3C), mmap(2), free(3C)). The brk() and sbrk() functions have been used in specialized cases where no other memory allocation function provided the same capability. The use of mmap(2) is now preferred because it can be used portably with all other memory allocation functions and with any function that uses other allocation functions.

NetBSD:

Note that ordinary application code should use malloc(3) and related
functions to allocate memory, or mmap(2) for lower-level page-granularity
control. While the brk() and/or sbrk() functions exist in most Unix-like
environments, their semantics sometimes vary subtly and their use is not
particularly portable. Also, one must take care not to mix calls to
malloc(3) or related functions with calls to brk() or sbrk() as this will
ordinarily confuse malloc(3); this can be difficult to accomplish given
that many things in the C library call malloc(3) themselves.

Dawin/Mac OS X:

The brk and sbrk functions are historical curiosities left over from earlier days before the advent of virtual memory management.

(Note that the source code of Apple's implementation shows that it simply allocates a 4MB block to simulate a "data segment" for brk/sbrk).

Even the version 2 of the Single UNIX Specification points out that brk and sbrk are basically useless:

The behaviour of brk() and sbrk() is unspecified if an application also uses any other memory functions (such as malloc(), mmap(), free()). Other functions may use these other memory functions silently.

brk and sbrk were removed completely from SUSv3 and POSIX:2001. "mmap" is now considered the "standard" way to allocate memory on UNIX-like systems.

That makes sense. But I still don't understand how I manage the heap with more than two tasks.

As mapped pages of each task changes, memory allocator should know what task is in now. Because it mapped memory for task n, but now system is in task m, its variables about where is the last allocation are now incorrect. The allocator can't assume the same pages / memory addresses used for task n. How can kernel and memory manager handle this situation?

How can I manage the heap for each task?

I really got confused, maybe I need to understand concept of the management of virtual and physical memory in both user and kernel space.

Thanks in advance

LtG · **Joined:** Thu Aug 13, 2015 4:57 pm **Posts:** 384

Below is a simplistic model:

During boot/initialization you tell PMM what free physical memory exists, after that only VMM ever talks with PMM. VMM only ever asks for (free) physical pages and the PMM gives them from its list, until it runs out and you do handle that somehow (kernel panic for start, but you want to later handle that better) and the VMM also returns pages to the PMM when those pages have been freed from some virtual address space. So the PMM is quite simple as is the interface between the VMM and PMM.

For the VMM allocation/deallocation you need to keep track of allocations for each _virtual address space_, not process. Of course for most people these two things are the same but I think it's important to realize that it's the VAS that's relevant here. How you keep track of that is up to you. One way is to not keep extra track, you already have the page tables and can use those, but that's pretty slow since it's "optimized" for the CPU to do "forward" lookups, not for fast allocation (though deallocation would be fast).

In practice you should have some virtual address space reserved for keeping track of process/VAS specific information. If you only want to implement something simple (but relatively bad) like brk/sbrk then you could have one page of the VAS per VAS/process reserved and there you store the current "limit" (the brk/sbrk high water mark or whatever they called it), this page of course has to be VAS/process specific so I'd probably make it part of the lower-half (if I was doing higher/lower half kernel) but not accessible to the process. That way when ever you switch to different process/task you can access the brk/sbrk "limit" variable at a fixed virtual address since when you switch the process you also switch the VAS.

Does that explain it? Or did I misunderstand the question?

mallard · **Posted:** Mon Apr 03, 2017 2:16 am

Newlib (or whatever userspace memory allocator you end up using) exists within the userspace process. When you switch to another process, you're switching to another instance of Newlib. Neither instance needs to know or care what the other is doing.

As LtG said, each userspace process will (usually; unless you're working on an AmigaOS clone or something) have its own virtual address space. Your kernel should provide system calls ("mmap"/"munmap" or similar) to allow a process to manage that address space. The userspace memory allocator would then use those system calls. When you switch to another userspace process, you switch to another virtual address space with its own allocator.

OSDev.org

Memory handling with newlib

Who is online