Address space layout randomization, aka ASLR -- and position independent executables (PIE), are used to improve the security of modern operating systems by making memory addresses less predictable.
Position-independent executables let systems more effectively use ASLR to randomize their memory layouts at runtime. The entry point offsets to functions remain fixed, while the base address is randomized.
$ readelf -h /usr/bin/ls
ELF Header:
Magic: 7f 45 4c 46 02 01 01 00 00 00 00 00 00 00 00 00
Class: ELF64
Data: 2's complement, little endian
Version: 1 (current)
OS/ABI: UNIX - System V
ABI Version: 0
Type: DYN (Position-Independent Executable file)
Machine: Advanced Micro Devices X86-64
Version: 0x1
Entry point address: 0x6d30
Start of program headers: 64 (bytes into file)
Start of section headers: 140328 (bytes into file)
Flags: 0x0
Size of this header: 64 (bytes)
Size of program headers: 56 (bytes)
Number of program headers: 13
Size of section headers: 64 (bytes)
Number of section headers: 31
Section header string table index: 30
On a system with PIE enabled binaries, the base address will be modified because it gets computed in addition with another value. Runtime addresses then, essentially get calculated like this:
Runtime Address = Randomized Base Address + Entry Point Offset
We can see some of the code for implementing the address space layout randomization of ELF files in the source code on Linus Torvalds' github in fs/binfmt_elf.c:
for(i = 0, elf_ppnt = elf_phdata;
i < elf_ex->e_phnum; i++, elf_ppnt++) {
int elf_prot, elf_flags;
unsigned long k, vaddr;
unsigned long total_size = 0;
unsigned long alignment;
if (elf_ppnt->p_type != PT_LOAD)
continue;
elf_prot = make_prot(elf_ppnt->p_flags, &arch_state,
!!interpreter, false);
elf_flags = MAP_PRIVATE;
vaddr = elf_ppnt->p_vaddr;
/*
* The first time through the loop, first_pt_load is true:
* layout will be calculated. Once set, use MAP_FIXED since
* we know we've already safely mapped the entire region with
* MAP_FIXED_NOREPLACE in the once-per-binary logic following.
*/
if (!first_pt_load) {
elf_flags |= MAP_FIXED;
} else if (elf_ex->e_type == ET_EXEC) {
/*
* This logic is run once for the first LOAD Program
* Header for ET_EXEC binaries. No special handling
* is needed.
*/
elf_flags |= MAP_FIXED_NOREPLACE;
} else if (elf_ex->e_type == ET_DYN) {
/*
* This logic is run once for the first LOAD Program
* Header for ET_DYN binaries to calculate the
* randomization (load_bias) for all the LOAD
* Program Headers.
*/
/*
* Calculate the entire size of the ELF mapping
* (total_size), used for the initial mapping,
* due to load_addr_set which is set to true later
* once the initial mapping is performed.
*
* Note that this is only sensible when the LOAD
* segments are contiguous (or overlapping). If
* used for LOADs that are far apart, this would
* cause the holes between LOADs to be mapped,
* running the risk of having the mapping fail,
* as it would be larger than the ELF file itself.
*
* As a result, only ET_DYN does this, since
* some ET_EXEC (e.g. ia64) may have large virtual
* memory holes between LOADs.
*
*/
total_size = total_mapping_size(elf_phdata,
elf_ex->e_phnum);
if (!total_size) {
retval = -EINVAL;
goto out_free_dentry;
}
/* Calculate any requested alignment. */
alignment = maximum_alignment(elf_phdata, elf_ex->e_phnum);
/*
* There are effectively two types of ET_DYN
* binaries: programs (i.e. PIE: ET_DYN with PT_INTERP)
* and loaders (ET_DYN without PT_INTERP, since they
* _are_ the ELF interpreter). The loaders must
* be loaded away from programs since the program
* may otherwise collide with the loader (especially
* for ET_EXEC which does not have a randomized
* position). For example to handle invocations of
* "./ld.so someprog" to test out a new version of
* the loader, the subsequent program that the
* loader loads must avoid the loader itself, so
* they cannot share the same load range. Sufficient
* room for the brk must be allocated with the
* loader as well, since brk must be available with
* the loader.
*
* Therefore, programs are loaded offset from
* ELF_ET_DYN_BASE and loaders are loaded into the
* independently randomized mmap region (0 load_bias
* without MAP_FIXED nor MAP_FIXED_NOREPLACE).
*/
if (interpreter) {
/* On ET_DYN with PT_INTERP, we do the ASLR. */
load_bias = ELF_ET_DYN_BASE;
if (current->flags & PF_RANDOMIZE)
load_bias += arch_mmap_rnd();
/* Adjust alignment as requested. */
if (alignment)
load_bias &= ~(alignment - 1);
elf_flags |= MAP_FIXED_NOREPLACE;
} else {
/*
* For ET_DYN without PT_INTERP, we rely on
* the architectures's (potentially ASLR) mmap
* base address (via a load_bias of 0).
*
* When a large alignment is requested, we
* must do the allocation at address "0" right
* now to discover where things will load so
* that we can adjust the resulting alignment.
* In this case (load_bias != 0), we can use
* MAP_FIXED_NOREPLACE to make sure the mapping
* doesn't collide with anything.
*/
if (alignment > ELF_MIN_ALIGN) {
load_bias = elf_load(bprm->file, 0, elf_ppnt,
elf_prot, elf_flags, total_size);
if (BAD_ADDR(load_bias)) {
retval = IS_ERR_VALUE(load_bias) ?
PTR_ERR((void*)load_bias) : -EINVAL;
goto out_free_dentry;
}
vm_munmap(load_bias, total_size);
/* Adjust alignment as requested. */
if (alignment)
load_bias &= ~(alignment - 1);
elf_flags |= MAP_FIXED_NOREPLACE;
} else
load_bias = 0;
}
/*
* Since load_bias is used for all subsequent loading
* calculations, we must lower it by the first vaddr
* so that the remaining calculations based on the
* ELF vaddrs will be correctly offset. The result
* is then page aligned.
*/
load_bias = ELF_PAGESTART(load_bias - vaddr);
}
//snipped
retval = create_elf_tables(bprm, elf_ex, interp_load_addr,
e_entry, phdr_addr);
if (retval < 0)
goto out;
mm = current->mm;
mm->end_code = end_code;
mm->start_code = start_code;
mm->start_data = start_data;
mm->end_data = end_data;
mm->start_stack = bprm->p;
if ((current->flags & PF_RANDOMIZE) && (snapshot_randomize_va_space > 1)) {
/*
* For architectures with ELF randomization, when executing
* a loader directly (i.e. no interpreter listed in ELF
* headers), move the brk area out of the mmap region
* (since it grows up, and may collide early with the stack
* growing down), and into the unused ELF_ET_DYN_BASE region.
*/
if (IS_ENABLED(CONFIG_ARCH_HAS_ELF_RANDOMIZE) &&
elf_ex->e_type == ET_DYN && !interpreter) {
mm->brk = mm->start_brk = ELF_ET_DYN_BASE;
} else {
/* Otherwise leave a gap between .bss and brk. */
mm->brk = mm->start_brk = mm->brk + PAGE_SIZE;
}
mm->brk = mm->start_brk = arch_randomize_brk(mm);
//snipped
We can visualize this with a simple C program that prints the address of its main
function. PIE would also randomize any other function addresses.
#include <stdio.h>
int main() {
printf("Address of main: %p\n", main);
return 0;
}
$ gcc -fpie main.c -o main
hexagr@vr:~$ ./main
Address of main: 0x62498320c149
hexagr@vr:~$ ./main
Address of main: 0x61bcab5a7149
hexagr@vr:~$ ./main
Address of main: 0x587688ff8149
The fixed offset to the address of main is 0x149
, while the base address -- the high bits -- are randomized:
High bits (randomized base address) | Low bits (entry point offset) |
---|---|
0x62498320c000 | 0x149 |
0x61bcab5a7000 | 0x149 |
0x587688ff8000 | 0x149 |
PIC
If we force the additional use of the fpic
flag, we can make gcc
generate position-independent code (assembly) which (if not optimized away) might use the Global Offset Table (GOT).
The position-independent code (PIC) feature is distinctly different from the position-independent executable (PIE) feature.
PIE is an extension of PIC and applies to the entire binary, utilizing ASLR to randomize its base address. This is useful for standalone executables.
PIC enables code to be loaded at any address by using relative addressing. It stores absolute addresses of global variables and functions in the Global Offset Table, which resolves them at runtime. This is useful for shared libraries, since code is loaded at random addresses on systems with ASLR enabled. The Global Offset Table serves as a layer of indirection to calculate relative addresses that still allows code such as shared libraries to run in a position-independent way.
Thus, the -fpie
flag is for executables, whereas the -fPIC
flag would be appropriate for a shared library. For standalone executables, the compiler typically uses RIP-relative addressing: leaq main(%rip), %rax
. Note that PIE executables often do not need the GOT for internal symbols (like main) because the entire binary gets relocated and often everything can work with RIP-relative addressing. But in cases where external symbols are necessary, position-independent executables will still need the Global Offset Table.
However, for position-independent code, the compiler emits assembly to use the Global Offset Table to calculate relative addresses: movq main@GOTPCREL(%rip), %rax
.
$ gcc -fPIC -S entry.c -o entry_fpic.s
$ gcc -fpie -S entry.c -o entry_fpie.s
$ diff entry_fpic.s entry_fpie.s
18c18
< movq main@GOTPCREL(%rip), %rax
---
> leaq main(%rip), %rax
Comments
Post a Comment