Chapter 8: Memory Management
Created: 2025-12-16
Updated: 2025-12-16
Background
- Registers and Main Memory are the only storage directly accessible to the CPU
- Collection of processes are loaded into memory and be executed
- Multiple processes are brought into memory and run concurrently
- A process may be moved between memory and disk during its execution
Address Binding
Multistep Processing of a Program
- compile time: translate source code to machine code
- load time: link the program and load it into memory
- execution time: run the program
Address Binding Timing
- Compile Time
- compiler translates symbolic code into absolute code
- if starting location changes → recompile
- Load Time
- compiler translates symbolic code into relocatable code(
.BS+0x1234) - when loading, translate to physical address(
0xABCD) - if starting location changes → reload the code
- compiler translates symbolic code into relocatable code(
- Execution Time
- compiler translates symbolic code into logical addresses (virtual-address) code(
0x18) - even when loading, keep using logical addresses(
0x18) - special hardware, Memory Management Unit (MMU), is used to map logical to physical addresses
- even CPU don't know the physical address, only use logical address
- compiler translates symbolic code into logical addresses (virtual-address) code(
Memory-Management Unit (MMU)
- hardware device that maps logical addresses to physical addresses
- the user program deals with logical addresses; it never sees the physical addresses
Dynamic Loading
- a routine is loaded into memory only when it is called, (not in load time)
- unused routines are never loaded
- better memory space utilization
- required implemented through program design (no special support from OS):
C
#include <dlfcn.h> int main() { double (*cosine)(double); void* handle = dlopen ("/lib/libm.so.6", RTLD_LAZY); cosine = dlsym(handle, "cos"); printf ("%f\n", (*cosine)(2.0)); dlclose(handle); }
Linking
- Static Linking: libraries are combined with object code at load time
- results in a larger executable file, duplicated code
- Dynamic Linking: linking postponed until execution time
- only one code copy in memory and shared by everyone
- need to compile to dynamic link library
- stub is included in the executable to ask the OS to load the library when needed
stub call-> check if library is in memory -> if not, load it -> execute the lib
Swapping
- a process can be swapped temporarily out of memory to a backing store (disk) and then brought back into memory for continued execution
- also used by midterm scheduling, not context switching
- Backing Store: a chunk of disk, separated from the file system, to provide direct access to memory images
- Advantages
- free up memory
- swap lower-priority processes out with a higher one
- Swap back: if binding is done at compile/load time, swap back must be the same
- A process to be swapped MUST be idle (a process is waiting for IO)
- solution:
- Never swap a process with pending I/O
- I/O ops are done through OS buffers (user process can be swapped out, OS does I/O on its behalf), then copy data to/from user process when swapped back
- I/O device → I/O Controller (own buffer) → OS Buffer → User Process
- Major problem: swap time is too long amount of memory swapped
- solution: use demand paging instead of swapping entire process
Contiguous Memory Allocation
- Fixed-partition allocation
- each process loads into one partition of fixed size
- degree of multiprogramming = # of partitions
- Variable-partition allocation
- Hole: block of contiguous free memory, (scattered in memory)
- First-fit: find the first hole that fits
- Best-fit: find the smallest hole that fits
- Worst-fit: find the largest hole
- Buddy Cell Allocation Algorithm
- a heuristic algo for best-fit policy
- split memory into halves recursively until the smallest block that fits is found
- those two blocks are called "buddies" (same size, contiguous) can be merged later
- search time: O(log n)
- merge and split: O(1)
- External Fragmentation
- total mem space big enough, but not contiguous
- only occur in variable-partition allocation
- Solution: compaction
- shuffle memory contents to place all free memory together in one large block
- only possible if relocation is dynamic, done at execution time
- Internal Fragmentation
- allocated memory may be slightly larger than requested memory
- only occur in fixed-partition allocation
Paging -- Non-Contiguous Memory Allocation
- Divide physical memory into fixed-size blocks called frames
- Divide logical memory into blocks of the same size called pages
- to run a program of size
npages, need to findnfree frames and load the program - Benefits
- non-contiguous allocation
- no external fragmentation
- limited internal fragmentation
- provide shared memory/page
- Page Table: maps the base of each page (page number) to the base of each frame
- page table only includes pages owned by a process
- a process cannot access memory outside its allocated pages (mem protection)
- page number
p: base address of a page - frame number
f: base address of a frame
- logical address = page number (
p) + page offset (d)N bitsofp-> can allocate at most pages ->M bitsofd-> bytes
- physical address = frame number (
f) + page offset (d) - Address Translation
p+d;p--MMU->f;f+d# of pages: determine the logical memory size of a process# of frames: depending on the size of physical memory
- free frame linked list: maintained by OS
- Page Size: a power of 2, 512 ~ 16MB, typically 4KB
- too small -> large page table, more pages per process -> more I/O ops for loading pages
- too large -> internal fragmentation
- Page Table
- kept in memory
- PTBR (Page-Table Base Register): the physical address of the page table (used by MMU)
- store in PCB (Process Control Block) during context switch
- With PTBR, each memory reference requires
2memory accesses -> solved by TLB
- TLB (Translation Lookaside Buffer)
- associative memory: parallel search O(1)
- limited entries: 64 ~ 2048
- cached (page number, frame number) pairs when MMU translating addresses
- TLB hit: get frame number from TLB
- TLB miss: use original by PTBR (2 memory accesses), then update TLB
- flush TLB during context switch
- EMAT (Effective Memory Access Time)
- example: 20ns for TLB lookup + 100ns for memory access
- 70% TLB hit ratio:
- 98% TLB hit ratio:
- Memory Protection: each page is associated with a set of protection bits
- e.g. valid/invalid bit
- valid: the page/frame is in the process's logical/physical address space
- invalid: the page/frame is not in the process's logical/physical address space
- PTLR (Page-Table Length Register): indicates the size of the page table (used by MMU)
- e.g. valid/invalid bit
- Shared Pages
- reentrant code (pure code): never change during execution, can be shared
- only one copy in physical memory
- multiple virtual addresses map to the same physical address
Page table would be huge and difficult to be loaded. e.g. 4GB () logical memory with 4KB () page would have 1 million () pages table entries. Three solutions:
Hierarchical Paging
- break the page table into multiple levels
- Two-level paging (32-bit address with 4KB page size)
- 10 bits for outer page number → 1K () page table entries
- 10 bits for inner page number → 1K () page table entries
- 12 bits for offset
d-> 4KB () page size - → 3 memory accesses
- 64-bit systems:
- 12 (p1) + 10 (p2) + 10 (p3) + 10 (p4) + 10 (p5) + 12 (d)
- require 6 memory accesses
Hashed Page Tables
- commonly-used for address > 32 bits
- virtual page number is hashed into a hash table
- hash table size varies; larger -> smaller chains in each entry
- each entry
- (virtual page number, frame number, next pointer)
- pointer waste memory & traverse linked list waste time
Inverted Page Tables
- no page table per process
- maintains a frame table for the whole memory (one entry per frame)
- each entry: (PID, virtual page number)
- increases memory access time
- each access needs to search the whole frame table
- solution: use hashing for the frame table
- hard to support shared pages !!! Critical
Segmentation -- Non-Contiguous Memory Allocation
- a program = a collection of segments
- a segment = a logical unit e.g. main program, procedure, function, data structure, stack
- logical address = (segment number, offset)
- Segment Table: maps segment number to (base (4B), limit(4B))
- STBR (Segment-Table Base Register): the physical address of the segment table
- STLR (Segment-Table Length Register): indicates the size of the segment table
- process:
- segment number -> segment table -> get base and limit
- check offset < limit?
- physical address = base + offset
- shared segments: same logical address in different processes
- Protection & Sharing
- Read-only segment: code (code shared at segment level)
- Read-Write segment: data, heap, stack
Segmentation with Paging
- CPU --logical address--> Segmentation --linear address--> Paging --physical address--> Memory
- Example: The Intel Pentium
- logical address: (selector
16, offset32)- selector: segment number
13, GDT/LDT1, protection info2- LDT: Local Descriptor Table ( entries)
- GDT: Global Descriptor Table ( entries)
- max # of segments per process:
- size of a segment:
- selector: segment number
- linear address: segment base address(output of segment descriptor) + offset = 32-bit linear address
- take 32-bit linear address as paging logical address
- 10 (
p1) + 10 (p2) + 12 (d) - the last bit of
p1can be used as a page size flag- 0: 4KB page (normal 10
p2+ 12d) - 1: 4MB page (directly use 22
d)
- 0: 4KB page (normal 10
- 10 (
- logical address: (selector
