Chapter 4: Multithreaded Programming

Created: 2025-12-16
Updated: 2025-12-16

Thread

  • aka lightweight process (LWP)
  • shared with the same process: code section, data section, OS resources(open files ...)
  • private to each thread: thread control block, PC, registers, stack
  • Motivation: Web browser, web server, RPC server
  • Benefits
    • Responsiveness: remain responsive even when part of the program is blocked
    • Resource Sharing: share resources of process
    • Utilization of Multiprocessor Architectures: run in parallel on multiple processors
    • Economy: less overhead to create and context switch.
      • register set switch is required, but memory map switch is not required
  • Challenges
    • Dividing Activities
    • Data Splitting
    • Data Dependency: synchronization
    • Balance: equal workload
    • Test and Debugging

User Threads vs. Kernel Threads

  • User Threads
    • Thread library in user space --- fast
    • OS kernel unaware of them
  • Kernel Threads
    • kernel performs thread creation, scheduling, management --- slow

Multithreading Models

  • Many-to-One Model
    • pros: efficient, no kernel mode switch
    • cons: one thread blocks all
  • One-to-One Model (Common)
    • pros: more concurrency, can run in parallel on multiprocessors, one thread blocks others unaffected
    • cons: more overhead, limit on # of kernel threads
  • Many-to-Many Model
    • pros: can create many user-level threads, can run in parallel on multiprocessors
    • cons: complex to implement

Shared-Memory Programming

  • Def: work together through a shared memory space which can be accessed by all processes
    • faster & more efficient than message-passing
  • issues: synchronization, deadlock, cache coherence
  • programming technique: parallelizing programming, Unix processes, Threads (Pthread, Java)

Pthread

Pthread := the implementation of POSIX Threads POSIX := Portable Operating System Interface for Unix, same API but can have different syscall implementations

C++
pthread_create(*thread, attr, routine, &arg)
    // thread: pointer to thread ID
    // attr: thread attributes (NULL for default)
    // routine: function to be executed by the thread
    // arg: argument to the function
pthread_join(thread, *retval) // blocking
    // thread: thread ID to wait for
    // retval: pointer to return value (NULL if not needed)
pthread_detach(thread)
    // thread: thread ID to detach
pthread_exit(*retval)
    // retval: return value to be collected by pthread_join
C
// example
#include <pthread.h>
#include <stdio.h>
#define NUM_THREADS 5
void *PrintHello(void *threadid) {
    long tid = (long)threadid;
    printf("Hello World! Thread ID, %ld\n", tid);
    pthread_exit(NULL);
}
int main() {
    pthread_t threads[NUM_THREADS];
    for (long tid=0; tid<NUM_THREADS; ++tid) {
        pthread_create(&threads[tid], NULL, PrintHello, (void*)tid);
        // when passing only a single value, can cast directly to (void*)
        // otherwise, we can define a struct to hold, and pass its pointer
    }
    pthread_exit(NULL);  // main() exit but process still alive
        // not recommended, usually do `pthread_join` and return 0
}

Java Threads

  • Java threads are implemented by a thread library on top of native OS threads
  • Thread mapping depends on JVM implementation and OS (one-to-one or many-to-many)

Linux Threads

  • Linux doesn't support multithreading, it use processes to simulate threads
  • clone() syscall: create a new process and a link pointing to the associated data of the parent process
    • flags: specify what to share between parent and child
      • none: new process (clone = fork)
      • CLONE_VM: share memory space
      • CLONE_FS: share file system info
      • CLONE_FILES: share file descriptors
      • CLONE_SIGHAND: share signal handlers

Threading Issues

  • Semantics of fork() and exec()
    • fork(): either "copy all threads" or "only the calling thread"
    • execlp(): replace entire process, all threads are terminated
  • Thread Cancellation
    • Asynchronous: terminate immediately --- unsafe
      • can leave shared data in inconsistent state
    • Deferred: terminate at defined cancellation points --- safe
  • Signal Handling
    • signals: generated by events → deliver to process → signal handler
    • options:
      • deliver to the thread that generated the signal
      • deliver to all threads
      • deliver to specific thread (e.g. pthread_kill())
      • designate a specific thread to receive all signals (e.g. main thread handles file IO)
  • Thread Pools
    • create a number of threads where they wait for tasks
    • pros
      • faster than creating/destroying threads for each task
      • better resource management (limit # of threads)
    • e.g. web server