System Calls MCQ 60 Practice Tests With Answers (2026)

System Calls MCQ practice questions are essential for preparing for competitive exams, certifications, and technical interviews. This comprehensive MCQ platform provides 60 carefully curated practice questions covering the system call lifecycle, privilege modes, parameter passing, and advanced Linux kernel optimization techniques.
These questions are organized into three progressive difficulty levels of 20 questions each: Basics (covering core definitions, POSIX APIs, and the system call table), Concepts (covering parameter passing, fork/exec, mmap, and ptrace), and Advanced (covering SYSCALL/SYSRET, vDSO, seccomp, KPTI, eBPF, and futex). Each question includes a verified, in-depth explanation to reinforce learning.
Practice in Study Mode to reveal answers and detailed explanations instantly, or use Exam Mode for timed testing and real-time scoring to simulate GATE, university, or FAANG technical interview conditions. The interactive engine tracks your progress and identifies knowledge gaps across modern kernel designs.
Contents
- 1.Basics (20 Questions)Core definitions Β· Mode Bits Β· POSIX abstraction Β· Syscall categories
- 2.Concepts (20 Questions)Parameter passing Β· fork/exec/wait Β· mmap Β· ptrace Β· async I/O
- 3.Advanced (20 Questions)vDSO Β· seccomp Β· KPTI Β· futex Β· zero-copy splice Β· eBPF Β· rootkits
- 4.Conclusionsummary Β· next steps Β· study tips
- 5.Key Takeawaysquick-fire bullet recap of essential facts
- 6.Quick Review Summaryconcept Β· definition Β· key fact table
- 7.FAQcommon questions answered
System Calls β Basics
1What is the fundamental definition of a "System Call"?
CorrectA: The programmatic interface through which a user-space process requests a privileged service from the operating system kernel
A system call is the formal boundary between user space and kernel space. It is the only sanctioned mechanism for an application to request services β such as file I/O, process creation, or network communication β that require elevated kernel privileges. The CPU mode switches from User Mode (Ring 3) to Kernel Mode (Ring 0) upon execution, ensuring that the kernel mediates all sensitive hardware access.
IncorrectA: The programmatic interface through which a user-space process requests a privileged service from the operating system kernel
A system call is the formal boundary between user space and kernel space. It is the only sanctioned mechanism for an application to request services β such as file I/O, process creation, or network communication β that require elevated kernel privileges. The CPU mode switches from User Mode (Ring 3) to Kernel Mode (Ring 0) upon execution, ensuring that the kernel mediates all sensitive hardware access.
2Why do software developers typically use an Application Programming Interface (API) rather than invoking raw system calls directly?
CorrectB: APIs provide cross-platform portability and are significantly easier to implement than handling low-level, architecture-specific CPU registers
Raw system calls require the developer to manually load the correct syscall number into a register (e.g., RAX on x86-64) and execute a trap instruction β details that vary per OS and CPU architecture. APIs like POSIX libc abstract this completely. For example, calling `fopen()` in C works identically on Linux, macOS, and any POSIX-compliant system, while the libc wrapper internally translates it to the correct native syscall (open() on Linux, _open() on BSD, etc.).
IncorrectB: APIs provide cross-platform portability and are significantly easier to implement than handling low-level, architecture-specific CPU registers
Raw system calls require the developer to manually load the correct syscall number into a register (e.g., RAX on x86-64) and execute a trap instruction β details that vary per OS and CPU architecture. APIs like POSIX libc abstract this completely. For example, calling `fopen()` in C works identically on Linux, macOS, and any POSIX-compliant system, while the libc wrapper internally translates it to the correct native syscall (open() on Linux, _open() on BSD, etc.).
3In which operational mode must the CPU be executing for a system call to actually perform its underlying task?
CorrectC: Kernel Mode
System calls bridge User Mode (Ring 3) and Kernel Mode (Ring 0). The preliminary work (argument setup, trap instruction) occurs in User Mode. The actual privileged work β accessing hardware, modifying kernel data structures, scheduling β executes exclusively in Kernel Mode. The CPU enforces this via the Current Privilege Level (CPL) field in the CS register and the Mode Bit in the processor status word.
IncorrectC: Kernel Mode
System calls bridge User Mode (Ring 3) and Kernel Mode (Ring 0). The preliminary work (argument setup, trap instruction) occurs in User Mode. The actual privileged work β accessing hardware, modifying kernel data structures, scheduling β executes exclusively in Kernel Mode. The CPU enforces this via the Current Privilege Level (CPL) field in the CS register and the Mode Bit in the processor status word.
4What is the most common API used by developers to interface with system calls on UNIX and Linux operating systems?
CorrectD: The POSIX API (often accessed via `libc`)
The POSIX (Portable Operating System Interface) standard defines a common API for UNIX-like systems. On Linux, glibc (the GNU C Library) provides the POSIX wrappers β functions like open(), read(), write(), fork() β each of which internally loads the appropriate Linux syscall number and executes the SYSCALL instruction. Other libc implementations (musl, dietlibc, bionic on Android) follow the same POSIX interface while mapping to the same kernel syscalls.
IncorrectD: The POSIX API (often accessed via `libc`)
The POSIX (Portable Operating System Interface) standard defines a common API for UNIX-like systems. On Linux, glibc (the GNU C Library) provides the POSIX wrappers β functions like open(), read(), write(), fork() β each of which internally loads the appropriate Linux syscall number and executes the SYSCALL instruction. Other libc implementations (musl, dietlibc, bionic on Android) follow the same POSIX interface while mapping to the same kernel syscalls.
5What specific event triggers the CPU to transition from User Mode to Kernel Mode to execute a system call?
CorrectB: A synchronous software interrupt (also known as a trap or exception) executed by the application
System calls use synchronous software interrupts β also called traps β deliberately initiated by the application. On x86-64 Linux this is the SYSCALL instruction; legacy 32-bit Linux used INT 0x80. Unlike hardware interrupts (which are asynchronous and triggered by external devices), software traps occur at a precisely defined point in the instruction stream. The CPU transfers control to the interrupt handler registered in the IDT/MSR, then executes the kernel-side syscall dispatcher.
IncorrectB: A synchronous software interrupt (also known as a trap or exception) executed by the application
System calls use synchronous software interrupts β also called traps β deliberately initiated by the application. On x86-64 Linux this is the SYSCALL instruction; legacy 32-bit Linux used INT 0x80. Unlike hardware interrupts (which are asynchronous and triggered by external devices), software traps occur at a precisely defined point in the instruction stream. The CPU transfers control to the interrupt handler registered in the IDT/MSR, then executes the kernel-side syscall dispatcher.
6Which of the following is a classic example of a "Process Control" system call in UNIX-like systems?
CorrectC: `fork()`
Process Control system calls manage the lifecycle of processes. fork() is the canonical example β it creates an exact (Copy-on-Write) copy of the calling process. Other Process Control calls include exec() (replace a process image), exit() (terminate a process), wait() (reap a child), nice() (adjust scheduling priority), and kill() (send a signal). chmod() is Protection, ioctl() is Device Management, and mmap() bridges File Manipulation and Memory Management.
IncorrectC: `fork()`
Process Control system calls manage the lifecycle of processes. fork() is the canonical example β it creates an exact (Copy-on-Write) copy of the calling process. Other Process Control calls include exec() (replace a process image), exit() (terminate a process), wait() (reap a child), nice() (adjust scheduling priority), and kill() (send a signal). chmod() is Protection, ioctl() is Device Management, and mmap() bridges File Manipulation and Memory Management.
7Which of the following functions represents a standard "File Management" system call?
CorrectD: `open()`
File Management system calls handle persistent data: open() (get a file descriptor), read()/write() (transfer data), close() (release the descriptor), stat() (query metadata), rename(), unlink() (delete), and lseek() (reposition). kill() is Process Control (send a signal), sleep() is Process Control (suspend), and gettimeofday() is Information Maintenance.
IncorrectD: `open()`
File Management system calls handle persistent data: open() (get a file descriptor), read()/write() (transfer data), close() (release the descriptor), stat() (query metadata), rename(), unlink() (delete), and lseek() (reposition). kill() is Process Control (send a signal), sleep() is Process Control (suspend), and gettimeofday() is Information Maintenance.
8For applications designed to run on Microsoft Windows, which API provides the primary wrappers for interacting with the NT kernel's system calls?
CorrectA: The Win32 API
The Win32 API (also called the Windows API or WinAPI), implemented in kernel32.dll, user32.dll, and advapi32.dll, provides the standard application-facing interface on Windows. These DLLs call into ntdll.dll, which makes the actual NT kernel system calls (NtReadFile, NtCreateProcess, etc.) via the SYSCALL instruction. The Cocoa API is Apple's macOS/iOS framework; Berkeley Sockets is the POSIX network API; GNU C Library (glibc) is Linux.
IncorrectA: The Win32 API
The Win32 API (also called the Windows API or WinAPI), implemented in kernel32.dll, user32.dll, and advapi32.dll, provides the standard application-facing interface on Windows. These DLLs call into ntdll.dll, which makes the actual NT kernel system calls (NtReadFile, NtCreateProcess, etc.) via the SYSCALL instruction. The Cocoa API is Apple's macOS/iOS framework; Berkeley Sockets is the POSIX network API; GNU C Library (glibc) is Linux.
9When a system call executes successfully in a UNIX environment, what is the most common type of value returned to the calling application?
CorrectC: A zero (0) or a positive integer (such as a file descriptor handle)
UNIX/POSIX system call wrappers return 0 on success for calls that return no meaningful value (like close(), chmod()), or a positive integer for calls that produce a result β open() returns a file descriptor (β₯0), fork() returns the child PID (>0) to the parent, read() returns the byte count read, etc. This convention allows callers to distinguish success (β₯0) from failure (β1) with a simple comparison, checking errno on β1.
IncorrectC: A zero (0) or a positive integer (such as a file descriptor handle)
UNIX/POSIX system call wrappers return 0 on success for calls that return no meaningful value (like close(), chmod()), or a positive integer for calls that produce a result β open() returns a file descriptor (β₯0), fork() returns the child PID (>0) to the parent, read() returns the byte count read, etc. This convention allows callers to distinguish success (β₯0) from failure (β1) with a simple comparison, checking errno on β1.
10If a system call fails to execute properly in a POSIX environment, what does the wrapper function typically return to the application?
CorrectD: It returns `-1` and sets the global `errno` variable to a specific error code
The POSIX error convention: a libc wrapper returns β1 and sets the thread-local errno variable to an error code (e.g., ENOENT for "no such file", EACCES for "permission denied", EAGAIN for "try again"). The application then inspects errno to determine the failure reason. This two-step mechanism (return value + errno) allows detailed error reporting without changing the function signature. Modern code should use strerror(errno) or perror() to get a human-readable message.
IncorrectD: It returns `-1` and sets the global `errno` variable to a specific error code
The POSIX error convention: a libc wrapper returns β1 and sets the thread-local errno variable to an error code (e.g., ENOENT for "no such file", EACCES for "permission denied", EAGAIN for "try again"). The application then inspects errno to determine the failure reason. This two-step mechanism (return value + errno) allows detailed error reporting without changing the function signature. Modern code should use strerror(errno) or perror() to get a human-readable message.
11Which system call is frequently used for "Device Management" to send highly specific, non-standard control commands to hardware peripherals?
CorrectA: `ioctl()`
ioctl() (Input/Output Control) is the Swiss Army knife of Device Management. It accepts a file descriptor (to a device, socket, or file), an operation code, and optional arguments β providing a generic extensible mechanism for hardware-specific commands that don't fit the standard read/write model. Examples: eject a CD tray, query the terminal window size (TIOCGWINSZ), configure a network interface (SIOCSIFFLAGS), or set serial port baud rate (TCSETS).
IncorrectA: `ioctl()`
ioctl() (Input/Output Control) is the Swiss Army knife of Device Management. It accepts a file descriptor (to a device, socket, or file), an operation code, and optional arguments β providing a generic extensible mechanism for hardware-specific commands that don't fit the standard read/write model. Examples: eject a CD tray, query the terminal window size (TIOCGWINSZ), configure a network interface (SIOCSIFFLAGS), or set serial port baud rate (TCSETS).
12Which of the following is an example of an "Information Maintenance" system call?
CorrectB: `gettimeofday()`
Information Maintenance system calls query or modify OS state data. gettimeofday() returns the current wall-clock time and timezone β pure kernel state, no hardware writes. Other Information Maintenance syscalls: getpid(), getuid(), uname() (OS version), sysinfo() (memory/CPU stats), and alarm() (set a timer). pipe() is Communications IPC; chown() is Protection; exit() is Process Control.
IncorrectB: `gettimeofday()`
Information Maintenance system calls query or modify OS state data. gettimeofday() returns the current wall-clock time and timezone β pure kernel state, no hardware writes. Other Information Maintenance syscalls: getpid(), getuid(), uname() (OS version), sysinfo() (memory/CPU stats), and alarm() (set a timer). pipe() is Communications IPC; chown() is Protection; exit() is Process Control.
13If an application needs to establish a network connection or facilitate inter-process communication (IPC), which category of system call will it invoke?
CorrectD: Communications (e.g., `pipe()` or `socket()`)
Communications system calls handle data exchange between processes or machines. Shared-memory model calls include shmget()/shmat(); message-passing model calls include pipe(), msgget()/msgsnd()/msgrcv(). Network communications use the full socket API: socket(), bind(), listen(), accept(), connect(), send()/recv(). This category maps both to local IPC and to network sockets β the unified file-descriptor model in UNIX means many of these share the same read()/write() interface once connected.
IncorrectD: Communications (e.g., `pipe()` or `socket()`)
Communications system calls handle data exchange between processes or machines. Shared-memory model calls include shmget()/shmat(); message-passing model calls include pipe(), msgget()/msgsnd()/msgrcv(). Network communications use the full socket API: socket(), bind(), listen(), accept(), connect(), send()/recv(). This category maps both to local IPC and to network sockets β the unified file-descriptor model in UNIX means many of these share the same read()/write() interface once connected.
14How does the operating system kernel map a generic system call request to the exact internal function that needs to be executed?
CorrectA: By utilizing an indexed System Call Table that maps an integer passed by the application to the corresponding kernel function pointer
When a SYSCALL instruction executes, the CPU transfers control to the kernel's syscall entry point (defined in the LSTAR MSR on x86-64). The kernel reads the syscall number from RAX and indexes into sys_call_table[] β a static array of function pointers compiled into the kernel. For example, syscall #1 on Linux x86-64 is sys_write; #57 is sys_fork. The kernel then calls the corresponding handler with the arguments passed in RDI, RSI, RDX, R10, R8, R9.
IncorrectA: By utilizing an indexed System Call Table that maps an integer passed by the application to the corresponding kernel function pointer
When a SYSCALL instruction executes, the CPU transfers control to the kernel's syscall entry point (defined in the LSTAR MSR on x86-64). The kernel reads the syscall number from RAX and indexes into sys_call_table[] β a static array of function pointers compiled into the kernel. For example, syscall #1 on Linux x86-64 is sys_write; #57 is sys_fork. The kernel then calls the corresponding handler with the arguments passed in RDI, RSI, RDX, R10, R8, R9.
15How does a program written in Java handle system calls, given that Java is meant to be platform-independent?
CorrectB: The Java Virtual Machine (JVM) intercepts the Java API calls and translates them into the native, host-specific system calls of the underlying OS
Java's platform independence is achieved by having the JVM act as an intermediate translation layer. When Java code calls java.io.FileInputStream.read(), the JVM's native implementation (written in C/C++) calls the OS-specific syscall β read() on Linux/macOS, NtReadFile on Windows β through JNI bindings compiled into the JVM binary. Java bytecode itself contains no syscall instructions; all OS interaction flows through the JVM runtime, which is compiled per-platform.
IncorrectB: The Java Virtual Machine (JVM) intercepts the Java API calls and translates them into the native, host-specific system calls of the underlying OS
Java's platform independence is achieved by having the JVM act as an intermediate translation layer. When Java code calls java.io.FileInputStream.read(), the JVM's native implementation (written in C/C++) calls the OS-specific syscall β read() on Linux/macOS, NtReadFile on Windows β through JNI bindings compiled into the JVM binary. Java bytecode itself contains no syscall instructions; all OS interaction flows through the JVM runtime, which is compiled per-platform.
16What is the primary security benefit of requiring applications to use system calls rather than accessing hardware directly?
CorrectC: It establishes a strict boundary that prevents errant or malicious User Mode applications from crashing the system or reading protected kernel memory
The dual-mode operation model (User Mode vs Kernel Mode) enforced by system calls is the cornerstone of OS security and stability. User Mode processes cannot directly access I/O ports, modify page tables, disable interrupts, or read kernel memory. Any attempt to execute a privileged instruction in User Mode triggers a General Protection Fault. All hardware access must go through the system call interface, where the kernel validates the request, checks permissions (owner/group/ACL), and mediates the operation safely.
IncorrectC: It establishes a strict boundary that prevents errant or malicious User Mode applications from crashing the system or reading protected kernel memory
The dual-mode operation model (User Mode vs Kernel Mode) enforced by system calls is the cornerstone of OS security and stability. User Mode processes cannot directly access I/O ports, modify page tables, disable interrupts, or read kernel memory. Any attempt to execute a privileged instruction in User Mode triggers a General Protection Fault. All hardware access must go through the system call interface, where the kernel validates the request, checks permissions (owner/group/ACL), and mediates the operation safely.
17Which of the following system calls is utilized for "Protection" and access control?
CorrectA: `chmod()`
Protection/Security system calls control access rights to system resources. chmod() modifies the permission bits (rwxr-xr-x) on a file β the most direct access control call on UNIX. Other Protection calls: chown() (change file owner/group), setuid()/setgid() (change process identity), getuid()/getgid() (query identity), umask() (set default permission mask), and capabilities-related calls (capset/capget). read() is File Management, wait() is Process Control, lseek() is File Management.
IncorrectA: `chmod()`
Protection/Security system calls control access rights to system resources. chmod() modifies the permission bits (rwxr-xr-x) on a file β the most direct access control call on UNIX. Other Protection calls: chown() (change file owner/group), setuid()/setgid() (change process identity), getuid()/getgid() (query identity), umask() (set default permission mask), and capabilities-related calls (capset/capget). read() is File Management, wait() is Process Control, lseek() is File Management.
18What happens to the thread of a User Mode application while a standard, synchronous system call is being executed by the kernel?
CorrectC: It is suspended or blocked until the kernel finishes processing the request and returns control
Synchronous (blocking) system calls put the calling thread into an S (interruptible sleep) or D (uninterruptible sleep) state in the kernel scheduler until the operation completes. The CPU registers and program counter are saved on the kernel stack, and the scheduler may run other processes during the wait. When the syscall completes (e.g., data arrives from disk), the kernel restores the thread's execution context and returns the result to User Mode. This blocking is why servers use async I/O (io_uring, epoll) to avoid stalling threads.
IncorrectC: It is suspended or blocked until the kernel finishes processing the request and returns control
Synchronous (blocking) system calls put the calling thread into an S (interruptible sleep) or D (uninterruptible sleep) state in the kernel scheduler until the operation completes. The CPU registers and program counter are saved on the kernel stack, and the scheduler may run other processes during the wait. When the syscall completes (e.g., data arrives from disk), the kernel restores the thread's execution context and returns the result to User Mode. This blocking is why servers use async I/O (io_uring, epoll) to avoid stalling threads.
19What physical CPU mechanism strictly enforces the transition that occurs during a system call?
CorrectB: The Mode Bit, which toggles from User Mode (1) to Kernel Mode (0)
The Mode Bit (also called the Privilege Level or Current Privilege Level in x86 terminology) is a hardware-enforced field in the processor status register. In x86-64 it is encoded in bits [1:0] of the CS segment register β Ring 3 for User Mode, Ring 0 for Kernel Mode. When SYSCALL executes, the CPU automatically transitions to Ring 0, preventing User Mode code from executing privileged instructions. SYSRET restores Ring 3. The hardware enforces this without software intervention, making it tamper-proof.
IncorrectB: The Mode Bit, which toggles from User Mode (1) to Kernel Mode (0)
The Mode Bit (also called the Privilege Level or Current Privilege Level in x86 terminology) is a hardware-enforced field in the processor status register. In x86-64 it is encoded in bits [1:0] of the CS segment register β Ring 3 for User Mode, Ring 0 for Kernel Mode. When SYSCALL executes, the CPU automatically transitions to Ring 0, preventing User Mode code from executing privileged instructions. SYSRET restores Ring 3. The hardware enforces this without software intervention, making it tamper-proof.
20Why does invoking a system call introduce performance overhead compared to calling a standard function within the application's own code?
CorrectD: It requires context switching, saving CPU registers, validating permissions, and flushing specific memory caches
A system call incurs multiple layers of overhead: (1) The SYSCALL/SYSRET instruction pair flushes the instruction pipeline. (2) All general-purpose registers are saved/restored on the kernel stack. (3) On KPTI systems (post-Meltdown), the page table must switch from the user-space CR3 to the kernel CR3, flushing TLB entries. (4) The kernel validates arguments and permissions. (5) CPU cache state may be partially invalidated. Modern numbers: a minimal SYSCALL round-trip takes ~100β300 ns on a current CPU (vs ~1 ns for a function call), hence libraries like vDSO and io_uring minimize round-trips.
IncorrectD: It requires context switching, saving CPU registers, validating permissions, and flushing specific memory caches
A system call incurs multiple layers of overhead: (1) The SYSCALL/SYSRET instruction pair flushes the instruction pipeline. (2) All general-purpose registers are saved/restored on the kernel stack. (3) On KPTI systems (post-Meltdown), the page table must switch from the user-space CR3 to the kernel CR3, flushing TLB entries. (4) The kernel validates arguments and permissions. (5) CPU cache state may be partially invalidated. Modern numbers: a minimal SYSCALL round-trip takes ~100β300 ns on a current CPU (vs ~1 ns for a function call), hence libraries like vDSO and io_uring minimize round-trips.
System Calls β Concepts
1When passing parameters to a system call, which method is the fastest but highly constrained by CPU architecture?
CorrectD: Passing parameters directly within the CPU registers
Register-based parameter passing is the fastest because it avoids any memory access β arguments are already in CPU registers when the SYSCALL instruction fires. The constraint is the limited number of general-purpose registers. On Linux x86-64 (System V ABI), 6 argument registers are available: RDI, RSI, RDX, R10, R8, R9 (with RAX holding the syscall number and R11/RCX used by SYSCALL/SYSRET internally). System calls with >6 arguments (rare on Linux) must fall back to passing a pointer to an in-memory structure.
IncorrectD: Passing parameters directly within the CPU registers
Register-based parameter passing is the fastest because it avoids any memory access β arguments are already in CPU registers when the SYSCALL instruction fires. The constraint is the limited number of general-purpose registers. On Linux x86-64 (System V ABI), 6 argument registers are available: RDI, RSI, RDX, R10, R8, R9 (with RAX holding the syscall number and R11/RCX used by SYSCALL/SYSRET internally). System calls with >6 arguments (rare on Linux) must fall back to passing a pointer to an in-memory structure.
2If an application needs to pass a large array of parameters to a system call that exceeds the available CPU registers, how is this typically handled?
CorrectB: The parameters are stored in a block or table in main memory, and the physical address of that block is passed as a pointer in a single register
When more parameters are needed than CPU registers can hold, the caller places all arguments in a contiguous struct/block in its address space and passes a single pointer (memory address) to that block in one register. The kernel then reads the parameters from that user-space memory using copy_from_user() β which validates and copies the data safely across the user/kernel boundary. This is how Linux implements syscalls like sigaction() and futex() with complex argument structures.
IncorrectB: The parameters are stored in a block or table in main memory, and the physical address of that block is passed as a pointer in a single register
When more parameters are needed than CPU registers can hold, the caller places all arguments in a contiguous struct/block in its address space and passes a single pointer (memory address) to that block in one register. The kernel then reads the parameters from that user-space memory using copy_from_user() β which validates and copies the data safely across the user/kernel boundary. This is how Linux implements syscalls like sigaction() and futex() with complex argument structures.
3Aside from registers and memory blocks, what is the third common method for passing parameters to the operating system during a syscall?
CorrectC: Pushing the parameters onto the stack by the program, which are then popped off by the operating system
The three standard parameter-passing methods in OS textbooks are: (1) Registers (fastest, limited count), (2) Block/Table in memory (pass a pointer), and (3) Stack-based (push args onto user stack, OS pops them). Some older systems (notably early UNIX on PDP-11) and some embedded OSes use the stack approach. Modern Linux exclusively uses registers (method 1) for its syscall ABI; Win32 historically used stack-based calling conventions (cdecl, stdcall) for user-mode API calls but uses register-based for the actual NT syscall layer.
IncorrectC: Pushing the parameters onto the stack by the program, which are then popped off by the operating system
The three standard parameter-passing methods in OS textbooks are: (1) Registers (fastest, limited count), (2) Block/Table in memory (pass a pointer), and (3) Stack-based (push args onto user stack, OS pops them). Some older systems (notably early UNIX on PDP-11) and some embedded OSes use the stack approach. Modern Linux exclusively uses registers (method 1) for its syscall ABI; Win32 historically used stack-based calling conventions (cdecl, stdcall) for user-mode API calls but uses register-based for the actual NT syscall layer.
4Which command-line diagnostic tool is used in Linux to intercept, record, and display all system calls made by a running process?
CorrectA: `strace`
strace uses the ptrace() system call to attach to a process and intercept every system call it makes, printing the name, arguments, and return value. It is invaluable for debugging β diagnosing why a program fails to open a file (strace shows ENOENT), finding configuration file paths, tracing performance issues (which syscalls are slow), and security auditing. Usage: `strace -p <PID>` to attach to a running process, or `strace ./program` to trace from startup. The macOS/BSD equivalent is `dtruss`; Windows has Process Monitor.
IncorrectA: `strace`
strace uses the ptrace() system call to attach to a process and intercept every system call it makes, printing the name, arguments, and return value. It is invaluable for debugging β diagnosing why a program fails to open a file (strace shows ENOENT), finding configuration file paths, tracing performance issues (which syscalls are slow), and security auditing. Usage: `strace -p <PID>` to attach to a running process, or `strace ./program` to trace from startup. The macOS/BSD equivalent is `dtruss`; Windows has Process Monitor.
5After the `fork()` system call successfully creates a child process, what values does it return?
CorrectB: It returns `0` to the newly created child process, and returns the child's Process ID (PID) to the parent
fork() is the unique syscall that returns twice from a single call β once in each process. In the parent, it returns the child's PID (a positive integer), allowing the parent to track or wait() for the child. In the newly created child process, fork() returns 0, allowing the child to identify itself and take a different code path. If fork() returns β1, it failed (e.g., EAGAIN if process limit reached). The standard idiom: `if ((pid = fork()) == 0) { /* child */ } else { /* parent, pid = child PID */ }`.
IncorrectB: It returns `0` to the newly created child process, and returns the child's Process ID (PID) to the parent
fork() is the unique syscall that returns twice from a single call β once in each process. In the parent, it returns the child's PID (a positive integer), allowing the parent to track or wait() for the child. In the newly created child process, fork() returns 0, allowing the child to identify itself and take a different code path. If fork() returns β1, it failed (e.g., EAGAIN if process limit reached). The standard idiom: `if ((pid = fork()) == 0) { /* child */ } else { /* parent, pid = child PID */ }`.
6What is the fundamental behavior of the `exec()` family of system calls?
CorrectA: It entirely replaces the memory space and core image of the current process with a new, executable program
exec() replaces the calling process's entire address space (text, data, heap, stack segments) with a new executable β the PID stays the same but the program image changes completely. The ELF loader reads the new binary's headers, maps segments into virtual memory, and jumps to the entry point. Signal handlers are reset, open file descriptors may be preserved or closed depending on FD_CLOEXEC, and the process effectively becomes a brand-new program. The classic UNIX pattern is fork() + exec(): fork to clone, then exec to become a different program.
IncorrectA: It entirely replaces the memory space and core image of the current process with a new, executable program
exec() replaces the calling process's entire address space (text, data, heap, stack segments) with a new executable β the PID stays the same but the program image changes completely. The ELF loader reads the new binary's headers, maps segments into virtual memory, and jumps to the entry point. Signal handlers are reset, open file descriptors may be preserved or closed depending on FD_CLOEXEC, and the process effectively becomes a brand-new program. The classic UNIX pattern is fork() + exec(): fork to clone, then exec to become a different program.
7Why does a parent process typically invoke the `wait()` system call after using `fork()`?
CorrectD: To suspend its own execution until the child terminates, allowing the parent to read the exit status and reap the child's "zombie" state
When a child process exits, its Process Control Block (PCB) is not immediately freed β it becomes a "zombie" process retaining only its exit status and PID. The parent must call wait() (or waitpid()) to read that exit status, after which the kernel removes the zombie from the process table. If the parent never calls wait(), the zombie persists until the parent itself exits (at which point init/systemd inherits and reaps it). A high zombie count indicates a programming error in the parent process.
IncorrectD: To suspend its own execution until the child terminates, allowing the parent to read the exit status and reap the child's "zombie" state
When a child process exits, its Process Control Block (PCB) is not immediately freed β it becomes a "zombie" process retaining only its exit status and PID. The parent must call wait() (or waitpid()) to read that exit status, after which the kernel removes the zombie from the process table. If the parent never calls wait(), the zombie persists until the parent itself exits (at which point init/systemd inherits and reaps it). A high zombie count indicates a programming error in the parent process.
8When a software trap is triggered to execute a system call, where is the execution context (program counter, registers) of the calling process safely stored?
CorrectC: Pushed onto the process's dedicated kernel stack
Every process has two stacks: a user-space stack (used for function calls, local variables) and a kernel stack (a small, fixed-size kernel-mode stack allocated per-thread). When SYSCALL executes, the CPU automatically saves RIP (return address), CS, RFLAGS, RSP, and SS; the kernel then saves all other general-purpose registers on the kernel stack. This saved context (called the pt_regs struct on Linux) is what the kernel restores when SYSRET returns control to the user process β ensuring transparent resumption.
IncorrectC: Pushed onto the process's dedicated kernel stack
Every process has two stacks: a user-space stack (used for function calls, local variables) and a kernel stack (a small, fixed-size kernel-mode stack allocated per-thread). When SYSCALL executes, the CPU automatically saves RIP (return address), CS, RFLAGS, RSP, and SS; the kernel then saves all other general-purpose registers on the kernel stack. This saved context (called the pt_regs struct on Linux) is what the kernel restores when SYSRET returns control to the user process β ensuring transparent resumption.
9What action does a standard C library (libc) wrapper function actually perform to initiate a system call?
CorrectC: It loads the specific system call number into a designated CPU register (like EAX/RAX) and executes a specialized trap instruction
A libc syscall wrapper typically: (1) Marshals arguments into the ABI-correct registers (RDI, RSI, RDX, R10, R8, R9 on Linux x86-64); (2) Loads the syscall number into RAX (e.g., 1 for sys_write); (3) Executes the SYSCALL instruction, which traps into the kernel; (4) On return, checks if RAX contains a negative value (indicating an error), and if so, negates it, stores it in errno, and returns β1 to the caller. The entire wrapper is typically just a few lines of inline assembly.
IncorrectC: It loads the specific system call number into a designated CPU register (like EAX/RAX) and executes a specialized trap instruction
A libc syscall wrapper typically: (1) Marshals arguments into the ABI-correct registers (RDI, RSI, RDX, R10, R8, R9 on Linux x86-64); (2) Loads the syscall number into RAX (e.g., 1 for sys_write); (3) Executes the SYSCALL instruction, which traps into the kernel; (4) On return, checks if RAX contains a negative value (indicating an error), and if so, negates it, stores it in errno, and returns β1 to the caller. The entire wrapper is typically just a few lines of inline assembly.
10Which system call allows a process to project the contents of a file directly into its virtual memory address space, enabling rapid I/O via standard memory pointers?
CorrectD: `mmap()`
mmap() (memory map) creates a mapping between a region of the process's virtual address space and either a file or anonymous memory. Once mapped, the process reads/writes the file by reading/writing memory β the kernel's page fault mechanism automatically fetches pages from disk on demand. This eliminates the syscall overhead of repeated read()/write() calls and is the mechanism behind memory-mapped databases (SQLite WAL, LMDB), dynamic linker loading ELF shared libraries, and Copy-on-Write in fork().
IncorrectD: `mmap()`
mmap() (memory map) creates a mapping between a region of the process's virtual address space and either a file or anonymous memory. Once mapped, the process reads/writes the file by reading/writing memory β the kernel's page fault mechanism automatically fetches pages from disk on demand. This eliminates the syscall overhead of repeated read()/write() calls and is the mechanism behind memory-mapped databases (SQLite WAL, LMDB), dynamic linker loading ELF shared libraries, and Copy-on-Write in fork().
11Despite its aggressive-sounding name, what is the actual purpose of the `kill()` system call?
CorrectA: To send a specified asynchronous signal (such as SIGTERM or SIGSTOP) to a target process
kill() is a signal-delivery mechanism, not a process terminator. It takes a PID and a signal number and delivers the specified signal to the target process. SIGTERM (15) requests graceful shutdown; SIGKILL (9) forces immediate termination; SIGSTOP (19) suspends; SIGCONT (18) resumes; SIGUSR1/SIGUSR2 are application-defined. The name is historical β the default SIGTERM does terminate, but the syscall itself merely delivers whatever signal you specify. Sending `kill -0 PID` checks if a process exists without sending any signal.
IncorrectA: To send a specified asynchronous signal (such as SIGTERM or SIGSTOP) to a target process
kill() is a signal-delivery mechanism, not a process terminator. It takes a PID and a signal number and delivers the specified signal to the target process. SIGTERM (15) requests graceful shutdown; SIGKILL (9) forces immediate termination; SIGSTOP (19) suspends; SIGCONT (18) resumes; SIGUSR1/SIGUSR2 are application-defined. The name is historical β the default SIGTERM does terminate, but the syscall itself merely delivers whatever signal you specify. Sending `kill -0 PID` checks if a process exists without sending any signal.
12What is the primary advantage of utilizing Asynchronous I/O system calls (like POSIX `aio_read`) over synchronous ones?
CorrectB: They return control to the application immediately without blocking, allowing the process to execute other tasks while the I/O operation finishes in the background
Asynchronous I/O decouples the submission of an I/O request from its completion. The application calls aio_read() (POSIX AIO), io_submit() (Linux AIO), or io_uring_enter() (modern Linux) β all return immediately β then continues processing. When I/O completes, the kernel notifies via a signal, callback, or event queue. This allows a single thread to manage thousands of concurrent I/O operations without needing one thread per connection, making it the foundation of high-performance servers (Nginx, Node.js, Redis).
IncorrectB: They return control to the application immediately without blocking, allowing the process to execute other tasks while the I/O operation finishes in the background
Asynchronous I/O decouples the submission of an I/O request from its completion. The application calls aio_read() (POSIX AIO), io_submit() (Linux AIO), or io_uring_enter() (modern Linux) β all return immediately β then continues processing. When I/O completes, the kernel notifies via a signal, callback, or event queue. This allows a single thread to manage thousands of concurrent I/O operations without needing one thread per connection, making it the foundation of high-performance servers (Nginx, Node.js, Redis).
13In the Windows NT architecture, which API function essentially combines the operations of the UNIX `fork()` and `exec()` system calls into a single step?
CorrectA: `CreateProcess()`
CreateProcess() is the Windows equivalent of fork()+exec() combined. It creates a new process object, allocates an address space, loads the specified executable, creates the primary thread, and starts execution β all atomically. It does not clone the calling process's address space like fork() does. CreateProcess() takes 10 parameters and maps to the NtCreateProcess/NtCreateThread NT kernel calls. Process creation was consequently more expensive on Windows than UNIX due to this combined approach.
IncorrectA: `CreateProcess()`
CreateProcess() is the Windows equivalent of fork()+exec() combined. It creates a new process object, allocates an address space, loads the specified executable, creates the primary thread, and starts execution β all atomically. It does not clone the calling process's address space like fork() does. CreateProcess() takes 10 parameters and maps to the NtCreateProcess/NtCreateThread NT kernel calls. Process creation was consequently more expensive on Windows than UNIX due to this combined approach.
14In legacy 32-bit x86 Linux environments, which specific assembly instruction was famously used to trigger the software interrupt required for system calls?
CorrectB: `int 0x80`
On 32-bit x86 Linux (prior to Pentium II's SYSENTER optimization and AMD64's SYSCALL instruction), system calls used `int 0x80` β a software interrupt that transfers control to the interrupt handler registered at vector 0x80 in the IDT. The syscall number went in EAX; arguments in EBX, ECX, EDX, ESI, EDI. Linux 2.6 added SYSENTER/SYSEXIT (Intel) support via the linux-gate vDSO for 32-bit speed. Modern 64-bit Linux uses SYSCALL/SYSRET exclusively, which is β5Γ faster than INT 0x80.
IncorrectB: `int 0x80`
On 32-bit x86 Linux (prior to Pentium II's SYSENTER optimization and AMD64's SYSCALL instruction), system calls used `int 0x80` β a software interrupt that transfers control to the interrupt handler registered at vector 0x80 in the IDT. The syscall number went in EAX; arguments in EBX, ECX, EDX, ESI, EDI. Linux 2.6 added SYSENTER/SYSEXIT (Intel) support via the linux-gate vDSO for 32-bit speed. Modern 64-bit Linux uses SYSCALL/SYSRET exclusively, which is β5Γ faster than INT 0x80.
15Which system calls are heavily utilized in network programming to efficiently monitor multiple file descriptors (like sockets) simultaneously?
CorrectD: `select()` and `poll()` (or `epoll`)
select(), poll(), and epoll() are I/O multiplexing syscalls that allow a single thread to wait on multiple file descriptors simultaneously. select() and poll() scan all monitored FDs on every call (O(n)); epoll (Linux-specific) uses an event-driven kernel data structure with O(1) notification β suitable for 100,000+ concurrent connections. epoll_create(), epoll_ctl(), epoll_wait() form the modern API used by Nginx, Redis, and Node.js event loops. Windows equivalent: IOCP (I/O Completion Ports).
IncorrectD: `select()` and `poll()` (or `epoll`)
select(), poll(), and epoll() are I/O multiplexing syscalls that allow a single thread to wait on multiple file descriptors simultaneously. select() and poll() scan all monitored FDs on every call (O(n)); epoll (Linux-specific) uses an event-driven kernel data structure with O(1) notification β suitable for 100,000+ concurrent connections. epoll_create(), epoll_ctl(), epoll_wait() form the modern API used by Nginx, Redis, and Node.js event loops. Windows equivalent: IOCP (I/O Completion Ports).
16Which system call changes the apparent root directory for the current running process and its children, effectively isolating them from the rest of the file system?
CorrectC: `chroot()`
chroot() changes the process's root directory to a specified path. All subsequent pathname lookups starting with "/" are then relative to this new root β the process cannot traverse above it. This is the basis of "chroot jails" for lightweight sandboxing (FTP servers, build systems). However, chroot() is not a security boundary on its own β a privileged process can escape using chroot("../../.."). True container isolation requires combining chroot with Linux namespaces (unshare()) and cgroups.
IncorrectC: `chroot()`
chroot() changes the process's root directory to a specified path. All subsequent pathname lookups starting with "/" are then relative to this new root β the process cannot traverse above it. This is the basis of "chroot jails" for lightweight sandboxing (FTP servers, build systems). However, chroot() is not a security boundary on its own β a privileged process can escape using chroot("../../.."). True container isolation requires combining chroot with Linux namespaces (unshare()) and cgroups.
17When an application passes a memory pointer as an argument to a system call, why must the kernel rigorously validate that pointer?
CorrectB: To ensure the pointer directs to valid, accessible memory strictly within the user's address space, preventing malicious attempts to read or overwrite protected kernel memory
Unvalidated user-space pointers are a critical attack surface. A malicious process could pass a kernel-space address hoping the kernel dereferences it, leaking or corrupting kernel memory. Linux uses copy_from_user()/copy_to_user() which check that the source/destination falls within the user's virtual address space before copying. A NULL/invalid pointer causes EFAULT to be returned to the caller. Without this validation, a simple write() call with a crafted buffer address could read kernel secrets or overwrite kernel data structures.
IncorrectB: To ensure the pointer directs to valid, accessible memory strictly within the user's address space, preventing malicious attempts to read or overwrite protected kernel memory
Unvalidated user-space pointers are a critical attack surface. A malicious process could pass a kernel-space address hoping the kernel dereferences it, leaking or corrupting kernel memory. Linux uses copy_from_user()/copy_to_user() which check that the source/destination falls within the user's virtual address space before copying. A NULL/invalid pointer causes EFAULT to be returned to the caller. Without this validation, a simple write() call with a crafted buffer address could read kernel secrets or overwrite kernel data structures.
18Which system call modifies the data segment size of a process, traditionally serving as the underlying mechanism for dynamic heap memory allocation (e.g., `malloc()`)?
CorrectC: `sbrk()` (or `brk()`)
brk()/sbrk() set or adjust the program break β the address marking the end of the process's heap segment. malloc() traditionally calls sbrk() to grow the heap when it needs more memory, then manages the returned region in user space. Modern glibc malloc() has shifted to using mmap(MAP_ANONYMOUS) for large allocations (>128 KB by default) because mmap() pages can be returned to the OS independently; sbrk() requires the heap to shrink from the top only. free() is a libc function, not a syscall.
IncorrectC: `sbrk()` (or `brk()`)
brk()/sbrk() set or adjust the program break β the address marking the end of the process's heap segment. malloc() traditionally calls sbrk() to grow the heap when it needs more memory, then manages the returned region in user space. Modern glibc malloc() has shifted to using mmap(MAP_ANONYMOUS) for large allocations (>128 KB by default) because mmap() pages can be returned to the OS independently; sbrk() requires the heap to shrink from the top only. free() is a libc function, not a syscall.
19In the standard x86 and x86-64 calling conventions, where does the kernel place the return value of a completed system call before handing control back to the user process?
CorrectA: In the EAX or RAX register
On both 32-bit x86 (EAX) and 64-bit x86-64 (RAX), the kernel places the system call return value in the accumulator register before executing SYSRET. A non-negative value indicates success (or the actual result, such as bytes read, a PID, or a file descriptor). A negative value in the range [-4095, -1] indicates an error β the libc wrapper negates it, stores it in errno, and returns β1 to the application. This register-based return is the fastest possible mechanism β no memory writes needed.
IncorrectA: In the EAX or RAX register
On both 32-bit x86 (EAX) and 64-bit x86-64 (RAX), the kernel places the system call return value in the accumulator register before executing SYSRET. A non-negative value indicates success (or the actual result, such as bytes read, a PID, or a file descriptor). A negative value in the range [-4095, -1] indicates an error β the libc wrapper negates it, stores it in errno, and returns β1 to the application. This register-based return is the fastest possible mechanism β no memory writes needed.
20Which powerful system call allows a parent process to observe, control, and manipulate the execution of another process, forming the backbone of debuggers like GDB?
CorrectD: `ptrace()`
ptrace() (process trace) is the kernel's debugging primitive. With PTRACE_ATTACH, a parent (tracer) attaches to a target (tracee); then PTRACE_SYSCALL causes the kernel to stop the tracee at every syscall entry/exit. PTRACE_PEEKDATA reads the tracee's memory; PTRACE_POKEDATA writes to it; PTRACE_GETREGS reads CPU registers. GDB, strace, ltrace, and rr all use ptrace() as their foundation. seccomp-bpf (used by Chrome and Docker sandboxes) uses the PTRACE_O_SECCOMP sidetrack to filter system calls without full ptrace overhead.
IncorrectD: `ptrace()`
ptrace() (process trace) is the kernel's debugging primitive. With PTRACE_ATTACH, a parent (tracer) attaches to a target (tracee); then PTRACE_SYSCALL causes the kernel to stop the tracee at every syscall entry/exit. PTRACE_PEEKDATA reads the tracee's memory; PTRACE_POKEDATA writes to it; PTRACE_GETREGS reads CPU registers. GDB, strace, ltrace, and rr all use ptrace() as their foundation. seccomp-bpf (used by Chrome and Docker sandboxes) uses the PTRACE_O_SECCOMP sidetrack to filter system calls without full ptrace overhead.
System Calls β Advanced
1To eliminate the severe performance bottleneck of triggering legacy software interrupts (`int 0x80`), modern x86-64 processors introduced which specialized instruction pair for rapid privilege transitions?
CorrectD: `SYSCALL` and `SYSRET`
SYSCALL/SYSRET (AMD64 architecture, now universal on x86-64) replace the slow INT/IRET mechanism. SYSCALL: atomically saves RIPβRCX, RFLAGSβR11, loads RIP from LSTAR MSR (kernel entry point), loads CS/SS selectors, and switches to Ring 0 β in approximately 25 CPU clock cycles. SYSRET restores RIP from RCX, RFLAGS from R11, and returns to Ring 3. Intel's SYSENTER/SYSEXIT (32-bit) filled a similar role for 32-bit Linux. The speedup is ~5x over INT 0x80 for syscall-heavy workloads.
IncorrectD: `SYSCALL` and `SYSRET`
SYSCALL/SYSRET (AMD64 architecture, now universal on x86-64) replace the slow INT/IRET mechanism. SYSCALL: atomically saves RIPβRCX, RFLAGSβR11, loads RIP from LSTAR MSR (kernel entry point), loads CS/SS selectors, and switches to Ring 0 β in approximately 25 CPU clock cycles. SYSRET restores RIP from RCX, RFLAGS from R11, and returns to Ring 3. Intel's SYSENTER/SYSEXIT (32-bit) filled a similar role for 32-bit Linux. The speedup is ~5x over INT 0x80 for syscall-heavy workloads.
2What is the vDSO (Virtual Dynamically linked Shared Object) mechanism used for in modern Linux kernels?
CorrectA: It maps safe, fast kernel routines directly into user space, allowing certain read-only system calls to execute without the costly overhead of switching into Kernel Mode
The vDSO is a small ELF shared library that the kernel maps into every process's address space at a random address (ASLR-compliant). It contains kernel-provided implementations of frequently called but read-only system calls β primarily gettimeofday(), clock_gettime(), and getcpu(). Since reading the current time only requires reading kernel variables (not modifying kernel state), the vDSO lets these execute entirely in user space without a SYSCALL trap. On modern x86-64, clock_gettime(CLOCK_MONOTONIC) via vDSO takes ~5 ns vs ~100 ns via SYSCALL.
IncorrectA: It maps safe, fast kernel routines directly into user space, allowing certain read-only system calls to execute without the costly overhead of switching into Kernel Mode
The vDSO is a small ELF shared library that the kernel maps into every process's address space at a random address (ASLR-compliant). It contains kernel-provided implementations of frequently called but read-only system calls β primarily gettimeofday(), clock_gettime(), and getcpu(). Since reading the current time only requires reading kernel variables (not modifying kernel state), the vDSO lets these execute entirely in user space without a SYSCALL trap. On modern x86-64, clock_gettime(CLOCK_MONOTONIC) via vDSO takes ~5 ns vs ~100 ns via SYSCALL.
3Which of the following system calls is the prime candidate for optimization via the vDSO mechanism because it only requires reading non-sensitive kernel state?
CorrectC: `gettimeofday()`
gettimeofday() and clock_gettime() are the classic vDSO candidates because they only read kernel timekeeping variables (stored in a read-only shared memory region called vsyscall/vvar). The kernel updates these variables at interrupt time; the vDSO implementation reads them atomically using a sequence lock (seqlock) β no ring transition needed. fork() must create a new process (modifies process table β kernel-only). write() must write to kernel buffers (modifies kernel state). kill() must modify process signal state β all require Ring 0.
IncorrectC: `gettimeofday()`
gettimeofday() and clock_gettime() are the classic vDSO candidates because they only read kernel timekeeping variables (stored in a read-only shared memory region called vsyscall/vvar). The kernel updates these variables at interrupt time; the vDSO implementation reads them atomically using a sequence lock (seqlock) β no ring transition needed. fork() must create a new process (modifies process table β kernel-only). write() must write to kernel buffers (modifies kernel state). kill() must modify process signal state β all require Ring 0.
4What is the function of the `seccomp` (Secure Computing) facility in the Linux kernel?
CorrectB: It acts as an application firewall, severely restricting the specific system calls a process is allowed to make to minimize its attack surface if compromised
seccomp (Secure Computing) was originally a simple mode (SECCOMP_MODE_STRICT) allowing only read(), write(), exit(), and sigreturn(). Modern seccomp-bpf (SECCOMP_MODE_FILTER) allows arbitrary BPF programs to filter syscalls β inspect the syscall number and any argument, then ALLOW, DENY (ENOSYS), KILL, or LOG the call. Chrome uses seccomp-bpf to sandbox renderer processes; Docker/runc applies a seccomp profile to containers; OpenSSH uses it. If a sandbox escape is attempted via an unexpected syscall, seccomp kills the process before it can cause harm.
IncorrectB: It acts as an application firewall, severely restricting the specific system calls a process is allowed to make to minimize its attack surface if compromised
seccomp (Secure Computing) was originally a simple mode (SECCOMP_MODE_STRICT) allowing only read(), write(), exit(), and sigreturn(). Modern seccomp-bpf (SECCOMP_MODE_FILTER) allows arbitrary BPF programs to filter syscalls β inspect the syscall number and any argument, then ALLOW, DENY (ENOSYS), KILL, or LOG the call. Chrome uses seccomp-bpf to sandbox renderer processes; Docker/runc applies a seccomp profile to containers; OpenSSH uses it. If a sandbox escape is attempted via an unexpected syscall, seccomp kills the process before it can cause harm.
5How does a classic "Kernel-level Rootkit" subvert the operating system by manipulating system calls?
CorrectC: By overwriting function pointers within the `sys_call_table`, redirecting standard system calls to malicious functions that hide processes, files, or network connections
Kernel rootkits operate by corrupting the sys_call_table β a kernel-space array of function pointers indexed by syscall number. A rootkit with kernel code execution (via a driver exploit or kernel module) overwrites, for example, the sys_call_table entry for __NR_getdents64 with a pointer to a malicious function that omits certain directory entries (hidden files/processes). Modern defenses: kernel modules require signing (CONFIG_MODULE_SIG_FORCE); sys_call_table is write-protected via CR0.WP; Secure Boot prevents unsigned kernels; integrity checking (IMA) detects table modifications.
IncorrectC: By overwriting function pointers within the `sys_call_table`, redirecting standard system calls to malicious functions that hide processes, files, or network connections
Kernel rootkits operate by corrupting the sys_call_table β a kernel-space array of function pointers indexed by syscall number. A rootkit with kernel code execution (via a driver exploit or kernel module) overwrites, for example, the sys_call_table entry for __NR_getdents64 with a pointer to a malicious function that omits certain directory entries (hidden files/processes). Modern defenses: kernel modules require signing (CONFIG_MODULE_SIG_FORCE); sys_call_table is write-protected via CR0.WP; Secure Boot prevents unsigned kernels; integrity checking (IMA) detects table modifications.
6What fundamental architectural rule prevents a User Mode application from executing the `CLI` (Clear Interrupts) instruction?
CorrectB: `CLI` is a privileged instruction that alters fundamental hardware control states, strictly requiring execution within Ring 0 (Kernel Mode)
CLI (Clear Interrupt Flag) disables maskable hardware interrupts by clearing the IF flag in EFLAGS. This is an inherently dangerous operation β if User Mode code could disable interrupts, it could prevent the scheduler from preempting it, causing system-wide hangs. x86 enforces that CLI requires CPL=0 (Ring 0/Kernel Mode); executing it at Ring 3 raises a General Protection Fault (#GP). Similarly privileged instructions: STI, HLT, LGDT, LIDT, MOV CR0, WRMSR β all kernel-only, enforcing the isolation that makes preemptive multitasking possible.
IncorrectB: `CLI` is a privileged instruction that alters fundamental hardware control states, strictly requiring execution within Ring 0 (Kernel Mode)
CLI (Clear Interrupt Flag) disables maskable hardware interrupts by clearing the IF flag in EFLAGS. This is an inherently dangerous operation β if User Mode code could disable interrupts, it could prevent the scheduler from preempting it, causing system-wide hangs. x86 enforces that CLI requires CPL=0 (Ring 0/Kernel Mode); executing it at Ring 3 raises a General Protection Fault (#GP). Similarly privileged instructions: STI, HLT, LGDT, LIDT, MOV CR0, WRMSR β all kernel-only, enforcing the isolation that makes preemptive multitasking possible.
7Which specialized system call provides "Zero-Copy" network performance by transferring data directly from a file descriptor to a socket entirely within kernel space?
CorrectD: `sendfile()`
sendfile() transfers data between two file descriptors within the kernel's page cache β no data is copied to user space. The traditional path (read() + write()) requires: diskβkernel buffer copy, kernelβuser copy, userβsocket buffer copy, socket bufferβNIC copy. sendfile() eliminates the two user-space copies: diskβkernel bufferβNIC, using DMA. Apache, Nginx, and HAProxy use sendfile() for static file serving. Linux 2.6.33+ also supports sendfile() between two sockets (splicing). TCP_SENDFILE_PERF shows 2-4Γ throughput improvements for large file transfers.
IncorrectD: `sendfile()`
sendfile() transfers data between two file descriptors within the kernel's page cache β no data is copied to user space. The traditional path (read() + write()) requires: diskβkernel buffer copy, kernelβuser copy, userβsocket buffer copy, socket bufferβNIC copy. sendfile() eliminates the two user-space copies: diskβkernel bufferβNIC, using DMA. Apache, Nginx, and HAProxy use sendfile() for static file serving. Linux 2.6.33+ also supports sendfile() between two sockets (splicing). TCP_SENDFILE_PERF shows 2-4Γ throughput improvements for large file transfers.
8How do modern operating systems drastically optimize the performance of the `fork()` system call using Copy-on-Write (CoW)?
CorrectA: The parent and child initially share the exact same read-only memory pages; a duplicate physical page is only created when one of the processes attempts to modify the data
Copy-on-Write (CoW) makes fork() near-instantaneous regardless of address space size. After fork(), both parent and child point to the same physical page frames; the MMU marks all shared pages as read-only. When either process writes a page, the MMU triggers a page fault; the kernel allocates a new physical frame, copies the original page, maps it read-write in the writing process, and keeps the original for the other. CoW is critical for shell scripting: `bash` fork()s thousands of times β CoW means only actually-modified pages (typically just the stack) are ever copied.
IncorrectA: The parent and child initially share the exact same read-only memory pages; a duplicate physical page is only created when one of the processes attempts to modify the data
Copy-on-Write (CoW) makes fork() near-instantaneous regardless of address space size. After fork(), both parent and child point to the same physical page frames; the MMU marks all shared pages as read-only. When either process writes a page, the MMU triggers a page fault; the kernel allocates a new physical frame, copies the original page, maps it read-write in the writing process, and keeps the original for the other. CoW is critical for shell scripting: `bash` fork()s thousands of times β CoW means only actually-modified pages (typically just the stack) are ever copied.
9What is the exact sequence of system calls and kernel actions required to successfully clean up a "Zombie" process?
CorrectA: The parent invokes `wait()`, the kernel reads the terminated child's exit status, and the kernel finally deallocates the child's Process Control Block (PCB) from the process table
A zombie (defunct) process has exited but its PCB remains in the process table because its exit status hasn't been collected. The cleanup sequence: (1) Child calls exit() β kernel frees its memory, closes file descriptors, but keeps the PCB with exit status and PID; (2) Kernel sends SIGCHLD to the parent; (3) Parent calls wait()/waitpid() β kernel transfers the exit status (WEXITSTATUS macro) to the parent; (4) Kernel removes the PCB from the process table, freeing the PID. Only wait() by the biological parent can reap a zombie β SIGKILL has no effect on zombies.
IncorrectA: The parent invokes `wait()`, the kernel reads the terminated child's exit status, and the kernel finally deallocates the child's Process Control Block (PCB) from the process table
A zombie (defunct) process has exited but its PCB remains in the process table because its exit status hasn't been collected. The cleanup sequence: (1) Child calls exit() β kernel frees its memory, closes file descriptors, but keeps the PCB with exit status and PID; (2) Kernel sends SIGCHLD to the parent; (3) Parent calls wait()/waitpid() β kernel transfers the exit status (WEXITSTATUS macro) to the parent; (4) Kernel removes the PCB from the process table, freeing the PID. Only wait() by the biological parent can reap a zombie β SIGKILL has no effect on zombies.
10In the Windows Operating System, what is the "Native API" (NTAPI)?
CorrectD: A set of mostly undocumented, internal system calls handled by `ntdll.dll`, traditionally prefixed with `Nt` or `Zw`, which sit below the standard Win32 API
The Windows Native API (NTAPI) is the actual kernel interface β the equivalent of Linux's raw syscall table. ntdll.dll implements functions like NtCreateFile, NtReadFile, NtWriteFile, NtCreateProcess, ZwQuerySystemInformation. The standard Win32 API (kernel32.dll) calls ntdll.dll, which makes the actual syscall via the SYSCALL instruction. The NTAPI is "mostly undocumented" officially but is well-documented by the community (ntsecapi.h, reactos source code). Malware and security tools often call NTAPI directly to bypass Win32 API hooks placed by EDR software.
IncorrectD: A set of mostly undocumented, internal system calls handled by `ntdll.dll`, traditionally prefixed with `Nt` or `Zw`, which sit below the standard Win32 API
The Windows Native API (NTAPI) is the actual kernel interface β the equivalent of Linux's raw syscall table. ntdll.dll implements functions like NtCreateFile, NtReadFile, NtWriteFile, NtCreateProcess, ZwQuerySystemInformation. The standard Win32 API (kernel32.dll) calls ntdll.dll, which makes the actual syscall via the SYSCALL instruction. The NTAPI is "mostly undocumented" officially but is well-documented by the community (ntsecapi.h, reactos source code). Malware and security tools often call NTAPI directly to bypass Win32 API hooks placed by EDR software.
11How does the Linux `clone()` system call differ from the traditional `fork()` system call?
CorrectB: `clone()` provides fine-grained control over exactly what execution context (memory space, file descriptors, signal handlers) is shared between the parent and child, serving as the underlying mechanism for implementing threads
clone() is the generalized process/thread creation syscall on Linux. The flags parameter controls exactly what is shared: CLONE_VM (share address space β thread), CLONE_FILES (share file descriptors), CLONE_FS (share current directory), CLONE_SIGHAND (share signal handlers), CLONE_THREAD (same thread group), CLONE_NEWPID / CLONE_NEWNET / CLONE_NEWNS (create new namespaces for containers). fork() is actually implemented as clone(SIGCHLD, ...) with no sharing flags. The POSIX thread libraries (pthreads via nptl) call clone(CLONE_VM|CLONE_FILES|CLONE_SIGHAND|CLONE_THREAD, ...).
IncorrectB: `clone()` provides fine-grained control over exactly what execution context (memory space, file descriptors, signal handlers) is shared between the parent and child, serving as the underlying mechanism for implementing threads
clone() is the generalized process/thread creation syscall on Linux. The flags parameter controls exactly what is shared: CLONE_VM (share address space β thread), CLONE_FILES (share file descriptors), CLONE_FS (share current directory), CLONE_SIGHAND (share signal handlers), CLONE_THREAD (same thread group), CLONE_NEWPID / CLONE_NEWNET / CLONE_NEWNS (create new namespaces for containers). fork() is actually implemented as clone(SIGCHLD, ...) with no sharing flags. The POSIX thread libraries (pthreads via nptl) call clone(CLONE_VM|CLONE_FILES|CLONE_SIGHAND|CLONE_THREAD, ...).
12What hardware vulnerability forced OS developers to implement KPTI (Kernel Page-Table Isolation), significantly increasing the overhead (latency) of executing system calls?
CorrectC: Meltdown
Meltdown (CVE-2017-5754, Intel/ARM) exploited speculative execution to read kernel-mapped memory from user space β breaking the fundamental kernel/user isolation. The kernel fix was KPTI: when running in User Mode, the CR3 register points to a page table with almost no kernel mappings (only the syscall/interrupt entry stubs); transitioning to Kernel Mode switches CR3 to the full kernel page table. This CR3 switch flushes TLB entries, adding ~30-200 ns overhead per syscall on pre-PCID hardware. PCID (Process Context Identifiers) mitigates TLB flushing on modern CPUs.
IncorrectC: Meltdown
Meltdown (CVE-2017-5754, Intel/ARM) exploited speculative execution to read kernel-mapped memory from user space β breaking the fundamental kernel/user isolation. The kernel fix was KPTI: when running in User Mode, the CR3 register points to a page table with almost no kernel mappings (only the syscall/interrupt entry stubs); transitioning to Kernel Mode switches CR3 to the full kernel page table. This CR3 switch flushes TLB entries, adding ~30-200 ns overhead per syscall on pre-PCID hardware. PCID (Process Context Identifiers) mitigates TLB flushing on modern CPUs.
13Why are Direct Memory Access (DMA) transfers considered highly efficient compared to performing repetitive `read()` and `write()` system calls?
CorrectB: DMA utilizes a dedicated controller to move data between peripherals and memory bypassing the CPU entirely, eliminating the heavy context-switching overhead of continuous system calls
Without DMA, every byte transferred from disk or NIC must be mediated by the CPU: the CPU reads a byte from the I/O port, stores it in a register, writes it to memory β occupying CPU cycles and causing syscall overhead for each transfer. DMA allows the peripheral's DMA controller to write directly to system RAM via the memory bus, completely bypassing the CPU. The CPU is notified only when the entire transfer is complete (via an interrupt). A 1 MB NVMe read might require 0 CPU cycles during transfer β only one interrupt on completion β vs. thousands of syscalls with PIO.
IncorrectB: DMA utilizes a dedicated controller to move data between peripherals and memory bypassing the CPU entirely, eliminating the heavy context-switching overhead of continuous system calls
Without DMA, every byte transferred from disk or NIC must be mediated by the CPU: the CPU reads a byte from the I/O port, stores it in a register, writes it to memory β occupying CPU cycles and causing syscall overhead for each transfer. DMA allows the peripheral's DMA controller to write directly to system RAM via the memory bus, completely bypassing the CPU. The CPU is notified only when the entire transfer is complete (via an interrupt). A 1 MB NVMe read might require 0 CPU cycles during transfer β only one interrupt on completion β vs. thousands of syscalls with PIO.
14What modern Linux kernel technology allows users to safely execute sandboxed, event-driven programs directly inside the kernel without loading custom kernel modules or changing source code?
CorrectC: eBPF (Extended Berkeley Packet Filter)
eBPF is a revolutionary kernel technology (Linux 4.x+) that allows verified, sandboxed programs to run in the kernel. Programs are compiled to eBPF bytecode, verified by the kernel's in-kernel verifier (type safety, bounded loops, no null dereferences), JIT-compiled to native code, and attached to hooks: tracepoints, kprobes, syscall entry/exit, network packet paths. Used for: performance profiling (bpftrace/BCC), syscall filtering (seccomp-bpf), network packet filtering (XDP), observability (Cilium), and system call interception β all without writing kernel modules.
IncorrectC: eBPF (Extended Berkeley Packet Filter)
eBPF is a revolutionary kernel technology (Linux 4.x+) that allows verified, sandboxed programs to run in the kernel. Programs are compiled to eBPF bytecode, verified by the kernel's in-kernel verifier (type safety, bounded loops, no null dereferences), JIT-compiled to native code, and attached to hooks: tracepoints, kprobes, syscall entry/exit, network packet paths. Used for: performance profiling (bpftrace/BCC), syscall filtering (seccomp-bpf), network packet filtering (XDP), observability (Cilium), and system call interception β all without writing kernel modules.
15Which system call is foundational to the creation of Linux Containers (like Docker), allowing a process to disassociate parts of its execution context from its parent's namespaces?
CorrectA: `unshare()`
unshare() allows a process to "unshare" specific namespaces from its parent: CLONE_NEWPID (new PID namespace β container process tree starts at PID 1), CLONE_NEWNET (new network namespace β isolated network stack), CLONE_NEWNS (new mount namespace β isolated file system view), CLONE_NEWUTS (new hostname), CLONE_NEWIPC, CLONE_NEWUSER. Docker's container runtime calls unshare() to create the isolated environment, then uses cgroups (via cgroupfs writes, not a single syscall) for resource limits. cgroups() is not a syscall; jail() is FreeBSD.
IncorrectA: `unshare()`
unshare() allows a process to "unshare" specific namespaces from its parent: CLONE_NEWPID (new PID namespace β container process tree starts at PID 1), CLONE_NEWNET (new network namespace β isolated network stack), CLONE_NEWNS (new mount namespace β isolated file system view), CLONE_NEWUTS (new hostname), CLONE_NEWIPC, CLONE_NEWUSER. Docker's container runtime calls unshare() to create the isolated environment, then uses cgroups (via cgroupfs writes, not a single syscall) for resource limits. cgroups() is not a syscall; jail() is FreeBSD.
16To avoid the expensive kernel-trap overhead of acquiring a mutex, what highly optimized IPC mechanism do Linux environments use for uncontended locks?
CorrectD: `futex` (Fast Userspace Mutex), which resolves uncontended locks entirely in user space and only invokes a system call if a thread actually needs to block/wait
futex (Fast Userspace Mutex) uses a 32-bit integer in shared memory as the lock state. When uncontended, the lock and unlock operations use atomic CPU instructions (LOCK CMPXCHG on x86) entirely in user space β zero syscalls, ~5 ns. Only when a thread fails to acquire the lock (contended) or needs to wait does it invoke futex() syscall to block in the kernel. On unlock, if the kernel waitqueue is non-empty, another futex() wakes a waiting thread. pthreads mutex, condition variables, rwlocks, and C++ std::mutex all use futex internally. This design makes uncontended locking essentially free.
IncorrectD: `futex` (Fast Userspace Mutex), which resolves uncontended locks entirely in user space and only invokes a system call if a thread actually needs to block/wait
futex (Fast Userspace Mutex) uses a 32-bit integer in shared memory as the lock state. When uncontended, the lock and unlock operations use atomic CPU instructions (LOCK CMPXCHG on x86) entirely in user space β zero syscalls, ~5 ns. Only when a thread fails to acquire the lock (contended) or needs to wait does it invoke futex() syscall to block in the kernel. On unlock, if the kernel waitqueue is non-empty, another futex() wakes a waiting thread. pthreads mutex, condition variables, rwlocks, and C++ std::mutex all use futex internally. This design makes uncontended locking essentially free.
17Which highly specialized Linux system call moves data between two file descriptors completely within kernel address space without ever copying the data to user space?
CorrectC: `splice()`
splice() transfers data between two file descriptors (at least one must be a pipe) by moving page cache references in kernel space β no data is copied to user space at any point. Data moves from a file's page cache into the pipe's internal ring buffer via page-reference splicing, then from the pipe to the destination socket/file. This enables zero-user-copy network proxying. tee() duplicates data from one pipe to another (also without user-space copies); vmsplice() moves user-space memory pages into a pipe. Together these syscalls implement the Linux zero-copy I/O stack.
IncorrectC: `splice()`
splice() transfers data between two file descriptors (at least one must be a pipe) by moving page cache references in kernel space β no data is copied to user space at any point. Data moves from a file's page cache into the pipe's internal ring buffer via page-reference splicing, then from the pipe to the destination socket/file. This enables zero-user-copy network proxying. tee() duplicates data from one pipe to another (also without user-space copies); vmsplice() moves user-space memory pages into a pipe. Together these syscalls implement the Linux zero-copy I/O stack.
18When a Guest Operating System running inside a Virtual Machine attempts to execute a privileged system call, how does the underlying Type-1 Hypervisor maintain control?
CorrectA: The execution of the privileged instruction in non-root mode triggers a hardware-assisted "VM Exit" (or trap), returning control to the Hypervisor to emulate or handle the request
With Intel VT-x or AMD-V hardware virtualization, the CPU supports two modes: VMX root mode (Hypervisor, Ring 0) and VMX non-root mode (Guest OS, ring 0/3). When the Guest OS executes a privileged instruction (VMCALL, MOV CR3, sensitive I/O) in non-root mode, the CPU triggers a "VM Exit" β automatically saving the guest register state in the VMCS (Virtual Machine Control Structure) and transferring control to the Hypervisor entry point. The Hypervisor inspects the exit reason, emulates the operation, updates guest state, and executes VMRESUME to return β all transparent to the guest OS.
IncorrectA: The execution of the privileged instruction in non-root mode triggers a hardware-assisted "VM Exit" (or trap), returning control to the Hypervisor to emulate or handle the request
With Intel VT-x or AMD-V hardware virtualization, the CPU supports two modes: VMX root mode (Hypervisor, Ring 0) and VMX non-root mode (Guest OS, ring 0/3). When the Guest OS executes a privileged instruction (VMCALL, MOV CR3, sensitive I/O) in non-root mode, the CPU triggers a "VM Exit" β automatically saving the guest register state in the VMCS (Virtual Machine Control Structure) and transferring control to the Hypervisor entry point. The Hypervisor inspects the exit reason, emulates the operation, updates guest state, and executes VMRESUME to return β all transparent to the guest OS.
19What does the FUSE (Filesystem in Userspace) framework do with VFS system calls targeting a FUSE-mounted filesystem?
CorrectD: It catches VFS system calls (like `open`, `read`) in the kernel and routes them back up to a specialized user-space daemon to handle the actual file operations
FUSE allows user-space programs to implement filesystems without writing kernel drivers. When the kernel's VFS receives a syscall targeting a FUSE-mounted path (e.g., open("/mnt/sshfs/file")), it routes the request through /dev/fuse to a user-space FUSE daemon (sshfs, Dropbox, EncFS, etc.) via a kernelβuserspace message protocol. The daemon performs the actual operation (SSH read, decrypt, cloud API call) and sends the result back. The overhead is two kernel/user boundary crossings per syscall vs zero for native filesystems β acceptable for network/encrypted FS where network latency dominates.
IncorrectD: It catches VFS system calls (like `open`, `read`) in the kernel and routes them back up to a specialized user-space daemon to handle the actual file operations
FUSE allows user-space programs to implement filesystems without writing kernel drivers. When the kernel's VFS receives a syscall targeting a FUSE-mounted path (e.g., open("/mnt/sshfs/file")), it routes the request through /dev/fuse to a user-space FUSE daemon (sshfs, Dropbox, EncFS, etc.) via a kernelβuserspace message protocol. The daemon performs the actual operation (SSH read, decrypt, cloud API call) and sends the result back. The overhead is two kernel/user boundary crossings per syscall vs zero for native filesystems β acceptable for network/encrypted FS where network latency dominates.
20Under the System V AMD64 ABI standard used by Linux and macOS, how are the first six integer or pointer arguments passed to a system call?
CorrectB: They are placed sequentially into six specific CPU registers: RDI, RSI, RDX, R10, R8, and R9 (with the syscall number in RAX)
The Linux x86-64 syscall ABI specifies: RAX = syscall number; RDI = arg1; RSI = arg2; RDX = arg3; R10 = arg4; R8 = arg5; R9 = arg6. Note that R10 is used instead of RCX (which SYSCALL overwrites with the return address). The user-mode function calling ABI uses RCX for arg4, but the syscall wrapper substitutes R10. Return value is in RAX. RCX and R11 are destroyed by SYSCALL/SYSRET (SYSRET restores RFLAGS from R11 and RIP from RCX). Any beyond the 6th argument must go via a pointer in one of these registers.
IncorrectB: They are placed sequentially into six specific CPU registers: RDI, RSI, RDX, R10, R8, and R9 (with the syscall number in RAX)
The Linux x86-64 syscall ABI specifies: RAX = syscall number; RDI = arg1; RSI = arg2; RDX = arg3; R10 = arg4; R8 = arg5; R9 = arg6. Note that R10 is used instead of RCX (which SYSCALL overwrites with the return address). The user-mode function calling ABI uses RCX for arg4, but the syscall wrapper substitutes R10. Return value is in RAX. RCX and R11 are destroyed by SYSCALL/SYSRET (SYSRET restores RFLAGS from R11 and RIP from RCX). Any beyond the 6th argument must go via a pointer in one of these registers.
Conclusion: Mastering System Calls
These 60 MCQs cover the full spectrum of System Call mechanics β from identifying how a hardware trap elevates privilege to Ring 0, to understanding why vDSO allows zero-overhead time reads, and why seccomp-bpf is crucial for container security and sandboxing.
The key to mastering these questions is building a system-wide mental model: Kernel = Ring 0 privileged hardware access, Syscalls = the strictly controlled gateway APIs. Once you understand the ABI conventions and the performance cost of context switching, modern kernel optimisations like futex, vDSO, and KPTI become entirely logical design decisions.
After completing this MCQ set, deepen your knowledge with the full System Calls Theory Notes and practice with OS Services MCQs to see these capabilities applied in broader operating system architectures.
β‘ Key Takeaways β System Calls
- System calls are the only sanctioned gateway between user applications and the kernel β all hardware access must pass through this controlled interface enforced by the CPU Mode Bit.
- SYSCALL/SYSRET (x86-64) is ~5Γ faster than the legacy INT 0x80 mechanism β it directly loads the entry point from the LSTAR MSR without traversing the IDT.
- vDSO maps kernel time-keeping routines into user space β clock_gettime() via vDSO takes ~5 ns vs ~100 ns via SYSCALL, a 20Γ speedup for high-frequency time queries.
- fork() + CoW: fork() is near-instantaneous because Copy-on-Write defers page duplication until a write occurs β only modified pages ever need copying.
- seccomp-bpf provides application-level syscall filtering β Chrome, Docker, and OpenSSH all use it to limit which syscalls their sandboxed processes can invoke.
- KPTI (Kernel Page-Table Isolation) forces a CR3 page-table switch on every syscall, adding 30β200 ns overhead on pre-PCID hardware β introduced as a Meltdown mitigation.
- futex makes uncontended mutex locking essentially free (user-space atomic only); the kernel is only involved when a thread actually needs to block.
- sendfile() and splice() implement true zero-copy I/O β data moves between kernel buffers without any user-space copy, enabling Nginx-class throughput.
Quick Review & Summary
Use this table to consolidate System Call internals before or after attempting the questions above.
| Concept | Mechanism / Detail | Key Example |
|---|---|---|
| System Call Trigger | Software trap β SYSCALL (x86-64) or INT 0x80 (legacy x86) | read(), write(), fork() |
| Syscall Number (RAX) | Index into sys_call_table[] β kernel function pointer | RAX=1 β sys_write |
| Parameter Passing | Registers: RDI, RSI, RDX, R10, R8, R9; or pointer to block | Linux x86-64 ABI |
| Error Convention | Returns β1, errno set to positive error code | ENOENT, EACCES, EAGAIN |
| fork() + CoW | Clones process; pages shared read-only until write triggers page fault | Shell scripting, exec() pattern |
| exec() family | Replaces current address space with new ELF; same PID | execve(), execl(), execlp() |
| vDSO | Kernel routines mapped into user space β no SYSCALL needed | clock_gettime(), gettimeofday() |
| seccomp-bpf | BPF filter on syscall number/args; ALLOW / KILL / ERRNO | Chrome renderer, Docker runtime |
| futex | User-space atomic lock; kernel only on contention | pthreads mutex, std::mutex |
| sendfile() / splice() | Zero-copy: page cache β socket/pipe, no user-space copy | Nginx, Apache static files |
| KPTI / Meltdown | CR3 switch per syscall to isolate kernel mappings; mitigated by PCID | Linux 4.15+ post-Meltdown |
| VM Exit | Privileged instruction in VMX non-root β hypervisor takes control via VMCS | KVM, VMware ESXi, Hyper-V |
Frequently Asked Questions
Q. How many System Calls MCQs are available on this page?
Q. What is the difference between a system call and an API in operating systems?
Q. What are the three methods for passing parameters to a system call?
Q. What is the difference between fork(), exec(), and clone() system calls?
Q. What is the vDSO and why is it needed?
Q. Are these System Calls MCQs suitable for GATE CS exam preparation?
Struggling with some questions? Re-read the full Theory Guide: System Calls