direct execution

limited direct execution

To make a program run as fast as one might expect, not surprisingly OS developers came up with a technique, which we call limited direct execution. The “direct execution” part of the idea is simple: just run the program directly on the CPU.

截屏2023-03-21 17.35.29.png

system call

why system calls look like procedure calls?

The simple reason: it is a procedure call, but hidden inside that procedure call is the famous trap instruction.

when you call open() (for example), you are executing a procedure call into the C library. Therein, whether for open() or any of the other system calls provided, the library uses an agreed-upon calling convention with the kernel to put the arguments to open() in well-known locations (e.g., on the stack, or in specific registers), puts the system-call number into a well-known location as well (again, onto the stack or a register), and then executes the aforementioned trap instruction. The code in the library after the trap unpacks return values and returns control to the program that issued the system call.

Thus, the parts of the C library that make system calls are hand-coded in assembly, as they need to carefully follow convention in order to process arguments and return values correctly, as well as execute the hardware-specific trap instruction. And now you know why you personally don’t have to write assembly code to trap into an OS; somebody has already written that assembly for you

code that runs in user mode is restricted in what it can do. For example, when running in user mode, a process can’t issue I/O requests; In contrast to user mode is kernel mode, which the operating system (or kernel) runs in. In this mode, code that runs can do what it likes, including privileged operations such as issuing I/O requests and executing all types of restricted instructions.

what should a user process do when it wishes to perform some kind of privileged operation, such as reading from disk? To enable this, virtually all modern hardware provides the ability for user programs to perform a system call.

The hardware provides different modes of execution to assist the operating system. Special instructions are provided to trap into the kernel and return-from-trap back to user-mode programs. Additionally, instructions are provided that allow the operating system to tell the hardware where the trap table resides in memory. To execute a system call, a program must execute a special trap instruction. When finished, the OS calls a special return-from-trap instruction.

how does the trap know which code to run inside the OS?

The kernel does so by setting up a trap table at boot time. One of the first things the OS thus does is to tell the hardware what code to run when certain exceptional events occur.

截屏2023-03-21 20.24.56.png

To specify the exact system call, a system-call number is usually assigned to each system call. This level of indirection serves as a form of protection; user code cannot specify an exact address to jump to, but rather must request a particular service via number.

Switching Between Processes

A Cooperative Approach: Wait For System Calls

A Non-Cooperative Approach: The OS Takes Control. timer interrupt. A timer device can be programmed to raise an interrupt every so many milliseconds; when the interrupt is raised, the currently running process is halted, and a pre-configured interrupt handler in the OS runs.

Saving and Restoring Context

This decision is made by a part of the operating system known as the scheduler;

If the decision is made to switch, the OS then executes a low-level piece of code which we refer to as a context switch. A context switch is conceptually simple: all the OS has to do is save a few register values for the currently-executing process (onto its kernel stack, for example) and restore a few for the soon-to-be-executing process (from its kernel stack).

Note that there are two types of register saves/restores that happen during this protocol. The first is when the timer interrupt occurs; in this case, the user registers of the running process are implicitly saved by the hardware, using the kernel stack of that process. The second is when the OS decides to switch from A to B; in this case, the kernel registers are explicitly saved by the software (i.e., the OS), but this time into memory in the process structure of the process.

截屏2023-03-21 20.59.32.png

The xv6 Context Switch Code

void swtch(struct context **old, struct context *new);
#
# Save current register context in old
# and then load register context from new.
.globl swtch
swtch:
# Save old registers
movl 4(%esp), %eax # put old ptr into eax
popl 0(%eax) # save the old IP
movl %esp, 4(%eax) # and stack
movl %ebx, 8(%eax) # and other registers
movl %ecx, 12(%eax)
movl %edx, 16(%eax)
movl %esi, 20(%eax)
movl %edi, 24(%eax)
movl %ebp, 28(%eax)

# Load new registers
movl 4(%esp), %eax # put new ptr into eax
movl 28(%eax), %ebp # restore other registers
movl 24(%eax), %edi
movl 20(%eax), %esi
movl 16(%eax), %edx
movl 12(%eax), %ecx
movl 8(%eax), %ebx
movl 4(%eax), %esp # stack is switched here
pushl 0(%eax) # return addr put in place
ret # finally return into new ctxt

concurrency issues

One simple thing an OS might do is disable interrupts during interrupt processing;

Operating systems also have developed a number of sophisticated locking schemes to protect concurrent access to internal data structures.

CPU scheduling

metrics

turnaround time. the time at which the job completes minus the time at which the job arrived in the system

response time as the time from when the job arrives in a system to the first time it is scheduled

scheduler

First In, First Out (FIFO)

The most basic algorithm we can implement is known as First In, First Out (FIFO) scheduling or sometimes First Come, First Served (FCFS).

Shortest Job First (SJF)

it runs the shortest job first, then the next shortest, and so on. SJF by our definition is a non-preemptive scheduler

Shortest Time-to-Completion First (STCF)

or Preemptive Shortest Job First (PSJF) scheduler. Any time a new job enters the system, the STCF scheduler determines which of the remaining jobs (including the new job) has the least time left, and schedules that one

Round-Robin (RR) scheduling

instead of running jobs to completion, RR runs a job for a time slice (sometimes called a scheduling quantum) and then switches to the next job in the run queue. RR is sometimes called time-slicing

the length of a time slice must be a multiple of the timer-interrupt period; thus if the timer interrupts every 10 milliseconds, the time slice could be 10, 20, or any other multiple of 10 ms.

Incorporating I/O

截屏2023-03-21 22.40.29.png

multi-level feedback queue

The MLFQ has a number of distinct queues, each assigned a different priority level. At any given time, a job that is ready to run is on a single queue. MLFQ uses priorities to decide which job should run at a given time: a job with higher priority (i.e., a job on a higher queue) is chosen to run.

more than one job may be on a given queue, and thus have the same priority. In this case, we will just use round-robin scheduling among those jobs

basic rules for MLFQ:

Rule 1: If Priority(A) > Priority(B), A runs (B doesn’t).
Rule 2: If Priority(A) = Priority(B), A & B run in RR.

The key to MLFQ scheduling therefore lies in how the scheduler sets priorities. Rather than giving a fixed priority to each job, MLFQ varies the priority of a job based on its observed behavior.

How To Change Priority

Rule 3: When a job enters the system, it is placed at the highest priority (the topmost queue).
Rule 4a: If a job uses up an entire time slice while running, its priority is reduced (i.e., it moves down one queue).
Rule 4b: If a job gives up the CPU before the time slice is up, it stays at the same priority level.

The Priority Boost

avoid the problem of starvation. if there are “too many” interactive jobs in the system, they will combine to consume all CPU time, and thus long-running jobs will never receive any CPU time (they starve).

Rule 5: After some time period S, move all the jobs in the system to the topmost queue.

Better Accounting

how to prevent gaming of our scheduler?

The solution here is to perform better accounting of CPU time at each level of the MLFQ. Instead of forgetting how much of a time slice a process used at a given level, the scheduler should keep track; once a process has used its allotment, it is demoted to the next priority queue. Whether it uses the time slice in one long burst or many small ones does not matter. We thus rewrite Rules 4a and 4b to the following single rule:

Rule 4: Once a job uses up its time allotment at a given level (regardless of how many times it has given up the CPU), its priority is reduced (i.e., it moves down one queue).

Tuning MLFQ And Other Issues

One big question is how to parameterize such a scheduler. For example, how many queues should there be? How big should the time slice be per queue?

There are no easy answers to these questions, and thus only some experience with workloads and subsequent tuning of the scheduler will lead to a satisfactory balance.

TIP: AVOID VOO-DOO CONSTANTS (OUSTERHOUT’S LAW)

目录CONTENT

OSTEP-notes-virtualization-chap6-to-8