If a concurrent thread causes the condition to be true, that thread must either hold the lock on the condition before the sleeping thread acquired it, or after the sleeping thread released it in sleep. If before, the sleeping thread must have seen the new condition value, and decided to sleep anyway, so it doesn’t matter if it misses the wakeup. If after, then the earliest the waker could acquire the lock on the condition is after sleep acquires ptable.lock, so that wakeup’s acquisition of pt- able.lock must wait until sleep has completely finished putting the sleeper to sleep…
xv6 book
I thought I could read English. 🙂
A few days ago I tried to start working on Lab 4: Preemptive Multitasking. However, jeez, multitasking is harder than I expected… So I decided to take a look at LEC 10 and LEC 11 first. Well, sometimes when reading the “Scheduling” chapter of the xv6 book I felt like I understood every single word in a paragraph but still couldn’t figure out what it was talking about. So…I decided to go breadth-first! 🙂 I hope I will figure out how sleep() and wakeup() really work in the Lab.
Anyway let’s first take a look at this homework. This one is fairly simple.
Download and read uthread.c and uthread.asm:
#include "types.h"
#include "stat.h"
#include "user.h"
/* Possible states of a thread; */
#define FREE 0x0
#define RUNNING 0x1
#define RUNNABLE 0x2
#define STACK_SIZE 8192
#define MAX_THREAD 4
typedef struct thread thread_t, *thread_p;
typedef struct mutex mutex_t, *mutex_p;
struct thread {
int sp; /* saved stack pointer */
char stack[STACK_SIZE]; /* the thread's stack */
int state; /* FREE, RUNNING, RUNNABLE */
};
static thread_t all_thread[MAX_THREAD];
thread_p current_thread;
thread_p next_thread;
extern void thread_switch(void);
void
thread_init(void)
{
// main() is thread 0, which will make the first invocation to
// thread_schedule(). it needs a stack so that the first thread_switch() can
// save thread 0's state. thread_schedule() won't run the main thread ever
// again, because its state is set to RUNNING, and thread_schedule() selects
// a RUNNABLE thread.
current_thread = &all_thread[0];
current_thread->state = RUNNING;
}
static void
thread_schedule(void)
{
thread_p t;
/* Find another runnable thread. */
next_thread = 0;
for (t = all_thread; t < all_thread + MAX_THREAD; t++) {
if (t->state == RUNNABLE && t != current_thread) {
next_thread = t;
break;
}
}
if (t >= all_thread + MAX_THREAD && current_thread->state == RUNNABLE) {
/* The current thread is the only runnable thread; run it. */
next_thread = current_thread;
}
if (next_thread == 0) {
printf(2, "thread_schedule: no runnable threads\n");
exit();
}
if (current_thread != next_thread) { /* switch threads? */
next_thread->state = RUNNING;
thread_switch();
} else
next_thread = 0;
}
void
thread_create(void (*func)())
{
thread_p t;
for (t = all_thread; t < all_thread + MAX_THREAD; t++) {
if (t->state == FREE) break;
}
t->sp = (int) (t->stack + STACK_SIZE); // set sp to the top of the stack
t->sp -= 4; // space for return address
* (int *) (t->sp) = (int)func; // push return address on stack
t->sp -= 32; // space for registers that thread_switch expects
t->state = RUNNABLE;
}
void
thread_yield(void)
{
current_thread->state = RUNNABLE;
thread_schedule();
}
static void
mythread(void)
{
int i;
printf(1, "my thread running\n");
for (i = 0; i < 100; i++) {
printf(1, "my thread 0x%x\n", (int) current_thread);
thread_yield();
}
printf(1, "my thread: exit\n");
current_thread->state = FREE;
thread_schedule();
}
int
main(int argc, char *argv[])
{
thread_init();
thread_create(mythread);
thread_create(mythread);
thread_schedule();
return 0;
}
Doing uthread runs this C program. It initializes some data structures, creates two threads, and runs a scheduler. Each thread is described using a thread structure, including its saved stack pointer int sp, a stack (char stack[STACK_SIZE]) as well as a state (“enum” int state, one of FREE, RUNNING and RUNNABLE).
It maintains some global variables: current_thread, next_thread and a thread table, all_thread[MAX_THREAD].
thread_init() registers the current “thread” (main()) as all_thread[0] and marks it RUNNING.
thread_create() finds the next free thread_t in all_thread, sets up some registers on its stack so that later when this thread is scheduled for the first time, it will start at func(), just like a simple trapframe.
Finally the scheduler, thread_schedue(), iterates over all_thread to find the next RUNNABLE threads, updates current_thread and next_thread, change its state to RUNNING, then thread_switch to it.
If the current thread is the only RUNNABLE thread in all_thread, thread_schedule() sets next_schedule to zero, and returns back to the same thread. The next call to thread_schedule() is guaranteed to exit(), so the whole user process can terminate.
If an user thread wants to yield, it calls, well, thread_yield():
void
thread_yield(void)
{
current_thread->state = RUNNABLE;
thread_schedule();
}
thread_yield() marks the caller thread as RUNNABLE so that later thread_schedule() can find it and call it again. The “main thread” created in thread_init() sets itself to RUNNING instead and will never be scheduled again, which is intended.
We need to implement thread_switch(). My solution saves the status of the old thread, switch to the new stack, set current_thread to next_thread, set next_thread to zero, then load the status of the new thread. Nothing fancy:
.text /* Switch from current_thread to next_thread. Make next_thread * the current_thread, and set next_thread to 0. * Use eax as a temporary register; it is caller saved. */ .globl thread_switch thread_switch: /* we do whatever we want to eax */ /* save old */ pushal movl current_thread, %eax movl %esp, (%eax) /* switch stack */ movl next_thread, %eax movl (%eax), %esp /* set current_thread to next_thread */ movl %eax, current_thread /* set next_thread to 0 */ movl $0, next_thread /* load new */ popal ret
We run the program with one CPU:
peilin@PWN:~/6.828/2018/xv6-public$ make CPUS=1 qemu-nox
qemu-system-i386 -nographic -drive file=fs.img,index=1,media=disk,format=raw -drive file=xv6.img,index=0,media=disk,format=raw -smp 1 -m 512
…
$ uthread
my thread running
my thread 0x2DA8
my thread running
my thread 0x4DB0
my thread 0x2DA8
my thread 0x4DB0
my thread 0x2DA8
my thread 0x4DB0
…
my thread 0x2DA8
my thread 0x4DB0
my thread: exit
my thread: exit
thread_schedule: no runnable threads
$
Good.
The only thing that bothered me for a while is: Where does thread_swtich() return to, really? We are not explicitly storing the %eip. Take the normal example of one user thread thread_switch()ing to another. From the C code we know that the calling sequence should be: (In the for loop of mythread(),) thread_yield(), thread_schedule() then thread_switch. Let’s follow this example in uthread.asm:
1b0: 83 ec 04 sub $0x4,%esp
1b3: ff 35 cc 8d 00 00 pushl 0x8dcc
1b9: 68 23 0a 00 00 push $0xa23
1be: 6a 01 push $0x1
1c0: e8 cb 04 00 00 call 690 <printf>
current_thread->state = RUNNABLE;
1c5: a1 cc 8d 00 00 mov 0x8dcc,%eax
1ca: c7 80 04 20 00 00 02 movl $0x2,0x2004(%eax)
1d1: 00 00 00
thread_schedule();
1d4: e8 27 ff ff ff call 100 <thread_schedule>
for (i = 0; i < 100; i++) {
1d9: 83 c4 10 add $0x10,%esp
1dc: 83 eb 01 sub $0x1,%ebx
1df: 75 cf jne 1b0 <mythread+0x20>
This is the for loop part. To my surprise, there’s no calls to thread_yield() at all – These highlighted lines do the same effect. Anyway, at 0x1d4 we call thread_schedule():
00000100 <thread_schedule>:
if (t->state == RUNNABLE && t != current_thread) {
100: 8b 15 cc 8d 00 00 mov 0x8dcc,%edx
next_thread = 0;
106: c7 05 d0 8d 00 00 00 movl $0x0,0x8dd0
10d: 00 00 00
for (t = all_thread; t < all_thread + MAX_THREAD; t++) {
110: b8 a0 0d 00 00 mov $0xda0,%eax
if (t->state == RUNNABLE && t != current_thread) {
115: 83 b8 04 20 00 00 02 cmpl $0x2,0x2004(%eax)
11c: 74 22 je 140 <thread_schedule+0x40>
for (t = all_thread; t < all_thread + MAX_THREAD; t++) {
11e: 05 08 20 00 00 add $0x2008,%eax
123: 3d c0 8d 00 00 cmp $0x8dc0,%eax
128: 75 eb jne 115 <thread_schedule+0x15>
if (t >= all_thread + MAX_THREAD && current_thread->state == RUNNABLE) {
12a: 83 ba 04 20 00 00 02 cmpl $0x2,0x2004(%edx)
131: 75 43 jne 176 <thread_schedule+0x76>
next_thread = 0;
133: c7 05 d0 8d 00 00 00 movl $0x0,0x8dd0
13a: 00 00 00
13d: c3 ret
13e: 66 90 xchg %ax,%ax
if (t->state == RUNNABLE && t != current_thread) {
140: 39 c2 cmp %eax,%edx
142: 74 da je 11e <thread_schedule+0x1e>
if (t >= all_thread + MAX_THREAD && current_thread->state == RUNNABLE) {
144: 3d c0 8d 00 00 cmp $0x8dc0,%eax
next_thread = t;
149: a3 d0 8d 00 00 mov %eax,0x8dd0
if (t >= all_thread + MAX_THREAD && current_thread->state == RUNNABLE) {
14e: 73 13 jae 163 <thread_schedule+0x63>
if (next_thread == 0) {
150: 85 c0 test %eax,%eax
152: 74 22 je 176 <thread_schedule+0x76>
next_thread->state = RUNNING;
154: c7 80 04 20 00 00 01 movl $0x1,0x2004(%eax)
15b: 00 00 00
thread_switch();
15e: e9 65 01 00 00 jmp 2c8 <thread_switch>
...
Highlighted lines are instructions actually executed. There’s no stack operations at all, and we jmp to thread_switch(), instead of calling it! Wow. This means next time when thread_switch() switches to this thread, it will ret from the call to thread_schedule (at 0x1d4), and resume executing at 0x1d9.
You may wonder what is my point for all of this. I’m just wondering: How did we tell the compiler to make sure that, whenever we jmp to thread_switch(), the stack top will always be the intended return address? You know, thread_schedule() has a local variable, thread_p t, but for some reason the compiler just didn’t make room for it on the stack (it used %eax instead).
I researched a little bit and learned about this is something called Tail Call Optimization (TCO), a kind of optimization performed by compilers. Indeed, in our case, thread_switch() is the last statement in thread_schedule() (in one of the if branches, of course). Well, maybe one day I’ll learn more about compilers. 🙂
See you next time!