BUAA OS 2023 Spring


Due to misunderstanding of some concepts and the meaning of some functions, I failed on lab 2 extra. So much thanks to the TA who helped me understand these stuffs. Thus, a reflection here, I have.

To be honest, the problem is not that difficult, but if you’re not clear with related concepts, you may get into much trouble. So, here I just want to demonstrate some key points of it.

Here is the original problem description, you can expand it optionally.

Problem

image-20230406202127222


To get a better understanding of this article, you can refer to my previous post here:

Traverse Page Table Entries

In this problem, to swap in or out a page need a complete traverse of the page table of current process. But why? Why not just a single page? Well, you know, multiple virtual page could be mapped to the same physical address, so we have to make sure all these pages are correctly marked. Therefore, a complete traverse is needed.

Then, you may ask again, there are so many page table entries, won’t it take too long? Emm… some page directories are not valid, so the page table entries will be skipped.

So, how do we traverse page table entries?

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
// Traverse page table entries.
const int PAGE_ENTRY_CNT = 1024;
for (int pdx = 0; pdx < PAGE_ENTRY_CNT; pdx++)
{
if (!(pgdir[pdx] & PTE_V)) // skip invalid page directory entry
continue;

// The address of page table should be kernel address for os access,
// but the content of page table entry is physical address.
Pte* pgtbl = (Pte*)KADDR(PTE_ADDR(pgdir[pdx]));
for (int ptx = 0; ptx < PAGE_ENTRY_CNT; ptx++)
{
if (!(pgtbl[ptx] & PTE_V))
continue;
// operations on valid page table entry
}
}

Flush TLB

Another point that is difficult to understand is the action to clear old TLB entry. The structure of TLB entry is as follows.

image-20230406204026763

VPN is virtual page number, if we takes it out and make low 12 bits zero, it will be the start virtual address of a virtual page. The same goes with PFN, which is Page Frame number, a.k.a. Physical Page number (PPN).

To invalidate a page directory entry, we actually flush the VPN it mapped to. Yeah, the VPN it mapped to, but how to get it? In pgdir_walk() function, we create or get a page table entry of the given va, which is exactly what we want here. So, again, what does pgdir_walk() really mean?

pgdir_walk()

I found that I didn’t understand this function before. Here is the definition of this tricky function.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
/*
* Overview:
* Given 'pgdir', a pointer to a page directory, 'pgdir_walk' returns a pointer
* to the page table entry (with permission PTE_D|PTE_V) for virtual address 'va'.
*
* Pre-Condition:
* 'pgdir' is a two-level page table structure.
*
* Post-Condition:
* If we're out of memory, return -E_NO_MEM.
* Otherwise, we get the page table entry, store the value of page table entry
* to *ppte, and return 0, indicating success.
*/
static int pgdir_walk(Pde* pgdir, u_long va, int create, Pte** ppte);

Depending on the value of create, we can get the page table entry mapped to virtual address va, or create the missing page table. (Since we use two-level page table, where the second level won’t be created until needed.) And what we used here is the so called VPN, which could be further explained as PDX and PTX. Again, here is a picture of it.

image-20230406205510542

Get it? Are we getting it? (By Steve Jobs on iPhone’s release conference.)

The va here is exactly what we use later in tlb_invalidate(). What? You still don’t know what va is? Just take some time to read the previous post Page Directory Self Mapping.

So the purpose of pgdir_walk() is just to get the corresponding page table entry of the virtual address va, or create it if not allocated before.

tlb_invalidate()

Here is the definition of related functions. Now you should know what’s the correct va to pass.

1
2
3
4
5
6
7
8
// Invalidate the TLB entry with specified 'asid' and virtual address 'va'.
void tlb_invalidate(u_int asid, u_long va)
{
tlb_out(PTE_ADDR(va) | (asid << 6));
}

// Flush the TLB entry with EntryHi Register set to 'entryhi'.
extern void tlb_out(u_int entryhi);

In tlb_invalidate(), we called tlb_out() function, which is written in MIPS. To access TLB in MIPS, we use EntryHi and EntryLo registers. EntryHi as key and EntryLo as value. So, here, to flush the entry, we need the key to specify which entry to flush, and the key is what we call entryhi. As a parameter and the first parameter, it will be passed as $a0 in MIPS.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
LEAF(tlb_out)
.set noreorder
mfc0 t0, CP0_ENTRYHI # Store previous value for context restore.
mtc0 a0, CP0_ENTRYHI # Make CP0.EntryHi the parameter we passed.

# Get corresponding value of CP0.EntryHi and store it to CP0.Index register.
# nop has to be added for pipline execution reason.
nop
tlbp
nop

# Get the result of tlbp from CP0.Index
mfc0 t1, CP0_INDEX

.set reorder
# If no such entry, highest bit of Index will be set to 1, which
# makes it a negative value, and then skip flush operation.
bltz t1, NO_SUCH_ENTRY # bltz: branch (if) less than zero

.set noreorder
# Clear TLB entry by setting key and value to zero.
# Here, CP0.EntryHi and CP0.EntryLo0 are just a step stone for us
# to operate TLB, they will be used later in tlbwi
mtc0 zero, CP0_ENTRYHI
mtc0 zero, CP0_ENTRYLO0 # MIPS 3000 only has CP0.EntryLo0.
nop

# Write CP0.EntryLo0 and CP0.EntryLo1 to TLB
tlbwi

.set reorder

# These instructions will be executed whether there is such an entry or not.
NO_SUCH_ENTRY:
mtc0 t0, CP0_ENTRYHI # Restore old value.
j ra
END(tlb_out)

“The even page entries in the TLB (such as PFN0) come from CP0.EntryLo0. Similarly, odd page entries come from CP0.EntryLo1.” From MD00090-2B-MIPS32PRA-AFP-06.02, Page 32.

However, we use MIPS 3000, which doesn’t have such feature. It only has CP0.EntryLo0.

Here are some explanation to TLB related instructions.

tlbr (read): Using the value in CP0.Index to read the corresponding entry to CP0.EntryHi and CP0.EntryLo.

tlbwi (write): Using the value in CP0.Index to write the value of CP0.EntryHi and CP0.EntryLo to corresponding TLB entry.

tlbwr (write random): Write CP0.EntryHi and CP0.EntryLo to a random TLB entry. Here, a register with random value is used, which is in fact a counter.

tlbp (probe): Using the value in CP0.EntryHi as key (including VPN and ASID) to find corresponding entry in TLB. If there is such an entry, the index of it will be stored at CP0.Index. Other wise, the highest bit of CP0.Index will be set to 1 to make it a negative number, indicating a failure.

So, let get back to the problem one last time. Now, everything is clear. We traverse the page table entries, and get the legal ones. Then, do the … things.

As we know, “Talk is cheap, show me the code.” So, just show the code. ;)

Code

To make it shorter, I removed some blank lines. We’re really sorry about this. :(

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
// Interface for 'Passive Swap Out'
static void ensure_free_page(Pde* pgdir, u_int asid)
{
// Select a page to swap out to disk.
struct Page* pp = pa2page(SWAP_PAGE_BASE);
// Allocate swap page on disk.
u_char* da = disk_alloc();
for (int pdx = 0; pdx < PAGE_ENTRY_CNT; pdx++)
{
if (!(pgdir[pdx] & PTE_V))
continue;
Pte* pgtbl = (Pte*)KADDR(PTE_ADDR(pgdir[pdx]));
for (int ptx = 0; ptx < PAGE_ENTRY_CNT; ptx++)
{
if (!(pgtbl[ptx] & PTE_V))
continue;
if (PPN(pgtbl[ptx]) == page2ppn(pp))
{
u_long va = (pdx << PDSHIFT) | (ptx << PGSHIFT);
u_long perm = PTE_PERM(pgtbl[ptx]);
perm = PTE_CLR(perm, PTE_V);
perm = PTE_SET(perm, PTE_SWP);
pgtbl[ptx] = PTE_ADDR(da) | perm;
// Now you should know what 'va' means. :P
tlb_invalidate(asid, va);
}
}
}
memcpy(da, (void*)page2kva(pp), BY2PG);
LIST_INSERT_HEAD(&page_free_swappable_list, pp, pp_link);
}

struct Page* swap_alloc(Pde* pgdir, u_int asid)
{
// Must ensure there is page to swap.
if (LIST_EMPTY(&page_free_swappable_list))
ensure_free_page(pgdir, asid);

struct Page* pp = LIST_FIRST(&page_free_swappable_list);
LIST_REMOVE(pp, pp_link);
memset((void*)page2kva(pp), 0, BY2PG);

return pp;
}
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
// Interfaces for 'Active Swap In'
static int is_swapped(Pde* pgdir, u_long va)
{
Pte* pte;
/*
* Not page_lookup() here, since the page we look for might be
* swapped out with PTE_V set to zero. This will be regarded as
* error by page_lookup(), which will then make pte NULL, causing
* the final return to be false, rather than true.
*/
pgdir_walk(pgdir, va, 0, &pte);
return pte && (*pte & PTE_SWP) && !(*pte & PTE_V);
}

static void swap(Pde* pgdir, u_int asid, u_long va)
{
// va may not be aligned by page, but this doesn't affect this lab.
va = ROUNDDOWN(va, BY2PG);

struct Page* pp = swap_alloc(pgdir, asid);
Pte* pte;

/*
* Find the pte corresponding to given va.
* Notice that, this va is one of the pte that mapped to the swapped out
* page, while all pte(s) that mapped to the swapped page contains the
* da we want.
*/
pgdir_walk(pgdir, va, &pte);
assert(pte);

u_char* da = (u_char*)PTE_ADDR(*pte);
memcpy((void*)page2kva(pp), da, BY2PG);

for (int pdx = 0; pdx < PAGE_ENTRY_CNT; pdx++)
{
if (!(pgdir[pdx] & PTE_V))
continue;
Pte* pgtbl = (Pte*)KADDR(PTE_ADDR(pgdir[pdx]));
for (int ptx = 0; ptx < PAGE_ENTRY_CNT; ptx++)
{
if (!((pgtbl[ptx] & PTE_SWP) && !(pgtbl[ptx] & PTE_V)))
continue;
if (PPN(pgtbl[ptx]) == PPN(da))
{
u_long vaddr = (pdx << PDSHIFT) | (ptx << PGSHIFT);
u_long perm = PTE_PERM(pgtbl[ptx]);
perm = PTE_SET(perm, PTE_V);
perm = PTE_CLR(perm, PTE_SWP);
pgtbl[ptx] = page2pa(pp) | perm;
tlb_invalidate(asid, vaddr);
}
}
}
disk_free(da);
}

Pte swap_lookup(Pde* pgdir, u_int asid, u_long va)
{
// If corresponding page is swapped out, swap it in
if (is_swapped(pgdir, va))
swap(pgdir, asid, va);

// Look up page table element.
// Here, the pte to va is sure to exist
Pte* ppte;
page_lookup(pgdir, va, &ppte);

return ppte == NULL ? 0 : *ppte;
}