MIT 6.s081 lab2.1–trace [clear thinking explanation version]

MIT 6.s081 lab2.1–trace

No.1 words written in front

I was very excited. When I followed the answer twice to do this experiment, I still had no clue. Now I started over again, thinking on my own without relying on the answer, and passed smoothly! Progress! It is necessary to emphasize: Be sure to understand the instructions and codes for reference! Be sure to fully understand it!

At first, I didn’t understand what Professor Frans meant when he said that the examples were enough. But now I realize that there are really a lot of hints.

No.2 Experimental Idea

At the beginning, it was prompted to use the mask to perform the syscall of the trace. The experimental requirements were clearly written. Here, the bits of the mask indicate the system call number.

For example, if mask = 2, its binary value is 0010, and flag bit 1 is in the first order, then the system call with call number 1 is traced. A preliminary understanding first, it will be useful later.

The steps for adding trace to each file according to the experimental guidance will not be repeated. The main steps are the following:

2.1 How to utilize system calls?

The prompt requires, [Realize system calls by saving parameters to a new variable in the proc structure]. Two questions arise here: Where does the new variable come from? How to make system calls using new variables?

First, look at the code of the structure proc in kernel/proc.h:

struct proc {<!-- -->
  struct spinlock lock;

  // p->lock must be held when using these:
  enum procstate state; // Process state
  struct proc *parent; // Parent process
  void *chan; // If non-zero, sleeping on chan
  int killed; // If non-zero, have been killed
  int xstate; // Exit status to be returned to parent's wait
  int pid; // Process ID

  // these are private to the process, so p->lock need not be held.
  uint64 kstack; // Virtual address of kernel stack
  uint64 sz; // Size of process memory (bytes)
  pagetable_t pagetable; // User page table
  struct trapframe *trapframe; // data page for trampoline.S
  struct context context; // swtch() here to run process
  struct file *ofile[NOFILE]; // Open files
  struct inode *cwd; // Current directory
  char name[16]; // Process name (debugging)
}

It can be seen that the process information is provided here, divided into locked and non-locked. There is no so-called new variable here, so you need to add it yourself. The question requires the use of mask to make system calls, so it is natural to think of adding a mask.

Because for a process, it uses relatively private things such as stack and name without locking. Mask also belongs to this category and does not require special protection, so it is added to the bottom:

 char name[16]; // Process name (debugging)
  int mask; //After checking, there are 31 mask digits. In the example 2147483647, all 31 low bits are set to 1, which is counted as a boundary value.

Next, solve the second question: How to use mask to make calls? Tip: You can view usage examples in kernel/sysproc.c. Let’s take a look:

uint64
sys_wait(void)
{<!-- -->
  uint64 p;
  if(argaddr(0, & amp;p) < 0)
    return -1;
  return wait(p);
}

First, you need to understand what argaddr is. The Code: System calls in the Traps, interrupts, and drivers chapter of the official original document is very clear. In general: argint/argaddr/argfd means from the trap framework Retrieve the nth system call parameter in and save it in the form of int, address, fd. Let’s go back a little further:

First, the parameter p enters argaddr;

int
argaddr(int n, uint64 *ip)
{<!-- -->
  *ip = argraw(n);
  return 0;
}

search argraw;

static uint64
argraw(int n)
{<!-- -->
  struct proc *p = myproc();
  switch (n) {<!-- -->
  case 0:
    return p->trapframe->a0;
  case 1:
    return p->trapframe->a1;
  case 2:
    return p->trapframe->a2;
  case 3:
    return p->trapframe->a3;
  case 4:
    return p->trapframe->a4;
  case 5:
    return p->trapframe->a5;
  }
  panic("argraw");
  return -1;
}

It can be seen that the 0 we passed in indicates that the current process enters the trapframe to obtain the value of register a0, and then the ip pointer records its address and returns it to the if judgment. If <0, it means that there is no value in the register and returns -1, otherwise it is called wait(), then our sys_trace can be easily written:

uint64
sys_trace(void){<!-- -->
  int n;
  if(argint(0, & amp;n) < 0)//Because the return value of trace is int type, use argint
    return -1;
  myproc()->mask = n;//Save the value in the a0 register to the newly defined mask in proc.h to implement a new system call. Myproc is the current process and is defined in proc.h.
  return 0;
}

Then follow the prompts to view the fork() function. We only need to look at the previous section:

// Create a new process, copying the parent.
// Sets up child kernel stack to return as if from fork() system call.
int
fork(void)
{<!-- -->
  int i, pid;
  struct proc *np;
  struct proc *p = myproc();

  // Allocate process.
  if((np = allocproc()) == 0){<!-- -->
    return -1;
  }

Because the child process wants to copy the parent process, it must make settings such as allocating space. It can be seen that np is the child process and p is the current process. Then come to the following and copy the mask from the parent process to the child process according to the prompts:

 np->sz = p->sz;
  np->mask = p->mask;//Trace mask copied from parent to child
  np->parent = p;

At this point, two-thirds of the experiment has been completed.

2.2 syscall()

The prompt requires printing trace output and adding an array of system call names to create an index. At this point, I was stunned for a moment: Print? How to print? Where to print? What is indexing? Haha, let’s continue exploring. Let’s take a look at syscall() first.

void
syscall(void)
{<!-- -->
  int num;
  struct proc *p = myproc();
    
  num = p->trapframe->a7;
  if(num > 0 & amp; & amp; num < NELEM(syscalls) & amp; & syscalls[num]) {<!-- -->
    //NELEM:
    //number of elements in fixed-size array
//Number of elements in fixed size array
//defs.h #define NELEM(x) (sizeof(x)/sizeof((x)[0]))
  
    //In other words, if the system call number is >0 and exists in the syscall array, it is within the range
    //Then copy the call number stored in a7 to a0.
    p->trapframe->a0 = syscalls[num]();
    }
  } else {<!-- -->
    //Otherwise it is an unknown system call
    printf("%d %s: unknown sys call %d\\
",
            p->pid, p->name, num);
    p->trapframe->a0 = -1;
  }

Then, our trace has not been added to the syscall array, and if run, it will inevitably cause an error; therefore, modify the array and extern above syscall():

extern uint64 sys_trace(void);
[SYS_trace] sys_trace,

Next, continue to solve the problem at the beginning. If your thoughts suddenly stop here, then go back to the starting point and start sorting it out again.

No.3 system call process

So far, I have almost completed the supplementary writing following the prompts, but my ideas are still confused. When do I fall into it and when do I start? So, let’s review how the first process demonstrated by Professor Frans in the course occurs [Video Node at 1:13:14]:

  1. compilation

    # exec(init, argv)
    .globl start
    start:
            la a0, init
            la a1, argv
            li a7, SYS_exec
            ecall
    

    Load the exec system call into a7 and then ecall into the kernel.

  2. userinit creates the initial process, returns to user space, executes the above assembly, finds system calls, and returns to kernel space

  3. Enter syscall, num reads the corresponding system call integer, and then passes it to a0, syscall[num] corresponds to the entry function

  4. The first process is completed by calling init via exec.

The implementation of trace is as shown below:

Imgur

Then, you will know how to complete it.

The question requires that each system call prints a line when it is about to return. This line should contain the process id, the name of the system call and the return value, first printf:

printf("%d: syscall %s -> %d\\
", p->pid, ?, p->trapframe->a0);

But how are the intermediate system call names determined? Then use the “Add system call name array to create index” in the prompt:

 char *syscall_names[] = {<!-- -->
  "", "fork", "exit", "wait", "pipe",
  "read", "kill", "exec", "fstat", "chdir",
  "dup", "getpid", "sbrk", "sleep", "uptime",
  "open", "write", "mknod", "unlink", "link",
  "mkdir", "close", "trace",};
  //It should be noted that the call number starts from 1

Almost done, but how to keep track? To review, the function of trace is to track the syscall being called, and the question at the beginning prompts to use mask to specify the syscall to be traced.

Then, if the system call number you enter is masked and converted, and if it is consistent with the one called by the current process, tracking can be completed. The complete modification of the entire syscall is as follows:

void
syscall(void)
{<!-- -->
  int num;
  struct proc *p = myproc();
    
  char *syscall_names[] = {<!-- -->
  "", "fork", "exit", "wait", "pipe",
  "read", "kill", "exec", "fstat", "chdir",
  "dup", "getpid", "sbrk", "sleep", "uptime",
  "open", "write", "mknod", "unlink", "link",
  "mkdir", "close", "trace",};
  //It should be noted that the call number starts from 1.

  num = p->trapframe->a7;
  if(num > 0 & amp; & amp; num < NELEM(syscalls) & amp; & syscalls[num]) {<!-- -->
    p->trapframe->a0 = syscalls[num]();
    //What is tracking? It is to track the system calls that are being used or used.
    //So how to determine whether the system call of a process can be tracked?
    //If the mask called by the current process is consistent with the call number that generates the actual system call stub user/usys.S
    if((1 << num) & amp; p->mask){<!-- -->
    //I made a mistake here, using == instead of & amp;
    //To determine whether the flag bits are equal through AND operation, if == is used, it means that both sides are not null values.
      printf("%d: syscall %s -> %d\\
", p->pid, syscall_names[num], p->trapframe->a0);
    }
  } else {<!-- -->
    printf("%d %s: unknown sys call %d\\
",
            p->pid, p->name, num);
    p->trapframe->a0 = -1;
  }
}

Perfect solution. [Take half a day]