6.1810: Operating System Engineering Lab: Xv6 and Unix utilities By:Haostart

Foreword

After school starts, I have to prepare for finding an internship in one year. To lay a foundation, I will do MIT-6.S081, which is the 2022 version. The address is as follows
6.1810: Operating System Engineering Lab

Lab

Be sure to know the system calls of Xv6 before doing experiments!!!

Be sure to know the system calls of Xv6 before doing experiments!!!

1. sleep

Since sleep is already included in the system call, just handle some other details.
Pay attention to parameter quantity and legality judgment.
code show as below:

#include "kernel/types.h"
#include "kernel/stat.h"
#include "user/user.h"

int main(int argc, char *argv[])
{<!-- -->
  if(argc != 2){<!-- -->
    fprintf(2, "sleep: missing operand\\
");
    exit(1);
  }
  int time = atoi(argv[1]);//Convert string numeric type to int type
  sleep(time);
  exit(0);
}

There is atoi implementation in qemu’s libc:

2. pingpong

Write a program to make two processes perform a “ping-pong” through a pipe, that is, the parent process sends a byte to the child process, and the child process reads it and prints: received ping, and then sends a byte to the parent process, and the parent process After the process reads in, it prints: received pong, and finally exits.

Before doing this experiment, you must understand the pipe. The specific pipe implementation can be found in kernel/pipe.c. See the following program example to help understand.


The following is the relevant description of the program by book-riscv-rev3

The program calls pipe, creating a new pipe and recording the read and write file descriptors in the array p. After the fork, both the parent and child processes have file descriptors referencing the pipe. The child process calls close and dup, points file descriptor zero to the read end of the pipe, closes the file descriptor in p, and calls exec to run wc. When wc reads from its standard input, it actually reads from the pipe. The parent process closes the reading end of the pipe, writes data to the pipe, and then closes the writing end.
If no data is available, a read from the pipe will wait for data to be written or for all file descriptors referencing the write end to be closed; in the latter case, the read will return 0 as if the end of the data file had been reached . One reason for pipes is that reads block until no more new data can arrive, which is why it is important for the child process to close the write end of the pipe before executing wc above: if one of wc’s file descriptors points to the pipe’s On the writing side, wc will never see the end-of-file character.

Following this example, we can create a pipeline and perform some simple operations:

#include "kernel/types.h"
#include "kernel/stat.h"
#include "user/user.h"

int
main(int argc,char *argv[]){<!-- -->

    int p[2],pid;
    char buff[2];

    pipe(p); //system call, create pipe

    if(fork()==0){<!-- --> // =0 is a child process
        pid=getpid(); //Get process id

        read(p[0],buff,1);
        printf("%d: received ping\\
", pid);
        close(p[0]);
        write(p[1],buff,1);
        close(p[1]);
        // printf("%d: send pong\\
", pid);
        exit(0);
    }
    else{<!-- -->
        pid=getpid();
        write(p[1],buff + 1,1); //The parent process writes to the pipe
        close(p[1]);
        // printf("%d: send ping\\
", pid);
        wait(0); //Wait for the child process to end
        read(p[0],buff + 1,1);
        close(p[0]);
        printf("%d: received pong\\
", pid);
        exit(0);

    }
}

3. primes

The idea of writing a concurrent version of the prime sieve method implemented using pipes was proposed by Doug McIlroy, the inventor of Unix pipes. Your solution should be in the file user/primes.c.

Your goal is to use pipes and fork to build pipeline lines. The first process feeds the numbers 2 to 35 into the pipe. For each prime number, you would arrange to create a process that reads from its left neighbor through one pipe and writes to its right neighbor through another pipe. Due to xv6’s limited number of file descriptors and processes, the first process can be stopped at 35.

Some tips:

  1. Be careful to close file descriptors that are not needed by the process, otherwise your program will exhaust xv6’s resources before the first process reaches 35.
  2. Once the first process reaches 35, it should wait for the entire pipeline to terminate, including all children, grandchildren, etc. Therefore, the main prime process can only exit after all output has been printed and all other prime processes have exited.
  3. Tip: When the writing end of the pipe is closed, the read function returns zero.
  4. The easiest way is to write a 32-bit (4-byte) integer directly to the pipe instead of using formatted ASCII I/O.
  5. Processes in the pipeline should be created only when needed.
  6. Add the program to UPROGS in the Makefile.

It was the first time I encountered this kind of thing. It was very difficult to do it without any algorithm experience. I later said

4. find

Write a simple version of a UNIX finder that finds all files with a specific name in a directory tree

Before writing this, be sure to be familiar with the code of the ls.c file, especially the implementation of the ls function.

After understanding how to get various types of information about files, it’s a matter of implementation. I used a string matching algorithm here.

#include "kernel/types.h"
#include "kernel/stat.h"
#include "user/user.h"
#include "kernel/fs.h"

//Determine whether the string contains
int
containsSubstring(const char *path, const char *target) {<!-- -->
    int pathLength = strlen(path);
    int targetLength = strlen(target);

    for (int i = 0; i <= pathLength - targetLength; i + + ) {<!-- -->
        int j;
        for (j = 0; j < targetLength; j + + ) {<!-- -->
            if (path[i + j] != target[j]) {<!-- -->
                break;
            }
        }
        if (j == targetLength) {<!-- -->
            return 1; // Found the target string
        }
    }
    return 0; // Target string not found
}

//Here is a simple return of the file name
char*
fmtname(char *path)
{<!-- -->
  static char buf[DIRSIZ + 1];
  char *p;

  // Find first character after last slash.
  for(p=path + strlen(path); p >= path & amp; & amp; *p != '/'; p--)
    ;
  p + + ;

  // Return blank-padded name.
  if(strlen(p) >= DIRSIZ)
    return p;
  memmove(buf, p, strlen(p));
  memset(buf + strlen(p), 0, DIRSIZ-strlen(p));
// printf("buf:%s\\
",buf);
  return buf;
}

void
find(char *path,char *target)
{<!-- -->
  char buf[512],*p;
  int fd;
  struct dirent de;
  struct stat st;

  if((fd = open(path, 0)) < 0){<!-- -->
    fprintf(2, "find: cannot open %s\\
", path);
    return;
  }

  if(fstat(fd, & amp;st) < 0){<!-- -->
    fprintf(2, "find: cannot stat %s\\
", path);
    close(fd);
    return;
  }

  switch(st.type){<!-- -->
    
  case T_DEVICE:
    // printf("it's a device.\\
");
  case T_FILE:
    // printf("it's a file %s your find is %s\\
",fmtname(path),target);
    if (containsSubstring(fmtname(path),target)) {<!-- -->
      printf("%s\\
", path);
    }

    break;

  case T_DIR:
    while(read(fd, & amp;de, sizeof(de)) == sizeof(de)){<!-- -->
        if(de.inum == 0)
            continue;
        if (strcmp(de.name, ".") == 0 || strcmp(de.name, "..") == 0)
            continue;
        // Build a new path
        strcpy(buf, path);
        p=buf + strlen(path);
        *p + + ='/';
        strcpy(p, de.name);
        // printf("%s\t\t%d\\
",buf,strlen(buf));
        // strcat(buf, de.name);
        find(buf,target); //Recursive implementation
        memset(buf, 0, sizeof(buf)); //Be careful to clear it
    }
    break;
  }
  close(fd);
}

int
main(int argc, char *argv[])
{<!-- -->
  if(argc == 2){<!-- -->
    find(".",argv[1]);
    exit(0);
  }
  else if(argc==3){<!-- -->
    find(argv[1], argv[2]);
    exit(0);
  }
  else{<!-- -->
    fprintf(2, "Usage: find <path> <target>\\
");
    exit(1);
  }
}

5. xargs

Write a simple version of the UNIX xargs program. xargs is a tool for combining multiple commands. Simply put, it can accept strings through pipes, split the received strings into many parameters by spaces, and then pass the parameters behind it. command as the command line parameter of the following command

In the Shell of Unix-like operating systems, the pipe character |
Used to create a pipe (Pipe), which allows the standard output (stdout) of one command to be connected to the standard input (stdin) of another command, thereby achieving inter-process communication and data transfer.

The principle is as follows:

  1. When the shell interpreter encounters the | symbol, it creates a pipe.

  2. The Shell divides the command to be executed into two parts: the left command and the right command. The standard output of the command on the left is redirected to the writing end of the pipe (the output of the pipe), and the standard input of the command on the right is redirected to the reading end of the pipe (the input of the pipe).

  3. Next, the Shell creates two child processes, one for executing the command on the left and the other for executing the command on the right.

  4. The output of the command on the left is written to the pipe, while the command on the right reads input data from the pipe.

  5. The two sub-processes run in parallel, and the data generated by the left command is passed to the right command through a pipe, thereby realizing the transmission of data flow.

  6. When the command on the left finishes executing, it closes the write end of the pipe, which triggers the end condition of the command on the right. The command on the right continues to read data from the pipe until there is no more data to read.

  7. When the command on the right is executed, the Shell will wait for both child processes to end and wait for their exit status.

Simply put, the output on the left becomes the input on the right.
for example

$ echo hello too | xargs echo bye
bye hello too
$

Output hello too on the left and append it to echo bye to become echo bye hello too
The following is the code for reference

#include "kernel/types.h"
#include "kernel/stat.h"
#include "user/user.h"
#include "kernel/param.h"

#defineMAXARGS 128
#defineMAXLEN 128

int main(int argc, char *argv[]) {<!-- -->
    if (argc < 2) {<!-- -->
        fprintf(2, "Usage: %s [command [initial-arguments]]\\
", argv[0]);
        exit(1);
    }

    char *buf[MAXARGS];
    int j = 0;
    for (int i = 1; i < argc; i + + )
        buf[j + + ] = argv[i]; //This is the original parameter
    for (; j < MAXARGS; j + + ) //Prepare memory space for pipe input parameters
        buf[j] = (char *)malloc(sizeof(MAXLEN)); // Allocate memory for each argument

    j = argc - 1; //Represents the starting position of the string input by the pipe
    char buff;
    
    int cmd=0;

    while (1){<!-- -->
      int m = 0; //m is the string character index
      while ((cmd=read(0, & amp;buff, 1)) != 0) {<!-- -->
          if (buff == ' ') {<!-- --> //space goes to the next string
              buf[j + + ][m] = 0;
              m = 0;
          } else if (buff == '\\
') {<!-- --> //Exit on newline, execute a command whenever \\
 is read
              buf[j + + ][m] = 0;
              m=0;
              break;
          } else {<!-- --> //Read parameter characters
              buf[j][m + + ] = buff;
          }
      }
      buf[j] = 0;
      j=argc-1; //String index initialization
      int pid;
      if ((pid = fork()) == 0) {<!-- -->
        exec(buf[0], buf);
          
      } else if (pid < 0) {<!-- -->
          // Fork failed
          exit(1);
      } else {<!-- -->
          // Parent process
          wait(0);
      }
        if(cmd<=0) //All reading is completed, exit
        break;
    }

    exit(0);
}

The result of make grade is as follows (it will not prime Ehrstein sieve):


Summary

It is once again familiar with the basic syntax of C language, especially string-related operations.
The system calls of Xv6 are concise and clear in structure. It is suitable for beginners to learn.
I still have a lot of shortcomings in terms of algorithms.