BUAA 2023 Spring OS

image-20230618215842372

Primary architecture based on PassBash. Code available at:

Copyright © Tony’s Studio 2023


1. Overall Design

1.1 Demand Analysis

The guiding book may differ each year, and right now, as I’m writing, is 2023. However, the basic requirements for a Shell 🐚 are generally the same.

  1. Friendly user interaction.
  2. Command & Conquer argument parsing.
  3. Job scheduling.

I think the first one is quite comprehensive, since it is always difficult to provide better user experiences. Here, my understanding of a friendly Shell is that it must at least support free editing, history command and, completion perhaps.

So, I implemented Pash, with primary features as follows.

  1. Free editing experience. Not only direction key, but also Home, End, Delete, even Ctrl.
  2. Enhanced user experience. Provide clear error info, support command history, and auto-completion on Tab.
  3. Complete redirection. Add support for >>.
  4. Beautiful interface with ANSI colors.
  5. Clean and tidy directory structure.

1.2 Architecture

The original shell looks like this.

shell_old

And Pash, is like this. Only better. With internal command and external command unified, we can parse them altogether, and enable redirection and pipe for internal commands.

shell_new

Here is another figure showing all modules of Pash.

module


2. Implementation

All functions that are exclusive to Pash are declared in user/include/pash.h, and the definition of Pash is in user/pash.c, while auxiliary functions and other modules are in user/lib/pashlib.c.

2.1 Input Module

2.1.1 Raw Input

Input module is what I considered most well designed. It is powerful and flexible. For low-level input, it uses getch function that is similar to function with the same name in Windows conio.h. It rely on a system call.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
// kern/syscall_all.c
int sys_getch(void)
{
return scancharc();
}

// user/lib/syscall_lib.c
int syscall_getch(void)
{
int ch = 0;
while ((ch = msyscall(SYS_getch)) == 0)
syscall_yield();
return ch;
}

2.1.2 Input Recognition

Well, in MOS, special keys are in form of ANSI code, just like what in Linux. So it could be a little annoying to recognize these keys. However, with finite state machine, it would be a problem no more. The state machine looks just like what is shown below, but some layers are omitted since there is only one node in those layers. Each leaf node is registered with a event handler corresponding to the input, and non-leaf node with a handler to determine the next node.

lab6-challenge-input

Well, in code, this state machine consists of these two types of functions. And to maintain a consistent input state, I use a input context structure. input_opt_t will be elaborated later.

1
2
3
4
5
6
7
8
9
10
11
12
13
// user/lib/pashlib.c
typedef void _input_action_t(const input_opt_t* opt, input_ctx_t* ctx);
typedef int _input_handler_t(const input_opt_t* opt, input_ctx_t* ctx);

typedef struct _input_ctx_t
{
char* buffer; // input buffer
int pos; // current cursor position
int length; // total input length
int ch; // next character to put into buffer
int index; // current history record index
int count; // total number of history records
} input_ctx_t __attribute__((aligned(32)));

Important: One thing to notice is that, for this struct, we have to use __attribute__((aligned(32))), or unexpected error will raise. It is sure to be a problem of byte alignment, but I could not explain it yet.

2.1.3 Input Options

One of the primary features of this input module is that is support custom options. This is the part I’m most satisfied with.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
typedef struct _input_history_t
{
int (*init)(int*); // initialize number of history records
int (*append)(const char*); // append new history record
int (*get)(int, char*); // get specific history record
} input_history_t __attribute__((aligned(16)));

// current input, completion, revert distance
typedef int (*input_competer_t)(const char*, char*, int*);

typedef struct _input_opt_t
{
int minLen; // minimum input length
int maxLen; // maximum input length
int interruptible; // whether can be interrupted if minimum length not satisfied
input_history_t* history; // history service callback
input_completer_t completer; // completion service callback
} input_opt_t __attribute__((aligned(32)));

Here, I’d like to show you some examples. Expand it as you wish.

Input Option Example

This is a piece of code extracted from history handling part of input. Some extra checks are removed to reduce passage length.

1
2
3
4
5
6
7
8
9
10
// command starts with white space will not be saved (like Linux)
if (opt.history && (!(*buffer == '\0' || *buffer == ' ')))
{
strstripr(buffer, ' '); // remove trailing white spaces
if (!is_the_same(buffer, last_record)) // do not record repeated command
{
opt.history->append(buffer);
strcpy(last_record, buffer);
}
}

And another one is about completion. It is registered to Tab key.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
static void _input_tab(const input_opt_t* opt, input_ctx_t* ctx)
{
if (!opt->completer) // no completion service registered
return;

char buffer[MAXPATHLEN];
int revert;
int ret = opt->completer(ctx->buffer, buffer, &revert);
if (ret == 0) // no available completions
return;

_input_end(opt, ctx); // set cursor to the end of input
while (revert-- > 0)
_input_backspace(opt, ctx);

const char* completion = buffer;
while (*completion)
{
ctx->ch = *completion;
_input_char(opt, ctx);
completion++;
}
}

For the concrete implementation of history and completion service, please refer to my git repository. 😉

2.2 Command Resolving

2.2.1 Command Parsing

Basic command parsing remains almost the same. However, it no longer do this in a forked child process. And the execution flow changes slightly for ; and & support. Mainly use hasNext and needWait flags to tell Shell to continue parsing, or end. Here, I only showed code essentially related to ; and &.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
static int _runcmd(char* cmd)
{
int hasNext;
int needWait;
do
{
hasNext = 0;
needWait = 1;
ret = _parsecmd(cmd, &argc, argv, &rightpipe);
if (ret > 0) // ; or &
{
hasNext = 1;
if (ret == 2) // &
needWait = 0;
cmd = NULL; // next turn will continue parse on previous cmd
}
argv[argc] = NULL;
int child = _execv(argv[0], argv);
if ((child > 0) && needWait)
wait(child);
if (rightpipe)
wait(rightpipe);
} while (hasNext);

return 0;
}

static int _parsecmd(char* cmd, int* argc, char* argv[], int* rightpipe)
{
for (; ; )
{
type = get_token(NULL, &token);
switch (type)
{
// ...
case TK_SEMI_COLON:
return 1;
case TK_AMPERSAND:
return 2;
// ...
}
}
return 0;
}

Here, the process of getting token in command also uses a state machine. (It is a good thing.)

lab6-challenge-parse

2.2.2 Command Execution

After parsing, shell will then execute the command. Here, both internal and external command are first handled in _execv of Pash itself. If it is not a internal command, Pash will use library function execv to execute the command.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
// user/pash.c
static int _execv(char* cmd, char* argv[])
{
int ret = execvi(cmd, argv); // return -1 if internal command not exists
if (ret != -1)
return 0;

return execv(cmd, argv);
}

// user/lib/lib.c
int execv(const char* path, char* argv[])
{
char prog[1024] = "/bin/";

if (strchr(path, '/')) // user directory to call command
strcpy(prog, path);
else // default command will be looked up in /bin
strcat(prog, path);

if (is_ends_with(prog, ".sh")) // auto-recognize .sh script
return execl("pash", "pash", prog, NULL);
if (!is_ends_with(prog, ".b")) // auto-complete .b extention
strcat(prog, ".b");

return spawn(prog, argv);
}

2.2.3 Patch for Redirection

Since we moved command parsing in Pash process, and internal command also support redirection, so Pash it self will be affected by commands with redirection and pipe. Therefore, the standard input and output must be reset after the execution of these commands.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
// user/pash.c
int main(int argc, char* argv[])
{
// ...
if (!interactive) // backup original I/O
{
backupfd[0] = dup(0, 3);
panic_on(backupfd[0] < 0);
backupfd[1] = dup(1, 4);
panic_on(backupfd[1] < 0);
}
// ...
}

static void _restore_stream()
{
if (!redirect) // redirect flag
return;

if (!interactive) // restore redirected I/O
{
dup(backupfd[0], 0);
dup(backupfd[1], 1);
}
else // restore standard console I/O
{
close(0);
panic_on(opencons() != 0);
panic_on(dup(0, 1) < 0);
}
}

2.2.4 Argument Parsing

The original MOS uses an argument parsing macro from Plan 9… It is good, but is almost unreadable. So here, I implemented getopt function to resemble the one in Linux. It’s a little long, so just refer to my repository. Here I just give you a simple example of how it is used. (Error handling are removed to reduce passage length.)

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
// user/tree.c
static int parse_args(int argc, char* argv[])
{
int opt;
while ((opt = getopt(argc, argv, "dh")))
{
switch (opt)
{
case 'd':
dirOnly = 1;
break;
case 'h':
showHelp = 1;
break;
case '!':
strcpy(targetPath, optarg);
break;
case '?':
printfc(ERROR_COLOR, "Unknown parameter \"-%c\"\n", optopt);
break;
default:
break;
}
}

return 0;
}

2.3 File System

2.3.1 Present Working Directory

This is simple, just place this stuff to PCB is OK.

1
2
3
4
5
struct Env
{
// ...
char env_pwd[1024];
};

Then, just implement some functions to handle this.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
// user/include/lib.h
int access(const char* path, int type); // type is used to perform extra check if exists
void getcwd(char* path);
int chdir(const char* path);

// kern/syscall_all.c
void sys_set_pwd(const char* path)
{
strcpy(curenv->env_pwd, path);
}

void sys_get_pwd(char* path)
{
strcpy(path, curenv->env_pwd);
}

2.3.2 Absolute Path

Once we have complicated path, it comes the problem that, how to get absolute path? It is impossible (at least for not) for a file to get its parent, even worse, we have to deal with path that looks like home/../../opt/./el.... So we have to request file system process to do this for us.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
// user/lib/file.c
int fullpath(const char* filename, char* path)
{
char dir[MAXPATHLEN] = { '\0' };
_make_fullpath(filename, dir);

return fsipc_fullpath(dir, path);
}

// fs/fs.c
int file_fullpath(const char* path, char* fullpath)
{
return walk_fullpath(path, fullpath);
}

Actually, we can get the parent of a struct File using File::f_dir. But I failed to do so in user library function. I didn’t have time to figure out why, perhaps you can do it. 🧐

Anyway, when we only have filename, isn’t it more simpler to directly get the absolute path than open a directory, then traverse its parent?

2.3.3 Open Mode

To realize >>, and history function, we have to implement O_APPEND mode for opening file, and of course, O_CREAT also. What’s more, a detail when open with O_WRONLY, we should clear the content of the file.

2.4 Extra

2.4.1 Directory Structure

To make everything in place, I refactored file structure of MOS to resemble Linux. Isn’t it tidy?

image-20230618230931509

And we have to change how we burn files into MOS disk in fs/Makefile.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
FSIMGFILES := rootfs/init.b  \
rootfs/bin/ \
rootfs/home/ \
rootfs/etc/ \
$(fs-files)

image: $(tools_dir)/fsformat
dd if=/dev/zero of=../target/fs.img bs=4096 count=1024 2>/dev/null
mkdir -p rootfs/home/mos/
mkdir -p rootfs/bin

cp -vf $(USERAPPS) -t rootfs/home/mos/
cp -vf $(USERBINS) -t rootfs/bin
mv -vf rootfs/bin/init.b rootfs/init.b

$(tools_dir)/fsformat ../target/fs.img $$(printf '%s\n' $(FSIMGFILES))

2.4.2 User Profile

In Pash, we stored user profile in file, rather than hard code. And uses a library function to get this.

image-20230618231304158

1
2
// lib.h
int profile(char* username, char* path, int create);

2.4.3 Colorful Output

Here, we also use ANSI code to output colorful characters. And we wrapped it in printfc. (I renamed previous user/lib/fprintf.c to user/lib/printf.c.)

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
// user/lib/printf.c
int printfc(int color, const char* fmt, ...)
{
printf("\033[%dm", color); // set color
va_list ap;
va_start(ap, fmt);
int r = vfprintf(1, fmt, ap);
va_end(ap);
printf("\033[0m"); // reset color
return r;
}

// user/include/lib.h
#define BLACK 0
#define RED 1
#define ... ...

#define FOREGROUND(COLOR) ((COLOR) + 30)
#define BACKGROUND(COLOR) ((COLOR) + 40)

#define FOREGROUND_INTENSE(COLOR) (FOREGROUND(COLOR) + 60)
#define BACKGROUND_INTENSE(COLOR) (BACKGROUND(COLOR) + 60)

3. Epilogue

Well, I guess this is a general overview of Pash. However, there are a lot pitfalls that I didn’t mention. Some of them were really obscure that I could not fully comprehend. But, it is truly exciting to see Pash running on MOS. 😁

image-20230618215754490

"When I left you, I was but the learner. Next time, I will be, the master."