forked from mit-pdos/xv6-book
-
Notifications
You must be signed in to change notification settings - Fork 0
/
unix.t
970 lines (964 loc) · 26.8 KB
/
unix.t
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
686
687
688
689
690
691
692
693
694
695
696
697
698
699
700
701
702
703
704
705
706
707
708
709
710
711
712
713
714
715
716
717
718
719
720
721
722
723
724
725
726
727
728
729
730
731
732
733
734
735
736
737
738
739
740
741
742
743
744
745
746
747
748
749
750
751
752
753
754
755
756
757
758
759
760
761
762
763
764
765
766
767
768
769
770
771
772
773
774
775
776
777
778
779
780
781
782
783
784
785
786
787
788
789
790
791
792
793
794
795
796
797
798
799
800
801
802
803
804
805
806
807
808
809
810
811
812
813
814
815
816
817
818
819
820
821
822
823
824
825
826
827
828
829
830
831
832
833
834
835
836
837
838
839
840
841
842
843
844
845
846
847
848
849
850
851
852
853
854
855
856
857
858
859
860
861
862
863
864
865
866
867
868
869
870
871
872
873
874
875
876
877
878
879
880
881
882
883
884
885
886
887
888
889
890
891
892
893
894
895
896
897
898
899
900
901
902
903
904
905
906
907
908
909
910
911
912
913
914
915
916
917
918
919
920
921
922
923
924
925
926
927
928
929
930
931
932
933
934
935
936
937
938
939
940
941
942
943
944
945
946
947
948
949
950
951
952
953
954
955
956
957
958
959
960
961
962
963
964
965
966
967
968
969
970
.chapter CH:UNIX "Operating system interfaces"
.PP
The job of an operating system is to share a computer among
multiple programs and to provide a more useful set of services
than the hardware alone supports.
The operating system manages and abstracts
the low-level hardware, so that, for example,
a word processor need not concern itself with which type
of disk hardware is being used.
It also shares the hardware among multiple programs so
that they run (or appear to run) at the same time.
Finally, operating systems provide controlled ways for programs
to interact, so that they can share data or work together.
.PP
An operating system provides services to user programs through an interface.
.index "interface design"
Designing a good interface turns out to be
difficult. On the one hand, we would like the interface to be
simple and narrow because that makes it easier to get the
implementation right. On the other hand,
we may be tempted to offer many sophisticated features to applications.
The trick in
resolving this tension is to design interfaces that rely on a few
mechanisms that can be combined to provide much generality.
.PP
This book uses a single operating system as a concrete example to
illustrate operating system concepts. That operating system,
xv6, provides the basic interfaces introduced by Ken Thompson and
Dennis Ritchie's Unix operating system, as well as mimicking Unix's
internal design. Unix provides a
narrow interface whose mechanisms combine well, offering a surprising
degree of generality. This interface has been so successful that
modern operating systems—BSD, Linux, Mac OS X, Solaris, and even, to a
lesser extent, Microsoft Windows—have Unix-like interfaces.
Understanding xv6 is a good start toward understanding any of these
systems and many others.
.PP
As shown in
.figref os ,
xv6 takes the traditional form of a
.italic-index kernel ,
a special program that provides
services to running programs.
Each running program, called a
.italic-index process ,
has memory containing instructions, data, and a stack. The
instructions implement the
program's computation. The data are the variables on which
the computation acts. The stack organizes the program's procedure calls.
.PP
When a
process needs to invoke a kernel service, it invokes a procedure call
in the operating system interface. Such a procedure is called a
.italic-index "system call" .
The system call enters the kernel;
the kernel performs the service and returns.
Thus a process alternates between executing in
.italic-index "user space"
and
.italic-index "kernel space" .
.PP
The kernel uses the CPU's hardware protection mechanisms to
ensure that each process executing in user space can access only
its own memory.
The kernel executes with the hardware privileges required to
implement these protections; user programs execute without
those privileges.
When a user program invokes a system call, the hardware
raises the privilege level and starts executing a pre-arranged
function in the kernel.
.figure os
.PP
The collection of system calls that a kernel provides
is the interface that user programs see.
The xv6 kernel provides a subset of the services and system calls
that Unix kernels traditionally offer.
.figref api
lists all of xv6's system calls.
.PP
The rest of this chapter outlines xv6's services—\c
processes, memory, file descriptors, pipes, and file system—\c
and illustrates them with code snippets and discussions
of how the
.italic-index "shell" ,
which is the primary user interface to
traditional Unix-like systems, uses them.
The shell's use of system calls illustrates how carefully they
have been designed.
.PP
The shell is an ordinary program that reads commands from the user
and executes them.
The fact that the shell is a user program, not part of the kernel,
illustrates the power of the system call interface: there is nothing
special about the shell.
It also means that the shell is easy to replace; as a result,
modern Unix systems have a variety of
shells to choose from, each with its own user interface
and scripting features.
The xv6 shell is a simple implementation of the essence of
the Unix Bourne shell. Its implementation can be found at line
.line sh.c:1 .
.\"
.\" Processes and memory
.\"
.section "Processes and memory"
.PP
An xv6 process consists of user-space memory (instructions, data, and stack)
and per-process state private to the kernel.
Xv6 can
.italic-index time-share
processes: it transparently switches the available CPUs
among the set of processes waiting to execute.
When a process is not executing, xv6 saves its CPU registers,
restoring them when it next runs the process.
The kernel associates a process identifier, or
.code-index pid ,
with each process.
.figure api
.PP
A process may create a new process using the
.code-index fork
system call.
.code Fork
creates a new process, called the
.italic-index "child process" ,
with exactly the same memory contents
as the calling process, called the
.italic-index "parent process" .
.code Fork
returns in both the parent and the child.
In the parent,
.code-index fork
returns the child's pid;
in the child, it returns zero.
For example, consider the following program fragment:
.P1
int pid = fork();
if(pid > 0){
printf("parent: child=%d\en", pid);
pid = wait();
printf("child %d is done\en", pid);
} else if(pid == 0){
printf("child: exiting\en");
exit();
} else {
printf("fork error\en");
}
.P2
The
.code-index exit
system call causes the calling process to stop executing and
to release resources such as memory and open files.
The
.code-index wait
system call returns the pid of an exited child of the
current process; if none of the caller's children
has exited,
.code-index wait
waits for one to do so.
In the example, the output lines
.P1
parent: child=1234
child: exiting
.P2
might come out in either order, depending on whether the
parent or child gets to its
.code-index printf
call first.
After the child exits the parent's
.code-index wait
returns, causing the parent to print
.P1
parent: child 1234 is done
.P2
Although the child has the same memory contents as the parent initially, the
parent and child are executing with different memory and different registers:
changing a variable in one does not affect the other. For example, when the
return value of
.code wait
is stored into
.code pid
in the parent process,
it doesn't change the variable
.code pid
in the child. The value of
.code pid
in the child will still be zero.
.PP
The
.code-index exec
system call
replaces the calling process's memory with a new memory
image loaded from a file stored in the file system.
The file must have a particular format, which specifies which part of
the file holds instructions, which part is data, at which instruction
to start, etc. xv6
uses the ELF format, which Chapter \*[CH:MEM] discusses in
more detail.
When
.code-index exec
succeeds, it does not return to the calling program;
instead, the instructions loaded from the file start
executing at the entry point declared in the ELF header.
.code Exec
takes two arguments: the name of the file containing the
executable and an array of string arguments.
For example:
.P1
char *argv[3];
argv[0] = "echo";
argv[1] = "hello";
argv[2] = 0;
exec("/bin/echo", argv);
printf("exec error\en");
.P2
This fragment replaces the calling program with an instance
of the program
.code /bin/echo
running with the argument list
.code echo
.code hello .
Most programs ignore the first argument, which is
conventionally the name of the program.
.PP
The xv6 shell uses the above calls to run programs on behalf of
users. The main structure of the shell is simple; see
.code main
.line sh.c:/main/ .
The main loop reads a line of input from the user with
.code-index getcmd .
Then it calls
.code fork ,
which creates a copy of the shell process. The
parent calls
.code wait ,
while the child runs the command. For example, if the user
had typed
.code "echo hello" '' ``
to the shell,
.code runcmd
would have been called with
.code "echo hello" '' ``
as the argument.
.code runcmd
.line sh.c:/runcmd/
runs the actual command. For
.code "echo hello" '', ``
it would call
.code exec
.line sh.c:/exec.ecmd/ .
If
.code exec
succeeds then the child will execute instructions from
.code echo
instead of
.code runcmd .
At some point
.code echo
will call
.code exit ,
which will cause the parent to return from
.code wait
in
.code main
.line sh.c:/main/ .
You might wonder why
.code-index fork
and
.code-index exec
are not combined in a single call; we
will
see later that separate calls for creating a process
and loading a program is a clever design.
.PP
Xv6 allocates most user-space memory
implicitly:
.code-index fork
allocates the memory required for the child's copy of the
parent's memory, and
.code-index exec
allocates enough memory to hold the executable file.
A process that needs more memory at run-time (perhaps for
.code-index malloc )
can call
.code sbrk(n)
to grow its data memory by
.code n
bytes;
.code-index sbrk
returns the location of the new memory.
.PP
Xv6 does not provide a notion of users or of protecting
one user from another; in Unix terms, all xv6 processes
run as root.
.\"
.\" I/O and File descriptors
.\"
.section "I/O and File descriptors"
.PP
A
.italic-index "file descriptor"
is a small integer representing a kernel-managed object
that a process may read from or write to.
A process may obtain a file descriptor by opening a file, directory,
or device, or by creating a pipe, or by duplicating an existing
descriptor.
For simplicity we'll often refer to the object a file descriptor
refers to as a ``file'';
the file descriptor interface abstracts away the differences between
files, pipes, and devices, making them all look like streams of bytes.
.PP
Internally, the xv6 kernel uses the file descriptor
as an index into a per-process table,
so that every process has a private space of file descriptors
starting at zero.
By convention, a process reads from file descriptor 0 (standard input),
writes output to file descriptor 1 (standard output), and
writes error messages to file descriptor 2 (standard error).
As we will see, the shell exploits the convention to implement I/O redirection
and pipelines. The shell ensures that it always has three file descriptors
open
.line sh.c:/open..console/ ,
which are by default file descriptors for the console.
.PP
The
.code read
and
.code write
system calls read bytes from and write bytes to
open files named by file descriptors.
The call
.code read(fd,
.code buf,
.code n)
reads at most
.code n
bytes from the file descriptor
.code fd ,
copies them into
.code buf ,
and returns the number of bytes read.
Each file descriptor that refers to a file
has an offset associated with it.
.code Read
reads data from the current file offset and then advances
that offset by the number of bytes read:
a subsequent
.code read
will return the bytes following the ones returned by the first
.code read .
When there are no more bytes to read,
.code read
returns zero to signal the end of the file.
.PP
The call
.code write(fd,
.code buf,
.code n)
writes
.code n
bytes from
.code buf
to the file descriptor
.code fd
and returns the number of bytes written.
Fewer than
.code n
bytes are written only when an error occurs.
Like
.code read ,
.code write
writes data at the current file offset and then advances
that offset by the number of bytes written:
each
.code write
picks up where the previous one left off.
.PP
The following program fragment (which forms the essence of
.code cat )
copies data from its standard input
to its standard output. If an error occurs, it writes a message
to the standard error.
.P1
char buf[512];
int n;
for(;;){
n = read(0, buf, sizeof buf);
if(n == 0)
break;
if(n < 0){
fprintf(2, "read error\en");
exit();
}
if(write(1, buf, n) != n){
fprintf(2, "write error\en");
exit();
}
}
.P2
The important thing to note in the code fragment is that
.code cat
doesn't know whether it is reading from a file, console, or a pipe.
Similarly
.code cat
doesn't know whether it is printing to a console, a file, or whatever.
The use of file descriptors and the convention that file descriptor 0
is input and file descriptor 1 is output allows a simple
implementation
of
.code cat .
.PP
The
.code close
system call
releases a file descriptor, making it free for reuse by a future
.code open ,
.code pipe ,
or
.code dup
system call (see below).
A newly allocated file descriptor
is always the lowest-numbered unused
descriptor of the current process.
.PP
File descriptors and
.code-index fork
interact to make I/O redirection easy to implement.
.code Fork
copies the parent's file descriptor table along with its memory,
so that the child starts with exactly the same open files as the parent.
The system call
.code-index exec
replaces the calling process's memory but preserves its file table.
This behavior allows the shell to
implement I/O redirection by forking, reopening chosen file descriptors,
and then execing the new program.
Here is a simplified version of the code a shell runs for the
command
.code cat
.code <
.code input.txt :
.P1
char *argv[2];
argv[0] = "cat";
argv[1] = 0;
if(fork() == 0) {
close(0);
open("input.txt", O_RDONLY);
exec("cat", argv);
}
.P2
After the child closes file descriptor 0,
.code open
is guaranteed to use that file descriptor
for the newly opened
.code input.txt :
0 will be the smallest available file descriptor.
.code Cat
then executes with file descriptor 0 (standard input) referring to
.code input.txt .
.PP
The code for I/O redirection in the xv6 shell works in exactly this way
.line sh.c:/case.REDIR/ .
Recall that at this point in the code the shell has already forked the
child shell and that
.code runcmd
will call
.code exec
to load the new program. Now it should be clear why it is a good idea that
.code fork
and
.code exec
are separate calls. Because if they are separate, the shell can fork a child,
use
.code open ,
.code close ,
.code dup
in the child to change the standard input and output
file descriptors, and then
.code exec .
No changes to the program being exec-ed
.code ( cat
in our example)
are required.
If
.code fork
and
.code exec
were combined into a single
system call, some other (probably more complex) scheme would be required for the
shell to redirect standard input and output, or the program itself would have to
understand how to redirect I/O.
.PP
Although
.code fork
copies the file descriptor table, each underlying file offset is shared
between parent and child.
Consider this example:
.P1
if(fork() == 0) {
write(1, "hello ", 6);
exit();
} else {
wait();
write(1, "world\en", 6);
}
.P2
At the end of this fragment, the file attached to file descriptor 1
will contain the data
.code hello
.code world .
The
.code write
in the parent
(which, thanks to
.code wait ,
runs only after the child is done)
picks up where the child's
.code write
left off.
This behavior helps produce sequential output from sequences
of shell commands, like
.code (echo
.code hello;
.code echo
.code world)
.code >output.txt .
.PP
The
.code dup
system call duplicates an existing file descriptor,
returning a new one that refers to the same underlying I/O object.
Both file descriptors share an offset, just as the file descriptors
duplicated by
.code fork
do.
This is another way to write
.code hello
.code world
into a file:
.P1
fd = dup(1);
write(1, "hello ", 6);
write(fd, "world\en", 6);
.P2
.PP
Two file descriptors share an offset if they were derived from
the same original file descriptor by a sequence of
.code fork
and
.code dup
calls.
Otherwise file descriptors do not share offsets, even if they
resulted from
.code open
calls for the same file.
.code Dup
allows shells to implement commands like this:
.code ls
.code existing-file
.code non-existing-file
.code >
.code tmp1
.code 2>&1 .
The
.code 2>&1
tells the shell to give the command a file descriptor 2 that
is a duplicate of descriptor 1.
Both the name of the existing file and the error message for the
non-existing file will show up in the file
.code tmp1.
The xv6 shell doesn't support I/O redirection for the error file
descriptor, but now you know how to implement it.
.PP
File descriptors are a powerful abstraction,
because they hide the details of what they are connected to:
a process writing to file descriptor 1 may be writing to a
file, to a device like the console, or to a pipe.
.\"
.\" Pipes
.\"
.section "Pipes"
.PP
A
.italic-index pipe
is a small kernel buffer exposed to processes as a pair of
file descriptors, one for reading and one for writing.
Writing data to one end of the pipe
makes that data available for reading from the other end of the pipe.
Pipes provide a way for processes to communicate.
.PP
The following example code runs the program
.code wc
with standard input connected to
the read end of a pipe.
.P1
int p[2];
char *argv[2];
argv[0] = "wc";
argv[1] = 0;
pipe(p);
if(fork() == 0) {
close(0);
dup(p[0]);
close(p[0]);
close(p[1]);
exec("/bin/wc", argv);
} else {
close(p[0]);
write(p[1], "hello world\en", 12);
close(p[1]);
}
.P2
The program calls
.code pipe ,
which creates a new pipe and records the read and write
file descriptors in the array
.code p .
After
.code fork ,
both parent and child have file descriptors referring to the pipe.
The child dups the read end onto file descriptor 0,
closes the file descriptors in
.code p ,
and execs
.code wc .
When
.code wc
reads from its standard input, it reads from the pipe.
The parent closes the read side of the pipe,
writes to the pipe,
and then closes the write side.
.PP
If no data is available, a
.code read
on a pipe waits for either data to be written or all
file descriptors referring to the write end to be closed;
in the latter case,
.code read
will return 0, just as if the end of a data file had been reached.
The fact that
.code read
blocks until it is impossible for new data to arrive
is one reason that it's important for the child to
close the write end of the pipe
before executing
.code wc
above: if one of
.code wc 's
file descriptors referred to the write end of the pipe,
.code wc
would never see end-of-file.
.PP
The xv6 shell implements pipelines such as
.code "grep fork sh.c | wc -l"
in a manner similar to the above code
.line sh.c:/case.PIPE/ .
The child process creates a pipe to connect the left end of the pipeline
with the right end. Then it calls
.code fork
and
.code runcmd
for the left end of the pipeline
and
.code fork
and
.code runcmd
for the right end, and waits for both to finish.
The right end of the pipeline may be a command that itself includes a
pipe (e.g.,
.code a
.code |
.code b
.code |
.code c) ,
which itself forks two new child processes (one for
.code b
and one for
.code c ).
Thus, the shell may
create a tree of processes. The leaves of this tree are commands and
the interior nodes are processes that wait until the left and right
children complete. In principle, you could have the interior nodes
run the left end of a pipeline, but doing so correctly would complicate the
implementation.
.PP
Pipes may seem no more powerful than temporary files:
the pipeline
.P1
echo hello world | wc
.P2
could be implemented without pipes as
.P1
echo hello world >/tmp/xyz; wc </tmp/xyz
.P2
Pipes have at least four advantages over temporary files
in this situation.
First, pipes automatically clean themselves up;
with the file redirection, a shell would have to
be careful to remove
.code /tmp/xyz
when done.
Second, pipes can pass arbitrarily long streams of
data, while file redirection requires enough free space
on disk to store all the data.
Third, pipes allow for parallel execution of pipeline stages,
while the file approach requires the first program to finish
before the second starts.
Fourth, if you are implementing inter-process communication,
pipes' blocking reads and writes are more efficient
than the non-blocking semantics of files.
.\"
.\" File system
.\"
.section "File system"
.PP
The xv6 file system provides data files,
which are uninterpreted byte arrays,
and directories, which
contain named references to data files and other directories.
The directories form a tree, starting
at a special directory called the
.italic-index root .
A
.italic-index path
like
.code /a/b/c
refers to the file or directory named
.code c
inside the directory named
.code b
inside the directory named
.code a
in the root directory
.code / .
Paths that don't begin with
.code /
are evaluated relative to the calling process's
.italic-index "current directory" ,
which can be changed with the
.code chdir
system call.
Both these code fragments open the same file
(assuming all the directories involved exist):
.P1
chdir("/a");
chdir("b");
open("c", O_RDONLY);
open("/a/b/c", O_RDONLY);
.P2
The first fragment changes the process's current directory to
.code /a/b ;
the second neither refers to nor changes the process's current directory.
.PP
.PP
There are multiple system calls to create a new file or directory:
.code mkdir
creates a new directory,
.code open
with the
.code O_CREATE
flag creates a new data file,
and
.code mknod
creates a new device file.
This example illustrates all three:
.P1
mkdir("/dir");
fd = open("/dir/file", O_CREATE|O_WRONLY);
close(fd);
mknod("/console", 1, 1);
.P2
.code Mknod
creates a file in the file system,
but the file has no contents.
Instead, the file's metadata marks it as a device file
and records the major and minor device numbers
(the two arguments to
.code mknod ),
which uniquely identify a kernel device.
When a process later opens the file, the kernel
diverts
.code read
and
.code write
system calls to the kernel device implementation
instead of passing them to the file system.
.PP
.code fstat
retrieves information about the object a file
descriptor refers to.
It fills in a
.code struct
.code stat ,
defined in
.code stat.h
as:
.P1
.so ../xv6/stat.h
.P2
.PP
A file's name is distinct from the file itself;
the same underlying file, called an
.italic-index inode ,
can have multiple names,
called
.italic-index links .
The
.code link
system call creates another file system name
referring to the same inode as an existing file.
This fragment creates a new file named both
.code a
and
.code b .
.P1
open("a", O_CREATE|O_WRONLY);
link("a", "b");
.P2
Reading from or writing to
.code a
is the same as reading from or writing to
.code b .
Each inode is identified by a unique
.italic inode
.italic number .
After the code sequence above, it is possible
to determine that
.code a
and
.code b
refer to the same underlying contents by inspecting the
result of
.code fstat :
both will return the same inode number
.code ino ), (
and the
.code nlink
count will be set to 2.
.PP
The
.code unlink
system call removes a name from the file system.
The file's inode and the disk space holding its content
are only freed when the file's link count is zero and
no file descriptors refer to it.
Thus adding
.P1
unlink("a");
.P2
to the last code sequence leaves the inode
and file content accessible as
.code b .
Furthermore,
.P1
fd = open("/tmp/xyz", O_CREATE|O_RDWR);
unlink("/tmp/xyz");
.P2
is an idiomatic way to create a temporary inode
that will be cleaned up when the process closes
.code fd
or exits.
.PP
Shell commands for file system operations are implemented
as user-level programs such as
.code mkdir ,
.code ln ,
.code rm ,
etc. This design allows anyone to extend the shell with new user commands by
just adding a new user-level program. In hindsight this plan seems obvious,
but other systems designed at the time of Unix often built such commands into
the shell (and built the shell into the kernel).
.PP
One exception is
.code cd ,
which is built into the shell
.line sh.c:/if.buf.0..==..c./ .
.code cd
must change the current working directory of the
shell itself. If
.code cd
were run as a regular command, then the shell would fork a child
process, the child process would run
.code cd ,
and
.code cd
would change the
.italic child 's
working directory. The parent's (i.e.,
the shell's) working directory would not change.
.\"
.\" Real world
.\"
.section "Real world"
.PP
Unix's combination of the ``standard'' file
descriptors, pipes, and convenient shell syntax for
operations on them was a major advance in writing
general-purpose reusable programs.
The idea sparked a whole culture of ``software tools'' that was
responsible for much of Unix's power and popularity,
and the shell was the first so-called ``scripting language.''
The Unix system call interface persists today in systems like
BSD, Linux, and Mac OS X.
.PP
The Unix system call interface has been standardized through the Portable
Operating System Interface (POSIX) standard.
Xv6 is
.italic not
POSIX compliant. It misses system calls (including basic ones such as
.code lseek ),
it implements systems calls only partially, etc. Our main goals for xv6 are
simplicity and clarity while providing a simple UNIX-like system-call interface.
Several people have extended xv6 with a few more basic system calls and a simple
C library so that they can run basic Unix programs. Modern kernels, however,
provide many more system calls, and many more kinds of kernel services, than
xv6. For example, they support networking, Window systems, user-level threads,
drivers for many devices, and so on. Modern kernels evolve continuously and
rapidly, and offer many features beyond POSIX.
.PP
For the most part, modern Unix-derived operating systems
have not followed the early
Unix model of exposing devices as special files, like the
.code console
device file discussed above.
The authors of Unix went on to build Plan 9,
which applied the ``resources are files''
concept to modern facilities,
representing networks, graphics, and other resources
as files or file trees.
.PP
The file system abstraction has been a powerful
idea, most recently applied to network resources in the form of the
World Wide Web.
Even so, there are other models for operating system interfaces.
Multics, a predecessor of Unix,
abstracted file storage in a way that made it look like memory,
producing a very different flavor of interface.
The complexity of the Multics design had a direct influence
on the designers of Unix, who tried to build something simpler.
.ig
XXX can we cut this, since its point is the same as the next paragraph?
An operating system interface that went out of fashion
decades ago but has recently returned is the idea of a virtual machine monitor.
Such systems provide a superficially different interface from xv6,
but the basic concepts are still the same:
a virtual machine, like a process, consists of some memory and
one or more register sets;
the virtual machine has access to one large file called
a virtual disk instead of a file system;
virtual machines send messages to each other
and the outside world using virtual network devices
instead of pipes or files.
..
.PP
This book examines how xv6 implements its Unix-like interface,
but the ideas and concepts apply to more than just Unix.
Any operating system must multiplex processes onto
the underlying hardware, isolate processes from each
other, and provide mechanisms for controlled
inter-process communication.
After studying xv6, you should be able to
look at other, more complex operating systems
and see the concepts underlying xv6 in those systems as well.