Commit a43033a3 authored by Rob Pike's avatar Rob Pike

develop interfaces through cats

sort
2,3,5

R=gri
DELTA=648  (647 added, 0 deleted, 1 changed)
OCL=15315
CL=15352
parent d01a1ec2
...@@ -4,7 +4,7 @@ Let's Go ...@@ -4,7 +4,7 @@ Let's Go
Rob Pike Rob Pike
---- ----
(September 10, 2008) (September 14, 2008)
This document is a tutorial introduction to the basics of the Go systems programming This document is a tutorial introduction to the basics of the Go systems programming
...@@ -300,3 +300,242 @@ and run the program: ...@@ -300,3 +300,242 @@ and run the program:
can't open file; errno=2 can't open file; errno=2
% %
Rotting cats
----
Building on the FD package, here's a simple version of the Unix utility "cat(1)", "progs/cat.go":
--PROG progs/cat.go
By now this should be easy to follow, but the "switch" statement introduces some
new features. Like a "for" loop, an "if" or "switch" can include an
initialization statement. The "switch" on line 12 uses one to create variables
"nr" and "er" to hold the return values from "fd.Read()". (The "if" on line 19
has the same idea.) The "switch" statement is general: it evaluates the cases
from top to bottom looking for the first case that matches the value; the
case expressions don't need to be constants or even integers.
Since the "switch" value is just "true", we could leave it off -- as is also true
in a "for" statement, a missing value means "true". In fact, such a "switch"
is a form of "if-else" chain.
Line 19 calls "Write()" by slicing (a pointer to) the array, creating a
<i>reference slice</i>.
Now let's make a variant of "cat" that optionally does "rot13" on its input.
It's easy to do by just processing the bytes, but instead we will exploit
Go's notion of an <i>interface</i>.
The "cat()" subroutine uses only two methods of "fd": "Read()" and "Name()",
so let's start by defining an interface that has exactly those two methods.
Here is code from "progs/cat_rot13.go":
--PROG progs/cat_rot13.go /type.Reader/ /^}/
Any type that implements the two methods of "Reader" -- regardless of whatever
other methods the type may also contain -- is said to <i>implement</i> the
interface. Since "FD.FD" implements these methods, it implements the
"Reader" interface. We could tweak the "cat" subroutine to accept a "Reader"
instead of a "*FD.FD" and it would work just fine, but let's embellish a little
first by writing a second type that implements "Reader", one that wraps an
existing "Reader" and does "rot13" on the data. To do this, we just define
the type and implement the methods and with no other bookkeeping,
we have a second implementation of the "Reader" interface.
--PROG progs/cat_rot13.go /type.Rot13/ /end.of.Rot13/
(The "rot13" function called on line 39 is trivial and not worth reproducing.)
To use the new feature, we define a flag:
--PROG progs/cat_rot13.go /rot13_flag/
and use it from within a mostly unchanged "cat()" function:
--PROG progs/cat_rot13.go /func.cat/ /^}/
Lines 53 and 54 set it all up: If the "rot13" flag is true, wrap the "Reader"
we received into a "Rot13" and proceed. Note that the interface variables
are values, not pointers: the argument is of type "Reader", not "*Reader",
even though under the covers it holds a pointer to a "struct".
Here it is in action:
<pre>
% echo abcdefghijklmnopqrstuvwxyz | ./cat
abcdefghijklmnopqrstuvwxyz
% echo abcdefghijklmnopqrstuvwxyz | ./cat --rot13
nopqrstuvwxyzabcdefghijklm
%
</pre>
Fans of dependency injection may take cheer from how easily interfaces
made substituting the implementation of a file descriptor.
Interfaces are a distinct feature of Go. An interface is implemented by a
type if the type implements all the methods declared in the interface.
This means
that a type may implement an arbitrary number of different interfaces.
There is no type hierarchy; things can be much more <i>ad hoc</i>,
as we saw with "rot13". "FD.FD" implements "Reader"; it could also
implement a "Writer", or any other interface built from its methods that
fits the current situation. Consider the <i>empty interface</i>
<pre>
type interface Empty {}
</pre>
<i>Every</i> type implements the empty interface, which makes it
useful for things like containers.
Sorting
----
As another example of interfaces, consider this simple sort algorithm,
taken from "progs/sort.go":
--PROG progs/sort.go /func.Sort/ /^}/
The code needs only three methods, which we wrap into "SortInterface":
--PROG progs/sort.go /interface/ /^}/
We can apply "Sort" to any type that implements "len", "less", and "swap".
The "sort" package includes the necessary methods to allow sorting of
arrays of integers, strings, etc.; here's the code for arrays of "int":
--PROG progs/sort.go /type.*IntArray/ /swap/
And now a routine to test it out, from "progs/sortmain.go". This
uses a function in the "sort" package, omitted here for brevity,
to test that the result is sorted.
--PROG progs/sortmain.go /func.ints/ /^}/
If we have a new type we want to be able to sort, all we need to do is
to implement the three methods for that type, like this:
--PROG progs/sortmain.go /type.Day/ /swap/
The 2,3,5 program
----
Now we come to processes and communication - concurrent programming.
It's a big subject so to be brief we assume some familiarity with the topic.
The prime sieve program in the language specification document is
an excellent illustration of concurrent programming, but for variety
here we'll solve a different problem in a similar way.
An old interview question is to write a program that prints all the
integers that can be written as multiples of 2, 3, and 5 only.
One way to solve it is to generate streams of numbers multiplied
by 2, 3, and 5, and to provide as input to the stream generators
the output of the program so far. To generate the correct output,
we pick the least number generated each round and eliminate
duplicates (6 appears twice, as 2*3s and as 3*2), but that's easy.
Here's a flow diagram:
<br>
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<img src=go235.jpg >
<br>
To create a stream of integers, we use a Go <i>channel</i>, which,
borrowing from CSP and its descendants, represents a communications
channel that can connect two computations. In Go, channel variables are
always pointers to channels -- it's the (hidden) object they point to that
does the communication.
Here are the first few lines of "progs/235A.go":
--PROG progs/235A.go /package/ /^}/
The numbers can get big, so we'll use 64-bit unsigned integers,
using the shorthand "INT" defined on line 3.
The function M is a multiplication generator. It receives data
on the channel "in", using the unary receive operator "&lt;-"; the expression
"&lt;-in" retrieves the next value on the the channel. The value
is multiplied by the factor "f" and then sent out on channel "out",
using the binary send operator "-&lt". Channels block, so if there's
nothing available on "in" or no recipient for the the value on "out",
the function will block until it can proceed.
To deal with blocking, we want M to run in a separate thread. Go has
its own model of process/threads/light-weight processes/coroutines,
so to avoid notational confusion we'll call concurrently executing
computations in Go <i>goroutines</i>. To start a goroutine,
invoke the function, prefixing the call with the keyword "go";
this starts the function running independently of the current
computation but in the same address space:
go sum(huge_array); // calculate sum in the background
If you want to know when the calculation is done, pass a channel
on which it can report back:
ch := new(chan int);
go sum(huge_array, ch);
// ... do something else for a while
result := <-ch; // wait for, and retrieve, result
Back to our 2-3-5 program. Here's how "main" sets up the
calculation:
--PROG progs/235A.go /func.main/ /go.M.5/
Lines 17 through 22 create the channels to connect the multipliers,
and lines 24 through 26 launch the goroutines. The "100" parameter
to the input channels ("c2i" etc.) is a buffer size. By default,
Go channels are unbuffered (synchronous) but the "Multipler" inputs need to
be buffered because the main loop will generate data faster than
they process it.
Next we initialize a few variables.
--PROG progs/235A.go /x.:=/ /x5/
The "x" variable will be the value we generate; the others will
hold the latest value received from each "Multiplier" goroutine.
Finally, here is the main loop:
--PROG progs/235A.go /for.*100/ /^.}/
The algorithm is simple: We send the current value to each of
the "Multiplier" goroutines; it needs to be multiplied by 2, 3, and 5 to
produce the full list. Next, we advance the streams: each
channel whose latest value is the current value needs to step
to the next value. Finally, we choose the least of the current
values, and iterate.
This program can be tightened up a little using a pattern common
in this style of programming. Here is a variant version of "Multiplier",
from "progs/235B.go":
--PROG progs/235B.go /func.M/ /^}/
This version does all the setup internally. It creates the channels,
launches a goroutine internally using a function literal, and
returns the channels to the caller. It is a concurrent factory,
starting the goroutine and returning its connections.
The "main" function starts out simpler as a result:
--PROG progs/235B.go /func.main/ /x5/
The rest is the same.
The program "progs/235_gen.go" generalizes the problem; by
filling in the elements of an array "F"
--PROG progs/235_gen.go /F.*INT/
we can produces outputs from multiples of any integers.
Here is the full program, without further elucidation.
--PROG progs/235_gen.go
// Copyright 2009 The Go Authors. All rights reserved.
// Use of this source code is governed by a BSD-style
// license that can be found in the LICENSE file.
package main
type INT uint64
func Multiplier(f INT, in, out *chan INT) {
for {
out -< (f * <-in);
}
}
func min(a, b INT) INT {
if a < b { return a }
return b;
}
func main() {
c2i := new(chan INT, 100);
c2o := new(chan INT);
c3i := new(chan INT, 100);
c3o := new(chan INT);
c5i := new(chan INT, 100);
c5o := new(chan INT);
go Multiplier(2, c2i, c2o);
go Multiplier(3, c3i, c3o);
go Multiplier(5, c5i, c5o);
var x INT = 1;
x2 := x;
x3 := x;
x5 := x;
for i := 0; i < 100; i++ {
print(x, "\n");
c2i -< x;
c3i -< x;
c5i -< x;
if x2 == x { x2 = <- c2o }
if x3 == x { x3 = <- c3o }
if x5 == x { x5 = <- c5o }
x = min(min(x2, x3), x5);
}
sys.exit(0);
}
// Copyright 2009 The Go Authors. All rights reserved.
// Use of this source code is governed by a BSD-style
// license that can be found in the LICENSE file.
package main
type INT uint64
func Multiplier(f INT) (in, out *chan INT) {
inc := new(chan INT, 100);
outc := new(chan INT);
go func(f INT, in, out *chan INT) {
for {
out -< f * <-in;
}
}(f, inc, outc)
return inc, outc
}
func min(a, b INT) INT {
if a < b { return a }
return b;
}
func main() {
c2i, c2o := Multiplier(2);
c3i, c3o := Multiplier(3);
c5i, c5o := Multiplier(5);
var x INT = 1;
x2, x3, x5 := x, x, x;
for i := 0; i < 100; i++ {
print(x, "\n");
c2i -< x;
c3i -< x;
c5i -< x;
if x2 == x { x2 = <- c2o }
if x3 == x { x3 = <- c3o }
if x5 == x { x5 = <- c5o }
x = min(min(x2, x3), x5);
}
sys.exit(0);
}
// Copyright 2009 The Go Authors. All rights reserved.
// Use of this source code is governed by a BSD-style
// license that can be found in the LICENSE file.
package main
type INT uint64
func Multiplier(f INT) (in, out *chan INT) {
in = new(chan INT, 100);
out = new(chan INT, 100);
go func(in, out *chan INT, f INT) {
for {
out -< f * <- in;
}
}(in, out, f);
return in, out;
}
func min(xs *[]INT) INT {
m := xs[0];
for i := 1; i < len(xs); i++ {
if xs[i] < m {
m = xs[i];
}
}
return m;
}
func main() {
F := []INT{2, 3, 5};
const n = len(F);
x := INT(1);
ins := new([]*chan INT, n);
outs := new([]*chan INT, n);
xs := new([]INT, n);
for i := 0; i < n; i++ {
ins[i], outs[i] = Multiplier(F[i]);
xs[i] = x;
}
for i := 0; i < 100; i++ {
print(x, "\n");
t := min(xs);
for i := 0; i < n; i++ {
ins[i] -< x;
}
for i := 0; i < n; i++ {
if xs[i] == x { xs[i] = <- outs[i]; }
}
x = min(xs);
}
sys.exit(0);
}
// Copyright 2009 The Go Authors. All rights reserved.
// Use of this source code is governed by a BSD-style
// license that can be found in the LICENSE file.
package main
import (
FD "fd";
Flag "flag";
)
func cat(fd *FD.FD) {
const NBUF = 512;
var buf [NBUF]byte;
for {
switch nr, er := fd.Read(&buf); true {
case nr < 0:
print("error reading from ", fd.Name(), ": ", er, "\n");
sys.exit(1);
case nr == 0: // EOF
return;
case nr > 0:
if nw, ew := FD.Stdout.Write((&buf)[0:nr]); nw != nr {
print("error writing from ", fd.Name(), ": ", ew, "\n");
}
}
}
}
func main() {
Flag.Parse(); // Scans the arg list and sets up flags
if Flag.NArg() == 0 {
cat(FD.Stdin);
}
for i := 0; i < Flag.NArg(); i++ {
fd, err := FD.Open(Flag.Arg(i), 0, 0);
if fd == nil {
print("can't open ", Flag.Arg(i), ": error ", err, "\n");
sys.exit(1);
}
cat(fd);
fd.Close();
}
}
// Copyright 2009 The Go Authors. All rights reserved.
// Use of this source code is governed by a BSD-style
// license that can be found in the LICENSE file.
package main
import (
FD "fd";
Flag "flag";
)
var rot13_flag = Flag.Bool("rot13", false, nil, "rot13 the input")
func rot13(bb byte) byte {
var b int = int(bb) /// BUG: until byte division is fixed
if 'a' <= b && b <= 'z' {
b = 'a' + ((b - 'a') + 13) % 26;
}
if 'A' <= b && b <= 'Z' {
b = 'A' + ((b - 'A') + 13) % 26
}
return byte(b)
}
type Reader interface {
Read(b *[]byte) (ret int64, errno int64);
Name() string;
}
type Rot13 struct {
source Reader;
}
func NewRot13(source Reader) *Rot13 {
r13 := new(Rot13);
r13.source = source;
return r13
}
func (r13 *Rot13) Read(b *[]byte) (ret int64, errno int64) {
r, e := r13.source.Read(b);
for i := int64(0); i < r; i++ {
b[i] = rot13(b[i])
}
return r, e
}
func (r13 *Rot13) Name() string {
return r13.source.Name()
}
// end of Rot13 implementation
func cat(r Reader) {
const NBUF = 512;
var buf [NBUF]byte;
if rot13_flag.BVal() {
r = NewRot13(r)
}
for {
switch nr, er := r.Read(&buf); {
case nr < 0:
print("error reading from ", r.Name(), ": ", er, "\n");
sys.exit(1);
case nr == 0: // EOF
return;
case nr > 0:
nw, ew := FD.Stdout.Write((&buf)[0:nr]);
if nw != nr {
print("error writing from ", r.Name(), ": ", ew, "\n");
}
}
}
}
func main() {
var bug FD.FD;
Flag.Parse(); // Scans the arg list and sets up flags
if Flag.NArg() == 0 {
cat(FD.Stdin);
}
for i := 0; i < Flag.NArg(); i++ {
fd, err := FD.Open(Flag.Arg(i), 0, 0);
if fd == nil {
print("can't open ", Flag.Arg(i), ": error ", err, "\n");
sys.exit(1);
}
cat(fd);
fd.Close();
}
}
...@@ -56,3 +56,7 @@ func (fd *FD) Write(b *[]byte) (ret int64, errno int64) { ...@@ -56,3 +56,7 @@ func (fd *FD) Write(b *[]byte) (ret int64, errno int64) {
r, e := Syscall.write(fd.fildes, &b[0], int64(len(b))); r, e := Syscall.write(fd.fildes, &b[0], int64(len(b)));
return r, e return r, e
} }
func (fd *FD) Name() string {
return fd.name
}
// Copyright 2009 The Go Authors. All rights reserved.
// Use of this source code is governed by a BSD-style
// license that can be found in the LICENSE file.
package sort
export type SortInterface interface {
len() int;
less(i, j int) bool;
swap(i, j int);
}
export func Sort(data SortInterface) {
// Bubble sort for brevity
for i := 0; i < data.len(); i++ {
for j := i; j < data.len(); j++ {
if data.less(j, i) {
data.swap(i, j)
}
}
}
}
export func IsSorted(data SortInterface) bool {
n := data.len();
for i := n - 1; i > 0; i-- {
if data.less(i, i - 1) {
return false;
}
}
return true;
}
// Convenience types for common cases
export type IntArray struct {
data *[]int;
}
func (p *IntArray) len() int { return len(p.data); }
func (p *IntArray) less(i, j int) bool { return p.data[i] < p.data[j]; }
func (p *IntArray) swap(i, j int) { p.data[i], p.data[j] = p.data[j], p.data[i]; }
export type FloatArray struct {
data *[]float;
}
func (p *FloatArray) len() int { return len(p.data); }
func (p *FloatArray) less(i, j int) bool { return p.data[i] < p.data[j]; }
func (p *FloatArray) swap(i, j int) { p.data[i], p.data[j] = p.data[j], p.data[i]; }
export type StringArray struct {
data *[]string;
}
func (p *StringArray) len() int { return len(p.data); }
func (p *StringArray) less(i, j int) bool { return p.data[i] < p.data[j]; }
func (p *StringArray) swap(i, j int) { p.data[i], p.data[j] = p.data[j], p.data[i]; }
// Convenience wrappers for common cases
export func SortInts(a *[]int) { Sort(&IntArray{a}); }
export func SortFloats(a *[]float) { Sort(&FloatArray{a}); }
export func SortStrings(a *[]string) { Sort(&StringArray{a}); }
export func IntsAreSorted(a *[]int) bool { return IsSorted(&IntArray{a}); }
export func FloatsAreSorted(a *[]float) bool { return IsSorted(&FloatArray{a}); }
export func StringsAreSorted(a *[]string) bool { return IsSorted(&StringArray{a}); }
// Copyright 2009 The Go Authors. All rights reserved.
// Use of this source code is governed by a BSD-style
// license that can be found in the LICENSE file.
package main
import Sort "sort"
func ints() {
data := []int{74, 59, 238, -784, 9845, 959, 905, 0, 0, 42, 7586, -5467984, 7586};
a := Sort.IntArray{&data};
Sort.Sort(&a);
if !Sort.IsSorted(&a) {
panic()
}
}
func strings() {
data := []string{"monday", "tuesday", "wednesday", "thursday", "friday", "saturday", "sunday"};
a := Sort.StringArray{&data};
Sort.Sort(&a);
if !Sort.IsSorted(&a) {
panic()
}
}
type Day struct {
num int;
short_name string;
long_name string;
}
type DayArray struct {
data *[]*Day;
}
func (p *DayArray) len() int { return len(p.data); }
func (p *DayArray) less(i, j int) bool { return p.data[i].num < p.data[j].num; }
func (p *DayArray) swap(i, j int) { p.data[i], p.data[j] = p.data[j], p.data[i]; }
func days() {
Sunday := Day{ 0, "SUN", "Sunday" };
Monday := Day{ 1, "MON", "Monday" };
Tuesday := Day{ 2, "TUE", "Tuesday" };
Wednesday := Day{ 3, "WED", "Wednesday" };
Thursday := Day{ 4, "THU", "Thursday" };
Friday := Day{ 5, "FRI", "Friday" };
Saturday := Day{ 6, "SAT", "Saturday" };
data := []*Day{&Tuesday, &Thursday, &Sunday, &Monday, &Friday};
a := DayArray{&data};
Sort.Sort(&a);
if !Sort.IsSorted(&a) {
panic()
}
for i := 0; i < len(data); i++ {
print(data[i].long_name, " ")
}
print("\n")
}
func main() {
ints();
strings();
days();
}
Markdown is supported
0%
or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment