Add description of how compiling and linking handle dependencies.

SVN=115807

Add description of how compiling and linking handle dependencies.
SVN=115807
b806ba4d · Rob Pike · 8cdb7101 · b806ba4d · b806ba4d
Commit b806ba4d authored Apr 15, 2008 by Rob Pike
Hide whitespace changes
Inline Side-by-side

Showing with 264 additions and 1 deletion

doc/candl.txt doc/candl.txt +263 -0

src/lib/container/vector.go src/lib/container/vector.go +1 -1

No files found.
--- a/doc/candl.txt
+++ b/doc/candl.txt
+Compiling and Linking
+----
+
+Assume we have:
+
+	- one or more source files, *.go, perhaps in different directories
+	- a compiler, C. it takes one .go file and generates a .o file.
+	- a linker, L, it takes one or more .o files and generates a go.out (!) file.
+
+There is a question around naming of the files.  Let's avoid that
+problem for now and state that if the input is X.go, the output of
+the compiler is X.o, ignoring the package declaration in the file.
+This is not current behavior and probably not correct behavior, but
+it keeps the exposition simpler.
+
+Let's also assume that the linker knows about the run time and we
+don't have to specify bootstrap and runtime linkage explicitly.
+
+
+Basics
+----
+
+Given a single file, main.go, with no dependencies, we do:
+
+	C main.go  # compile
+	L main.o  # link
+	go.out  # run
+
+Now let's say that main.go contains
+
+	import "fmt"
+
+and that fmt.go contains
+
+	import "sys"
+
+Then to build, we must compile in dependency order:
+
+	C sys.go
+	C fmt.go
+	C main.go
+
+and then link
+
+	L main.o fmt.o sys.o
+
+To the linker itself, the order of arguments is unimportant.
+
+When we compile fmt.go, we need to know the details of the functions
+(etc.) exported by sys.go and used by fmt.go.  When we run
+
+	C fmt.go
+
+it discovers the import of sys, and must then read sys.o to discover
+the details.  We must therefore compile the exporting source file before we
+can compile the importing source.  Moreover, if there is a mismatch
+between export and import, we can discover it during compilation
+of the importing source.
+
+To be explicit, then, what we say is, in effect
+
+	C sys.go
+	C fmt.go sys.o
+	C main.go fmt.o sys.o
+	L main.o fmt.o sys.o
+
+
+The contents of .o files (I)
+----
+
+It's necessary to include in fmt.o the information for linking
+against the functions etc. in sys.o.  It's also possible to identify
+sys.o explicitly inside fmt.o, so we need to say only
+
+	L main.o fmt.o
+
+with sys.o discovered automatically.   Iterating again, it's easy
+to reduce the link step to
+
+	L main.o
+
+with L discovering automatically the .o files it needs to process
+to create the final go.out.
+
+
+Automation of dependencies (I)
+----
+
+It should be possible to automate discovery of the dependencies of
+main.go and therefore the order necessary to compile.  Since the
+source files contain explicit import statements, it is possible,
+given a source file, to discover the dependency tree automatically.
+(This will require rules and/or conventions about where to find
+things; for now assume everything is in the same directory.)
+
+The program that does this might possibly be a variant of the
+compiler, since it must parse import statements at least, but for
+clarity let's call it D for dependency.  It can be a little like
+make, but let's not call it make because that brings along properties
+we don't want. In particular, it reads the sources to discover the
+dependencies; it doesn't need a separate description such as a
+Makefile.
+
+In a directory with the source files above, including main.go, but
+with no .o files, we say:
+
+	D main.go
+
+D reads main.go, finds the import for fmt, and in effect descends,
+automatically running
+
+	D fmt.go
+
+which in turn invokes
+
+	D sys.go
+
+The file sys.go has no dependencies, so it can be compiled; D
+therefore says in effect
+
+	"compile sys.go"
+
+and returns; then we have what we need for fmt.go since the exports
+in sys.go are known (or at least the recipe to discover them is
+known).  So the next level says
+
+	"compile fmt.go"
+
+and pops up, whereupon the top D says
+
+	"compile main.go"
+
+The output of D could therefore be described as a script to run to
+compile the source.
+
+We could imagine that instead, D actually runs the compiler.
+(Conversely, we could imagine that C uses D to make sure the
+dependencies are built, but that has the danger of causing unnecessary
+dependency checking and compilation; more on that later.)
+
+To build, therefore, all we need to say is:
+
+	D -c main.go  # -c means 'run the compiler'
+	L main.o
+
+Obviously, D at this stage could just run L.  Therefore, we can
+simplify further by having it do so, whereupon
+
+	D -c main.go
+
+can automate the complete compilation and linking process.
+
+Automation of dependencies (II)
+----
+
+Let's say we now edit main.go without changing its imports.  To
+recompile, we have two options. First, we could be explicit:
+
+	C main.go
+
+Or we could use D to automate running the compiler, as described
+in the previous section:
+
+	D -c main.go
+
+The D command will discover the import of fmt, but can see that fmt.o
+already exists.  Assuming its existence implies its currency, it need
+go no further; it can invoke C to compile main.go and link as usual.
+Whether it should make this assumption might be controlled by a flag.
+For the purpose of discussion, let's say it makes the assumption if
+the -c flag is set.
+
+There are two implications to this scheme. First, running D when D
+is going to turn around and run C anyway implies we could just run
+C directly and save one command invocation.   (We could decide
+independently whether C should automatically invoke the linker.)
+
+The other implication is more interesting.  If we stop traversing
+the dependency hierarchy as soon as we discover a .o file, then we
+may not realize that fmt.o is out of date and link against a stale
+binary. To fix this problem, we need to stat() or checksum the .o
+and .go files to see if they need recompilation.  Doing this every
+time is expensive and gets us back into the make-like approach.
+
+The great majority of compilations do not require this full check,
+however; this is especially true when in the compile-debug-edit
+cycle.  We therefore propose splitting the model into two scenarios.
+
+Scenario 1: General
+
+In this scenario, we ask D to update the full dependency tree by
+stat()-ing or checksumming files to check currency.  The generated
+go.out will always be up to date but incremental compilation will
+be slower.  Typically, this will be necessary only after a major
+operation like syncing or checking out code, or if there are known
+changes being made to the dependencies.
+
+Scenario 2: Fast
+
+In this scenario, we explicitly tell D -c what has changed and have
+it compile only what is required.  Typically, this will mean compiling
+only the single active file or maybe a few files.  If an IDE is
+present or there is some watcher tool, it's easy to avoid the common
+mistake of forgetting to compile a changed file.
+
+If an edit has caused skew between export and import, this will be
+caught by the compiler, so it should be type-safe at least.  If D is
+running the compilation, it might be possible to arrange that C tells
+it there is a dependency problem and have D then try to resolve it
+by reevaluation.
+
+
+The contents of .o files (II)
+----
+
+For scenario 2, we can make things even faster if the .o files
+identify not just the files that must be imported to satisfy the
+imports, but details about the imports themselves.  Let's say main.go
+uses only one function from fmt.go, called F. If the compiled main.o
+says, in effect
+
+	from package fmt get F
+
+then the linker will not need to read all of fmt.o to link main.o;
+instead it can extract only the necessary function.
+
+Even better, if fmt is a package made of many files, it may be
+possible to store in main.o specific information about the exact
+files needed:
+
+	from file fmtF.o get F
+
+The linker can then not even bother opening the other .o files that
+form package fmt.
+
+The compiler should therefore be explicit and detailed within the .o
+files it generates about what elements of a package are needed by
+the program being compiled.
+
+Earlier, we said that when we run
+
+	C fmt.go
+
+it discovers the import of sys, and must then read sys.o to discover
+the details.  Note that if we record the information as specified here,
+when we then do
+
+	C main.go
+
+and it reads fmt.o, it does not in turn need to read sys.o; the necessary
+information has already been pulled up into fmt.o by D.
+
+Thus, once the dependency information is properly constructed, to
+compile a program X.go we must read X.go plus N .o files, where N
+is the number of packages explicitly imported by X.go.  The transitive
+closure need not be evaluated to compile a file, only the explicit
+imports.  By this result, we hope to dramatically reduce the amount
+of I/O necessary to compile a Go source file.
+
+To put this another way, if a package P imports packages Xi, the
+existence of Xi.o files is all that is needed to compile P because the
+Xi.o files contain the export information.  This is what breaks the
+transitive dependency closure.
--- a/src/lib/container/vector.go
+++ b/src/lib/container/vector.go
@@ -123,7 +123,7 @@ func Test() {
 	for i := 0; i < v.Len(); i++ {
 		var x *I;
 		x = v.At(i);
-		print i, " ", x.val, "\n";  // BUG: can't use I(v.At(i))
+		print i, " ", x.val, "\n";
 	}
 }