Rewrite lexical section.

Put grammar productions into a box with a separate background color. R=gri DELTA=397 (132 added, 49 deleted, 216 changed) OCL=25235 CL=25258

Rewrite lexical section.
Put grammar productions into a box with a separate background color. R=gri DELTA=397 (132 added, 49 deleted, 216 changed) OCL=25235 CL=25258
ff70f09d · Rob Pike · fd1f3830 · ff70f09d
Commit ff70f09d authored Feb 20, 2009 by Rob Pike
Hide whitespace changes
Inline Side-by-side

Showing with 320 additions and 243 deletions

doc/go_spec.html doc/go_spec.html +320 -243

No files found.
--- a/doc/go_spec.html
+++ b/doc/go_spec.html
@@ -156,13 +156,13 @@ compile/link model to generate executable binaries.
 The grammar is compact and regular, allowing for easy analysis by
 automatic tools such as integrated development environments.
 </p>
-<hr>
+<hr/>
 <h2>Notation</h2>
 <p>
 The syntax is specified using Extended Backus-Naur Form (EBNF):
 </p>
-<pre>
+<pre class="grammar">
 Production  = production_name "=" Expression .
 Expression  = Alternative { "|" Alternative } .
 Alternative = Term { Term } .
@@ -176,7 +176,7 @@ Repetition  = "{" Expression "}" .
 Productions are expressions constructed from terms and the following
 operators, in increasing precedence:
 </p>
-<pre>
+<pre class="grammar">
 |   alternation
 ()  grouping
 []  option (0 or 1 times)
@@ -199,23 +199,21 @@ The form <tt>"a ... b"</tt> represents the set of characters from
 Where possible, recursive productions are used to express evaluation order
 and operator precedence syntactically.
 </p>
-<hr>
+<hr/>
 <h2>Source code representation</h2>
-Source code is Unicode text encoded in UTF-8.
-<p>
-Tokenization follows the usual rules.  Source text is case-sensitive.
 <p>
-White space is blanks, newlines, carriage returns, or tabs.
+Source code is Unicode text encoded in UTF-8. The text is not
-<p>
+canonicalized, so a single accented code point is distinct from the
-Comments are // to end of line or /* */ without nesting and are treated as white space.
+same character constructed from combining an accent and a letter;
+those are treated as two code points.  For simplicity, this document
+will use the term <i>character</i> to refer to a Unicode code point.
+</p>
 <p>
-Some Unicode characters (e.g., the character U+00E4) may be representable in
+Each code point is distinct; for instance, upper and lower case letters
-two forms, as a single code point or as two code points.  For simplicity of
+are different characters.
-implementation, Go treats these as distinct characters: each Unicode code
+</p>
-point is a single character in Go.
 <h3>Characters</h3>
@@ -223,37 +221,66 @@ point is a single character in Go.
 The following terms are used to denote specific Unicode character classes:
 </p>
 <ul>
-	<li>unicode_char      an arbitrary Unicode code point
+	<li>unicode_char      an arbitrary Unicode code point</li>
-	<li>unicode_letter    a Unicode code point classified as "Letter"
+	<li>unicode_letter    a Unicode code point classified as "Letter"</li>
-	<li>capital_letter    a Unicode code point classified as "Letter, uppercase"
+	<li>capital_letter    a Unicode code point classified as "Letter, uppercase"</li>
+	<li>unicode_digit     a Unicode code point classified as "Digit"</li>
 </ul>
 (The Unicode Standard, Section 4.5 General Category - Normative.)
 <h3>Letters and digits</h3>
-<pre>
+<p>
+The underscore character <tt>_</tt> (U+005F) is considered a letter.
+</>
+<pre class="grammar">
 letter        = unicode_letter | "_" .
 decimal_digit = "0" ... "9" .
 octal_digit   = "0" ... "7" .
 hex_digit     = "0" ... "9" | "A" ... "F" | "a" ... "f" .
 </pre>
-<hr>
+<hr/>
+<h2>Lexical elements</h2>
-<h2>Vocabulary</h2>
+<h3>Comments</h3>
-Tokens make up the vocabulary of the Go language. They consist of
+<p>
-identifiers, numbers, strings, operators, and delimitors.
+There are two forms of comments.  The first starts at the character
+sequence <tt>//</tt> and continues through the next newline.  The
+second starts at the character sequence <tt>/*</tt> and continues
+through the character sequence <tt>*/</tt>.  Comments do not nest.
+</p>
+<h3>Tokens</h3>
+<p>
+Tokens form the vocabulary of the Go language.
+There are four classes: identifiers, keywords, operators
+and delimiters, and literals.  <i>White space</i>, formed from
+blanks, tabs, and newlines, is ignored except as it separates tokens
+that would otherwise combine into a single token.  Comments
+behave as white space.  While breaking the input into tokens,
+the next token is the longest sequence of characters that form a
+valid token.
+</p>
 <h3>Identifiers</h3>
-An identifier is a name for a program entity such as a variable, a
+<p>
-type, a function, etc.
+Identifiers name program entities such as variables and types.
-<pre>
+An identifier is a sequence of one or more letters and digits.
-identifier = letter { letter | decimal_digit } .
+The first character in an identifier must be a letter.
+</p>
+<pre class="grammar">
+identifier    = letter { letter | unicode_digit } .
 </pre>
-Exported identifiers (§Exported identifiers) start with a capital_letter.
+<p>
+Exported identifiers (§Exported identifiers) start with a <tt>capital_letter</tt>.
+<br>
+<font color=red>TODO: This sentence feels out of place.</font>
+</p>
 <pre>
 a
 _x9
@@ -262,16 +289,46 @@ ThisVariableIsExported
 </pre>
 Some identifiers are predeclared (§Predeclared identifiers).
+<h3>Keywords</h3>
-<h3>Numeric literals</h3>
+<p>
+The following keywords are reserved and may not be used as identifiers.
+</p>
+<pre class="grammar">
+break        default      func         interface    select
+case         defer        go           map          struct
+chan         else         goto         package      switch
+const        fallthrough  if           range        type
+continue     for          import       return       var
+</pre>
-An integer literal represents a mathematically ideal integer constant
+<h3>Operators and Delimiters</h3>
-of arbitrary precision, or 'ideal int'.
-<pre>
+<p>
-int_lit     = decimal_int | octal_int | hex_int .
+The following character sequences represent operators, delimiters, and other special tokens:
-decimal_int = ( "1" ... "9" ) { decimal_digit } .
+</p>
-octal_int   = "0" { octal_digit } .
+<pre class="grammar">
-hex_int     = "0" ( "x" | "X" ) hex_digit { hex_digit } .
+    &amp;     +=    &amp;=     &amp;&amp;    ==    !=    (    )
+-    |     -=    |=     ||    &lt;     &lt;=    [    ]
+*    ^     *=    ^=     &lt;-    &gt;     &gt;=    {    }
+/    <<    /=    <<=    ++    =     :=    ,    ;
+%    >>    %=    >>=    --    !     ...   .    :
+</pre>
+<h3>Integer literals</h3>
+<p>
+An integer literal is a sequence of one or more digits in the
+corresponding base, which may be 8, 10, or 16.  An optional prefix
+sets a non-decimal base: <tt>0</tt> for octal, <tt>0x</tt> or
+<tt>0X</tt> for hexadecimal.  In hexadecimal literals, letters
+<tt>a-f</tt> and <tt>A-F</tt> represent values 10 through 15.
+</p>
+<pre class="grammar">
+int_lit       = decimal_lit | octal_lit | hex_lit .
+decimal_lit   = ( "1" ... "9" ) { decimal_digit } .
+octal_lit     = "0" { octal_digit } .
+hex_lit       = "0" ( "x" | "X" ) hex_digit { hex_digit } .
 </pre>
 <pre>
@@ -281,14 +338,20 @@ hex_int     = "0" ( "x" | "X" ) hex_digit { hex_digit } .
 170141183460469231731687303715884105727
 </pre>
-A floating point literal represents a mathematically ideal floating point
+<h3>Floating-point literals</h3>
-constant of arbitrary precision, or 'ideal float'.
+<p>
+A floating-point literal is a decimal representation of a floating-point
-<pre>
+number.  It has an integer part, a decimal point, a fractional part,
-float_lit =
+and an exponent part.  The integer and fractional part comprise
-	decimals "." [ decimals ] [ exponent ] |
+decimal digits; the exponent part is an <tt>e</TT> or <tt>E</tt>
-	decimals exponent |
+followed by an optionally signed decimal exponent.  One of the
-	"." decimals [ exponent ] .
+integer part or the fractional part may be elided; one of the decimal
+point or the exponent may be elided.
+</p>
+<pre class="grammar">
+float_lit    = decimals "." [ decimals ] [ exponent ] |
+               decimals exponent |
+               "." decimals [ exponent ] .
 decimals = decimal_digit { decimal_digit } .
 exponent = ( "e" | "E" ) [ "+" | "-" ] decimals .
 </pre>
@@ -303,79 +366,90 @@ exponent = ( "e" | "E" ) [ "+" | "-" ] decimals .
 .12345E+5
 </pre>
-Numeric literals are unsigned. A negative constant is formed by
+<h3>Ideal numbers</h3>
-applying the unary prefix operator "-" (§Arithmetic operators).
-<p>
-An 'ideal number' is either an 'ideal int' or an 'ideal float'.
 <p>
-Only when an ideal number (or an arithmetic expression formed
+Integer literals represent values of arbitrary precision, or <i>ideal
-solely from ideal numbers) is bound to a variable or used in an expression
+integers</i>.  Similarly, floating-point literals represent values
-or constant of fixed-size integers or floats it is required to fit
+of arbitrary precision, or <i>ideal floats</i>.  These <i>ideal
-a particular size.  In other words, ideal numbers and arithmetic
+numbers</i> have no size or type and cannot overflow.  However,
-upon them are not subject to overflow; only use of them in assignments
+when (used in an expression) assigned to a variable or typed constant,
-or expressions involving fixed-size numbers may cause overflow, and thus
+the destination must be able to represent the assigned value.
-an error (§Expressions).
+</p>
 <p>
 Implementation restriction: A compiler may implement ideal numbers
-by choosing a "sufficiently large" internal representation of such
+by choosing a large internal representation of such numbers.
-numbers.
+<br>
+<font color=red>TODO: This is too vague. It used to say "sufficiently"
+but that doesn't help.  Define a minimum?</font>
+</p>
-<h3>Character and string literals</h3>
+<h3>Character literals</h3>
 <p>
-Character and string literals are almost the same as in C, with the
+A character literal represents an integer value, typically a
-following differences:
+Unicode code point, as one or more characters enclosed in single
+quotes.  Within the quotes, any character may appear except single
+quote and newline. A single quoted character represents itself,
+while multi-character sequences beginning with a backslash encode
+values in various formats.
 </p>
-<ul>
-	<li>The encoding is UTF-8
-	<li>`` strings exist; they do not interpret backslashes
-	<li>Octal character escapes are always 3 digits ("\077" not "\77")
-	<li>Hexadecimal character escapes are always 2 digits ("\x07" not "\x7")
-</ul>
-The rules are:
-<pre>
-escaped_char = "\" ( "a" | "b" | "f" | "n" | "r" | "t" | "v" | "\" | "'" | """ ) .
-</pre>
 <p>
-A unicode_value takes one of four forms:
+The simplest form represents the single character within the quotes;
+since Go source text is Unicode characters encoded in UTF-8, multiple
+UTF-8-encoded bytes may represent a single integer value.  For
+instance, the literal <tt>'a'</tt> holds a single byte representing
+a literal <tt>a</tt>, Unicode U+0061, value <tt>0x61</tt>, while
+<tt>'ä'</tt> holds two bytes (<tt>0xc3</tt> <tt>0xa4</tt>) representing
+a literal <tt>a</tt>-dieresis, U+00E4, value <tt>0xe4</tt>.
 </p>
-<ul>
-	<li>The UTF-8 encoding of a Unicode code point.  Since Go source
-text is in UTF-8, this is the obvious translation from input
-text into Unicode characters.
-	<li>The usual list of C backslash escapes: "\n", "\t", etc.
-Within a character or string literal, only the corresponding quote character
-is a legal escape (this is not explicitly reflected in the above syntax).
-	<li>A `little u' value, such as "\u12AB".  This represents the Unicode
-code point with the corresponding hexadecimal value.  It always
-has exactly 4 hexadecimal digits.
-	<li>A `big U' value, such as "\U00101234".  This represents the
-Unicode code point with the corresponding hexadecimal value.
-It always has exactly 8 hexadecimal digits.
-</ul>
-Some values that can be represented this way are illegal because they
-are not valid Unicode code points.  These include values above
-0x10FFFF and surrogate halves.
 <p>
-An octal_byte_value contains three octal digits.  A hex_byte_value
+Several backslash escapes allow arbitrary values to be represented
-contains two hexadecimal digits.  (Note: This differs from C but is
+as ASCII text.  There are four ways to represent the integer value
-simpler.)
+as a numeric constant: <tt>\x</tt> followed by exactly two hexadecimal
+digits; <tt>\u</tt> followed by exactly four hexadecimal digits;
+<tt>\U</tt> followed by exactly eight hexadecimal digits, and a
+plain backslash <tt>\</tt> followed by exactly three octal digits.
+In each case the value of the literal is the value represented by
+the digits in the corresponding base.
+</p>
 <p>
-It is erroneous for an octal_byte_value to represent a value larger than 255. 
+Although these representations all result in an integer, they have
-(By construction, a hex_byte_value cannot.)
+different valid ranges.  Octal escapes must represent a value between
+0 and 255 inclusive.  (Hexadecimal escapes satisfy this condition
+by construction). The `Unicode' escapes <tt>\u</tt> and <tt>\U</tt>
+represent Unicode code points so within them some values are illegal,
+in particular those above <tt>0x10FFFF</tt> and surrogate halves.
+</p>
 <p>
-A character literal is a form of unsigned integer constant.  Its value
+After a backslash, certain single-character escapes represent special values:
-is that of the Unicode code point represented by the text between the
+</p>
-quotes.
+<pre class="grammar">
+\a   U+0007 alert or bell
+\b   U+0008 backspace
+\f   U+000C form feed
+\n   U+000A line feed or newline
+\r   U+000D carriage return
+\t   U+0009 horizontal tab
+\v   U+000b vertical tab
+\\   U+005c backslash
+\'   U+0027 single quote  (valid escape only within character literals)
+\"   U+0022 double quote  (valid escape only within string literals)
+</pre>
+<p>
+All other sequences are illegal inside character literals.
+</p>
+<pre class="grammar">
+char_lit         = "'" ( unicode_value | byte_value ) "'" .
+unicode_value    = unicode_char | little_u_value | big_u_value | escaped_char .
+byte_value       = octal_byte_value | hex_byte_value .
+octal_byte_value = "\" octal_digit octal_digit octal_digit .
+hex_byte_value   = "\" "x" hex_digit hex_digit .
+little_u_value   = "\" "u" hex_digit hex_digit hex_digit hex_digit .
+big_u_value      = "\" "U" hex_digit hex_digit hex_digit hex_digit
+                           hex_digit hex_digit hex_digit hex_digit .
+escaped_char     = "\" ( "a" | "b" | "f" | "n" | "r" | "t" | "v" | "\" | "'" | """ ) .
+</pre>
 <pre>
 'a'
 'ä'
@@ -390,30 +464,47 @@ quotes.
 '\U00101234'
 </pre>
-String literals come in two forms: double-quoted and back-quoted.
+<p>
-Double-quoted strings have the usual properties; back-quoted strings
+The value of a character literal is an ideal integer, just as with
-do not interpret backslashes at all.
+integer literals.
+</p>
-<pre>
+<h3>String literals</h3>
-string_lit = raw_string_lit | interpreted_string_lit .
-raw_string_lit = "`" { unicode_char } "`" .
+<p>
+String literals represent constant values of type <tt>string</tt>.
+There are two forms: raw string literals and interpreted string
+literals.
+</p>
+<p>
+Raw string literals are character sequences between back quotes
+<tt>``</tt>.  Within the quotes, any character is legal except
+newline and back quote. The value of a raw string literal is the
+string composed of the uninterpreted bytes between the quotes;
+in particular, backslashes have no special meaning.
+</p>
+<p>
+Interpreted string literals are character sequences between double
+quotes <tt>&quot;&quot;</tt>. The text between the quotes forms the
+value of the literal, with backslash escapes interpreted as they
+are in character literals (except that <tt>\'</tt> is illegal and
+<tt>\"</tt> is legal).  The three-digit octal (<tt>\000</tt>)
+and two-digit hexadecimal (<tt>\x00</tt>) escapes represent individual
+<i>bytes</i> of the resulting string; all other escapes represent
+the (possibly multi-byte) UTF-8 encoding of individual <i>characters</i>.
+Thus inside a string literal <tt>\377</tt> and <tt>\xFF</tt> represent
+a single byte of value <tt>0xFF</tt>=255, while <tt>ÿ</tt>,
+<tt>\u00FF</tt>, <tt>\U000000FF</tt> and <tt>\xc3\xbf</tt> represent
+the two bytes <tt>0xc3 0xbf</tt> of the UTF-8 encoding of character
+U+00FF.
+</p>
+<pre class="grammar">
+string_lit             = raw_string_lit | interpreted_string_lit .
+raw_string_lit         = "`" { unicode_char } "`" .
 interpreted_string_lit = """ { unicode_value | byte_value } """ .
 </pre>
-A string literal has type "string" (§Strings).  Its value is constructed
-by taking the byte values formed by the successive elements of the
-literal.  For byte_values, these are the literal bytes; for
-unicode_values, these are the bytes of the UTF-8 encoding of the
-corresponding Unicode code points.  Note that
-	"\u00FF"
-and
-	"\xFF"
-are
-different strings: the first contains the two-byte UTF-8 expansion of
-the value 255, while the second contains a single byte of value 255.
-The same rules apply to raw string literals, except the contents are
-uninterpreted UTF-8.
 <pre>
 `abc`
 `\n`
@@ -426,61 +517,38 @@ uninterpreted UTF-8.
 "\xff\u00FF"
 </pre>
+<p>
 These examples all represent the same string:
+</p>
 <pre>
-"日本語"  // UTF-8 input text
+"日本語"                                 // UTF-8 input text
-`日本語`  // UTF-8 input text as a raw literal
+`日本語`                                 // UTF-8 input text as a raw literal
-"\u65e5\u672c\u8a9e"  // The explicit Unicode code points
+"\u65e5\u672c\u8a9e"                    // The explicit Unicode code points
-"\U000065e5\U0000672c\U00008a9e"  // The explicit Unicode code points
+"\U000065e5\U0000672c\U00008a9e"        // The explicit Unicode code points
 "\xe6\x97\xa5\xe6\x9c\xac\xe8\xaa\x9e"  // The explicit UTF-8 bytes
 </pre>
+<p>
-Adjacent strings separated only by whitespace (including comments)
+Adjacent string literals separated only by the empty string, white
-are concatenated into a single string. The following two lines
+space, or comments are concatenated into a single string literal.
-represent the same string:
+</p>
+<pre class="grammar">
+StringLit              = string_lit { string_lit } .
+</pre>
 <pre>
 "Alea iacta est."
 "Alea " /* The die */ `iacta est` /* is cast */ "."
 </pre>
-The language does not canonicalize Unicode text or evaluate combining
-forms.  The text of source code is passed uninterpreted.
 <p>
 If the source code represents a character as two code points, such as
 a combining form involving an accent and a letter, the result will be
 an error if placed in a character literal (it is not a single code
 point), and will appear as two code points if placed in a string
 literal.
+</p>
+<hr/>
-<h3>Operators and delimitors</h3>
-The following special character sequences serve as operators or delimitors:
-<pre>
-+    &amp;     +=    &amp;=     &amp;&amp;    ==    !=    (    )
-    |     -=    |=     ||    <     <=    [    ]
-*    ^     *=    ^=     <-    >     >=    {    }
-/    <<    /=    <<=    ++    =     :=    ,    ;
-%    >>    %=    >>=    --    !     ...   .    :
-</pre>
-<h3>Reserved words</h3>
-The following words are reserved and must not be used as identifiers:
-<pre>
-break        default      func         interface    select
-case         defer        go           map          struct
-chan         else         goto         package      switch
-const        fallthrough  if           range        type
-continue     for          import       return       var
-</pre>
-<hr>
 <h2>Declarations and scope rules</h2>
@@ -488,7 +556,7 @@ A declaration ``binds'' an identifier to a language entity (such as
 a package, constant, type, struct field, variable, parameter, result,
 function, method) and specifies properties of that entity such as its type.
-<pre>
+<pre class="grammar">
 Declaration = ConstDecl | TypeDecl | VarDecl | FunctionDecl | MethodDecl .
 </pre>
@@ -535,30 +603,33 @@ same identifier declared in an outer block.
 <h3>Predeclared identifiers</h3>
+<p>
 The following identifiers are predeclared:
+</p>
+<p>
 All basic types:
+</p>
-<pre>
+<pre class="grammar">
 bool, byte, uint8, uint16, uint32, uint64, int8, int16, int32, int64,
 float32, float64, string
 </pre>
 A set of platform-specific convenience types:
-<pre>
+<pre class="grammar">
 uint, int, float, uintptr
 </pre>
 The predeclared constants:
-<pre>
+<pre class="grammar">
 true, false, iota, nil
 </pre>
 The predeclared functions (note: this list is likely to change):
-<pre>
+<pre class="grammar">
 cap(), convert(), len(), make(), new(), panic(), panicln(), print(), println(), typeof(), ...
 </pre>
@@ -584,7 +655,7 @@ are never exported, but non-global fields/methods may be exported.
 A constant declaration binds an identifier to the value of a constant
 expression (§Constant expressions).
-<pre>
+<pre class="grammar">
 ConstDecl = "const" ( ConstSpec | "(" [ ConstSpecList ] ")" ) .
 ConstSpecList = ConstSpec { ";" ConstSpec } [ ";" ] .
 ConstSpec = IdentifierList [ CompleteType ] [ "=" ExpressionList ] .
@@ -753,7 +824,7 @@ const (
 A type declaration specifies a new type and binds an identifier to it.
 The identifier is called the ``type name''; it denotes the type.
-<pre>
+<pre class="grammar">
 TypeDecl = "type" ( TypeSpec | "(" [ TypeSpecList ] ")" ) .
 TypeSpecList = TypeSpec { ";" TypeSpec } [ ";" ] .
 TypeSpec = identifier Type .
@@ -791,7 +862,7 @@ The variable type must be a complete type (§Types).
 In some forms of declaration the type of the initial value defines the type
 of the variable.
-<pre>
+<pre class="grammar">
 VarDecl = "var" ( VarSpec | "(" [ VarSpecList ] ")" ) .
 VarSpecList = VarSpec { ";" VarSpec } [ ";" ] .
 VarSpec = IdentifierList ( CompleteType [ "=" ExpressionList ] | "=" ExpressionList ) .
@@ -827,13 +898,13 @@ var f = 3.1415  // f has float type
 The syntax
-<pre>
+<pre class="grammar">
 SimpleVarDecl = IdentifierList ":=" ExpressionList .
 </pre>
 is shorthand for
-<pre>
+<pre class="grammar">
 "var" IdentifierList = ExpressionList .
 </pre>
@@ -846,7 +917,7 @@ ch := new(chan int);
 Also, in some contexts such as "if", "for", or "switch" statements,
 this construct can be used to declare local temporary variables.
-<hr>
+<hr/>
 <h2>Types</h2>
@@ -857,8 +928,8 @@ A type may be specified by a type name (§Type declarations) or a type literal.
 A type literal is a syntactic construct that explicitly specifies the
 composition of a new type in terms of other (already declared) types.
-<pre>
+<pre class="grammar">
-Type = TypeName | TypeLit .
+Type = TypeName | TypeLit | "(" Type ")" .
 TypeName = QualifiedIdent.
 TypeLit =
 	ArrayType | StructType | PointerType | FunctionType | InterfaceType |
@@ -881,7 +952,7 @@ type of a pointer type, may be incomplete). Incomplete types are subject to usag
 restrictions; for instance the type of a variable must be complete where the
 variable is declared.
-<pre>
+<pre class="grammar">
 CompleteType = Type .
 </pre>
@@ -912,7 +983,7 @@ and strings.
 The following list enumerates all platform-independent numeric types:
-<pre>
+<pre class="grammar">
 byte     same as uint8 (for convenience)
 uint8    the set of all unsigned  8-bit integers (0 to 255)
@@ -944,7 +1015,7 @@ its corresponding unsigned type without loss).
 Additionally, Go declares a set of platform-specific numeric types for
 convenience:
-<pre>
+<pre class="grammar">
 uint     at least 32 bits, at most the size of the largest uint type
 int      at least 32 bits, at most the size of the largest int type
 float    at least 32 bits, at most the size of the largest float type
@@ -1006,7 +1077,7 @@ same type, called the element type. The element type must be a complete type
 negative. The elements of an array are designated by indices
 which are integers from 0 through the length - 1.
-<pre>
+<pre class="grammar">
 ArrayType = "[" ArrayLength "]" ElementType .
 ArrayLength = Expression .
 ElementType = CompleteType .
@@ -1046,7 +1117,7 @@ an identifier and type for each field. Within a struct type no field
 identifier may be declared twice and all field types must be complete
 types (§Types).
-<pre>
+<pre class="grammar">
 StructType = "struct" [ "{" [ FieldDeclList ] "}" ] .
 FieldDeclList = FieldDecl { ";" FieldDecl } [ ";" ] .
 FieldDecl = (IdentifierList CompleteType | [ "*" ] TypeName) [ Tag ] .
@@ -1134,7 +1205,7 @@ equal type only.
 A pointer type denotes the set of all pointers to variables of a given
 type, called the ``base type'' of the pointer, and the value "nil".
-<pre>
+<pre class="grammar">
 PointerType = "*" BaseType .
 BaseType = Type .
 </pre>
@@ -1178,7 +1249,7 @@ Pointer arithmetic of any kind is not permitted.
 A function type denotes the set of all functions with the same parameter
 and result types, and the value "nil".
-<pre>
+<pre class="grammar">
 FunctionType = "func" Signature .
 Signature = "(" [ ParameterList ] ")" [ Result ] .
 ParameterList = ParameterDecl { "," ParameterDecl } .
@@ -1236,7 +1307,7 @@ Type interfaces may be specified explicitly by interface types.
 An interface type denotes the set of all types that implement at least
 the set of methods specified by the interface type, and the value "nil".
-<pre>
+<pre class="grammar">
 InterfaceType = "interface" [ "{" [ MethodSpecList ] "}" ] .
 MethodSpecList = MethodSpec { ";" MethodSpec } [ ";" ] .
 MethodSpec = IdentifierList Signature | TypeName .
@@ -1344,7 +1415,7 @@ The number of elements of a slice is called its length; it is never negative.
 The elements of a slice are designated by indices which are
 integers from 0 through the length - 1.
-<pre>
+<pre class="grammar">
 SliceType = "[" "]" ElementType .
 </pre>
@@ -1436,7 +1507,7 @@ each be of a specific complete type (§Types) called the key and value type,
 respectively. The number of entries in a map is called its length; it is never
 negative.
-<pre>
+<pre class="grammar">
 MapType = "map" "[" KeyType "]" ValueType .
 KeyType = CompleteType .
 ValueType = CompleteType .
@@ -1491,7 +1562,7 @@ A channel provides a mechanism for two concurrently executing functions
 to synchronize execution and exchange values of a specified type. This
 type must be a complete type (§Types). <font color=red>(TODO could it be incomplete?)</font>
-<pre>
+<pre class="grammar">
 ChannelType = Channel | SendChannel | RecvChannel .
 Channel = "chan" ValueType .
 SendChannel = "chan" "&lt;-" ValueType .
@@ -1544,7 +1615,7 @@ the same ValueType. They are equal if both values were created by the same
 Types may be ``different'', ``structurally equal'', or ``identical''.
 Go is a type-safe language; generally different types cannot be mixed
 in binary operations, and values cannot be assigned to variables of different
-types. However, values may be assigned to variables of structually
+types. However, values may be assigned to variables of structurally
 equal types. Finally, type guards succeed only if the dynamic type
 is identical to or implements the type tested against (§Type guards).
 <p>
@@ -1659,7 +1730,7 @@ struct { a, b *T5 } and struct { a, b *T5 }
 As an example, "T0" and "T1" are equal but not identical because they have
 different declarations.
-<hr>
+<hr/>
 <h2>Expressions</h2>
@@ -1688,7 +1759,7 @@ should be ideal number, because for arrays, it is a constant.
 Operands denote the elementary values in an expression.
-<pre>
+<pre class="grammar">
 Operand  = Literal | QualifiedIdent | "(" Expression ")" .
 Literal  = BasicLit | CompositeLit | FunctionLit .
 BasicLit = int_lit | float_lit | char_lit | StringLit .
@@ -1713,7 +1784,7 @@ A qualified identifier is an identifier qualified by a package name.
 TODO(gri) expand this section.
 </font>
-<pre>
+<pre class="grammar">
 QualifiedIdent = { PackageName "." } identifier .
 PackageName = identifier .
 </pre>
@@ -1725,7 +1796,7 @@ Literals for composite data structures consist of the type of the value
 followed by a braced expression list for array, slice, and structure literals,
 or a list of expression pairs for map literals.
-<pre>
+<pre class="grammar">
 CompositeLit = LiteralType "(" [ ( ExpressionList | ExprPairList ) [ "," ] ] ")" .
 LiteralType = Type | "[" "..." "]" ElementType .
 ExprPairList = ExprPair { "," ExprPair } .
@@ -1798,7 +1869,7 @@ A function literal represents an anonymous function. It consists of a
 specification of the function type and the function body. The parameter
 and result types of the function type must all be complete types (§Types).
-<pre>
+<pre class="grammar">
 FunctionLit = "func" Signature Block .
 Block = "{" [ StatementList ] "}" .
 </pre>
@@ -1825,7 +1896,7 @@ as they are accessible in any way.
 <h3>Primary expressions</h3>
-<pre>
+<pre class="grammar">
 PrimaryExpr =
 	Operand |
 	PrimaryExpr Selector |
@@ -2175,7 +2246,7 @@ in f_extra.
 Operators combine operands into expressions.
-<pre>
+<pre class="grammar">
 Expression = UnaryExpr | Expression binaryOp UnaryExpr .
 UnaryExpr = PrimaryExpr | unary_op UnaryExpr .
@@ -2210,7 +2281,7 @@ The operand types in binary operations must be equal, with the following excepti
 Unary operators have the highest precedence. They are evaluated from
 right to left. Note that "++" and "--" are outside the unary operator
-hierachy (they are statements) and they apply to the operand on the left.
+hierarchy (they are statements) and they apply to the operand on the left.
 Specifically, "*p++" means "(*p)++" in Go (as opposed to "*(p++)" in C).
 <p>
 There are six precedence levels for binary operators:
@@ -2219,7 +2290,7 @@ operators, comparison operators, communication operators,
 "&amp;&amp;" (logical and), and finally "||" (logical or) with the
 lowest precedence:
-<pre>
+<pre class="grammar">
 Precedence    Operator
    6             *  /  %  &lt;&lt;  >>  &amp;
    5             +  -  |  ^
@@ -2251,7 +2322,7 @@ type as the first operand. The four standard arithmetic operators ("+", "-",
 "*", "/") apply to both integer and floating point types, while "+" also applies
 to strings and arrays; all other arithmetic operators apply to integer types only.
-<pre>
+<pre class="grammar">
 +    sum             integers, floats, strings, arrays
 -    difference      integers, floats
 *    product         integers, floats
@@ -2317,7 +2388,7 @@ Specifically, "x << 1" is the same as "x*2"; and "x >> 1" is the same as
 For integer operands, the unary operators "+", "-", and "^" are defined as
 follows:
-<pre>
+<pre class="grammar">
 +x                          is 0 + x
 -x    negation              is 0 - x
 ^x    bitwise complement    is m ^ x  with m = "all bits set to 1"
@@ -2347,7 +2418,7 @@ boolean values, pointer, interface, and channel types. Slice and
 map types only support testing for equality against the predeclared value
 "nil".
-<pre>
+<pre class="grammar">
 ==    equal
 !=    not equal
 <     less
@@ -2372,7 +2443,7 @@ and §Channel types, respectively.
 Logical operators apply to boolean operands and yield a boolean result.
 The right operand is evaluated conditionally.
-<pre>
+<pre class="grammar">
 &amp;&amp;    conditional and    p &amp;&amp; q  is  "if p then q else false"
 ||    conditional or     p || q  is  "if p then true else q"
 !     not                !p      is  "not p"
@@ -2580,13 +2651,13 @@ TODO: Complete this list as needed.
 <p>
 Constant expressions can be evaluated at compile time.
-<hr>
+<hr/>
 <h2>Statements</h2>
 Statements control execution.
-<pre>
+<pre class="grammar">
 Statement =
 	Declaration | LabelDecl | EmptyStat |
 	SimpleStat | GoStat | ReturnStat | BreakStat | ContinueStat | GotoStat |
@@ -2601,7 +2672,7 @@ SimpleStat =
 Statements in a statement list are separated by semicolons, which can be
 omitted in some cases as expressed by the OptSemicolon production.
-<pre>
+<pre class="grammar">
 StatementList = Statement { OptSemicolon Statement } .
 </pre>
@@ -2623,14 +2694,14 @@ is an empty statement, a statement list can always be ``terminated'' with a semi
 The empty statement does nothing.
-<pre>
+<pre class="grammar">
 EmptyStat = .
 </pre>
 <h3>Expression statements</h3>
-<pre>
+<pre class="grammar">
 ExpressionStat = Expression .
 </pre>
@@ -2648,14 +2719,14 @@ TODO: specify restrictions. 6g only appears to allow calls here.
 The "++" and "--" statements increment or decrement their operands
 by the (ideal) constant value 1.
-<pre>
+<pre class="grammar">
 IncDecStat = Expression ( "++" | "--" ) .
 </pre>
 The following assignment statements (§Assignments) are semantically
 equivalent:
-<pre>
+<pre class="grammar">
 IncDec statement    Assignment
 x++                 x += 1
 x--                 x -= 1
@@ -2669,11 +2740,9 @@ For instance, "x++" cannot be used as an operand in an expression.
 <h3>Assignments</h3>
-<pre>
+<pre class="grammar">
 Assignment = ExpressionList assign_op ExpressionList .
-</pre>
-<pre>
 assign_op = [ add_op | mul_op ] "=" .
 </pre>
@@ -2742,7 +2811,7 @@ and the "else" branch. If Expression evaluates to true,
 the "if" branch is executed. Otherwise the "else" branch is executed if present.
 If Condition is omitted, it is equivalent to true.
-<pre>
+<pre class="grammar">
 IfStat = "if" [ [ SimpleStat ] ";" ] [ Expression ] Block [ "else" Statement ] .
 </pre>
@@ -2792,7 +2861,7 @@ without the surrounding Block:
 Switches provide multi-way execution.
-<pre>
+<pre class="grammar">
 SwitchStat = "switch" [ [ SimpleStat ] ";" ] [ Expression ] "{" { CaseClause } "}" .
 CaseClause = SwitchCase ":" [ StatementList ] .
 SwitchCase = "case" ExpressionList | "default" .
@@ -2858,7 +2927,7 @@ case x == 4: f3();
 A for statement specifies repeated execution of a block. The iteration is
 controlled by a condition, a for clause, or a range clause.
-<pre>
+<pre class="grammar">
 ForStat = "for" [ Condition | ForClause | RangeClause ] Block .
 Condition = Expression .
 </pre>
@@ -2879,7 +2948,7 @@ additionally it may specify an init and post statement, such as an assignment,
 an increment or decrement statement. The init statement may also be a (simple)
 variable declaration; no variables can be declared in the post statement.
-<pre>
+<pre class="grammar">
 ForClause = [ InitStat ] ";" [ Condition ] ";" [ PostStat ] .
 InitStat = SimpleStat .
 PostStat = SimpleStat .
@@ -2917,7 +2986,7 @@ of iteration variables - and then executes the block. Iteration terminates
 when all entries have been processed, or if the for statement is terminated
 early, for instance by a break or return statement.
-<pre>
+<pre class="grammar">
 RangeClause = IdentifierList ( "=" | ":=" ) "range" Expression .
 </pre>
@@ -2970,7 +3039,7 @@ A go statement starts the execution of a function as an independent
 concurrent thread of control within the same address space. The expression
 must be a function or method call.
-<pre>
+<pre class="grammar">
 GoStat = "go" Expression .
 </pre>
@@ -2989,7 +3058,7 @@ A select statement chooses which of a set of possible communications
 will proceed.  It looks similar to a switch statement but with the
 cases all referring to communication operations.
-<pre>
+<pre class="grammar">
 SelectStat = "select" "{" { CommClause } "}" .
 CommClause = CommCase ":" [ StatementList ] .
 CommCase = "case" ( SendExpr | RecvExpr) | "default" .
@@ -3067,7 +3136,7 @@ TODO: Make semantics more precise.
 A return statement terminates execution of the containing function
 and optionally provides a result value or values to the caller.
-<pre>
+<pre class="grammar">
 ReturnStat = "return" [ ExpressionList ] .
 </pre>
@@ -3111,7 +3180,7 @@ func complex_f2() (re float, im float) {
 Within a for, switch, or select statement, a break statement terminates
 execution of the innermost such statement.
-<pre>
+<pre class="grammar">
 BreakStat = "break" [ identifier ].
 </pre>
@@ -3133,7 +3202,7 @@ L: for i < n {
 Within a for loop a continue statement begins the next iteration of the
 loop at the post statement.
-<pre>
+<pre class="grammar">
 ContinueStat = "continue" [ identifier ].
 </pre>
@@ -3144,7 +3213,7 @@ The optional identifier is analogous to that of a break statement.
 A label declaration serves as the target of a goto, break or continue statement.
-<pre>
+<pre class="grammar">
 LabelDecl = identifier ":" .
 </pre>
@@ -3159,7 +3228,7 @@ Error:
 A goto statement transfers control to the corresponding label statement.
-<pre>
+<pre class="grammar">
 GotoStat = "goto" identifier .
 </pre>
@@ -3187,7 +3256,7 @@ next case clause in a switch statement (§Switch statements). It may only
 be used in a switch statement, and only as the last statement in a case
 clause of the switch statement.
-<pre>
+<pre class="grammar">
 FallthroughStat = "fallthrough" .
 </pre>
@@ -3197,7 +3266,7 @@ FallthroughStat = "fallthrough" .
 A defer statement invokes a function whose execution is deferred to the moment
 when the surrounding function returns.
-<pre>
+<pre class="grammar">
 DeferStat = "defer" Expression .
 </pre>
@@ -3218,7 +3287,7 @@ for i := 0; i &lt;= 3; i++ {
 }
 </pre>
-<hr>
+<hr/>
 <h2>Function declarations</h2>
@@ -3227,7 +3296,7 @@ Functions contain declarations and statements.  They may be
 recursive. Except for forward declarations (see below), the parameter
 and result types of the signature must all be complete types (§Type declarations).
-<pre>
+<pre class="grammar">
 FunctionDecl = "func" identifier Signature [ Block ] .
 </pre>
@@ -3263,7 +3332,7 @@ it is declared within the scope of that type (§Type declarations). If the
 receiver value is not needed inside the method, its identifier may be omitted
 in the declaration.
-<pre>
+<pre class="grammar">
 MethodDecl = "func" Receiver identifier Signature [ Block ] .
 Receiver = "(" [ identifier ] [ "*" ] TypeName ")" .
 </pre>
@@ -3310,7 +3379,7 @@ base type and may be forward-declared.
 <h3>Length and capacity</h3>
-<pre>
+<pre class="grammar">
 Call      Argument type        Result
 len(s)    string, *string      string length (in bytes)
@@ -3345,7 +3414,7 @@ at any time the following relationship holds:
 Conversions syntactically look like function calls of the form
-<pre>
+<pre class="grammar">
 T(value)
 </pre>
@@ -3453,14 +3522,14 @@ TODO Once this has become clearer, connect new() and make() (new() may be
 explained by make() and vice versa).
 </font>
-<hr>
+<hr/>
 <h2>Packages</h2>
 A package is a package clause, optionally followed by import declarations,
 followed by a series of declarations.
-<pre>
+<pre class="grammar">
 Package = PackageClause { ImportDecl [ ";" ] } { Declaration [ ";" ] } .
 </pre>
@@ -3470,7 +3539,7 @@ purposes ($Declarations and scope rules).
 Every source file identifies the package to which it belongs.
 The file must begin with a package clause.
-<pre>
+<pre class="grammar">
 PackageClause = "package" PackageName .
 package Math
@@ -3480,7 +3549,7 @@ package Math
 A package can gain access to exported identifiers from another package
 through an import declaration:
-<pre>
+<pre class="grammar">
 ImportDecl = "import" ( ImportSpec | "(" [ ImportSpecList ] ")" ) .
 ImportSpecList = ImportSpec { ";" ImportSpec } [ ";" ] .
 ImportSpec = [ "." | PackageName ] PackageFileName .
@@ -3568,7 +3637,7 @@ func main() {
 }
 </pre>
-<hr>
+<hr/>
 <h2>Program initialization and execution</h2>
@@ -3577,7 +3646,7 @@ or "new()", and no explicit initialization is provided, the memory is
 given a default initialization.  Each element of such a value is
 set to the ``zero'' for that type: "false" for booleans, "0" for integers,
 "0.0" for floats, '''' for strings, and "nil" for pointers and interfaces.
-This intialization is done recursively, so for instance each element of an
+This initialization is done recursively, so for instance each element of an
 array of integers will be set to 0 if no other value is specified.
 <p>
 These two simple declarations are equivalent:
@@ -3640,7 +3709,7 @@ invoking main.main().
 <p>
 When main.main() returns, the program exits.
-<hr>
+<hr/>
 <h2>Systems considerations</h2>
@@ -3652,7 +3721,7 @@ system. A package using "unsafe" must be vetted manually for type safety.
 <p>
 The package "unsafe" provides (at least) the following package interface:
-<pre>
+<pre class="grammar">
 package unsafe
 const Maxalign int
@@ -3712,7 +3781,7 @@ The results of calls to "unsafe.Alignof", "unsafe.Offsetof", and
 For the arithmetic types (§Arithmetic types), a Go compiler guarantees the
 following sizes:
-<pre>
+<pre class="grammar">
 type                      size in bytes
 byte, uint8, int8         1
@@ -3737,7 +3806,15 @@ A Go compiler guarantees the following minimal alignment properties:
   unsafe.Alignof(x[0]), but at least 1.
 </ol>
-<hr>
+<hr/>
+<h2><font color=red>Differences between this doc and implementation - TODO</font></h2>
+<p>
+<font color=red>
+Current implementation accepts only ASCII digits for digits; doc says Unicode.
+<br>
+</font>
+</p>
 </div>
 </body>