Skip to content
Projects
Groups
Snippets
Help
Loading...
Help
Support
Keyboard shortcuts
?
Submit feedback
Contribute to GitLab
Sign in / Register
Toggle navigation
G
go
Project overview
Project overview
Details
Activity
Releases
Repository
Repository
Files
Commits
Branches
Tags
Contributors
Graph
Compare
Issues
0
Issues
0
List
Boards
Labels
Milestones
Merge Requests
0
Merge Requests
0
Analytics
Analytics
Repository
Value Stream
Wiki
Wiki
Snippets
Snippets
Members
Members
Collapse sidebar
Close sidebar
Activity
Graph
Create a new issue
Commits
Issue Boards
Open sidebar
Kirill Smelkov
go
Commits
18c5b488
Commit
18c5b488
authored
Mar 02, 2008
by
Robert Griesemer
Browse files
Options
Browse Files
Download
Email Patches
Plain Diff
Go spec starting point.
SVN=111041
parent
d82b11e4
Changes
1
Show whitespace changes
Inline
Side-by-side
Showing
1 changed file
with
1197 additions
and
0 deletions
+1197
-0
doc/go_spec
doc/go_spec
+1197
-0
No files found.
doc/go_spec
0 → 100644
View file @
18c5b488
The
Go
Annotated
Specification
This
document
supersedes
all
previous
Go
spec
attempts
.
The
intent
is
to
make
this
a
reference
for
syntax
and
semantics
.
It
is
annotated
with
additional
information
not
strictly
belonging
into
a
language
spec
.
Recent
design
decisions
A
list
of
decisions
made
but
for
which
we
haven
't incorporated proper
language into this spec. Keep this section small and the spec
up-to-date instead.
- multi-dimensional arrays: implementation restriction for now
- no '
->
', always '
.
'
- (*a)[i] can be sugared into: a[i]
- '
.
' to select package elements
- arrays are not automatically pointers, we must always say
explicitly: "*array T" if we mean a pointer to that array
- there is no pointer arithmetic in the language
- there are no unions
- packages: need to pin it all down
- tuple notation: (a, b) = (b, a);
generally: need to make this clear
- for now: no (C) '
static
' variables inside functions
- exports: we write: '
export
a
,
b
,
c
;
' (with a, b, c, etc. a list of
exported names, possibly also: structure.field)
- the ordering of methods in interfaces is not relevant
- structs must be identical (same decl) to be the same
(Ken has different implementation: equivalent declaration is the
same; what about methods?)
- new methods can be added to a struct outside the package where the
struct is declared (need to think through all implications)
- array assignment by value
- do we need a type switch?
- write down scoping rules for statements
- semicolons: where are they needed and where are they not needed.
need a simple and consistent rule
- we have: postfix ++ and -- as statements
Guiding principles
Go is an attempt at a new systems programming language.
[gri: this needs to be expanded. some keywords below]
- small, concise, crisp
- procedural
- strongly typed
- few, orthogonal, and general concepts
- avoid repetition of declarations
- multi-threading support in the language
- garbage collected
- containers w/o templates
- compiler can be written in Go and so can it'
s
GC
-
very
fast
compilation
possible
(
1
MLOC
/
s
stretch
goal
)
-
reasonably
efficient
(
C
ballpark
)
-
compact
,
predictable
code
(
local
program
changes
generally
have
local
effects
)
-
no
macros
Syntax
The
syntax
of
Go
borrows
from
the
C
tradition
with
respect
to
statements
and
from
the
Pascal
tradition
with
respect
to
declarations
.
Go
programs
are
written
using
a
lean
notation
with
a
small
set
of
keywords
,
without
filler
keywords
(
such
as
'of'
,
'to'
,
etc
.)
or
other
gratuitous
syntax
,
and
with
a
slight
preference
for
expressive
keywords
(
e
.
g
.
'function'
)
over
operators
or
other
syntactic
mechanisms
.
Generally
,
"light"
language
features
(
variables
,
simple
control
flow
,
etc
.)
are
expressed
using
a
light
-
weight
notation
(
short
keywords
,
little
syntax
),
while
"heavy"
language
features
use
a
more
heavy
-
weight
notation
(
longer
keywords
,
more
syntax
).
[
gri
:
should
say
something
about
syntactic
alternatives
:
if
a
syntactic
form
foreseeably
will
lead
to
a
style
recommendation
,
try
to
make
that
the
syntactic
form
instead
.
For
instance
,
Go
structured
statements
always
require
the
{}
braces
even
if
there
is
only
a
single
sub
-
statement
.
Similar
ideas
apply
elsewhere
.]
Modularity
,
identifiers
and
scopes
A
Go
program
consists
of
one
or
more
files
compiled
separately
,
though
not
independently
.
A
single
file
or
compilation
unit
may
make
individual
identifiers
visible
to
other
files
by
marking
them
as
exported
;
there
is
no
"header file"
.
The
exported
interface
of
a
file
may
be
exposed
in
condensed
form
(
without
the
corresponding
implementation
)
through
tools
.
A
package
collects
types
,
constants
,
functions
,
and
so
on
into
a
named
entity
that
may
be
imported
to
enable
its
constituents
be
used
in
another
compilation
unit
.
Each
source
file
is
part
of
exactly
one
package
;
each
package
is
constructed
from
one
source
file
.
Within
a
file
,
all
identifiers
are
declared
explicitly
(
expect
for
general
predeclared
identifiers
such
as
true
and
false
)
and
thus
for
each
identifier
in
a
file
the
corresponding
declaration
can
be
found
in
that
same
file
(
usually
before
its
use
,
except
for
the
rare
case
of
forward
declarations
).
Identifiers
may
denote
program
entities
that
are
implemented
in
other
files
.
Nevertheless
,
such
identifiers
are
still
declared
via
an
import
declaration
in
the
file
that
is
referring
to
them
.
This
explicit
declaration
requirement
ensures
that
every
compilation
unit
can
be
read
by
itself
.
The
scoping
of
identifiers
is
uniform
:
An
identifier
is
visible
from
the
point
of
its
declaration
to
the
end
of
the
immediately
surrounding
block
,
and
nested
identifiers
shadow
outer
identifiers
with
the
same
name
.
All
identifiers
are
in
the
same
namespace
;
i
.
e
.,
no
two
identifiers
in
the
same
scope
may
have
the
same
name
even
if
they
denote
different
language
concepts
(
for
instance
,
such
as
variable
vs
a
function
).
Uniform
scoping
rules
make
Go
programs
easier
to
read
and
to
understand
.
Program
structure
A
compilation
unit
consists
of
a
package
specifier
followed
by
import
declarations
followed
by
other
declarations
.
There
are
no
statements
at
the
top
level
of
a
file
.
[
gri
:
do
we
have
a
main
function
?
or
do
we
treat
all
functions
uniformly
and
instead
permit
a
program
to
be
started
by
providing
a
package
name
and
a
"start"
function
?
I
like
the
latter
because
if
gives
a
lot
of
flexibility
and
should
be
not
hard
to
implement
].
[
r
:
i
suggest
that
we
define
a
symbol
,
main
or
Main
or
start
or
Start
,
and
begin
execution
in
the
single
exported
function
of
that
name
in
the
program
.
the
flexibility
of
having
a
choice
of
name
is
unimportant
and
the
corresponding
need
to
define
the
name
in
order
to
link
or
execute
adds
complexity
.
by
default
it
should
be
trivial
;
we
could
allow
a
run
-
time
flag
to
override
the
default
for
gri
's flexibility.]
Typing, polymorphism, and object-orientation
Go programs are strongly typed; i.e., each program entity has a static
type known at compile time. Variables also have a dynamic type, which
is the type of the value they hold at run-time. Generally, the
dynamic and the static type of a variable are identical, except for
variables of interface type. In that case the dynamic type of the
variable is a pointer to a structure that implements the variable'
s
(
static
)
interface
type
.
There
may
be
many
different
structures
implementing
an
interface
and
thus
the
dynamic
type
of
such
variables
is
generally
not
known
at
compile
time
.
Such
variables
are
called
polymorphic
.
Interface
types
are
the
mechanism
to
support
an
object
-
oriented
programming
style
.
Different
interface
types
are
independent
of
each
other
and
no
explicit
hierarchy
is
required
(
such
as
single
or
multiple
inheritance
explicitly
specified
through
respective
type
declarations
).
Interface
types
only
define
a
set
of
functions
that
a
corresponding
implementation
must
provide
.
Thus
interface
and
implementation
are
strictly
separated
.
An
interface
is
implemented
by
associating
functions
(
methods
)
with
structures
.
If
a
structure
implements
all
methods
of
an
interface
,
it
implements
that
interface
and
thus
can
be
used
where
that
interface
is
required
.
Unless
used
through
a
variable
of
interface
type
,
methods
can
always
be
statically
bound
(
they
are
not
"virtual"
),
and
incur
no
runtime
overhead
compared
to
an
ordinary
function
.
Go
has
no
explicit
notion
of
classes
,
sub
-
classes
,
or
inheritance
.
These
concepts
are
trivially
modeled
in
Go
through
the
use
of
functions
,
structures
,
associated
methods
,
and
interfaces
.
Go
has
no
explicit
notion
of
type
parameters
or
templates
.
Instead
,
containers
(
such
as
stacks
,
lists
,
etc
.)
are
implemented
through
the
use
of
abstract
data
types
operating
on
interface
types
.
[
gri
:
there
is
some
automatic
boxing
,
semi
-
automatic
unboxing
support
for
basic
types
].
Pointers
and
garbage
collection
Variables
may
be
allocated
automatically
(
when
entering
the
scope
of
the
variable
)
or
explicitly
on
the
heap
.
Pointers
are
used
to
refer
to
heap
-
allocated
variables
.
Pointers
may
also
be
used
to
point
to
any
other
variable
;
such
a
pointer
is
obtained
by
"getting the
address"
of
that
variable
.
In
particular
,
pointers
may
point
"inside"
other
variables
,
or
to
automatic
variables
(
which
are
usually
allocated
on
the
stack
).
Variables
are
automatically
reclaimed
when
they
are
no
longer
accessible
.
There
is
no
pointer
arithmetic
in
Go
.
Functions
Functions
contain
declarations
and
statements
.
They
may
be
invoked
recursively
.
Functions
may
declare
nested
functions
,
and
nested
functions
have
access
to
the
variables
in
the
surrounding
functions
,
they
are
in
fact
closures
.
Functions
may
be
anonymous
and
appear
as
literals
in
expressions
.
Multithreading
and
channels
[
Rob
:
We
need
something
here
]
Notation
The
syntax
is
specified
in
green
productions
using
Extended
Backus
-
Naur
Form
(
EBNF
).
In
particular
:
''
encloses
lexical
symbols
|
separates
alternatives
()
used
for
grouping
[]
specifies
option
(
0
or
1
times
)
{}
specifies
repetition
(
0
to
n
times
)
A
production
may
be
referred
to
from
various
places
in
this
document
but
is
usually
defined
close
to
its
first
use
.
Code
examples
are
written
in
gray
.
Annotations
are
in
blue
,
and
open
issues
are
in
red
.
One
goal
is
to
get
rid
of
all
red
text
in
this
document
.
[
r
:
done
!]
Vocabulary
and
representation
REWRITE
THIS
:
BADLY
EXPRESSED
Go
program
source
is
a
sequence
of
characters
.
Each
character
is
a
Unicode
code
point
encoded
in
UTF
-
8.
A
Go
program
is
a
sequence
of
symbols
satisfying
the
Go
syntax
.
A
symbol
is
a
non
-
empty
sequence
of
characters
.
Symbols
are
identifiers
,
numbers
,
strings
,
operators
,
delimiters
,
and
comments
.
White
space
must
not
occur
within
symbols
(
except
in
comments
,
and
in
the
case
of
blanks
and
tabs
in
strings
).
They
are
ignored
unless
they
are
essential
to
separate
two
consecutive
symbols
.
White
space
is
composed
of
blanks
,
newlines
,
carriage
returns
,
and
tabs
only
.
A
character
is
a
Unicode
code
point
.
In
particular
,
capital
and
lower
-
case
letters
are
considered
as
being
distinct
.
Note
that
some
Unicode
characters
(
e
.
g
.,
the
character
ä
),
may
be
representable
in
two
forms
,
as
a
single
code
point
,
or
as
two
code
points
.
For
the
Unicode
standard
these
two
encodings
represent
the
same
character
,
but
for
Go
,
these
two
encodings
correspond
to
two
different
characters
).
Source
encoding
The
input
is
encoded
in
UTF
-
8.
In
the
grammar
we
use
the
notation
utf8_char
to
refer
to
an
arbitrary
Unicode
code
point
encoded
in
UTF
-
8.
Digits
and
Letters
octal_digit
=
{
'0'
|
'1'
|
'2'
|
'3'
|
'4'
|
'5'
|
'6'
|
'7'
}
.
decimal_digit
=
{
'0'
|
'1'
|
'2'
|
'3'
|
'4'
|
'5'
|
'6'
|
'7'
|
'8'
|
'9'
}
.
hex_digit
=
{
'0'
|
'1'
|
'2'
|
'3'
|
'4'
|
'5'
|
'6'
|
'7'
|
'8'
|
'9'
|
'a'
|
'A'
|
'b'
|
'B'
|
'c'
|
'C'
|
'd'
|
'D'
|
'e'
|
'E'
|
'f'
|
'F'
}
.
letter
=
'A'
|
'a'
|
...
'Z'
|
'z'
|
'_'
.
For
now
,
letters
and
digits
are
ASCII
.
We
may
expand
this
to
allow
Unicode
definitions
of
letters
and
digits
.
Identifiers
An
identifier
is
a
name
for
a
program
entity
such
as
a
variable
,
a
type
,
a
function
,
etc
.
identifier
=
letter
{
letter
|
decimal_digit
}
.
-
need
to
explain
scopes
,
visibility
(
elsewhere
)
-
need
to
say
something
about
predeclared
identifiers
,
and
their
(
universe
)
scope
(
elsewhere
)
Character
and
string
literals
A
RawStringLit
is
a
string
literal
delimited
by
back
quotes
``;
the
first
back
quote
encountered
after
the
opening
back
quote
terminates
the
string
.
RawStringLit
=
'`'
{
utf8_char
}
'`'
.
`
abc
`
`\
n
`
Character
and
string
literals
are
very
similar
to
C
except
:
-
Octal
character
escapes
are
always
3
digits
(\
077
not
\
77
)
-
Hexadecimal
character
escapes
are
always
2
digits
(\
x07
not
\
x7
)
-
Strings
are
UTF
-
8
and
represent
Unicode
-
``
strings
exist
;
they
do
not
interpret
backslashes
CharLit
=
'\'' ( UnicodeValue | ByteValue ) '
\
''
.
StringLit
=
RawStringLit
|
InterpretedStringLit
.
InterpretedStringLit
=
'"'
{
UnicodeValue
|
ByteValue
}
'"'
.
ByteValue
=
OctalByteValue
|
HexByteValue
.
OctalByteValue
=
'\'
octal_digit
octal_digit
octal_digit
.
HexByteValue
=
'\'
'x'
hex_digit
hex_digit
.
UnicodeValue
=
utf8_char
|
EscapedCharacter
|
LittleUValue
|
BigUValue
.
LittleUValue
=
'\'
'u'
hex_digit
hex_digit
hex_digit
hex_digit
.
BigUValue
=
'\'
'U'
hex_digit
hex_digit
hex_digit
hex_digit
hex_digit
hex_digit
hex_digit
hex_digit
.
EscapedCharacter
=
'\'
(
'a'
|
'b'
|
'f'
|
'n'
|
'r'
|
't'
|
'v'
)
.
An
OctalByteValue
contains
three
octal
digits
.
A
HexByteValue
contains
two
hexadecimal
digits
.
(
Note
:
This
differs
from
C
but
is
simpler
.)
It
is
erroneous
for
an
OctalByteValue
to
represent
a
value
larger
than
255.
(
By
construction
,
a
HexByteValue
cannot
.)
A
UnicodeValue
takes
one
of
four
forms
:
1.
The
UTF
-
8
encoding
of
a
Unicode
code
point
.
Since
Go
source
text
is
in
UTF
-
8
,
this
is
the
obvious
translation
from
input
text
into
Unicode
characters
.
2.
The
usual
list
of
C
backslash
escapes
:
\
n
\
t
etc
.
3.
A
`
little
u
' value, such as \u12AB. This represents the Unicode
code point with the corresponding hexadecimal value. It always
has exactly 4 hexadecimal digits.
4. A `big U'
value
,
such
as
'\U00101234'
.
This
represents
the
Unicode
code
point
with
the
corresponding
hexadecimal
value
.
It
always
has
exactly
8
hexadecimal
digits
.
Some
values
that
can
be
represented
this
way
are
illegal
because
they
are
not
valid
Unicode
code
points
.
These
include
values
above
0x10FFFF
and
surrogate
halves
.
A
character
literal
is
a
form
of
unsigned
integer
constant
.
Its
value
is
that
of
the
Unicode
code
point
represented
by
the
text
between
the
quotes
.
'a'
'ä'
'本'
'\t'
'\0'
'\07'
'\0377'
'\x7'
'\xff'
'\u12e4'
'\U00101234'
A
string
literal
has
type
'string'
.
Its
value
is
constructed
by
taking
the
byte
values
formed
by
the
successive
elements
of
the
literal
.
For
ByteValues
,
these
are
the
literal
bytes
;
for
UnicodeValues
,
these
are
the
bytes
of
the
UTF
-
8
encoding
of
the
corresponding
Unicode
code
points
.
Note
that
"\u00FF"
and
"\xFF"
are
different
strings
:
the
first
contains
the
two
-
byte
UTF
-
8
expansion
of
the
value
255
,
while
the
second
contains
a
single
byte
of
value
255.
The
same
rules
apply
to
raw
string
literals
,
except
the
contents
are
uninterpreted
UTF
-
8.
""
"Hello, world!
\n
"
"日本語"
"\u65e5本\U00008a9e"
"\xff\u00FF"
These
examples
all
represent
the
same
string
:
"日本語"
//
UTF
-
8
input
text
`
日本語
`
//
UTF
-
8
input
text
as
a
raw
literal
"\u65e5\u672c\u8a9e"
//
The
explicit
Unicode
code
points
"\U000065e5\U0000672c\U00008a9e"
//
The
explicit
Unicode
code
points
"\xe6\x97\xa5\xe6\x9c\xac\xe8\xaa\x9e"
//
The
explicit
UTF
-
8
bytes
The
language
does
not
canonicalize
Unicode
text
or
evaluate
combining
forms
.
The
text
of
source
code
is
passed
uninterpreted
.
If
the
source
code
represents
a
character
as
two
code
points
,
such
as
a
combining
form
involving
an
accent
and
a
letter
,
the
result
will
be
an
error
if
placed
in
a
character
literal
(
it
is
not
a
single
code
point
),
and
will
appear
as
two
code
points
if
placed
in
a
string
literal
.
[
This
simple
strategy
may
be
insufficient
in
the
long
run
but
is
surely
fine
for
now
.]
Numeric
literals
Integer
literals
take
the
usual
C
form
,
except
for
the
absence
of
the
'U'
,
'L'
etc
.
suffixes
,
and
represent
integer
constants
.
(
Character
literals
are
also
integer
constants
.)
Similarly
,
floating
point
literals
are
also
C
-
like
,
without
suffixes
and
decimal
only
.
An
integer
constant
represents
an
abstract
integer
value
of
arbitrary
precision
.
Only
when
an
integer
constant
(
or
arithmetic
expression
formed
from
integer
constants
)
is
assigned
to
a
variable
(
or
other
l
-
value
)
is
it
required
to
fit
into
a
particular
size
-
that
of
type
of
the
variable
.
In
other
words
,
integer
constants
and
arithmetic
upon
them
is
not
subject
to
overflow
;
only
assignment
of
integer
constants
(
and
constant
expressions
)
to
an
l
-
value
can
cause
overflow
.
It
is
an
error
if
the
value
of
the
constant
or
expression
cannot
be
represented
correctly
in
the
range
of
the
type
of
the
l
-
value
.
Floating
point
literals
also
represent
an
abstract
,
ideal
floating
point
value
that
is
constrained
only
upon
assignment
.
[
r
:
what
do
we
need
to
say
here
?
trickier
because
of
truncation
of
fractions
.]
IntLit
=
[
'+'
|
'-'
]
UnsignedIntLit
.
UnsignedIntLit
=
DecimalIntLit
|
OctalIntLit
|
HexIntLit
.
DecimalIntLit
=
(
'1'
|
'2'
|
'3'
|
'4'
|
'5'
|
'6'
|
'7'
|
'8'
|
'9'
)
{
decimal_digit
}
.
OctalIntLit
=
'0'
{
octal_digit
}
.
HexIntLit
=
'0'
(
'x'
|
'X'
)
hex_digit
{
hex_digit
}
.
FloatLit
=
[
'+'
|
'-'
]
UnsignedFloatLit
.
UnsignedFloatLit
=
"the usual decimal-only floating point representation"
.
Compound
Literals
THIS
SECTION
IS
WRONG
Compound
literals
require
some
fine
tuning
.
I
think
we
did
ok
in
Sawzall
but
there
are
some
loose
ends
.
I
don
't like that one cannot
easily distinguish between an array and a struct. We may need to
specify a type if these literals appear in expressions, but we don'
t
want
to
specify
a
type
if
these
literals
appear
as
intializer
expressions
where
the
variable
is
already
typed
.
And
we
don
't want to
do any implicit conversions.
CompoundLit = ArrayLit | FunctionLit | StructureLit | MapLit.
ArrayLit = '
{
' [ ExpressionList ] '
]
'. // all elems must have "the same" type
StructureLit = '
{
' [ ExpressionList ] '
}
'.
MapLit = '
{
' [ PairList ] '
}
'.
PairList = Pair { '
,
' Pair }.
Pair = Expression '
:
' Expression.
Literals
Literal = BasicLit | CompoundLit .
BasicLit = CharLit | StringLit | IntLit | FloatLit .
Function Literals
[THESE ARE CORRECT]
FunctionLit = FunctionType Block.
// Function literal
func (a, b int, z float) bool { return a*b < int(z); }
// Method literal
func (p *T) . (a, b int, z float) bool { return a*b < int(z) + p.x; }
Operators
- incomplete
Delimiters
- incomplete
Comments
There are two forms of comments.
The first starts '
//
' and ends at a newline.
The second starts at '
/*
' and ends at the first '
*/
'. It may cross
newlines. It does not nest.
Comments are treated like white space.
Common productions
IdentifierList = identifier { '
,
' identifier }.
ExpressionList = Expression { '
,
' Expression }.
QualifiedIdent = [ PackageName '
.
' ] identifier.
PackageName = identifier.
Types
A type specifies the set of values which variables of that type may
assume, and the operators that are applicable.
Except for variables of interface types, the static type of a variable
(i.e. the type the variable is declared with) is the same as the
dynamic type of the variable (i.e. the type of the variable at
run-time). Variables of interface types may hold variables of
different dynamic types, but their dynamic types must be compatible
with the static interface type. At any given instant during run-time,
a variable has exactly one dynamic type. A type declaration
associates an identifier with a type.
Array and struct types are called structured types, all other types
are called unstructured. A structured type cannot contain itself.
[gri: this needs to be formulated much more precisely].
Type = TypeName | ArrayType | ChannelType | InterfaceType |
FunctionType | MapType | StructType | PointerType .
TypeName = QualifiedIdent.
[gri: To make the types specifications more precise we need to
introduce some general concepts such as what it means to '
contain
'
another type, to be '
equal
' to another type, etc. Furthermore, we are
imprecise as we sometimes use the word type, sometimes just the type
name (int), or the structure (array) to denote different things (types
and variables). We should explain more precisely. Finally, there is
a difference between equality of types and assignment compatibility -
or isn'
t
there
?]
Basic
types
Go
defines
a
number
of
basic
types
which
are
referred
to
by
their
predeclared
type
names
.
There
are
signed
and
unsigned
integer
types
,
and
floating
point
types
:
bool
the
truth
values
true
and
false
uint8
the
set
of
all
unsigned
8
bit
integers
uint16
the
set
of
all
unsigned
16
bit
integers
uint32
the
set
of
all
unsigned
32
bit
integers
unit64
the
set
of
all
unsigned
64
bit
integers
byte
same
as
uint8
int8
the
set
of
all
signed
8
bit
integers
,
in
2
's complement
int16 the set of all signed 16bit integers, in 2'
s
complement
int32
the
set
of
all
signed
32
bit
integers
,
in
2
's complement
int64 the set of all signed 64bit integers, in 2'
s
complement
float32
the
set
of
all
valid
IEEE
-
754
32
bit
floating
point
numbers
float64
the
set
of
all
valid
IEEE
-
754
64
bit
floating
point
numbers
float80
the
set
of
all
valid
IEEE
-
754
80
bit
floating
point
numbers
double
same
as
float64
Additionally
,
Go
declares
3
basic
types
,
uint
,
int
,
and
float
,
which
are
platform
-
specific
.
The
bit
width
of
these
types
corresponds
to
the
"natural bit width"
for
the
respective
types
for
the
given
platform
(
e
.
g
.
int
is
usally
the
same
as
int32
on
a
32
bit
architecture
,
or
int64
on
a
64
bit
architecture
).
These
types
are
by
definition
platform
-
specific
and
should
be
used
with
the
appropriate
caution
.
[
gri
:
do
we
specify
minimal
sizes
for
uint
,
int
,
float
?
e
.
g
.
int
is
at
least
int32
?]
[
gri
:
do
we
say
something
about
the
correspondence
of
sizeof
(*
T
)
and
sizeof
(
int
)?
Are
they
the
same
?]
[
r
:
do
we
want
int128
and
uint128
?.]
Built
-
in
types
Besides
the
basic
types
there
is
a
set
of
built
-
in
types
:
string
,
and
chan
,
with
maybe
more
to
follow
.
Type
string
The
string
type
represents
the
set
of
string
values
(
strings
).
A
string
behaves
like
an
array
of
bytes
,
with
the
following
properties
:
-
They
are
immutable
:
after
creation
,
it
is
not
possible
to
change
the
contents
of
a
string
-
No
internal
pointers
:
it
is
illegal
to
create
a
pointer
to
an
inner
element
of
a
string
-
They
can
be
indexed
:
given
string
s1
,
s1
[
i
]
is
a
byte
value
-
They
can
be
concatenated
:
given
strings
s1
and
s2
,
s1
+
s2
is
a
value
combining
the
elements
of
s1
and
s2
in
sequence
-
Known
length
:
the
length
of
a
string
s1
can
be
obtained
by
the
function
/
operator
len
(
s1
).
[
r
:
is
it
a
bulitin
?
do
we
make
it
a
method
?
etc
.
this
is
a
placeholder
].
The
length
of
a
string
is
the
number
of
bytes
within
.
Unlike
in
C
,
there
is
no
terminal
NUL
byte
.
-
Creation
1
:
a
string
can
be
created
from
an
integer
value
by
a
conversion
string
(
'x'
)
yields
"x"
-
Creation
2
:
a
string
can
by
created
from
an
array
of
integer
values
(
maybe
just
array
of
bytes
)
by
a
conversion
a
[
3
]
byte
;
a
[
0
]
=
'a'
;
a
[
1
]
=
'b'
;
a
[
2
]
=
'c'
;
string
(
a
)
==
"abc"
;
The
language
has
string
literals
as
dicussed
above
.
The
type
of
a
string
literal
is
'string'
.
Array
types
An
array
is
a
structured
type
consisting
of
a
number
of
elements
which
are
all
of
the
same
type
,
called
the
element
type
.
The
number
of
elements
of
an
array
is
called
its
length
.
The
elements
of
an
array
are
designated
by
indices
which
are
integers
between
0
and
the
length
-
1.
THIS
SECTION
NEEDS
WORK
REGARDING
STATIC
AND
DYNAMIC
ARRAYS
An
array
type
specifies
a
set
of
arrays
with
a
given
element
type
and
an
optional
array
length
.
The
array
length
must
be
(
compile
-
time
)
constant
expression
,
if
present
.
Arrays
without
length
specification
are
called
open
arrays
.
An
open
array
must
not
contain
other
open
arrays
,
and
open
arrays
can
only
be
used
as
parameter
types
or
in
a
pointer
type
(
for
instance
,
a
struct
may
not
contain
an
open
array
field
,
but
only
a
pointer
to
an
open
array
).
[
gri
:
Need
to
define
when
array
types
are
the
same
! Also need to
define
assignment
compatibility
]
[
gri
:
Need
to
define
a
mechanism
to
get
to
the
length
of
an
array
at
run
-
time
.
This
could
be
a
predeclared
function
'length'
(
which
may
be
problematic
due
to
the
name
).
Alternatively
,
we
could
define
an
interface
for
array
types
and
say
that
there
is
a
'length()'
method
.
So
we
would
write
a
.
length
()
which
I
think
is
pretty
clean
.].
[
r
:
if
array
types
have
an
interface
and
a
string
is
an
array
,
some
stuff
(
but
not
enough
)
falls
out
nicely
.]
ArrayType
=
'array'
{
'['
ArrayLength
']'
}
ElementType
.
ArrayLength
=
Expression
.
ElementType
=
Type
.
The
notation
array
[
n
][
m
]
T
is
a
syntactic
shortcut
for
array
[
n
]
array
[
m
]
T
.
(
the
shortcut
may
be
applied
recursively
).
array
uint8
array
[
64
]
struct
{
x
,
y
:
int32
;
}
array
[
1000
][
1000
]
float64
Channel
types
ChannelType
=
'channel'
'('
Type
'<-'
Type
')'
.
channel
(
int
<-
float
)
-
incomplete
Pointer
types
-
TODO
:
Need
some
intro
here
.
Two
pointer
types
are
the
same
if
they
are
pointing
to
variables
of
the
same
type
.
PointerType
=
'*'
Type
.
-
We
do
not
allow
pointer
arithmetic
of
any
kind
.
Interface
types
-
TBD
:
This
needs
to
be
much
more
precise
.
For
now
we
understand
what
it
means
.
An
interface
type
specifies
a
set
of
methods
,
the
"method interface"
of
structs
.
No
two
methods
in
one
interface
can
have
the
same
name
.
Two
interfaces
are
the
same
if
their
set
of
functions
is
the
same
,
i
.
e
.,
if
all
methods
exist
in
both
interfaces
and
if
the
function
names
and
signatures
are
the
same
.
The
order
of
declaration
of
methods
in
an
interface
is
irrelevant
.
A
set
of
interface
types
implicitly
creates
an
unconnected
,
ordered
lattice
of
types
.
An
interface
type
T1
is
said
to
be
smaller
than
or
equalt
to
an
interface
type
T2
(
T1
<=
T2
)
if
the
entire
interface
of
T1
"is part"
of
T2
.
Thus
,
two
interface
types
T1
,
T2
are
the
same
if
T1
<=
T2
,
and
T2
<=
T1
,
and
thus
we
can
write
T1
==
T2
.
InterfaceType
=
'interface'
'{'
{
MethodDecl
}
'}'
.
MethodDecl
=
identifier
Signature
';'
,
//
An
empty
interface
.
interface
{};
//
A
basic
file
interface
.
interface
{
Read
(
Buffer
)
bool
;
Write
(
Buffer
)
bool
;
Close
();
}
Interface
pointers
can
be
implemented
as
"fat pointers"
;
namely
a
pair
(
ptr
,
tdesc
)
where
ptr
is
simply
the
pointer
to
a
struct
instance
implementing
the
interface
,
and
tdesc
is
the
structs
type
descriptor
.
Only
when
crossing
the
boundary
from
statically
typed
structs
to
interfaces
and
vice
versa
,
does
the
type
descriptor
come
into
play
.
In
those
places
,
the
compiler
statically
knows
the
value
of
the
type
descriptor
.
Function
types
FunctionType
=
'func'
Signature
.
Signature
=
[
Receiver
'.'
]
Parameters
[
Result
]
.
Receiver
=
'('
identifier
Type
')'
.
Parameters
=
'('
[
ParameterList
]
')'
.
ParameterList
=
ParameterSection
{
','
ParameterSection
}
.
ParameterSection
=
[
IdentifierList
]
Type
.
Result
=
[
Type
]
|
'('
ParameterList
')'
.
//
Function
types
func
()
func
(
a
,
b
int
,
z
float
)
bool
func
(
a
,
b
int
,
z
float
)
(
success
bool
)
func
(
a
,
b
int
,
z
float
)
(
success
bool
,
result
float
)
//
Method
types
func
(
p
*
T
)
.
()
func
(
p
*
T
)
.
(
a
,
b
int
,
z
float
)
bool
func
(
p
*
T
)
.
(
a
,
b
int
,
z
float
)
(
success
bool
)
func
(
p
*
T
)
.
(
a
,
b
int
,
z
float
)
(
success
bool
,
result
float
)
Map
types
MapType
=
'map'
'('
Type
<-
Type
')'
.
map
(
int
<-
string
)
-
incomplete
Struct
types
Struct
types
are
similar
to
C
structs
.
NEED
TO
DEFINE
STRUCT
EQUIVALENCE
Two
struct
types
are
the
same
if
and
only
if
they
are
declared
by
the
same
struct
type
;
i
.
e
.,
struct
types
are
compared
via
equivalence
,
and
*
not
*
structurally
.
For
that
reason
,
struct
types
are
usually
given
a
type
name
so
that
it
is
possible
to
refer
to
the
same
struct
in
different
places
in
a
program
.
What
about
equivalence
of
structs
w
/
respect
to
methods
?
What
if
methods
can
be
added
in
another
package
?
TBD
.
Each
field
of
a
struct
represents
a
variable
within
the
data
structure
.
In
particular
,
a
function
field
represents
a
function
variable
,
not
a
method
.
StructType
=
'struct'
'{'
{
FieldDecl
}
'}'
.
FieldDecl
=
IdentifierList
Type
';'
.
//
An
empty
struct
.
struct
{}
//
A
struct
with
5
fields
.
struct
{
x
,
y
int
;
u
float
;
a
[]
int
;
f
func
();
}
Note
that
a
program
which
never
uses
interface
types
can
be
fully
statically
typed
.
That
is
,
the
"usual"
implementation
of
structs
(
or
classes
as
they
are
called
in
other
languages
)
having
an
extra
type
descriptor
prepended
in
front
of
every
single
struct
is
not
required
.
Only
when
a
pointer
to
a
struct
is
assigned
to
an
interface
variable
,
the
type
descriptor
comes
into
play
,
and
at
that
point
it
is
statically
known
at
compile
-
time
!
Package
specifiers
Every
source
file
is
an
element
of
a
package
,
and
defines
which
package
by
the
first
element
of
every
source
file
,
which
must
be
a
package
specifier
:
PackageSpecifier
=
'package'
PackageName
.
package
Math
Package
import
declarations
A
program
can
access
exported
items
from
another
package
.
It
does
so
by
in
effect
declaring
a
local
name
providing
access
to
the
package
,
and
then
using
the
local
name
as
a
namespace
with
which
to
address
the
elements
of
the
package
.
ImportDecl
=
'import'
PackageName
FileName
.
FileName
=
DoubleQuotedString
.
DoubleQuotedString
=
'"'
TEXT
'"'
.
(
DoubleQuotedString
should
be
replaced
by
the
correct
string
literal
production
!)
Package
import
declarations
must
be
the
first
statements
in
a
file
after
the
package
specifier
.
A
package
import
associates
an
identifier
with
a
package
,
named
by
a
file
.
In
effect
,
it
is
a
declaration
:
import
Math
"lib/Math"
;
import
library
"my/library"
;
After
such
an
import
,
one
can
use
the
Math
(
e
.
g
)
identifier
to
access
elements
within
it
x
float
=
Math
.
sin
(
y
);
Note
that
this
process
derives
nothing
explicit
about
the
type
of
the
`
imported
' function (here Math.sin()). The import must execute to
provide this information to the compiler (or the programmer, for that
matter).
An angled-string refers to official stuff in a public place, in effect
the run-time library. A double-quoted-string refers to arbitrary
code; it is probably a local file name that needs to be discovered
using rules outside the scope of the language spec.
The file name in a package must be complete except for a suffix.
Moreover, the package name must correspond to the (basename of) the
source file name. For instance, the implementation of package Bar
must be in file Bar.go, and if it lives in directory foo we write
import Bar "foo/bar";
to import it.
[This is a little redundant but if we allow multiple files per package
it will seem less so, and in any case the redundancy is useful and
protective.]
We assume Unix syntax for file names: / separators, no suffix for
directories. If the language is ported to other systems, the
environment must simulate these properties to avoid changing the
source code.
Declarations
- This needs to be expanded.
- We need to think about enums (or some alternative mechanism).
Declaration = (ConstDecl | VarDecl | TypeDecl | FunctionDecl |
ForwardDecl | AliasDecl) .
Const declarations
ConstDecl = '
const
' ( ConstSpec | '
(
' ConstSpecList [ '
;
' ] '
)
' ).
ConstSpec = identifier [ Type ] '
=
' Expression .
ConstSpecList = ConstSpec { '
;
' ConstSpec }.
const pi float = 3.14159265
const e = 2.718281828
const (
one int = 1;
two = 3
)
Variable declarations
VarDecl = '
var
' ( VarSpec | '
(
' VarSpecList [ '
;
' ] '
)
' ) | ShortVarDecl .
VarSpec = IdentifierList ( Type [ '
=
' ExpressionList ] | '
=
' ExpressionList ) .
VarSpecList = VarSpec { '
;
' VarSpec } .
ShortVarDecl = identifier '
:=
' Expression .
var i int
var u, v, w float
var k = 0
var x, y float = -1.0, -2.0
var (
i int;
u, v = 2.0, 3.0
)
If the expression list is present, it must have the same number of elements
as there are variables in the variable specification.
[ TODO: why is x := 0 not legal at the global level? ]
Type declarations
TypeDecl = '
type
' ( TypeSpec | '
(
' TypeSpecList [ '
;
' ] '
)
' ).
TypeSpec = identifier Type .
TypeSpecList = TypeSpec { '
;
' TypeSpec }.
type IntArray [16] int
type (
Point struct { x, y float };
Polar Point
)
Function and method declarations
FunctionDecl = '
func
' [ Receiver ] identifier Parameters [ Result ] ( '
;
' | Block ) .
Block = '
{
' { Statement } '
}
' .
func min(x int, y int) int {
if x < y {
return x;
}
return y;
}
func foo (a, b int, z float) bool {
return a*b < int(z);
}
A method is a function that also declares a receiver. The receiver is
a struct with which the function is associated. The receiver type
must denote a pointer to a struct.
func (p *T) foo (a, b int, z float) bool {
return a*b < int(z) + p.x;
}
func (p *Point) Length() float {
return Math.sqrt(p.x * p.x + p.y * p.y);
}
func (p *Point) Scale(factor float) {
p.x = p.x * factor;
p.y = p.y * factor;
}
The last two examples are methods of struct type Point. The variable p is
the receiver; within the body of the method it represents the value of
the receiving struct.
Note that methods are declared outside the body of the corresponding
struct.
Functions and methods can be forward declared by omitting the body:
func foo (a, b int, z float) bool;
func (p *T) foo (a, b int, z float) bool;
Statements
Statement = EmptyStat | Assignment | CompoundStat | Declaration |
ExpressionStat | IncDecStat | IfStat | WhileStat | ReturnStat .
Empty statements
EmptyStat = '
;
' .
Assignments
Assignment = Designator '
=
' Expression .
- no automatic conversions
- values can be assigned to variables if they are of the same type, or
if they satisfy the interface type (much more precision needed here!)
Compound statements
CompoundStat = '
{
' { Statement } '
}
' .
Expression statements
ExpressionStat = Expression .
IncDec statements
IncDecStat = Expression ( '
++
' | '
--
' ) .
If statements
IfStat = '
if
' ( [ Expression ] '
{
' { IfCaseList } '
}
' ) |
( Expression '
{
' { Statement } '
}
' [ '
else
' { Statement } ] ).
IfCaseList = ( '
case
' ExpressionList | '
default
' ) '
:
' { Statement } .
if x < y {
return x;
} else {
return y;
}
if tag {
case 0, 1: s1();
case 2: s2();
default: ;
}
if {
case x < y: f1();
case x < z: f2();
}
While statements
WhileStat = '
while
' ( [ Expression ] '
{
' { WhileCaseList } '
}
' ) |
( Expression '
{
' { Statement } '
}
' ).
WhileCaseList = '
case
' ExpressionList '
:
' { Statement } .
while {
case i < n: f1();
case i < m: f2();
}
Return statements
ReturnStat = '
return
' [ ExpressionList ] .
There are two ways to return values from a function. The first is to
explicitly list the return value or values in the return statement:
func simple_f () int {
return 2;
}
func complex_f1() (re float, im float) {
return -7.0, -4.0;
}
The second is to provide names for the return values and assign them
explicitly in the function; the return statement will then provide no
values:
func complex_f2() (re float, im float) {
re = 7.0;
im = 4.0;
return;
}
It is legal to name the return values in the declaration even if the
first form of return statement is used:
func complex_f2() (re float, im float) {
return 7.0, 4.0;
}
Expressions
Expression = Conjunction { '
||
' Conjunction }.
Conjunction = Comparison { '
&&
' Comparison }.
Comparison = SimpleExpr [ relation SimpleExpr ].
relation = '
==
' | '
!=' | '<' | '<=' | '>' | '>='.
SimpleExpr
=
Term
{
add_op
Term
}.
add_op
=
'+'
|
'-'
|
'|'
|
'^'
.
Term
=
Factor
{
mul_op
Factor
}.
mul_op
=
'*'
|
'/'
|
'%'
|
'<<'
|
'>>'
|
'&'
.
The
corresponding
precedence
hierarchy
is
as
follows
:
(
5
levels
of
precedence
is
about
the
maximum
people
can
keep
comfortably
in
their
heads
.
The
experience
with
C
and
C
++
shows
that
more
then
that
usually
requires
explicit
manual
consultation
...).
[
gri
:
I
still
think
we
should
consider
0
levels
of
binary
precedence
:
All
operators
are
on
the
same
level
,
but
parentheses
are
required
when
different
operators
are
mixed
.
That
would
make
it
really
easy
,
and
really
clear
.
It
would
also
open
the
door
for
straight
-
forward
introduction
of
user
-
defined
operators
,
which
would
be
rather
useful
.]
Precedence
Operator
1
||
2
&&
3
==
!= < <= > >=
4
+
-
|
^
5
*
/
%
<<
>>
&
For
integer
values
,
/
and
%
satisfy
the
following
relationship
:
(
a
/
b
)
*
b
+
a
%
b
==
a
and
(
a
/
b
)
is
"truncated towards zero"
.
The
shift
operators
implement
arithmetic
shifts
for
signed
integers
,
and
logical
shifts
for
unsigned
integers
.
TBD
:
is
there
any
range
checking
on
s
in
x
>>
s
,
or
x
<<
s
?
[
gri
:
We
decided
on
a
couple
of
issues
here
that
we
need
to
write
down
more
nicely
]
-
There
are
no
implicit
type
conversions
except
for
constants
/
literals
.
In
particular
,
unsigned
and
signed
integers
cannot
be
mixed
in
an
expression
w
/
o
explicit
casting
.
-
Unary
'^'
corresponds
to
C
'~'
(
bitwise
negate
).
-
Arrays
can
be
subscripted
(
a
[
i
])
or
sliced
(
a
[
i
:
j
]).
A
slice
a
[
i
:
j
]
is
a
new
array
of
length
(
j
-
i
),
and
consisting
of
the
elements
a
[
i
],
a
[
i
+
1
],
...
a
[
j
-
1
].
[
gri
/
r
:
Is
the
slice
array
bounds
check
hard
(
leading
to
an
error
),
or
soft
(
truncating
)
?].
Furthermore
:
Array
slicing
is
very
tricky
! Do we get a copy (a new
array
)
or
a
new
array
descriptor
?
This
is
open
at
this
point
.
There
is
a
simple
way
out
of
the
mess
:
Structured
types
are
always
passed
by
reference
,
and
there
is
no
value
assignment
for
structured
types
.
It
gets
very
complicated
very
quickly
.
[
gri
:
Syntax
below
is
incomplete
-
what
about
method
invocation
?]
Factor
=
Literal
|
Designator
|
'!'
Expression
|
'-'
Expression
|
'^'
Expression
|
'&'
Expression
|
'('
Expression
')'
|
Call
.
Designator
=
QualifiedIdent
{
Selector
}.
Selector
=
'.'
identifier
|
'['
Expression
[
':'
Expression
]
']'
.
Call
=
Factor
'('
ExpressionList
')'
.
[
gri
:
We
need
a
precise
definition
of
a
constant
expression
]
Compilation
units
The
unit
of
compilation
is
a
single
file
.
A
compilation
unit
consists
of
a
package
specifier
followed
by
a
list
of
import
declarations
followed
by
a
list
of
global
declarations
.
CompilationUnit
=
{
ImportDecl
}
{
GlobalDeclaration
}.
GlobalDeclaration
=
Declaration
.
Exports
Globally
declared
identifiers
may
be
exported
,
thus
making
the
exported
identifer
visible
outside
the
package
.
Another
package
may
then
import
the
identifier
to
use
it
.
Export
directives
must
only
appear
at
the
global
level
of
a
compilation
unit
(
at
least
for
now
).
That
is
,
one
can
export
compilation
-
unit
global
identifiers
but
not
,
for
example
,
local
variables
or
structure
fields
.
Exporting
an
identifier
makes
the
identifier
visible
externally
to
the
package
.
If
the
identifier
represents
a
type
,
the
type
structure
is
exported
as
well
.
The
exported
identifiers
may
appear
later
in
the
source
than
the
export
directive
itself
,
but
it
is
an
error
to
specify
an
identifier
not
declared
anywhere
in
the
source
file
containing
the
export
directive
.
ExportDirective
=
'export'
ExportIdentifier
{
','
ExportIdentifier
}
.
ExportIdentifier
=
identifier
.
export
sin
,
cos
;
One
may
export
variables
and
types
,
but
(
at
least
for
now
),
not
aliases
.
[
r
:
what
is
needed
to
make
aliases
exportable
?
issue
is
transitivity
.]
Exporting
a
variable
does
not
automatically
export
the
type
of
the
variable
.
For
illustration
,
consider
the
program
fragment
:
package
P
;
export
v1
,
v2
,
p
;
struct
S
{
a
int
;
b
int
;
}
var
v1
S
;
var
v2
S
;
var
p
*
S
;
Notice
that
S
is
not
exported
.
Another
source
file
may
contain
:
import
P
;
alias
v1
P
.
v1
;
alias
v2
P
.
v2
;
alias
p
P
.
p
;
This
program
can
use
v
and
p
but
not
access
the
fields
(
a
and
b
)
of
structure
type
S
explicitly
.
For
instance
,
it
could
legally
contain
if
p
==
nil
{
}
if
v1
==
v2
{
}
but
not
if
v
.
a
==
0
{
}
Write
Preview
Markdown
is supported
0%
Try again
or
attach a new file
Attach a file
Cancel
You are about to add
0
people
to the discussion. Proceed with caution.
Finish editing this message first!
Cancel
Please
register
or
sign in
to comment