Commit 37a09751 authored by Robert Griesemer's avatar Robert Griesemer

spec: be precise about rune/string literals and comments

See #10248 for details.

Fixes #10248.

Change-Id: I373545b2dca5d1da1c7149eb0a8f6c6dd8071a4c
Reviewed-on: https://go-review.googlesource.com/10503Reviewed-by: default avatarRuss Cox <rsc@golang.org>
Reviewed-by: default avatarRob Pike <r@golang.org>
parent b1177d39
<!--{
"Title": "The Go Programming Language Specification",
"Subtitle": "Version of June 23, 2015",
"Subtitle": "Version of July 23, 2015",
"Path": "/ref/spec"
}-->
......@@ -129,27 +129,27 @@ hex_digit = "0" … "9" | "A" … "F" | "a" … "f" .
<h3 id="Comments">Comments</h3>
<p>
There are two forms of comments:
Comments serve as program documentation. There are two forms:
</p>
<ol>
<li>
<i>Line comments</i> start with the character sequence <code>//</code>
and stop at the end of the line. A line comment acts like a newline.
and stop at the end of the line.
</li>
<li>
<i>General comments</i> start with the character sequence <code>/*</code>
and continue through the character sequence <code>*/</code>. A general
comment containing one or more newlines acts like a newline, otherwise it acts
like a space.
and stop with the first subsequent character sequence <code>*/</code>.
</li>
</ol>
<p>
Comments do not nest.
A comment cannot start inside a <a href="#Rune_literals">rune</a> or
<a href="#String_literals">string literal</a>, or inside a comment.
A general comment containing no newlines acts like a space.
Any other comment acts like a newline.
</p>
<h3 id="Tokens">Tokens</h3>
<p>
......@@ -176,11 +176,8 @@ using the following two rules:
<ol>
<li>
<p>
When the input is broken into tokens, a semicolon is automatically inserted
into the token stream at the end of a non-blank line if the line's final
token is
</p>
into the token stream immediately after a line's final token if that token is
<ul>
<li>an
<a href="#Identifiers">identifier</a>
......@@ -357,9 +354,10 @@ imaginary_lit = (decimals | float_lit) "i" .
<p>
A rune literal represents a <a href="#Constants">rune constant</a>,
an integer value identifying a Unicode code point.
A rune literal is expressed as one or more characters enclosed in single quotes.
Within the quotes, any character may appear except single
quote and newline. A single quoted character represents the Unicode value
A rune literal is expressed as one or more characters enclosed in single quotes,
as in <code>'x'</code> or <code>'\n'</code>.
Within the quotes, any character may appear except newline and unescaped single
quote. A single quoted character represents the Unicode value
of the character itself,
while multi-character sequences beginning with a backslash encode
values in various formats.
......@@ -433,6 +431,7 @@ escaped_char = `\` ( "a" | "b" | "f" | "n" | "r" | "t" | "v" | `\` | "'" | `
'\xff'
'\u12e4'
'\U00101234'
'\'' // rune literal containing single quote character
'aa' // illegal: too many characters
'\xa' // illegal: too few hexadecimal digits
'\0' // illegal: too few octal digits
......@@ -449,8 +448,8 @@ obtained from concatenating a sequence of characters. There are two forms:
raw string literals and interpreted string literals.
</p>
<p>
Raw string literals are character sequences between back quotes
<code>``</code>. Within the quotes, any character is legal except
Raw string literals are character sequences between back quotes, as in
<code>`foo`</code>. Within the quotes, any character may appear except
back quote. The value of a raw string literal is the
string composed of the uninterpreted (implicitly UTF-8-encoded) characters
between the quotes;
......@@ -461,8 +460,9 @@ are discarded from the raw string value.
</p>
<p>
Interpreted string literals are character sequences between double
quotes <code>&quot;&quot;</code>. The text between the quotes,
which may not contain newlines, forms the
quotes, as in <code>&quot;bar&quot;</code>.
Within the quotes, any character may appear except newline and unescaped double quote.
The text between the quotes forms the
value of the literal, with backslash escapes interpreted as they
are in <a href="#Rune_literals">rune literals</a> (except that <code>\'</code> is illegal and
<code>\"</code> is legal), with the same restrictions.
......@@ -484,17 +484,17 @@ interpreted_string_lit = `"` { unicode_value | byte_value } `"` .
</pre>
<pre>
`abc` // same as "abc"
`abc` // same as "abc"
`\n
\n` // same as "\\n\n\\n"
\n` // same as "\\n\n\\n"
"\n"
""
"\"" // same as `"`
"Hello, world!\n"
"日本語"
"\u65e5本\U00008a9e"
"\xff\u00FF"
"\uD800" // illegal: surrogate half
"\U00110000" // illegal: invalid Unicode code point
"\uD800" // illegal: surrogate half
"\U00110000" // illegal: invalid Unicode code point
</pre>
<p>
......
Markdown is supported
0%
or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment