Commit 37a09751 authored by Robert Griesemer's avatar Robert Griesemer

spec: be precise about rune/string literals and comments

See #10248 for details.

Fixes #10248.

Change-Id: I373545b2dca5d1da1c7149eb0a8f6c6dd8071a4c
Reviewed-on: https://go-review.googlesource.com/10503Reviewed-by: default avatarRuss Cox <rsc@golang.org>
Reviewed-by: default avatarRob Pike <r@golang.org>
parent b1177d39
<!--{ <!--{
"Title": "The Go Programming Language Specification", "Title": "The Go Programming Language Specification",
"Subtitle": "Version of June 23, 2015", "Subtitle": "Version of July 23, 2015",
"Path": "/ref/spec" "Path": "/ref/spec"
}--> }-->
...@@ -129,27 +129,27 @@ hex_digit = "0" … "9" | "A" … "F" | "a" … "f" . ...@@ -129,27 +129,27 @@ hex_digit = "0" … "9" | "A" … "F" | "a" … "f" .
<h3 id="Comments">Comments</h3> <h3 id="Comments">Comments</h3>
<p> <p>
There are two forms of comments: Comments serve as program documentation. There are two forms:
</p> </p>
<ol> <ol>
<li> <li>
<i>Line comments</i> start with the character sequence <code>//</code> <i>Line comments</i> start with the character sequence <code>//</code>
and stop at the end of the line. A line comment acts like a newline. and stop at the end of the line.
</li> </li>
<li> <li>
<i>General comments</i> start with the character sequence <code>/*</code> <i>General comments</i> start with the character sequence <code>/*</code>
and continue through the character sequence <code>*/</code>. A general and stop with the first subsequent character sequence <code>*/</code>.
comment containing one or more newlines acts like a newline, otherwise it acts
like a space.
</li> </li>
</ol> </ol>
<p> <p>
Comments do not nest. A comment cannot start inside a <a href="#Rune_literals">rune</a> or
<a href="#String_literals">string literal</a>, or inside a comment.
A general comment containing no newlines acts like a space.
Any other comment acts like a newline.
</p> </p>
<h3 id="Tokens">Tokens</h3> <h3 id="Tokens">Tokens</h3>
<p> <p>
...@@ -176,11 +176,8 @@ using the following two rules: ...@@ -176,11 +176,8 @@ using the following two rules:
<ol> <ol>
<li> <li>
<p>
When the input is broken into tokens, a semicolon is automatically inserted When the input is broken into tokens, a semicolon is automatically inserted
into the token stream at the end of a non-blank line if the line's final into the token stream immediately after a line's final token if that token is
token is
</p>
<ul> <ul>
<li>an <li>an
<a href="#Identifiers">identifier</a> <a href="#Identifiers">identifier</a>
...@@ -357,9 +354,10 @@ imaginary_lit = (decimals | float_lit) "i" . ...@@ -357,9 +354,10 @@ imaginary_lit = (decimals | float_lit) "i" .
<p> <p>
A rune literal represents a <a href="#Constants">rune constant</a>, A rune literal represents a <a href="#Constants">rune constant</a>,
an integer value identifying a Unicode code point. an integer value identifying a Unicode code point.
A rune literal is expressed as one or more characters enclosed in single quotes. A rune literal is expressed as one or more characters enclosed in single quotes,
Within the quotes, any character may appear except single as in <code>'x'</code> or <code>'\n'</code>.
quote and newline. A single quoted character represents the Unicode value Within the quotes, any character may appear except newline and unescaped single
quote. A single quoted character represents the Unicode value
of the character itself, of the character itself,
while multi-character sequences beginning with a backslash encode while multi-character sequences beginning with a backslash encode
values in various formats. values in various formats.
...@@ -433,6 +431,7 @@ escaped_char = `\` ( "a" | "b" | "f" | "n" | "r" | "t" | "v" | `\` | "'" | ` ...@@ -433,6 +431,7 @@ escaped_char = `\` ( "a" | "b" | "f" | "n" | "r" | "t" | "v" | `\` | "'" | `
'\xff' '\xff'
'\u12e4' '\u12e4'
'\U00101234' '\U00101234'
'\'' // rune literal containing single quote character
'aa' // illegal: too many characters 'aa' // illegal: too many characters
'\xa' // illegal: too few hexadecimal digits '\xa' // illegal: too few hexadecimal digits
'\0' // illegal: too few octal digits '\0' // illegal: too few octal digits
...@@ -449,8 +448,8 @@ obtained from concatenating a sequence of characters. There are two forms: ...@@ -449,8 +448,8 @@ obtained from concatenating a sequence of characters. There are two forms:
raw string literals and interpreted string literals. raw string literals and interpreted string literals.
</p> </p>
<p> <p>
Raw string literals are character sequences between back quotes Raw string literals are character sequences between back quotes, as in
<code>``</code>. Within the quotes, any character is legal except <code>`foo`</code>. Within the quotes, any character may appear except
back quote. The value of a raw string literal is the back quote. The value of a raw string literal is the
string composed of the uninterpreted (implicitly UTF-8-encoded) characters string composed of the uninterpreted (implicitly UTF-8-encoded) characters
between the quotes; between the quotes;
...@@ -461,8 +460,9 @@ are discarded from the raw string value. ...@@ -461,8 +460,9 @@ are discarded from the raw string value.
</p> </p>
<p> <p>
Interpreted string literals are character sequences between double Interpreted string literals are character sequences between double
quotes <code>&quot;&quot;</code>. The text between the quotes, quotes, as in <code>&quot;bar&quot;</code>.
which may not contain newlines, forms the Within the quotes, any character may appear except newline and unescaped double quote.
The text between the quotes forms the
value of the literal, with backslash escapes interpreted as they value of the literal, with backslash escapes interpreted as they
are in <a href="#Rune_literals">rune literals</a> (except that <code>\'</code> is illegal and are in <a href="#Rune_literals">rune literals</a> (except that <code>\'</code> is illegal and
<code>\"</code> is legal), with the same restrictions. <code>\"</code> is legal), with the same restrictions.
...@@ -488,7 +488,7 @@ interpreted_string_lit = `"` { unicode_value | byte_value } `"` . ...@@ -488,7 +488,7 @@ interpreted_string_lit = `"` { unicode_value | byte_value } `"` .
`\n `\n
\n` // same as "\\n\n\\n" \n` // same as "\\n\n\\n"
"\n" "\n"
"" "\"" // same as `"`
"Hello, world!\n" "Hello, world!\n"
"日本語" "日本語"
"\u65e5本\U00008a9e" "\u65e5本\U00008a9e"
......
Markdown is supported
0%
or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment