- 11 May, 2015 18 commits
-
-
Russ Cox authored
We want typedmemmove to use the heap bitmap to determine where pointers are, instead of reinterpreting the type information. The heap bitmap is simpler to access. In general, typedmemmove will need to be able to look up the bits for any word and find valid pointer information, so fill even after the dead marker. Not filling after the dead marker was an optimization I introduced only a few days ago, when reintroducing the dead marker code. At the time I said it probably wouldn't last, and it didn't. Change-Id: I6ba01bff17ddee1ff429f454abe29867ec60606e Reviewed-on: https://go-review.googlesource.com/9885Reviewed-by: Austin Clements <austin@google.com>
-
Russ Cox authored
The runtime deals with 1-bit pointer bitmaps and 2-bit heap bitmaps that have entries for both pointers and mark bits. Each byte in a 1-bit pointer bitmap looks like pppppppp (all pointer bits). Each byte in a 2-bit heap bitmap looks like mpmpmpmp (mark, pointer, ...). This means that when converting from 1-bit to 2-bit, as we do during malloc, we have to pick up 4 bits in pppp form and use shifts to create the mpmpmpmp form. This CL changes the 2-bit heap bitmap form to mmmmpppp, so that 4 bits picked up in 1-bit form can be used directly in the low bits of the heap bitmap byte, without expansion. This simplifies the code, and it also happens to be faster. name old mean new mean delta SetTypePtr 14.0ns × (0.98,1.09) 14.0ns × (0.98,1.08) ~ (p=0.966) SetTypePtr8 16.5ns × (0.99,1.05) 15.3ns × (0.96,1.16) -6.86% (p=0.012) SetTypePtr16 21.3ns × (0.98,1.05) 18.8ns × (0.94,1.14) -11.49% (p=0.000) SetTypePtr32 34.6ns × (0.93,1.22) 27.7ns × (0.91,1.26) -20.08% (p=0.001) SetTypePtr64 55.7ns × (0.97,1.11) 41.6ns × (0.98,1.04) -25.30% (p=0.000) SetTypePtr126 98.0ns × (1.00,1.00) 67.7ns × (0.99,1.05) -30.88% (p=0.000) SetTypePtr128 98.6ns × (1.00,1.01) 68.6ns × (0.99,1.03) -30.44% (p=0.000) SetTypePtrSlice 781ns × (0.99,1.01) 571ns × (0.99,1.04) -26.93% (p=0.000) SetTypeNode1 13.1ns × (0.99,1.01) 12.1ns × (0.99,1.01) -7.45% (p=0.000) SetTypeNode1Slice 113ns × (0.99,1.01) 94ns × (1.00,1.00) -16.35% (p=0.000) SetTypeNode8 32.7ns × (1.00,1.00) 29.8ns × (0.99,1.01) -8.97% (p=0.000) SetTypeNode8Slice 266ns × (1.00,1.00) 204ns × (1.00,1.00) -23.40% (p=0.000) SetTypeNode64 58.0ns × (0.98,1.08) 42.8ns × (1.00,1.01) -26.24% (p=0.000) SetTypeNode64Slice 1.55µs × (0.99,1.02) 0.96µs × (1.00,1.00) -37.84% (p=0.000) SetTypeNode64Dead 13.1ns × (0.99,1.01) 12.1ns × (1.00,1.00) -7.33% (p=0.000) SetTypeNode64DeadSlice 1.52µs × (1.00,1.01) 1.08µs × (1.00,1.01) -28.95% (p=0.000) SetTypeNode124 97.9ns × (1.00,1.00) 67.1ns × (1.00,1.01) -31.49% (p=0.000) SetTypeNode124Slice 2.87µs × (0.99,1.02) 1.75µs × (1.00,1.01) -39.15% (p=0.000) SetTypeNode126 98.4ns × (1.00,1.01) 68.1ns × (1.00,1.01) -30.79% (p=0.000) SetTypeNode126Slice 2.91µs × (0.99,1.01) 1.77µs × (0.99,1.01) -39.09% (p=0.000) SetTypeNode1024 732ns × (1.00,1.00) 511ns × (0.87,1.42) -30.14% (p=0.000) SetTypeNode1024Slice 23.1µs × (1.00,1.00) 13.9µs × (0.99,1.02) -39.83% (p=0.000) Change-Id: I12e3b850a4e6fa6c8146b8635ff728f3ef658819 Reviewed-on: https://go-review.googlesource.com/9828Reviewed-by: Austin Clements <austin@google.com>
-
Russ Cox authored
Moving them up makes them properly aligned on 32-bit systems. There are some odd fields above them right now (like fixalloc and mutex maybe). Change-Id: I57851a5bbb2e7cc339712f004f99bb6c0cce0ca5 Reviewed-on: https://go-review.googlesource.com/9889Reviewed-by: Austin Clements <austin@google.com>
-
Russ Cox authored
Most of the calls to panicindex are already marked as not returning, but these two were missed at some point. Performance changes below. name old mean new mean delta BinaryTree17 5.70s × (0.98,1.04) 5.68s × (0.97,1.04) ~ (p=0.681) Fannkuch11 4.32s × (1.00,1.00) 4.41s × (0.98,1.03) +1.98% (p=0.018) FmtFprintfEmpty 92.6ns × (0.91,1.11) 92.7ns × (0.91,1.16) ~ (p=0.969) FmtFprintfString 280ns × (0.97,1.05) 281ns × (0.96,1.08) ~ (p=0.860) FmtFprintfInt 284ns × (0.99,1.02) 288ns × (0.97,1.06) ~ (p=0.207) FmtFprintfIntInt 488ns × (0.98,1.01) 493ns × (0.97,1.04) ~ (p=0.271) FmtFprintfPrefixedInt 418ns × (0.98,1.04) 423ns × (0.97,1.04) ~ (p=0.311) FmtFprintfFloat 597ns × (1.00,1.00) 598ns × (0.99,1.01) ~ (p=0.789) FmtManyArgs 1.87µs × (0.99,1.01) 1.89µs × (0.98,1.05) ~ (p=0.158) GobDecode 14.6ms × (0.99,1.01) 14.8ms × (0.98,1.03) +1.51% (p=0.015) GobEncode 12.3ms × (0.98,1.03) 12.3ms × (0.98,1.01) ~ (p=0.474) Gzip 647ms × (1.00,1.01) 656ms × (0.99,1.05) ~ (p=0.104) Gunzip 142ms × (1.00,1.00) 142ms × (1.00,1.00) ~ (p=0.110) HTTPClientServer 89.6µs × (0.99,1.03) 91.2µs × (0.97,1.04) ~ (p=0.061) JSONEncode 31.7ms × (0.99,1.01) 32.6ms × (0.97,1.08) +2.87% (p=0.038) JSONDecode 111ms × (1.00,1.01) 114ms × (0.97,1.05) +2.47% (p=0.040) Mandelbrot200 6.01ms × (1.00,1.00) 6.11ms × (0.98,1.04) ~ (p=0.073) GoParse 6.54ms × (0.99,1.02) 6.66ms × (0.97,1.04) ~ (p=0.064) RegexpMatchEasy0_32 159ns × (0.99,1.02) 159ns × (0.99,1.00) ~ (p=0.693) RegexpMatchEasy0_1K 540ns × (0.99,1.03) 538ns × (1.00,1.01) ~ (p=0.360) RegexpMatchEasy1_32 137ns × (0.99,1.01) 138ns × (1.00,1.00) ~ (p=0.511) RegexpMatchEasy1_1K 867ns × (1.00,1.01) 869ns × (0.99,1.01) ~ (p=0.193) RegexpMatchMedium_32 252ns × (1.00,1.00) 252ns × (0.99,1.01) ~ (p=0.076) RegexpMatchMedium_1K 72.7µs × (1.00,1.00) 72.7µs × (1.00,1.00) ~ (p=0.963) RegexpMatchHard_32 3.84µs × (1.00,1.00) 3.85µs × (1.00,1.00) ~ (p=0.371) RegexpMatchHard_1K 117µs × (1.00,1.01) 118µs × (1.00,1.00) ~ (p=0.898) Revcomp 909ms × (0.98,1.03) 920ms × (0.97,1.07) ~ (p=0.368) Template 128ms × (0.99,1.01) 129ms × (0.98,1.03) +1.41% (p=0.042) TimeParse 619ns × (0.98,1.01) 619ns × (0.99,1.01) ~ (p=0.730) TimeFormat 651ns × (1.00,1.01) 661ns × (0.98,1.04) ~ (p=0.097) Change-Id: I0ec5baff41f5d282307137ce0d927e6301e4fa10 Reviewed-on: https://go-review.googlesource.com/9811Reviewed-by: David Chase <drchase@google.com>
-
Russ Cox authored
Dead code. This field is left over from Go 1.4, when we elided the fake write barrier in this case. Today, it's unused (always false). The upcoming append/slice changes handle this case again, but without needing this field. Change-Id: Ic6f160b64efdc1bbed02097ee03050f8cd0ab1b8 Reviewed-on: https://go-review.googlesource.com/9789Reviewed-by: David Chase <drchase@google.com>
-
Russ Cox authored
If you are using -h to get a stack trace at the site of the failure, Yyerror will never return. Dump the register allocation sites before calling Yyerror. Change-Id: I51266c03e06cb5084c2eaa89b367b9ed85ba286a Reviewed-on: https://go-review.googlesource.com/9788Reviewed-by: Josh Bleecher Snyder <josharian@gmail.com> Reviewed-by: Dave Cheney <dave@cheney.net>
-
Russ Cox authored
The new(uint64) was moving to the stack, which may not be aligned. Change-Id: Iad070964202001b52029494d43e299fed980f939 Reviewed-on: https://go-review.googlesource.com/9787Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org> Reviewed-by: David Chase <drchase@google.com>
-
Russ Cox authored
The -g mode is a debugging mode that prints instructions as they are constructed. Gbranch was just missing the print. Change-Id: I3fb45fd9bd3996ed96df5be903b9fd6bd97148b0 Reviewed-on: https://go-review.googlesource.com/9827Reviewed-by: Rick Hudson <rlh@golang.org>
-
Russ Cox authored
The write barrier shadow heap was very useful for developing the write barriers initially, but it's no longer used, clunky, and dragging the rest of the implementation down. The gccheckmark mode will find bugs due to missed barriers when they result in missed marks; wbshadow mode found the missed barriers more aggressively, but it required an entire separate copy of the heap. The gccheckmark mode requires no extra memory, making it more useful in practice. Compared to previous CL: name old mean new mean delta BinaryTree17 5.91s × (0.96,1.06) 5.72s × (0.97,1.03) -3.12% (p=0.000) Fannkuch11 4.32s × (1.00,1.00) 4.36s × (1.00,1.00) +0.91% (p=0.000) FmtFprintfEmpty 89.0ns × (0.93,1.10) 86.6ns × (0.96,1.11) ~ (p=0.077) FmtFprintfString 298ns × (0.98,1.06) 283ns × (0.99,1.04) -4.90% (p=0.000) FmtFprintfInt 286ns × (0.98,1.03) 283ns × (0.98,1.04) -1.09% (p=0.032) FmtFprintfIntInt 498ns × (0.97,1.06) 480ns × (0.99,1.02) -3.65% (p=0.000) FmtFprintfPrefixedInt 408ns × (0.98,1.02) 396ns × (0.99,1.01) -3.00% (p=0.000) FmtFprintfFloat 587ns × (0.98,1.01) 562ns × (0.99,1.01) -4.34% (p=0.000) FmtManyArgs 1.94µs × (0.99,1.02) 1.89µs × (0.99,1.01) -2.85% (p=0.000) GobDecode 15.8ms × (0.98,1.03) 15.7ms × (0.99,1.02) ~ (p=0.251) GobEncode 12.0ms × (0.96,1.09) 11.8ms × (0.98,1.03) -1.87% (p=0.024) Gzip 648ms × (0.99,1.01) 647ms × (0.99,1.01) ~ (p=0.688) Gunzip 143ms × (1.00,1.01) 143ms × (1.00,1.01) ~ (p=0.203) HTTPClientServer 90.3µs × (0.98,1.01) 89.1µs × (0.99,1.02) -1.30% (p=0.000) JSONEncode 31.6ms × (0.99,1.01) 31.7ms × (0.98,1.02) ~ (p=0.219) JSONDecode 107ms × (1.00,1.01) 111ms × (0.99,1.01) +3.58% (p=0.000) Mandelbrot200 6.03ms × (1.00,1.01) 6.01ms × (1.00,1.00) ~ (p=0.077) GoParse 6.53ms × (0.99,1.03) 6.54ms × (0.99,1.02) ~ (p=0.585) RegexpMatchEasy0_32 161ns × (1.00,1.01) 161ns × (0.98,1.05) ~ (p=0.948) RegexpMatchEasy0_1K 541ns × (0.99,1.01) 559ns × (0.98,1.01) +3.32% (p=0.000) RegexpMatchEasy1_32 138ns × (1.00,1.00) 137ns × (0.99,1.01) -0.55% (p=0.001) RegexpMatchEasy1_1K 887ns × (0.99,1.01) 878ns × (0.99,1.01) -0.98% (p=0.000) RegexpMatchMedium_32 253ns × (0.99,1.01) 252ns × (0.99,1.01) -0.39% (p=0.001) RegexpMatchMedium_1K 72.8µs × (1.00,1.00) 72.7µs × (1.00,1.00) ~ (p=0.485) RegexpMatchHard_32 3.85µs × (1.00,1.01) 3.85µs × (1.00,1.01) ~ (p=0.283) RegexpMatchHard_1K 117µs × (1.00,1.01) 117µs × (1.00,1.00) ~ (p=0.175) Revcomp 922ms × (0.97,1.08) 903ms × (0.98,1.05) -2.15% (p=0.021) Template 126ms × (0.99,1.01) 126ms × (0.99,1.01) ~ (p=0.943) TimeParse 628ns × (0.99,1.01) 634ns × (0.99,1.01) +0.92% (p=0.000) TimeFormat 668ns × (0.99,1.01) 698ns × (0.98,1.03) +4.53% (p=0.000) It's nice that the microbenchmarks are the ones helped the most, because those were the ones hurt the most by the conversion from 4-bit to 2-bit heap bitmaps. This CL brings the overall effect of that process to (compared to CL 9706 patch set 1): name old mean new mean delta BinaryTree17 5.87s × (0.94,1.09) 5.72s × (0.97,1.03) -2.57% (p=0.011) Fannkuch11 4.32s × (1.00,1.00) 4.36s × (1.00,1.00) +0.87% (p=0.000) FmtFprintfEmpty 89.1ns × (0.95,1.16) 86.6ns × (0.96,1.11) ~ (p=0.090) FmtFprintfString 283ns × (0.98,1.02) 283ns × (0.99,1.04) ~ (p=0.681) FmtFprintfInt 284ns × (0.98,1.04) 283ns × (0.98,1.04) ~ (p=0.620) FmtFprintfIntInt 486ns × (0.98,1.03) 480ns × (0.99,1.02) -1.27% (p=0.002) FmtFprintfPrefixedInt 400ns × (0.99,1.02) 396ns × (0.99,1.01) -0.84% (p=0.001) FmtFprintfFloat 566ns × (0.99,1.01) 562ns × (0.99,1.01) -0.80% (p=0.000) FmtManyArgs 1.91µs × (0.99,1.02) 1.89µs × (0.99,1.01) -1.10% (p=0.000) GobDecode 15.5ms × (0.98,1.05) 15.7ms × (0.99,1.02) +1.55% (p=0.005) GobEncode 11.9ms × (0.97,1.03) 11.8ms × (0.98,1.03) -0.97% (p=0.048) Gzip 648ms × (0.99,1.01) 647ms × (0.99,1.01) ~ (p=0.627) Gunzip 143ms × (1.00,1.00) 143ms × (1.00,1.01) ~ (p=0.482) HTTPClientServer 89.2µs × (0.99,1.02) 89.1µs × (0.99,1.02) ~ (p=0.740) JSONEncode 32.3ms × (0.97,1.06) 31.7ms × (0.98,1.02) -1.95% (p=0.002) JSONDecode 106ms × (0.99,1.01) 111ms × (0.99,1.01) +4.22% (p=0.000) Mandelbrot200 6.02ms × (1.00,1.00) 6.01ms × (1.00,1.00) ~ (p=0.417) GoParse 6.57ms × (0.97,1.06) 6.54ms × (0.99,1.02) ~ (p=0.404) RegexpMatchEasy0_32 162ns × (1.00,1.00) 161ns × (0.98,1.05) ~ (p=0.088) RegexpMatchEasy0_1K 561ns × (0.99,1.02) 559ns × (0.98,1.01) -0.47% (p=0.034) RegexpMatchEasy1_32 145ns × (0.95,1.04) 137ns × (0.99,1.01) -5.56% (p=0.000) RegexpMatchEasy1_1K 864ns × (0.99,1.04) 878ns × (0.99,1.01) +1.57% (p=0.000) RegexpMatchMedium_32 255ns × (0.99,1.04) 252ns × (0.99,1.01) -1.43% (p=0.001) RegexpMatchMedium_1K 73.9µs × (0.98,1.04) 72.7µs × (1.00,1.00) -1.55% (p=0.004) RegexpMatchHard_32 3.92µs × (0.98,1.04) 3.85µs × (1.00,1.01) -1.80% (p=0.003) RegexpMatchHard_1K 120µs × (0.98,1.04) 117µs × (1.00,1.00) -2.13% (p=0.001) Revcomp 936ms × (0.95,1.08) 903ms × (0.98,1.05) -3.58% (p=0.002) Template 130ms × (0.98,1.04) 126ms × (0.99,1.01) -2.98% (p=0.000) TimeParse 638ns × (0.98,1.05) 634ns × (0.99,1.01) ~ (p=0.198) TimeFormat 674ns × (0.99,1.01) 698ns × (0.98,1.03) +3.69% (p=0.000) Change-Id: Ia0e9b50b1d75a3c0c7556184cd966305574fe07c Reviewed-on: https://go-review.googlesource.com/9706Reviewed-by: Rick Hudson <rlh@golang.org>
-
Russ Cox authored
Reintroduce an optimization discarded during the initial conversion from 4-bit heap bitmaps to 2-bit heap bitmaps: when we reach the place in the bitmap where there are no more pointers, mark that position for the GC so that it can avoid scanning past that place. During heapBitsSetType we can also avoid initializing heap bitmap beyond that location, which gives a bit of a win compared to Go 1.4. This particular optimization (not initializing the heap bitmap) may not last: we might change typedmemmove to use the heap bitmap, in which case it would all need to be initialized. The early stop in the GC scan will stay no matter what. Compared to Go 1.4 (github.com/rsc/go, branch go14bench): name old mean new mean delta SetTypeNode64 80.7ns × (1.00,1.01) 57.4ns × (1.00,1.01) -28.83% (p=0.000) SetTypeNode64Dead 80.5ns × (1.00,1.01) 13.1ns × (0.99,1.02) -83.77% (p=0.000) SetTypeNode64Slice 2.16µs × (1.00,1.01) 1.54µs × (1.00,1.01) -28.75% (p=0.000) SetTypeNode64DeadSlice 2.16µs × (1.00,1.01) 1.52µs × (1.00,1.00) -29.74% (p=0.000) Compared to previous CL: name old mean new mean delta SetTypeNode64 56.7ns × (1.00,1.00) 57.4ns × (1.00,1.01) +1.19% (p=0.000) SetTypeNode64Dead 57.2ns × (1.00,1.00) 13.1ns × (0.99,1.02) -77.15% (p=0.000) SetTypeNode64Slice 1.56µs × (1.00,1.01) 1.54µs × (1.00,1.01) -0.89% (p=0.000) SetTypeNode64DeadSlice 1.55µs × (1.00,1.01) 1.52µs × (1.00,1.00) -2.23% (p=0.000) This is the last CL in the sequence converting from the 4-bit heap to the 2-bit heap, with all the same optimizations reenabled. Compared to before that process began (compared to CL 9701 patch set 1): name old mean new mean delta BinaryTree17 5.87s × (0.94,1.09) 5.91s × (0.96,1.06) ~ (p=0.578) Fannkuch11 4.32s × (1.00,1.00) 4.32s × (1.00,1.00) ~ (p=0.474) FmtFprintfEmpty 89.1ns × (0.95,1.16) 89.0ns × (0.93,1.10) ~ (p=0.942) FmtFprintfString 283ns × (0.98,1.02) 298ns × (0.98,1.06) +5.33% (p=0.000) FmtFprintfInt 284ns × (0.98,1.04) 286ns × (0.98,1.03) ~ (p=0.208) FmtFprintfIntInt 486ns × (0.98,1.03) 498ns × (0.97,1.06) +2.48% (p=0.000) FmtFprintfPrefixedInt 400ns × (0.99,1.02) 408ns × (0.98,1.02) +2.23% (p=0.000) FmtFprintfFloat 566ns × (0.99,1.01) 587ns × (0.98,1.01) +3.69% (p=0.000) FmtManyArgs 1.91µs × (0.99,1.02) 1.94µs × (0.99,1.02) +1.81% (p=0.000) GobDecode 15.5ms × (0.98,1.05) 15.8ms × (0.98,1.03) +1.94% (p=0.002) GobEncode 11.9ms × (0.97,1.03) 12.0ms × (0.96,1.09) ~ (p=0.263) Gzip 648ms × (0.99,1.01) 648ms × (0.99,1.01) ~ (p=0.992) Gunzip 143ms × (1.00,1.00) 143ms × (1.00,1.01) ~ (p=0.585) HTTPClientServer 89.2µs × (0.99,1.02) 90.3µs × (0.98,1.01) +1.24% (p=0.000) JSONEncode 32.3ms × (0.97,1.06) 31.6ms × (0.99,1.01) -2.29% (p=0.000) JSONDecode 106ms × (0.99,1.01) 107ms × (1.00,1.01) +0.62% (p=0.000) Mandelbrot200 6.02ms × (1.00,1.00) 6.03ms × (1.00,1.01) ~ (p=0.250) GoParse 6.57ms × (0.97,1.06) 6.53ms × (0.99,1.03) ~ (p=0.243) RegexpMatchEasy0_32 162ns × (1.00,1.00) 161ns × (1.00,1.01) -0.80% (p=0.000) RegexpMatchEasy0_1K 561ns × (0.99,1.02) 541ns × (0.99,1.01) -3.67% (p=0.000) RegexpMatchEasy1_32 145ns × (0.95,1.04) 138ns × (1.00,1.00) -5.04% (p=0.000) RegexpMatchEasy1_1K 864ns × (0.99,1.04) 887ns × (0.99,1.01) +2.57% (p=0.000) RegexpMatchMedium_32 255ns × (0.99,1.04) 253ns × (0.99,1.01) -1.05% (p=0.012) RegexpMatchMedium_1K 73.9µs × (0.98,1.04) 72.8µs × (1.00,1.00) -1.51% (p=0.005) RegexpMatchHard_32 3.92µs × (0.98,1.04) 3.85µs × (1.00,1.01) -1.88% (p=0.002) RegexpMatchHard_1K 120µs × (0.98,1.04) 117µs × (1.00,1.01) -2.02% (p=0.001) Revcomp 936ms × (0.95,1.08) 922ms × (0.97,1.08) ~ (p=0.234) Template 130ms × (0.98,1.04) 126ms × (0.99,1.01) -2.99% (p=0.000) TimeParse 638ns × (0.98,1.05) 628ns × (0.99,1.01) -1.54% (p=0.004) TimeFormat 674ns × (0.99,1.01) 668ns × (0.99,1.01) -0.80% (p=0.001) The slowdown of the first few benchmarks seems to be due to the new atomic operations for certain small size allocations. But the larger benchmarks mostly improve, probably due to the decreased memory pressure from having half as much heap bitmap. CL 9706, which removes the (never used anymore) wbshadow mode, gets back what is lost in the early microbenchmarks. Change-Id: I37423a209e8ec2a2e92538b45cac5422a6acd32d Reviewed-on: https://go-review.googlesource.com/9705Reviewed-by: Rick Hudson <rlh@golang.org>
-
Russ Cox authored
For the conversion of the heap bitmap from 4-bit to 2-bit fields, I replaced heapBitsSetType with the dumbest thing that could possibly work: two atomic operations (atomicand8+atomicor8) per 2-bit field. This CL replaces that code with a proper implementation that avoids the atomics whenever possible. Benchmarks vs base CL (before the conversion to 2-bit heap bitmap) and vs Go 1.4 below. Compared to Go 1.4, SetTypePtr (a 1-pointer allocation) is 10ns slower because a race against the concurrent GC requires the use of an atomicor8 that used to be an ordinary write. This slowdown was present even in the base CL. Compared to both Go 1.4 and base, SetTypeNode8 (a 10-word allocation) is 10ns slower because it too needs a new atomic, because with the denser representation, the byte on the end of the allocation is now shared with the object next to it; this was not true with the 4-bit representation. Excluding these two (fundamental) slowdowns due to the use of atomics, the new code is noticeably faster than both Go 1.4 and the base CL. The next CL will reintroduce the ``typeDead'' optimization. Stats are from 5 runs on a MacBookPro10,2 (late 2012 Core i5). Compared to base CL (** = new atomic) name old mean new mean delta SetTypePtr 14.1ns × (0.99,1.02) 14.7ns × (0.93,1.10) ~ (p=0.175) SetTypePtr8 18.4ns × (1.00,1.01) 18.6ns × (0.81,1.21) ~ (p=0.866) SetTypePtr16 28.7ns × (1.00,1.00) 22.4ns × (0.90,1.27) -21.88% (p=0.015) SetTypePtr32 52.3ns × (1.00,1.00) 33.8ns × (0.93,1.24) -35.37% (p=0.001) SetTypePtr64 79.2ns × (1.00,1.00) 55.1ns × (1.00,1.01) -30.43% (p=0.000) SetTypePtr126 118ns × (1.00,1.00) 100ns × (1.00,1.00) -15.97% (p=0.000) SetTypePtr128 130ns × (0.92,1.19) 98ns × (1.00,1.00) -24.36% (p=0.008) SetTypePtrSlice 726ns × (0.96,1.08) 760ns × (1.00,1.00) ~ (p=0.152) SetTypeNode1 14.1ns × (0.94,1.15) 12.0ns × (1.00,1.01) -14.60% (p=0.020) SetTypeNode1Slice 135ns × (0.96,1.07) 88ns × (1.00,1.00) -34.53% (p=0.000) SetTypeNode8 20.9ns × (1.00,1.01) 32.6ns × (1.00,1.00) +55.37% (p=0.000) ** SetTypeNode8Slice 414ns × (0.99,1.02) 244ns × (1.00,1.00) -41.09% (p=0.000) SetTypeNode64 80.0ns × (1.00,1.00) 57.4ns × (1.00,1.00) -28.23% (p=0.000) SetTypeNode64Slice 2.15µs × (1.00,1.01) 1.56µs × (1.00,1.00) -27.43% (p=0.000) SetTypeNode124 119ns × (0.99,1.00) 100ns × (1.00,1.00) -16.11% (p=0.000) SetTypeNode124Slice 3.40µs × (1.00,1.00) 2.93µs × (1.00,1.00) -13.80% (p=0.000) SetTypeNode126 120ns × (1.00,1.01) 98ns × (1.00,1.00) -18.19% (p=0.000) SetTypeNode126Slice 3.53µs × (0.98,1.08) 3.02µs × (1.00,1.00) -14.49% (p=0.002) SetTypeNode1024 726ns × (0.97,1.09) 740ns × (1.00,1.00) ~ (p=0.451) SetTypeNode1024Slice 24.9µs × (0.89,1.37) 23.1µs × (1.00,1.00) ~ (p=0.476) Compared to Go 1.4 (** = new atomic) name old mean new mean delta SetTypePtr 5.71ns × (0.89,1.19) 14.68ns × (0.93,1.10) +157.24% (p=0.000) ** SetTypePtr8 19.3ns × (0.96,1.10) 18.6ns × (0.81,1.21) ~ (p=0.638) SetTypePtr16 30.7ns × (0.99,1.03) 22.4ns × (0.90,1.27) -26.88% (p=0.005) SetTypePtr32 51.5ns × (1.00,1.00) 33.8ns × (0.93,1.24) -34.40% (p=0.001) SetTypePtr64 83.6ns × (0.94,1.12) 55.1ns × (1.00,1.01) -34.12% (p=0.001) SetTypePtr126 137ns × (0.87,1.26) 100ns × (1.00,1.00) -27.10% (p=0.028) SetTypePtrSlice 865ns × (0.80,1.23) 760ns × (1.00,1.00) ~ (p=0.243) SetTypeNode1 15.2ns × (0.88,1.12) 12.0ns × (1.00,1.01) -20.89% (p=0.014) SetTypeNode1Slice 156ns × (0.93,1.16) 88ns × (1.00,1.00) -43.57% (p=0.001) SetTypeNode8 23.8ns × (0.90,1.18) 32.6ns × (1.00,1.00) +36.76% (p=0.003) ** SetTypeNode8Slice 502ns × (0.92,1.10) 244ns × (1.00,1.00) -51.46% (p=0.000) SetTypeNode64 85.6ns × (0.94,1.11) 57.4ns × (1.00,1.00) -32.89% (p=0.001) SetTypeNode64Slice 2.36µs × (0.91,1.14) 1.56µs × (1.00,1.00) -33.96% (p=0.002) SetTypeNode124 130ns × (0.91,1.12) 100ns × (1.00,1.00) -23.49% (p=0.004) SetTypeNode124Slice 3.81µs × (0.90,1.22) 2.93µs × (1.00,1.00) -23.09% (p=0.025) There are fewer benchmarks vs Go 1.4 because unrolling directly into the heap bitmap is not yet implemented, so those would not be meaningful comparisons. These benchmarks were not present in Go 1.4 as distributed. The backport to Go 1.4 is in github.com/rsc/go's go14bench branch, commit 71d5ee5. Change-Id: I95ed05a22bf484b0fc9efad549279e766c98d2b6 Reviewed-on: https://go-review.googlesource.com/9704Reviewed-by: Rick Hudson <rlh@golang.org>
-
Russ Cox authored
Previous CLs changed the representation of the non-heap type bitmaps to be 1-bit bitmaps (pointer or not). Before this CL, the heap bitmap stored a 2-bit type for each word and a mark bit and checkmark bit for the first word of the object. (There used to be additional per-word bits.) Reduce heap bitmap to 2-bit, with 1 dedicated to pointer or not, and the other used for mark, checkmark, and "keep scanning forward to find pointers in this object." See comments for details. This CL replaces heapBitsSetType with very slow but obviously correct code. A followup CL will optimize it. (Spoiler: the new code is faster than Go 1.4 was.) Change-Id: I999577a133f3cfecacebdec9cdc3573c235c7fb9 Reviewed-on: https://go-review.googlesource.com/9703Reviewed-by: Rick Hudson <rlh@golang.org> Reviewed-by: Austin Clements <austin@google.com>
-
Russ Cox authored
The type information in reflect.Type and the GC programs is now 1 bit per word, down from 2 bits. The in-memory unrolled type bitmap representation are now 1 bit per word, down from 4 bits. The conversion from the unrolled (now 1-bit) bitmap to the heap bitmap (still 4-bit) is not optimized. A followup CL will work on that, after the heap bitmap has been converted to 2-bit. The typeDead optimization, in which a special value denotes that there are no more pointers anywhere in the object, is lost in this CL. A followup CL will bring it back in the final form of heapBitsSetType. Change-Id: If61e67950c16a293b0b516a6fd9a1c755b6d5549 Reviewed-on: https://go-review.googlesource.com/9702Reviewed-by: Austin Clements <austin@google.com>
-
Russ Cox authored
There was an old benchmark that measured this indirectly via allocation, but I don't understand how to factor out the allocation cost when interpreting the numbers. Replace with a benchmark that only calls heapBitsSetType, that does not allocate. This was not possible when the benchmark was first written, because heapBitsSetType had not been factored out of mallocgc. Change-Id: I30f0f02362efab3465a50769398be859832e6640 Reviewed-on: https://go-review.googlesource.com/9701Reviewed-by: Austin Clements <austin@google.com>
-
Daniel Morsing authored
Since we now have stack information for code running on the systemstack, we can traceback over it. To make cpu profiles useful, add a case in gentraceback to jump over systemstack switches. Fixes #10609. Change-Id: I21f47fcc802c07c5d4a1ada56374314e388a6dc7 Reviewed-on: https://go-review.googlesource.com/9506Reviewed-by: Dmitry Vyukov <dvyukov@google.com>
-
Patrick Mezard authored
I have around twenty of such values on a Windows 7 development machine. regedit displays (translated): "invalid 32-bits DWORD value". Change-Id: Ib37a414ee4c85e891b0a25fed2ddad9e105f5f4e Reviewed-on: https://go-review.googlesource.com/9901Reviewed-by: Alex Brainman <alex.brainman@gmail.com>
-
Shenghou Ma authored
The trace-viewer doesn't use the Go license, so it makes sense to include the license text into the README.md file. While we're at here, reformat existing text using real Markdown syntax. Change-Id: I13e42d3cc6a0ca7e64e3d46ad460dc0460f7ed09 Reviewed-on: https://go-review.googlesource.com/9882Reviewed-by: Rob Pike <r@golang.org>
-
Mikio Hara authored
On darwin/arm, the test sometimes fails with: Process 557 resuming --- FAIL: TestWriteTimeoutFluctuation (1.64s) timeout_test.go:706: Write took over 1s; expected 0.1s FAIL Process 557 exited with status = 1 (0x00000001) go_darwin_arm_exec: timeout running tests This change increaes timeout on iOS builders from 1s to 3s as a temporarily fix. Updates #10775. Change-Id: Ifdaf99cf5b8582c1a636a0f7d5cc66bb276efd72 Reviewed-on: https://go-review.googlesource.com/9915Reviewed-by: Minux Ma <minux@golang.org>
-
- 10 May, 2015 2 commits
-
-
Shenghou Ma authored
Thanks Dmitri Shuralyov for pointing it out. Change-Id: If9c5ac0e56d601d327b2b682ee3548037439cb83 Reviewed-on: https://go-review.googlesource.com/9881Reviewed-by: Josh Bleecher Snyder <josharian@gmail.com>
-
Patrick Mezard authored
ExpandString correctly loops on the syscall until it reaches the required buffer size but truncates it before converting it back to string. The truncation limit is increased to 2^15 bytes which is the documented maximum ExpandEnvironmentStrings output size. This fixes TestExpandString on systems where len($PATH) > 1024. Change-Id: I2a6f184eeca939121b458bcffe1a436a50f3298e Reviewed-on: https://go-review.googlesource.com/9805Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org> Reviewed-by: Alex Brainman <alex.brainman@gmail.com> Run-TryBot: Alex Brainman <alex.brainman@gmail.com> TryBot-Result: Gobot Gobot <gobot@golang.org>
-
- 09 May, 2015 3 commits
-
-
Shenghou Ma authored
There is no SYS_INOTIFY_INIT on linux/arm64, only SYS_INOTIFY_INIT1. Change-Id: I97f430f2c2b910fb19dce495ff1adf591b8634fc Reviewed-on: https://go-review.googlesource.com/9870 Run-TryBot: Minux Ma <minux@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Ian Lance Taylor <iant@golang.org> Reviewed-by: Dave Cheney <dave@cheney.net>
-
Rahul Chaudhry authored
Change-Id: I72df4d979212d8af74a4d2763423346eb6ba14f2 Reviewed-on: https://go-review.googlesource.com/9892Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
-
Mikio Hara authored
Change-Id: I4ade27469d82839b4396e1a88465dddc6b31d578 Reviewed-on: https://go-review.googlesource.com/9838Reviewed-by: Aram Hăvărneanu <aram@mgk.ro>
-
- 08 May, 2015 6 commits
-
-
Didier Spezia authored
The html package uses some specific code to escape special characters. Actually, the strings.Replacer can be used instead, and is much more efficient. The converse operation is more complex but can still be slightly optimized. Credits to Ken Bloom (kabloom@google.com), who first submitted a similar patch at https://codereview.appspot.com/141930043 Added benchmarks and slightly optimized UnescapeString. benchmark old ns/op new ns/op delta BenchmarkEscape-4 118713 19825 -83.30% BenchmarkEscapeNone-4 87653 3784 -95.68% BenchmarkUnescape-4 24888 23417 -5.91% BenchmarkUnescapeNone-4 14423 157 -98.91% benchmark old allocs new allocs delta BenchmarkEscape-4 9 2 -77.78% BenchmarkEscapeNone-4 0 0 +0.00% BenchmarkUnescape-4 2 2 +0.00% BenchmarkUnescapeNone-4 0 0 +0.00% benchmark old bytes new bytes delta BenchmarkEscape-4 24800 12288 -50.45% BenchmarkEscapeNone-4 0 0 +0.00% BenchmarkUnescape-4 10240 10240 +0.00% BenchmarkUnescapeNone-4 0 0 +0.00% Fixes #8697 Change-Id: I208261ed7cbe9b3dee6317851f8c0cf15528bce4 Reviewed-on: https://go-review.googlesource.com/9808 Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org> Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org>
-
Rob Pike authored
Delete the colon from RUN: for examples, since it's not there for tests. Add spaces to line up RUN and PASS: lines. Before: === RUN TestCount --- PASS: TestCount (0.00s) === RUN: ExampleFields --- PASS: ExampleFields (0.00s) After: === RUN TestCount --- PASS: TestCount (0.00s) === RUN ExampleFields --- PASS: ExampleFields (0.00s) Fixes #10594. Change-Id: I189c80a5d99101ee72d8c9c3a4639c07e640cbd8 Reviewed-on: https://go-review.googlesource.com/9846Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
-
Didier Spezia authored
Pipelines are altered by inserting sanitizers if they are not already present. The code makes the assumption that the first operands of each commands are function identifiers. This is wrong, since they can also be methods. It results in a panic with templates such as {{1|print 2|.f 3}} Adds an extra type assertion to make sure only identifiers are compared with sanitizers. Fixes #10673 Change-Id: I3eb820982675231dbfa970f197abc5ef335ce86b Reviewed-on: https://go-review.googlesource.com/9801Reviewed-by: Rob Pike <r@golang.org>
-
Brett Cannon authored
In the Slices section of Effective Go, the os package's File.Read function is used as an example. Unfortunately the function signature does not match the function's code in the example, nor the os package's documentation. This change updates the function signature to match the os package and the pre-existing function code. Change-Id: Iae9f30c898d3a1ff8d47558ca104dfb3ff07112c Reviewed-on: https://go-review.googlesource.com/9845Reviewed-by: Rob Pike <r@golang.org>
-
Ian Lance Taylor authored
This will make it possible for C++ code to #include the export header file and see the correct declarations. The preamble remains the user's responsibility. It would not be appropriate to wrap the preamble in extern "C", because it might include header files that work with both C and C++. Putting those header files in an extern "C" block would break them. Change-Id: Ifb40879d709d26596d5c80b1307a49f1bd70932a Reviewed-on: https://go-review.googlesource.com/9850Reviewed-by: Minux Ma <minux@golang.org> Run-TryBot: Ian Lance Taylor <iant@golang.org> Reviewed-by: David Crawshaw <crawshaw@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org>
-
Shenghou Ma authored
With this patch, gdb seems to be able to corretly backtrace Go process on at least linux/{arm,arm64,ppc64}. Change-Id: Ic40a2a70e71a19c4a92e4655710f38a807b67e9a Reviewed-on: https://go-review.googlesource.com/9822 Run-TryBot: Minux Ma <minux@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Ian Lance Taylor <iant@golang.org>
-
- 07 May, 2015 11 commits
-
-
Josh Bleecher Snyder authored
Fixes #8927. Change-Id: I638cddd439dd2d4eeef5474118cfcbde0c8a5a43 Reviewed-on: https://go-review.googlesource.com/9632 Run-TryBot: David Chase <drchase@google.com> Reviewed-by: David Chase <drchase@google.com>
-
Russ Cox authored
Change-Id: Ic8cb8b1ed8715d6d5a53ec3cac385c0e93883514 Reviewed-on: https://go-review.googlesource.com/9825Reviewed-by: Rick Hudson <rlh@golang.org> Reviewed-by: Austin Clements <austin@google.com>
-
Russ Cox authored
It was testing the mark bits on what roots pointed at, but not the remainder of the live heap, because in CL 2991 I accidentally inverted this check during refactoring. The next CL will turn it back off by default again, but I want one run on the builders with the full checkmark checks. Change-Id: Ic166458cea25c0a56e5387fc527cb166ff2e5ada Reviewed-on: https://go-review.googlesource.com/9824 Run-TryBot: Russ Cox <rsc@golang.org> Reviewed-by: Rick Hudson <rlh@golang.org> Reviewed-by: Austin Clements <austin@google.com>
-
Rick Hudson authored
Currently the heap minimum is set to 4MB which prevents our ability to collect at every allocation by setting GOGC=0. This adjust the heap minimum to 4MB*GOGC/100 thus reenabling collecting at every allocation. Fixes #10681 Change-Id: I912d027dac4b14ae535597e8beefa9ac3fb8ad94 Reviewed-on: https://go-review.googlesource.com/9814Reviewed-by: Austin Clements <austin@google.com> Reviewed-by: Russ Cox <rsc@golang.org>
-
Rob Pike authored
This was added during testing but is unnecessary. Thanks to gravis on GitHub for catching it. See #10574. Change-Id: I4a8f76d237e67f5a0ea189a0f3cadddbf426778a Reviewed-on: https://go-review.googlesource.com/9841Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
-
Rob Pike authored
The code already handled high widths but not high precisions. Also make sure it handles the harder cases of %U. Fixes #10745. Change-Id: Ib4d394d49a9941eeeaff866dc59d80483e312a98 Reviewed-on: https://go-review.googlesource.com/9769Reviewed-by: Ian Lance Taylor <iant@golang.org>
-
Ian Lance Taylor authored
When using -buildmode=c-archive or c-shared, and when installing packages that use cgo, and when those packages export some functions via //export comments, then for each such package, install a pkg.h header file that declares the functions. This permits C code to #include the header when calling the Go functions. This is a little awkward to use when there are multiple packages that export functions, as you have to "go install" your c-archive/c-shared object and then pull it out of the package directory. When compiling your C code you have to -I pkg/$GOOS_$GOARCH. I haven't thought of any more convenient approach. It's simpler when only the main package has exported functions. When using c-shared you currently have to use a _shared suffix in the -I option; it would be nice to fix that somehow. Change-Id: I5d8cf08914b7d3c2b194120c77791d2732ffd26e Reviewed-on: https://go-review.googlesource.com/9798Reviewed-by: David Crawshaw <crawshaw@golang.org> Run-TryBot: Ian Lance Taylor <iant@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org>
-
David Chase authored
Try to provide hints for common areas, either *interface were interface would have been better, and note incorrect capitalization (but don't be more ambitious than that, at least not today). Added code and test for cases ptrInterface.ExistingMethod ptrInterface.unexportedMethod ptrInterface.MissingMethod ptrInterface.withwRongcASEdMethod interface.withwRongcASEdMethod ptrStruct.withwRongcASEdMethod struct.withwRongcASEdMethod also included tests for related errors to check for unintentional changes and consistent wording. Somewhat simplified from previous versions to avoid second- guessing user errors, yet also biased to point out most-likely root cause. Fixes #10700 Change-Id: I16693e93cc8d8ca195e7742a222d640c262105b4 Reviewed-on: https://go-review.googlesource.com/9731Reviewed-by: Russ Cox <rsc@golang.org>
-
Michael Hudson-Doyle authored
Current code just checks the consistency (that the functab is correctly sorted by PC, etc) of the moduledata object that the runtime belongs to. Change to check all of them. Change-Id: I544a44c5de7445fff87d3cdb4840ff04c5e2bf75 Reviewed-on: https://go-review.googlesource.com/9773Reviewed-by: Ian Lance Taylor <iant@golang.org> Run-TryBot: Ian Lance Taylor <iant@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org>
-
John Dethridge authored
When AttrByteSize is not present for a type, we can still determine the size in two more cases: when the type is a Typedef referring to another type, and when the type is a pointer and we know the default address size. entry.go: return after setting an error if the offset is out of range. Change-Id: I63a922ca4e4ad2fc9e9be3e5b47f59fae7d0eb5c Reviewed-on: https://go-review.googlesource.com/9663Reviewed-by: Austin Clements <austin@google.com>
-
Alex Brainman authored
No code changes, but the test passes here. And TryBots are happy. Fixes #8662 maybe Change-Id: Id37380f72a951c9ad7cf96c0db153c05167e62ed Reviewed-on: https://go-review.googlesource.com/9778Reviewed-by: Minux Ma <minux@golang.org>
-