Commit 363617c7 authored by Robert Griesemer's avatar Robert Griesemer

math/big: added (internal) Float.form field for easier case distinctions

This is a fairly significant _internal_ representation change. Instead
of encoding 0, finite, infinite, and NaN values with special mantissa
and exponent values, a new (1 byte) 'form' field is used (without making
the Float struct bigger). The form field permits simpler and faster
case distinctions. As a side benefit, for zero and non-finite floats,
fewer fields need to be set. Also, the exponent range is not the full
int32 range (in the old format, infExp and nanExp were used to represent
Inf and NaN values and tests for those values sometimes didn't test
for the empty mantissa, so the range was reduced by 2 values).

The correspondence between the old and new fields is as follows.
Old representation:

x                 neg      mant         exp
---------------------------------------------------------------
+/-0              sign     empty        0
0 < |x| < +Inf    sign     mantissa     exponent
+/-Inf            sign     empty        infExp
NaN               false    empty        nanExp

New representation (- stands for ignored fields):

x                 neg      mant         exp         form
---------------------------------------------------------------
+/-0              sign     -            -           zero
0 < |x| < +Inf    sign     mantissa     exponent    finite
+/-Inf            sign     -            -           inf
NaN               -        -            -           nan

Client should not be affected by this change.

Change-Id: I7e355894d602ceb23f9ec01da755fe6e0386b101
Reviewed-on: https://go-review.googlesource.com/6870Reviewed-by: default avatarAlan Donovan <adonovan@google.com>
parent 0ff7c3ea
...@@ -37,6 +37,9 @@ const maxShift = _W - 4 ...@@ -37,6 +37,9 @@ const maxShift = _W - 4
// precision argument and keeping track of when a number was truncated early // precision argument and keeping track of when a number was truncated early
// (equivalent of "sticky bit" in binary rounding). // (equivalent of "sticky bit" in binary rounding).
// TODO(gri) Along the same lines, enforce some limit to shift magnitudes
// to avoid "infinitely" long running conversions (until we run out of space).
// Init initializes x to the decimal representation of m << shift (for // Init initializes x to the decimal representation of m << shift (for
// shift >= 0), or m >> -shift (for shift < 0). // shift >= 0), or m >> -shift (for shift < 0).
func (x *decimal) init(m nat, shift int) { func (x *decimal) init(m nat, shift int) {
......
...@@ -24,7 +24,9 @@ const debugFloat = true // enable for debugging ...@@ -24,7 +24,9 @@ const debugFloat = true // enable for debugging
// //
// with 0.5 <= mantissa < 1.0, and MinExp <= exponent <= MaxExp. // with 0.5 <= mantissa < 1.0, and MinExp <= exponent <= MaxExp.
// A Float may also be zero (+0, -0), infinite (+Inf, -Inf) or // A Float may also be zero (+0, -0), infinite (+Inf, -Inf) or
// not-a-number (NaN). // not-a-number (NaN). Except for NaNs, all Floats are ordered,
// and the ordering of two Floats x and y is defined by x.Cmp(y).
// NaNs are always different from any other Float value.
// //
// Each Float value also has a precision, rounding mode, and accuracy. // Each Float value also has a precision, rounding mode, and accuracy.
// The precision is the maximum number of mantissa bits available to // The precision is the maximum number of mantissa bits available to
...@@ -57,27 +59,47 @@ type Float struct { ...@@ -57,27 +59,47 @@ type Float struct {
prec uint32 prec uint32
mode RoundingMode mode RoundingMode
acc Accuracy acc Accuracy
form form
neg bool neg bool
mant nat mant nat
exp int32 exp int32
} }
// Internal representation: The mantissa bits x.mant of a Float x are stored // Exponent and precision limits.
// in a nat slice long enough to hold up to x.prec bits; the slice may (but const (
// doesn't have to) be shorter if the mantissa contains trailing 0 bits. MaxExp = math.MaxInt32 // largest supported exponent
// Unless x is a zero, infinity, or NaN, x.mant is normalized such that the MinExp = math.MinInt32 // smallest supported exponent
// msb of x.mant == 1 (i.e., the msb is shifted all the way "to the left"). MaxPrec = math.MaxUint32 // largest (theoretically) supported precision; likely memory-limited
// Thus, if the mantissa has trailing 0 bits or x.prec is not a multiple )
// of the the Word size _W, x.mant[0] has trailing zero bits. Zero, Inf, and
// NaN values have an empty mantissa and a 0, infExp, or NanExp exponent, // Internal representation: The mantissa bits x.mant of a nonzero finite
// respectively. // Float x are stored in a nat slice long enough to hold up to x.prec bits;
// the slice may (but doesn't have to) be shorter if the mantissa contains
// trailing 0 bits. x.mant is normalized if the msb of x.mant == 1 (i.e.,
// the msb is shifted all the way "to the left"). Thus, if the mantissa has
// trailing 0 bits or x.prec is not a multiple of the the Word size _W,
// x.mant[0] has trailing zero bits. The msb of the mantissa corresponds
// to the value 0.5; the exponent x.exp shifts the binary point as needed.
//
// A zero or non-finite Float x ignores x.mant and x.exp. A NaN x ignores
// the sign x.neg.
//
// x form neg mant exp
// ----------------------------------------------------------
// ±0 zero sign - -
// 0 < |x| < +Inf finite sign mantissa exponent
// ±Inf inf sign - -
// NaN nan - - -
// A form value describes the internal representation.
type form byte
// The form value order is relevant - do not change!
const ( const (
MaxExp = math.MaxInt32 // largest supported exponent zero form = iota
MinExp = math.MinInt32 + 2 // smallest supported exponent finite
infExp = math.MinInt32 + 1 // exponent for Inf values inf
nanExp = math.MinInt32 + 0 // exponent for NaN values nan
MaxPrec = math.MaxUint32 // largest (theoretically) supported precision; likely memory-limited
) )
// RoundingMode determines how a Float value is rounded to the // RoundingMode determines how a Float value is rounded to the
...@@ -113,6 +135,13 @@ const ( ...@@ -113,6 +135,13 @@ const (
//go:generate stringer -type=Accuracy //go:generate stringer -type=Accuracy
func (x *Float) cmpZero() Accuracy {
if x.neg {
return Above
}
return Below
}
// SetPrec sets z's precision to prec and returns the (possibly) rounded // SetPrec sets z's precision to prec and returns the (possibly) rounded
// value of z. Rounding occurs according to z's rounding mode if the mantissa // value of z. Rounding occurs according to z's rounding mode if the mantissa
// cannot be represented in prec bits without loss of precision. // cannot be represented in prec bits without loss of precision.
...@@ -124,14 +153,10 @@ func (z *Float) SetPrec(prec uint) *Float { ...@@ -124,14 +153,10 @@ func (z *Float) SetPrec(prec uint) *Float {
// special case // special case
if prec == 0 { if prec == 0 {
z.prec = 0 z.prec = 0
if len(z.mant) != 0 { if z.form == finite {
// truncate and compute accuracy // truncate z to 0
z.setZero() z.acc = z.cmpZero()
acc := Below z.form = zero
if z.neg {
acc = Above
}
z.acc = acc
} }
return z return z
} }
...@@ -165,8 +190,11 @@ func (x *Float) Prec() uint { ...@@ -165,8 +190,11 @@ func (x *Float) Prec() uint {
// MinPrec returns the minimum precision required to represent x exactly // MinPrec returns the minimum precision required to represent x exactly
// (i.e., the smallest prec before x.SetPrec(prec) would start rounding x). // (i.e., the smallest prec before x.SetPrec(prec) would start rounding x).
// The result is 0 for ±0, ±Inf, and NaN. // The result is 0 if x is 0 or not finite.
func (x *Float) MinPrec() uint { func (x *Float) MinPrec() uint {
if x.form != finite {
return 0
}
return uint(len(x.mant))*_W - x.mant.trailingZeroBits() return uint(len(x.mant))*_W - x.mant.trailingZeroBits()
} }
...@@ -190,7 +218,7 @@ func (x *Float) Sign() int { ...@@ -190,7 +218,7 @@ func (x *Float) Sign() int {
if debugFloat { if debugFloat {
x.validate() x.validate()
} }
if len(x.mant) == 0 && x.exp != infExp { if x.form == zero || x.form == nan {
return 0 return 0
} }
if x.neg { if x.neg {
...@@ -219,18 +247,36 @@ func (x *Float) MantExp(mant *Float) (exp int) { ...@@ -219,18 +247,36 @@ func (x *Float) MantExp(mant *Float) (exp int) {
if debugFloat { if debugFloat {
x.validate() x.validate()
} }
if len(x.mant) != 0 { if x.form == finite {
exp = int(x.exp) exp = int(x.exp)
} }
if mant != nil { if mant != nil {
mant.Copy(x) mant.Copy(x)
if x.exp >= MinExp { if mant.form == finite {
mant.exp = 0 mant.exp = 0
} }
} }
return return
} }
// setExp sets the exponent for z.
// If e < MinExp, z becomes ±0; if e > MaxExp, z becomes ±Inf.
func (z *Float) setExp(e int64) {
if debugFloat && z.form != finite {
panic("setExp called for non-finite Float")
}
switch {
case e < MinExp:
// TODO(gri) check that accuracy is adjusted if necessary
z.form = zero // underflow
default:
z.exp = int32(e)
case e > MaxExp:
// TODO(gri) check that accuracy is adjusted if necessary
z.form = inf // overflow
}
}
// SetMantExp sets z to mant × 2**exp and and returns z. // SetMantExp sets z to mant × 2**exp and and returns z.
// The result z has the same precision and rounding mode // The result z has the same precision and rounding mode
// as mant. SetMantExp is an inverse of MantExp but does // as mant. SetMantExp is an inverse of MantExp but does
...@@ -253,7 +299,7 @@ func (z *Float) SetMantExp(mant *Float, exp int) *Float { ...@@ -253,7 +299,7 @@ func (z *Float) SetMantExp(mant *Float, exp int) *Float {
mant.validate() mant.validate()
} }
z.Copy(mant) z.Copy(mant)
if len(z.mant) == 0 { if z.form != finite {
return z return z
} }
z.setExp(int64(z.exp) + int64(exp)) z.setExp(int64(z.exp) + int64(exp))
...@@ -263,28 +309,28 @@ func (z *Float) SetMantExp(mant *Float, exp int) *Float { ...@@ -263,28 +309,28 @@ func (z *Float) SetMantExp(mant *Float, exp int) *Float {
// IsNeg reports whether x is negative. // IsNeg reports whether x is negative.
// A NaN value is not negative. // A NaN value is not negative.
func (x *Float) IsNeg() bool { func (x *Float) IsNeg() bool {
return x.neg && x.exp != nanExp return x.neg && x.form != nan
} }
// IsZero reports whether x is +0 or -0. // IsZero reports whether x is +0 or -0.
func (x *Float) IsZero() bool { func (x *Float) IsZero() bool {
return len(x.mant) == 0 && x.exp == 0 return x.form == zero
} }
// IsFinite reports whether -Inf < x < Inf. // IsFinite reports whether -Inf < x < Inf.
// A NaN value is not finite. // A NaN value is not finite.
func (x *Float) IsFinite() bool { func (x *Float) IsFinite() bool {
return len(x.mant) != 0 || x.exp == 0 return x.form <= finite
} }
// IsInf reports whether x is +Inf or -Inf. // IsInf reports whether x is +Inf or -Inf.
func (x *Float) IsInf() bool { func (x *Float) IsInf() bool {
return x.exp == infExp return x.form == inf
} }
// IsNaN reports whether x is a NaN value. // IsNaN reports whether x is a NaN value.
func (x *Float) IsNaN() bool { func (x *Float) IsNaN() bool {
return x.exp == nanExp return x.form == nan
} }
// IsInt reports whether x is an integer. // IsInt reports whether x is an integer.
...@@ -293,56 +339,32 @@ func (x *Float) IsInt() bool { ...@@ -293,56 +339,32 @@ func (x *Float) IsInt() bool {
if debugFloat { if debugFloat {
x.validate() x.validate()
} }
// pick off easy cases // special cases
if x.form != finite {
return x.form == zero
}
// x.form == finite
if x.exp <= 0 { if x.exp <= 0 {
// |x| < 1 || |x| == Inf || x is NaN return false
return len(x.mant) == 0 && x.exp == 0 // x == 0
} }
// x.exp > 0 // x.exp > 0
return x.prec <= uint32(x.exp) || x.MinPrec() <= uint(x.exp) // not enough bits for fractional mantissa return x.prec <= uint32(x.exp) || x.MinPrec() <= uint(x.exp) // not enough bits for fractional mantissa
} }
func (z *Float) setZero() {
z.mant = z.mant[:0]
z.exp = 0
}
func (z *Float) setInf() {
z.mant = z.mant[:0]
z.exp = infExp
}
// setExp sets the exponent for z.
// If e < MinExp, z becomes ±0; if e > MaxExp, z becomes ±Inf.
func (z *Float) setExp(e int64) {
switch {
case e < MinExp:
z.setZero()
default:
if len(z.mant) == 0 {
e = 0
}
z.exp = int32(e)
case e > MaxExp:
z.setInf()
}
}
// debugging support // debugging support
func (x *Float) validate() { func (x *Float) validate() {
if !debugFloat { if !debugFloat {
// avoid performance bugs // avoid performance bugs
panic("validate called but debugFloat is not set") panic("validate called but debugFloat is not set")
} }
const msb = 1 << (_W - 1) if x.form != finite {
return
}
m := len(x.mant) m := len(x.mant)
if m == 0 { if m == 0 {
// 0.0, Inf, or NaN panic("nonzero finite x with empty mantissa")
if x.exp != 0 && x.exp >= MinExp {
panic(fmt.Sprintf("empty matissa with invalid exponent %d", x.exp))
}
return
} }
const msb = 1 << (_W - 1)
if x.mant[m-1]&msb == 0 { if x.mant[m-1]&msb == 0 {
panic(fmt.Sprintf("msb not set in last word %#x of %s", x.mant[m-1], x.Format('p', 0))) panic(fmt.Sprintf("msb not set in last word %#x of %s", x.mant[m-1], x.Format('p', 0)))
} }
...@@ -362,21 +384,21 @@ func (x *Float) validate() { ...@@ -362,21 +384,21 @@ func (x *Float) validate() {
func (z *Float) round(sbit uint) { func (z *Float) round(sbit uint) {
if debugFloat { if debugFloat {
z.validate() z.validate()
if z.form > finite {
panic(fmt.Sprintf("round called for non-finite value %s", z))
}
} }
// z.form <= finite
z.acc = Exact z.acc = Exact
if z.form == zero {
// handle zero, Inf, and NaN
m := uint32(len(z.mant)) // present mantissa length in words
if m == 0 {
if z.exp == nanExp {
z.acc = Undef
}
return return
} }
// z.form == finite && len(z.mant) > 0
// m > 0 implies z.prec > 0 (checked by validate) // m > 0 implies z.prec > 0 (checked by validate)
bits := m * _W // present mantissa bits m := uint32(len(z.mant)) // present mantissa length in words
bits := m * _W // present mantissa bits
if bits <= z.prec { if bits <= z.prec {
// mantissa fits => nothing to do // mantissa fits => nothing to do
return return
...@@ -487,6 +509,8 @@ func (z *Float) round(sbit uint) { ...@@ -487,6 +509,8 @@ func (z *Float) round(sbit uint) {
// zero out trailing bits in least-significant word // zero out trailing bits in least-significant word
z.mant[0] &^= lsb - 1 z.mant[0] &^= lsb - 1
// TODO(gri) can z.mant be all 0s at this point?
// update accuracy // update accuracy
if z.acc != Exact && z.neg { if z.acc != Exact && z.neg {
z.acc ^= Below | Above z.acc ^= Below | Above
...@@ -525,10 +549,11 @@ func (z *Float) setBits64(neg bool, x uint64) *Float { ...@@ -525,10 +549,11 @@ func (z *Float) setBits64(neg bool, x uint64) *Float {
z.acc = Exact z.acc = Exact
z.neg = neg z.neg = neg
if x == 0 { if x == 0 {
z.setZero() z.form = zero
return z return z
} }
// x != 0 // x != 0
z.form = finite
s := nlz64(x) s := nlz64(x)
z.mant = z.mant.setUint64(x << s) z.mant = z.mant.setUint64(x << s)
z.exp = int32(64 - s) // always fits z.exp = int32(64 - s) // always fits
...@@ -566,20 +591,20 @@ func (z *Float) SetFloat64(x float64) *Float { ...@@ -566,20 +591,20 @@ func (z *Float) SetFloat64(x float64) *Float {
z.prec = 53 z.prec = 53
} }
if math.IsNaN(x) { if math.IsNaN(x) {
z.SetNaN() return z.SetNaN()
return z
} }
z.acc = Exact z.acc = Exact
z.neg = math.Signbit(x) // handle -0, -Inf correctly z.neg = math.Signbit(x) // handle -0, -Inf correctly
if math.IsInf(x, 0) { if x == 0 {
z.setInf() z.form = zero
return z return z
} }
if x == 0 { if math.IsInf(x, 0) {
z.setZero() z.form = inf
return z return z
} }
// normalized x != 0 // normalized x != 0
z.form = finite
fmant, exp := math.Frexp(x) // get normalized mantissa fmant, exp := math.Frexp(x) // get normalized mantissa
z.mant = z.mant.setUint64(1<<63 | math.Float64bits(fmant)<<11) z.mant = z.mant.setUint64(1<<63 | math.Float64bits(fmant)<<11)
z.exp = int32(exp) // always fits z.exp = int32(exp) // always fits
...@@ -620,11 +645,11 @@ func (z *Float) SetInt(x *Int) *Float { ...@@ -620,11 +645,11 @@ func (z *Float) SetInt(x *Int) *Float {
z.acc = Exact z.acc = Exact
z.neg = x.neg z.neg = x.neg
if len(x.abs) == 0 { if len(x.abs) == 0 {
z.mant = z.mant[:0] z.form = zero
z.exp = 0
return z return z
} }
// x != 0 // x != 0
z.form = finite
z.mant = z.mant.set(x.abs) z.mant = z.mant.set(x.abs)
fnorm(z.mant) fnorm(z.mant)
z.setExp(int64(bits)) z.setExp(int64(bits))
...@@ -655,18 +680,16 @@ func (z *Float) SetRat(x *Rat) *Float { ...@@ -655,18 +680,16 @@ func (z *Float) SetRat(x *Rat) *Float {
// z is unchanged and the result is always Exact. // z is unchanged and the result is always Exact.
func (z *Float) SetInf(sign int) *Float { func (z *Float) SetInf(sign int) *Float {
z.acc = Exact z.acc = Exact
z.form = inf
z.neg = sign < 0 z.neg = sign < 0
z.setInf()
return z return z
} }
// SetNaN sets z to a NaN value, and returns z. // SetNaN sets z to a NaN value, and returns z.
// The precision of z is unchanged and the result is always Exact. // The precision of z is unchanged and the result is always Undef.
func (z *Float) SetNaN() *Float { func (z *Float) SetNaN() *Float {
z.acc = Exact z.acc = Undef
z.neg = false z.form = nan
z.mant = z.mant[:0]
z.exp = nanExp
return z return z
} }
...@@ -685,6 +708,7 @@ func (z *Float) Set(x *Float) *Float { ...@@ -685,6 +708,7 @@ func (z *Float) Set(x *Float) *Float {
if z.prec == 0 { if z.prec == 0 {
z.prec = x.prec z.prec = x.prec
} }
z.form = x.form
z.neg = x.neg z.neg = x.neg
z.exp = x.exp z.exp = x.exp
z.mant = z.mant.set(x.mant) z.mant = z.mant.set(x.mant)
...@@ -706,6 +730,7 @@ func (z *Float) Copy(x *Float) *Float { ...@@ -706,6 +730,7 @@ func (z *Float) Copy(x *Float) *Float {
z.prec = x.prec z.prec = x.prec
z.mode = x.mode z.mode = x.mode
z.acc = x.acc z.acc = x.acc
z.form = x.form
z.neg = x.neg z.neg = x.neg
z.mant = z.mant.set(x.mant) z.mant = z.mant.set(x.mant)
z.exp = x.exp z.exp = x.exp
...@@ -739,41 +764,42 @@ func (x *Float) Uint64() (uint64, Accuracy) { ...@@ -739,41 +764,42 @@ func (x *Float) Uint64() (uint64, Accuracy) {
x.validate() x.validate()
} }
// special cases switch x.form {
if len(x.mant) == 0 { case finite:
switch x.exp { if x.neg {
case 0: return 0, Above
return 0, Exact // ±0 }
case infExp: // 0 < x < +Inf
if x.neg { if x.exp <= 0 {
return 0, Above // -Inf // 0 < x < 1
return 0, Below
}
// 1 <= x < Inf
if x.exp <= 64 {
// u = trunc(x) fits into a uint64
u := high64(x.mant) >> (64 - uint32(x.exp))
if x.MinPrec() <= 64 {
return u, Exact
} }
return math.MaxUint64, Below // +Inf return u, Below // x truncated
case nanExp:
return 0, Undef // NaN
} }
panic("unreachable") // x too large
} return math.MaxUint64, Below
if x.neg { case zero:
return 0, Above return 0, Exact
}
// x > 0 case inf:
if x.exp <= 0 { if x.neg {
// 0 < x < 1 return 0, Above
return 0, Below
}
// 1 <= x
if x.exp <= 64 {
// u = trunc(x) fits into a uint64
u := high64(x.mant) >> (64 - uint32(x.exp))
if x.MinPrec() <= 64 {
return u, Exact
} }
return u, Below // x truncated return math.MaxUint64, Below
case nan:
return 0, Undef
} }
// x too large
return math.MaxUint64, Below panic("unreachable")
} }
// Int64 returns the integer resulting from truncating x towards zero. // Int64 returns the integer resulting from truncating x towards zero.
...@@ -786,54 +812,52 @@ func (x *Float) Int64() (int64, Accuracy) { ...@@ -786,54 +812,52 @@ func (x *Float) Int64() (int64, Accuracy) {
x.validate() x.validate()
} }
// special cases switch x.form {
if len(x.mant) == 0 { case finite:
switch x.exp { // 0 < |x| < +Inf
case 0: acc := x.cmpZero()
return 0, Exact // ±0 if x.exp <= 0 {
case infExp: // 0 < |x| < 1
return 0, acc
}
// x.exp > 0
// 1 <= |x| < +Inf
if x.exp <= 63 {
// i = trunc(x) fits into an int64 (excluding math.MinInt64)
i := int64(high64(x.mant) >> (64 - uint32(x.exp)))
if x.neg { if x.neg {
return math.MinInt64, Above // -Inf i = -i
} }
return math.MaxInt64, Below // +Inf if x.MinPrec() <= 63 {
case nanExp: return i, Exact
return 0, Undef // NaN }
return i, acc // x truncated
} }
panic("unreachable") if x.neg {
} // check for special case x == math.MinInt64 (i.e., x == -(0.5 << 64))
if x.exp == 64 && x.MinPrec() == 1 {
acc = Exact
}
return math.MinInt64, acc
}
// x too large
return math.MaxInt64, Below
// 0 < |x| < +Inf case zero:
acc := Below return 0, Exact
if x.neg {
acc = Above
}
if x.exp <= 0 {
// 0 < |x| < 1
return 0, acc
}
// x.exp > 0
// 1 <= |x| < +Inf case inf:
if x.exp <= 63 {
// i = trunc(x) fits into an int64 (excluding math.MinInt64)
i := int64(high64(x.mant) >> (64 - uint32(x.exp)))
if x.neg { if x.neg {
i = -i return math.MinInt64, Above
}
if x.MinPrec() <= 63 {
return i, Exact
} }
return i, acc // x truncated return math.MaxInt64, Below
}
if x.neg { case nan:
// check for special case x == math.MinInt64 (i.e., x == -(0.5 << 64)) return 0, Undef
if x.exp == 64 && x.MinPrec() == 1 {
acc = Exact
}
return math.MinInt64, acc
} }
// x == +Inf
return math.MaxInt64, Below panic("unreachable")
} }
// Float64 returns the closest float64 value of x // Float64 returns the closest float64 value of x
...@@ -844,38 +868,39 @@ func (x *Float) Float64() (float64, Accuracy) { ...@@ -844,38 +868,39 @@ func (x *Float) Float64() (float64, Accuracy) {
x.validate() x.validate()
} }
// special cases switch x.form {
if len(x.mant) == 0 { case finite:
switch x.exp { // 0 < |x| < +Inf
case 0: var r Float
if x.neg { r.prec = 53
var zero float64 r.Set(x)
return -zero, Exact var s uint64
} if r.neg {
return 0, Exact s = 1 << 63
case infExp:
var sign int
if x.neg {
sign = -1
}
return math.Inf(sign), Exact
case nanExp:
return math.NaN(), Undef
} }
panic("unreachable") e := uint64(1022+r.exp) & 0x7ff // TODO(gri) check for overflow
} m := high64(r.mant) >> 11 & (1<<52 - 1)
return math.Float64frombits(s | e<<52 | m), r.acc
case zero:
z := 0.0
if x.neg {
z = -z
}
return z, Exact
case inf:
sign := +1
if x.neg {
sign = -1
}
return math.Inf(sign), Exact
// 0 < |x| < +Inf case nan:
var r Float return math.NaN(), Undef
r.prec = 53
r.Set(x)
var s uint64
if r.neg {
s = 1 << 63
} }
e := uint64(1022+r.exp) & 0x7ff // TODO(gri) check for overflow
m := high64(r.mant) >> 11 & (1<<52 - 1) panic("unreachable")
return math.Float64frombits(s | e<<52 | m), r.acc
} }
// Int returns the result of truncating x towards zero; // Int returns the result of truncating x towards zero;
...@@ -889,61 +914,53 @@ func (x *Float) Int(z *Int) (*Int, Accuracy) { ...@@ -889,61 +914,53 @@ func (x *Float) Int(z *Int) (*Int, Accuracy) {
x.validate() x.validate()
} }
if z == nil { if z == nil && x.form <= finite {
// no need to do this for Inf and NaN
// but those are rare enough that we
// don't care
z = new(Int) z = new(Int)
} }
// special cases switch x.form {
if len(x.mant) == 0 { case finite:
switch x.exp { // 0 < |x| < +Inf
case 0: acc := x.cmpZero()
return z.SetInt64(0), Exact // 0 if x.exp <= 0 {
case infExp: // 0 < |x| < 1
if x.neg { return z.SetInt64(0), acc
return nil, Above
}
return nil, Below
case nanExp:
return nil, Undef
} }
panic("unreachable") // x.exp > 0
}
// 0 < |x| < +Inf // 1 <= |x| < +Inf
acc := Below // determine minimum required precision for x
if x.neg { allBits := uint(len(x.mant)) * _W
acc = Above exp := uint(x.exp)
} if x.MinPrec() <= exp {
if x.exp <= 0 { acc = Exact
// 0 < |x| < 1 }
return z.SetInt64(0), acc // shift mantissa as needed
} if z == nil {
// x.exp > 0 z = new(Int)
}
z.neg = x.neg
switch {
case exp > allBits:
z.abs = z.abs.shl(x.mant, exp-allBits)
default:
z.abs = z.abs.set(x.mant)
case exp < allBits:
z.abs = z.abs.shr(x.mant, allBits-exp)
}
return z, acc
// 1 <= |x| < +Inf case zero:
// determine minimum required precision for x return z.SetInt64(0), Exact
allBits := uint(len(x.mant)) * _W
exp := uint(x.exp) case inf:
if x.MinPrec() <= exp { return nil, x.cmpZero()
acc = Exact
} case nan:
// shift mantissa as needed return nil, Undef
if z == nil {
z = new(Int)
}
z.neg = x.neg
switch {
case exp > allBits:
z.abs = z.abs.shl(x.mant, exp-allBits)
default:
z.abs = z.abs.set(x.mant)
case exp < allBits:
z.abs = z.abs.shr(x.mant, allBits-exp)
} }
return z, acc
panic("unreachable")
} }
// Rat returns the rational number corresponding to x; // Rat returns the rational number corresponding to x;
...@@ -956,49 +973,44 @@ func (x *Float) Rat(z *Rat) (*Rat, Accuracy) { ...@@ -956,49 +973,44 @@ func (x *Float) Rat(z *Rat) (*Rat, Accuracy) {
x.validate() x.validate()
} }
if z == nil { if z == nil && x.form <= finite {
// no need to do this for Inf and NaN
// but those are rare enough that we
// don't care
z = new(Rat) z = new(Rat)
} }
// special cases switch x.form {
if len(x.mant) == 0 { case finite:
switch x.exp { // 0 < |x| < +Inf
case 0: allBits := int32(len(x.mant)) * _W
return z.SetInt64(0), Exact // 0 // build up numerator and denominator
case infExp: z.a.neg = x.neg
if x.neg { switch {
return nil, Above case x.exp > allBits:
} z.a.abs = z.a.abs.shl(x.mant, uint(x.exp-allBits))
return nil, Below z.b.abs = z.b.abs[:0] // == 1 (see Rat)
case nanExp: // z already in normal form
return nil, Undef default:
z.a.abs = z.a.abs.set(x.mant)
z.b.abs = z.b.abs[:0] // == 1 (see Rat)
// z already in normal form
case x.exp < allBits:
z.a.abs = z.a.abs.set(x.mant)
t := z.b.abs.setUint64(1)
z.b.abs = t.shl(t, uint(allBits-x.exp))
z.norm()
} }
panic("unreachable") return z, Exact
case zero:
return z.SetInt64(0), Exact
case inf:
return nil, x.cmpZero()
case nan:
return nil, Undef
} }
// 0 <= |x| < Inf panic("unreachable")
allBits := int32(len(x.mant)) * _W
// build up numerator and denominator
z.a.neg = x.neg
switch {
case x.exp > allBits:
z.a.abs = z.a.abs.shl(x.mant, uint(x.exp-allBits))
z.b.abs = z.b.abs[:0] // == 1 (see Rat)
// z already in normal form
default:
z.a.abs = z.a.abs.set(x.mant)
z.b.abs = z.b.abs[:0] // == 1 (see Rat)
// z already in normal form
case x.exp < allBits:
z.a.abs = z.a.abs.set(x.mant)
t := z.b.abs.setUint64(1)
z.b.abs = t.shl(t, uint(allBits-x.exp))
z.norm()
}
return z, Exact
} }
// Abs sets z to the (possibly rounded) value |x| (the absolute value of x) // Abs sets z to the (possibly rounded) value |x| (the absolute value of x)
...@@ -1031,7 +1043,7 @@ func (z *Float) uadd(x, y *Float) { ...@@ -1031,7 +1043,7 @@ func (z *Float) uadd(x, y *Float) {
// http://www.vinc17.net/research/papers/rnc6.pdf // http://www.vinc17.net/research/papers/rnc6.pdf
if debugFloat && (len(x.mant) == 0 || len(y.mant) == 0) { if debugFloat && (len(x.mant) == 0 || len(y.mant) == 0) {
panic("uadd called with 0 argument") panic("uadd called with empty mantissa")
} }
// compute exponents ex, ey for mantissa with "binary point" // compute exponents ex, ey for mantissa with "binary point"
...@@ -1070,7 +1082,7 @@ func (z *Float) usub(x, y *Float) { ...@@ -1070,7 +1082,7 @@ func (z *Float) usub(x, y *Float) {
// by special-casing, and the code will diverge. // by special-casing, and the code will diverge.
if debugFloat && (len(x.mant) == 0 || len(y.mant) == 0) { if debugFloat && (len(x.mant) == 0 || len(y.mant) == 0) {
panic("usub called with 0 argument") panic("usub called with empty mantissa")
} }
ex := int64(x.exp) - int64(len(x.mant))*_W ex := int64(x.exp) - int64(len(x.mant))*_W
...@@ -1094,7 +1106,7 @@ func (z *Float) usub(x, y *Float) { ...@@ -1094,7 +1106,7 @@ func (z *Float) usub(x, y *Float) {
// operands may have cancelled each other out // operands may have cancelled each other out
if len(z.mant) == 0 { if len(z.mant) == 0 {
z.acc = Exact z.acc = Exact
z.setZero() z.form = zero
return return
} }
// len(z.mant) > 0 // len(z.mant) > 0
...@@ -1107,7 +1119,7 @@ func (z *Float) usub(x, y *Float) { ...@@ -1107,7 +1119,7 @@ func (z *Float) usub(x, y *Float) {
// x and y must not be 0, Inf, or NaN. // x and y must not be 0, Inf, or NaN.
func (z *Float) umul(x, y *Float) { func (z *Float) umul(x, y *Float) {
if debugFloat && (len(x.mant) == 0 || len(y.mant) == 0) { if debugFloat && (len(x.mant) == 0 || len(y.mant) == 0) {
panic("umul called with 0 argument") panic("umul called with empty mantissa")
} }
// Note: This is doing too much work if the precision // Note: This is doing too much work if the precision
...@@ -1128,7 +1140,7 @@ func (z *Float) umul(x, y *Float) { ...@@ -1128,7 +1140,7 @@ func (z *Float) umul(x, y *Float) {
// x and y must not be 0, Inf, or NaN. // x and y must not be 0, Inf, or NaN.
func (z *Float) uquo(x, y *Float) { func (z *Float) uquo(x, y *Float) {
if debugFloat && (len(x.mant) == 0 || len(y.mant) == 0) { if debugFloat && (len(x.mant) == 0 || len(y.mant) == 0) {
panic("uquo called with 0 argument") panic("uquo called with empty mantissa")
} }
// mantissa length in words for desired result precision + 1 // mantissa length in words for desired result precision + 1
...@@ -1176,7 +1188,7 @@ func (z *Float) uquo(x, y *Float) { ...@@ -1176,7 +1188,7 @@ func (z *Float) uquo(x, y *Float) {
// while ignoring the signs of x and y. x and y must not be 0, Inf, or NaN. // while ignoring the signs of x and y. x and y must not be 0, Inf, or NaN.
func (x *Float) ucmp(y *Float) int { func (x *Float) ucmp(y *Float) int {
if debugFloat && (len(x.mant) == 0 || len(y.mant) == 0) { if debugFloat && (len(x.mant) == 0 || len(y.mant) == 0) {
panic("ucmp called with 0 argument") panic("ucmp called with empty mantissa")
} }
switch { switch {
...@@ -1248,14 +1260,14 @@ func (z *Float) Add(x, y *Float) *Float { ...@@ -1248,14 +1260,14 @@ func (z *Float) Add(x, y *Float) *Float {
} }
// special cases // special cases
if len(x.mant) == 0 || len(y.mant) == 0 { if x.form != finite || y.form != finite {
if x.exp <= infExp || y.exp <= infExp { if x.form > finite || y.form > finite {
// TODO(gri) handle Inf separately // TODO(gri) handle Inf separately
return z.SetNaN() return z.SetNaN()
} }
if len(x.mant) == 0 { // x == ±0 if x.form == zero {
z.Set(y) z.Set(y)
if len(z.mant) == 0 && z.exp == 0 { if z.form == zero {
z.neg = x.neg && y.neg // -0 + -0 == -0 z.neg = x.neg && y.neg // -0 + -0 == -0
} }
return z return z
...@@ -1265,6 +1277,7 @@ func (z *Float) Add(x, y *Float) *Float { ...@@ -1265,6 +1277,7 @@ func (z *Float) Add(x, y *Float) *Float {
} }
// x, y != 0 // x, y != 0
z.form = finite
z.neg = x.neg z.neg = x.neg
if x.neg == y.neg { if x.neg == y.neg {
// x + y == x + y // x + y == x + y
...@@ -1282,7 +1295,7 @@ func (z *Float) Add(x, y *Float) *Float { ...@@ -1282,7 +1295,7 @@ func (z *Float) Add(x, y *Float) *Float {
} }
// -0 is only possible for -0 + -0 // -0 is only possible for -0 + -0
if len(z.mant) == 0 { if z.form == zero {
z.neg = false z.neg = false
} }
...@@ -1303,14 +1316,14 @@ func (z *Float) Sub(x, y *Float) *Float { ...@@ -1303,14 +1316,14 @@ func (z *Float) Sub(x, y *Float) *Float {
} }
// special cases // special cases
if len(x.mant) == 0 || len(y.mant) == 0 { if x.form != finite || y.form != finite {
if x.exp <= infExp || y.exp <= infExp { if x.form > finite || y.form > finite {
// TODO(gri) handle Inf separately // TODO(gri) handle Inf separately
return z.SetNaN() return z.SetNaN()
} }
if len(x.mant) == 0 { // x == ±0 if x.form == zero {
z.Neg(y) z.Neg(y)
if len(z.mant) == 0 && z.exp == 0 { if z.form == zero {
z.neg = x.neg && !y.neg // -0 - 0 == -0 z.neg = x.neg && !y.neg // -0 - 0 == -0
} }
return z return z
...@@ -1320,6 +1333,7 @@ func (z *Float) Sub(x, y *Float) *Float { ...@@ -1320,6 +1333,7 @@ func (z *Float) Sub(x, y *Float) *Float {
} }
// x, y != 0 // x, y != 0
z.form = finite
z.neg = x.neg z.neg = x.neg
if x.neg != y.neg { if x.neg != y.neg {
// x - (-y) == x + y // x - (-y) == x + y
...@@ -1337,7 +1351,7 @@ func (z *Float) Sub(x, y *Float) *Float { ...@@ -1337,7 +1351,7 @@ func (z *Float) Sub(x, y *Float) *Float {
} }
// -0 is only possible for -0 - 0 // -0 is only possible for -0 - 0
if len(z.mant) == 0 { if z.form == zero {
z.neg = false z.neg = false
} }
...@@ -1360,24 +1374,25 @@ func (z *Float) Mul(x, y *Float) *Float { ...@@ -1360,24 +1374,25 @@ func (z *Float) Mul(x, y *Float) *Float {
z.neg = x.neg != y.neg z.neg = x.neg != y.neg
// special cases // special cases
if len(x.mant) == 0 || len(y.mant) == 0 { if x.form != finite || y.form != finite {
if x.exp <= infExp || y.exp <= infExp { if x.form > finite || y.form > finite {
// TODO(gri) handle Inf separately // TODO(gri) handle Inf separately
return z.SetNaN() return z.SetNaN()
} }
// x == ±0 || y == ±0 // x == ±0 || y == ±0
z.acc = Exact z.acc = Exact
z.setZero() z.form = zero
return z return z
} }
if len(x.mant) == 0 || len(y.mant) == 0 { if x.form == zero || y.form == zero {
z.acc = Exact z.acc = Exact
z.setZero() z.form = zero
return z return z
} }
// x, y != 0 // x, y != 0
z.form = finite
z.umul(x, y) z.umul(x, y)
return z return z
} }
...@@ -1399,26 +1414,27 @@ func (z *Float) Quo(x, y *Float) *Float { ...@@ -1399,26 +1414,27 @@ func (z *Float) Quo(x, y *Float) *Float {
// special cases // special cases
z.acc = Exact z.acc = Exact
if len(x.mant) == 0 || len(y.mant) == 0 { if x.form != finite || y.form != finite {
if x.exp <= infExp || y.exp <= infExp { if x.form > finite || y.form > finite {
// TODO(gri) handle Inf separately // TODO(gri) handle Inf separately
return z.SetNaN() return z.SetNaN()
} }
if len(x.mant) == 0 { if x.form == zero {
if len(y.mant) == 0 { if y.form == zero {
return z.SetNaN() return z.SetNaN()
} }
z.setZero() z.form = zero
return z return z
} }
// x != 0 // x != 0
if len(y.mant) == 0 { if y.form == zero {
z.setInf() z.form = inf
return z return z
} }
} }
// x, y != 0 // x, y != 0
z.form = finite
z.uquo(x, y) z.uquo(x, y)
return z return z
} }
...@@ -1475,15 +1491,16 @@ func umax32(x, y uint32) uint32 { ...@@ -1475,15 +1491,16 @@ func umax32(x, y uint32) uint32 {
// +2 if x == +Inf // +2 if x == +Inf
// //
func (x *Float) ord() int { func (x *Float) ord() int {
m := 1 // common case var m int
if len(x.mant) == 0 { switch x.form {
m = 0 case finite:
if x.exp == infExp { m = 1
m = 2 case zero:
} return 0
if x.exp == nanExp { case inf:
panic("unimplemented") m = 2
} case nan:
panic("unimplemented")
} }
if x.neg { if x.neg {
m = -m m = -m
......
...@@ -90,15 +90,20 @@ func TestFloatZeroValue(t *testing.T) { ...@@ -90,15 +90,20 @@ func TestFloatZeroValue(t *testing.T) {
func makeFloat(s string) *Float { func makeFloat(s string) *Float {
var x Float var x Float
if s == "Inf" || s == "+Inf" {
switch s {
case "0":
return &x
case "-0":
return x.Neg(&x)
case "Inf", "+Inf":
return x.SetInf(+1) return x.SetInf(+1)
} case "-Inf":
if s == "-Inf" {
return x.SetInf(-1) return x.SetInf(-1)
} case "NaN", "-NaN":
if s == "NaN" || s == "-NaN" {
return x.SetNaN() return x.SetNaN()
} }
x.SetPrec(1000) x.SetPrec(1000)
if _, ok := x.SetString(s); !ok { if _, ok := x.SetString(s); !ok {
panic(fmt.Sprintf("%q is not a valid float", s)) panic(fmt.Sprintf("%q is not a valid float", s))
...@@ -146,13 +151,6 @@ func TestFloatSetPrec(t *testing.T) { ...@@ -146,13 +151,6 @@ func TestFloatSetPrec(t *testing.T) {
if got, acc := x.String(), x.Acc(); got != test.want || acc != test.acc { if got, acc := x.String(), x.Acc(); got != test.want || acc != test.acc {
t.Errorf("%s.SetPrec(%d) = %s (%s); want %s (%s)", test.x, test.prec, got, acc, test.want, test.acc) t.Errorf("%s.SetPrec(%d) = %s (%s); want %s (%s)", test.x, test.prec, got, acc, test.want, test.acc)
} }
// look inside x and check correct value for x.exp
if len(x.mant) == 0 {
// ±0, ±Inf, or NaN
if x.exp != 0 && x.exp > MinExp {
t.Errorf("%s.SetPrec(%d): incorrect exponent %d", test.x, test.prec, x.exp)
}
}
} }
} }
...@@ -209,7 +207,7 @@ func feq(x, y *Float) bool { ...@@ -209,7 +207,7 @@ func feq(x, y *Float) bool {
if x.IsNaN() || y.IsNaN() { if x.IsNaN() || y.IsNaN() {
return x.IsNaN() && y.IsNaN() return x.IsNaN() && y.IsNaN()
} }
return x.Cmp(y) == 0 && x.neg == y.neg return x.Cmp(y) == 0 && x.IsNeg() == y.IsNeg()
} }
func TestFloatMantExp(t *testing.T) { func TestFloatMantExp(t *testing.T) {
...@@ -261,11 +259,11 @@ func TestFloatSetMantExp(t *testing.T) { ...@@ -261,11 +259,11 @@ func TestFloatSetMantExp(t *testing.T) {
{"Inf", 1234, "+Inf"}, {"Inf", 1234, "+Inf"},
{"+Inf", -1234, "+Inf"}, {"+Inf", -1234, "+Inf"},
{"-Inf", -1234, "-Inf"}, {"-Inf", -1234, "-Inf"},
{"0", -MaxExp - 1, "0"}, {"0", MinExp, "0"},
{"0.5", -MaxExp - 1, "+0"}, // exponent underflow {"0.25", MinExp, "+0"}, // exponent underflow
{"-0.5", -MaxExp - 1, "-0"}, // exponent underflow {"-0.25", MinExp, "-0"}, // exponent underflow
{"1", MaxExp, "+Inf"}, // exponent overflow {"1", MaxExp, "+Inf"}, // exponent overflow
{"2", MaxExp - 1, "+Inf"}, // exponent overflow {"2", MaxExp - 1, "+Inf"}, // exponent overflow
{"0.75", 1, "1.5"}, {"0.75", 1, "1.5"},
{"0.5", 11, "1024"}, {"0.5", 11, "1024"},
{"-0.5", -2, "-0.125"}, {"-0.5", -2, "-0.125"},
......
...@@ -96,11 +96,13 @@ func (z *Float) Scan(r io.ByteScanner, base int) (f *Float, b int, err error) { ...@@ -96,11 +96,13 @@ func (z *Float) Scan(r io.ByteScanner, base int) (f *Float, b int, err error) {
// special-case 0 // special-case 0
if len(z.mant) == 0 { if len(z.mant) == 0 {
z.acc = Exact z.acc = Exact
z.exp = 0 z.form = zero
return return
} }
// len(z.mant) > 0 // len(z.mant) > 0
z.form = finite
// The mantissa may have a decimal point (fcount <= 0) and there // The mantissa may have a decimal point (fcount <= 0) and there
// may be a nonzero exponent exp. The decimal point amounts to a // may be a nonzero exponent exp. The decimal point amounts to a
// division by b**(-fcount). An exponent means multiplication by // division by b**(-fcount). An exponent means multiplication by
...@@ -275,9 +277,13 @@ func (x *Float) bstring(buf []byte) []byte { ...@@ -275,9 +277,13 @@ func (x *Float) bstring(buf []byte) []byte {
if x.neg { if x.neg {
buf = append(buf, '-') buf = append(buf, '-')
} }
if len(x.mant) == 0 { if x.form == zero {
return append(buf, '0') return append(buf, '0')
} }
if debugFloat && x.form != finite {
panic("non-finite float")
}
// x != 0 // x != 0
// adjust mantissa to use exactly x.prec bits // adjust mantissa to use exactly x.prec bits
...@@ -306,9 +312,13 @@ func (x *Float) pstring(buf []byte) []byte { ...@@ -306,9 +312,13 @@ func (x *Float) pstring(buf []byte) []byte {
if x.neg { if x.neg {
buf = append(buf, '-') buf = append(buf, '-')
} }
if len(x.mant) == 0 { if x.form == zero {
return append(buf, '0') return append(buf, '0')
} }
if debugFloat && x.form != finite {
panic("non-finite float")
}
// x != 0 // x != 0
// remove trailing 0 words early // remove trailing 0 words early
......
...@@ -19,11 +19,17 @@ import "strconv" ...@@ -19,11 +19,17 @@ import "strconv"
// bigFtoa formats a float for the %e, %E, %f, %g, and %G formats. // bigFtoa formats a float for the %e, %E, %f, %g, and %G formats.
func (f *Float) bigFtoa(buf []byte, fmt byte, prec int) []byte { func (f *Float) bigFtoa(buf []byte, fmt byte, prec int) []byte {
// TODO(gri) handle Inf. if debugFloat && !f.IsFinite() {
panic("non-finite float")
}
// 1) convert Float to multiprecision decimal // 1) convert Float to multiprecision decimal
var mant nat
if f.form == finite {
mant = f.mant
}
var d decimal var d decimal
d.init(f.mant, int(f.exp)-f.mant.bitLen()) d.init(mant, int(f.exp)-f.mant.bitLen())
// 2) round to desired precision // 2) round to desired precision
shortest := false shortest := false
......
Markdown is supported
0%
or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment