Commit 363617c7 authored by Robert Griesemer's avatar Robert Griesemer

math/big: added (internal) Float.form field for easier case distinctions

This is a fairly significant _internal_ representation change. Instead
of encoding 0, finite, infinite, and NaN values with special mantissa
and exponent values, a new (1 byte) 'form' field is used (without making
the Float struct bigger). The form field permits simpler and faster
case distinctions. As a side benefit, for zero and non-finite floats,
fewer fields need to be set. Also, the exponent range is not the full
int32 range (in the old format, infExp and nanExp were used to represent
Inf and NaN values and tests for those values sometimes didn't test
for the empty mantissa, so the range was reduced by 2 values).

The correspondence between the old and new fields is as follows.
Old representation:

x                 neg      mant         exp
---------------------------------------------------------------
+/-0              sign     empty        0
0 < |x| < +Inf    sign     mantissa     exponent
+/-Inf            sign     empty        infExp
NaN               false    empty        nanExp

New representation (- stands for ignored fields):

x                 neg      mant         exp         form
---------------------------------------------------------------
+/-0              sign     -            -           zero
0 < |x| < +Inf    sign     mantissa     exponent    finite
+/-Inf            sign     -            -           inf
NaN               -        -            -           nan

Client should not be affected by this change.

Change-Id: I7e355894d602ceb23f9ec01da755fe6e0386b101
Reviewed-on: https://go-review.googlesource.com/6870Reviewed-by: default avatarAlan Donovan <adonovan@google.com>
parent 0ff7c3ea
...@@ -37,6 +37,9 @@ const maxShift = _W - 4 ...@@ -37,6 +37,9 @@ const maxShift = _W - 4
// precision argument and keeping track of when a number was truncated early // precision argument and keeping track of when a number was truncated early
// (equivalent of "sticky bit" in binary rounding). // (equivalent of "sticky bit" in binary rounding).
// TODO(gri) Along the same lines, enforce some limit to shift magnitudes
// to avoid "infinitely" long running conversions (until we run out of space).
// Init initializes x to the decimal representation of m << shift (for // Init initializes x to the decimal representation of m << shift (for
// shift >= 0), or m >> -shift (for shift < 0). // shift >= 0), or m >> -shift (for shift < 0).
func (x *decimal) init(m nat, shift int) { func (x *decimal) init(m nat, shift int) {
......
...@@ -24,7 +24,9 @@ const debugFloat = true // enable for debugging ...@@ -24,7 +24,9 @@ const debugFloat = true // enable for debugging
// //
// with 0.5 <= mantissa < 1.0, and MinExp <= exponent <= MaxExp. // with 0.5 <= mantissa < 1.0, and MinExp <= exponent <= MaxExp.
// A Float may also be zero (+0, -0), infinite (+Inf, -Inf) or // A Float may also be zero (+0, -0), infinite (+Inf, -Inf) or
// not-a-number (NaN). // not-a-number (NaN). Except for NaNs, all Floats are ordered,
// and the ordering of two Floats x and y is defined by x.Cmp(y).
// NaNs are always different from any other Float value.
// //
// Each Float value also has a precision, rounding mode, and accuracy. // Each Float value also has a precision, rounding mode, and accuracy.
// The precision is the maximum number of mantissa bits available to // The precision is the maximum number of mantissa bits available to
...@@ -57,29 +59,49 @@ type Float struct { ...@@ -57,29 +59,49 @@ type Float struct {
prec uint32 prec uint32
mode RoundingMode mode RoundingMode
acc Accuracy acc Accuracy
form form
neg bool neg bool
mant nat mant nat
exp int32 exp int32
} }
// Internal representation: The mantissa bits x.mant of a Float x are stored // Exponent and precision limits.
// in a nat slice long enough to hold up to x.prec bits; the slice may (but
// doesn't have to) be shorter if the mantissa contains trailing 0 bits.
// Unless x is a zero, infinity, or NaN, x.mant is normalized such that the
// msb of x.mant == 1 (i.e., the msb is shifted all the way "to the left").
// Thus, if the mantissa has trailing 0 bits or x.prec is not a multiple
// of the the Word size _W, x.mant[0] has trailing zero bits. Zero, Inf, and
// NaN values have an empty mantissa and a 0, infExp, or NanExp exponent,
// respectively.
const ( const (
MaxExp = math.MaxInt32 // largest supported exponent MaxExp = math.MaxInt32 // largest supported exponent
MinExp = math.MinInt32 + 2 // smallest supported exponent MinExp = math.MinInt32 // smallest supported exponent
infExp = math.MinInt32 + 1 // exponent for Inf values
nanExp = math.MinInt32 + 0 // exponent for NaN values
MaxPrec = math.MaxUint32 // largest (theoretically) supported precision; likely memory-limited MaxPrec = math.MaxUint32 // largest (theoretically) supported precision; likely memory-limited
) )
// Internal representation: The mantissa bits x.mant of a nonzero finite
// Float x are stored in a nat slice long enough to hold up to x.prec bits;
// the slice may (but doesn't have to) be shorter if the mantissa contains
// trailing 0 bits. x.mant is normalized if the msb of x.mant == 1 (i.e.,
// the msb is shifted all the way "to the left"). Thus, if the mantissa has
// trailing 0 bits or x.prec is not a multiple of the the Word size _W,
// x.mant[0] has trailing zero bits. The msb of the mantissa corresponds
// to the value 0.5; the exponent x.exp shifts the binary point as needed.
//
// A zero or non-finite Float x ignores x.mant and x.exp. A NaN x ignores
// the sign x.neg.
//
// x form neg mant exp
// ----------------------------------------------------------
// ±0 zero sign - -
// 0 < |x| < +Inf finite sign mantissa exponent
// ±Inf inf sign - -
// NaN nan - - -
// A form value describes the internal representation.
type form byte
// The form value order is relevant - do not change!
const (
zero form = iota
finite
inf
nan
)
// RoundingMode determines how a Float value is rounded to the // RoundingMode determines how a Float value is rounded to the
// desired precision. Rounding may change the Float value; the // desired precision. Rounding may change the Float value; the
// rounding error is described by the Float's Accuracy. // rounding error is described by the Float's Accuracy.
...@@ -113,6 +135,13 @@ const ( ...@@ -113,6 +135,13 @@ const (
//go:generate stringer -type=Accuracy //go:generate stringer -type=Accuracy
func (x *Float) cmpZero() Accuracy {
if x.neg {
return Above
}
return Below
}
// SetPrec sets z's precision to prec and returns the (possibly) rounded // SetPrec sets z's precision to prec and returns the (possibly) rounded
// value of z. Rounding occurs according to z's rounding mode if the mantissa // value of z. Rounding occurs according to z's rounding mode if the mantissa
// cannot be represented in prec bits without loss of precision. // cannot be represented in prec bits without loss of precision.
...@@ -124,14 +153,10 @@ func (z *Float) SetPrec(prec uint) *Float { ...@@ -124,14 +153,10 @@ func (z *Float) SetPrec(prec uint) *Float {
// special case // special case
if prec == 0 { if prec == 0 {
z.prec = 0 z.prec = 0
if len(z.mant) != 0 { if z.form == finite {
// truncate and compute accuracy // truncate z to 0
z.setZero() z.acc = z.cmpZero()
acc := Below z.form = zero
if z.neg {
acc = Above
}
z.acc = acc
} }
return z return z
} }
...@@ -165,8 +190,11 @@ func (x *Float) Prec() uint { ...@@ -165,8 +190,11 @@ func (x *Float) Prec() uint {
// MinPrec returns the minimum precision required to represent x exactly // MinPrec returns the minimum precision required to represent x exactly
// (i.e., the smallest prec before x.SetPrec(prec) would start rounding x). // (i.e., the smallest prec before x.SetPrec(prec) would start rounding x).
// The result is 0 for ±0, ±Inf, and NaN. // The result is 0 if x is 0 or not finite.
func (x *Float) MinPrec() uint { func (x *Float) MinPrec() uint {
if x.form != finite {
return 0
}
return uint(len(x.mant))*_W - x.mant.trailingZeroBits() return uint(len(x.mant))*_W - x.mant.trailingZeroBits()
} }
...@@ -190,7 +218,7 @@ func (x *Float) Sign() int { ...@@ -190,7 +218,7 @@ func (x *Float) Sign() int {
if debugFloat { if debugFloat {
x.validate() x.validate()
} }
if len(x.mant) == 0 && x.exp != infExp { if x.form == zero || x.form == nan {
return 0 return 0
} }
if x.neg { if x.neg {
...@@ -219,18 +247,36 @@ func (x *Float) MantExp(mant *Float) (exp int) { ...@@ -219,18 +247,36 @@ func (x *Float) MantExp(mant *Float) (exp int) {
if debugFloat { if debugFloat {
x.validate() x.validate()
} }
if len(x.mant) != 0 { if x.form == finite {
exp = int(x.exp) exp = int(x.exp)
} }
if mant != nil { if mant != nil {
mant.Copy(x) mant.Copy(x)
if x.exp >= MinExp { if mant.form == finite {
mant.exp = 0 mant.exp = 0
} }
} }
return return
} }
// setExp sets the exponent for z.
// If e < MinExp, z becomes ±0; if e > MaxExp, z becomes ±Inf.
func (z *Float) setExp(e int64) {
if debugFloat && z.form != finite {
panic("setExp called for non-finite Float")
}
switch {
case e < MinExp:
// TODO(gri) check that accuracy is adjusted if necessary
z.form = zero // underflow
default:
z.exp = int32(e)
case e > MaxExp:
// TODO(gri) check that accuracy is adjusted if necessary
z.form = inf // overflow
}
}
// SetMantExp sets z to mant × 2**exp and and returns z. // SetMantExp sets z to mant × 2**exp and and returns z.
// The result z has the same precision and rounding mode // The result z has the same precision and rounding mode
// as mant. SetMantExp is an inverse of MantExp but does // as mant. SetMantExp is an inverse of MantExp but does
...@@ -253,7 +299,7 @@ func (z *Float) SetMantExp(mant *Float, exp int) *Float { ...@@ -253,7 +299,7 @@ func (z *Float) SetMantExp(mant *Float, exp int) *Float {
mant.validate() mant.validate()
} }
z.Copy(mant) z.Copy(mant)
if len(z.mant) == 0 { if z.form != finite {
return z return z
} }
z.setExp(int64(z.exp) + int64(exp)) z.setExp(int64(z.exp) + int64(exp))
...@@ -263,28 +309,28 @@ func (z *Float) SetMantExp(mant *Float, exp int) *Float { ...@@ -263,28 +309,28 @@ func (z *Float) SetMantExp(mant *Float, exp int) *Float {
// IsNeg reports whether x is negative. // IsNeg reports whether x is negative.
// A NaN value is not negative. // A NaN value is not negative.
func (x *Float) IsNeg() bool { func (x *Float) IsNeg() bool {
return x.neg && x.exp != nanExp return x.neg && x.form != nan
} }
// IsZero reports whether x is +0 or -0. // IsZero reports whether x is +0 or -0.
func (x *Float) IsZero() bool { func (x *Float) IsZero() bool {
return len(x.mant) == 0 && x.exp == 0 return x.form == zero
} }
// IsFinite reports whether -Inf < x < Inf. // IsFinite reports whether -Inf < x < Inf.
// A NaN value is not finite. // A NaN value is not finite.
func (x *Float) IsFinite() bool { func (x *Float) IsFinite() bool {
return len(x.mant) != 0 || x.exp == 0 return x.form <= finite
} }
// IsInf reports whether x is +Inf or -Inf. // IsInf reports whether x is +Inf or -Inf.
func (x *Float) IsInf() bool { func (x *Float) IsInf() bool {
return x.exp == infExp return x.form == inf
} }
// IsNaN reports whether x is a NaN value. // IsNaN reports whether x is a NaN value.
func (x *Float) IsNaN() bool { func (x *Float) IsNaN() bool {
return x.exp == nanExp return x.form == nan
} }
// IsInt reports whether x is an integer. // IsInt reports whether x is an integer.
...@@ -293,56 +339,32 @@ func (x *Float) IsInt() bool { ...@@ -293,56 +339,32 @@ func (x *Float) IsInt() bool {
if debugFloat { if debugFloat {
x.validate() x.validate()
} }
// pick off easy cases // special cases
if x.form != finite {
return x.form == zero
}
// x.form == finite
if x.exp <= 0 { if x.exp <= 0 {
// |x| < 1 || |x| == Inf || x is NaN return false
return len(x.mant) == 0 && x.exp == 0 // x == 0
} }
// x.exp > 0 // x.exp > 0
return x.prec <= uint32(x.exp) || x.MinPrec() <= uint(x.exp) // not enough bits for fractional mantissa return x.prec <= uint32(x.exp) || x.MinPrec() <= uint(x.exp) // not enough bits for fractional mantissa
} }
func (z *Float) setZero() {
z.mant = z.mant[:0]
z.exp = 0
}
func (z *Float) setInf() {
z.mant = z.mant[:0]
z.exp = infExp
}
// setExp sets the exponent for z.
// If e < MinExp, z becomes ±0; if e > MaxExp, z becomes ±Inf.
func (z *Float) setExp(e int64) {
switch {
case e < MinExp:
z.setZero()
default:
if len(z.mant) == 0 {
e = 0
}
z.exp = int32(e)
case e > MaxExp:
z.setInf()
}
}
// debugging support // debugging support
func (x *Float) validate() { func (x *Float) validate() {
if !debugFloat { if !debugFloat {
// avoid performance bugs // avoid performance bugs
panic("validate called but debugFloat is not set") panic("validate called but debugFloat is not set")
} }
const msb = 1 << (_W - 1) if x.form != finite {
return
}
m := len(x.mant) m := len(x.mant)
if m == 0 { if m == 0 {
// 0.0, Inf, or NaN panic("nonzero finite x with empty mantissa")
if x.exp != 0 && x.exp >= MinExp {
panic(fmt.Sprintf("empty matissa with invalid exponent %d", x.exp))
}
return
} }
const msb = 1 << (_W - 1)
if x.mant[m-1]&msb == 0 { if x.mant[m-1]&msb == 0 {
panic(fmt.Sprintf("msb not set in last word %#x of %s", x.mant[m-1], x.Format('p', 0))) panic(fmt.Sprintf("msb not set in last word %#x of %s", x.mant[m-1], x.Format('p', 0)))
} }
...@@ -362,20 +384,20 @@ func (x *Float) validate() { ...@@ -362,20 +384,20 @@ func (x *Float) validate() {
func (z *Float) round(sbit uint) { func (z *Float) round(sbit uint) {
if debugFloat { if debugFloat {
z.validate() z.validate()
if z.form > finite {
panic(fmt.Sprintf("round called for non-finite value %s", z))
}
} }
// z.form <= finite
z.acc = Exact z.acc = Exact
if z.form == zero {
// handle zero, Inf, and NaN
m := uint32(len(z.mant)) // present mantissa length in words
if m == 0 {
if z.exp == nanExp {
z.acc = Undef
}
return return
} }
// z.form == finite && len(z.mant) > 0
// m > 0 implies z.prec > 0 (checked by validate) // m > 0 implies z.prec > 0 (checked by validate)
m := uint32(len(z.mant)) // present mantissa length in words
bits := m * _W // present mantissa bits bits := m * _W // present mantissa bits
if bits <= z.prec { if bits <= z.prec {
// mantissa fits => nothing to do // mantissa fits => nothing to do
...@@ -487,6 +509,8 @@ func (z *Float) round(sbit uint) { ...@@ -487,6 +509,8 @@ func (z *Float) round(sbit uint) {
// zero out trailing bits in least-significant word // zero out trailing bits in least-significant word
z.mant[0] &^= lsb - 1 z.mant[0] &^= lsb - 1
// TODO(gri) can z.mant be all 0s at this point?
// update accuracy // update accuracy
if z.acc != Exact && z.neg { if z.acc != Exact && z.neg {
z.acc ^= Below | Above z.acc ^= Below | Above
...@@ -525,10 +549,11 @@ func (z *Float) setBits64(neg bool, x uint64) *Float { ...@@ -525,10 +549,11 @@ func (z *Float) setBits64(neg bool, x uint64) *Float {
z.acc = Exact z.acc = Exact
z.neg = neg z.neg = neg
if x == 0 { if x == 0 {
z.setZero() z.form = zero
return z return z
} }
// x != 0 // x != 0
z.form = finite
s := nlz64(x) s := nlz64(x)
z.mant = z.mant.setUint64(x << s) z.mant = z.mant.setUint64(x << s)
z.exp = int32(64 - s) // always fits z.exp = int32(64 - s) // always fits
...@@ -566,20 +591,20 @@ func (z *Float) SetFloat64(x float64) *Float { ...@@ -566,20 +591,20 @@ func (z *Float) SetFloat64(x float64) *Float {
z.prec = 53 z.prec = 53
} }
if math.IsNaN(x) { if math.IsNaN(x) {
z.SetNaN() return z.SetNaN()
return z
} }
z.acc = Exact z.acc = Exact
z.neg = math.Signbit(x) // handle -0, -Inf correctly z.neg = math.Signbit(x) // handle -0, -Inf correctly
if math.IsInf(x, 0) { if x == 0 {
z.setInf() z.form = zero
return z return z
} }
if x == 0 { if math.IsInf(x, 0) {
z.setZero() z.form = inf
return z return z
} }
// normalized x != 0 // normalized x != 0
z.form = finite
fmant, exp := math.Frexp(x) // get normalized mantissa fmant, exp := math.Frexp(x) // get normalized mantissa
z.mant = z.mant.setUint64(1<<63 | math.Float64bits(fmant)<<11) z.mant = z.mant.setUint64(1<<63 | math.Float64bits(fmant)<<11)
z.exp = int32(exp) // always fits z.exp = int32(exp) // always fits
...@@ -620,11 +645,11 @@ func (z *Float) SetInt(x *Int) *Float { ...@@ -620,11 +645,11 @@ func (z *Float) SetInt(x *Int) *Float {
z.acc = Exact z.acc = Exact
z.neg = x.neg z.neg = x.neg
if len(x.abs) == 0 { if len(x.abs) == 0 {
z.mant = z.mant[:0] z.form = zero
z.exp = 0
return z return z
} }
// x != 0 // x != 0
z.form = finite
z.mant = z.mant.set(x.abs) z.mant = z.mant.set(x.abs)
fnorm(z.mant) fnorm(z.mant)
z.setExp(int64(bits)) z.setExp(int64(bits))
...@@ -655,18 +680,16 @@ func (z *Float) SetRat(x *Rat) *Float { ...@@ -655,18 +680,16 @@ func (z *Float) SetRat(x *Rat) *Float {
// z is unchanged and the result is always Exact. // z is unchanged and the result is always Exact.
func (z *Float) SetInf(sign int) *Float { func (z *Float) SetInf(sign int) *Float {
z.acc = Exact z.acc = Exact
z.form = inf
z.neg = sign < 0 z.neg = sign < 0
z.setInf()
return z return z
} }
// SetNaN sets z to a NaN value, and returns z. // SetNaN sets z to a NaN value, and returns z.
// The precision of z is unchanged and the result is always Exact. // The precision of z is unchanged and the result is always Undef.
func (z *Float) SetNaN() *Float { func (z *Float) SetNaN() *Float {
z.acc = Exact z.acc = Undef
z.neg = false z.form = nan
z.mant = z.mant[:0]
z.exp = nanExp
return z return z
} }
...@@ -685,6 +708,7 @@ func (z *Float) Set(x *Float) *Float { ...@@ -685,6 +708,7 @@ func (z *Float) Set(x *Float) *Float {
if z.prec == 0 { if z.prec == 0 {
z.prec = x.prec z.prec = x.prec
} }
z.form = x.form
z.neg = x.neg z.neg = x.neg
z.exp = x.exp z.exp = x.exp
z.mant = z.mant.set(x.mant) z.mant = z.mant.set(x.mant)
...@@ -706,6 +730,7 @@ func (z *Float) Copy(x *Float) *Float { ...@@ -706,6 +730,7 @@ func (z *Float) Copy(x *Float) *Float {
z.prec = x.prec z.prec = x.prec
z.mode = x.mode z.mode = x.mode
z.acc = x.acc z.acc = x.acc
z.form = x.form
z.neg = x.neg z.neg = x.neg
z.mant = z.mant.set(x.mant) z.mant = z.mant.set(x.mant)
z.exp = x.exp z.exp = x.exp
...@@ -739,31 +764,17 @@ func (x *Float) Uint64() (uint64, Accuracy) { ...@@ -739,31 +764,17 @@ func (x *Float) Uint64() (uint64, Accuracy) {
x.validate() x.validate()
} }
// special cases switch x.form {
if len(x.mant) == 0 { case finite:
switch x.exp {
case 0:
return 0, Exact // ±0
case infExp:
if x.neg {
return 0, Above // -Inf
}
return math.MaxUint64, Below // +Inf
case nanExp:
return 0, Undef // NaN
}
panic("unreachable")
}
if x.neg { if x.neg {
return 0, Above return 0, Above
} }
// x > 0 // 0 < x < +Inf
if x.exp <= 0 { if x.exp <= 0 {
// 0 < x < 1 // 0 < x < 1
return 0, Below return 0, Below
} }
// 1 <= x // 1 <= x < Inf
if x.exp <= 64 { if x.exp <= 64 {
// u = trunc(x) fits into a uint64 // u = trunc(x) fits into a uint64
u := high64(x.mant) >> (64 - uint32(x.exp)) u := high64(x.mant) >> (64 - uint32(x.exp))
...@@ -774,6 +785,21 @@ func (x *Float) Uint64() (uint64, Accuracy) { ...@@ -774,6 +785,21 @@ func (x *Float) Uint64() (uint64, Accuracy) {
} }
// x too large // x too large
return math.MaxUint64, Below return math.MaxUint64, Below
case zero:
return 0, Exact
case inf:
if x.neg {
return 0, Above
}
return math.MaxUint64, Below
case nan:
return 0, Undef
}
panic("unreachable")
} }
// Int64 returns the integer resulting from truncating x towards zero. // Int64 returns the integer resulting from truncating x towards zero.
...@@ -786,27 +812,10 @@ func (x *Float) Int64() (int64, Accuracy) { ...@@ -786,27 +812,10 @@ func (x *Float) Int64() (int64, Accuracy) {
x.validate() x.validate()
} }
// special cases switch x.form {
if len(x.mant) == 0 { case finite:
switch x.exp {
case 0:
return 0, Exact // ±0
case infExp:
if x.neg {
return math.MinInt64, Above // -Inf
}
return math.MaxInt64, Below // +Inf
case nanExp:
return 0, Undef // NaN
}
panic("unreachable")
}
// 0 < |x| < +Inf // 0 < |x| < +Inf
acc := Below acc := x.cmpZero()
if x.neg {
acc = Above
}
if x.exp <= 0 { if x.exp <= 0 {
// 0 < |x| < 1 // 0 < |x| < 1
return 0, acc return 0, acc
...@@ -832,8 +841,23 @@ func (x *Float) Int64() (int64, Accuracy) { ...@@ -832,8 +841,23 @@ func (x *Float) Int64() (int64, Accuracy) {
} }
return math.MinInt64, acc return math.MinInt64, acc
} }
// x == +Inf // x too large
return math.MaxInt64, Below
case zero:
return 0, Exact
case inf:
if x.neg {
return math.MinInt64, Above
}
return math.MaxInt64, Below return math.MaxInt64, Below
case nan:
return 0, Undef
}
panic("unreachable")
} }
// Float64 returns the closest float64 value of x // Float64 returns the closest float64 value of x
...@@ -844,27 +868,8 @@ func (x *Float) Float64() (float64, Accuracy) { ...@@ -844,27 +868,8 @@ func (x *Float) Float64() (float64, Accuracy) {
x.validate() x.validate()
} }
// special cases switch x.form {
if len(x.mant) == 0 { case finite:
switch x.exp {
case 0:
if x.neg {
var zero float64
return -zero, Exact
}
return 0, Exact
case infExp:
var sign int
if x.neg {
sign = -1
}
return math.Inf(sign), Exact
case nanExp:
return math.NaN(), Undef
}
panic("unreachable")
}
// 0 < |x| < +Inf // 0 < |x| < +Inf
var r Float var r Float
r.prec = 53 r.prec = 53
...@@ -876,6 +881,26 @@ func (x *Float) Float64() (float64, Accuracy) { ...@@ -876,6 +881,26 @@ func (x *Float) Float64() (float64, Accuracy) {
e := uint64(1022+r.exp) & 0x7ff // TODO(gri) check for overflow e := uint64(1022+r.exp) & 0x7ff // TODO(gri) check for overflow
m := high64(r.mant) >> 11 & (1<<52 - 1) m := high64(r.mant) >> 11 & (1<<52 - 1)
return math.Float64frombits(s | e<<52 | m), r.acc return math.Float64frombits(s | e<<52 | m), r.acc
case zero:
z := 0.0
if x.neg {
z = -z
}
return z, Exact
case inf:
sign := +1
if x.neg {
sign = -1
}
return math.Inf(sign), Exact
case nan:
return math.NaN(), Undef
}
panic("unreachable")
} }
// Int returns the result of truncating x towards zero; // Int returns the result of truncating x towards zero;
...@@ -889,34 +914,14 @@ func (x *Float) Int(z *Int) (*Int, Accuracy) { ...@@ -889,34 +914,14 @@ func (x *Float) Int(z *Int) (*Int, Accuracy) {
x.validate() x.validate()
} }
if z == nil { if z == nil && x.form <= finite {
// no need to do this for Inf and NaN
// but those are rare enough that we
// don't care
z = new(Int) z = new(Int)
} }
// special cases switch x.form {
if len(x.mant) == 0 { case finite:
switch x.exp {
case 0:
return z.SetInt64(0), Exact // 0
case infExp:
if x.neg {
return nil, Above
}
return nil, Below
case nanExp:
return nil, Undef
}
panic("unreachable")
}
// 0 < |x| < +Inf // 0 < |x| < +Inf
acc := Below acc := x.cmpZero()
if x.neg {
acc = Above
}
if x.exp <= 0 { if x.exp <= 0 {
// 0 < |x| < 1 // 0 < |x| < 1
return z.SetInt64(0), acc return z.SetInt64(0), acc
...@@ -944,6 +949,18 @@ func (x *Float) Int(z *Int) (*Int, Accuracy) { ...@@ -944,6 +949,18 @@ func (x *Float) Int(z *Int) (*Int, Accuracy) {
z.abs = z.abs.shr(x.mant, allBits-exp) z.abs = z.abs.shr(x.mant, allBits-exp)
} }
return z, acc return z, acc
case zero:
return z.SetInt64(0), Exact
case inf:
return nil, x.cmpZero()
case nan:
return nil, Undef
}
panic("unreachable")
} }
// Rat returns the rational number corresponding to x; // Rat returns the rational number corresponding to x;
...@@ -956,30 +973,13 @@ func (x *Float) Rat(z *Rat) (*Rat, Accuracy) { ...@@ -956,30 +973,13 @@ func (x *Float) Rat(z *Rat) (*Rat, Accuracy) {
x.validate() x.validate()
} }
if z == nil { if z == nil && x.form <= finite {
// no need to do this for Inf and NaN
// but those are rare enough that we
// don't care
z = new(Rat) z = new(Rat)
} }
// special cases switch x.form {
if len(x.mant) == 0 { case finite:
switch x.exp { // 0 < |x| < +Inf
case 0:
return z.SetInt64(0), Exact // 0
case infExp:
if x.neg {
return nil, Above
}
return nil, Below
case nanExp:
return nil, Undef
}
panic("unreachable")
}
// 0 <= |x| < Inf
allBits := int32(len(x.mant)) * _W allBits := int32(len(x.mant)) * _W
// build up numerator and denominator // build up numerator and denominator
z.a.neg = x.neg z.a.neg = x.neg
...@@ -999,6 +999,18 @@ func (x *Float) Rat(z *Rat) (*Rat, Accuracy) { ...@@ -999,6 +999,18 @@ func (x *Float) Rat(z *Rat) (*Rat, Accuracy) {
z.norm() z.norm()
} }
return z, Exact return z, Exact
case zero:
return z.SetInt64(0), Exact
case inf:
return nil, x.cmpZero()
case nan:
return nil, Undef
}
panic("unreachable")
} }
// Abs sets z to the (possibly rounded) value |x| (the absolute value of x) // Abs sets z to the (possibly rounded) value |x| (the absolute value of x)
...@@ -1031,7 +1043,7 @@ func (z *Float) uadd(x, y *Float) { ...@@ -1031,7 +1043,7 @@ func (z *Float) uadd(x, y *Float) {
// http://www.vinc17.net/research/papers/rnc6.pdf // http://www.vinc17.net/research/papers/rnc6.pdf
if debugFloat && (len(x.mant) == 0 || len(y.mant) == 0) { if debugFloat && (len(x.mant) == 0 || len(y.mant) == 0) {
panic("uadd called with 0 argument") panic("uadd called with empty mantissa")
} }
// compute exponents ex, ey for mantissa with "binary point" // compute exponents ex, ey for mantissa with "binary point"
...@@ -1070,7 +1082,7 @@ func (z *Float) usub(x, y *Float) { ...@@ -1070,7 +1082,7 @@ func (z *Float) usub(x, y *Float) {
// by special-casing, and the code will diverge. // by special-casing, and the code will diverge.
if debugFloat && (len(x.mant) == 0 || len(y.mant) == 0) { if debugFloat && (len(x.mant) == 0 || len(y.mant) == 0) {
panic("usub called with 0 argument") panic("usub called with empty mantissa")
} }
ex := int64(x.exp) - int64(len(x.mant))*_W ex := int64(x.exp) - int64(len(x.mant))*_W
...@@ -1094,7 +1106,7 @@ func (z *Float) usub(x, y *Float) { ...@@ -1094,7 +1106,7 @@ func (z *Float) usub(x, y *Float) {
// operands may have cancelled each other out // operands may have cancelled each other out
if len(z.mant) == 0 { if len(z.mant) == 0 {
z.acc = Exact z.acc = Exact
z.setZero() z.form = zero
return return
} }
// len(z.mant) > 0 // len(z.mant) > 0
...@@ -1107,7 +1119,7 @@ func (z *Float) usub(x, y *Float) { ...@@ -1107,7 +1119,7 @@ func (z *Float) usub(x, y *Float) {
// x and y must not be 0, Inf, or NaN. // x and y must not be 0, Inf, or NaN.
func (z *Float) umul(x, y *Float) { func (z *Float) umul(x, y *Float) {
if debugFloat && (len(x.mant) == 0 || len(y.mant) == 0) { if debugFloat && (len(x.mant) == 0 || len(y.mant) == 0) {
panic("umul called with 0 argument") panic("umul called with empty mantissa")
} }
// Note: This is doing too much work if the precision // Note: This is doing too much work if the precision
...@@ -1128,7 +1140,7 @@ func (z *Float) umul(x, y *Float) { ...@@ -1128,7 +1140,7 @@ func (z *Float) umul(x, y *Float) {
// x and y must not be 0, Inf, or NaN. // x and y must not be 0, Inf, or NaN.
func (z *Float) uquo(x, y *Float) { func (z *Float) uquo(x, y *Float) {
if debugFloat && (len(x.mant) == 0 || len(y.mant) == 0) { if debugFloat && (len(x.mant) == 0 || len(y.mant) == 0) {
panic("uquo called with 0 argument") panic("uquo called with empty mantissa")
} }
// mantissa length in words for desired result precision + 1 // mantissa length in words for desired result precision + 1
...@@ -1176,7 +1188,7 @@ func (z *Float) uquo(x, y *Float) { ...@@ -1176,7 +1188,7 @@ func (z *Float) uquo(x, y *Float) {
// while ignoring the signs of x and y. x and y must not be 0, Inf, or NaN. // while ignoring the signs of x and y. x and y must not be 0, Inf, or NaN.
func (x *Float) ucmp(y *Float) int { func (x *Float) ucmp(y *Float) int {
if debugFloat && (len(x.mant) == 0 || len(y.mant) == 0) { if debugFloat && (len(x.mant) == 0 || len(y.mant) == 0) {
panic("ucmp called with 0 argument") panic("ucmp called with empty mantissa")
} }
switch { switch {
...@@ -1248,14 +1260,14 @@ func (z *Float) Add(x, y *Float) *Float { ...@@ -1248,14 +1260,14 @@ func (z *Float) Add(x, y *Float) *Float {
} }
// special cases // special cases
if len(x.mant) == 0 || len(y.mant) == 0 { if x.form != finite || y.form != finite {
if x.exp <= infExp || y.exp <= infExp { if x.form > finite || y.form > finite {
// TODO(gri) handle Inf separately // TODO(gri) handle Inf separately
return z.SetNaN() return z.SetNaN()
} }
if len(x.mant) == 0 { // x == ±0 if x.form == zero {
z.Set(y) z.Set(y)
if len(z.mant) == 0 && z.exp == 0 { if z.form == zero {
z.neg = x.neg && y.neg // -0 + -0 == -0 z.neg = x.neg && y.neg // -0 + -0 == -0
} }
return z return z
...@@ -1265,6 +1277,7 @@ func (z *Float) Add(x, y *Float) *Float { ...@@ -1265,6 +1277,7 @@ func (z *Float) Add(x, y *Float) *Float {
} }
// x, y != 0 // x, y != 0
z.form = finite
z.neg = x.neg z.neg = x.neg
if x.neg == y.neg { if x.neg == y.neg {
// x + y == x + y // x + y == x + y
...@@ -1282,7 +1295,7 @@ func (z *Float) Add(x, y *Float) *Float { ...@@ -1282,7 +1295,7 @@ func (z *Float) Add(x, y *Float) *Float {
} }
// -0 is only possible for -0 + -0 // -0 is only possible for -0 + -0
if len(z.mant) == 0 { if z.form == zero {
z.neg = false z.neg = false
} }
...@@ -1303,14 +1316,14 @@ func (z *Float) Sub(x, y *Float) *Float { ...@@ -1303,14 +1316,14 @@ func (z *Float) Sub(x, y *Float) *Float {
} }
// special cases // special cases
if len(x.mant) == 0 || len(y.mant) == 0 { if x.form != finite || y.form != finite {
if x.exp <= infExp || y.exp <= infExp { if x.form > finite || y.form > finite {
// TODO(gri) handle Inf separately // TODO(gri) handle Inf separately
return z.SetNaN() return z.SetNaN()
} }
if len(x.mant) == 0 { // x == ±0 if x.form == zero {
z.Neg(y) z.Neg(y)
if len(z.mant) == 0 && z.exp == 0 { if z.form == zero {
z.neg = x.neg && !y.neg // -0 - 0 == -0 z.neg = x.neg && !y.neg // -0 - 0 == -0
} }
return z return z
...@@ -1320,6 +1333,7 @@ func (z *Float) Sub(x, y *Float) *Float { ...@@ -1320,6 +1333,7 @@ func (z *Float) Sub(x, y *Float) *Float {
} }
// x, y != 0 // x, y != 0
z.form = finite
z.neg = x.neg z.neg = x.neg
if x.neg != y.neg { if x.neg != y.neg {
// x - (-y) == x + y // x - (-y) == x + y
...@@ -1337,7 +1351,7 @@ func (z *Float) Sub(x, y *Float) *Float { ...@@ -1337,7 +1351,7 @@ func (z *Float) Sub(x, y *Float) *Float {
} }
// -0 is only possible for -0 - 0 // -0 is only possible for -0 - 0
if len(z.mant) == 0 { if z.form == zero {
z.neg = false z.neg = false
} }
...@@ -1360,24 +1374,25 @@ func (z *Float) Mul(x, y *Float) *Float { ...@@ -1360,24 +1374,25 @@ func (z *Float) Mul(x, y *Float) *Float {
z.neg = x.neg != y.neg z.neg = x.neg != y.neg
// special cases // special cases
if len(x.mant) == 0 || len(y.mant) == 0 { if x.form != finite || y.form != finite {
if x.exp <= infExp || y.exp <= infExp { if x.form > finite || y.form > finite {
// TODO(gri) handle Inf separately // TODO(gri) handle Inf separately
return z.SetNaN() return z.SetNaN()
} }
// x == ±0 || y == ±0 // x == ±0 || y == ±0
z.acc = Exact z.acc = Exact
z.setZero() z.form = zero
return z return z
} }
if len(x.mant) == 0 || len(y.mant) == 0 { if x.form == zero || y.form == zero {
z.acc = Exact z.acc = Exact
z.setZero() z.form = zero
return z return z
} }
// x, y != 0 // x, y != 0
z.form = finite
z.umul(x, y) z.umul(x, y)
return z return z
} }
...@@ -1399,26 +1414,27 @@ func (z *Float) Quo(x, y *Float) *Float { ...@@ -1399,26 +1414,27 @@ func (z *Float) Quo(x, y *Float) *Float {
// special cases // special cases
z.acc = Exact z.acc = Exact
if len(x.mant) == 0 || len(y.mant) == 0 { if x.form != finite || y.form != finite {
if x.exp <= infExp || y.exp <= infExp { if x.form > finite || y.form > finite {
// TODO(gri) handle Inf separately // TODO(gri) handle Inf separately
return z.SetNaN() return z.SetNaN()
} }
if len(x.mant) == 0 { if x.form == zero {
if len(y.mant) == 0 { if y.form == zero {
return z.SetNaN() return z.SetNaN()
} }
z.setZero() z.form = zero
return z return z
} }
// x != 0 // x != 0
if len(y.mant) == 0 { if y.form == zero {
z.setInf() z.form = inf
return z return z
} }
} }
// x, y != 0 // x, y != 0
z.form = finite
z.uquo(x, y) z.uquo(x, y)
return z return z
} }
...@@ -1475,16 +1491,17 @@ func umax32(x, y uint32) uint32 { ...@@ -1475,16 +1491,17 @@ func umax32(x, y uint32) uint32 {
// +2 if x == +Inf // +2 if x == +Inf
// //
func (x *Float) ord() int { func (x *Float) ord() int {
m := 1 // common case var m int
if len(x.mant) == 0 { switch x.form {
m = 0 case finite:
if x.exp == infExp { m = 1
case zero:
return 0
case inf:
m = 2 m = 2
} case nan:
if x.exp == nanExp {
panic("unimplemented") panic("unimplemented")
} }
}
if x.neg { if x.neg {
m = -m m = -m
} }
......
...@@ -90,15 +90,20 @@ func TestFloatZeroValue(t *testing.T) { ...@@ -90,15 +90,20 @@ func TestFloatZeroValue(t *testing.T) {
func makeFloat(s string) *Float { func makeFloat(s string) *Float {
var x Float var x Float
if s == "Inf" || s == "+Inf" {
switch s {
case "0":
return &x
case "-0":
return x.Neg(&x)
case "Inf", "+Inf":
return x.SetInf(+1) return x.SetInf(+1)
} case "-Inf":
if s == "-Inf" {
return x.SetInf(-1) return x.SetInf(-1)
} case "NaN", "-NaN":
if s == "NaN" || s == "-NaN" {
return x.SetNaN() return x.SetNaN()
} }
x.SetPrec(1000) x.SetPrec(1000)
if _, ok := x.SetString(s); !ok { if _, ok := x.SetString(s); !ok {
panic(fmt.Sprintf("%q is not a valid float", s)) panic(fmt.Sprintf("%q is not a valid float", s))
...@@ -146,13 +151,6 @@ func TestFloatSetPrec(t *testing.T) { ...@@ -146,13 +151,6 @@ func TestFloatSetPrec(t *testing.T) {
if got, acc := x.String(), x.Acc(); got != test.want || acc != test.acc { if got, acc := x.String(), x.Acc(); got != test.want || acc != test.acc {
t.Errorf("%s.SetPrec(%d) = %s (%s); want %s (%s)", test.x, test.prec, got, acc, test.want, test.acc) t.Errorf("%s.SetPrec(%d) = %s (%s); want %s (%s)", test.x, test.prec, got, acc, test.want, test.acc)
} }
// look inside x and check correct value for x.exp
if len(x.mant) == 0 {
// ±0, ±Inf, or NaN
if x.exp != 0 && x.exp > MinExp {
t.Errorf("%s.SetPrec(%d): incorrect exponent %d", test.x, test.prec, x.exp)
}
}
} }
} }
...@@ -209,7 +207,7 @@ func feq(x, y *Float) bool { ...@@ -209,7 +207,7 @@ func feq(x, y *Float) bool {
if x.IsNaN() || y.IsNaN() { if x.IsNaN() || y.IsNaN() {
return x.IsNaN() && y.IsNaN() return x.IsNaN() && y.IsNaN()
} }
return x.Cmp(y) == 0 && x.neg == y.neg return x.Cmp(y) == 0 && x.IsNeg() == y.IsNeg()
} }
func TestFloatMantExp(t *testing.T) { func TestFloatMantExp(t *testing.T) {
...@@ -261,9 +259,9 @@ func TestFloatSetMantExp(t *testing.T) { ...@@ -261,9 +259,9 @@ func TestFloatSetMantExp(t *testing.T) {
{"Inf", 1234, "+Inf"}, {"Inf", 1234, "+Inf"},
{"+Inf", -1234, "+Inf"}, {"+Inf", -1234, "+Inf"},
{"-Inf", -1234, "-Inf"}, {"-Inf", -1234, "-Inf"},
{"0", -MaxExp - 1, "0"}, {"0", MinExp, "0"},
{"0.5", -MaxExp - 1, "+0"}, // exponent underflow {"0.25", MinExp, "+0"}, // exponent underflow
{"-0.5", -MaxExp - 1, "-0"}, // exponent underflow {"-0.25", MinExp, "-0"}, // exponent underflow
{"1", MaxExp, "+Inf"}, // exponent overflow {"1", MaxExp, "+Inf"}, // exponent overflow
{"2", MaxExp - 1, "+Inf"}, // exponent overflow {"2", MaxExp - 1, "+Inf"}, // exponent overflow
{"0.75", 1, "1.5"}, {"0.75", 1, "1.5"},
......
...@@ -96,11 +96,13 @@ func (z *Float) Scan(r io.ByteScanner, base int) (f *Float, b int, err error) { ...@@ -96,11 +96,13 @@ func (z *Float) Scan(r io.ByteScanner, base int) (f *Float, b int, err error) {
// special-case 0 // special-case 0
if len(z.mant) == 0 { if len(z.mant) == 0 {
z.acc = Exact z.acc = Exact
z.exp = 0 z.form = zero
return return
} }
// len(z.mant) > 0 // len(z.mant) > 0
z.form = finite
// The mantissa may have a decimal point (fcount <= 0) and there // The mantissa may have a decimal point (fcount <= 0) and there
// may be a nonzero exponent exp. The decimal point amounts to a // may be a nonzero exponent exp. The decimal point amounts to a
// division by b**(-fcount). An exponent means multiplication by // division by b**(-fcount). An exponent means multiplication by
...@@ -275,9 +277,13 @@ func (x *Float) bstring(buf []byte) []byte { ...@@ -275,9 +277,13 @@ func (x *Float) bstring(buf []byte) []byte {
if x.neg { if x.neg {
buf = append(buf, '-') buf = append(buf, '-')
} }
if len(x.mant) == 0 { if x.form == zero {
return append(buf, '0') return append(buf, '0')
} }
if debugFloat && x.form != finite {
panic("non-finite float")
}
// x != 0 // x != 0
// adjust mantissa to use exactly x.prec bits // adjust mantissa to use exactly x.prec bits
...@@ -306,9 +312,13 @@ func (x *Float) pstring(buf []byte) []byte { ...@@ -306,9 +312,13 @@ func (x *Float) pstring(buf []byte) []byte {
if x.neg { if x.neg {
buf = append(buf, '-') buf = append(buf, '-')
} }
if len(x.mant) == 0 { if x.form == zero {
return append(buf, '0') return append(buf, '0')
} }
if debugFloat && x.form != finite {
panic("non-finite float")
}
// x != 0 // x != 0
// remove trailing 0 words early // remove trailing 0 words early
......
...@@ -19,11 +19,17 @@ import "strconv" ...@@ -19,11 +19,17 @@ import "strconv"
// bigFtoa formats a float for the %e, %E, %f, %g, and %G formats. // bigFtoa formats a float for the %e, %E, %f, %g, and %G formats.
func (f *Float) bigFtoa(buf []byte, fmt byte, prec int) []byte { func (f *Float) bigFtoa(buf []byte, fmt byte, prec int) []byte {
// TODO(gri) handle Inf. if debugFloat && !f.IsFinite() {
panic("non-finite float")
}
// 1) convert Float to multiprecision decimal // 1) convert Float to multiprecision decimal
var mant nat
if f.form == finite {
mant = f.mant
}
var d decimal var d decimal
d.init(f.mant, int(f.exp)-f.mant.bitLen()) d.init(mant, int(f.exp)-f.mant.bitLen())
// 2) round to desired precision // 2) round to desired precision
shortest := false shortest := false
......
Markdown is supported
0%
or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment