SIAM News Blog

IEEE Floating Point vs. Posits

By Anne Greenbaum

John Gustafson has proposed a replacement for IEEE floating point arithmetic [1]. This goes by the name of Universal Numbers or Unums and has now evolved into “posits.” An account of the history of unums/posits can be found here. See also the “Great Debate” between Gustafson and William Kahan, the architect of IEEE 754. A problem with all of these sources is that many ideas seem to be mixed together and it is difficult to separate one from the other. John Cook has a clear explanation of posits.  

If a computer word consists of \(n\) bits that are either on or off, then \(2^n\) numbers (or exceptions) can be represented. Which \(2^n\) numbers should one choose? Suppose a real number is written in base \(2\) as

\[\pm 1. b_1 b_2 \ldots \times 2^E . \label{IEEE} \tag1 \]

In IEEE arithmetic, one bit is used for the sign (\(0\) for \(+\) or \(1\) for \(−\)), a certain number of bits are set aside for the exponent \(E\) (\(11\) bits for double precision when \(n = 64\) and \(8\) bits for single precision when \(n = 32\)), and the remaining bits represent the fraction \(b_1b_2\)...(\(52\) bits for double precision and \(23\) bits for single precision). The exponent field contains a shifted binary representation of the actual exponent \(E\), with a \(0\) followed by all \(1\)s representing \(E = 0\) and binary numbers smaller than this representing negative exponents and those greater than this representing positive exponents. This allows for numbers with a wide range of magnitudes: about \(2.2 \times 10^{-308}\) to \(1.8 \times 10^{308}\) for double precision, about \(1.2 \times 10^{-38}\) to \(3.4 \times 10^{38}\) for single precision (additionally, special exponent bit fields are used to flag exceptions and subnormal numbers). Yet in the vast majority of computations, numbers occur in a much narrower range. Couldn’t fewer bits be used to represent exponents of moderate size numbers and the unused exponent bits be used instead for the fraction?

This, as I understand it, is the idea behind posits. In posit arithmetic, one sets the word size \(n\) and the power \(p\) in the formula below: posit (\(n\),\(p\)). The exponent \(E\) in \((1)\) is then represented in the form

\[E = e + k \cdot 2^p , \label{posit} \tag2 \]

where a variable number of bits (called “regime” bits) are used to represent \(k\) and if, after representing \(k\), there are at least \(p\) remaining bits, then \(p\) bits are used to represent \(e\). The regime field consists of identical bits, either all \(0\)s or all \(1\)s and it is terminated by an opposite bit (or by running out of bits). If there are \(m\) regime bits, then \(k = −m\) if they are all \(0\)s, or \(k = m − 1\) if they are all \(1\)s. The bits representing \(e\) are interpreted as a standard nonnegative binary number. This presents a problem when there are fewer than \(p\) bits remaining after \(k\) has been set. It means that, while \(E = k \times 2^p\) and \(E = (k+1) \times 2^{p}\) may be representable, not all integer values of \(E\) in the range \([ k \times 2^p + 1, k \times 2^p + ( 2^p - 1 )]\) will be representable and numbers whose exponent lands in this range must be rounded to a value with large relative error and, if \(k\) is positive, large absolute error as well! A simple fix, but one that limits variability still further, would be to limit the number of bits that can be used to represent \(k\) so that there are always at least \(p\) bits remaining for \(e\) and at least some fixed number remaining after that for the fraction.

Figure 1. Toy posit system with 5-bit words, p = 0. Top line shows the nonnegative numbers that can be represented in this system. Bottom line shows the nonnegative numbers that can be represented if the regime is limited to 3 bits.

For example, suppose \(n = 5\) and \(p = 0\). The nonnegative posit numbers that can be represented are pictured in the top plot of Figure 1. Like IEEE, the first posit bit is the sign bit. Posit numbers with \(k = 0\) or \(k = −1\) need only one regime bit and one terminator bit (\(10\) or \(01\), respectively), leaving two fraction bits. This means that when a real number \(x\) is in the range [\(\frac{1}{2} , 2\)] and it is rounded to the nearest posit number, the rounded number satisfies \(\textrm{round}(x)=x(1+ \epsilon)\), where \(|\epsilon| \leq \frac{1}{8}\). Posit numbers with \(k = 1\) or \(k = −2\) need two regime bits and a terminator bit (\(110\) or \(001\), respectively), leaving only one fraction bit. This means that a real number \(x\) in the range [ \(\frac{1}{4}, \frac{1}{2} )\) or \((2, 4)\) satisfies \(\textrm{round}(x) = x(1 + \epsilon )\), where \(| \epsilon | \leq \frac{1}{4}\). Numbers outside these ranges have no fraction bits, resulting in the unfortunate situation that, for instance, \( \textrm{round}(3+3)=4 \: \textrm{or} \: 8\); that is, \(\textrm{round}(6)=6(1 \pm \frac {1}{3})\)When \(p > 0\), these problems are magnified. Again, a possible fix would be to limit the size of the regime field to, say, three bits (in which case a terminator bit would not even be needed if all three bits were used for the regime). This would ensure that there was always at least one bit for the fraction, and some numbers would have two fraction bits.  The numbers that can be represented with this system are shown in the bottom plot of Figure 1.

Despite the problems with posits, the idea of being able to set the word size and exponent range seems very appealing, especially if bits not needed for the exponent of a particular number could automatically be used for the fraction. Whether such a system can be implemented efficiently in hardware remains to be seen, but there are a number of companies currently working on this.

[1] Gustafson, John L. (2015). The End of Error: Unum Computing. Boca Raton, Fla.: CRC Press.

Anne Greenbaum is Boeing Professor of Applied Math at the University of Washington. She received her Ph.D. in mathematics from the University of California Berkeley in 1981, and worked at Lawrence Livermore National Laboratory and at the Courant Institute of Mathematical Sciences before joining the University of Washington in 1997 as a professor. In 2015, she was elected as a SIAM Fellow.
blog comments powered by Disqus