http://suburbia.net/~billm/floating-point/ieee.html#standards
IEEE recommends to use round-to-nearest or even. This is the default
mode used nby Linux. To be compatible we probably should run on WinNT
with the option _controlfp(_RC_NEAR, _MCW_RC).
Cheers, Fons.
Valery Fine wrote:
>
> Dear Fred
> On 9 Apr 98 at 23:22, you wrote:
>
> > There is a Visual C++ runtime library function called _controlfp()
> > which sets the IEEE floating-point status word. The syntax is
> > _controlfp(value, mask); the mask indicates which bits of the status
> > word you want to change. There are lots of constants defined for
> > various parts of this word. In order to set round-to-nearest mode,
> > the syntax would be:
> >
> > _controlfp(_RC_NEAR, _MCW_RC);
> >
> > The other rounding mode choices are _RC_CHOP, _RC_UP, and _RC_DOWN.
>
> Thank you very much for your clarification.
>
> I am aware about this function but I doubt ROOT should change the
> default setting.
>
> Another question whether ROOT should supply some "common" method to
> give the user an opportunity to change this option on his / her own
> via TSystem class.
>
> Since ROOT is multi-platform package it should be understood how
> this method will be implemented for the rest supported platforms too.
>
> For the time being I would advice those finding it is essential
> to control the rounding "by hand" call this function himself
> /herself from his/her method.
>
> From another hand I'd like to call your attention that the C / C++
> docs says:
>
> Floating to Integral
> ====================
>
> When an object of floating type is converted to an integral type,
> the fractional part is truncated. No rounding takes place in the
> conversion process. Truncation means that a number like 1.3 is
> converted to 1, and -1.3 is converted to -1.
>
> THAT'S TRUE for ANY CURRENT MathCPU bits setting (it is a LAW for
> each platform / compiler implementation and NO hardware/ CPU flag /
> mask are "allowed" to change it)
>
> And just another LAW:
>
> Integral to Floating
> ====================
> When an object of integral type is converted to a floating type and
> the original value cannot be represented exactly, the result is
> either the next higher or the next lower representable value.
>
> Valery
>
> =================================================================
> Dr. Valeri Faine (Fine)
> ------------------- Phone: +1 516 344 7806
> Brookhaven National Laboratory FAX : +1 516 344 4206
> Bldg. 510A /STAR mailto:fine@bnl.gov
> Upton, New York, 11973-5000 http://nicewww.cern.ch/~fine
> USA
>
> Dr. Valery Fine Telex : 911621 dubna su
> -----------
> LCTA/Joint Inst.for Nuclear Res. Phone : +7 09621 6 40 80
> 141980 Dubna, Moscow region Fax : +7 09621 6 51 45
> Russia mailto:fine@main1.jinr.dubna.su
-- Org: CERN, European Laboratory for Particle Physics. Mail: 1211 Geneve 23, Switzerland Phone: +41 22 7679248 E-Mail: Fons.Rademakers@cern.ch Fax: +41 22 7677910