Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Robust bisect #15

Open
ivan-pi opened this issue Aug 4, 2022 · 4 comments
Open

Robust bisect #15

ivan-pi opened this issue Aug 4, 2022 · 4 comments
Labels
enhancement New feature or request

Comments

@ivan-pi
Copy link
Contributor

ivan-pi commented Aug 4, 2022

x3 = (x1 + x2) / 2.0_wp

The excerpt below is taken from Kahaner, Moler, & Nash, Numerical Methods and Software, 1989, Prentice-Hall, Inc.:

image

According to Herbie this rearrangement seems irrelevant in FP64, however I couldn't find an option to see what happens in lower precision: https://herbie.uwplse.org/demo/ebc6120a0224e43ec4e96ae4c79d8bf1537cfdf8.1.6/graph.html#history

@ivan-pi
Copy link
Contributor Author

ivan-pi commented Aug 4, 2022

The compilers do interpret the instructions differently. Question is whether the error of ADD+MUL is lower than the error of SUB + FMA, and whether it even matters for > FP32?

(a + b)/2

gfortran -O2 -march=native

__root_MOD_bisect:
        vmovsd  xmm0, QWORD PTR [rdi]
        vaddsd  xmm0, xmm0, QWORD PTR [rsi]
        vmulsd  xmm0, xmm0, QWORD PTR .LC1[rip]
        ret
.LC1:
        .long   0
        .long   1071644672

ifx -O2 -xHost

.LCPI1_0:
        .quad   0x3fe0000000000000
root_mp_bisect_:
        vmovsd  xmm0, qword ptr [rdi]
        vaddsd  xmm0, xmm0, qword ptr [rsi]
        vmulsd  xmm0, xmm0, qword ptr [rip + .LCPI1_0]
        ret

a + 0.5*(b - a)

gfortran -O2 -march=native

__root_MOD_bisect:
        vmovsd  xmm1, QWORD PTR [rdi]
        vmovsd  xmm0, QWORD PTR [rsi]
        vsubsd  xmm0, xmm0, xmm1
        vfmadd132sd     xmm0, xmm1, QWORD PTR .LC1[rip]
        ret
.LC1:
        .long   0
        .long   1071644672

ifx -O2 -xHost

.LCPI1_0:
        .quad   0x3fe0000000000000
root_mp_bisect_:
        vmovsd  xmm1, qword ptr [rdi]
        vmovsd  xmm0, qword ptr [rsi]
        vsubsd  xmm0, xmm0, xmm1
        vfmadd132sd     xmm0, xmm1, qword ptr [rip + .LCPI1_0]
        ret

@jacobwilliams
Copy link
Owner

We can compile the lib and tests using the -D REAL32 option and see if it makes a difference?

@jacobwilliams jacobwilliams added the enhancement New feature or request label Aug 4, 2022
@ivan-pi
Copy link
Contributor Author

ivan-pi commented Aug 4, 2022

In the Boost C++ root finding library (/boost/math/tools/roots.hpp), they use

         delta = 0.5F * (guess - max);  // lucky that 0.5F has an exact binary representation...
         result = guess - delta;

in the fallback bisection step of Newton's method.

However in the actual bisection routine they do it the "naive" way:

   while(count && (0 == tol(min, max)))
   {
      T mid = (min + max) / 2;   // 2 gets correctly cast to the required precision
      T fmid = f(mid);
      ...

So it looks like they are indifferent to this issue.

In SciPy's zero methods (only double precision, e.g. https://github.com/scipy/scipy/blob/v1.9.0/scipy/optimize/Zeros/bisect.c) they use the "correction" approach consistently in all of the solvers:

    dm = xb - xa;
    solver_stats->iterations = 0;
    for (i=0; i<iter; i++) {
        solver_stats->iterations++;
        dm *= .5;         // only double supported anyways, else should be (T) 0.5;
        xm = xa + dm;

@ivan-pi
Copy link
Contributor Author

ivan-pi commented Aug 30, 2022

The following paper is dedicated entirely to bisecting an interval:

Goualard, F. (2014). How do you compute the midpoint of an interval?. ACM Transactions on Mathematical Software (TOMS), 40(2), 1-25. https://doi.org/10.1145/2493882

A preprint is available here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

2 participants