libgomp: Implementing FOR construct

1 
1 9.11 Implementing FOR construct
1 ===============================
1 
1        #pragma omp parallel for
1        for (i = lb; i <= ub; i++)
1          body;
1 
1    becomes
1 
1        void subfunction (void *data)
1        {
1          long _s0, _e0;
1          while (GOMP_loop_static_next (&_s0, &_e0))
1          {
1            long _e1 = _e0, i;
1            for (i = _s0; i < _e1; i++)
1              body;
1          }
1          GOMP_loop_end_nowait ();
1        }
1 
1        GOMP_parallel_loop_static (subfunction, NULL, 0, lb, ub+1, 1, 0);
1        subfunction (NULL);
1        GOMP_parallel_end ();
1 
1        #pragma omp for schedule(runtime)
1        for (i = 0; i < n; i++)
1          body;
1 
1    becomes
1 
1        {
1          long i, _s0, _e0;
1          if (GOMP_loop_runtime_start (0, n, 1, &_s0, &_e0))
1            do {
1              long _e1 = _e0;
1              for (i = _s0, i < _e0; i++)
1                body;
1            } while (GOMP_loop_runtime_next (&_s0, _&e0));
1          GOMP_loop_end ();
1        }
1 
1    Note that while it looks like there is trickiness to propagating a
1 non-constant STEP, there isn't really.  We're explicitly allowed to
1 evaluate it as many times as we want, and any variables involved should
1 automatically be handled as PRIVATE or SHARED like any other variables.
1 So the expression should remain evaluable in the subfunction.  We can
1 also pull it into a local variable if we like, but since its supposed to
1 remain unchanged, we can also not if we like.
1 
1    If we have SCHEDULE(STATIC), and no ORDERED, then we ought to be able
1 to get away with no work-sharing context at all, since we can simply
1 perform the arithmetic directly in each thread to divide up the
1 iterations.  Which would mean that we wouldn't need to call any of these
1 routines.
1 
1    There are separate routines for handling loops with an ORDERED
1 clause.  Bookkeeping for that is non-trivial...
1