| \input texinfo @c -*-texinfo-*- |
| |
| @c %**start of header |
| @setfilename libgomp.info |
| @settitle GNU libgomp |
| @c %**end of header |
| |
| |
| @copying |
| Copyright @copyright{} 2006-2013 Free Software Foundation, Inc. |
| |
| Permission is granted to copy, distribute and/or modify this document |
| under the terms of the GNU Free Documentation License, Version 1.3 or |
| any later version published by the Free Software Foundation; with the |
| Invariant Sections being ``Funding Free Software'', the Front-Cover |
| texts being (a) (see below), and with the Back-Cover Texts being (b) |
| (see below). A copy of the license is included in the section entitled |
| ``GNU Free Documentation License''. |
| |
| (a) The FSF's Front-Cover Text is: |
| |
| A GNU Manual |
| |
| (b) The FSF's Back-Cover Text is: |
| |
| You have freedom to copy and modify this GNU Manual, like GNU |
| software. Copies published by the Free Software Foundation raise |
| funds for GNU development. |
| @end copying |
| |
| @ifinfo |
| @dircategory GNU Libraries |
| @direntry |
| * libgomp: (libgomp). GNU OpenMP runtime library |
| @end direntry |
| |
| This manual documents the GNU implementation of the OpenMP API for |
| multi-platform shared-memory parallel programming in C/C++ and Fortran. |
| |
| Published by the Free Software Foundation |
| 51 Franklin Street, Fifth Floor |
| Boston, MA 02110-1301 USA |
| |
| @insertcopying |
| @end ifinfo |
| |
| |
| @setchapternewpage odd |
| |
| @titlepage |
| @title The GNU OpenMP Implementation |
| @page |
| @vskip 0pt plus 1filll |
| @comment For the @value{version-GCC} Version* |
| @sp 1 |
| Published by the Free Software Foundation @* |
| 51 Franklin Street, Fifth Floor@* |
| Boston, MA 02110-1301, USA@* |
| @sp 1 |
| @insertcopying |
| @end titlepage |
| |
| @summarycontents |
| @contents |
| @page |
| |
| |
| @node Top |
| @top Introduction |
| @cindex Introduction |
| |
| This manual documents the usage of libgomp, the GNU implementation of the |
| @uref{http://www.openmp.org, OpenMP} Application Programming Interface (API) |
| for multi-platform shared-memory parallel programming in C/C++ and Fortran. |
| |
| |
| |
| @comment |
| @comment When you add a new menu item, please keep the right hand |
| @comment aligned to the same column. Do not use tabs. This provides |
| @comment better formatting. |
| @comment |
| @menu |
| * Enabling OpenMP:: How to enable OpenMP for your applications. |
| * Runtime Library Routines:: The OpenMP runtime application programming |
| interface. |
| * Environment Variables:: Influencing runtime behavior with environment |
| variables. |
| * The libgomp ABI:: Notes on the external ABI presented by libgomp. |
| * Reporting Bugs:: How to report bugs in GNU OpenMP. |
| * Copying:: GNU general public license says |
| how you can copy and share libgomp. |
| * GNU Free Documentation License:: |
| How you can copy and share this manual. |
| * Funding:: How to help assure continued work for free |
| software. |
| * Library Index:: Index of this documentation. |
| @end menu |
| |
| |
| @c --------------------------------------------------------------------- |
| @c Enabling OpenMP |
| @c --------------------------------------------------------------------- |
| |
| @node Enabling OpenMP |
| @chapter Enabling OpenMP |
| |
| To activate the OpenMP extensions for C/C++ and Fortran, the compile-time |
| flag @command{-fopenmp} must be specified. This enables the OpenMP directive |
| @code{#pragma omp} in C/C++ and @code{!$omp} directives in free form, |
| @code{c$omp}, @code{*$omp} and @code{!$omp} directives in fixed form, |
| @code{!$} conditional compilation sentinels in free form and @code{c$}, |
| @code{*$} and @code{!$} sentinels in fixed form, for Fortran. The flag also |
| arranges for automatic linking of the OpenMP runtime library |
| (@ref{Runtime Library Routines}). |
| |
| A complete description of all OpenMP directives accepted may be found in |
| the @uref{http://www.openmp.org, OpenMP Application Program Interface} manual, |
| version 3.1. |
| |
| |
| @c --------------------------------------------------------------------- |
| @c Runtime Library Routines |
| @c --------------------------------------------------------------------- |
| |
| @node Runtime Library Routines |
| @chapter Runtime Library Routines |
| |
| The runtime routines described here are defined by section 3 of the OpenMP |
| specifications in version 3.1. The routines are structured in following |
| three parts: |
| |
| Control threads, processors and the parallel environment. |
| |
| @menu |
| * omp_get_active_level:: Number of active parallel regions |
| * omp_get_ancestor_thread_num:: Ancestor thread ID |
| * omp_get_dynamic:: Dynamic teams setting |
| * omp_get_level:: Number of parallel regions |
| * omp_get_max_active_levels:: Maximum number of active regions |
| * omp_get_max_threads:: Maximum number of threads of parallel region |
| * omp_get_nested:: Nested parallel regions |
| * omp_get_num_procs:: Number of processors online |
| * omp_get_num_threads:: Size of the active team |
| * omp_get_schedule:: Obtain the runtime scheduling method |
| * omp_get_team_size:: Number of threads in a team |
| * omp_get_thread_limit:: Maximum number of threads |
| * omp_get_thread_num:: Current thread ID |
| * omp_in_parallel:: Whether a parallel region is active |
| * omp_in_final:: Whether in final or included task region |
| * omp_set_dynamic:: Enable/disable dynamic teams |
| * omp_set_max_active_levels:: Limits the number of active parallel regions |
| * omp_set_nested:: Enable/disable nested parallel regions |
| * omp_set_num_threads:: Set upper team size limit |
| * omp_set_schedule:: Set the runtime scheduling method |
| @end menu |
| |
| Initialize, set, test, unset and destroy simple and nested locks. |
| |
| @menu |
| * omp_init_lock:: Initialize simple lock |
| * omp_set_lock:: Wait for and set simple lock |
| * omp_test_lock:: Test and set simple lock if available |
| * omp_unset_lock:: Unset simple lock |
| * omp_destroy_lock:: Destroy simple lock |
| * omp_init_nest_lock:: Initialize nested lock |
| * omp_set_nest_lock:: Wait for and set simple lock |
| * omp_test_nest_lock:: Test and set nested lock if available |
| * omp_unset_nest_lock:: Unset nested lock |
| * omp_destroy_nest_lock:: Destroy nested lock |
| @end menu |
| |
| Portable, thread-based, wall clock timer. |
| |
| @menu |
| * omp_get_wtick:: Get timer precision. |
| * omp_get_wtime:: Elapsed wall clock time. |
| @end menu |
| |
| |
| |
| @node omp_get_active_level |
| @section @code{omp_get_active_level} -- Number of parallel regions |
| @table @asis |
| @item @emph{Description}: |
| This function returns the nesting level for the active parallel blocks, |
| which enclose the calling call. |
| |
| @item @emph{C/C++} |
| @multitable @columnfractions .20 .80 |
| @item @emph{Prototype}: @tab @code{int omp_get_active_level(void);} |
| @end multitable |
| |
| @item @emph{Fortran}: |
| @multitable @columnfractions .20 .80 |
| @item @emph{Interface}: @tab @code{integer function omp_get_active_level()} |
| @end multitable |
| |
| @item @emph{See also}: |
| @ref{omp_get_level}, @ref{omp_get_max_active_levels}, @ref{omp_set_max_active_levels} |
| |
| @item @emph{Reference}: |
| @uref{http://www.openmp.org/, OpenMP specifications v3.1}, section 3.2.19. |
| @end table |
| |
| |
| |
| @node omp_get_ancestor_thread_num |
| @section @code{omp_get_ancestor_thread_num} -- Ancestor thread ID |
| @table @asis |
| @item @emph{Description}: |
| This function returns the thread identification number for the given |
| nesting level of the current thread. For values of @var{level} outside |
| zero to @code{omp_get_level} -1 is returned; if @var{level} is |
| @code{omp_get_level} the result is identical to @code{omp_get_thread_num}. |
| |
| @item @emph{C/C++} |
| @multitable @columnfractions .20 .80 |
| @item @emph{Prototype}: @tab @code{int omp_get_ancestor_thread_num(int level);} |
| @end multitable |
| |
| @item @emph{Fortran}: |
| @multitable @columnfractions .20 .80 |
| @item @emph{Interface}: @tab @code{integer function omp_get_ancestor_thread_num(level)} |
| @item @tab @code{integer level} |
| @end multitable |
| |
| @item @emph{See also}: |
| @ref{omp_get_level}, @ref{omp_get_thread_num}, @ref{omp_get_team_size} |
| |
| @item @emph{Reference}: |
| @uref{http://www.openmp.org/, OpenMP specifications v3.1}, section 3.2.17. |
| @end table |
| |
| |
| |
| @node omp_get_dynamic |
| @section @code{omp_get_dynamic} -- Dynamic teams setting |
| @table @asis |
| @item @emph{Description}: |
| This function returns @code{true} if enabled, @code{false} otherwise. |
| Here, @code{true} and @code{false} represent their language-specific |
| counterparts. |
| |
| The dynamic team setting may be initialized at startup by the |
| @code{OMP_DYNAMIC} environment variable or at runtime using |
| @code{omp_set_dynamic}. If undefined, dynamic adjustment is |
| disabled by default. |
| |
| @item @emph{C/C++}: |
| @multitable @columnfractions .20 .80 |
| @item @emph{Prototype}: @tab @code{int omp_get_dynamic(void);} |
| @end multitable |
| |
| @item @emph{Fortran}: |
| @multitable @columnfractions .20 .80 |
| @item @emph{Interface}: @tab @code{logical function omp_get_dynamic()} |
| @end multitable |
| |
| @item @emph{See also}: |
| @ref{omp_set_dynamic}, @ref{OMP_DYNAMIC} |
| |
| @item @emph{Reference}: |
| @uref{http://www.openmp.org/, OpenMP specifications v3.1}, section 3.2.8. |
| @end table |
| |
| |
| |
| @node omp_get_level |
| @section @code{omp_get_level} -- Obtain the current nesting level |
| @table @asis |
| @item @emph{Description}: |
| This function returns the nesting level for the parallel blocks, |
| which enclose the calling call. |
| |
| @item @emph{C/C++} |
| @multitable @columnfractions .20 .80 |
| @item @emph{Prototype}: @tab @code{int omp_get_level(void);} |
| @end multitable |
| |
| @item @emph{Fortran}: |
| @multitable @columnfractions .20 .80 |
| @item @emph{Interface}: @tab @code{integer function omp_level()} |
| @end multitable |
| |
| @item @emph{See also}: |
| @ref{omp_get_active_level} |
| |
| @item @emph{Reference}: |
| @uref{http://www.openmp.org/, OpenMP specifications v3.1}, section 3.2.16. |
| @end table |
| |
| |
| |
| @node omp_get_max_active_levels |
| @section @code{omp_get_max_active_levels} -- Maximum number of active regions |
| @table @asis |
| @item @emph{Description}: |
| This function obtains the maximum allowed number of nested, active parallel regions. |
| |
| @item @emph{C/C++} |
| @multitable @columnfractions .20 .80 |
| @item @emph{Prototype}: @tab @code{int omp_get_max_active_levels(void);} |
| @end multitable |
| |
| @item @emph{Fortran}: |
| @multitable @columnfractions .20 .80 |
| @item @emph{Interface}: @tab @code{integer function omp_get_max_active_levels()} |
| @end multitable |
| |
| @item @emph{See also}: |
| @ref{omp_set_max_active_levels}, @ref{omp_get_active_level} |
| |
| @item @emph{Reference}: |
| @uref{http://www.openmp.org/, OpenMP specifications v3.1}, section 3.2.15. |
| @end table |
| |
| |
| |
| @node omp_get_max_threads |
| @section @code{omp_get_max_threads} -- Maximum number of threads of parallel region |
| @table @asis |
| @item @emph{Description}: |
| Return the maximum number of threads used for the current parallel region |
| that does not use the clause @code{num_threads}. |
| |
| @item @emph{C/C++}: |
| @multitable @columnfractions .20 .80 |
| @item @emph{Prototype}: @tab @code{int omp_get_max_threads(void);} |
| @end multitable |
| |
| @item @emph{Fortran}: |
| @multitable @columnfractions .20 .80 |
| @item @emph{Interface}: @tab @code{integer function omp_get_max_threads()} |
| @end multitable |
| |
| @item @emph{See also}: |
| @ref{omp_set_num_threads}, @ref{omp_set_dynamic}, @ref{omp_get_thread_limit} |
| |
| @item @emph{Reference}: |
| @uref{http://www.openmp.org/, OpenMP specifications v3.1}, section 3.2.3. |
| @end table |
| |
| |
| |
| @node omp_get_nested |
| @section @code{omp_get_nested} -- Nested parallel regions |
| @table @asis |
| @item @emph{Description}: |
| This function returns @code{true} if nested parallel regions are |
| enabled, @code{false} otherwise. Here, @code{true} and @code{false} |
| represent their language-specific counterparts. |
| |
| Nested parallel regions may be initialized at startup by the |
| @code{OMP_NESTED} environment variable or at runtime using |
| @code{omp_set_nested}. If undefined, nested parallel regions are |
| disabled by default. |
| |
| @item @emph{C/C++}: |
| @multitable @columnfractions .20 .80 |
| @item @emph{Prototype}: @tab @code{int omp_get_nested(void);} |
| @end multitable |
| |
| @item @emph{Fortran}: |
| @multitable @columnfractions .20 .80 |
| @item @emph{Interface}: @tab @code{logical function omp_get_nested()} |
| @end multitable |
| |
| @item @emph{See also}: |
| @ref{omp_set_nested}, @ref{OMP_NESTED} |
| |
| @item @emph{Reference}: |
| @uref{http://www.openmp.org/, OpenMP specifications v3.1}, section 3.2.10. |
| @end table |
| |
| |
| |
| @node omp_get_num_procs |
| @section @code{omp_get_num_procs} -- Number of processors online |
| @table @asis |
| @item @emph{Description}: |
| Returns the number of processors online. |
| |
| @item @emph{C/C++}: |
| @multitable @columnfractions .20 .80 |
| @item @emph{Prototype}: @tab @code{int omp_get_num_procs(void);} |
| @end multitable |
| |
| @item @emph{Fortran}: |
| @multitable @columnfractions .20 .80 |
| @item @emph{Interface}: @tab @code{integer function omp_get_num_procs()} |
| @end multitable |
| |
| @item @emph{Reference}: |
| @uref{http://www.openmp.org/, OpenMP specifications v3.1}, section 3.2.5. |
| @end table |
| |
| |
| |
| @node omp_get_num_threads |
| @section @code{omp_get_num_threads} -- Size of the active team |
| @table @asis |
| @item @emph{Description}: |
| Returns the number of threads in the current team. In a sequential section of |
| the program @code{omp_get_num_threads} returns 1. |
| |
| The default team size may be initialized at startup by the |
| @code{OMP_NUM_THREADS} environment variable. At runtime, the size |
| of the current team may be set either by the @code{NUM_THREADS} |
| clause or by @code{omp_set_num_threads}. If none of the above were |
| used to define a specific value and @code{OMP_DYNAMIC} is disabled, |
| one thread per CPU online is used. |
| |
| @item @emph{C/C++}: |
| @multitable @columnfractions .20 .80 |
| @item @emph{Prototype}: @tab @code{int omp_get_num_threads(void);} |
| @end multitable |
| |
| @item @emph{Fortran}: |
| @multitable @columnfractions .20 .80 |
| @item @emph{Interface}: @tab @code{integer function omp_get_num_threads()} |
| @end multitable |
| |
| @item @emph{See also}: |
| @ref{omp_get_max_threads}, @ref{omp_set_num_threads}, @ref{OMP_NUM_THREADS} |
| |
| @item @emph{Reference}: |
| @uref{http://www.openmp.org/, OpenMP specifications v3.1}, section 3.2.2. |
| @end table |
| |
| |
| |
| @node omp_get_schedule |
| @section @code{omp_get_schedule} -- Obtain the runtime scheduling method |
| @table @asis |
| @item @emph{Description}: |
| Obtain the runtime scheduling method. The @var{kind} argument will be |
| set to the value @code{omp_sched_static}, @code{omp_sched_dynamic}, |
| @code{omp_sched_guided} or @code{omp_sched_auto}. The second argument, |
| @var{modifier}, is set to the chunk size. |
| |
| @item @emph{C/C++} |
| @multitable @columnfractions .20 .80 |
| @item @emph{Prototype}: @tab @code{void omp_schedule(omp_sched_t *kind, int *modifier);} |
| @end multitable |
| |
| @item @emph{Fortran}: |
| @multitable @columnfractions .20 .80 |
| @item @emph{Interface}: @tab @code{subroutine omp_schedule(kind, modifier)} |
| @item @tab @code{integer(kind=omp_sched_kind) kind} |
| @item @tab @code{integer modifier} |
| @end multitable |
| |
| @item @emph{See also}: |
| @ref{omp_set_schedule}, @ref{OMP_SCHEDULE} |
| |
| @item @emph{Reference}: |
| @uref{http://www.openmp.org/, OpenMP specifications v3.1}, section 3.2.12. |
| @end table |
| |
| |
| |
| @node omp_get_team_size |
| @section @code{omp_get_team_size} -- Number of threads in a team |
| @table @asis |
| @item @emph{Description}: |
| This function returns the number of threads in a thread team to which |
| either the current thread or its ancestor belongs. For values of @var{level} |
| outside zero to @code{omp_get_level}, -1 is returned; if @var{level} is zero, |
| 1 is returned, and for @code{omp_get_level}, the result is identical |
| to @code{omp_get_num_threads}. |
| |
| @item @emph{C/C++}: |
| @multitable @columnfractions .20 .80 |
| @item @emph{Prototype}: @tab @code{int omp_get_team_size(int level);} |
| @end multitable |
| |
| @item @emph{Fortran}: |
| @multitable @columnfractions .20 .80 |
| @item @emph{Interface}: @tab @code{integer function omp_get_team_size(level)} |
| @item @tab @code{integer level} |
| @end multitable |
| |
| @item @emph{See also}: |
| @ref{omp_get_num_threads}, @ref{omp_get_level}, @ref{omp_get_ancestor_thread_num} |
| |
| @item @emph{Reference}: |
| @uref{http://www.openmp.org/, OpenMP specifications v3.1}, section 3.2.18. |
| @end table |
| |
| |
| |
| @node omp_get_thread_limit |
| @section @code{omp_get_thread_limit} -- Maximum number of threads |
| @table @asis |
| @item @emph{Description}: |
| Return the maximum number of threads of the program. |
| |
| @item @emph{C/C++}: |
| @multitable @columnfractions .20 .80 |
| @item @emph{Prototype}: @tab @code{int omp_get_thread_limit(void);} |
| @end multitable |
| |
| @item @emph{Fortran}: |
| @multitable @columnfractions .20 .80 |
| @item @emph{Interface}: @tab @code{integer function omp_get_thread_limit()} |
| @end multitable |
| |
| @item @emph{See also}: |
| @ref{omp_get_max_threads}, @ref{OMP_THREAD_LIMIT} |
| |
| @item @emph{Reference}: |
| @uref{http://www.openmp.org/, OpenMP specifications v3.1}, section 3.2.13. |
| @end table |
| |
| |
| |
| @node omp_get_thread_num |
| @section @code{omp_get_thread_num} -- Current thread ID |
| @table @asis |
| @item @emph{Description}: |
| Returns a unique thread identification number within the current team. |
| In a sequential parts of the program, @code{omp_get_thread_num} |
| always returns 0. In parallel regions the return value varies |
| from 0 to @code{omp_get_num_threads}-1 inclusive. The return |
| value of the master thread of a team is always 0. |
| |
| @item @emph{C/C++}: |
| @multitable @columnfractions .20 .80 |
| @item @emph{Prototype}: @tab @code{int omp_get_thread_num(void);} |
| @end multitable |
| |
| @item @emph{Fortran}: |
| @multitable @columnfractions .20 .80 |
| @item @emph{Interface}: @tab @code{integer function omp_get_thread_num()} |
| @end multitable |
| |
| @item @emph{See also}: |
| @ref{omp_get_num_threads}, @ref{omp_get_ancestor_thread_num} |
| |
| @item @emph{Reference}: |
| @uref{http://www.openmp.org/, OpenMP specifications v3.1}, section 3.2.4. |
| @end table |
| |
| |
| |
| @node omp_in_parallel |
| @section @code{omp_in_parallel} -- Whether a parallel region is active |
| @table @asis |
| @item @emph{Description}: |
| This function returns @code{true} if currently running in parallel, |
| @code{false} otherwise. Here, @code{true} and @code{false} represent |
| their language-specific counterparts. |
| |
| @item @emph{C/C++}: |
| @multitable @columnfractions .20 .80 |
| @item @emph{Prototype}: @tab @code{int omp_in_parallel(void);} |
| @end multitable |
| |
| @item @emph{Fortran}: |
| @multitable @columnfractions .20 .80 |
| @item @emph{Interface}: @tab @code{logical function omp_in_parallel()} |
| @end multitable |
| |
| @item @emph{Reference}: |
| @uref{http://www.openmp.org/, OpenMP specifications v3.1}, section 3.2.6. |
| @end table |
| |
| |
| @node omp_in_final |
| @section @code{omp_in_final} -- Whether in final or included task region |
| @table @asis |
| @item @emph{Description}: |
| This function returns @code{true} if currently running in a final |
| or included task region, @code{false} otherwise. Here, @code{true} |
| and @code{false} represent their language-specific counterparts. |
| |
| @item @emph{C/C++}: |
| @multitable @columnfractions .20 .80 |
| @item @emph{Prototype}: @tab @code{int omp_in_final(void);} |
| @end multitable |
| |
| @item @emph{Fortran}: |
| @multitable @columnfractions .20 .80 |
| @item @emph{Interface}: @tab @code{logical function omp_in_final()} |
| @end multitable |
| |
| @item @emph{Reference}: |
| @uref{http://www.openmp.org/, OpenMP specifications v3.1}, section 3.2.20. |
| @end table |
| |
| |
| @node omp_set_dynamic |
| @section @code{omp_set_dynamic} -- Enable/disable dynamic teams |
| @table @asis |
| @item @emph{Description}: |
| Enable or disable the dynamic adjustment of the number of threads |
| within a team. The function takes the language-specific equivalent |
| of @code{true} and @code{false}, where @code{true} enables dynamic |
| adjustment of team sizes and @code{false} disables it. |
| |
| @item @emph{C/C++}: |
| @multitable @columnfractions .20 .80 |
| @item @emph{Prototype}: @tab @code{void omp_set_dynamic(int set);} |
| @end multitable |
| |
| @item @emph{Fortran}: |
| @multitable @columnfractions .20 .80 |
| @item @emph{Interface}: @tab @code{subroutine omp_set_dynamic(set)} |
| @item @tab @code{logical, intent(in) :: set} |
| @end multitable |
| |
| @item @emph{See also}: |
| @ref{OMP_DYNAMIC}, @ref{omp_get_dynamic} |
| |
| @item @emph{Reference}: |
| @uref{http://www.openmp.org/, OpenMP specifications v3.1}, section 3.2.7. |
| @end table |
| |
| |
| |
| @node omp_set_max_active_levels |
| @section @code{omp_set_max_active_levels} -- Limits the number of active parallel regions |
| @table @asis |
| @item @emph{Description}: |
| This function limits the maximum allowed number of nested, active |
| parallel regions. |
| |
| @item @emph{C/C++} |
| @multitable @columnfractions .20 .80 |
| @item @emph{Prototype}: @tab @code{void omp_set_max_active_levels(int max_levels);} |
| @end multitable |
| |
| @item @emph{Fortran}: |
| @multitable @columnfractions .20 .80 |
| @item @emph{Interface}: @tab @code{subroutine omp_set_max_active_levels(max_levels)} |
| @item @tab @code{integer max_levels} |
| @end multitable |
| |
| @item @emph{See also}: |
| @ref{omp_get_max_active_levels}, @ref{omp_get_active_level} |
| |
| @item @emph{Reference}: |
| @uref{http://www.openmp.org/, OpenMP specifications v3.1}, section 3.2.14. |
| @end table |
| |
| |
| |
| @node omp_set_nested |
| @section @code{omp_set_nested} -- Enable/disable nested parallel regions |
| @table @asis |
| @item @emph{Description}: |
| Enable or disable nested parallel regions, i.e., whether team members |
| are allowed to create new teams. The function takes the language-specific |
| equivalent of @code{true} and @code{false}, where @code{true} enables |
| dynamic adjustment of team sizes and @code{false} disables it. |
| |
| @item @emph{C/C++}: |
| @multitable @columnfractions .20 .80 |
| @item @emph{Prototype}: @tab @code{void omp_set_nested(int set);} |
| @end multitable |
| |
| @item @emph{Fortran}: |
| @multitable @columnfractions .20 .80 |
| @item @emph{Interface}: @tab @code{subroutine omp_set_nested(set)} |
| @item @tab @code{logical, intent(in) :: set} |
| @end multitable |
| |
| @item @emph{See also}: |
| @ref{OMP_NESTED}, @ref{omp_get_nested} |
| |
| @item @emph{Reference}: |
| @uref{http://www.openmp.org/, OpenMP specifications v3.1}, section 3.2.9. |
| @end table |
| |
| |
| |
| @node omp_set_num_threads |
| @section @code{omp_set_num_threads} -- Set upper team size limit |
| @table @asis |
| @item @emph{Description}: |
| Specifies the number of threads used by default in subsequent parallel |
| sections, if those do not specify a @code{num_threads} clause. The |
| argument of @code{omp_set_num_threads} shall be a positive integer. |
| |
| @item @emph{C/C++}: |
| @multitable @columnfractions .20 .80 |
| @item @emph{Prototype}: @tab @code{void omp_set_num_threads(int n);} |
| @end multitable |
| |
| @item @emph{Fortran}: |
| @multitable @columnfractions .20 .80 |
| @item @emph{Interface}: @tab @code{subroutine omp_set_num_threads(n)} |
| @item @tab @code{integer, intent(in) :: n} |
| @end multitable |
| |
| @item @emph{See also}: |
| @ref{OMP_NUM_THREADS}, @ref{omp_get_num_threads}, @ref{omp_get_max_threads} |
| |
| @item @emph{Reference}: |
| @uref{http://www.openmp.org/, OpenMP specifications v3.1}, section 3.2.1. |
| @end table |
| |
| |
| |
| @node omp_set_schedule |
| @section @code{omp_set_schedule} -- Set the runtime scheduling method |
| @table @asis |
| @item @emph{Description}: |
| Sets the runtime scheduling method. The @var{kind} argument can have the |
| value @code{omp_sched_static}, @code{omp_sched_dynamic}, |
| @code{omp_sched_guided} or @code{omp_sched_auto}. Except for |
| @code{omp_sched_auto}, the chunk size is set to the value of |
| @var{modifier} if positive, or to the default value if zero or negative. |
| For @code{omp_sched_auto} the @var{modifier} argument is ignored. |
| |
| @item @emph{C/C++} |
| @multitable @columnfractions .20 .80 |
| @item @emph{Prototype}: @tab @code{void omp_set_schedule(omp_sched_t *kind, int *modifier);} |
| @end multitable |
| |
| @item @emph{Fortran}: |
| @multitable @columnfractions .20 .80 |
| @item @emph{Interface}: @tab @code{subroutine omp_set_schedule(kind, modifier)} |
| @item @tab @code{integer(kind=omp_sched_kind) kind} |
| @item @tab @code{integer modifier} |
| @end multitable |
| |
| @item @emph{See also}: |
| @ref{omp_get_schedule} |
| @ref{OMP_SCHEDULE} |
| |
| @item @emph{Reference}: |
| @uref{http://www.openmp.org/, OpenMP specifications v3.1}, section 3.2.11. |
| @end table |
| |
| |
| |
| @node omp_init_lock |
| @section @code{omp_init_lock} -- Initialize simple lock |
| @table @asis |
| @item @emph{Description}: |
| Initialize a simple lock. After initialization, the lock is in |
| an unlocked state. |
| |
| @item @emph{C/C++}: |
| @multitable @columnfractions .20 .80 |
| @item @emph{Prototype}: @tab @code{void omp_init_lock(omp_lock_t *lock);} |
| @end multitable |
| |
| @item @emph{Fortran}: |
| @multitable @columnfractions .20 .80 |
| @item @emph{Interface}: @tab @code{subroutine omp_init_lock(lock)} |
| @item @tab @code{integer(omp_lock_kind), intent(out) :: lock} |
| @end multitable |
| |
| @item @emph{See also}: |
| @ref{omp_destroy_lock} |
| |
| @item @emph{Reference}: |
| @uref{http://www.openmp.org/, OpenMP specifications v3.1}, section 3.3.1. |
| @end table |
| |
| |
| |
| @node omp_set_lock |
| @section @code{omp_set_lock} -- Wait for and set simple lock |
| @table @asis |
| @item @emph{Description}: |
| Before setting a simple lock, the lock variable must be initialized by |
| @code{omp_init_lock}. The calling thread is blocked until the lock |
| is available. If the lock is already held by the current thread, |
| a deadlock occurs. |
| |
| @item @emph{C/C++}: |
| @multitable @columnfractions .20 .80 |
| @item @emph{Prototype}: @tab @code{void omp_set_lock(omp_lock_t *lock);} |
| @end multitable |
| |
| @item @emph{Fortran}: |
| @multitable @columnfractions .20 .80 |
| @item @emph{Interface}: @tab @code{subroutine omp_set_lock(lock)} |
| @item @tab @code{integer(omp_lock_kind), intent(inout) :: lock} |
| @end multitable |
| |
| @item @emph{See also}: |
| @ref{omp_init_lock}, @ref{omp_test_lock}, @ref{omp_unset_lock} |
| |
| @item @emph{Reference}: |
| @uref{http://www.openmp.org/, OpenMP specifications v3.1}, section 3.3.3. |
| @end table |
| |
| |
| |
| @node omp_test_lock |
| @section @code{omp_test_lock} -- Test and set simple lock if available |
| @table @asis |
| @item @emph{Description}: |
| Before setting a simple lock, the lock variable must be initialized by |
| @code{omp_init_lock}. Contrary to @code{omp_set_lock}, @code{omp_test_lock} |
| does not block if the lock is not available. This function returns |
| @code{true} upon success, @code{false} otherwise. Here, @code{true} and |
| @code{false} represent their language-specific counterparts. |
| |
| @item @emph{C/C++}: |
| @multitable @columnfractions .20 .80 |
| @item @emph{Prototype}: @tab @code{int omp_test_lock(omp_lock_t *lock);} |
| @end multitable |
| |
| @item @emph{Fortran}: |
| @multitable @columnfractions .20 .80 |
| @item @emph{Interface}: @tab @code{logical function omp_test_lock(lock)} |
| @item @tab @code{integer(omp_lock_kind), intent(inout) :: lock} |
| @end multitable |
| |
| @item @emph{See also}: |
| @ref{omp_init_lock}, @ref{omp_set_lock}, @ref{omp_set_lock} |
| |
| @item @emph{Reference}: |
| @uref{http://www.openmp.org/, OpenMP specifications v3.1}, section 3.3.5. |
| @end table |
| |
| |
| |
| @node omp_unset_lock |
| @section @code{omp_unset_lock} -- Unset simple lock |
| @table @asis |
| @item @emph{Description}: |
| A simple lock about to be unset must have been locked by @code{omp_set_lock} |
| or @code{omp_test_lock} before. In addition, the lock must be held by the |
| thread calling @code{omp_unset_lock}. Then, the lock becomes unlocked. If one |
| or more threads attempted to set the lock before, one of them is chosen to, |
| again, set the lock to itself. |
| |
| @item @emph{C/C++}: |
| @multitable @columnfractions .20 .80 |
| @item @emph{Prototype}: @tab @code{void omp_unset_lock(omp_lock_t *lock);} |
| @end multitable |
| |
| @item @emph{Fortran}: |
| @multitable @columnfractions .20 .80 |
| @item @emph{Interface}: @tab @code{subroutine omp_unset_lock(lock)} |
| @item @tab @code{integer(omp_lock_kind), intent(inout) :: lock} |
| @end multitable |
| |
| @item @emph{See also}: |
| @ref{omp_set_lock}, @ref{omp_test_lock} |
| |
| @item @emph{Reference}: |
| @uref{http://www.openmp.org/, OpenMP specifications v3.1}, section 3.3.4. |
| @end table |
| |
| |
| |
| @node omp_destroy_lock |
| @section @code{omp_destroy_lock} -- Destroy simple lock |
| @table @asis |
| @item @emph{Description}: |
| Destroy a simple lock. In order to be destroyed, a simple lock must be |
| in the unlocked state. |
| |
| @item @emph{C/C++}: |
| @multitable @columnfractions .20 .80 |
| @item @emph{Prototype}: @tab @code{void omp_destroy_lock(omp_lock_t *lock);} |
| @end multitable |
| |
| @item @emph{Fortran}: |
| @multitable @columnfractions .20 .80 |
| @item @emph{Interface}: @tab @code{subroutine omp_destroy_lock(lock)} |
| @item @tab @code{integer(omp_lock_kind), intent(inout) :: lock} |
| @end multitable |
| |
| @item @emph{See also}: |
| @ref{omp_init_lock} |
| |
| @item @emph{Reference}: |
| @uref{http://www.openmp.org/, OpenMP specifications v3.1}, section 3.3.2. |
| @end table |
| |
| |
| |
| @node omp_init_nest_lock |
| @section @code{omp_init_nest_lock} -- Initialize nested lock |
| @table @asis |
| @item @emph{Description}: |
| Initialize a nested lock. After initialization, the lock is in |
| an unlocked state and the nesting count is set to zero. |
| |
| @item @emph{C/C++}: |
| @multitable @columnfractions .20 .80 |
| @item @emph{Prototype}: @tab @code{void omp_init_nest_lock(omp_nest_lock_t *lock);} |
| @end multitable |
| |
| @item @emph{Fortran}: |
| @multitable @columnfractions .20 .80 |
| @item @emph{Interface}: @tab @code{subroutine omp_init_nest_lock(lock)} |
| @item @tab @code{integer(omp_nest_lock_kind), intent(out) :: lock} |
| @end multitable |
| |
| @item @emph{See also}: |
| @ref{omp_destroy_nest_lock} |
| |
| @item @emph{Reference}: |
| @uref{http://www.openmp.org/, OpenMP specifications v3.1}, section 3.3.1. |
| @end table |
| |
| |
| @node omp_set_nest_lock |
| @section @code{omp_set_nest_lock} -- Wait for and set nested lock |
| @table @asis |
| @item @emph{Description}: |
| Before setting a nested lock, the lock variable must be initialized by |
| @code{omp_init_nest_lock}. The calling thread is blocked until the lock |
| is available. If the lock is already held by the current thread, the |
| nesting count for the lock is incremented. |
| |
| @item @emph{C/C++}: |
| @multitable @columnfractions .20 .80 |
| @item @emph{Prototype}: @tab @code{void omp_set_nest_lock(omp_nest_lock_t *lock);} |
| @end multitable |
| |
| @item @emph{Fortran}: |
| @multitable @columnfractions .20 .80 |
| @item @emph{Interface}: @tab @code{subroutine omp_set_nest_lock(lock)} |
| @item @tab @code{integer(omp_nest_lock_kind), intent(inout) :: lock} |
| @end multitable |
| |
| @item @emph{See also}: |
| @ref{omp_init_nest_lock}, @ref{omp_unset_nest_lock} |
| |
| @item @emph{Reference}: |
| @uref{http://www.openmp.org/, OpenMP specifications v3.1}, section 3.3.3. |
| @end table |
| |
| |
| |
| @node omp_test_nest_lock |
| @section @code{omp_test_nest_lock} -- Test and set nested lock if available |
| @table @asis |
| @item @emph{Description}: |
| Before setting a nested lock, the lock variable must be initialized by |
| @code{omp_init_nest_lock}. Contrary to @code{omp_set_nest_lock}, |
| @code{omp_test_nest_lock} does not block if the lock is not available. |
| If the lock is already held by the current thread, the new nesting count |
| is returned. Otherwise, the return value equals zero. |
| |
| @item @emph{C/C++}: |
| @multitable @columnfractions .20 .80 |
| @item @emph{Prototype}: @tab @code{int omp_test_nest_lock(omp_nest_lock_t *lock);} |
| @end multitable |
| |
| @item @emph{Fortran}: |
| @multitable @columnfractions .20 .80 |
| @item @emph{Interface}: @tab @code{logical function omp_test_nest_lock(lock)} |
| @item @tab @code{integer(omp_nest_lock_kind), intent(inout) :: lock} |
| @end multitable |
| |
| |
| @item @emph{See also}: |
| @ref{omp_init_lock}, @ref{omp_set_lock}, @ref{omp_set_lock} |
| |
| @item @emph{Reference}: |
| @uref{http://www.openmp.org/, OpenMP specifications v3.1}, section 3.3.5. |
| @end table |
| |
| |
| |
| @node omp_unset_nest_lock |
| @section @code{omp_unset_nest_lock} -- Unset nested lock |
| @table @asis |
| @item @emph{Description}: |
| A nested lock about to be unset must have been locked by @code{omp_set_nested_lock} |
| or @code{omp_test_nested_lock} before. In addition, the lock must be held by the |
| thread calling @code{omp_unset_nested_lock}. If the nesting count drops to zero, the |
| lock becomes unlocked. If one ore more threads attempted to set the lock before, |
| one of them is chosen to, again, set the lock to itself. |
| |
| @item @emph{C/C++}: |
| @multitable @columnfractions .20 .80 |
| @item @emph{Prototype}: @tab @code{void omp_unset_nest_lock(omp_nest_lock_t *lock);} |
| @end multitable |
| |
| @item @emph{Fortran}: |
| @multitable @columnfractions .20 .80 |
| @item @emph{Interface}: @tab @code{subroutine omp_unset_nest_lock(lock)} |
| @item @tab @code{integer(omp_nest_lock_kind), intent(inout) :: lock} |
| @end multitable |
| |
| @item @emph{See also}: |
| @ref{omp_set_nest_lock} |
| |
| @item @emph{Reference}: |
| @uref{http://www.openmp.org/, OpenMP specifications v3.1}, section 3.3.4. |
| @end table |
| |
| |
| |
| @node omp_destroy_nest_lock |
| @section @code{omp_destroy_nest_lock} -- Destroy nested lock |
| @table @asis |
| @item @emph{Description}: |
| Destroy a nested lock. In order to be destroyed, a nested lock must be |
| in the unlocked state and its nesting count must equal zero. |
| |
| @item @emph{C/C++}: |
| @multitable @columnfractions .20 .80 |
| @item @emph{Prototype}: @tab @code{void omp_destroy_nest_lock(omp_nest_lock_t *);} |
| @end multitable |
| |
| @item @emph{Fortran}: |
| @multitable @columnfractions .20 .80 |
| @item @emph{Interface}: @tab @code{subroutine omp_destroy_nest_lock(lock)} |
| @item @tab @code{integer(omp_nest_lock_kind), intent(inout) :: lock} |
| @end multitable |
| |
| @item @emph{See also}: |
| @ref{omp_init_lock} |
| |
| @item @emph{Reference}: |
| @uref{http://www.openmp.org/, OpenMP specifications v3.1}, section 3.3.2. |
| @end table |
| |
| |
| |
| @node omp_get_wtick |
| @section @code{omp_get_wtick} -- Get timer precision |
| @table @asis |
| @item @emph{Description}: |
| Gets the timer precision, i.e., the number of seconds between two |
| successive clock ticks. |
| |
| @item @emph{C/C++}: |
| @multitable @columnfractions .20 .80 |
| @item @emph{Prototype}: @tab @code{double omp_get_wtick(void);} |
| @end multitable |
| |
| @item @emph{Fortran}: |
| @multitable @columnfractions .20 .80 |
| @item @emph{Interface}: @tab @code{double precision function omp_get_wtick()} |
| @end multitable |
| |
| @item @emph{See also}: |
| @ref{omp_get_wtime} |
| |
| @item @emph{Reference}: |
| @uref{http://www.openmp.org/, OpenMP specifications v3.1}, section 3.4.2. |
| @end table |
| |
| |
| |
| @node omp_get_wtime |
| @section @code{omp_get_wtime} -- Elapsed wall clock time |
| @table @asis |
| @item @emph{Description}: |
| Elapsed wall clock time in seconds. The time is measured per thread, no |
| guarantee can be made that two distinct threads measure the same time. |
| Time is measured from some "time in the past", which is an arbitrary time |
| guaranteed not to change during the execution of the program. |
| |
| @item @emph{C/C++}: |
| @multitable @columnfractions .20 .80 |
| @item @emph{Prototype}: @tab @code{double omp_get_wtime(void);} |
| @end multitable |
| |
| @item @emph{Fortran}: |
| @multitable @columnfractions .20 .80 |
| @item @emph{Interface}: @tab @code{double precision function omp_get_wtime()} |
| @end multitable |
| |
| @item @emph{See also}: |
| @ref{omp_get_wtick} |
| |
| @item @emph{Reference}: |
| @uref{http://www.openmp.org/, OpenMP specifications v3.1}, section 3.4.1. |
| @end table |
| |
| |
| |
| @c --------------------------------------------------------------------- |
| @c Environment Variables |
| @c --------------------------------------------------------------------- |
| |
| @node Environment Variables |
| @chapter Environment Variables |
| |
| The variables @env{OMP_DYNAMIC}, @env{OMP_MAX_ACTIVE_LEVELS}, |
| @env{OMP_NESTED}, @env{OMP_NUM_THREADS}, @env{OMP_SCHEDULE}, |
| @env{OMP_STACKSIZE},@env{OMP_THREAD_LIMIT} and @env{OMP_WAIT_POLICY} |
| are defined by section 4 of the OpenMP specifications in version 3.1, |
| while @env{GOMP_CPU_AFFINITY} and @env{GOMP_STACKSIZE} are GNU |
| extensions. |
| |
| @menu |
| * OMP_DYNAMIC:: Dynamic adjustment of threads |
| * OMP_MAX_ACTIVE_LEVELS:: Set the maximum number of nested parallel regions |
| * OMP_NESTED:: Nested parallel regions |
| * OMP_NUM_THREADS:: Specifies the number of threads to use |
| * OMP_STACKSIZE:: Set default thread stack size |
| * OMP_SCHEDULE:: How threads are scheduled |
| * OMP_THREAD_LIMIT:: Set the maximum number of threads |
| * OMP_WAIT_POLICY:: How waiting threads are handled |
| * OMP_PROC_BIND:: Whether theads may be moved between CPUs |
| * GOMP_CPU_AFFINITY:: Bind threads to specific CPUs |
| * GOMP_STACKSIZE:: Set default thread stack size |
| @end menu |
| |
| |
| @node OMP_DYNAMIC |
| @section @env{OMP_DYNAMIC} -- Dynamic adjustment of threads |
| @cindex Environment Variable |
| @table @asis |
| @item @emph{Description}: |
| Enable or disable the dynamic adjustment of the number of threads |
| within a team. The value of this environment variable shall be |
| @code{TRUE} or @code{FALSE}. If undefined, dynamic adjustment is |
| disabled by default. |
| |
| @item @emph{See also}: |
| @ref{omp_set_dynamic} |
| |
| @item @emph{Reference}: |
| @uref{http://www.openmp.org/, OpenMP specifications v3.1}, section 4.3 |
| @end table |
| |
| |
| |
| @node OMP_MAX_ACTIVE_LEVELS |
| @section @env{OMP_MAX_ACTIVE_LEVELS} -- Set the maximum number of nested parallel regions |
| @cindex Environment Variable |
| @table @asis |
| @item @emph{Description}: |
| Specifies the initial value for the maximum number of nested parallel |
| regions. The value of this variable shall be a positive integer. |
| If undefined, the number of active levels is unlimited. |
| |
| @item @emph{See also}: |
| @ref{omp_set_max_active_levels} |
| |
| @item @emph{Reference}: |
| @uref{http://www.openmp.org/, OpenMP specifications v3.1}, section 4.8 |
| @end table |
| |
| |
| |
| @node OMP_NESTED |
| @section @env{OMP_NESTED} -- Nested parallel regions |
| @cindex Environment Variable |
| @cindex Implementation specific setting |
| @table @asis |
| @item @emph{Description}: |
| Enable or disable nested parallel regions, i.e., whether team members |
| are allowed to create new teams. The value of this environment variable |
| shall be @code{TRUE} or @code{FALSE}. If undefined, nested parallel |
| regions are disabled by default. |
| |
| @item @emph{See also}: |
| @ref{omp_set_nested} |
| |
| @item @emph{Reference}: |
| @uref{http://www.openmp.org/, OpenMP specifications v3.1}, section 4.5 |
| @end table |
| |
| |
| |
| @node OMP_NUM_THREADS |
| @section @env{OMP_NUM_THREADS} -- Specifies the number of threads to use |
| @cindex Environment Variable |
| @cindex Implementation specific setting |
| @table @asis |
| @item @emph{Description}: |
| Specifies the default number of threads to use in parallel regions. The |
| value of this variable shall be a comma-separated list of positive integers; |
| the value specified the number of threads to use for the corresponding nested |
| level. If undefined one thread per CPU is used. |
| |
| @item @emph{See also}: |
| @ref{omp_set_num_threads} |
| |
| @item @emph{Reference}: |
| @uref{http://www.openmp.org/, OpenMP specifications v3.1}, section 4.2 |
| @end table |
| |
| |
| |
| @node OMP_SCHEDULE |
| @section @env{OMP_SCHEDULE} -- How threads are scheduled |
| @cindex Environment Variable |
| @cindex Implementation specific setting |
| @table @asis |
| @item @emph{Description}: |
| Allows to specify @code{schedule type} and @code{chunk size}. |
| The value of the variable shall have the form: @code{type[,chunk]} where |
| @code{type} is one of @code{static}, @code{dynamic}, @code{guided} or @code{auto} |
| The optional @code{chunk} size shall be a positive integer. If undefined, |
| dynamic scheduling and a chunk size of 1 is used. |
| |
| @item @emph{See also}: |
| @ref{omp_set_schedule} |
| |
| @item @emph{Reference}: |
| @uref{http://www.openmp.org/, OpenMP specifications v3.1}, sections 2.5.1 and 4.1 |
| @end table |
| |
| |
| |
| @node OMP_STACKSIZE |
| @section @env{OMP_STACKSIZE} -- Set default thread stack size |
| @cindex Environment Variable |
| @table @asis |
| @item @emph{Description}: |
| Set the default thread stack size in kilobytes, unless the number |
| is suffixed by @code{B}, @code{K}, @code{M} or @code{G}, in which |
| case the size is, respectively, in bytes, kilobytes, megabytes |
| or gigabytes. This is different from @code{pthread_attr_setstacksize} |
| which gets the number of bytes as an argument. If the stack size cannot |
| be set due to system constraints, an error is reported and the initial |
| stack size is left unchanged. If undefined, the stack size is system |
| dependent. |
| |
| @item @emph{Reference}: |
| @uref{http://www.openmp.org/, OpenMP specifications v3.1}, sections 4.6 |
| @end table |
| |
| |
| |
| @node OMP_THREAD_LIMIT |
| @section @env{OMP_THREAD_LIMIT} -- Set the maximum number of threads |
| @cindex Environment Variable |
| @table @asis |
| @item @emph{Description}: |
| Specifies the number of threads to use for the whole program. The |
| value of this variable shall be a positive integer. If undefined, |
| the number of threads is not limited. |
| |
| @item @emph{See also}: |
| @ref{OMP_NUM_THREADS} |
| @ref{omp_get_thread_limit} |
| |
| @item @emph{Reference}: |
| @uref{http://www.openmp.org/, OpenMP specifications v3.1}, section 4.9 |
| @end table |
| |
| |
| |
| @node OMP_WAIT_POLICY |
| @section @env{OMP_WAIT_POLICY} -- How waiting threads are handled |
| @cindex Environment Variable |
| @table @asis |
| @item @emph{Description}: |
| Specifies whether waiting threads should be active or passive. If |
| the value is @code{PASSIVE}, waiting threads should not consume CPU |
| power while waiting; while the value is @code{ACTIVE} specifies that |
| they should. |
| |
| @item @emph{Reference}: |
| @uref{http://www.openmp.org/, OpenMP specifications v3.1}, sections 4.7 |
| @end table |
| |
| |
| |
| @node OMP_PROC_BIND |
| @section @env{OMP_PROC_BIND} -- Whether theads may be moved between CPUs |
| @cindex Environment Variable |
| @table @asis |
| @item @emph{Description}: |
| Specifies whether threads may be moved between processors. If set to |
| @code{true}, OpenMP theads should not be moved, if set to @code{false} |
| they may be moved. |
| |
| @item @emph{See also}: |
| @ref{GOMP_CPU_AFFINITY} |
| |
| @item @emph{Reference}: |
| @uref{http://www.openmp.org/, OpenMP specifications v3.1}, sections 4.4 |
| @end table |
| |
| |
| |
| @node GOMP_CPU_AFFINITY |
| @section @env{GOMP_CPU_AFFINITY} -- Bind threads to specific CPUs |
| @cindex Environment Variable |
| @table @asis |
| @item @emph{Description}: |
| Binds threads to specific CPUs. The variable should contain a space-separated |
| or comma-separated list of CPUs. This list may contain different kinds of |
| entries: either single CPU numbers in any order, a range of CPUs (M-N) |
| or a range with some stride (M-N:S). CPU numbers are zero based. For example, |
| @code{GOMP_CPU_AFFINITY="0 3 1-2 4-15:2"} will bind the initial thread |
| to CPU 0, the second to CPU 3, the third to CPU 1, the fourth to |
| CPU 2, the fifth to CPU 4, the sixth through tenth to CPUs 6, 8, 10, 12, |
| and 14 respectively and then start assigning back from the beginning of |
| the list. @code{GOMP_CPU_AFFINITY=0} binds all threads to CPU 0. |
| |
| There is no GNU OpenMP library routine to determine whether a CPU affinity |
| specification is in effect. As a workaround, language-specific library |
| functions, e.g., @code{getenv} in C or @code{GET_ENVIRONMENT_VARIABLE} in |
| Fortran, may be used to query the setting of the @code{GOMP_CPU_AFFINITY} |
| environment variable. A defined CPU affinity on startup cannot be changed |
| or disabled during the runtime of the application. |
| |
| If this environment variable is omitted, the host system will handle the |
| assignment of threads to CPUs. |
| |
| @item @emph{See also}: |
| @ref{OMP_PROC_BIND} |
| @end table |
| |
| |
| |
| @node GOMP_STACKSIZE |
| @section @env{GOMP_STACKSIZE} -- Set default thread stack size |
| @cindex Environment Variable |
| @cindex Implementation specific setting |
| @table @asis |
| @item @emph{Description}: |
| Set the default thread stack size in kilobytes. This is different from |
| @code{pthread_attr_setstacksize} which gets the number of bytes as an |
| argument. If the stack size cannot be set due to system constraints, an |
| error is reported and the initial stack size is left unchanged. If undefined, |
| the stack size is system dependent. |
| |
| @item @emph{See also}: |
| @ref{OMP_STACKSIZE} |
| |
| @item @emph{Reference}: |
| @uref{http://gcc.gnu.org/ml/gcc-patches/2006-06/msg00493.html, |
| GCC Patches Mailinglist}, |
| @uref{http://gcc.gnu.org/ml/gcc-patches/2006-06/msg00496.html, |
| GCC Patches Mailinglist} |
| @end table |
| |
| |
| |
| @c --------------------------------------------------------------------- |
| @c The libgomp ABI |
| @c --------------------------------------------------------------------- |
| |
| @node The libgomp ABI |
| @chapter The libgomp ABI |
| |
| The following sections present notes on the external ABI as |
| presented by libgomp. Only maintainers should need them. |
| |
| @menu |
| * Implementing MASTER construct:: |
| * Implementing CRITICAL construct:: |
| * Implementing ATOMIC construct:: |
| * Implementing FLUSH construct:: |
| * Implementing BARRIER construct:: |
| * Implementing THREADPRIVATE construct:: |
| * Implementing PRIVATE clause:: |
| * Implementing FIRSTPRIVATE LASTPRIVATE COPYIN and COPYPRIVATE clauses:: |
| * Implementing REDUCTION clause:: |
| * Implementing PARALLEL construct:: |
| * Implementing FOR construct:: |
| * Implementing ORDERED construct:: |
| * Implementing SECTIONS construct:: |
| * Implementing SINGLE construct:: |
| @end menu |
| |
| |
| @node Implementing MASTER construct |
| @section Implementing MASTER construct |
| |
| @smallexample |
| if (omp_get_thread_num () == 0) |
| block |
| @end smallexample |
| |
| Alternately, we generate two copies of the parallel subfunction |
| and only include this in the version run by the master thread. |
| Surely this is not worthwhile though... |
| |
| |
| |
| @node Implementing CRITICAL construct |
| @section Implementing CRITICAL construct |
| |
| Without a specified name, |
| |
| @smallexample |
| void GOMP_critical_start (void); |
| void GOMP_critical_end (void); |
| @end smallexample |
| |
| so that we don't get COPY relocations from libgomp to the main |
| application. |
| |
| With a specified name, use omp_set_lock and omp_unset_lock with |
| name being transformed into a variable declared like |
| |
| @smallexample |
| omp_lock_t gomp_critical_user_<name> __attribute__((common)) |
| @end smallexample |
| |
| Ideally the ABI would specify that all zero is a valid unlocked |
| state, and so we wouldn't need to initialize this at |
| startup. |
| |
| |
| |
| @node Implementing ATOMIC construct |
| @section Implementing ATOMIC construct |
| |
| The target should implement the @code{__sync} builtins. |
| |
| Failing that we could add |
| |
| @smallexample |
| void GOMP_atomic_enter (void) |
| void GOMP_atomic_exit (void) |
| @end smallexample |
| |
| which reuses the regular lock code, but with yet another lock |
| object private to the library. |
| |
| |
| |
| @node Implementing FLUSH construct |
| @section Implementing FLUSH construct |
| |
| Expands to the @code{__sync_synchronize} builtin. |
| |
| |
| |
| @node Implementing BARRIER construct |
| @section Implementing BARRIER construct |
| |
| @smallexample |
| void GOMP_barrier (void) |
| @end smallexample |
| |
| |
| @node Implementing THREADPRIVATE construct |
| @section Implementing THREADPRIVATE construct |
| |
| In _most_ cases we can map this directly to @code{__thread}. Except |
| that OMP allows constructors for C++ objects. We can either |
| refuse to support this (how often is it used?) or we can |
| implement something akin to .ctors. |
| |
| Even more ideally, this ctor feature is handled by extensions |
| to the main pthreads library. Failing that, we can have a set |
| of entry points to register ctor functions to be called. |
| |
| |
| |
| @node Implementing PRIVATE clause |
| @section Implementing PRIVATE clause |
| |
| In association with a PARALLEL, or within the lexical extent |
| of a PARALLEL block, the variable becomes a local variable in |
| the parallel subfunction. |
| |
| In association with FOR or SECTIONS blocks, create a new |
| automatic variable within the current function. This preserves |
| the semantic of new variable creation. |
| |
| |
| |
| @node Implementing FIRSTPRIVATE LASTPRIVATE COPYIN and COPYPRIVATE clauses |
| @section Implementing FIRSTPRIVATE LASTPRIVATE COPYIN and COPYPRIVATE clauses |
| |
| This seems simple enough for PARALLEL blocks. Create a private |
| struct for communicating between the parent and subfunction. |
| In the parent, copy in values for scalar and "small" structs; |
| copy in addresses for others TREE_ADDRESSABLE types. In the |
| subfunction, copy the value into the local variable. |
| |
| It is not clear what to do with bare FOR or SECTION blocks. |
| The only thing I can figure is that we do something like: |
| |
| @smallexample |
| #pragma omp for firstprivate(x) lastprivate(y) |
| for (int i = 0; i < n; ++i) |
| body; |
| @end smallexample |
| |
| which becomes |
| |
| @smallexample |
| @{ |
| int x = x, y; |
| |
| // for stuff |
| |
| if (i == n) |
| y = y; |
| @} |
| @end smallexample |
| |
| where the "x=x" and "y=y" assignments actually have different |
| uids for the two variables, i.e. not something you could write |
| directly in C. Presumably this only makes sense if the "outer" |
| x and y are global variables. |
| |
| COPYPRIVATE would work the same way, except the structure |
| broadcast would have to happen via SINGLE machinery instead. |
| |
| |
| |
| @node Implementing REDUCTION clause |
| @section Implementing REDUCTION clause |
| |
| The private struct mentioned in the previous section should have |
| a pointer to an array of the type of the variable, indexed by the |
| thread's @var{team_id}. The thread stores its final value into the |
| array, and after the barrier, the master thread iterates over the |
| array to collect the values. |
| |
| |
| @node Implementing PARALLEL construct |
| @section Implementing PARALLEL construct |
| |
| @smallexample |
| #pragma omp parallel |
| @{ |
| body; |
| @} |
| @end smallexample |
| |
| becomes |
| |
| @smallexample |
| void subfunction (void *data) |
| @{ |
| use data; |
| body; |
| @} |
| |
| setup data; |
| GOMP_parallel_start (subfunction, &data, num_threads); |
| subfunction (&data); |
| GOMP_parallel_end (); |
| @end smallexample |
| |
| @smallexample |
| void GOMP_parallel_start (void (*fn)(void *), void *data, unsigned num_threads) |
| @end smallexample |
| |
| The @var{FN} argument is the subfunction to be run in parallel. |
| |
| The @var{DATA} argument is a pointer to a structure used to |
| communicate data in and out of the subfunction, as discussed |
| above with respect to FIRSTPRIVATE et al. |
| |
| The @var{NUM_THREADS} argument is 1 if an IF clause is present |
| and false, or the value of the NUM_THREADS clause, if |
| present, or 0. |
| |
| The function needs to create the appropriate number of |
| threads and/or launch them from the dock. It needs to |
| create the team structure and assign team ids. |
| |
| @smallexample |
| void GOMP_parallel_end (void) |
| @end smallexample |
| |
| Tears down the team and returns us to the previous @code{omp_in_parallel()} state. |
| |
| |
| |
| @node Implementing FOR construct |
| @section Implementing FOR construct |
| |
| @smallexample |
| #pragma omp parallel for |
| for (i = lb; i <= ub; i++) |
| body; |
| @end smallexample |
| |
| becomes |
| |
| @smallexample |
| void subfunction (void *data) |
| @{ |
| long _s0, _e0; |
| while (GOMP_loop_static_next (&_s0, &_e0)) |
| @{ |
| long _e1 = _e0, i; |
| for (i = _s0; i < _e1; i++) |
| body; |
| @} |
| GOMP_loop_end_nowait (); |
| @} |
| |
| GOMP_parallel_loop_static (subfunction, NULL, 0, lb, ub+1, 1, 0); |
| subfunction (NULL); |
| GOMP_parallel_end (); |
| @end smallexample |
| |
| @smallexample |
| #pragma omp for schedule(runtime) |
| for (i = 0; i < n; i++) |
| body; |
| @end smallexample |
| |
| becomes |
| |
| @smallexample |
| @{ |
| long i, _s0, _e0; |
| if (GOMP_loop_runtime_start (0, n, 1, &_s0, &_e0)) |
| do @{ |
| long _e1 = _e0; |
| for (i = _s0, i < _e0; i++) |
| body; |
| @} while (GOMP_loop_runtime_next (&_s0, _&e0)); |
| GOMP_loop_end (); |
| @} |
| @end smallexample |
| |
| Note that while it looks like there is trickiness to propagating |
| a non-constant STEP, there isn't really. We're explicitly allowed |
| to evaluate it as many times as we want, and any variables involved |
| should automatically be handled as PRIVATE or SHARED like any other |
| variables. So the expression should remain evaluable in the |
| subfunction. We can also pull it into a local variable if we like, |
| but since its supposed to remain unchanged, we can also not if we like. |
| |
| If we have SCHEDULE(STATIC), and no ORDERED, then we ought to be |
| able to get away with no work-sharing context at all, since we can |
| simply perform the arithmetic directly in each thread to divide up |
| the iterations. Which would mean that we wouldn't need to call any |
| of these routines. |
| |
| There are separate routines for handling loops with an ORDERED |
| clause. Bookkeeping for that is non-trivial... |
| |
| |
| |
| @node Implementing ORDERED construct |
| @section Implementing ORDERED construct |
| |
| @smallexample |
| void GOMP_ordered_start (void) |
| void GOMP_ordered_end (void) |
| @end smallexample |
| |
| |
| |
| @node Implementing SECTIONS construct |
| @section Implementing SECTIONS construct |
| |
| A block as |
| |
| @smallexample |
| #pragma omp sections |
| @{ |
| #pragma omp section |
| stmt1; |
| #pragma omp section |
| stmt2; |
| #pragma omp section |
| stmt3; |
| @} |
| @end smallexample |
| |
| becomes |
| |
| @smallexample |
| for (i = GOMP_sections_start (3); i != 0; i = GOMP_sections_next ()) |
| switch (i) |
| @{ |
| case 1: |
| stmt1; |
| break; |
| case 2: |
| stmt2; |
| break; |
| case 3: |
| stmt3; |
| break; |
| @} |
| GOMP_barrier (); |
| @end smallexample |
| |
| |
| @node Implementing SINGLE construct |
| @section Implementing SINGLE construct |
| |
| A block like |
| |
| @smallexample |
| #pragma omp single |
| @{ |
| body; |
| @} |
| @end smallexample |
| |
| becomes |
| |
| @smallexample |
| if (GOMP_single_start ()) |
| body; |
| GOMP_barrier (); |
| @end smallexample |
| |
| while |
| |
| @smallexample |
| #pragma omp single copyprivate(x) |
| body; |
| @end smallexample |
| |
| becomes |
| |
| @smallexample |
| datap = GOMP_single_copy_start (); |
| if (datap == NULL) |
| @{ |
| body; |
| data.x = x; |
| GOMP_single_copy_end (&data); |
| @} |
| else |
| x = datap->x; |
| GOMP_barrier (); |
| @end smallexample |
| |
| |
| |
| @c --------------------------------------------------------------------- |
| @c |
| @c --------------------------------------------------------------------- |
| |
| @node Reporting Bugs |
| @chapter Reporting Bugs |
| |
| Bugs in the GNU OpenMP implementation should be reported via |
| @uref{http://gcc.gnu.org/bugzilla/, bugzilla}. For all cases, please add |
| "openmp" to the keywords field in the bug report. |
| |
| |
| |
| @c --------------------------------------------------------------------- |
| @c GNU General Public License |
| @c --------------------------------------------------------------------- |
| |
| @include gpl_v3.texi |
| |
| |
| |
| @c --------------------------------------------------------------------- |
| @c GNU Free Documentation License |
| @c --------------------------------------------------------------------- |
| |
| @include fdl.texi |
| |
| |
| |
| @c --------------------------------------------------------------------- |
| @c Funding Free Software |
| @c --------------------------------------------------------------------- |
| |
| @include funding.texi |
| |
| @c --------------------------------------------------------------------- |
| @c Index |
| @c --------------------------------------------------------------------- |
| |
| @node Library Index |
| @unnumbered Library Index |
| |
| @printindex cp |
| |
| @bye |