forked from snirmarc/MPI-OMP
-
Notifications
You must be signed in to change notification settings - Fork 0
/
Copy pathintro.tex
60 lines (53 loc) · 2.82 KB
/
intro.tex
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
\section{Introduction}
We cover in this document two topics:
\begin{enumerate}
\item The interaction of MPI with the threading and tasking constructs of
OpenMP (or similar languages)
\item The interaction of MPI with the accelerator constructs of OpenMP (or
similar languages).
\end{enumerate}
For each of them, we discuss current performance issues, best practices for the
use of current constructs, possible changes in MPI or OpenMP standard APIs that
could alleviate these problems and required work to implement these changes.
\subsection{Goals}
The main goal of this document is to propose mechanisms that would provide
efficient support of the MPI+X programming
model where X is OpenMP. Most of the discussion applies as well to OpenACC. The
efficient support should extend to nodes with high core counts, systems with
accelerators, and systems with heterogeneous memory. It includes short-term
recommendations to users and long-term recommendations to implementers.
The proposed mechanisms should not break existing codes and should be
reasonably easy to implement. The recommendations to users should apply to
current state-of-the-art and facilitate transition to future implementations.
\subsection{Terminology}
\begin{description}
\item[Process:]
An instance of a computer program that is being executed. It contains the
program code, an address space, and one of multiple threads of execution that
execute
instructions concurrently
\item[Thread:] A sequence of programmed instructions that can be managed
independently by the OS scheduler (descheduled or rescheduled)
\item[Task:] (aka User-Level-Thread or Fiber) A unit of execution that must be
manually scheduled by the application or a user-level runtime. Tasks run in the
context of the threads
that schedule them. They cannot be preempted by the OS (however, the thread
executing a task could be preempted). They can Yield or block (Wait) on a
synchronization call
\end{description}
POSIX defines standard APIs for processes and threads, but not for tasks
\dross{
\subsection{Requirements}
We assume that, as the
nubmer of cores per node keeps increasing, it will become imperative to support
concurrent MPI communications from multiple threads -- support efficiently
\texttt{MPI\_THREAD\_MULTIPLE}. Efficient support for
\texttt{MPI\_THREAD\_MULTIPLE} is addressed
by another milestone of the exascale MPI project.
An MPI implementation may use a dedicated progress thread (or several) to poll
for incoming messages and execute any MPI logic that is not directly associated
with MPI calls. Other libraries may have their own requirements for dedicated
servers. We assume that mechanisms are provided for supporting the allocation
of dedicated hardware resources to MPI or other libraries; or. more generally,
for providing Quality of Service guarantees to different subsystems.
}