Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Doc: Use ++n instead of +p in charmrun examples #3780

Open
wants to merge 2 commits into
base: main
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 2 additions & 2 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -272,7 +272,7 @@ executable named `nqueen`.

Following the previous example, to run the program on two processors, type

$ ./charmrun +p2 ./nqueen 12 6
$ ./charmrun ++n 2 ./nqueen 12 6

This should run for a few seconds, and print out:
`There are 14200 Solutions to 12 queens. Time=0.109440 End time=0.112752`
Expand Down Expand Up @@ -307,7 +307,7 @@ want to run program on only one machine, for example, your laptop. This
can save you all the hassle of setting up ssh daemons.
To use this option, just type:

$ ./charmrun ++local ./nqueen 12 100 +p2
$ ./charmrun ++local ./nqueen 12 100 ++n 2

However, for best performance, you should launch one node program per processor.

Expand Down
4 changes: 2 additions & 2 deletions doc/ampi/02-building.rst
Original file line number Diff line number Diff line change
Expand Up @@ -175,7 +175,7 @@ arguments. A typical invocation of an AMPI program ``pgm`` with

.. code-block:: bash

$ ./charmrun +p16 ./pgm +vp64
$ ./charmrun ++n 16 ./pgm +vp64

Here, the AMPI program ``pgm`` is run on 16 physical processors with 64
total virtual ranks (which will be mapped 4 per processor initially).
Expand All @@ -189,7 +189,7 @@ example:

.. code-block:: bash

$ ./charmrun +p16 ./pgm +vp128 +tcharm_stacksize 32K +balancer RefineLB
$ ./charmrun ++n 16 ./pgm +vp128 +tcharm_stacksize 32K +balancer RefineLB

Running with ampirun
~~~~~~~~~~~~~~~~~~~~
Expand Down
4 changes: 2 additions & 2 deletions doc/ampi/04-extensions.rst
Original file line number Diff line number Diff line change
Expand Up @@ -566,15 +566,15 @@ of the AMPI program with some additional command line options.

.. code-block:: bash

$ ./charmrun ./pgm +p4 +vp4 +msgLogWrite +msgLogRank 2 +msgLogFilename "msg2.log"
$ ./charmrun ./pgm ++n 4 +vp4 +msgLogWrite +msgLogRank 2 +msgLogFilename "msg2.log"

In the above example, a parallel run with 4 worker threads and 4 AMPI
ranks will be executed, and the changes in the MPI environment of worker
thread 2 (also rank 2, starting from 0) will get logged into diskfile
"msg2.log".

Unlike the first run, the re-run is a sequential program, so it is not
invoked by charmrun (and omitting charmrun options like +p4 and +vp4),
invoked by charmrun (and omitting charmrun options like ++n 4 and +vp4),
and additional command line options are required as well.

.. code-block:: bash
Expand Down
34 changes: 17 additions & 17 deletions doc/ampi/05-examples.rst
Original file line number Diff line number Diff line change
Expand Up @@ -31,7 +31,7 @@ MiniFE
program.

- Refer to the ``README`` file on how to run the program. For example:
``./charmrun +p4 ./miniFE.x nx=30 ny=30 nz=30 +vp32``
``./charmrun ++n 4 ./miniFE.x nx=30 ny=30 nz=30 +vp32``

MiniMD v2.0
~~~~~~~~~~~
Expand All @@ -44,7 +44,7 @@ MiniMD v2.0
execute ``make ampi`` to build the program.

- Refer to the ``README`` file on how to run the program. For example:
``./charmrun +p4 ./miniMD_ampi +vp32``
``./charmrun ++n 4 ./miniMD_ampi +vp32``

CoMD v1.1
~~~~~~~~~
Expand Down Expand Up @@ -72,7 +72,7 @@ MiniXYCE v1.0
``test/``.

- Example run command:
``./charmrun +p3 ./miniXyce.x +vp3 -circuit ../tests/cir1.net -t_start 1e-6 -pf params.txt``
``./charmrun ++n 3 ./miniXyce.x +vp3 -circuit ../tests/cir1.net -t_start 1e-6 -pf params.txt``

HPCCG v1.0
~~~~~~~~~~
Expand All @@ -84,7 +84,7 @@ HPCCG v1.0
AMPI compilers.

- Run with a command such as:
``./charmrun +p2 ./test_HPCCG 20 30 10 +vp16``
``./charmrun ++n 2 ./test_HPCCG 20 30 10 +vp16``

MiniAMR v1.0
~~~~~~~~~~~~
Expand Down Expand Up @@ -140,7 +140,7 @@ Lassen v1.0

- No changes necessary to enable AMPI virtualization. Requires some
C++11 support. Set ``AMPIDIR`` in Makefile and ``make``. Run with:
``./charmrun +p4 ./lassen_mpi +vp8 default 2 2 2 50 50 50``
``./charmrun ++n 4 ./lassen_mpi +vp8 default 2 2 2 50 50 50``

Kripke v1.1
~~~~~~~~~~~
Expand All @@ -167,7 +167,7 @@ Kripke v1.1

.. code-block:: bash

$ ./charmrun +p8 ./src/tools/kripke +vp8 --zones 64,64,64 --procs 2,2,2 --nest ZDG
$ ./charmrun ++n 8 ./src/tools/kripke +vp8 --zones 64,64,64 --procs 2,2,2 --nest ZDG

MCB v1.0.3 (2013)
~~~~~~~~~~~~~~~~~
Expand All @@ -181,7 +181,7 @@ MCB v1.0.3 (2013)

.. code-block:: bash

$ OMP_NUM_THREADS=1 ./charmrun +p4 ./../src/MCBenchmark.exe --weakScaling
$ OMP_NUM_THREADS=1 ./charmrun ++n 4 ./../src/MCBenchmark.exe --weakScaling
--distributedSource --nCores=1 --numParticles=20000 --multiSigma --nThreadCore=1 +vp16

.. _not-yet-ampi-zed-reason-1:
Expand Down Expand Up @@ -228,7 +228,7 @@ SNAP v1.01 (C version)
while the C version works out of the box on all platforms.

- Edit the Makefile for AMPI compiler paths and run with:
``./charmrun +p4 ./snap +vp4 --fi center_src/fin01 --fo center_src/fout01``
``./charmrun ++n 4 ./snap +vp4 --fi center_src/fin01 --fo center_src/fout01``

Sweep3D
~~~~~~~
Expand All @@ -248,7 +248,7 @@ Sweep3D

- Modify file ``input`` to set the different parameters. Refer to
file ``README`` on how to change those parameters. Run with:
``./charmrun ./sweep3d.mpi +p8 +vp16``
``./charmrun ./sweep3d.mpi ++n 8 +vp16``

PENNANT v0.8
~~~~~~~~~~~~
Expand All @@ -264,7 +264,7 @@ PENNANT v0.8

- For PENNANT-v0.8, point CC in Makefile to AMPICC and just ’make’. Run
with the provided input files, such as:
``./charmrun +p2 ./build/pennant +vp8 test/noh/noh.pnt``
``./charmrun ++n 2 ./build/pennant +vp8 test/noh/noh.pnt``

Benchmarks
----------
Expand Down Expand Up @@ -307,7 +307,7 @@ NAS Parallel Benchmarks (NPB 3.3)
*cg.256.C* will appear in the *CG* and ``bin/`` directories. To
run the particular benchmark, you must follow the standard
procedure of running AMPI programs:
``./charmrun ./cg.C.256 +p64 +vp256 ++nodelist nodelist``
``./charmrun ./cg.C.256 ++n 64 +vp256 ++nodelist nodelist``

NAS PB Multi-Zone Version (NPB-MZ 3.3)
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Expand Down Expand Up @@ -340,7 +340,7 @@ NAS PB Multi-Zone Version (NPB-MZ 3.3)
directory. In the previous example, a file *bt-mz.256.C* will be
created in the ``bin`` directory. To run the particular benchmark,
you must follow the standard procedure of running AMPI programs:
``./charmrun ./bt-mz.C.256 +p64 +vp256 ++nodelist nodelist``
``./charmrun ./bt-mz.C.256 ++n 64 +vp256 ++nodelist nodelist``

HPCG v3.0
~~~~~~~~~
Expand All @@ -352,7 +352,7 @@ HPCG v3.0
- No AMPI-ization needed. To build, modify ``setup/Make.AMPI`` for
compiler paths, do
``mkdir build && cd build && configure ../setup/Make.AMPI && make``.
To run, do ``./charmrun +p16 ./bin/xhpcg +vp64``
To run, do ``./charmrun ++n 16 ./bin/xhpcg +vp64``

Intel Parallel Research Kernels (PRK) v2.16
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Expand Down Expand Up @@ -408,7 +408,7 @@ HYPRE-2.11.1
``LIBFLAGS``. Then run ``make``.

- To run the ``new_ij`` test, run:
``./charmrun +p64 ./new_ij -n 128 128 128 -P 4 4 4 -intertype 6 -tol 1e-8 -CF 0 -solver 61 -agg_nl 1 27pt -Pmx 6 -ns 4 -mu 1 -hmis -rlx 13 +vp64``
``./charmrun ++n 64 ./new_ij -n 128 128 128 -P 4 4 4 -intertype 6 -tol 1e-8 -CF 0 -solver 61 -agg_nl 1 27pt -Pmx 6 -ns 4 -mu 1 -hmis -rlx 13 +vp64``

MFEM-3.2
~~~~~~~~
Expand Down Expand Up @@ -440,7 +440,7 @@ MFEM-3.2
- ``make parallel MFEM_USE_MPI=YES MPICXX=~/charm/bin/ampicxx HYPRE_DIR=~/hypre-2.11.1/src/hypre METIS_DIR=~/metis-4.0.3``

- To run an example, do
``./charmrun +p4 ./ex15p -m ../data/amr-quad.mesh +vp16``. You may
``./charmrun ++n 4 ./ex15p -m ../data/amr-quad.mesh +vp16``. You may
want to add the runtime options ``-no-vis`` and ``-no-visit`` to
speed things up.

Expand All @@ -464,10 +464,10 @@ XBraid-1.1
HYPRE in their Makefiles and ``make``.

- To run an example, do
``./charmrun +p2 ./ex-02 -pgrid 1 1 8 -ml 15 -nt 128 -nx 33 33 -mi 100 +vp8 ++local``.
``./charmrun ++n 2 ./ex-02 -pgrid 1 1 8 -ml 15 -nt 128 -nx 33 33 -mi 100 +vp8 ++local``.

- To run a driver, do
``./charmrun +p4 ./drive-03 -pgrid 2 2 2 2 -nl 32 32 32 -nt 16 -ml 15 +vp16 ++local``
``./charmrun ++n 4 ./drive-03 -pgrid 2 2 2 2 -nl 32 32 32 -nt 16 -ml 15 +vp16 ++local``

Other AMPI codes
----------------
Expand Down
4 changes: 2 additions & 2 deletions doc/charisma/manual.rst
Original file line number Diff line number Diff line change
Expand Up @@ -483,7 +483,7 @@ Turing Cluster, use the customized job launcher ``rjq`` or ``rj``).

.. code-block:: bash

$ charmrun pgm +p4
$ charmrun pgm ++n 4

Please refer to Charm++'s manual and tutorial for more details of
building and running a Charm++ program.
Expand Down Expand Up @@ -619,7 +619,7 @@ instance, the following command uses ``RefineLB``.

.. code-block:: bash

$ ./charmrun ./pgm +p16 +balancer RefineLB
$ ./charmrun ./pgm ++n 16 +balancer RefineLB

.. _secsparse:

Expand Down
61 changes: 35 additions & 26 deletions doc/charm++/manual.rst
Original file line number Diff line number Diff line change
Expand Up @@ -8452,7 +8452,7 @@ mode. For example:

.. code-block:: bash

$ ./charmrun hello +p4 +restart log
$ ./charmrun hello ++n 4 +restart log

Restarting is the reverse process of checkpointing. Charm++ allows
restarting the old checkpoint on a different number of physical
Expand Down Expand Up @@ -8481,7 +8481,7 @@ After a failure, the system may contain fewer or more processors. Once
the failed components have been repaired, some processors may become
available again. Therefore, the user may need the flexibility to restart
on a different number of processors than in the checkpointing phase.
This is allowable by giving a different ``+pN`` option at runtime. One
This is allowable by giving a different ``++n N`` option at runtime. One
thing to note is that the new load distribution might differ from the
previous one at checkpoint time, so running a load balancer (see
Section :numref:`loadbalancing`) after restart is suggested.
Expand Down Expand Up @@ -8618,9 +8618,9 @@ it stores them in the local disk. The checkpoint files are named
Users can pass the runtime option ``+ftc_disk`` to activate this mode. For
example:

.. code-block:: c++
.. code-block:: bash

./charmrun hello +p8 +ftc_disk
./charmrun hello ++n 8 +ftc_disk

Building Instructions
^^^^^^^^^^^^^^^^^^^^^
Expand All @@ -8629,7 +8629,7 @@ In order to have the double local-storage checkpoint/restart
functionality available, the parameter ``syncft`` must be provided at
build time:

.. code-block:: c++
.. code-block:: bash

./build charm++ netlrts-linux-x86_64 syncft

Expand All @@ -8656,7 +8656,7 @@ name:

.. code-block:: bash

$ ./charmrun hello +p8 +kill_file <file>
$ ./charmrun hello ++n 8 +kill_file <file>

An example of this usage can be found in the ``syncfttest`` targets in
``tests/charm++/jacobi3d``.
Expand Down Expand Up @@ -9967,7 +9967,7 @@ program

.. code-block:: bash

$ ./charmrun pgm +p1000 +balancer RandCentLB +LBDump 2 +LBDumpSteps 4 +LBDumpFile lbsim.dat
$ ./charmrun pgm ++n 1000 +balancer RandCentLB +LBDump 2 +LBDumpSteps 4 +LBDumpFile lbsim.dat

This will collect data on files lbsim.dat.2,3,4,5. We can use this data
to analyze the performance of various centralized strategies using:
Expand Down Expand Up @@ -11330,7 +11330,7 @@ used, and a port number to listen the shrink/expand commands:

.. code-block:: bash

$ ./charmrun +p4 ./jacobi2d 200 20 +balancer GreedyLB ++nodelist ./mynodelistfile ++server ++server-port 1234
$ ./charmrun ++n 4 ./jacobi2d 200 20 +balancer GreedyLB ++nodelist ./mynodelistfile ++server ++server-port 1234

The CCS client to send shrink/expand commands needs to specify the
hostname, port number, the old(current) number of processor and the
Expand Down Expand Up @@ -11988,7 +11988,7 @@ To run a Charm++ program named “pgm” on four processors, type:

.. code-block:: bash

$ charmrun pgm +p4
$ charmrun pgm ++n 4

Execution on platforms which use platform specific launchers, (i.e.,
**aprun**, **ibrun**), can proceed without charmrun, or charmrun can be
Expand Down Expand Up @@ -12122,7 +12122,7 @@ advanced options are available:
``++p N``
Total number of processing elements to create. In SMP mode, this
refers to worker threads (where
:math:`\texttt{n} * \texttt{ppn} = \texttt{p}`), otherwise it refers
:math:`\texttt{n} \times \texttt{ppn} = \texttt{p}`), otherwise it refers
to processes (:math:`\texttt{n} = \texttt{p}`). The default is 1. Use
of ``++p`` is discouraged in favor of ``++processPer*`` (and
``++oneWthPer*`` in SMP mode) where desirable, or ``++n`` (and
Expand Down Expand Up @@ -12230,7 +12230,7 @@ The remaining options cover details of process launch and connectivity:

.. code-block:: bash

$ ./charmrun +p4 ./pgm 100 2 3 ++runscript ./set_env_script
$ ./charmrun ++n 4 ./pgm 100 2 3 ++runscript ./set_env_script

In this case, ``set_env_script`` is invoked on each node. **Note:** When this
is provided, ``charmrun`` will not invoke the program directly, instead only
Expand Down Expand Up @@ -12400,20 +12400,29 @@ like:

$ ./charmrun ++ppn 3 +p6 +pemap 1-3,5-7 +commap 0,4 ./app <args>

This will create two logical nodes/OS processes (2 = 6 PEs/3 PEs per
node), each with three worker threads/PEs (``++ppn 3``). The worker
threads/PEs will be mapped thusly: PE 0 to core 1, PE 1 to core 2, PE 2
to core 3 and PE 4 to core 5, PE 5 to core 6, and PE 6 to core 7.
PEs/worker threads 0-2 compromise the first logical node and 3-5 are the
second logical node. Additionally, the communication threads will be
mapped to core 0, for the communication thread of the first logical
node, and to core 4, for the communication thread of the second logical
node.

Please keep in mind that ``+p`` always specifies the total number of PEs
created by Charm++, regardless of mode (the same number as returned by
``CkNumPes()``). The ``+p`` option does not include the communication
thread, there will always be exactly one of those per logical node.
``CkNumPes()``). So this will create two logical nodes/OS processes
(2 = 6 PEs/3 PEs per node), each with three worker threads/PEs
(``++ppn 3``).

We recommend using ``++n``, especially with ``++ppn``. Recall
that :math:`\texttt{n} \times \texttt{ppn} = \texttt{p}`. So the example becomes:

.. code-block:: bash

$ ./charmrun ++ppn 3 ++n 2 +pemap 1-3,5-7 +commap 0,4 ./app <args>

The worker threads/PEs will be mapped thusly: PE 0 to
core 1, PE 1 to core 2, PE 2 to core 3 and PE 4 to core 5, PE 5 to
core 6, and PE 6 to core 7 (``+pemap``). PEs/worker threads 0-2
compromise the first logical node and 3-5 are the second logical node.
Additionally, the communication threads will be mapped to core 0, for
the communication thread of the first logical node, and to core 4,
for the communication thread of the second logical node (``+commap``).

Note that the ``+p`` option does not include the communication
thread. There will always be exactly one of those per logical node.

Multicore Options
^^^^^^^^^^^^^^^^^
Expand Down Expand Up @@ -12526,7 +12535,7 @@ nodes than there are hosts in the group, it will reuse hosts. Thus,

.. code-block:: bash

$ charmrun pgm ++nodegroup kale-sun +p6
$ charmrun pgm ++nodegroup kale-sun ++n 6

uses hosts (charm, dp, grace, dagger, charm, dp) respectively as nodes
(0, 1, 2, 3, 4, 5).
Expand All @@ -12536,7 +12545,7 @@ Thus, if one specifies

.. code-block:: bash

$ charmrun pgm +p4
$ charmrun pgm ++n 4

it will use “localhost” four times. “localhost” is a Unix trick; it
always find a name for whatever machine you’re on.
Expand Down Expand Up @@ -13237,7 +13246,7 @@ of the above incantation, for various kinds of process launchers:

.. code-block:: bash

$ ./charmrun +p2 `which valgrind` --log-file=VG.out.%p --trace-children=yes ./application_name ...application arguments...
$ ./charmrun ++n 2 `which valgrind` --log-file=VG.out.%p --trace-children=yes ./application_name ...application arguments...
$ aprun -n 2 `which valgrind` --log-file=VG.out.%p --trace-children=yes ./application_name ...application arguments...

The first adaptation is to use :literal:`\`which valgrind\`` to obtain a
Expand Down
2 changes: 1 addition & 1 deletion doc/faq/manual.rst
Original file line number Diff line number Diff line change
Expand Up @@ -204,7 +204,7 @@ following command:

.. code-block:: bash

./charmrun +p14 ./pgm ++ppn 7 +commap 0 +pemap 1-7
./charmrun ++n 2 ./pgm ++ppn 7 +commap 0 +pemap 1-7

See :ref:`sec-smpopts` of the Charm++ manual for more information.

Expand Down
2 changes: 1 addition & 1 deletion doc/libraries/manual.rst
Original file line number Diff line number Diff line change
Expand Up @@ -36,7 +36,7 @@ client is a small Java program. A typical use of this is:

cd charm/examples/charm++/wave2d
make
./charmrun ./wave2d +p2 ++server ++server-port 1234
./charmrun ./wave2d ++n 2 ++server ++server-port 1234
~/ccs_tools/bin/liveViz localhost 1234

Use git to obtain a copy of ccs_tools (prior to using liveViz) and build
Expand Down
Loading
Loading