Starting the MPI, Hello World

Home / Starting the MPI, Hello World

Starting the MPI, Hello World

December 9, 2015 | Article | No Comments

In this article, we will discuss about a basic MPI Hello World application and also discuss how to run an MPI program. If you have read this article, you will find similarity between this and those article. But in this article we will discuss it deeper.

In this article I assume you have installed MPICH or OpenMPI (or other MPI implementation).

Source Code

Create a file mpi_hello_world.c and write this:

#include <mpi.h>
#include <stdio.h>

int main(int argc, char* argv[]) {
   // Initialize the MPI Environment
   MPI_Init(&argc, &argv);

   // Get the number of process
   int size;
   MPI_Comm_size( MPI_COMM_WORLD, &size );

   // Get the rank of process
   int rank;
   MPI_Comm_rank( MPI_COMM_WORLD, &rank );

   // Get the name of processor
   char procname[ MPI_MAX_PROCESSOR_NAME ];
   int len;
   MPI_Get_processor_name( procname, &len);

   // Print off a Hello World message
   printf("Hello world from processor %s, rank %d of %d processor\n",    procname, rank, size);

   // Finalize the MPI environment
   MPI_Finalize();

   return 0;
}

Compile & Run

To compile:

mpicc mpi_hello_world.c -o mpi_hello_world

To Run:

mpirun -n 4 mpi_hello_world

Where “-n 4” specifies how many process will be used.

Explanation

Now let’s discuss all the component we have written.

The first section is list of included library. We use MPI which is defined after MPI is installed. Here we have three libraries included here: mpi for MPI and stdio for Standard input output (like printf and scanf).

The MPI some communication mechanism. To use MPI, an initialization process should be done beforehand using MPI_Init(). Once MPI initialized, it then spawn processes / threads (depends on implementation) and each process or thread will handle a specific task.

On initialization, all of MPI’s global and internal variables are constructed. For example, a communicator is formed around all of the processes that are spawned. This global communicator is referred as MPI_COMM_WORLD. For each process, an unique rank is assigned.

After initialization, two common function can be called. Almost all MPI program will invoke these functions:

MPI_Comm_size( MPI_Comm communicator, int* size )

MPI_Comm_size Returns the size of a communicator. If we use MPI_COMM_WORLD, MPI will encloses all of the processes in the job. This should return the amount of process that were requested for the job.

MPI_Comm_rank( MPI_Comm communicator, int* rank )

MPI_Comm_rank Returns the rank or id of querying process. Each process have assigned an unique id and this id is valid inside of a communicator. The id is started from zero and increment to n-1 for n process spawned.

Less used function in this program is this one:

MPI_Get_processor_name( char* name, int* name_length)

This function can obtain the actual name of the processor on which the process is executing.

At the end of process, MPI_Finalize() should be given.

MPI_Finalize()

MPI_Finalize() will clean up the MPI environment. Once finalization done, no more MPI calls can be made.

Running MPI in Cluster of Nodes

The purpose of MPI is to do distributed work. Using one program, one can divide computation or operation load to several nodes available and simultaneously do the work. If this is the case, we have to setup a host file to run the program simultaneously.

Suppose you have four nodes on a cluster: 167.205.129.1 to 167.205.129.4  and assign them as veda1 to veda4. In each machine we have to setup an authorized key file to avoid password prompt for SSH (the communication will use SSH connection). Follow this steps:

  • Choose a node as a basis, let say 167.205.129.1 (veda1)
  • Login to veda1
  • Generate public and private key for ssh. Choose passphrase you want for using that key.
ssh-keygen -t dsa
  • For other nodes, login to that node and create a directory .ssh on home directory. Make sure you set the permission as following:
chmod 700 /home/<username>/.ssh
  • From basis (veda1) copy the public key to other nodes, use following command:
scp /home/<username>/.ssh/id_dsa.pub <username>@<node's ip>:.ssh/authorized_keys
  • Try access other node via ssh. Use passphrase used at beginning of this section. For example we connect to veda2
ssh 167.205.129.2
  • In order we don’t need to write password everytime we log in, on basis node do following:
eval `ssh-agent`
ssh-add ~/.ssh/id_dsa
  • Next, write host file. This file list all nodes used for computation in MPI. In this case we have 4 nodes (veda1 to veda4) so our host file would be
# List of hosts or nodes used by MPI
veda1
veda2
veda3
veda4
# -----------------------------------

Let say it as .mpi_hostfile.

Now to run it across all of the nodes, do following in basis node:

mpirun -np <# process> --hostfile <hostfile name> <program name>

or in more concrete:

mpirun -np 20 --hostfile .mpi_hostfile hellompi

for launching 20 processes.

We can modify the hostfile for our need. Suppose our hosts are actually dual-core machine (or more core?). We can get MPI to spawn processes across the individual cores first before individual nodes. To accommodate our need, we alter the file to:

# List of hosts or nodes used by MPI
veda1:2
veda2:2
veda3:2
veda4:2
# -----------------------------------

,

About Author

about author

xathrya

A man who is obsessed to low level technology.

Leave a Reply

Your email address will not be published. Required fields are marked *

Social media & sharing icons powered by UltimatelySocial