Heat Equation Problem

The heat equation

is relatively easy to solve using finite difference methods. We shall use this example to demonstrate a number of scientific computation techniques.

Mathematical statement

We wish to solve the IVBP

numerically over the interval . The analytical solution is

 

Simple numerical algorithms

Introduce a discretization of the computational domain

and use a finite difference approximation of the heat equation with forward differences in time, centered differences in space

This is called a FTCS discretization.

Natural language program

Prior to any coding in a chosen programming language it is useful to construct a natural language program. Typically this is done using comments. As the tasks to be accomplished become clear, the comments are translated into a computer language. Further details come to mind while doing this. The same procedure is followed: specify the sub-task details in natural language and then code them in a programming language. At the end of this procedure you have written a documented piece of code. All during the process you should strive to have a correct program that can be compiled and executed without error, even if it doesn't accomplish all of the final tasks. Debugging is much easier if this discipline is maintained throughout code development.

The top level natural language description for solving the heat equation problem using FTCS is:

·        Set up the discretization

·        Initialize the field variable values

·        Iterate forward in time

Fortran 90 program

Fortran 90 shall be used to code this application. The application shall be written in a sequence of stages using the top-down strategy mentioned above. In order to keep track of the changes introduced by each stage the Concurrent Version System (CVS) will be used.

Stage 1 - Program skeleton

First start with a program skeleton. The skeleton should contain just a general comment on the purpose of the program and the bare essentials needed for the code to compile:

PROGRAM mpidif2D

  ! =================================================

  ! Solve the 2D diffusion equation

  !         q_t = q_xx + q_yy

  ! on (x,y) in [0,1]x[0,1] with boundary conditions

  !   q(x,y,0)=sin(pi*x)*sin(pi*y)

  !   q(0,y,t)=q(1,y,t)=q(x,0,t)=q(x,1,t)=0

  ! using a FTCS discretization.

  ! =================================================

 

  ! Set up the discretization 

 

  ! Initialize field variable values 

 

  ! Iterate forward in time

 

END 

 

We shall start a new CVS entry with the above code. First make sure that the $CVSROOT is set correctly and you are authenticated into the Kerberos system by

 

setenv CVSROOT /afs/amath/common/cvs

klog

 

on the AMATH Sun systems. The CVS command

 

cvs import –m "SCC_FTCS" SCC/ftcs 'UNC_Amath' 'start'

 

with a vendor identification of 'UNC_Amath' and a version identification of 'start'. The first version of the code typically receives a 1.1 version tag. You can retrieve this particular version later using

 

cvs checkout –r 1.1 SCC/ftcs/ftcs.f90

The code is a correct Fortran 90 code. You can compile and link it with

f90 ftcs.f90

and launch the executable file produced with

a.out

Of course the code doesn't do anything yet, but there are no errors.

Stage 2 - Data structures

A decision must be made about what data structures are required by the problem. For FTCS the requirements are simple: two arrays to hold the old and updated values of the field variables. Let us add the required lines of code. Go to the directory within your home directory where you work on programs, e.g.

~/programs

Then request the code from the CVS using

cvs checkout –r 1.1 SCC/ftcs/ftcs.f90

The latest version of the code existent in the CVS system can be retrieved with

cvs checkout SCC/ftcs

The ftcs.f90 file is retrieved from the CVS system along with some additional files that show version modifications.

Edit the file to add comments about the required data structures.

PROGRAM mpidif2D

  ! =================================================

  ! Solve the 2D diffusion equation

  !         q_t = q_xx + q_yy

  ! on (x,y) in [0,1]x[0,1] with boundary conditions

  !   q(x,y,0)=sin(pi*x)*sin(pi*y)

  !   q(0,y,t)=q(1,y,t)=q(x,0,t)=q(x,1,t)=0

  ! using a FTCS discretization.

  ! =================================================

  INTEGER nx,ny

  ! Declare the field variables

  REAL, ALLOCATABLE, DIMENSION (:,:) :: qnew,qold

 

  ! Set up the discretization

  nx=40; ny=40

 

  ! Allocate space for the field variables

  ALLOCATE(qold(0:nx,0:ny),qnew(0:nx,0:ny),STAT=iError)

  IF (iError /= 0) THEN

    PRINT *,'Error allocating qold or qnew'

    STOP

  END IF

 

  ! Initialize field variable values

 

  ! Iterate forward in time

 

  ! Release space allocated for the field variables

  DEALLOCATE(qold,qnew)

 

END

Here, the Fortran 90 dynamic allocation of memory facility is used. First, the arrays qold, qnew are declared as holding floating point values (REAL), being ALLOCATABLE and having 2 dimensions. During program execution the ALLOCATE statement is invoked to effectively reserve memeory space for the two arrays. Something might go wrong in this process (e.g., there is not enough memory available) so it is always necessary to check whether the allocation was successful. DEALLOCATE releases the space.

The code above is correct and can be compiled and executed. Something gets done this time, but is not apparent since there is no output. It is generally good policy to carry out editing of new fragments to the stage where the code is again correct, albeit incomplete. The new version can then be added to the CVS repository using

cvs commit -m 'Data_Structures' SCC/ftcs

The output from this command is

commit: Examining .

/afs/amath/common/cvs/SCC/ftcs/ftcs.f90,v  <--  ftcs.f90

revision: 1.2; previous revision: 1.1

ftcs.f90 file has been updated and a new version number generated.

A new version, called 1.2 has been added to the CVS system. Both versions are still available.

Stage 3 - Basic execution model

The program should loop to the final time using the FTCS formula to update the field variables from their initial values to the final values at tfinal. Here is the code to do this:

PROGRAM mpidif2D

  ! =================================================

  ! Solve the 2D diffusion equation

  !         q_t = q_xx + q_yy

  ! on (x,y) in [0,1]x[0,1] with boundary conditions

  !   q(x,y,0)=sin(pi*x)*sin(pi*y)

  !   q(0,y,t)=q(1,y,t)=q(x,0,t)=q(x,1,t)=0

  ! using a FTCS discretization.

  ! =================================================

  INTEGER nx,ny

  REAL hx,hy,pi,pihx,pihy,pix,piy,t,delt,tfinal

  ! Declare the field variables

  REAL, ALLOCATABLE, DIMENSION (:,:) :: qnew,qold

 

  ! Set up the discretization

  nx=40; ny=40 

 

  ! Allocate space for the field variables

  ALLOCATE(qold(0:nx,0:ny),qnew(0:nx,0:ny),STAT=iError)

  IF (iError /= 0) THEN

    PRINT *,'Error allocating qold or qnew'

    STOP

  END IF

 

  ! Initialize field variable values

  hx=1.0/nx; hy=1.0/ny; pi=4.0*atan(1.0); pihx=pi*hx; pihy=pi*hy

  DO j=0,ny

    piy=j*pihy

    DO i=0,nx

      pix=i*pihx

      qold(i,j)=SIN(pix)*SIN(piy)

      qnew(i,j)=qold(i,j)

    END DO

  END DO

 

  ! Set up time iteration

  delt=0.25*MIN(hx,hy)**2; sigmax=delt/hx**2; sigmay=delt/hy**2

  t=0.0; tfinal=1.0

  ! Iterate forward in time

  DO WHILE (t<tfinal)

 

   

    ! Prepare for next iteration

   

    t=t+delt

  END DO

 

  ! Release space allocated for the field variables

  DEALLOCATE(qold,qnew)

 

END

Notice the DO WHILE … END DO construct available in Fortran 90. Also notice the way the two loops are embedded in the initialization stage. The loop on the first array index i is interior to the one on the second array index j. Arrays are stored linearly in memory. Fortran 90 uses column-first ordering so that qold(1,1), qold(2,1), qold(3,1) are stored successively in memory. It is more efficient to access arrays column-first. Let us add this new version to the CVS system:

cvs commit -m 'Basic_Execution' SCC/ftcs

 

The output from this operation is

 

cvs commit: Examining SCC/ftcs

Checking in SCC/ftcs/ftcs.f90;

/afs/amath/common/cvs/SCC/ftcs/ftcs.f90,v  <--  ftcs.f90

new revision: 1.3; previous revision: 1.2

done

 

denoting that version 1.3 has been stored.

 

Stage 4 - Implementation of numeric algorithm

We now add instructions that actually update the field variables using the FTCS algorithm

PROGRAM mpidif2D

  ! =================================================

  ! Solve the 2D diffusion equation

  !         q_t = q_xx + q_yy

  ! on (x,y) in [0,1]x[0,1] with boundary conditions

  !   q(x,y,0)=sin(pi*x)*sin(pi*y)

  !   q(0,y,t)=q(1,y,t)=q(x,0,t)=q(x,1,t)=0

  ! using a FTCS discretization.

  ! =================================================

  INTEGER nx,ny

  REAL hx,hy,pi,pihx,pihy,pix,piy,t,delt,tfinal

  ! Declare the field variables

  REAL, ALLOCATABLE, DIMENSION (:,:) :: qnew,qold

 

  ! Set up the discretization

  nx=40; ny=40 

 

  ! Allocate space for the field variables

  ALLOCATE(qold(0:nx,0:ny),qnew(0:nx,0:ny),STAT=iError)

  IF (iError /= 0) THEN

    PRINT *,'Error allocating qold or qnew'

    STOP

  END IF

 

  ! Initialize field variable values

  hx=1.0/nx; hy=1.0/ny; pi=4.0*atan(1.0); pihx=pi*hx; pihy=pi*hy

  DO j=0,ny

    piy=j*pihy

    DO i=0,nx

      pix=i*pihx

      qold(i,j)=SIN(pix)*SIN(piy)

      qnew(i,j)=qold(i,j)

    END DO

  END DO

 

  ! Set up time iteration

  delt=0.25*MIN(hx,hy)**2; sigmax=delt/hx**2; sigmay=delt/hy**2

  t=0.0; tfinal=1.0

  ! Iterate forward in time

  DO WHILE (t<tfinal)

 

    DO j=1,ny-1

      DO i=1,nx-1

        qnew(i,j)=qold(i,j)+sigmax*(qold(i+1,j)-2*qold(i,j)+qold(i-1,j)) &

                           +sigmay*(qold(i,j+1)-2*qold(i,j)+qold(i,j-1))

      END DO

    END DO

    ! Prepare for next iteration

    qold=qnew

    t=t+delt

  END DO

 

  ! Release space allocated for the field variables

  DEALLOCATE(qold,qnew)

 

END

If you run the above program you'll probably notice that it takes more time for the code to execute since actual computation is being carried out. A very useful Fortran 90 feature is shown by the qold=qnew instruction, which transfers values for the entire array. As before this version of the code is stored in the CVS system as version 1.4.

Stage 5 - Input and Output

Of course it's not useful just to have the program run without any output and it is inconvenient to have to recompile the program if we want to change the domain size. We need the program to read input data from a file, e.g. ftcs.dat and write to another file, e.g. output.dat. Typically, we do not want to have output from all of the time steps since I/O operations are quite expensive and we would rapidly fill up disk space. We shall set a plottime after which time the field variables are saved to a file.

PROGRAM mpidif2D

  ! =================================================

  ! Solve the 2D diffusion equation

  !         q_t = q_xx + q_yy

  ! on (x,y) in [0,1]x[0,1] with boundary conditions

  !   q(x,y,0)=sin(pi*x)*sin(pi*y)

  !   q(0,y,t)=q(1,y,t)=q(x,0,t)=q(x,1,t)=0

  ! using a FTCS discretization.

  ! =================================================

  INTEGER nx,ny

  REAL hx,hy,pi,pihx,pihy,pix,piy,t,delt,tfinal,tplot

  ! Declare the field variables

  REAL, ALLOCATABLE, DIMENSION (:,:) :: qnew,qold

 

  ! Set up the discretization

  OPEN(UNIT=1,FILE='ftcs.dat',STATUS='OLD')

  READ(1,*)nx,ny

  READ(1,*)tfinal,tplot

  CLOSE(1)

 

  ! Allocate space for the field variables

  ALLOCATE(qold(0:nx,0:ny),qnew(0:nx,0:ny),STAT=iError)

  IF (iError /= 0) THEN

    PRINT *,'Error allocating qold or qnew'

    STOP

  END IF

 

  ! Initialize field variable values

  hx=1.0/nx; hy=1.0/ny; pi=4.0*atan(1.0); pihx=pi*hx; pihy=pi*hy

  DO j=0,ny

    piy=j*pihy

    DO i=0,nx

      pix=i*pihx

      qold(i,j)=SIN(pix)*SIN(piy)

      qnew(i,j)=qold(i,j)

    END DO

  END DO

 

  ! Set up time iteration

  delt=0.25*MIN(hx,hy)**2; sigmax=delt/hx**2; sigmay=delt/hy**2

  t=0.0; plottime=0.0

  ! Iterate forward in time

  DO WHILE (t<tfinal)

    IF (t>=plottime) THEN

      ! Time to output the current field

      plottime=plottime+tplot ! Set the next plot time

      PRINT *,'Plot time t=',t

      IF (t == 0.0) THEN

        OPEN(UNIT=2,FILE='output.dat',STATUS='REPLACE')

      ELSE

        OPEN(UNIT=2,FILE='output.dat',STATUS='OLD',POSITION='APPEND')

      END IF

      WRITE(2,2001)nx+1,ny+1

      2001 FORMAT('ZONE I=',i4,' J=',i4)

      DO i=0,nx

        DO j=0,ny

          WRITE(2,*)qnew(i,j)

        END DO

      END DO

      ! Close the output file.

      CLOSE(2)

    END IF

    DO j=1,ny-1

      DO i=1,nx-1

        qnew(i,j)=qold(i,j)+sigmax*(qold(i+1,j)-2*qold(i,j)+qold(i-1,j)) &

                           +sigmay*(qold(i,j+1)-2*qold(i,j)+qold(i,j-1))

      END DO

    END DO

    ! Prepare for next iteration

    qold=qnew

    t=t+delt

  END DO

 

  ! Release space allocated for the field variables

  DEALLOCATE(qold,qnew)

 

END

The particular format of output used here is the text format used by the Tecplot visualization package. Note that the output code is written so the first plot creates a new file and successive plots are appended to this file.

This is now a fully working program that satisfies our initial goal of numerically solving the heat equation. Visualizations of the solution can be examined on the showcase page. The version is commited to the CVS system as version 1.5. This has been done after program execution so the input and output files are also available.

Stage 6 - Parallelization

If we want to solve very large problems we have various strategies at our disposal:

·        use efficient algorithms

·        use faster single processor computers

·        use more than one processor

We shall discuss parallelization in more detail in the next meeting. For now try to investigate the following parallelized version of the FTCS algorithm and note any differences that arise.

PROGRAM mpidif2D

  ! Solve the 2D diffusion equation

  !         q_t = q_xx + q_yy

  ! on (x,y) in [0,1]x[0,1] with boundary conditions

  !   q(x,y,0)=sin(pi*x)*sin(pi*y)

  !   q(0,y,t)=q(1,y,t)=q(x,0,t)=q(x,1,t)=0

  ! using a FTCS discretization.

  ! The computation domain is split into vertical strips. Each strip is

  ! sent to another process to carry out the time stepping.

  !

  ! Compile with:

  !   mpif90 -o mpidif2D mpidif2D.f90 -lmpi

  ! Program requires data file mpidif2D.dat. Sample data file:

  !

  ! 64 64

  ! 0.1 0.01

  !

  ! Run with:

  !   mpirun -c 2 -v mpidif2D

  ! on a 2 processor computer or

  !   mpirun -v mpidif2D

  ! on a cluster.

  INCLUDE 'mpif.h'

  INTEGER MPIstatus(MPI_STATUS_SIZE)

  INTEGER, PARAMETER :: TAG_WRITEDATA=0, TAG_DATAWRITTEN=1

  INTEGER, PARAMETER :: TAG_DATASYNCH=2, TAG_INIT=3

  REAL, ALLOCATABLE, DIMENSION (:,:) :: qnew,qold

  REAL, ALLOCATABLE, DIMENSION (:) :: qleft,qright

  REAL qparam(4)

  DOUBLE PRECISION tIO,tcomp,tstart,tend,tplot1,tplot0

  ! Initialize the MPI system

  CALL MPI_INIT(ierr)

  IF (ierr /= MPI_SUCCESS) THEN

    PRINT *,'Unable to initialize MPI system'

    STOP

  END IF

  ! Find out how many processes are active

  CALL MPI_COMM_SIZE(MPI_COMM_WORLD,np,ierr)

  ! Find out what the ID of this particular process is.

  CALL MPI_COMM_RANK(MPI_COMM_WORLD,idproc,ierr)

  IF (idproc == 0) THEN

    ! The master process reads input data

    OPEN(UNIT=1,FILE='mpidif2D.dat',STATUS='OLD')

    READ(1,*)nx,ny

    READ(1,*)tfinal,tplot

    CLOSE(1)

    ! And then sends it to all the others

    qparam(1)=nx; qparam(2)=ny; qparam(3)=tfinal; qparam(4)=tplot

    DO n=1,np-1

      CALL MPI_SEND(qparam(1), 4, MPI_REAL, n, TAG_INIT, MPI_COMM_WORLD, ierr)

    END DO

  ELSE

    ! Slave process receives input data from master

    CALL MPI_RECV(qparam(1), 4, MPI_REAL, 0, TAG_INIT, MPI_COMM_WORLD, MPIstatus, ierr)

    nx=qparam(1); ny=qparam(2); tfinal=qparam(3); tplot=qparam(4)

  END IF

  ! Initialize. Each process allocates enough memory to hold the

  ! strip of data it's going to work on.

  ! The relationship between the local index i within a strip and the

  ! global index iglbl is

  !    iglbl=idproc*ni+i

  ! idproc is the process id running from 0 to np-1

  ni=CEILING((nx+1.0)/np)

  ALLOCATE(qold(0:ni+1,0:ny),qnew(0:ni+1,0:ny),STAT=iError)

  ! Gridlines from 1 to ni are within this processor's strip.

  ! Gridline i=0 will be received from neighbor to the left

  !              (lower process number) or left boundary condition

  ! Gridline i=ni+1 will be received from neighbor to the right

  !              (higher process number) or right boundary condition

  IF (iError /= 0) THEN

    PRINT *,'Error allocating qold or qnew'

    STOP

  END IF

  ALLOCATE(qleft(0:ny),qright(0:ny),STAT=iError)

  IF (iError /= 0) THEN

    PRINT *,'Error allocating qleft or qright'

    STOP

  END IF

  ! Initial conditions. Since these do not depend on synchronization between

  ! the processes we can use the same code for all processes

  hx=1.0/nx; hy=1.0/ny; pi=4.0*atan(1.0)

  DO j=0,ny

    piy=pi*j*hy

    DO i=0,ni+1

      iglbl=idproc*ni+i

      pix=pi*iglbl*hx

      qold(i,j)=sin(pix)*sin(piy)

      qnew(i,j)=qold(i,j)

    END DO

  END DO

  ! Compute the time step

  delt=0.25*MIN(hx,hy)**2; sigmax=delt/hx**2; sigmay=delt/hy**2

  t=0.0; plottime=0.0

  ! Start the time iteration.

  IF (idproc == 0) THEN

    PRINT *,'Running on ',np,' processes from t=0 to t=',tfinal,' with delt=',delt

    tcomp=0.0; tIO=0.0

  END IF

  DO WHILE (t<tfinal)

    IF (t>=plottime) THEN

      ! Time to output the current field

      plottime=plottime+tplot ! Set the next plot time

      ! There is no guaranteed running order among the processes.

      ! So we must set up a master/slave relationship in order to output

      ! the data in proper order.

      IF (idproc == 0) THEN

        tplot0=MPI_WTIME()

        ! Master process. Open the output file

        PRINT *,'Plot time t=',t

        PRINT 2000,0

        2000 FORMAT('Output of data from process ',i3)

        IF (t == 0.0) THEN

          OPEN(UNIT=2,FILE='mpidif2D.out',STATUS='REPLACE')

        ELSE

          OPEN(UNIT=2,FILE='mpidif2D.out',STATUS='OLD',POSITION='APPEND')

        END IF

        WRITE(2,2001)nx+1,ny+1

        2001 FORMAT('ZONE I=',i4,' J=',i4)

        DO i=1,MIN(ni,nx+1-idproc*ni)

          DO j=0,ny

            WRITE(2,*)qnew(i,j)

          END DO

        END DO

        ! Close the output file.

        CLOSE(2)

        ! Loop over the processes telling each one to output its data

        DO n=1,np-1

          PRINT 2000,n

          qsend=n

          ! Send a request to output the data

          CALL MPI_SEND(qsend, 1, MPI_REAL, n, TAG_WRITEDATA, MPI_COMM_WORLD, ierr)

          ! Wait for a response that data has been written

          CALL MPI_RECV(qrecv, 1, MPI_REAL, n, TAG_DATAWRITTEN, MPI_COMM_WORLD, ierr)

        END DO

        tplot1=MPI_WTIME()

        tIO=tIO+(tplot1-tplot0)

      ELSE

        ! Wait for a signal from the master process

        CALL  MPI_RECV(qrecv, 1, MPI_REAL, 0, TAG_WRITEDATA, MPI_COMM_WORLD, ierr)

        ! After signal is received, output the data

        OPEN(UNIT=2,FILE='mpidif2D.out',STATUS='OLD',POSITION='APPEND')

        DO i=1,MIN(ni,nx+1-idproc*ni)

          DO j=0,ny

            WRITE(2,*)qnew(i,j)

          END DO

        END DO

        CLOSE(2)

        qsend=idproc

        CALL MPI_SEND(qsend, 1, MPI_REAL, 0, TAG_DATAWRITTEN, MPI_COMM_WORLD, ierr)

      END IF

    END IF

    IF (idproc == 0) tstart=MPI_WTIME()

    ! Carry out a time step. All BC's are initially valid

    DO j=1,ny-1

      DO i=1,MIN(ni,nx+1-idproc*ni)

        qnew(i,j)=qold(i,j)+sigmax*(qold(i+1,j)-2*qold(i,j)+qold(i-1,j)) &

                           +sigmay*(qold(i,j+1)-2*qold(i,j)+qold(i,j-1))

      END DO

    END DO

    ! Synchronize across the processes.

    !  Send right ghost cell values

    IF (idproc < np-1) THEN

      qright(:)=qnew(ni,:)

      CALL MPI_SEND(qright(0), ny+1, MPI_REAL, idproc+1, TAG_DATASYNCH, MPI_COMM_WORLD, ierr)

    END IF

    !  Receive left ghost cell values

    IF (idproc > 0) THEN

      CALL MPI_RECV(qleft(0), ny+1, MPI_REAL, idproc-1, TAG_DATASYNCH, MPI_COMM_WORLD, ierr)

      qnew(0,:)=qleft(:)

    END IF

    !  Send left ghost cell values

    IF (idproc > 0) THEN

      qleft(:)=qnew(1,:)

      CALL MPI_SEND(qleft(0), ny+1, MPI_REAL, idproc-1, TAG_DATASYNCH, MPI_COMM_WORLD, ierr)

    END IF

    !  Receive right ghost cell values

    IF (idproc < np-1) THEN

      CALL MPI_RECV(qright(0), ny+1, MPI_REAL, idproc+1, TAG_DATASYNCH, MPI_COMM_WORLD, ierr)

      qnew(ni+1,:)=qright(:)

    END IF

    IF (idproc == 0) THEN

      tend=MPI_WTIME()

      tcomp=tcomp+tend-tstart

    END IF

    ! Prepare for next iteration

    qold=qnew

    t=t+delt

  END DO

  IF (idproc == 0) THEN

    PRINT *,'Time spent in I/O      t=',tIO

    PRINT *,'Computation time       t=',tcomp

  END IF

  ! Shut down the MPI system

  CALL MPI_FINALIZE(ierr)

END