Stylish Fortran

This is not intended to be a complete beginner's guide to programming in Fortran, but rather to fill in some of the holes in the knowledge of the self-taught. I have not written a similar guide for C: those who need such a thing should not be writing in C...

The compiler
The code
Debugging

The compiler

Invoking a compiler as simply

f90 hello.f

hardly needs documenting. However, the more complicated forms might need a couple of words. The basic structure is

f90 [options] [source files] [libraries]

So one should usually write

f90 -O4 -r8 -o myprog test.f90 -lnag

rather than some other order. With some combinations of options and compilers, the options only effect source files specified further to the right, hence specifying all source files after all options is usually the best approach. (Not that one should routinely use -r8 on well-written code...)

Most, but not all, compilers optimise by default. With some one has to specify at least -O for optimisation to occur. Optimisation will increase the compile time, and make it harder to use a debugger with the resulting code, but should reduce the run-time considerably, often by a factor of about two. The compiler's man page will discuss optimisation levels in more detail.

The Code

Many people appear to take pride in writing unreadable code. There are even competitions for such tricks. However, doing so in research is silly: even you will have problems reading the code in a few months time, and no-one will be able to collaborate with you. One might add that you need a brain the size of a small planet to be able to write obfuscated code without bugs, but you might regard such statements as a challenge.

Indenting etc.

Using sensible variable names, indenting loops and other blocks, and using comments, are such simple ideas, and yet few do it.

if (a .ne. 1.0) then
  do i=1,n
    y(i)=y(i)+a*x(i)
  enddo
else
  do i=1,n
    y(i)=y(i)+x(i)
  enddo
endif

tends to be easier for humans to read that the equivalent

IF(A.NE.1.0)THEN
DO I=1,N
Y(I)=Y(I)+A*X(I)
ENDDO
ELSE
DO I=1,N
Y(I)=Y(I)+X(I)
ENDDO
ENDIF

and this block of code is trivial in length and complexity.

If one indents by too many characters, one soon ends up with an enormous left-hand margin. Too few, and the indentations are hard to follow by eye. Most importantly, one should be consistent: the compiler will not check the indentation, and indentation which does not match the code structure is very confusing. Between two and four spaces per level of indentation is probably about right.

Deprecated features

For many reasons, languages often contain syntax which is deprecated or even marked for future removal. One really should not use such features unless one understands why they are deprecated, and why they are actually appropriate in the case in question. In Fortran90 such features would include goto, common, numbered do loops, and implicit typing.

Implicit typing

In Fortran, undeclared variables are permitted, and default to being integer if the first character is between I and N, and single precision floating point otherwise. (One can change these defaults.)

Implicit typing is generally a Bad Thing.

real salary, tax
salary=15000
tax=(salary-4000)*0.23
salery=salary-tax
write(*,*) salary
end

The above code contains an obvious(?) mistake, but the compiler will produce no complaint as it is syntactically correct. Therefore one should add `implicit none' to the beginning of the code, and/or compile with whatever flags cause the compiler to disallow undeclared variables (-u for most compilers). C never permits undeclared variables.

Interface blocks

Fortran 77 compilers are rarely able to check the arguments passed to functions and procedures. Fortran 90 compilers will do so if an interface block is provided. It is advisable to use them, as they can catch many errors. When using the NAG library, interface blocks are available, and their use, as below, will highlight the error in the following code:

use nag_f77_f_chapter
real (kind(1.0d0)) :: m(3,3),d,wrk
integer i,n,ifail

m(1,1)=2 ; m(1,2)=0
m(2,1)=0 ; m(2,2)=2
i=3 ; n=2 ; ifail=0

call f03aaf(m,i,n,d,wrk,ifail)

write(*,*)'Determinant is ',d
end

Precision

It is unlikely that you will wish to use any single-precision arithmetic in your code. If the compiler has an option to promote all single precision expressions to double precision (often -r8), then adding this option should not change the result of the code at all. If it does, then there was some use of single precision somewhere.

The F77 standard defines single precision real, double precision real, and single precision complex. Any use of double precision complex is an extension to the standard, and an absolute necessity in most cases! The entirely correct declarations are real, double precision and complex, however, almost all compilers accept real*4, real*8, complex*8 and complex*16, where the number is the total number of bytes occupied by the data. As a complex number is stored as two real parts, complex*8 has the same precision as real*4.

The F90 standard is rather different, and does have double precision complex. One can specify a "kind" with a real declaration, which controls the precision. A common form is:

integer, parameter :: dp=kind(1.0d0)
real (kind=dp)     :: x,y
complex (kind=dp)  :: z

although an abbreviated form of this is

integer, parameter :: dp=kind(1d0)
real (dp)     :: x,y
complex (dp)  :: z

Note that the value of kind(1d0) is compiler-dependent, so one cannot write

real (kind=8) :: x

and expect one's code to be portable or respected. Note also that the kind values for double precision real and double precision complex are identical.

It is permitted to mix F90 and F77 style declarations in F90, although using the pure F90 style is preferred.

Debugging

Debugging is tedious, and it is best to force the compiler to do as much as possible automatically. By this stage you should be using interface blocks and trapping undeclared variables. A further trick is to try to trap out-of-bound array references, usually achieved by compiling with -C. This will make the code run significantly slower, so it is not advised for real `production' runs.

Beyond this, you might need to investigate core files. A core file is produced on certain errors, and is a snapshot of all the memory the program was using at the time the error occured. It can therefore be big: potentially the odd gigabyte. Currently the Linux machines do not produce these files by default. If a core file is likely to be large, it ought to be dumped in /scratch, not your home directory. Dumping large files across a network running at under 10MB/sec is not fun, nor is having them truncated by quota limitations.

To ensure core files are enabled (or disabled!), one can check the shell process limits:

pc0:~$ ulimit -c
0

This machine will not currently dump core files at all. To enable them, one can type:

pc0:~$ ulimit -c unlimited

If the executable was compiled with the -g flag, with a core file one can determine precisely which line caused the program to crash using a debugger. Simply typing gdb executable core followed by where and quit is often sufficient. Replace gdb by dbx90 (NAG f95 compiler) or iidb (Intel compilers).

Please send comments/corrections to mjr19.