Course Outline 1. Preface 2. What is Fortran? 3. Fortran basics 4. How to use Fortran on the department Unix computers 5. Variables, types, and declarations 6. Expressions and assignment 7. Logical expressions 8. The if statements 9. Loops 10. Arrays 11. Subprograms 12. Arrays in subprograms 13. Common blocks 14. Data and Block Data Statements 15. File I/O 16. Simple I/O 17. Format statements 18. Numerical Software * BLAS * LAPACK * How to use libraries (Unix) * How can I find what I need? 19. Programming style 20. Debugging 21. Fortran resources on the Web Copyright © 1995-7 by Stanford University. All rights reserved. ***************************************************************** 2. What is Fortran? Fortran is a general purpose programming language, mainly intended for mathematical computations in, e.g., physics. Fortran is an acronym for FORmula TRANslation, and was originally capitalized as FORTRAN. However, following the current trend to only capitalize the first letter in acronyms, we will call it Fortran. Fortran was the first ever high-level programming language. The work on Fortran started in the 1950's at IBM and there have been many versions since. By convention, a Fortran version is denoted by the last two digits of the year the standard was proposed. Thus we have * Fortran 66 * Fortran 77 * Fortran 90 (95) The most common Fortran version today is still Fortran 77, although Fortran 90 is growing in popularity. Fortran 95 is a revised version of Fortran 90 which (as of early 1996) is expected to be approved by ANSI soon. There are also several versions of Fortran aimed at parallel computers. The most important one is High Performance Fortran (HPF), which is a de-facto standard. Users should be aware that most Fortran 77 compilers allow a superset of Fortran 77, i.e. they allow non-standard extensions. In this tutorial we will emphasize standard ANSI Fortran 77. Why learn Fortran? Fortran is the dominant programming language used in scientific and engineering applications. It is therefore important for physicists to be able to read and modify Fortran code. From time to time, so-called experts predict that Fortran will rapidly fade in popularity and soon become extinct. These predictions have always failed. Fortran is the most enduring computer programming language in history. One of the main reasons Fortran has survived and will survive is software inertia. Once a company has spent many man-years and perhaps millions of dollars on a software product, it is unlikely to try to translate the software to a different language. Reliable software translation is a very difficult task. Portability A major advantage Fortran has is that it is standardized by ANSI and ISO (see footnotes). Consequently, if your program is written in ANSI Fortran 77, using nothing outside the standard, then it will run on any computer that has a Fortran 77 compiler. Thus, Fortran programs are portable across machine platforms. (If you want to read some Fortran Standards Documents, click here.) Footnotes: ANSI = American National Standards Institute ISO = International Standards Organization Copyright © 1995-7 by Stanford University. All rights reserved. ****************************************************************** 3. Fortran 77 Basics A Fortran program is just a sequence of lines of text. The text has to follow a certain structure to be a valid Fortran program. We start by looking at a simple example: program circle real r, area c This program reads a real number r and prints c the area of a circle with radius r. write (*,*) 'Give radius r:' read (*,*) r area = 3.14159*r*r write (*,*) 'Area = ', area stop end The lines that begin with a "c" are comments and have no purpose other than to make the program more readable for humans. Originally, all Fortran programs had to be written in all upper-case letters. Most people now write lower-case since this is more legible, and so will we. You may wish to mix case, but Fortran is not case-sensitive, so "X" and "x" are the same variable. Program organization A Fortran program generally consists of a main program (or driver) and possibly several subprograms (procedures or subroutines). For now we will place all the statements in the main program; subprograms will be treated later. The structure of a main program is: program name declarations statements stop end In this tutorial, words that are in italics should not be taken as literal text, but rather as a description of what belongs in their place. The stop statement is optional and may seem superfluous since the program will stop when it reaches the end anyway, but it is recommended to always terminate a program with the stop statement to emphasize that the execution flow stops there. You should note that you cannot have a variable with the same name as the program. Column position rules Fortran 77 is not a free-format language, but has a very strict set of rules for how the source code should be formatted. The most important rules are the column position rules: Col. 1 : Blank, or a "c" or "*" for comments Col. 1-5 : Statement label (optional) Col. 6 : Continuation of previous line (optional; see below) Col. 7-72 : Statements Col. 73-80: Sequence number (optional, rarely used today) Most lines in a Fortran 77 program starts with 6 blanks and end before column 72, i.e. only the statement field is used. Comments A line that begins with the letter "c" or an asterisk in the first column is a comment. Comments may appear anywhere in the program. Well-written comments are crucial to program readability. Commercial Fortran codes often contain about 50% comments. You may also encounter Fortran programs that use the exclamation mark (!) for comments. This is not a standard part of Fortran 77, but is supported by several Fortran 77 compilers and is explicitly allowed in Fortran 90. When understood, the exclamation mark may appear anywhere on a line (except in positions 2-6). Continuation Sometimes, a statement does not fit into the 66 available columns of a single line. One can then break the statement into two or more lines, and use the continuation mark in position 6. Example: c The next statement goes over two physical lines area = 3.14159265358979 + * r * r Any character can be used instead of the plus sign as a continuation character. It is considered good programming style to use either the plus sign, an ampersand, or digits (using 2 for the second line, 3 for the third, and so on). Blank spaces Blank spaces are ignored in Fortran 77. So if you remove all blanks in a Fortran 77 program, the program is still acceptable to a compiler but almost unreadable to humans. Copyright © 1995-7 by Stanford University. All rights reserved. ******************************************************************************* 4. How to use Fortran on the department Unix computers Practical details You will need an account on our Unix network. If you don't have one, see the department system administrator. Also, you will need to ask the department office for a key code so that you can gain access to the computer lab. We will be using Fortran under the Unix operating system. If you have no previous experience with Unix, you will have to learn the basics on your own. However, you will be provided with a tutorial introduction to Unix for reference. Source code, object code, compiling, and linking A Fortran program consists of plain text that follows certain rules (syntax). This is called the source code. You need to use an editor to write (edit) the source code. A commonly used editor in Unix "vi", but it isn't very user friendly. The "text editor" program on our Sun workstations is easy to use. (Access it by right-clicking with the mouse, going to the "programs" menu, and choosing "text editor".) When you have written a Fortran program, you should save it in a file that has the extension .f or .for. Before you can execute the program, you must translate it into machine readable form. This is done by a special program called a compiler. The Unix command which runs the Fortran 77 compiler is f77. The output from the compilation is named a.out by default, but you can choose another name if you wish. Once the program has successfully been compiled, and an executable file such as a.out has been created, you may run the program by simply typing the name of the executable file, for example a.out. (This explanation is a bit oversimplified. Really, the compiler translates source code into object code and the linker/loader makes this into an executable file.) Examples: Refer back to Section 3, and write the program file circle.f. Then compile and run the program. If you need to have several executables at the same time, it is a good idea to give them (different!) descriptive names. This can be accomplished using the -o option. For example, f77 circle.f -o circle.out will compile the file circle.f and save the executable in the file circle.out (rather than the default a.out). Please note that object codes and executables take a lot of disk space, so you should delete them when you are not using them. (The remove command in Unix is rm.) Optional Topic In the previous examples, we have not distinguished between compiling and linking. These are two different processes but the Fortran compiler performs them both, so the user usually does not need to know about it. But in the next example we will use two source code files. f77 circle1.f circle2.f This will generate three files, the two object code files circle1.o and circle2.o, plus the executable file a.out. What really happened here, is that the Fortran compiler first compiled each of the source code files into object files (ending in .o) and then linked the two object files together into the executable a.out. You can separate these two steps by using the -c option to tell the compiler to only compile the source files: f77 -c circle1.f circle2.f f77 circle1.o circle2.o Compiling separate files like this may be useful if there are many files and only a few of them need to be recompiled. In Unix there is a useful command called make which is usually used to handle large software packages with many source files. These packages come with a makefile and all the user has to do is to type make. Writing makefiles is a bit complicated so we will not discuss this in this tutorial. Copyright © 1995-7 by Stanford University. All rights reserved. ************************************************************************************ 5. Variables, types, and declarations Variable names Variable names in Fortran consist of 1-6 characters chosen from the letters a-z and the digits 0-9. The first character must be a letter. Fortran 77 does not distinguish between upper and lower case, and nearly all Fortran 77 compilers will accept lower case. If you should ever encounter a Fortran 77 compiler that insists on upper case it is usually easy to convert the source code to all upper case. The words which make up the Fortran language are called reserved words and cannot be used as names of variables. Some of the reserved words which we have seen so far are "program", "real", "stop" and "end". Types and declarations Every variable should be defined in a declaration. This establishes the type of the variable. The most common declarations are: integer list of variables real list of variables double precision list of variables complex list of variables logical list of variables character list of variables The list of variables should consist of variable names separated by commas. Each variable should be declared exactly once. If a variable is undeclared, Fortran 77 uses a set of implicit rules to establish the type. This means all variables starting with the letters i-n are integers and all others are real. Many old Fortran 77 programs used these implicit rules, but you should not! The probability of errors in your program grows dramatically if you do not consistently declare your variables. Integers and floating point variables Fortran 77 has only one type for integer variables. Integers are usually stored as 32 bit (4 byte) variables. Therefore, all integer variables should take on values in the range [-m,m], where m is approximately 2*10^9. Fortran 77 has two different types for floating point variables, called real and double precision. While real is often adequate, some numerical calculations need very high precision and double precision should be used. Usually a real is a 4 byte variable and the double precision is 8 bytes, but this is machine dependent. Some non-standard Fortran versions use the syntax real*8 to denote 8 byte floating point variables. The parameter statement Some constants appear many times in a program. It is then often desirable to define them only once, in the beginning of the program. This is what the parameter statement is for. It also makes programs more readable. For example, the circle area program should rather have been written like this: program circle real r, area, pi parameter (pi = 3.14159) c This program reads a real number r and prints c the area of a circle with radius r. write (*,*) 'Give radius r:' read (*,*) r area = pi*r*r write (*,*) 'Area = ', area stop end The syntax of the parameter statement is parameter (name = constant, ... , name = constant) The rules for the parameter statement are: * The name defined in the parameter statement is not a variable but rather a constant. (You cannot change its value at a later point in the program.) * A name can appear in at most one parameter statement. * The parameter statement(s) must come before the first executable statement. Some good reasons to use the parameter statement are: * It helps reduce the number of typos. * It makes it easier to change a constant that appears many times in a program. * It increases the readability of your program. Copyright © 1995-7 by Stanford University. All rights reserved. *********************************************************************************** 6. Expressions and assignment Constants The simplest form of an expression is a constant. There are 6 types of constants, corresponding to the 6 data types. Here are some integer constants: 1 0 -100 32767 +15 Then we have real constants: 1.0 -0.25 2.0E6 3.333E-1 The E-notation means that you should multiply the constant by 10 raised to the power following the "E". Hence, 2.0E6 is two million, while 3.333E-1 is approximately one third. Double precision constants are used for constants that are larger than the largest real allowed, or that require high precision. The notation is the same as for real constants except the "E" is replaced by a "D". Examples: 2.0D-1 1D99 Here 2.0D-1 is a double precision one-fifth, while 1D99 is a one followed by 99 zeros. Complex constants are designated by a pair of constants (integer or real), separated by a comma and enclosed in parentheses. Examples are: (2, -3) (1., 9.9E-1) The first number denotes the real part and the second the imaginary part. Logical constants can only have one of two values: .TRUE. .FALSE. Note that the dots enclosing the letters are required. Character constants are most often used as an array of characters, called a string. These consist of an arbitrary sequence of characters enclosed in apostrophes (single quotes): 'ABC' 'Anything goes!' 'It is a nice day' Strings and character constants are case sensitive. A problem arises if you want to have an apostrophe in the string itself. In this case, you should double the apostrophe: 'It''s a nice day' Expressions The simplest non-constant expressions are of the form operand operator operand and an example is x + y The result of an expression is itself an operand, hence we can nest expressions together like x + 2 * y This raises the question of precedence: Does the last expression mean x + (2*y) or (x+2)*y? The precedence of arithmetic operators in Fortran 77 are (from highest to lowest): ** {exponentiation} *,/ {multiplication, division} +,- {addition, subtraction} All these operators are calculated left-to-right, except the exponentiation operator **, which has right-to-left precedence. If you want to change the default evaluation order, you can use parentheses. When in doubt, you may use parentheses to insure the correct order of operation. The above operators are all binary operators. there is also the unary operator - for negation, which takes precedence over the others. Hence an expression like -x+y means what you would expect. Extreme caution must be taken when using the division operator, which has a quite different meaning for integers and reals. If the operands are both integers, an integer division is performed, otherwise a real arithmetic division is performed. For example, 3/2 equals 1, while 3./2. equals 1.5 (note the decimal points). Assignment A variable assignment has the form variable_name = expression The interpretation is as follows: Evaluate the right hand side and assign the resulting value to the variable on the left. The expression on the right may contain other variables, but the variable assignment does not change their value. For example, area = pi * r**2 does not change the value of pi or r, only area. Type conversion When different data types occur in the same expression, type conversion has to take place, either explicitly or implicitly. Fortran will do some type conversion implicitly. For example, real x x = x + 1 will convert the integer one to the real number one, and has the desired effect of incrementing x by one. However, in more complicated expressions, it is good programming practice to force the necessary type conversions explicitly. For numbers, the following functions are available: int real dble ichar char The first three have the obvious meaning, while ichar takes a character and converts it to an integer, and char does exactly the opposite. Example: the following multiplies two real variables x and y using double precision, and stores the result in the double precision variable w: w = dble(x)*dble(y) Note that this is different from w = dble(x*y) Copyright © 1995-7 by Stanford University. All rights reserved. **************************************************************************** 7. Logical expressions Logical expressions can only have the value .TRUE. or .FALSE.. A logical expression can be formed by comparing arithmetic expressions using the following relational operators: .LT. meaning < .LE. <= .GT. > .GE. >= .EQ. = .NE. /= So you cannot use symbols like < or = for comparison in Fortran 77, but you have to use the correct two-letter abbreviation enclosed by dots! Logical expressions can be combined by the logical operators.AND. .OR. .NOT. which have the obvious meaning. Logical variables and assignment Truth values can be stored in logical variables. The assignment is analogous to the arithmetic assignment. Example: logical a, b a = .TRUE. b = a .AND. 3 .LT. 5/2 The order of precedence is important, as the last example shows. The rule is that arithmetic expressions are evaluated first, then relational operators, and finally logical operators. Hence b will be assigned .FALSE. in the example above. Among the logical operators the precedence (in the absence of parenthesis) is that .NOT. is done first, then .AND., then .OR. is done last. While logical variables are seldom used in Fortran, logical expressions are frequently used in conditional statements such as the if statement. Copyright © 1995-7 by Stanford University. All rights reserved. ******************************************************************************** 8. The if statements An important part of any programming language are the conditional statements. The most common such statement in Fortran is the if statement, which actually has several forms. The simplest one is the logical if statement: if (logical expression) executable statement This has to be written on one line. This example finds the absolute value of x: if (x .LT. 0) x = -x If more than one statement should be executed inside the if, then the following syntax should be used: if (logical expression) then statements endif The most general form of the if statement has the following form: if (logical expression) then statements elseif (logical expression) then statements : : else statements endif The execution flow is from top to bottom. The conditional expressions are evaluated in sequence until one is found to be true. Then the associated statements are executed and the control resumes after the endif. Nested if statements if statements can be nested in several levels. To ensure readability, it is important to use proper indentation. Here is an example: if (x .GT. 0) then if (x .GE. y) then write(*,*) 'x is positive and x >= y' else write(*,*) 'x is positive but x < y' endif elseif (x .LT. 0) then write(*,*) 'x is negative' else write(*,*) 'x is zero' endif You should avoid nesting many levels of if statements since things get hard to follow. Copyright © 1995-7 by Stanford University. All rights reserved ********************************************************************************* 9. Loops For repeated execution of similar things, loops are used. If you are familiar with other programming languages you have probably heard about for-loops, while-loops, and until-loops. Fortran 77 has only one loop construct, called the do-loop. The do-loop corresponds to what is known as a for-loop in other languages. Other loop constructs have to be built using the if and goto statements. do-loops The do-loop is used for simple counting. Here is a simple example that prints the cumulative sums of the integers from 1 through n (assume n has been assigned a value elsewhere): integer i, n, sum sum = 0 do 10 i = 1, n sum = sum + i write(*,*) 'i =', i write(*,*) 'sum =', sum 10 continue The number 10 is a statement label. Typically, there will be many loops and other statements in a single program that require a statement label. The programmer is responsible for assigning a unique number to each label in each program (or subprogram). Recall that column positions 1-5 are reserved for statement labels. The numerical value of statement labels have no significance, so any integers can be used, in any order. Typically, most programmers use consecutive multiples of 10. The variable defined in the do-statement is incremented by 1 by default. However, you can define the step to be any number but zero. This program segment prints the even numbers between 1 and 10 in decreasing order: integer i do 20 i = 10, 1, -2 write(*,*) 'i =', i 20 continue The general form of the do-loop is as follows: do label var = expr1, expr2, expr3 statements label continue var is the loop variable (often called the loop index) which must be an integer. Here expr1 specifies the initial value of var, while expr2 is the terminating bound, and expr3 is the increment (step). Note: The do-loop variable must never be changed by other statements within the loop! This will cause great confusion. The loop index can be of type real, but due to round off errors may not take on exactly the expected sequence of values. Many Fortran 77 compilers allow do-loops to be closed by the enddo statement. The advantage of this is that the statement label can then be omitted since it is assumed that an enddo closes the nearest previous do statement. The enddo construct is widely used, but it is not a part of ANSI Fortran 77. It should be noted that unlike some programming languages, Fortran only evaluates the start, end, and step expressions once, before the first pass through the body of the loop. This means that the following do-loop will double a non-negative j, rather than running forever as the equivalent loop might in another language. integer i,j read (*,*) j do 20 i = 1, j j = j + 1 20 continue write (*,*) j while-loops The most intuitive way to write a while-loop is while (logical expr) do statements enddo or alternatively, do while (logical expr) statements enddo The program will alternate testing the condition and executing the statements in the body as long as the condition in the while statement is true. Even though this syntax is accepted by many compilers, it is not ANSI Fortran 77. The correct way is to use if and goto: label if (logical expr) then statements goto label endif Here is an example that calculates and prints all the powers of two that are less than or equal to 100: integer n n = 1 10 if (n .le. 100) then write (*,*) n n = 2*n goto 10 endif until-loops If the termination criterion is at the end instead of the beginning, it is often called an until-loop. The pseudocode looks like this: do statements until (logical expr) Again, this should be implemented in Fortran 77 by using if and goto: label continue statements if (logical expr) goto label Note that the logical expression in the latter version should be the negation of the expression given in the pseudocode! Copyright © 1995-7 by Stanford University. All rights reserved. ******************************************************************************** 10. Arrays Many scientific computations use vectors and matrices. The data type Fortran uses for representing such objects is the array. A one-dimensional array corresponds to a vector, while a two-dimensional array corresponds to a matrix. To fully understand how this works in Fortran 77, you will have to know not only the syntax for usage, but also how these objects are stored in memory in Fortran 77. One-dimensional arrays The simplest array is the one-dimensional array, which is just a sequence of elements stored consecutively in memory. For example, the declaration real a(20) declares a as a real array of length 20. That is, a consists of 20 real numbers stored contiguously in memory. By convention, Fortran arrays are indexed from 1 and up. Thus the first number in the array is denoted by a(1) and the last by a(20). However, you may define an arbitrary index range for your arrays using the following syntax: real b(0:19), weird(-162:237) Here, b is exactly similar to a from the previous example, except the index runs from 0 through 19, while weird is an array of length 237-(-162)+1 = 400. The type of an array element can be any of the basic data types. Examples: integer i(10) logical aa(0:1) double precision x(100) Each element of an array can be thought of as a separate variable. You reference the i'th element of array a by a(i). Here is a code segment that stores the squares of the numbers 1 through 10 in the array sq: integer i, sq(10) do 100 i = 1, 10 sq(i) = i**2 100 continue A common bug in Fortran is that the program tries to access array elements that are out of bounds or undefined. This is the responsibility of the programmer, and the Fortran compiler will not detect any such bugs! Two-dimensional arrays Matrices are very important in linear algebra. Matrices are usually represented by two-dimensional arrays. For example, the declaration real A(3,5) defines a two-dimensional array of 3*5=15 real numbers. It is useful to think of the first index as the row index, and the second as the column index. Hence we get the graphical picture: (1,1) (1,2) (1,3) (1,4) (1,5) (2,1) (2,2) (2,3) (2,4) (2,5) (3,1) (3,2) (3,3) (3,4) (3,5) Two-dimensional arrays may also have indices in an arbitrarily defined range. The general syntax for declarations is: name (low_index1 : hi_index1, low_index2 : hi_index2) The total size of the array is then size = (hi_index1-low_index1+1)*(hi_index2-low_index2+1) It is quite common in Fortran to declare arrays that are larger than the matrix we want to store. (This is because Fortran does not have dynamic storage allocation.) This is perfectly legal. Example: real A(3,5) integer i,j c c We will only use the upper 3 by 3 part of this array. c do 20 j = 1, 3 do 10 i = 1, 3 A(i,j) = real(i)/real(j) 10 continue 20 continue The elements in the submatrix A(1:3,4:5) are undefined. Do not assume these elements are initialized to zero by the compiler (some compilers will do this, but not all). Storage format for 2-dimensional arrays Fortran stores higher dimensional arrays as a contiguous sequence of elements. It is important to know that 2-dimensional arrays are stored by column. So in the above example, array element (1,2) will follow element (3,1). Then follows the rest of the second column, thereafter the third column, and so on. Consider again the example where we only use the upper 3 by 3 submatrix of the 3 by 5 array A(3,5). The 9 interesting elements will then be stored in the first nine memory locations, while the last six are not used. This works out neatly because the leading dimension is the same for both the array and the matrix we store in the array. However, frequently the leading dimension of the array will be larger than the first dimension of the matrix. Then the matrix will not be stored contiguously in memory, even if the array is contiguous. For example, suppose the declaration was A(5,3) instead. Then there would be two "unused" memory cells between the end of one column and the beginning of the next column (again we are assuming the matrix is 3 by 3). This may seem complicated, but actually it is quite simple when you get used to it. If you are in doubt, it can be useful to look at how the address of an array element is computed. Each array will have some memory address assigned to the beginning of the array, that is element (1,1). The address of element (i,j) is then given by addr[A(i,j)] = addr[A(1,1)] + (j-1)*lda + (i-1) where lda is the leading (i.e. row) dimension of A. Note that lda is in general different from the actual matrix dimension. Many Fortran errors are caused by this, so it is very important that you understand the distinction! Multi-dimensional arrays Fortran 77 allows arrays of up to seven dimensions. The syntax and storage format are analogous to the two-dimensional case. The dimension statement There is an alternate way to declare arrays in Fortran 77. The statements real A, x dimension x(50) dimension A(10,20) are equivalent to real A(10,20), x(50) This dimension statement is considered old-fashioned style today. Copyright © 1995-7 by Stanford University. All rights reserved. ********************************************************************************