This page is meant to give a very superficial overview of the processing of compiling and running software. The first section provides a very basic example program in the C language and shows the most basic compilation commands.

The second section reviews the most common types of files that might be encountered.

The third section will walk you through downloading, unpacking, configuring, and compiling a small, open source software program.

We also have a separate page that provides more details on building software with external libraries, which are software programs designed to be included in other software.

The general procedure for compiling software is

  1. Write or obtain source code
  2. Compile source code
  3. Run program

Any time you make changes to the source code, you must recompile the executable.

The most basic example: print a message

Source code

Since software is typically designed to perform work and report the results to the user, the first example program is often how to print output (at least for software that produces text output). Here is a version of the most famous first programming example. This is an example of source code.

#include <stdio.h>

int main(void) {
    print("Hello, world.\n");

This is not a tutorial on progamming, so suffice it to say that the above example is a valid C program. Given that it is, how do go from that text to something you can run on the computer?

It is generally recommended that you create directories for different kinds of files. What you will see here is the author’s personal scheme. You should ask others and pick something that works for you. When you are just beginning, it is probably best to pick something and use it for a while before choosing a different scheme so that you have some experience to judge from.

The file into which source code is put will be called a source file. A programming project often includes hundreds of files, and it is a good idea to keep those separate from the finished, runnable programs. In addition, you probably do not want to mix the source files from two different projects. So, please create in your home directory a directory (also called a folder) called src, and inside that create another folder called hello. We put the source files for this into /sw/examples/compiling/hello so you can copy them, as show in this example.

Please note: In examples, the $ indicates the prompt and is not to be typed if you copy and paste from this web page. This is done to more easily distinguish input from output.

$ mkdir src
$ cp -r /sw/examples/hello ./
$ cd hello
$ ls

Compiling the source

The program that converts source code to an executable, which is the generic term for a file that can be run as a program, is called a compiler. We will first use the compiler that comes as the standard on most Linux systems, GCC. To compile hello.c, you will run the compiler command followed by the source file name.

$ gcc hello.c
$ ls
a.out  hello.c

By default, the output file, which is the runnable program, will be called a.out. You should run this using

$ ./a.out
Hello, world.

You can continue to use a.out, but most people prefer to give their programs more descriptive or meaningful names. You do this by providing an option to gcc to tell it the name of the output file, as shown next.

$ gcc -o hello hello.c
$ ls
a.out  hello  hello.c
$ ./hello
Hello, world.

Header files and libraries

It is very common for there to be many files needed for real software. The most commonly encountered types are header files, which typically include definitions, and library files, which include bits of runnable computer code. In your source code, you need to tell the compiler which header files you wish included with your software.

This line

#include  <stdio.h>

tells the compiler to look for an external file called stdio.h and include its definitions. This is one of the files from the C standard library, so the compiler knows where to find it.

Here is a slightly more complicated program. It just prints a table of 8 values of the sine function from 0 to 2π.

#include  <stdio.h>
#include  <math.h>

int main()
    int i = 0;

    for (i=0 ; i<=8 ; i++) {
        printf("This is the sin of %f  is:  %12.8f\n",
               M_PI_4 * (double)i, sin(M_PI_4 * (double)i));
    return 0;

The file math.h contains the definition of M_PI_4, which is PI divided by 4 for π to about eight digits. Again the header file, math.h, is in a location already known to the compiler. Let’s try compiling this.

$ gcc -o print_sin print_sin.c
/tmp/ccYvKEH3.o: In function `main':
print_sin.c:(.text+0x34): undefined reference to `sin'
collect2: error: ld returned 1 exit status

The compiler is telling us that we are using a function, sin, that hasn’t been defined. In this case, it is in the standard mathematics library, which is a collection of mathematical functions commonly used. The compiler knows where that library (and many more) are kept, but it doesn’t know you want to use it until you tell it so. You do that on the compilation line with a -l option. (That is typically read as ‘minus ell’.)

$ gcc -o print_sin print_sin.c -lm
$ ./print_sin
This is the sin of 0.000000  is:    0.00000000
This is the sin of 0.785398  is:    0.70710678
This is the sin of 1.570796  is:    1.00000000
This is the sin of 2.356194  is:    0.70710678
This is the sin of 3.141593  is:    0.00000000
This is the sin of 3.926991  is:   -0.70710678
This is the sin of 4.712389  is:   -1.00000000
This is the sin of 5.497787  is:   -0.70710678
This is the sin of 6.283185  is:   -0.00000000

Specifying file locations

It is quite common for a library about which the compiler does not know where the header files or the library files are located. These are indicated on the command line as options for the compiler.

Include files

The directory location of header files is specified using the -I /path/to/header/files option. You may be able to remember this more easily by thinking of it as the path to files to be included. More than one -I option can be included on the command line, if needed, and the most important should be listed first.

Library files

The directory location of library files is similarly specified using the -L /path/to/library/files option. Remember that, as we noted above, two options are needed for libraries: one option to tell it which libraries are needed (-l) and one to tell it where they are located (-L), if needed. These are sometimes called “little ell$rdquo; and “big ell” options.

Here is an example compiler command using all three options. This uses a source file from /sw/examples/compiling-software called gsl_example.c, which you can copy. The program demonstrates the effect of the order in which elements of matrices are stored in memory. We also show the use of the backslash character to continue a long command on multiple lines.

$ gcc -o gsl_example gsl_example.c \
>    -I /sw/arcts/centos7/gsl/2.1/include \
>    -L /sw/arcts/centos7/gsl/2.1/lib \
>    -l gsl -lgslcblas
$ ./gsl_example
./gsl_example: error while loading shared libraries: cannot open shared object file: No such file or directory

As when compiling a program using a library for which the compiler does not know the location, when running a program that uses a library that is not in a standard location, you must tell it where to find the library. You do this by adding the library‘s location to an environment variable, LD_LIBRARY_PATH, prior to running the program. The LD_LIBRARY_PATH variable can contain many locations, which are separated by colons. You usually want to add your library to the front, then put the previous value of LD_LIBRARY_PATH after your new library directory. Here is an example of doing this for only one invocation of the new program.

$ LD_LIBRARY_PATH=/sw/arcts/centos7/gsl/2.1/lib:$LD_LIBRARY_PATH \
>    ./gsl_example
gsl BLAS, row-major:
| 2, 2 |
| 0, 0 |
gsl CBLAS, row-major:
| 2, 2 |
| 0, 0 |
gsl CBLAS, column-major:
| 2, 1 |
| 1, 0 |

For most libraries installed on the clusters and made available via modules, the LD_LIBRARY_PATH is set for you, and setting it will not be necessary if you have the module loaded. In addition, named variables are typically defined to hold the include and library paths. Please see our page on Using libraries for more information.