Matlab is capable of running parallel jobs on Flux. The most common kinds of parallel jobs are where the computational task can be broken into independent calculations, and each calculation is then carried out by a different processor.
One common example is when there are many files each of which needs to have the same operations performed on the data therein. For an example of a scenario like this, which is used to illustrated the parallel methods explained below, see the Spectral Processing Example web page.
One common example would be some simulation or analytic technique that depends on Monte Carlo techniques for random sampling from data or randomly generating data, and each iteration in the simulation can be carried out independently of the others. See the Matlab web page on benchmarking parallel jobs using parfor
.
There are two different schemes for handling situations like these. The first scheme creates a pool of worker Matlabs within the current job and assigns each a portion of the work. The scheme is slightly different depending whether the job includes more than one node. The second actually breaks things into separate, batch-processed, single-core jobs, which Matlab then submits for you. This is more typically used when Matlab is run on a workstation that can submit jobs remotely to the cluster.
Parallel for loops
A for loop is simply where some computation is done once for every entry in a list of entries, where entries are typically some sequence of positive integers.
A parallel for loop is one where each step in the loop is assigned to its own processor, and as processors complete their computation, a new one is assigned until all the steps have been completed. This is done using a pool of workers, where a worker represents one processor able to do work. When you define the pool of workers, you specify a cluster profile that will tell Matlab where those processors are.
Parallel for loops in Matlab are implemented using the parfor
command, but are syntactically the same as an ordinary for
loop, which makes this quite easy and natural to use in many contexts. The parfor
command uses the currently defined pool, or, if none exists, it attempts to create one. We highly recommend that you explicitly create and delete the pool as you need it.
If the workers are all to be on the same machine, then the profile used will be the local
profile. For more details, see the Matlab parfor using the local profile web page, where both the image processing and the blackjack example are shown.
If the workers can be on multiple machines, then the profile used will be the current
profile, which is all of the processors assigned to the current PBS job. For more details, see the Matlab parfor using the current profile web page.
Distributed PBS jobs
In the distributed PBS job scheme, you specify some function for Matlab to execute, then Matlab will submit a PBS job for each instance of the function. For more details, see the Matlab distributed jobs web page. This is very infrequently used on Flux.