Parallel computing

1 Preamble

Import the Python modules needed to run the analysis.

# Imports
import os
import multiprocessing as mp
import pkg_resources
import time

import numpy as np
import pandas as pd
from tabulate import tabulate

import riskmapjnr as rmj

Increase the cache for GDAL to increase computational speed.

# GDAL
os.environ["GDAL_CACHEMAX"] = "1024"

Set the PROJ_LIB environmental variable.

os.environ["PROJ_LIB"] = "/home/ghislain/.pyenv/versions/miniconda3-latest/envs/conda-rmj/share/proj"

Create a directory to save results.

out_dir = "outputs_parallel"
rmj.make_dir(out_dir)

Load forest data.

fcc_file = pkg_resources.resource_filename("riskmapjnr", "data/fcc123_GLP.tif")
print(fcc_file)
border_file = pkg_resources.resource_filename("riskmapjnr", "data/ctry_border_GLP.gpkg")
print(border_file)
/home/ghislain/Code/riskmapjnr/riskmapjnr/data/fcc123_GLP.tif
/home/ghislain/Code/riskmapjnr/riskmapjnr/data/ctry_border_GLP.gpkg

2 Sequential computing

We set parallel argument to False in the call to makemap() function.

start_time = time.time()
results_makemap = rmj.makemap(
    fcc_file=fcc_file,
    time_interval=[10, 10],
    output_dir=out_dir,
    clean=False,
    dist_bins=np.arange(0, 1080, step=30),
    win_sizes=np.arange(5, 48, 6),
    ncat=30,
    parallel=False,
    ncpu=None,
    methods=["Equal Interval", "Equal Area"],
    csize=40,
    no_quantity_error=True,
    figsize=(6.4, 4.8),
    dpi=100,
    blk_rows=128,
    verbose=True)
sec_seq = time.time() - start_time
print('Time Taken:', time.strftime("%H:%M:%S",time.gmtime(sec_seq)))
Model calibration and validation
.. Model 0: window size = 5, slicing method = ei.
.. Model 1: window size = 5, slicing method = ea.
.. Model 2: window size = 11, slicing method = ei.
.. Model 3: window size = 11, slicing method = ea.
.. Model 4: window size = 17, slicing method = ei.
.. Model 5: window size = 17, slicing method = ea.
.. Model 6: window size = 23, slicing method = ei.
.. Model 7: window size = 23, slicing method = ea.
.. Model 8: window size = 29, slicing method = ei.
.. Model 9: window size = 29, slicing method = ea.
.. Model 10: window size = 35, slicing method = ei.
.. Model 11: window size = 35, slicing method = ea.
.. Model 12: window size = 41, slicing method = ei.
.. Model 13: window size = 41, slicing method = ea.
.. Model 14: window size = 47, slicing method = ei.
.. Model 15: window size = 47, slicing method = ea.
Deriving risk map for full historical period
Time Taken: 00:02:06

3 Parallel computing

We use parallel computing using several CPUs. We set parallel argument to True in the call to makemap() function and set ncpu to mp.cpu_count() to use the maximum number of available CPUs (here 8). When using parallel computing, one CPU is used for each window size.

ncpu = mp.cpu_count()
print(f"Number of CPUs to use: {ncpu}.")
Number of CPUs to use: 8.
start_time = time.time()
results_makemap = rmj.makemap(
    fcc_file=fcc_file,
    time_interval=[10, 10],
    output_dir=out_dir,
    clean=False,
    dist_bins=np.arange(0, 1080, step=30),
    win_sizes=np.arange(5, 48, 6),
    ncat=30,
    parallel=True,
    ncpu=ncpu,
    methods=["Equal Interval", "Equal Area"],
    csize=40,
    no_quantity_error=True,
    figsize=(6.4, 4.8),
    dpi=100,
    blk_rows=128,
    verbose=True)
sec_par = time.time() - start_time
print('Time Taken:', time.strftime("%H:%M:%S",time.gmtime(sec_par)))
Model calibration and validation
.. Model 0: window size = 5, slicing method = ei.
.. Model 2: window size = 11, slicing method = ei.
.. Model 6: window size = 23, slicing method = ei.
.. Model 4: window size = 17, slicing method = ei.
.. Model 12: window size = 41, slicing method = ei.
.. Model 8: window size = 29, slicing method = ei.
.. Model 14: window size = 47, slicing method = ei.
.. Model 10: window size = 35, slicing method = ei.
.. Model 1: window size = 5, slicing method = ea.
.. Model 3: window size = 11, slicing method = ea.
.. Model 7: window size = 23, slicing method = ea.
.. Model 15: window size = 47, slicing method = ea.
.. Model 5: window size = 17, slicing method = ea.
.. Model 13: window size = 41, slicing method = ea.
.. Model 9: window size = 29, slicing method = ea.
.. Model 11: window size = 35, slicing method = ea.
Deriving risk map for full historical period
Time Taken: 00:00:45

4 Results

Sequential computing took 02m 06s against 45s for parallel computing with 8 CPUs when considering 8 window sizes.