# Code used for the paper Efficient Combination of Rematerialization and Offloading for Training DNNs

## Contents:
+ `rotor-offload_all.tar.gz` contains the code for the rotor framework
  in which the algorithms are implemented.
+ `simulate.py` is the main simulation script
+ `do_run.sh` small script which produces the results used for the
  Figures in the paper
+ `simulate.R` produces the plots from the results
+ `plots` contains results and plots
+ `get_chains.py` and `measure_BW.py` are used to perform the
  measurements of networks and bandwidth
+ `all_chains.py` contains the measurements we obtained on our machine


## How to use
+ First `rotor` should be installed. Preferably inside a virtual
  environment. `rotor` depends on `pytorch` and can be installed
  simply by running `python3 setup.py install` in the `rotor`
  directory.

+ Then `simulate.py` can be used to perform the experiments. It can be
  run without arguments to compute with all measured networks, or with
  arguments to filter the considered cases. `simulate.py -h` provides
  help.

  The output of `simulate.py` contains one line per simulation (for a
  given network, target memory, bandwidth, and algorithm). Each line
  contains: the network description, the memory ratio, the memory
  limit, the bandwidth, the algorithm name, the duration of the
  produced sequence, the memory used, the total computation in the
  sequence, the total data transferred, the time required to execute
  the algorithm.

+ The `do_run.sh` script is used to run the simulations of the
  paper. It should be run with an argument, providing a directory
  where the results should be stored. It runs the simulations in
  parallel with 4 processes, and takes about 1 hour of computation
  time on a 4-cores machine.

+ The plots are produced with R, and require `ggplot2` and `plyr`. The
  command `Rscript simulate.R directory/` takes the data contained in
  `directory` and produces the plots as in the paper in the same
  `directory`.