Compositional Zero-Shot Learning experiments
============================================

Code for running the experiments in the paper
Logical Activation Functions: Logit-space equivalents of Probabilistic Boolean Operators
https://openreview.net/forum?id=m6HNNpQO8dc

On the MIT-States dataset using TMN-based architecture.

This repository was forked from [facebookresearch/taskmodularnets](https://github.com/facebookresearch/taskmodularnets) under the Creative Commons Attribution Non-Commercial 4.0 International license.
The repository contains the official code for the corresponding paper:
Task-Driven Modular Networks for Zero-Shot Compositional Learning<br/>
*Senthil Purushwalkam, Maximilian Nickel, Abhinav Gupta, Marc'Aurelio Ranzato<br/>
arXiv preprint arXiv:1905.05908<br/>*
[Webpage](http://www.cs.cmu.edu/~spurushw/projects/compositional.html) | [Paper](https://arxiv.org/abs/1905.05908) <br/>


Installation
------------

```
# Pick a name for the new environment
ENVNAME=taskmodularnets

# Create the python3.x conda environment, with pip installed
conda create -p "$ENVNAME" -q python=3.9.7 pip

# Activate the environment
conda activate "$ENVNAME"

# Install dependencies
pip install -r requirements.txt

# Download data
bash utils/download_data.sh
```


Training and evaluation
-----------------------

To generate our final results (TMNx architecture, Table 3), run the following bash code.

Each job uses a single GPU, and each takes between 2 and 4.5 hours with a Tesla T4 GPU.
The variance between jobs depends on how many epochs are needed to reach peak performance, which is typically around 5 epochs.
The training process has a patience of 5 epochs to be sure the peak has been discovered, meaning each job typically trains for around 10 epochs before early stopping exits the job.

The bash script `run_TMNx.sh` will train the model with `train_modular.py`, and then evaluate the best model (on the validation set) using the script `test_modular.py` on the test set, for the top-k values k=1,2,3.

Note that the final results in the paper are shown with a constant width of 96, which produced more similar numbers of parameters than using 64 and 96.

```
# Generating main results (TMNx architecture, Table 3)
# Results in the paper are shown with a constant width of 96

LR=3e-3
LRG="$LR"
WD=1e-5

for SEED in {10..14};
do

    HWIDTH=96
    for ACTFUN in relu max_min_dup ail_and_or_dup ail_or_xnor_dup ail_and_or_xnor_dup max ail_or ail_xnor ail_or_xnor_part ail_and_or_xnor_part
    do
        run_TMNx.sh \
            "txn-${HWIDTH}-${ACTFUN}-lr,${LR}-wd,${WD}-lrg,${LRG}-${SEED}" \
            "${HWIDTH}" \
            --lr "$LR" \
            --lrg "$LRG" \
            --wd "$WD" \
            --gater_actfun "$ACTFUN" \
            --module_actfun "$ACTFUN" \
            --module_type "extra" \
            --seed "$SEED"
    done

done
```

To generate our preliminary results (TMN architecture, Table 8), run the following bash code.

Note that the results shown in the paper use a width of 64 or 96 depending on activation function.

```
# Generating preliminary results (TMN architecture, Table 8 of the paper)
# Results in the paper are shown with a width of either 64 or 96, depending on activation function

for SEED in {0..4};
do

    HWIDTH=64
    for ACTFUN in relu max_min_dup ail_and_or_dup ail_or_xnor_dup ail_and_or_xnor_dup
    do
        run_TMN.sh \
            "tmn3-${HWIDTH}-${ACTFUN}-${SEED}" \
            "${HWIDTH}" \
            --gater_actfun "$ACTFUN" \
            --module_actfun "$ACTFUN" \
            --seed "$SEED"
    done

    HWIDTH=96
    for ACTFUN in max ail_or ail_xnor ail_or_xnor_part ail_and_or_xnor_part
    do
        run_TMN.sh \
            "tmn3-${HWIDTH}-${ACTFUN}-${SEED}" \
            "${HWIDTH}" \
            --gater_actfun "$ACTFUN" \
            --module_actfun "$ACTFUN" \
            --seed "$SEED"
    done

done
```


Generate tables
---------------

To generate the tables shown in the paper, run the notebooks `notebook/TMNx-results.ipynb` and `notebook/TMN-results.ipynb`.

```bash
python -m pip install matplotlib jupyterlab pandas
python -m jupyterlab "TMNx-results.ipynb"
python -m jupyterlab "TMN-results.ipynb"
```

