# Code for the NeurIPS 2022 submission: </br></br>"LAMP: Extracting Text from Gradients with Language Model Priors"
## Prerequisites
- Install Anaconda. 
- Create the conda environment:<br>
> conda env create -f environment.yml
- Enable the created environment:<br>
> conda activate lamp
- Download the GPT-2 model trained on BERT tokenizer:<br>
> wget https://dl.fbaipublicfiles.com/text-adversarial-attack/transformer_wikitext-103.pth
- Download fine-tuned models for Table 1 from:<br>
> https://drive.google.com/file/d/1gffraG72oSqHZlfY3vKrYaCsGjdHMmDT/view?usp=sharing
- Unzip them in the same directory

## Main experiments (Table 1)

### Parameters
- *DATASET* - the dataset to use. Must be one of **cola**, **sst2**, **rotten_tomatoes**.
- *BATCH\_SIZE* - the batch size to use e.g **1**.
- *BERT\_PATH* - the language model to attack. Must be one of **bert-base-uncased**, **models/bert-base-finetuned-cola**, **models/bert-base-finetuned-sst2**, **models/bert-base-finetuned-rottentomatoes**, **huawei-noah/TinyBERT_General_4L_312D**, **huawei-noah/TinyBERT_General_6L_768D** for BERT<sub>BASE</sub>, each of the three fine-tuned BERT<sub>BASE</sub>-FT, and TinyBERT<sub>4</sub> or TinyBERT<sub>6</sub>.

### Commands
- To run the experiment on LAMP with cosine loss:<br>
> python3 attack.py --dataset DATASET --split test --loss cos --n_inputs 100 -b BATCH_SIZE --coeff_perplexity 0.2 --coeff_reg 1 --lr 0.01 --lr_decay 0.89 --bert_path BERT_PATH
- To run the experiment on LAMP with L1+L2 loss:<br>
> python3 attack.py --dataset DATASET --split test --loss tag --n_inputs 100 -b BATCH_SIZE --coeff_perplexity 60 --coeff_reg 25 --lr 0.01 --lr_decay 0.89 --tag_factor 0.01 --bert_path BERT_PATH
- To run the experiment on TAG:<br>
> python3 attack.py --baseline --dataset DATASET --split test --loss tag --n_inputs 100 -b BATCH_SIZE --lr 0.1 --lr_decay 1 --tag_factor 0.01 --bert_path BERT_PATH
- To run the experiment on DLG:<br>
> python3 attack.py --baseline --dataset DATASET --split test --loss dlg --n_inputs 100 -b BATCH_SIZE --lr 0.1 --lr_decay 1  --bert_path BERT_PATH


## Ablation study (Table 3)

### Parameters
- *DATASET* - the dataset to use. Must be one of **cola**, **sst2**, **rotten_tomatoes**.

### Commands
- To run the experiment on LAMP with cosine loss:<br>
> python3 attack.py --dataset DATASET --split test --loss cos --n_inputs 100 -b 1 --coeff_perplexity 0.2 --coeff_reg 1 --lr 0.01 --lr_decay 0.89 --bert_path bert-base-uncased
- To run the experiment on LAMP with L1+L2 loss:<br>
> python3 attack.py --dataset DATASET --split test --loss tag --n_inputs 100 -b 1 --coeff_perplexity 60 --coeff_reg 25 --lr 0.01 --lr_decay 0.89 --tag_factor 0.01 --bert_path bert-base-uncased
- To run the experiment on LAMP with L2 loss:<br>
> python3 attack.py --dataset DATASET --split test --loss dlg --n_inputs 100 -b 1 --coeff_perplexity 60 --coeff_reg 25 --lr 0.01 --lr_decay 0.89 --bert_path bert-base-uncased
- To run the experiment on LAMP without perplexity:<br>
> python3 attack.py --dataset DATASET --split test --loss cos --n_inputs 100 -b 1 --coeff_perplexity 0 --coeff_reg 1 --lr 0.01 --lr_decay 0.89 --bert_path bert-base-uncased
- To run the experiment on LAMP without regularisation:<br>
> python3 attack.py --dataset DATASET --split test --loss cos --n_inputs 100 -b 1 --coeff_perplexity 0.2 --coeff_reg 0 --lr 0.01 --lr_decay 0.89 --bert_path bert-base-uncased
- To run the experiment on LAMP without discrete optimisation:<br>
> python3 attack.py --dataset DATASET --split test --loss cos --n_inputs 100 -b 1 --coeff_perplexity 0.2 --coeff_reg 1 --lr 0.01 --lr_decay 0.89 --no-use_swaps --bert_path bert-base-uncased 


## Gradient defense (Table 4)

### Parameters
- *SIGMA* - the amount of Gaussian noise with which to defend e.g **0.001**.

### Commands
- To run the experiment on LAMP with cosine loss:<br>
> python3 attack.py --dataset cola --split test --loss cos --n_inputs 100 -b 1 --coeff_perplexity 0.2 --coeff_reg 1 --lr 0.01 --lr_decay 0.89 --bert_path bert-base-uncased --defense_noise SIGMA
- To run the experiment on LAMP with L1+L2 loss:<br>
> python3 attack.py --dataset cola --split test --loss tag --n_inputs 100 -b 1 --coeff_perplexity 60 --coeff_reg 25 --lr 0.01 --lr_decay 0.89 --tag_factor 0.01 --bert_path bert-base-uncased --defense_noise SIGMA
- To run the experiment on TAG:<br>
> python3 attack.py --baseline --dataset cola --split test --loss tag --n_inputs 100 -b 1 --lr 0.1 --lr_decay 1 --tag_factor 0.01 --bert_path bert-base-uncased --defense_noise SIGMA
- To run the experiment on DLG:<br>
> python3 attack.py --baseline --dataset cola --split test --loss dlg --n_inputs 100 -b 1 --lr 0.1 --lr_decay 1 --bert_path bert-base-uncased --defense_noise SIGMA


## Fine-tuning BERT with and without defended gradients
### Parameters
- *DATASET* - the dataset to use. Must be one of **cola**, **sst2**, **rotten_tomatoes**.
- *SIGMA* - the amount of Gaussian noise with which to train e.g **0.001**. To train without defense set to **0.0**.
- *NUM_EPOCHS* - for how many epochs to train e.g **2**.

### Commands

- To train your own network:<br>
> python3 train.py --dataset DATASET --batch_size 32 --noise SIGMA --num_epochs NUM_EPOCHS --save_every 100

The models are stored under `finetune/DATASET/noise_SIGMA/STEPS`
