## Setup Docker Container
The easiest way to perform the attacks is to run the code in a Docker container. To build the Docker image run the following script:

```bash
docker build -t nemo  .
```

To create and start a Docker container run the following command from the project's root:

```bash
docker run --rm --shm-size 16G --name my_container --gpus '"device=all"' -v $(pwd):/workspace -it nemo bash
```

# Localizing Memorization

## 1. Calculating Activation Statistics
To identify neurons that are memorizing specific samples, we first have to calculate the activations on unmemorized samples. To to that run the following script:

```python 
python 1_compute_activations_statistics.py
```
The script provides the following options:
```
options:
  -h, --help            show this help message and exit
  --prompts PROMPTS     The file from which the prompts are loaded to calculate the statistics (default: 'prompts/additional_laion_prompts.csv').
  --output OUTPUT       The file to which the activation statistics are written (default: 'statistics/statistics_additional_laion_prompts.pt').
  -v VERSION, --version VERSION
                        Stable Diffusion version (default: "v1-4")
  -u USER, --user USER  name initials for RTPT (default: "XX")
```

## 2. Calculate SSIM Thresholds
In addition to the activation statistics on unmemorized prompts, we need the SSIM thresholds for the neuron detection algorithm. To achieve this, we first have to calculate the pariwise SSIM between different seeds of unmemorized prompts. To calculate the pairwise SSIMs, run the following script:
```python
python 2_compute_pairwise_ssim.py
```
The script provides the following options:
```
options:
  -h, --help            show this help message and exit
  --prompts PROMPTS     The file from which the prompts are loaded to calculate the statistics (default: 'prompts/additional_laion_prompts.csv').
  --output OUTPUT       The file to which the activation statistics are written (default: 'pairwise_ssim_per_prompt.pt').
  --seed SEED           The seed used for the SD inference (default: 1).
  --batch_size BATCH_SIZE
                        The batch size used for calculating the pairwise SSIM score (default: 45).
  -u USER, --user USER  name initials for RTPT (default: "XX")
  -v VERSION, --version VERSION
                        Stable Diffusion version (default: "v1-4")
```

Afterwards you can calculate the thresholds by loading the file with PyTorch and compute the mean and standard deviation.

## 3. Detect Memorizing Neurons
To run the neuron detection algorithm, run the following script:

```python
python 3_detect_memorized_neurons.py
```
The script provides the following option:
```
options:
  -h, --help            show this help message and exit
  -v VERSION, --version VERSION
                        Stable diffusion version (default: v1-4)
  -d DATASET, --dataset DATASET
                        Dataset of memorized prompts (default: prompts/memorized_laion_prompts.csv)
  -o OUTPUT, --output OUTPUT
                        Output file for memorization statistics (default: results/memorization_statistics.csv)
  -s SEED, --seed SEED  Random seed (default: 1)
  --samples_per_prompt SAMPLES_PER_PROMPT
                        Number of samples generated per prompt (default: 10)
  --num_inference_steps NUM_INFERENCE_STEPS
                        Number of inference steps used to generate the images. Even though only the first step is used, this has an effect on the noise prediction of the first step. (default: 50)
  -sf, --scaling_factor SCALING_FACTOR
                        Scaling factor for neuron activations (default: 0.0)
  --ssim_threshold_initial_selection SSIM_THRESHOLD_INITIAL_SELECTION
                        SSIM threshold for the initial neuron selection (default: 0.428)
  --ssim_threshold_refinement SSIM_THRESHOLD_REFINEMENT
                        SSIM threshold for the neuron refinement (default: 0.428)
  --guidance_scale GUIDANCE_SCALE
                        The guidance scale for the image generation during and after the neuron selection (default: 0)
  --theta_reduction_per_step THETA_REDUCTION_PER_STEP
                        The reduction of theta value per step during the initial neuron selection(default: 0.25)
  --min_theta MIN_THETA
                        The minimum theta value for the initial neuron selection (default: 1)
  --initial_theta INITIAL_THETA
                        The initial theta value for the initial neuron selection (default: 5)
  --initial_k INITIAL_K
                        The initial k value for the initial neuron selection (default: 0)
  -n NAME, --name NAME  RTPT user name (Default: XX)
```

## 4. Image Generation
To calculate the result metrics we first have to generate the original images (without blocking neurons) and the images with the identified blocked neurons. To do both, run the following script(s):
```python
python 4_generate_images.py
```

The script provides the following option:
```
options:
  -h, --help            show this help message and exit
  -f RESULT_FILE, --result_file RESULT_FILE
                        path to file with image descriptions (default: None)
  -o OUTPUT_PATH, --output OUTPUT_PATH
                        output folder for generated images (default: 'generated_images/generated_images')
  -s SEED, --seed SEED  seed for generated images (default: 0
  -n NUM_SAMPLES, --num_samples NUM_SAMPLES
                        number of generated samples for each prompt (default: 1)
  --steps NUM_STEPS     number of denoising steps (default: 50)
  -g GUIDANCE_SCALE, --guidance_scale GUIDANCE_SCALE
                        guidance scale (default: 7)
  -u USER, --user USER  name initials for RTPT (default: "XX")
  -v VERSION, --version VERSION
                        Stable Diffusion version (default: "v1-5")
  --original_images     Generate the original images
  --initial_neurons     Block initial neurons
  --refined_neurons     Block refined neurons
```
The configurations for the image generation are stored as a json file in the output folder.

# Evaluation Metrics

After generating images, metrics can be computed by running the scripts at metrics. For all metrics, provide the link to the csv result file containing the detected neurons.
For splitting the results into VM und TM prompts, also provide a link to the original prompt file with ```-p=prompts/memorized_laion_prompts.csv```