neural_de.transformations.diffusion package

Subpackages

neural_de.transformations.diffusion.unet package

Submodules

neural_de.transformations.diffusion.diffpure_config module

class neural_de.transformations.diffusion.diffpure_config.DiffPureConfig(weights_path=PosixPath('/home/runner/.neuralde/diffpure/256x256_diffusion_uncond.pt'), img_shape=(3, 256, 256), attention_resolutions=<factory>, num_classes=None, dims=2, learn_sigma=True, num_channels=256, num_head_channels=64, num_res_blocks=2, resblock_updown=True, use_fp16=True, use_scale_shift_norm=True, num_heads=4, num_heads_upsample=-1, channel_mult=None, dropout=0.0, use_new_attention_order=False, t=150, t_delta=15, use_bm=False, use_checkpoint=False, conv_resample=True, sample_step=1, rand_t=False)[source]

Bases: object

A dataclass to configure and provide parameters for the internal diffusion model of diffusion_enhancer.

Most of the parameters are available to allow a custom usage of a different pre-trained diffusion models, based on the U-net architecture and code. The one which can be modified with the provided model are t, t_delta and sample_steps.

weights_path: Path of the pre-trained weights, to provide custom weights files.

img_shape: the shape of each input image of the diffusion model (by default (3, 256, 256)). Dimension are hannel-first.

attention_resolutions: resolution, in pixels, of the attention-layers of the model

num_classes: int. (by default None). Number of classes the diffusion model is trained of.

dims: int. images 1D, 2D or 3D (by default = 2)

learn_sigma: bool (by default = True). If true, the output channel number will be 6 instead of 3.

num_channels: int (by default 256). Base channel number for the layers of the diffusion model architecture.

num_head_channels: int (by default 64). Number of channel per head of the attention blocks.

num_res_blocks: int (by default 2). Number of residual block of the architecture.

resblock_updown: bool (by default True). Whether to apply a downsampling after each residual block of the underlying Unet architecture.

use_fp16: bool (by default True). Use 16bit floating -point precision. If cuda is not available, will be set as false (fp32).

use_scale_shift_norm: bool (by default True). Normalisation of the output of each block of layers in the Unet architecture.

num_heads: int (by default 4). Number of attention heads.

num_heads_upsample: int (by default -1). Num head for upsampling attention layers.

channel_mult: tuple (by default None). Will be computed if not provided. Depending on the resolution, multiply the base channel number to get the final one for each residual layer of the Unet model.

dropout: float (by default 0.0). Dropout rate.

use_new_attention_order: bool (by default False). If true, the unet will use QKVAttention layers, if False, will use QKVAttentionLegacy.

t: int (by default 150). Number of diffusion steps applied for each image.

t_delta: int (by default 15). Strength of the noise added before the diffusion process.

use_bm: float (by default False) #Erreur sur la valeur?

use_checkpoint: bool (by default False). gradient checkpointing for training

conv_resample: bool (by default True). Use learned convolutions for upsampling and downsampling. If false, interpolation (nearest) will be used.

sample_step: int (by default 1). Number of time the diffusion process (noise addition + denoising) is repeated for each image.

rand_t: bool (by default False). If true, add random noise before denoising. The noise is sampled uniformly between -t_delta and +t_delta.

attention_resolutions: list[int]

channel_mult: tuple = None

conv_resample: bool = True

dims: int = 2

dropout: float = 0.0

img_shape: tuple = (3, 256, 256)

learn_sigma: bool = True

num_channels: int = 256

num_classes: int = None

num_head_channels: int = 64

num_heads: int = 4

num_heads_upsample: int = -1

num_res_blocks: int = 2

rand_t: bool = False

resblock_updown: bool = True

sample_step: int = 1

t: int = 150

t_delta: int = 15

use_bm: float = False

use_checkpoint: bool = False

use_fp16: bool = True

use_new_attention_order: bool = False

use_scale_shift_norm: bool = True

weights_path: Path = PosixPath('/home/runner/.neuralde/diffpure/256x256_diffusion_uncond.pt')

neural_de.transformations.diffusion.diffusion_enhancer module

class neural_de.transformations.diffusion.diffusion_enhancer.DiffusionEnhancer(device=None, config=DiffPureConfig(weights_path=PosixPath('/home/runner/.neuralde/diffpure/256x256_diffusion_uncond.pt'), img_shape=(3, 256, 256), attention_resolutions=[32, 16, 8], num_classes=None, dims=2, learn_sigma=True, num_channels=256, num_head_channels=64, num_res_blocks=2, resblock_updown=True, use_fp16=True, use_scale_shift_norm=True, num_heads=4, num_heads_upsample=-1, channel_mult=None, dropout=0.0, use_new_attention_order=False, t=150, t_delta=15, use_bm=False, use_checkpoint=False, conv_resample=True, sample_step=1, rand_t=False), logger=None)[source]

Bases: BaseTransformation

The goal of this class is to purify a batch of images, to reduce noise and to increase robustness against potential adversarial attacks contained in the images. The weights given in this librairy are adapted for an output in 256*256 format. Of course, all sizes are supported in input but the enhancer will resize the images to 256*256.

Parameters:

device (Optional[DeviceObjType]) – some steps can be computed with cpu but a gpu is highly recommended.
config (Optional[DiffPureConfig]) – an instance of the DiffPureConfig class. The most important attributes are: t, sample_step and t_delta. Higher t or sample step will lead to a stronger denoising, at the cost of processing time. t_delta is the quantity of noise added by the method before it’s diffusion process : the higher, the higher the chances to remove adversarial attacks, at the cost of a potentiel loss of quality in the images. The other attributes of DiffPureConfig should be modified for a custom Diffusion model.

forward(x)[source]

Apply the diffusion process to a tensor of images.

Parameters:: x (Tensor) – Tensor of batch images
Returns:: Tensor of images after diffusion.

transform(image_batch)[source]

“Purify” (removes noise and noise-based adverserial attacks) a batch of input images by applying a diffusion process to the images.

The images are resized to the diffusion model supported size (currently 256*256) : you may want to resize/enhance the resolution of the output images. If the input images do not have the same h and w, the resizing process will crop to a square image, thus losing some information.

Parameters:: image_batch (Union[ndarray, Tensor]) – Batch of images to purify (numpy array or torch.Tensor).
Returns:: The batch of purified images (numpy array).

neural_de.transformations.diffusion.rev_guided_diffusion module

class neural_de.transformations.diffusion.rev_guided_diffusion.RevGuidedDiffusion(config, device=None, logger=None)[source]

Bases: Module

Implements the rev-guided diffusion.

Parameters:

device (Optional[DeviceObjType]) – “cuda” or “cpu”. Gpu is highly recommended but somme steps are available with cpu.
config (DiffPureConfig) – An instance of DiffPureConfig, it has been created in the input of the DiffusionEhnancer class.
logger (Optional[Logger]) – logger.

image_editing_sample(img)[source]

This method apply the rev-guided diffusion to a batch of images.

Parameters:: img (Tensor) – Tensor (batch of images)
Returns:: Tensor (batch of images)

training: bool

neural_de.transformations.diffusion.rev_vpsde module

class neural_de.transformations.diffusion.rev_vpsde.RevVPSDE(model, beta_min=0.1, beta_max=20, N=1000, img_shape=(3, 256, 256), logger=None)[source]

Bases: Module

Constructs a Variance Preserving SDE.

Parameters:

model (Module) – diffusion model
beta_min (float) – min value of beta for normalisation
beta_max (float) – max value of beta for normalisation
N (int) – scaling factor
img_shape (tuple) – Image dimension, channel-first.
logger (Optional[Logger]) – logger (logging.Logger)

f(t, x)[source]

Creates the drift function -f(x, 1-t) (by t’ = 1 - t) Sdeint only support a 2D tensor (batch_size, c*h*w)

Parameters:

t (Tensor) – current step
x (Tensor) – batch of input images

Return type:

Tensor

g(t, x)[source]

Create the diffusion function g(1-t) (by t’ = 1 - t) sdeint only support a 2D tensor (batch_size, c*h*w)

Parameters:

t (Tensor) – current step
x (Tensor) – batch of input images

Return type:

Tensor

rvpsde_fn(t, x, return_type='drift')[source]

Create the drift and diffusion functions for the reverse SDE

Parameters:

t (Tensor) – current step
x (Tensor) – batch of input images
return_type (str) – if “drift”, will apply a drift following the diffusion. If not, only the diffusion will be performed.

training: bool

vpsde_fn(t, x)[source]

Apply variant-preserving sde to a batch of images.

Parameters:

timesteps – current timestep
x (Tensor) – image batch

Return type:

tuple[Tensor, Tensor]

neural_de.transformations.diffusion package

Subpackages

Submodules

neural_de.transformations.diffusion.diffpure_config module

neural_de.transformations.diffusion.diffusion_enhancer module

neural_de.transformations.diffusion.rev_guided_diffusion module

neural_de.transformations.diffusion.rev_vpsde module

Module contents