Skip to main content

Table 1 Generative AI models [20,21,22,23]

From: Generative AI in healthcare: an implementation science informed translational path on application, integration and governance

Generative AI model

Description

Applications

Generative adversarial networks (GANs)

GANs consist of 2 neural networks, a generator and a discriminator, that compete against each other. GANs are often used in image synthesis, super-resolution, style transfer, and more

Image synthesis, style transfer, face ageing, data augmentation, 3D object creation

Variational autoencoders (VAEs)

VAEs are a type of autoencoder which adds additional constraints to the encoding process, causing the network to generate continuous, structured representations. This makes them useful for tasks such as generating new images or other data points

Image generation, anomaly detection, image denoising, exploration of latent spaces, content generation in gaming

Autoregressive models

These models predict the next output in a sequence based on previous outputs. They have been used extensively in language modelling tasks (like text generation), as well as in generating music and even images

Text generation (e.g., GPT models), music composition, image generation (e.g., PixelRNN), time-series forecasting

Flow-based models

These models leverage the change of variables formula to model complex distributions. They are characterised by their ability to both generate new samples and perform efficient inference

High-quality image synthesis, speech and music modelling, density estimation, anomaly detection

Energy-based models (EBMs)

In EBMs, the aim is to learn an energy function that assigns low-energy values to data points from the data distribution and higher energies to other points. EBMs can be used for a wide range of applications, including image synthesis, denoising and in painting

Image synthesis and restoration, pattern recognition, unsupervised and semi-supervised learning, structured prediction

Diffusion models

These models gradually learn to construct data by reversing a diffusion process, which transforms data into a Gaussian distribution. They have shown remarkable results in generating high-quality, diverse samples

High-fidelity image generation (DALL-E2), audio synthesis, molecular structure generation