andrewdalpino
MewZoom-V1-4X
MewZoom-V1-2X
MewZoom-V1-4X-Unet
MewZoom-V1-2X-Unet
UltraZoom-4X-Ctrl
UltraZoom-2X
A fast single image super-resolution (SISR) model for upscaling images with ultra high-quality. Ultra Zoom uses a two-stage "zoom in and enhance" mechanism that utilizes a fast deterministic upscaling algorithm to upscale the image and then enhances it through a steerable residual pathway that operates primarily in the low-resolution subspace of a deep neural network. - Fast and scalable: Ultra Zoom uses a unique "zoom in and enhance" mechanism that combines the speed of deterministic bicubic interpolation with the power of a deep neural network. - Controllable enhancements: Ultra Zoom integrates channel-wise control modules directly into the architecture, allowing you to finely adjust the amount of denoising, deblurring, and deartifacting to suite your image source. - Full RGB: Unlike many efficient SR models that only operate in the luminance domain, Ultra Zoom operates within the full RGB color domain enhancing both luminance and chrominance for the best possible image quality. View at full resolution for best results. More comparisons can be found here. This comparison demonstrates the strength of the enhancements (deblurring, denoising, and deartifacting) applied to the upscaled image. This comparison demonstrates the individual enhancements applied in isolation. The following pretrained models are available on HuggingFace Hub. | Name | Upscale | Num Channels | Encoder Layers | Parameters | Control Modules | Library Version | |---|---|---|---|---|---|---| | andrewdalpino/UltraZoom-2X-Ctrl | 2X | 48 | 20 | 1.8M | Yes | 0.2.x | | andrewdalpino/UltraZoom-3X-Ctrl | 3X | 54 | 30 | 3.5M | Yes | 0.2.x | | andrewdalpino/UltraZoom-2X | 2X | 48 | 20 | 1.8M | No | 0.1.x | | andrewdalpino/UltraZoom-3X | 3X | 54 | 30 | 3.5M | No | 0.1.x | | andrewdalpino/UltraZoom-4X | 4X | 96 | 40 | 14M | No | 0.1.x | If you'd just like to load the pretrained weights and do inference, getting started is as simple as in the examples below. First, you'll need the `ultrazoom` package installed into your project. For the non-control version we'll need library version `0.1.x` to load the pretrained weights. We'll also need the `torchvision` library to do some basic image preprocessing. We recommend using a virtual environment to make package management easier. Then, load the weights from HuggingFace Hub, convert the input image to a tensor, and upscale the image. The control version of Ultra Zoom allows you to independently adjust the level of deblurring, denoising, and deartifacting applied to the upscaled image. We accomplish this by conditioning the input image on a Control Vector that gets picked up by control modules embedded into each layer of the encoder. Version `0.2.x` of the library is required for control functionality. The `ControlVector` class takes 3 arguments - `gaussianblur`, `gaussiannoise`, and `jpegcompression` corresponding to the assumed level of each type of degradation present in the input image. Their values range from 0.0 meaning no degradation is assumed present to 1.0 meaning that the maximum amount of degradation is assumed present. You'll need the code in the repository to train new models and export them for production. Project dependencies are specified in the `requirements.txt` file. You can install them with pip using the following command from the project root. We recommend using a virtual environment such as `venv` to keep package dependencies on your system tidy. Ultra Zoom is trained in two stages. The first stage focuses on building a foundation model for fine-tuning. It aims to jointly minimize the Pixel Loss with high and low frequency perceptual losses from the perspective of a pretrained VGG19 image classifier. To start training with the default settings, add your training and testing images to the `./dataset/train` and `./dataset/test` folders respectively and call the pretraining script like in the example below. If you are looking for good training sets to start with we recommend the `DIV2K` and/or `Flicker2K` datasets. You can customize the upscaler model by adjusting the `numchannels`, `hiddenratio`, and `numencoderlayers` hyper-parameters like in the example below. You can also adjust the `batchsize`, `learningrate`, and `gradientaccumulationsteps` to suite your training setup. In addition, you can control various training data augmentation arguments such as the brightness, contrast, hue, and saturation jitter. We use TensorBoard to capture and display training events such as loss and gradient norm updates. To launch the dashboard server run the following command from the terminal. Then navigate to the dashboard using your favorite web browser. | Argument | Default | Type | Description | |---|---|---|---| | --trainimagespath | "./dataset/train" | str | The path to the folder containing your training images. | | --testimagespath | "./dataset/test" | str | The path to the folder containing your testing images. | | --numdatasetprocesses | 8 | int | The number of CPU processes to use to preprocess the dataset. | | --targetresolution | 256 | int | The number of pixels in the height and width dimensions of the training images. | | --upscaleratio | 2 | (1, 2, 3, 4) | The upscaling or zoom factor. | | --mingaussianblur | 0.0 | float | The minimum amount of Gaussian blur to apply to the degraded low-resolution image. | | --maxgaussianblur | 1.0 | float | The maximum amount of Gaussian blur to apply to the degraded low-resolution image. | | --mingaussiannoise | 0.0 | float | The minimum amount of Gaussian noise to add to the degraded low-resolution image. | | --maxgaussiannoise | 0.1 | float | The maximum amount of Gaussian noise to add to the degraded low-resolution image. | | --mincompression | 0.0 | float | The minimum amount of JPEG compression to apply to the degraded low-resolution image. | | --maxcompression | 0.8 | float | The maximum amount of JPEG compression to apply to the degraded low-resolution image. | | --brightnessjitter | 0.1 | float | The amount of jitter applied to the brightness of the training images. | | --contrastjitter | 0.1 | float | The amount of jitter applied to the contrast of the training images. | | --saturationjitter | 0.1 | float | The amount of jitter applied to the saturation of the training images. | | --huejitter | 0.1 | float | The amount of jitter applied to the hue of the training images. | | --batchsize | 32 | int | The number of training images to pass through the network at a time. | | --gradientaccumulationsteps | 4 | int | The number of batches to pass through the network before updating the model weights. | | --numepochs | 100 | int | The number of epochs to train for. | | --learningrate | 5e-4 | float | The learning rate of the AdamW optimizer. | | --maxgradientnorm | 2.0 | float | Clip gradients above this threshold norm before stepping. | | --numchannels | 48 | int | The number of channels within each encoder block. | | --hiddenratio | 2 | (1, 2, 4) | The ratio of hidden channels to `numchannels` within the activation portion of each encoder block. | | --numencoderlayers | 20 | int | The number of layers within the body of the encoder. | | --activationcheckpointing | False | bool | Should we use activation checkpointing? This will drastically reduce memory utilization during training at the cost of recomputing the forward pass. | | --evalinterval | 2 | int | Evaluate the model after this many epochs on the testing set. | | --checkpointinterval | 2 | int | Save the model checkpoint to disk every this many epochs. | | --checkpointpath | "./checkpoints/checkpoint.pt" | str | The path to the base checkpoint file on disk. | | --resume | False | bool | Should we resume training from the last checkpoint? | | --rundirpath | "./runs" | str | The path to the TensorBoard run directory for this training session. | | --device | "cpu" | str | The device to run the computation on. | | --seed | None | int | The seed for the random number generator. | This next stage focuses on squeezing extra performance out of the model using an adversarial training framework. Step 2 of training takes the pretrained checkpoint and fine-tunes the model using feedback from an adversarial critic model. The critic is specially optimized to detect slight differences between real images and images generated by Ultra Zoom. It uses feedback from the upscaler to improve its detection rate and in turn the upscaler uses feedback from the critic to improve its fool rate. This stage can be considered fully optimized when the critic can no longer reliably detect fake images i.e. the F1 score is pegged near 0.5. To start fine-tuning your pretrained checkpoint see the example below. To adjust the size of the critic model use the `criticmodelsize` argument. | Argument | Default | Type | Description | |---|---|---|---| | --basecheckpointpath | None | str | The path to the pretrained checkpoint. | | --trainimagespath | "./dataset/train" | str | The path to the folder containing your training images. | | --testimagespath | "./dataset/test" | str | The path to the folder containing your testing images. | | --numdatasetprocesses | 8 | int | The number of CPU processes to use to preprocess the dataset. | | --targetresolution | 512 | int | The number of pixels in the height and width dimensions of the training images. | | --mingaussianblur | 0.0 | float | The minimum amount of Gaussian blur to apply to the degraded low-resolution image. | | --maxgaussianblur | 1.0 | float | The maximum amount of Gaussian blur to apply to the degraded low-resolution image. | | --mingaussiannoise | 0.0 | float | The minimum amount of Gaussian noise to add to the degraded low-resolution image. | | --maxgaussiannoise | 0.1 | float | The maximum amount of Gaussian noise to add to the degraded low-resolution image. | | --mincompression | 0.0 | float | The minimum amount of JPEG compression to apply to the degraded low-resolution image. | | --maxcompression | 0.8 | float | The maximum amount of JPEG compression to apply to the degraded low-resolution image. | | --brightnessjitter | 0.1 | float | The amount of jitter applied to the brightness of the training images. | | --contrastjitter | 0.1 | float | The amount of jitter applied to the contrast of the training images. | | --saturationjitter | 0.1 | float | The amount of jitter applied to the saturation of the training images. | | --huejitter | 0.1 | float | The amount of jitter applied to the hue of the training images. | | --batchsize | 8 | int | The number of training images to pass through the network at a time. | | --gradientaccumulationsteps | 16 | int | The number of batches to pass through the network before updating the model weights. | | --upscalerlearningrate | 1e-4 | float | The learning rate of the AdamW optimizer. | | --upscalermaxgradientnorm | 1.0 | float | Clip gradients above this threshold norm before stepping. | | --criticlearningrate | 5e-4 | float | The learning rate of the AdamW optimizer. | | --criticmaxgradientnorm | 5.0 | float | Clip gradients above this threshold norm before stepping. | | --numepochs | 100 | int | The number of epochs to train for. | | --criticwarmupepochs | 1 | int | Train the critic model for this many epochs before using it to train the upscaler. | | --criticmodelsize | "small" | str | The size of the critic model. Choice of small, medium, and large. | | --activationcheckpointing | False | bool | Should we use activation checkpointing? This will drastically reduce memory utilization during training at the cost of recomputing the forward pass. | | --evalinterval | 2 | int | Evaluate the model after this many epochs on the testing set. | | --checkpointinterval | 2 | int | Save the model checkpoint to disk every this many epochs. | | --checkpointpath | "./checkpoints/checkpoint.pt" | str | The path to the base checkpoint file on disk. | | --resume | False | bool | Should we resume training from the last checkpoint? | | --rundirpath | "./runs" | str | The path to the TensorBoard run directory for this training session. | | --device | "cpu" | str | The device to run the computation on. | | --seed | None | int | The seed for the random number generator. | You can use the provided `test-compare.py` script to generate upscaled images from the trained model at the default checkpoint like in the example below. To generate images using a different checkpoint you can use the `checkpointpath` argument like in the example below. You can adjust the level of enhancements applied to the image by setting the `gaussianblur`, `gaussiannoise`, and `jpegcompression` arguments like in the example below. Each value has been normalized such that 0 means no enhancement and 1 means full enhancement. | Argument | Default | Type | Description | |---|---|---|---| | --imagepath | None | str | The path to the image file to be upscaled by the model. | | --checkpointpath | "./checkpoints/fine-tuned.pt" | str | The path to the base checkpoint file on disk. | | --gaussianblur | 0.5 | float | The strength of gaussian blur removal from the image, between 0 and 1. | | --gaussiannoise | 0.5 | float | The strength of gaussian noise removal from the image, between 0 and 1. | | --jpegcompression | 0.5 | float | The strength of JPEG compression artifact removal from the image, between 0 and 1. | | --device | "cpu" | str | The device to run the computation on. | >- J. Song, et. al Gram-GAN: Image Super-Resolution Based on Gram Matrix and Discriminator Perceptual Loss, Sensors, 2023. >- Z. Liu, et al. A ConvNet for the 2020s, 2022. >- A. Jolicoeur-Martineau. The Relativistic Discriminator: A Key Element Missing From Standard GAN, 2018. >- J. Yu, et al. Wide Activation for Efficient and Accurate Image Super-Resolution, 2018. >- J. Johnson, et al. Perceptual Losses for Real-time Style Transfer and Super-Resolution, 2016. >- W. Shi, et al. Real-Time Single Image and Video Super-Resolution Using an Efficient Sub-Pixel Convolutional Neural Network, 2016. >- T. Salimans, et al. Weight Normalization: A Simple Reparameterization to Accelerate Training of Deep Neural Networks, OpenAI, 2016. >- T. Miyato, et al. Spectral Normalization for Generative Adversarial Networks, ICLR, 2018. >- E. Perez, et. al. FiLM: Visual Reasoning with a General Conditioning Layer, Association for the Advancement of Artificial Intelligence, 2018.
NoPE-GPT-400M-Chat
UltraZoom-4X
A fast single image super-resolution (SISR) model for upscaling images with ultra high-quality. Ultra Zoom uses a two-stage "zoom in and enhance" mechanism that utilizes a fast deterministic upscaling algorithm to upscale the image and then enhances it through a steerable residual pathway that operates primarily in the low-resolution subspace of a deep neural network. - Fast and scalable: Ultra Zoom uses a unique "zoom in and enhance" mechanism that combines the speed of deterministic bicubic interpolation with the power of a deep neural network. - Controllable enhancements: Ultra Zoom integrates channel-wise control modules directly into the architecture, allowing you to finely adjust the amount of denoising, deblurring, and deartifacting to suite your image source. - Full RGB: Unlike many efficient SR models that only operate in the luminance domain, Ultra Zoom operates within the full RGB color domain enhancing both luminance and chrominance for the best possible image quality. View at full resolution for best results. More comparisons can be found here. This comparison demonstrates the strength of the enhancements (deblurring, denoising, and deartifacting) applied to the upscaled image. This comparison demonstrates the individual enhancements applied in isolation. The following pretrained models are available on HuggingFace Hub. | Name | Upscale | Num Channels | Encoder Layers | Parameters | Control Modules | Library Version | |---|---|---|---|---|---|---| | andrewdalpino/UltraZoom-2X-Ctrl | 2X | 48 | 20 | 1.8M | Yes | 0.2.x | | andrewdalpino/UltraZoom-3X-Ctrl | 3X | 54 | 30 | 3.5M | Yes | 0.2.x | | andrewdalpino/UltraZoom-2X | 2X | 48 | 20 | 1.8M | No | 0.1.x | | andrewdalpino/UltraZoom-3X | 3X | 54 | 30 | 3.5M | No | 0.1.x | | andrewdalpino/UltraZoom-4X | 4X | 96 | 40 | 14M | No | 0.1.x | If you'd just like to load the pretrained weights and do inference, getting started is as simple as in the examples below. First, you'll need the `ultrazoom` package installed into your project. For the non-control version we'll need library version `0.1.x` to load the pretrained weights. We'll also need the `torchvision` library to do some basic image preprocessing. We recommend using a virtual environment to make package management easier. Then, load the weights from HuggingFace Hub, convert the input image to a tensor, and upscale the image. The control version of Ultra Zoom allows you to independently adjust the level of deblurring, denoising, and deartifacting applied to the upscaled image. We accomplish this by conditioning the input image on a Control Vector that gets picked up by control modules embedded into each layer of the encoder. Version `0.2.x` of the library is required for control functionality. The `ControlVector` class takes 3 arguments - `gaussianblur`, `gaussiannoise`, and `jpegcompression` corresponding to the assumed level of each type of degradation present in the input image. Their values range from 0.0 meaning no degradation is assumed present to 1.0 meaning that the maximum amount of degradation is assumed present. You'll need the code in the repository to train new models and export them for production. Project dependencies are specified in the `requirements.txt` file. You can install them with pip using the following command from the project root. We recommend using a virtual environment such as `venv` to keep package dependencies on your system tidy. Ultra Zoom is trained in two stages. The first stage focuses on building a foundation model for fine-tuning. It aims to jointly minimize the Pixel Loss with high and low frequency perceptual losses from the perspective of a pretrained VGG19 image classifier. To start training with the default settings, add your training and testing images to the `./dataset/train` and `./dataset/test` folders respectively and call the pretraining script like in the example below. If you are looking for good training sets to start with we recommend the `DIV2K` and/or `Flicker2K` datasets. You can customize the upscaler model by adjusting the `numchannels`, `hiddenratio`, and `numencoderlayers` hyper-parameters like in the example below. You can also adjust the `batchsize`, `learningrate`, and `gradientaccumulationsteps` to suite your training setup. In addition, you can control various training data augmentation arguments such as the brightness, contrast, hue, and saturation jitter. We use TensorBoard to capture and display training events such as loss and gradient norm updates. To launch the dashboard server run the following command from the terminal. Then navigate to the dashboard using your favorite web browser. | Argument | Default | Type | Description | |---|---|---|---| | --trainimagespath | "./dataset/train" | str | The path to the folder containing your training images. | | --testimagespath | "./dataset/test" | str | The path to the folder containing your testing images. | | --numdatasetprocesses | 8 | int | The number of CPU processes to use to preprocess the dataset. | | --targetresolution | 256 | int | The number of pixels in the height and width dimensions of the training images. | | --upscaleratio | 2 | (1, 2, 3, 4) | The upscaling or zoom factor. | | --mingaussianblur | 0.0 | float | The minimum amount of Gaussian blur to apply to the degraded low-resolution image. | | --maxgaussianblur | 1.0 | float | The maximum amount of Gaussian blur to apply to the degraded low-resolution image. | | --mingaussiannoise | 0.0 | float | The minimum amount of Gaussian noise to add to the degraded low-resolution image. | | --maxgaussiannoise | 0.1 | float | The maximum amount of Gaussian noise to add to the degraded low-resolution image. | | --mincompression | 0.0 | float | The minimum amount of JPEG compression to apply to the degraded low-resolution image. | | --maxcompression | 0.8 | float | The maximum amount of JPEG compression to apply to the degraded low-resolution image. | | --brightnessjitter | 0.1 | float | The amount of jitter applied to the brightness of the training images. | | --contrastjitter | 0.1 | float | The amount of jitter applied to the contrast of the training images. | | --saturationjitter | 0.1 | float | The amount of jitter applied to the saturation of the training images. | | --huejitter | 0.1 | float | The amount of jitter applied to the hue of the training images. | | --batchsize | 32 | int | The number of training images to pass through the network at a time. | | --gradientaccumulationsteps | 4 | int | The number of batches to pass through the network before updating the model weights. | | --numepochs | 100 | int | The number of epochs to train for. | | --learningrate | 5e-4 | float | The learning rate of the AdamW optimizer. | | --maxgradientnorm | 2.0 | float | Clip gradients above this threshold norm before stepping. | | --numchannels | 48 | int | The number of channels within each encoder block. | | --hiddenratio | 2 | (1, 2, 4) | The ratio of hidden channels to `numchannels` within the activation portion of each encoder block. | | --numencoderlayers | 20 | int | The number of layers within the body of the encoder. | | --activationcheckpointing | False | bool | Should we use activation checkpointing? This will drastically reduce memory utilization during training at the cost of recomputing the forward pass. | | --evalinterval | 2 | int | Evaluate the model after this many epochs on the testing set. | | --checkpointinterval | 2 | int | Save the model checkpoint to disk every this many epochs. | | --checkpointpath | "./checkpoints/checkpoint.pt" | str | The path to the base checkpoint file on disk. | | --resume | False | bool | Should we resume training from the last checkpoint? | | --rundirpath | "./runs" | str | The path to the TensorBoard run directory for this training session. | | --device | "cpu" | str | The device to run the computation on. | | --seed | None | int | The seed for the random number generator. | This next stage focuses on squeezing extra performance out of the model using an adversarial training framework. Step 2 of training takes the pretrained checkpoint and fine-tunes the model using feedback from an adversarial critic model. The critic is specially optimized to detect slight differences between real images and images generated by Ultra Zoom. It uses feedback from the upscaler to improve its detection rate and in turn the upscaler uses feedback from the critic to improve its fool rate. This stage can be considered fully optimized when the critic can no longer reliably detect fake images i.e. the F1 score is pegged near 0.5. To start fine-tuning your pretrained checkpoint see the example below. To adjust the size of the critic model use the `criticmodelsize` argument. | Argument | Default | Type | Description | |---|---|---|---| | --basecheckpointpath | None | str | The path to the pretrained checkpoint. | | --trainimagespath | "./dataset/train" | str | The path to the folder containing your training images. | | --testimagespath | "./dataset/test" | str | The path to the folder containing your testing images. | | --numdatasetprocesses | 8 | int | The number of CPU processes to use to preprocess the dataset. | | --targetresolution | 512 | int | The number of pixels in the height and width dimensions of the training images. | | --mingaussianblur | 0.0 | float | The minimum amount of Gaussian blur to apply to the degraded low-resolution image. | | --maxgaussianblur | 1.0 | float | The maximum amount of Gaussian blur to apply to the degraded low-resolution image. | | --mingaussiannoise | 0.0 | float | The minimum amount of Gaussian noise to add to the degraded low-resolution image. | | --maxgaussiannoise | 0.1 | float | The maximum amount of Gaussian noise to add to the degraded low-resolution image. | | --mincompression | 0.0 | float | The minimum amount of JPEG compression to apply to the degraded low-resolution image. | | --maxcompression | 0.8 | float | The maximum amount of JPEG compression to apply to the degraded low-resolution image. | | --brightnessjitter | 0.1 | float | The amount of jitter applied to the brightness of the training images. | | --contrastjitter | 0.1 | float | The amount of jitter applied to the contrast of the training images. | | --saturationjitter | 0.1 | float | The amount of jitter applied to the saturation of the training images. | | --huejitter | 0.1 | float | The amount of jitter applied to the hue of the training images. | | --batchsize | 8 | int | The number of training images to pass through the network at a time. | | --gradientaccumulationsteps | 16 | int | The number of batches to pass through the network before updating the model weights. | | --upscalerlearningrate | 1e-4 | float | The learning rate of the AdamW optimizer. | | --upscalermaxgradientnorm | 1.0 | float | Clip gradients above this threshold norm before stepping. | | --criticlearningrate | 5e-4 | float | The learning rate of the AdamW optimizer. | | --criticmaxgradientnorm | 5.0 | float | Clip gradients above this threshold norm before stepping. | | --numepochs | 100 | int | The number of epochs to train for. | | --criticwarmupepochs | 1 | int | Train the critic model for this many epochs before using it to train the upscaler. | | --criticmodelsize | "small" | str | The size of the critic model. Choice of small, medium, and large. | | --activationcheckpointing | False | bool | Should we use activation checkpointing? This will drastically reduce memory utilization during training at the cost of recomputing the forward pass. | | --evalinterval | 2 | int | Evaluate the model after this many epochs on the testing set. | | --checkpointinterval | 2 | int | Save the model checkpoint to disk every this many epochs. | | --checkpointpath | "./checkpoints/checkpoint.pt" | str | The path to the base checkpoint file on disk. | | --resume | False | bool | Should we resume training from the last checkpoint? | | --rundirpath | "./runs" | str | The path to the TensorBoard run directory for this training session. | | --device | "cpu" | str | The device to run the computation on. | | --seed | None | int | The seed for the random number generator. | You can use the provided `test-compare.py` script to generate upscaled images from the trained model at the default checkpoint like in the example below. To generate images using a different checkpoint you can use the `checkpointpath` argument like in the example below. You can adjust the level of enhancements applied to the image by setting the `gaussianblur`, `gaussiannoise`, and `jpegcompression` arguments like in the example below. Each value has been normalized such that 0 means no enhancement and 1 means full enhancement. | Argument | Default | Type | Description | |---|---|---|---| | --imagepath | None | str | The path to the image file to be upscaled by the model. | | --checkpointpath | "./checkpoints/fine-tuned.pt" | str | The path to the base checkpoint file on disk. | | --gaussianblur | 0.5 | float | The strength of gaussian blur removal from the image, between 0 and 1. | | --gaussiannoise | 0.5 | float | The strength of gaussian noise removal from the image, between 0 and 1. | | --jpegcompression | 0.5 | float | The strength of JPEG compression artifact removal from the image, between 0 and 1. | | --device | "cpu" | str | The device to run the computation on. | >- J. Song, et. al Gram-GAN: Image Super-Resolution Based on Gram Matrix and Discriminator Perceptual Loss, Sensors, 2023. >- Z. Liu, et al. A ConvNet for the 2020s, 2022. >- A. Jolicoeur-Martineau. The Relativistic Discriminator: A Key Element Missing From Standard GAN, 2018. >- J. Yu, et al. Wide Activation for Efficient and Accurate Image Super-Resolution, 2018. >- J. Johnson, et al. Perceptual Losses for Real-time Style Transfer and Super-Resolution, 2016. >- W. Shi, et al. Real-Time Single Image and Video Super-Resolution Using an Efficient Sub-Pixel Convolutional Neural Network, 2016. >- T. Salimans, et al. Weight Normalization: A Simple Reparameterization to Accelerate Training of Deep Neural Networks, OpenAI, 2016. >- T. Miyato, et al. Spectral Normalization for Generative Adversarial Networks, ICLR, 2018. >- E. Perez, et. al. FiLM: Visual Reasoning with a General Conditioning Layer, Association for the Advancement of Artificial Intelligence, 2018.
MewZoom-V0-2X
MewZoom-V0-4X
UltraZoom-2X-Ctrl
NoPE-GPT-400M-Base
UltraZoom-3X-Ctrl
UltraZoom-3X
MewZoom-2X-Ctrl
MewZoom-V0-2X-Ctrl
ESMC-600M-Protein-Function
MewZoom-V0-4X-Ctrl
MewZoom-4X
ESMC-300M-Protein-Function
ESM2-35M-Protein-Cellular-Component
MewZoom-2X
ESMC-300M-QAT-Protein-Function
An Evolutionary-scale Model (ESM) for protein function prediction from amino acid sequences using the Gene Ontology (GO). Based on the ESM Cambrian Transformer architecture, pre-trained on UniRef, MGnify, and the Joint Genome Institute's database and fine-tuned on the AmiGO Boost protein function dataset, this model predicts the GO subgraph for a particular protein sequence - giving you insight into the molecular function, biological process, and location of the activity inside the cell. > "The Gene Ontology (GO) is a concept hierarchy that describes the biological function of genes and gene products at different levels of abstraction (Ashburner et al., 2000). It is a good model to describe the multi-faceted nature of protein function." > "GO is a directed acyclic graph. The nodes in this graph are functional descriptors (terms or classes) connected by relational ties between them (isa, partof, etc.). For example, terms 'protein binding activity' and 'binding activity' are related by an isa relationship; however, the edge in the graph is often reversed to point from binding towards protein binding. This graph contains three subgraphs (subontologies): Molecular Function (MF), Biological Process (BP), and Cellular Component (CC), defined by their root nodes. Biologically, each subgraph represent a different aspect of the protein's function: what it does on a molecular level (MF), which biological processes it participates in (BP) and where in the cell it is located (CC)." The following pretrained models are available on HuggingFace Hub. | Name | Embedding Dim. | Attn. Heads | Encoder Layers | Context Length | QAT | Total Parameters | |---|---|---|---|---|---|---| | andrewdalpino/ESMC-300M-Protein-Function | 960 | 15 | 30 | 2048 | None | 361M | | andrewdalpino/ESMC-300M-QAT-Protein-Function | 960 | 15 | 30 | 2048 | int8w | 361M | | andrewdalpino/ESMC-600M-Protein-Function | 1152 | 18 | 36 | 2048 | None | 644M | | andrewdalpino/ESMC-600M-QAT-Protein-Function | 1152 | 18 | 36 | 2048 | int8w | 644M | First, install the `esmcfunctionclassifier` package using pip. Then, we'll load the model weights from HuggingFace Hub and the GO graph using `obonet`, tokenize the amino acid sequence, and infer the GO subgraph. You can also output the gene-ontology (GO) `networkx` subgraph for a given sequence like in the example below. You'll need an up-to-date gene ontology database that you can import using the `obonet` package. To quantize the model weights using int8 call the `quantizeweights()` method. Any model can be quantized, but we recommend one that has been quantization-aware trained (QAT) for the best performance. The `groupsize` argument controls the granularity at which quantization scales are computed. The training code can be found at https://github.com/andrewdalpino/ESMC-Function-Classifier. >- T. Hayes, et al. Simulating 500 million years of evolution with a language model, 2024. >- M. Ashburner, et al. Gene Ontology: tool for the unification of biology, 2000.
MewZoom-V0-3X
NoPE-GPT-Small-Base
MewZoom-4X-Ctrl
NoPE-GPT-Small-Chat
ProtHash-V2-512-Tiny
ProtHash-384-Tiny
ProtHash-V2-384-Tiny
MewZoom-3X
ESM2-150M-Protein-Biological-Process
An Evolutionary-scale Model (ESM) for protein function prediction from amino acid sequences using the Gene Ontology (GO). Based on the ESM2 Transformer architecture, pre-trained on UniRef50, and fine-tuned on the AmiGO dataset, this model predicts the GO subgraph for a particular protein sequence - giving you insight into the molecular function, biological process, and location of the activity inside the cell. Note: This version only models the `biological process` subgraph of the gene ontology. > "The Gene Ontology (GO) is a concept hierarchy that describes the biological function of genes and gene products at different levels of abstraction (Ashburner et al., 2000). It is a good model to describe the multi-faceted nature of protein function." > "GO is a directed acyclic graph. The nodes in this graph are functional descriptors (terms or classes) connected by relational ties between them (isa, partof, etc.). For example, terms 'protein binding activity' and 'binding activity' are related by an isa relationship; however, the edge in the graph is often reversed to point from binding towards protein binding. This graph contains three subgraphs (subontologies): Molecular Function (MF), Biological Process (BP), and Cellular Component (CC), defined by their root nodes. Biologically, each subgraph represent a different aspect of the protein's function: what it does on a molecular level (MF), which biological processes it participates in (BP) and where in the cell it is located (CC)." - Vocabulary Size: 33 - Embedding Dimensions: 640 - Attention Heads: 20 - Encoder Layers: 30 - Context Length: 1026 For a basic demonstration we can rank the GO terms for a particular sequence. For a more advanced example see the predict-subgraph.py source file. >- A. Rives, et al. Biological structure and function emerge from scaling unsupervised learning to 250 million protein sequences, 2021. >- Z. Lin, et al. Evolutionary-scale prediction of atomic level protein structure with a language model, 2022. >- G. A. Merino, et al. Hierarchical deep learning for predicting GO annotations by integrating protein knowledge, 2022. >- M. Ashburner, et al. Gene Ontology: tool for the unification of biology, 2000.
ESM2-35M-Protein-Molecular-Function
ESM2-150M-Protein-Molecular-Function
An Evolutionary-scale Model (ESM) for protein function prediction from amino acid sequences using the Gene Ontology (GO). Based on the ESM2 Transformer architecture, pre-trained on UniRef50, and fine-tuned on the AmiGO dataset, this model predicts the GO subgraph for a particular protein sequence - giving you insight into the molecular function, biological process, and location of the activity inside the cell. Note: This version only models the `molecular function` subgraph of the gene ontology. > "The Gene Ontology (GO) is a concept hierarchy that describes the biological function of genes and gene products at different levels of abstraction (Ashburner et al., 2000). It is a good model to describe the multi-faceted nature of protein function." > "GO is a directed acyclic graph. The nodes in this graph are functional descriptors (terms or classes) connected by relational ties between them (isa, partof, etc.). For example, terms 'protein binding activity' and 'binding activity' are related by an isa relationship; however, the edge in the graph is often reversed to point from binding towards protein binding. This graph contains three subgraphs (subontologies): Molecular Function (MF), Biological Process (BP), and Cellular Component (CC), defined by their root nodes. Biologically, each subgraph represent a different aspect of the protein's function: what it does on a molecular level (MF), which biological processes it participates in (BP) and where in the cell it is located (CC)." - Vocabulary Size: 33 - Embedding Dimensions: 640 - Attention Heads: 20 - Encoder Layers: 30 - Context Length: 1026 For a basic demonstration we can rank the GO terms for a particular sequence. For a more advanced example see the predict-subgraph.py source file. >- A. Rives, et al. Biological structure and function emerge from scaling unsupervised learning to 250 million protein sequences, 2021. >- Z. Lin, et al. Evolutionary-scale prediction of atomic level protein structure with a language model, 2022. >- G. A. Merino, et al. Hierarchical deep learning for predicting GO annotations by integrating protein knowledge, 2022. >- M. Ashburner, et al. Gene Ontology: tool for the unification of biology, 2000.
ProtHash-512
SuperCool-4x-Medium
ESMC-600M-QAT-Protein-Function
MewZoom-V0-3X-Ctrl
SuperCool-4x-Small
ESM2-35M-Protein-Biological-Process
An Evolutionary-scale Model (ESM) for protein function prediction from amino acid sequences using the Gene Ontology (GO). Based on the ESM2 Transformer architecture, pre-trained on UniRef50, and fine-tuned on the AmiGO dataset, this model predicts the GO subgraph for a particular protein sequence - giving you insight into the molecular function, biological process, and location of the activity inside the cell. Note: This version only models the `biological process` subgraph of the gene ontology. > "The Gene Ontology (GO) is a concept hierarchy that describes the biological function of genes and gene products at different levels of abstraction (Ashburner et al., 2000). It is a good model to describe the multi-faceted nature of protein function." > "GO is a directed acyclic graph. The nodes in this graph are functional descriptors (terms or classes) connected by relational ties between them (isa, partof, etc.). For example, terms 'protein binding activity' and 'binding activity' are related by an isa relationship; however, the edge in the graph is often reversed to point from binding towards protein binding. This graph contains three subgraphs (subontologies): Molecular Function (MF), Biological Process (BP), and Cellular Component (CC), defined by their root nodes. Biologically, each subgraph represent a different aspect of the protein's function: what it does on a molecular level (MF), which biological processes it participates in (BP) and where in the cell it is located (CC)." - Vocabulary Size: 33 - Embedding Dimensions: 480 - Attention Heads: 20 - Encoder Layers: 12 - Context Length: 1026 For a basic demonstration we can rank the GO terms for a particular sequence. For a more advanced example see the predict-subgraph.py source file. >- A. Rives, et al. Biological structure and function emerge from scaling unsupervised learning to 250 million protein sequences, 2021. >- Z. Lin, et al. Evolutionary-scale prediction of atomic level protein structure with a language model, 2022. >- G. A. Merino, et al. Hierarchical deep learning for predicting GO annotations by integrating protein knowledge, 2022. >- M. Ashburner, et al. Gene Ontology: tool for the unification of biology, 2000.