abdiffusion

Antibody Discrete Diffusion

Model Overview

The Antibody Discrete Diffusion model allows for sequence-based generation or mask-filling of sequences using discrete diffusion trained from an 8M parameter ESM-style model. This model iteratively generates variable heavy or light chain antibodies from scratch using just the length of sequence to generate or from a user-defined template and fills in the blank positions.

Input

The main input is a sequence with any number of mask tokens (`<mask>`), or partially masked.

Additionally, there are parameters for users to modify the generation with different strategies, temperatures, and number to unmask each step. The number to unmask each step can potentially speed up sampling by reducing the number of passes through the model when generating. The value can be one of [1, 2, 4]. Any value above 4 tends to reduce the quality of sample generation. Temperature values range from 0.0 to 1.0 and decoding strategy is one of the following:

Decoding strategies:
    * `random`: Positions to unmask are ordered randomly
    * `max_prob`: Positions to unmask are ordered by the highest per-position probability.
    * `entropy`: Positions to unmask are ordered by the lowest per-position entropy.

Example

Example with Playground Submission

Using the API playground, one can experiment with use of the model by specifying the indicated fields in JSON format. Below is an example that can be entered. The length of sequence has been shortened for visual clarity.

{
  "sequence": "<mask><mask><mask><mask>TGATH",
  "temperature": 0.5,
  "decoding_order_strategy": "entropy",
  "unmaskings_per_step": 1
}

Example with Python Client

The python client, offers a convenient way to interact with the Ginkgo AI API. Find an example of how to use the client at this Google Colab.

Last updated