Skip to content

Commit

Permalink
Update README.md
Browse files Browse the repository at this point in the history
  • Loading branch information
jzthree authored Sep 25, 2023
1 parent 40a3a86 commit 3ca39fa
Showing 1 changed file with 11 additions and 0 deletions.
11 changes: 11 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -6,6 +6,8 @@ This repo contains the official implementation for the paper [Dirichlet diffusio
**Dirichlet Diffusion Score Model (DDSM)** is a continuous-time diffusion framework designed specificaly for modeling discrete data such as biological
sequences. We introduce a diffusion process defined in probability simplex space with stationary distribution being the Dirichlet distribution. This makes diffusion in continuous space natural for modeling discrete data. DDSM is the first approach for discrete data modeling with continuous-time stochastic differential equation (SDE) diffusion in probability simplex space.

We showed that DDSM is capable of [solving Sudoku](https://github.com/jzhoulab/ddsm/tree/main/sudoku) and [designing promoter sequences](https://github.com/jzhoulab/ddsm/tree/main/promoter_design) according to transcription initiation signals.

The Jax version of the code will be published soon.

Installation instructions
Expand All @@ -24,6 +26,15 @@ An example notebook containing code for applying a toy model to binarized MNIST

[Usage.md](USAGE.md) contains detailed information how to use other scripts provided in the repository.

Time dilation
-------------
Time dilation is a generally applicable technique (not just for DDSM) for improving diffusion sample quality and is very easy to implement. It can be easily applied to other SDE-based diffusion models as well. It simply involves adding a c factor to the reverse diffusion process (c>1).
<img width="751" alt="image" src="https://github.com/jzhoulab/ddsm/assets/8333155/ebe1f91e-16a3-4aa7-9b8a-bc900191d53a">

Time dilation works by biasing sampling toward higher-density areas, which often correspond to better-quality samples. It is advisable to increase the number of reverse diffusion steps by c, but it is not always necessary.

Another useful trick is to introduce time dilation only in the later part of reverse diffusion sampling, since it will avoid biasing sampling globally (e.g. in MNIST generation task, sampling more ones because one is the most frequent digit in MNIST) and only bias sampling locally(e.g. better digit image quality)

Benchmarks
----------
The evaluation is based on comparing generated sequences and human genome promoter sequences (ground truth) on the test chromosomes.
Expand Down

0 comments on commit 3ca39fa

Please sign in to comment.