This is a wish list for protein folding and engineering. It contains some speculation and brain storming and shouldn't be considered completely viable for now.
-
Given a 3d shape (of some nanostructure), produce a protein's amino acid sequence that will consistently create that shape. (done as of 2023?)
-
Control over protein functional properties, such as catalytic domains and sites, as well as designing specific confirmational changes and control over conformation changes.
-
DNA data storage: faster polymerases
-
Proteins that make molecular display techniques easier (simplifying lab bench protocols) -- like mRNA display and ribosome display; easier molecular display would be very valuable for projects using directed evolution techniques.
-
Better protein-based nanopores for DNA sequencing, amino acid sequencing, and protein sensing.
-
Human-controlled DNA polymerase synthesis activity (choose each nucleotide), or an instrumented ribosome to control protein production regardless of mRNA content
-
Molecular protein lego: connect multiple legos together to build large-scale protein structures. This is generally useful for modeling and nanostructures. Binding by DNA addresses or other high affinity ligand specific techniques, for a stable toolbox of known protein structures and shapes and building up larger structures from small parts.
-
Protein mechanical logic: protein structures that have internal logic and state, based on mechanical motion or other catalytic reactions and interactions.
-
Generalized, fully-programmable molecular nanotechnology: programmable nanomachines and nanofactories that can produce other nanostructures to exact specifications, without uncertainty regarding protein folding.
- What were those long-tube protein molecular-chemistry factories called? (non-ribosomal peptide synthetases or NRPS). They are apparently natural, and they have multiple points of interest inside the tube that modify a molecule as it progresses along the protein.
- gene editing proteins (see [[gene-editing]])
- enzymes for DNA synthesis
- molecular recording (like in vivo DNA-based recording devices, for debugging or otherwise, lineage tracing techniques, "of toasters and molecular ticker tapes")
- protein binding affinity stuff (protein-protein interaction)
- catalytic activity, enhancement of catalysis or reduction of catalysis
- synthetic metabolisms
- biosensors
Well, it's probably time to update this page... lots of recent progress in machine learning for protein design.
- AlphaFold2: Highly accurate protein structure prediction with AlphaFold
- RoseTTAFold: Accurate prediction of protein structures and interactions using a three-track neural network
- RFdiffusion: Broadly applicable and accurate protein design by integrating structure prediction networks and diffusion generative models
- A new protein design era with protein diffusion
- A high-level programming language for generative protein design
- Codon language embeddings provide strong signals for protein engineering
- openfold (ref)
- De novo design of high-affinity protein binders to bioactive helical peptides
- Illuminating protein space with a programmable generative model
See https://diyhpl.us/~bryan/papers2/bio/protein-engineering/