Summarized by Daily Strand AI from peer-reviewed source
Proteins are the workhorses of biology, and reading their precise sequence of building blocks — called amino acids — is fundamental to understanding disease, drug targets, and how our cells function. But sequencing proteins, especially one molecule at a time, has remained far harder than sequencing DNA. Now, researchers have developed a clever workaround: instead of reading proteins directly, they convert the protein's sequence information into DNA, then read the DNA using well-established sequencing technology. They call this 'reverse translation,' a playful nod to the biological process cells normally use to build proteins from genetic instructions.
The technique works by borrowing and modifying a decades-old chemical process called Edman degradation, which peels amino acids off a protein chain one at a time from one end. In this new version, each amino acid that gets snipped off carries a unique DNA barcode tied to its identity and its position in the original molecule. Specialized antibody-based tools called proximity extension assays then recognize these tagged amino acids and generate short DNA signals that can be copied and amplified using a standard lab technique called PCR. The result is a library of DNA records that, when read by modern high-throughput DNA sequencers, reveals the full amino acid sequence of the original peptide — a short protein fragment — at the level of a single molecule. In tests, the method achieved complete sequence coverage across millions of DNA reads and could distinguish between normal peptides and those carrying chemical modifications that happen after the protein is made, known as post-translational modifications.
Being able to sequence proteins at the single-molecule level is a long-sought goal in the field of proteomics, the large-scale study of all proteins in a cell or organism. Unlike DNA sequencing, which has been revolutionized by cheap, fast, high-throughput tools, protein sequencing has lagged behind. This matters enormously for medicine, because proteins — not genes — are what actually carry out biological functions and what most drugs target. Post-translational modifications in particular play critical roles in diseases like cancer, neurodegeneration, and autoimmune conditions, and detecting them precisely could open new doors for diagnostics and drug development.
This research is an early-stage proof of concept, meaning it demonstrates the underlying approach works but has not yet been scaled to analyze complex biological samples or full proteomes. Still, by reframing protein sequencing as a DNA sequencing problem, it allows the field to ride the wave of continued improvements in DNA sequencing technology, which has already dropped in cost by orders of magnitude over the past two decades. If the approach can be developed further into a high-throughput platform, it could give researchers an unprecedented ability to catalog the full complexity of proteins in individual cells, potentially transforming how scientists study disease at its most fundamental level.
Interested in General Biotech?
Newsletter
Never miss a breakthrough.
Join 10,000+ curious minds getting biotech stories distilled into plain language. Free, three times a week.