ProteinScatter: Visualizing Structurally Similar Proteins with 3Di Embeddings

Trained a GPT-like model on 300+ thousand protein 3Di sequences (from Foldseek) and visualized embeddings in a 2D scatterplot via UMAP.
Molecular Modeling with Juan Vanegas, Oregon State University (2024). Corvallis, OR
Demo
Paper
Cite
BibTeX
@misc{bertucci2024proteinscatter, author = {Donald Bertucci}, title = {ProteinScatter: Visualizing Structurally Similar Proteins with 3Di Embeddings}, booktitle = {Molecular Modeling with Juan Vanegas, Oregon State University (2024). Corvallis, OR}, year = {2024}, url = {https://xnought.github.io/files/protein-scatter.pdf}, }