Elana P. Simon

prof_pic.jpg

I’m in the third year of my PhD at Stanford University, advised by James Zou, working on various projects at the intersection of machine learning and biology. Previously I worked at Reverie Labs as an ML engineer helping design small molecule cancer drugs. As an undergrad, I studied Computer Science at Harvard, worked with Debora Marks on protein language models. I also have been quite involved with research, fundraising, and patient advocacy focused on Fibrolamellar Hepatocellular Carcinoma.

Currently I’m very excited about trying to understand what ML models are actually learning from protein sequences and structures - digging into their embeddings to find interpretable biological concepts, figuring out how they pick up these patterns during training, and seeing what biology we can uncover by reverse-engineering the models!

I also aim to write up a bunch of ML-bio deep-dives in my blog (matols) but those are pretty high effort thus and low-frequency.

selected publications

  1. chimera.png
    Detection of a recurrent DNAJB1-PRKACA chimeric transcript in fibrolamellar hepatocellular carcinoma
    *Joshua N Honeyman, *Elana P Simon, Nicolas Robine, and 8 more authors
    Science, 2014
  2. nanobody.png
    Protein design and variant prediction using autoregressive generative models
    Jung-Eun Shin, Adam J Riesselman, Aaron W Kollasch, and 6 more authors
    Nature communications, Apr 2021
  3. chemberta.png
    Chemberta-2: Towards chemical foundation models
    *Walid Ahmad, *Elana Simon, Seyone Chithrananda, and 2 more authors
    ELLIS Machine Learning for Molecule Discovery Workshop, Dec 2021
  4. primer.png
    Language models for biological research: a primer
    *Elana Simon, *Kyle Swanson, and James Zou
    Nature Methods, Aug 2024
  5. interplm.png
    InterPLM: Discovering Interpretable Features in Protein Language Models via Sparse Autoencoders
    Elana Simon, and James Zou
    bioRxiv, Nov 2024