publications
2025
- Towards functional annotation with latent protein language model featuresIn ICML 2025 Generative AI and Biology (GenBio) Workshop , 2025
- Massively parallel immunopeptidome by DNA sequencing provides insights into cancer antigen presentationNature Genetics, 2025
2024
- Language models for biological research: a primerNature Methods, Aug 2024
- UniTox: Leveraging LLMs to Curate a Unified Dataset of Drug-Induced Toxicity from FDA LabelsNeurIPS Datasets and Benchmarks, Dec 2024
- InterPLM: Discovering Interpretable Features in Protein Language Models via Sparse AutoencodersbioRxiv, Nov 2024
2022
- Compounds, compositions and methods of treating disordersSep 2022US Patent App. 17/744,228
2021
- Protein design and variant prediction using autoregressive generative modelsNature communications, Apr 2021
- Chemberta-2: Towards chemical foundation modelsELLIS Machine Learning for Molecule Discovery Workshop, Dec 2021
2020
- The Fibrolamellar Registry: A patient-based medical registry can address medical careCancer Research, Dec 2020
2019
- Accelerating protein design using autoregressive generative modelsBioRxiv, Sep 2019
- The fibrolamellar registry: A model for the study of rare diseasesCancer Research, Sep 2019
2018
- Non coding RNA analysis in fibrolamellar hepatocellular carcinomaOncotarget, Feb 2018
2015
- Molecular analysis of the pediatric cancer fibrolamellar hepatocellular carcinomaCancer Research, Apr 2015
- Transcriptomic characterization of fibrolamellar hepatocellular carcinomaProceedings of the National Academy of Sciences, Apr 2015
2014
- Detection of a recurrent DNAJB1-PRKACA chimeric transcript in fibrolamellar hepatocellular carcinomaScience, Apr 2014