BAGIM is an active community of Boston area scientists bringing together people from diverse fields of modeling and informatics to impact life and health sciences. BAGIM strives to create a forum for great scientific discussions covering a wide range of topics including data management, visualization, computational chemistry, drug discovery, protein structure, molecular modeling, structure-based drug design, data mining, software tools, and the sharing of goals and experiences. Our community is made up of participants from academia, government, and industry whose goal is to engage in the discussion of science involving a synthesis of theory and technology. Discussions sponsored by BAGIM are targeted to the needs and interests of informatics scientists, computational chemists, medicinal chemists, and statisticians. BAGIM also provides opportunities for networking within these disciplines as well as an arena for the dissemination of information of specific interest to the membership.

Thursday, December 22, 2022

Enric Herrero - Exploring the Exa-space with 3D-based screening methods

We are happy to announce our speaker, Enric Herrero, presenting Exploring the Exa-space with 3D-based screening methods

Please join us for a in-person event

Date: Thursday, March 23, 2023 at 6:00 PM

Sign-up: BAGIM Meetup

Title:  Exploring the Exa-space with 3D-based screening methods


Many huge commercial libraries have appeared in the last 5 years that include Billions of molecules with a reasonable synthesizability and represent a huge opportunity for the drug discovery ecosystem. Bigger libraries can increase the chances of finding interesting hits in a drug discovery project but also represent a challenge to process with existing in-silico methods.

Several approaches have been proposed to deal with these chemical spaces that use simple molecule representations to contain the computational and storage cost. However, 3D representations add valuable binding information of the ligands and representations that are heavily tied to the chemical structure tend to limit the chemical diversity of the proposed solutions.

In this talk we are going to discuss how to tackle the problem of exploring huge chemical spaces in the Exa-scale without sacrificing the advantages of 3D based methods. Different approaches will be presented and compared in order to see the advantages and disadvantages of different molecule representations and screening methods and create a debate around the topic.

Friday, December 2, 2022

Soumya Raychaudhuri: Using genetics & genomics to define mechanisms autoimmunity

We are happy to announce our latest speaker Soumya Raychaudhuri, Professor of Medicine and Biomedical Informatics, Harvard Medical School, https://immunogenomics.hms.harvard.edu/

Please join us for a virtual event on

Date: April 12, 2023

Time: noon ET / 9 am PT

Sign-up: BAGIM Meetup

Title: Using genetics and genomics to define mechanisms of autoimmunity

Abstract: To be confirmed

About the speaker:

Dr. Soumya Raychaudhuri is an Associate Professor at Harvard Medical School and at Brigham and Women’s Hospital. He is also appointed as an Associate Member at the Broad Institute and a Visiting Professor in Genetics at the University of Manchester. He matriculated into the Stanford University NIH funded MST program in 1997 after completing degrees in mathematics and biophysics at the University at Buffalo. In 2004, he completed both his medical training and his doctoral training in biomedical informatics. After completing his clinical training in Internal Medicine, he joined the rheumatology fellowship training program in 2006, and concurrently completed his postdoctoral fellowship training under Mark Daly at the Broad Institute. Since starting his own group in 2010 at Harvard Medical School and Brigham and Women’s Hospital, his lab has focused on finding and fine-mapping disease alleles and understanding their significance in immune-mediated diseases. He has focused on multiple diseases including rheumatoid arthritis, type I diabetes, and tuberculosis infection. He has worked on fine-mapping HLA loci, devising integrative statistical genetics strategies to identify causal variation by taking advantage of largescale epigenetic data, and integrating genetic data with functional genomics and data on human immunological phenotypes.

John Irwin: Chemical screening library growth: is bigger better?

We are happy to announce our latest speaker John Irwin, UCSF Pharmaceutical Chemistry, http://irwinlab.compbio.ucsf.edu

Please join us for a virtual event on

Date: January 11, 2023

Time: noon ET / 9 am PT

Sign-upBAGIM Meetup

Title: Chemical screening library growth: is bigger better?

Abstract: To be confirmed

About the speaker:

John Irwin is Adjunct Associate Professor in the Department of Pharmaceutical Chemistry at the University of California San Francisco. John works on methods to discover new reagents for biological targets using target- and ligand-based methods. He applies these methods to various projects and participates in numerous collaborations.

John develops the ZINC database of commercially available compounds for virtual screening, the DUDE database for benchmarking virtual screening methods, the DOCK Blaster family of products for web-based virtual screening, and the SEA method of comparing proteins by the ligands they bind.

Short bio: John received his Ph.D. for work with Jack Dunitz in chemical crystallography in 1991 at ETH Zurich. He worked at a startup molecular modeling company before joining Gerard Bricogne's group at the Medical Research Council's Laboratory of Molecular Biology in Cambridge UK, developing software for new macromolecular crystallographic structure solution and refinement methods (BUSTER and SHARP). He spent a year and a half as a staff scientist at EMBL-EBI also in Cambridge working on the PDB cleanup project before moving to Northwestern University Medical School in 2000. He has been working on DOCK, ZINC, and SEA since 2000.

Alex Wiltschko - Digitizing olfaction for health and happiness

BAGIM is excited to welcome members back for our first in-person (CONFIRMED) event since 2020!

Date: December 14, 2022

Time: 6 pm ET

Location: Google - 150 Broadway, Cambridge, MA 02142

Sign-upBAGIM Meetup

Alex Wiltschko will present "Digitizing olfaction for health and happiness: progress and opportunities"

Abstract: Computers can see and hear, but they cannot smell. We know living noses can smell COVID-19, people trapped in rubble, cancer, and Parkinson's disease. If we could digitize smell, people would live longer, happier lives. But how can we measure a smell? Smells are produced by molecules that waft through the air, enter our noses, and bind to sensory receptors. Potentially billions of molecules can produce a smell, so figuring out which ones produce which smells is difficult to catalog or predict. Sensory maps can help us solve this problem. Color vision has the most familiar examples of these maps, from the color wheel we each learn in primary school to more sophisticated variants used to perform color correction in video production. While these maps have existed for centuries, useful maps for smell have been missing, because smell is a harder problem to crack: molecules vary in many more ways than photons do; data collection requires physical proximity between the smeller and smell (we don’t have good smell “cameras” and smell “monitors”); and the human eye only has three sensory receptors for color while the human nose has > 300 for odor. As a result, previous efforts to produce odor maps have failed to gain traction.

We introduce the “Principal Odor Map” (POM), which identifies the vector representation of each odorous molecule in the model’s embedding space as a single point in a high-dimensional space. The POM has the properties of a sensory map: first, pairs of perceptually similar odors correspond to two nearby points in the POM (by analogy, red is nearer to orange than to green on the color wheel). Second, the POM enables us to predict and discover new odors and the molecules that produce them. We demonstrate that the map can be used to prospectively predict the odor properties of molecules, understand these properties in terms of fundamental biology, and tackle pressing global health problems. We discuss each of these promising applications of the POM and how we test them.

Digitizing Smell: Using Molecular Maps to Understand Odor

Digitizing Smell: Using Molecular Maps to Understand Odor – Google AI Blog (googleblog.com)

Machine Learning Highlights a Hidden Order in Scents

AI Model Links Smell Molecules With Metabolic Processes | Quanta Magazine

Prof. Charlotte Deane - From Machine learning to the physics of binding


Please join us for a BAGIM/SAGIM joint event sponsored by CCDC.

Date: December 7th, 2022

Time: 10 am PT/1 pm ET/6 pm BT

Sign-up - BAGIM Meetup

Title: From Machine learning to the physics of binding


Fueled by the success of machine learning in a wide range of domains, there is significant interest in the application of machine learning to early-stage drug discovery in areas from designing novel compounds to screening libraries of compounds against a specific target.

There has been particular interest in machine-learning based scoring functions for predicting the binding of small molecules to target proteins. The aim of these functions is to approximate the distribution which takes two molecules as input and outputs the energy of their interaction. This distribution is dependent on interactions between the atoms of the two molecules and the solvent, and only a scoring function that accounts for these interactions can accurately predict binding affinity on novel/unseen molecules.

To attempt to create a method capable of learning these interactions we built PointVS, a machine learning scoring function, which achieves state-of-the-art performance even after performing rigorous filtering of the training set. PointVS is able to identify important interactions. PointVS appears able to identify important binding interactions and is the first deep learning-based method for extracting important binding information from a target for molecule design.

About Speaker

Charlotte Deane, a professor of Structural Bioinformatics and former Head of the Department of Statistics at the University of Oxford. She completed her undergraduate education at University College, Oxford and went to the University of Cambridge to study structural bioinformatics. Prof. Deane worked as a Wellcome Trust Research Fellow for two years. She was recently awarded an MBE in the Queen’s Birthday Honours. Her research focuses on protein structure prediction, particularly antibodies. Her research group, Oxford Protein Informatics Group (OPIG), created the SABDab, a database for antibody structures and the SAbPred, a webserver for antibody structure prediction. In addition, Prof. Deane's research also focuses on immuninformatics, biological networks and small molecule drug discovery.

EquiBind: Geometric Deep Learning for Drug Binding Structure Prediction

Thursday, April 14, 2022 at 12:00 PM


Octavian Ganea


Geometric Deep Learning Models for Predicting 3D Structures and Interactions of Molecules


Understanding 3D structures and interactions of biological nano-machines, such as proteins or drug-like molecules, is crucial for assisting drug and therapeutics discovery. A core problem is molecular docking, i.e., determining how two proteins or a protein and a drug-molecule attach and create a molecular complex. Having access to very fast computational docking tools would enable applications such as fast virtual search for drugs inhibiting disease proteins, in silico molecular design, or rapid drug side-effect prediction. However, existing computer models follow a very time-consuming strategy of sampling a large number (e.g., millions) of molecular complex candidates, followed by scoring, ranking, and fine-tuning steps. In this talk, I will show that geometry and deep learning (DL) can significantly reduce the enormous search space associated with the docking and molecular conformation problems. I will present my recent DL architectures, EquiDock and EquiBind, that perform a direct shot prediction of the molecular complex, and GeoMol, that models molecular flexibility. I will argue that the governing laws of geometry, physics, or chemistry that naturally constrain these 3D structures should be incorporated in DL solutions in a mathematically meaningful way. I will explain our key modeling concepts such as SE(3)-equivariant graph matching networks, attention keypoint sets, optimal transport for binding pocket prediction, and torsion angle neural networks. These approaches reduce the inference runtimes of open-source or commercial software from tens of minutes or hours to a few seconds, while being competitive or better in terms of quality. Finally, I will highlight a number of exciting on-going and future efforts in the space of artificial intelligence for structural biology and chemistry.


Octavian Ganea is a postdoctoral researcher at CSAIL-MIT working with Tommi Jaakkola and Regina Barzilay on deep learning solutions for drug discovery and structural biology using geometric and physical inductive biases. He is part of and contributes to the Machine Learning for Pharmaceutical Discovery and Synthesis consortium, the Abdul Latif Jameel Clinic for Machine Learning in Health, the DARPA Accelerated Molecular Discovery program, and the ELLIS society. Octavian received his PhD from ETH Zurich under the supervision of Thomas Hofmann working on non-Euclidean representation learning for graphs, hierarchical data, and natural language processing. His published research includes a spotlight at ICLR 2022, spotlights at NeurIPS 2021 and 2018, and oral talks at ICML 2018 and 2019.

Structure and ligand-based analysis: combining the best of both worlds

 Tuesday, October 26, 2021 at 12:00 PM


Structure- and ligand-based analysis - How to combine the best of two worlds for driving compound design

Save the date for this Discngine sponsored panel discussion.

Malin Lemurell, Executive Dir., Head of Medicinal Chemistry Early CVRM BioPharmaceuticals R&D at AstraZeneca

Troy D. Smith, Senior Expert I, Data Science at Novartis Institutes for BioMedical Research (NIBR)

And the moderators: Gabriella Jonasson, 3decision product owner at Discngine

Structure-based and ligand-based methods are two general approaches typically used to derive rational insight and guidelines for both SAR analysis and compound design / prioritization. Both bring different yet complementary levels of information that are valuable to most drug discovery projects.

Combining insights from the two worlds can be tricky since each side requires very different skillsets, theoretical knowledge, and software tools.

This panel discussion will cover the current challenges and future opportunities for leveraging structure-and ligand-based data in drug discovery teams on a routine basis. We aim here to define a set of good practices together and identify solutions for the current challenges.