Learning Reaction SMARTS: A Practical Guide to Reaction-Based Patterns

Learning Reaction SMARTS: A Practical Guide to Reaction-Based Patterns

Introduction In this tutorial, we’ll dive into using Reaction SMARTS for defining chemical transformations in cheminformatics. It is a powerful tool for those in cheminformatics and drug discovery looking to write chemical transformations in a structured and automatable way. It’s particularly valuable for virtual synthesis, reaction prediction, or automated workflows for compound libraries. In this…

Taming the Chaos: Cleaning Data for Reliable ADMET Models

Taming the Chaos: Cleaning Data for Reliable ADMET Models

Building machine learning models for ADMET (Absorption, Distribution, Metabolism, Excretion, and Toxicity) predictions puts you on the front lines of drug discovery. It’s exciting work—pushing the boundaries of what’s possible, using algorithms to predict how molecules behave in the body. But the part they don’t tell you? The real battle isn’t in designing models or…

How to Build Virtual Chemical Libraries with Fragment Analogues: ChemX

How to Build Virtual Chemical Libraries with Fragment Analogues: ChemX

Introduction I want to introduce you to ChemX, a Python-based program I developed during a hackathon in 2019. You can use it to build virtual chemical libraries using fragment analogues to the building blocks of the target molecule. Using the RDKit library, ChemX assembles chemically similar fragments to create a virtual chemical library. What ChemX…

How to Highlight Molecular Substructures: Celebrating Commonalities and Differences

How to Highlight Molecular Substructures: Celebrating Commonalities and Differences

Introduction Today, we will dive into molecular substructure highlighting with RDKit – a powerful technique that illuminates the hidden intricacies within molecular compounds. In this tutorial, I will be focusing on two things: If you are interested in more Cheminformatics related tutorials, check my other blog posts here. Section 1: Understanding the Power of Structure…

Nested Cross-Validation & Cross-Validation Series – Part 2A

Nested Cross-Validation & Cross-Validation Series – Part 2A

This is part 2A of the Nested Cross-Validation & Cross-Validation Series. I will go through a python tutorial on implementing k-fold CV regressors using random forest (RF) from scikit-learn with the first dataset: (A) a simple cheminformatics dataset with descriptors and endpoints of interest. In Part 2B, I will cover the same python tutorial for…