How to Build Virtual Chemical Libraries with Fragment Analogues: ChemX

How to Build Virtual Chemical Libraries with Fragment Analogues: ChemX

Introduction I want to introduce you to ChemX, a Python-based program I developed during a hackathon in 2019. You can use it to build virtual chemical libraries using fragment analogues to the building blocks of the target molecule. Using the RDKit library, ChemX assembles chemically similar fragments to create a virtual chemical library. What ChemX…

How to Highlight Molecular Substructures: Celebrating Commonalities and Differences

How to Highlight Molecular Substructures: Celebrating Commonalities and Differences

Introduction Today, we will dive into molecular substructure highlighting with RDKit – a powerful technique that illuminates the hidden intricacies within molecular compounds. In this tutorial, I will be focusing on two things: If you are interested in more Cheminformatics related tutorials, check my other blog posts here. Section 1: Understanding the Power of Structure…

How to Merge Multiple Datasets with Pandas and Python – Part 1

How to Merge Multiple Datasets with Pandas and Python – Part 1

Today’s tutorial is on how to merge multiple datasets using the Pandas library in python. We will add new columns based on a key column, and we will also aggregate information for the same column names from various datasets. I have made five sample datasets (A1.csv, A2.csv, A3.csv, A4.csv, A5.csv) that we will be merging.…

How to Box Plot with Python

How to Box Plot with Python

This blog post is for readers as well as myself. In this tutorial, I will show how to make different types of boxplots including horizontal, vertical, grouped boxplots, and interactive ones. It’s not meant to be comprehensive. It’s just a collection of different styles and visualizations that I like. For the code, you will need…

Nested Cross-Validation & Cross-Validation Series – Part 2B

Nested Cross-Validation & Cross-Validation Series – Part 2B

Please check out the previous blog posts from this series if you haven’t done so already: Part 1 algorithm for k-fold Cross-Validation Part 2A of the Nested Cross-Validation & Cross-Validation Series where I went through a python tutorial on implementing k-fold CV regressors using random forest (RF) from scikit-learn with a simple cheminformatics dataset with descriptors…