Pablo Rodríguez Belenguer obtained a degree in Pharmacy from the University of Valencia in 2012. In 2013, he completed a Master's degree in Pharmacology and Pharmacokinetics at the same university. His professional journey commenced in 2014 in the pharmaceutical industry, focusing primarily on Lymphoproliferative Syndromes. He initially worked at Janssen-Cilag and later, in November 2019, he joined Kite-Pharma (Gilead), where he worked with CAR-T cells as Cell Therapist Account Manager. In July 2020, he successfully completed a master’s degree in data science from MBIT School. Then, in January 2021, he became a part of the PharmacoInformatics group at UPF, under the leadership of Manuel Pastor, as a predoctoral researcher. During the same year, he accomplished a Master's degree in Artificial Intelligence. Pablo is on track to attain his Ph.D. in computational toxicology, which is expected to be conferred in February 2024.
OpenTox Virtual Conference 2023
Unlocking the Black Box: A Practical Session on Explainable AI with Gradient Boosting Approaches
Explainable AI (XAI) is a critical aspect of modern machine learning and predictive modelling, as it bridges the gap between black box models and human understanding. This presentation provides a comprehensive overview of gradient boosting methods, with a particular emphasis on the XGBoost algorithm in the context of Explainable AI.
Gradient Boosting is a technique that enhances model performance by aggregating weak learners in a sequential manner. There are different types of gradient boosting, but due to its wide scope, this session will deal with the use of XAI using eXtreme Gradient Boosting (XGBoost). XGBoost is a kind of gradient boosting that works by building an ensemble of decision trees in a sequential manner, continually improving predictions. It is also known for its speed, efficiency, and the ability to handle missing data. It also incorporates regularization techniques to prevent overfitting and is widely used in various fields due to its accuracy and versatility.
With respect to XAI, XGBoost determines feature importance using five key factors: "Weight," which reflects how frequently a feature is used for splitting data across the ensemble; "Gain," which quantifies the predictive accuracy improvement attributed to a specific feature; "Cover," representing the relative frequency of a feature's usage in model decision-making; "Total Gain," which sums gain values across all trees employing the feature; and "Total Cover," summing cover values across all relevant trees. These combined metrics facilitate XGBoost in assigning feature importance scores, offering a more comprehensive insight into the significance of features in enhancing model transparency and interpretability for XAI.
To enhance understanding, we also provide a Google Colab notebook (https://colab.research.google.com/drive/1rKQR-52Fh0mxKgaXgPI1Bqq6fiybIqHJ?usp=sharing) accompanied by XAI implemented using XGBoost with the incorporation of fingerprints and physicochemical descriptors, demonstrating practical application in a hands-on fashion.