Frameworks for using Large Language Models in Toxicological Risk Assessment
Large Language Models (LLMs) (e.g., ChatGPT, Google Gemini) have the potential to increase the speed, comprehensiveness, and overall quality of toxicological risk assessments. These models can efficiently extract and collate data from documents as well as draft reports and summarize studies. However, a number of technical limitations of such models have been reported (e.g., inconsistent outputs, prompt sensitivities, hallucinations, and lack of transparency) that need to be overcome before LLMs can be effectively used in this context. To address these challenges, a framework outlining the use of LLMs to support toxicological assessment associated with exposure to chemicals or perturbation of the biological targets (such as receptors) has been developed. This framework addresses the limitations of LLMs and is based on input from an industry consortium as well as case studies. An example illustrating the use of the framework to assess target biology will be presented, covering the major steps in the process from data capture to evidence interpretation.