Artificial Intelligence (AI) systems, specifically machine learning (ML) technologies, learn how to accomplish tasks and identify patterns from data provided instead of relying on dictated rules. This data-based learning has been critical in helping the biotechnology and pharmaceutical companies and academic researchers in the drug discovery field in developing new AI technologies to improve drug design and discovery processes. Having a more efficient AI-based process in drug discovery and design has been helpful in recent years in reducing the 2.6 billion USD average cost and an average of 12 years to bring a new drug to market.
Automation and discovery have been the two major applications of AI at the industrial level. AI for automation has helped facilitate repetitive work that is usually easy for humans, sometimes without the need for any expertise, and removing employees from dangerous jobs. On the other hand, discovery tasks are those that even experts may not be able to accomplish with high accuracy. For example, it is not a trivial task to predict, without the help of AI, what small molecule could interact with a target protein. These tasks need expertise, patience, and trial and error, combined with experimental efforts. It is worth noting that we as human experts have limited capacity to consider a huge space of possibilities for each discovery task, which can include millions of possible chemical compounds that can be considered for designing a new drug for a target protein. In this case, AI could help us identify relationships and patterns that human experts do not necessarily know. For example, Matchmaker, is an AI-enabled deep learning engine that predicts the polypharmacology of small molecules as the foundation for small molecule drug discovery. Cyclica developed MatchMaker in 2018, and it continually improves its use of information of small molecules and proteins to predict which small molecule can interact with which protein, through drug-target interaction (DTI) data in public and private databases. MatchMaker is an example of successful AI technology designed primarily for hit discovery in the early stages of drug discovery (Figure 1).
Figure 1. Schematic representation of the application of AI in different stages of the drug discovery process. You can find more about MatchMaker and Graph Neural Network (GNN) target ID here.
Successful AI technologies in healthcare and drug discovery are mainly built using machine learning algorithms. Let’s consider three main tasks in building a supervised ML model:
1) Processing the data and providing features and labels (or continuous values) for an ML system.
2) Training the ML model and letting it learn the relationships between features of each data point (characteristics of data points) and their labels.
3) Assessing the performance of the models using proper testing strategies, in addition to all software-related requirements like data, code and model versioning.
These tasks are necessary for building AI technologies to be used in different drug discovery and development steps, from hit discovery to guiding clinical trials. Considering the complexity of each one of these tasks and the knowledge and experience required in software development, domain knowledge such as in structural bioinformatics and medicinal chemistry, algorithm design, statistics, etc., it is evident that it is not a job to be successfully done by one or a group of experts with the same expertise like structural bioinformatics or algorithm design. The success of companies like Cyclica relies on the collaboration of experts in building and properly utilizing AI technologies like MatchMaker to guide the drug discovery and design process. Also, it lets these companies avoid mistakes that experts in one field can make because of a lack of proper understanding of other aspects of AI technology development.
Data as the core aspect of building machine learning models and technologies have determined the progress made in the application of AI in drug discovery. Applications like hit discovery for proteins, as in MatchMaker, have been successful in relying on millions of public and private data available to academic and industrial teams. Other applications like protein design, lead optimization, translatability prediction, and guiding clinical trials in later stages of drug development have progressed by getting access to the correct data. Upon the availability of new data through databases or the development of new technologies for the generation of higher quality data in high quantity, we will see even further progress in the application of AI in drug discovery. The rest is on us to know how to use that data and build successful AI technologies.
Throughout our AI Drug Discovery campaign, you’ll hear next about the collaborative environment we have built at Cyclica, in addition to the pitfalls of AI in drug discovery and how we have avoided them in subsequent pieces of content throughout the next month.