Abstract
Drug targets are specific molecules in biological tissues and body fluids that interact with drugs. Drug target discovery is a key component of drug discovery and is essential for the development of new drugs in areas such as cancer therapy and precision medicine. Traditional in vitro or in vivo target discovery methods are time-consuming and labor-intensive, limiting the pace of drug discovery. With the development of modern discovery methods, the discovery and application of various emerging technologies have greatly improved the efficiency of drug discovery, shortened the cycle time, and reduced the cost. This review provides a comprehensive overview of various emerging drug target discovery strategies, including computer-assisted approaches, drug affinity response target stability, multiomics analysis, gene editing, and nonsense-mediated mRNA degradation, and discusses the effectiveness and limitations of the various approaches, as well as their application in real cases. Through the review of the aforementioned contents, a general overview of the development of novel drug targets and disease treatment strategies will be provided, and a theoretical basis will be provided for those who are engaged in pharmaceutical science research.
Significance Statement Target-based drug discovery has been the main approach to drug discovery in the pharmaceutical industry for the past three decades. Traditional drug target discovery methods based on in vivo or in vitro validation are time-consuming and costly, greatly limiting the development of new drugs. Therefore, the development and selection of new methods in the drug target discovery process is crucial.
I. Introduction
Drug targets are biomolecules in the body that come into direct contact with a drug to create an interaction. The discovery of drug targets is crucial in the development of new drugs for cancer therapy and precision medicine (An and Yu, 2021; Koivisto et al., 2022). Most drugs exert their effects by interacting with target molecules in vivo. The effectiveness of a drug largely depends on its target (Hwang et al., 2021). Therefore, the search for new drug targets has become the focus of intense competition in innovative drug research today.
A good target should be safe, effective, clinically and commercially viable, and druggable (Gashaw et al., 2011; Zhao et al., 2023b). The ideal drug target should have the following characteristics: first, the drug target should be closely related to the target disease and its regulatory mechanism should be an important factor in the disease; second, it should have one or more sites where it can bind to other structural substances; third, the drug target should be modifiable so that the drug can modulate it when needed to achieve a therapeutic effect; and fourth, the physiological effects arising from the structure of the substance need to play an essential role in a complex regulatory process. Finally, there may be endogenous small molecules or exogenous ligands that bind to them and have known pharmacological effects (Gashaw et al., 2011).
As a critical part of drug development, the discovery of the function of a potential therapeutic target and its role in disease is the beginning of target discovery and characterization (Xia et al., 2021). Target-based drug discovery has been the primary approach to drug discovery in the pharmaceutical industry for the past three decades (Barnash et al., 2017; Ap and Guo, 2019; Luo et al., 2019). Traditional drug target discovery methods based on in vitro or in vivo validation of organisms are time-consuming and costly, significantly limiting the development of new drugs (Parvathaneni et al., 2019). Therefore, selecting the appropriate method for drug target discovery is crucial. In recent years, various new technologies have been applied to drug target discovery and identification, which have greatly facilitated the drug target discovery process. Therefore, in this review article, we categorically discuss the need for a variety of new and efficient target discovery methods, including drug affinity response target stability (DARTS), network and machine learning-based approaches, gene editing, nonsense-mediated mRNA degradation (NMD), and multiomics, from the perspectives of both drug-centered target discovery and disease-centered target discovery. In addition, we analyze the published literature on drug target discovery in the Web of Science Core Collection database (Supplemental Fig. 1A). Through bibliometric analysis, we identified drug-target interactions, drug discovery, machine learning, and deep learning as global research hotspots and cutting-edge trends in recent years (Supplemental Fig. 1, B and C).
II. Emergence of New Drug Target Discovery Methods
Target discovery methods can be divided into two categories according to their focus: drug-centered target discovery and disease-centered target discovery. Of these, drug-centered target discovery focuses on identifying the molecular targets of existing drugs and characterizing the mechanisms of drug-target interactions. Disease-centered target discovery focuses on the backtranslation of pathological phenotypes to animal and in vitro models aiming to identify the biological molecules that are responsible for these phenotypes by means of genetic manipulation of these models. In the following section, we provide a categorical overview of different target discovery approaches, including DARTS, network and machine learning-based approaches, gene editing, NMD, and multiomics, from the perspective of both drug-centered and disease-centered target discovery.
A. Drug-Centered Target Discovery Methods
1. Drug Affinity Responsive Target Stability
Under physiological conditions, proteins are in dynamic equilibrium. When a specific ligand (e.g., a drug) binds, a thermodynamically more stable state results due to hydrophobic, hydrogen bonding, or electrostatic interactions formed between the protein and the drug ligand. In this state, the conformational changes of the target protein are significantly reduced and the resistance to protease hydrolysis is significantly increased (Li et al., 2023). DARTS is a new approach to target discovery that was discovered in recent years. DARTS focuses on monitoring changes in the protein stability of biologically active small molecule receptors by observing whether ligands in mixed solutions can protect target proteins from degradation, thereby revealing the interactions between ligands and proteins in cells or tissues and identifying potential target proteins (Hwang et al., 2020b; Huang et al., 2021).
DARTS consists of five steps (Fig. 1A): 1) Sample preparation: cell lysates or purified proteins are used to prepare protein libraries for subsequent assays. 2) Small molecule or drug treatment: specific small molecules or drug candidates treat each aliquot of protein specimen. Small molecules or drugs are usually added at a particular concentration to assess their binding affinity to the target protein. 3) Protease treatment: the protein sample is divided into multiple aliquots, and each aliquot is treated with a protease, usually a nonspecific protease such as thermolysin or proteinase K, which degrades unprotected proteins in the sample. 4) Protein stability analysis: after proteolytic digestion, the treated and nontreated groups were analyzed via techniques such as SDS-PAGE or mass spectrometry, and the two groups were compared. The protein hydrolysis fragments generated from the two groups were compared. 5) Target protein identification: if a small molecule or drug binds to a protein, it stabilizes the protein and reduces protein degradation (Ren et al., 2021).
DARTS has several advantages in the drug target discovery process. It is a label-free small molecule target identification technique that can be used for both complex cell lysates and purified proteins (Hwang et al., 2020a). It is a relatively simple and cost-effective method compared with other target identification methods and not only can be applied to most natural, unmodified small molecules, but also does not require large quantities of pure proteins; moreover, phenotypic characterization of model organisms is not restricted, and protein samples can be obtained from organisms of cell lines or tissues (Lomenick et al., 2009).
However, it must be noted that there are still some limitations in the application of DARTS in the process of drug discovery. On the one hand, due to the complexity of protein libraries, there may be some misbinding between small-molecule drugs and proteins; on the other hand, the positive results of some low-abundance proteins can easily be overlooked in SDS-PAGE analysis. Moreover, incorrect manipulation may also lead to false-positive results. Therefore, DARTS is usually not used alone but rather in combination with other techniques such as liquid chromatography/tandem mass spectrometry (Ferraro et al., 2022; Park et al., 2024), coimmunoprecipitation (Wang et al., 2021), and cellular thermal transfer assay (Wu et al., 2021) to validate and identify potential drug targets. In addition, some studies, such as functional assays and in vivo experiments, are required to confirm the relevance of the targets.
2. Network-Based and Machine Learning-Based Methods for Predicting Drug Targets
Target prediction based on network and machine learning methods has been essential for drug-target interaction (DTI) prediction (Jung et al., 2022). DTI prediction problems can be divided into four categories: 1) known drugs and known targets, 2) known drugs and new target candidates, 3) new drugs and known targets, and 4) new drugs and new target candidates. While the ultimate goal of network-based and machine-learning approaches is the interactive prediction of new drugs and target candidates, most approaches in the literature are limited to the first three categories (Bagherian et al., 2021).
Network-based approaches use relevant information from bioinformatics networks, such as protein-protein interaction networks, to predict drug targets (Zhang et al., 2022b). These methods usually rely on the assumption that proteins with similar interaction networks tend to have similar functions or participate in similar biological processes. The following are several commonly used network-based target discovery methods: 1) guilt by association: these methods assume that proteins interacting with known drug targets are likely to be potential drug targets themselves and identify proteins with similar function or structure to known drug targets through bioinformatics networks (Casas et al., 2019); 2) network-based inference: this method of network-based inference improves the accuracy of drug target prediction by combining and analyzing multiple bioinformatics networks and high-throughput data (Peng et al., 2021); 3) random walks: this identifies the most relevant nodes to a known drug target by modeling random wandering on the network (Liu et al., 2022a; Xu et al., 2022).
Machine learning refers to the process by which a machine learns from a large amount of historical data through statistical algorithms and then guides other tasks by building a generated empirical model. Related research on machine learning is a multidiscipline that specializes in studying how computers can simulate or implement human learning behaviors to acquire new knowledge, reorganize existing knowledge structures, and continuously improve their own performance (Narasimhan and Victor, 2023). Machine learning-based methods employ algorithms to learn patterns and relationships from training data to predict drug targets (Zhang et al., 2019). These methods typically use various molecular descriptions or features extracted from drugs' and target proteins' chemical and biological properties. Machine learning-based target discovery methods can be classified into three categories: 1) supervised learning: supervised learning is the machine learning task of inferring a feature from labeled training data. The labeled data represents the correspondence between inputs and outputs, and the predictive model produces the corresponding outputs for the given inputs (He et al., 2023). In supervised learning, models are trained using labeled data of known drug-target interactions. These models can predict new drug-target interactions (Chen et al., 2018, 2022b). Supervised learning algorithms include decision trees, random forests, support vector machines, and neural networks (Zhang and Ma, 2019). 2) Semisupervised learning: semisupervised learning is a key research problem in the field of pattern recognition and machine learning and is a learning method that combines supervised and unsupervised learning. Semisupervised learning uses a large amount of unlabelled data, as well as labeled data at the same time, to perform pattern recognition work (Tanoori et al., 2021). 3) deep learning: deep learning is the process of learning the intrinsic patterns and levels of representation of sample data, and the information gained from these learning processes can be of great help in the interpretation of data such as text, images, and sound (Deng et al., 2014). Deep learning models, especially deep neural networks, have broad application prospects in drug target prediction. These models can automatically learn complex representations from raw data to predict drug-target interactions (Bagherian et al., 2021; Gupta et al., 2021).
It is worth noting that network-based and machine-learning-based approaches are not mutually exclusive, and they may overlap. For example, machine learning models may include web-based features or use web-based similarity data as input for prediction. The choice of the specific approach depends on the available data, the specific problem, and the research objectives (Wu et al., 2018).
Current approaches based on overlaying network and machine learning on top of each other to identify unknown DTIs typically involve the following steps (Fig. 1B): 1) data collection and preprocessing: data on known drug-target interactions are collected from public databases or the literature. Drug features (e.g., chemical structures, molecular descriptors) and target features (e.g., protein sequences, structural information) are collected. Preprocess and standardize the collected data to ensure the drug's and target features' compatibility. 2) perform similarity analysis on known drug components and similarity analysis on known proteins, and perform feature compression and fusion to reduce feature dimensionality and information redundancy. 3) prediction of unknown DTI: unknown DTI is predicted by trained machine learning models using unknown interacting drug and target features (Chen et al., 2016; Wu et al., 2018).
In recent years, computational methods for DTI prediction, which avoid the problems of costly and time-consuming large-scale in vitro and in vivo experiments, have attracted increasing interest (Zhang et al., 2023b). However, these methods also have their own limitations; e.g., the large amount of prediction data and inaccuracy of the evaluation function usually lead to inefficient molecular docking. In addition, despite the practical application of several traditional machine learning models such as random forest and support vector machines, although they greatly improve the prediction efficiency, the performance and effectiveness of these models are unsatisfactory (Azlim Khan and Ahamed Hassain Malim, 2023). Fortunately, the introduction of deep learning models has greatly solved this problem, and the continuous improvement of the three modules of target feature extraction, drug feature extraction, and prediction by integrating various practical cases has greatly promoted the development of drug-target interaction prediction (Li et al., 2022a).
B. Disease-Centered Target Discovery Methods
1. Gene Targeting Approaches Applied to Drug Target Discovery
Targeted therapy refers to the targeted intervention of key signaling molecules of a disease through specific drugs or therapeutic methods, enabling doctors to precisely target specific molecular targets of the disease, effectively improving the precision and efficacy of treatment (Bedard et al., 2020). However, a known targeted therapeutic target is required as a prerequisite. Therefore, it is very important to find targeted therapeutic targets.
Currently, there are a number of drug target discovery technologies based on gene targeting. For example, CRISPR-Cas9 technology selectively disrupts or modifies specific genes to understand their roles in disease pathways, enabling systematic studies of gene function (Ravichandran and Maddalo, 2023); RNA interference technology can be used to transiently silence genes, enabling researchers to study their effects on cell function and disease pathways (Hu et al., 2020); zinc-finger nucleases can be used to edit specific genes for functional studies and to understand their role in disease (Swarthout et al., 2011); and transcription activator-like effector nucleases can be used to precisely edit genes, enabling researchers to create specific mutations for functional studies (Cao et al., 2018). Peptide self-assembly nanoplatforms based on “bottom-up” assembly methods have also been widely used for precise targeting and targeted therapy (Wang et al., 2023).
Among them, CRISPR-Cas-based functional screening technology is regarded as the most commonly used gene targeting technology in drug target discovery (Xu et al., 2021). The principle is that the guide RNA (gRNA) generated by CRISPR transcription mediates Cas nuclease to target the target sequence and cleave the sequence (Manghwar et al., 2019). In the following, we take CRISPR library screening as an example to introduce its application in the process of drug target discovery.
CRISPRko, CRISPRi, and CRISPRa are the three most commonly used libraries in the drug target discovery process (Chan et al., 2022). Among others, the basic idea of CRISPR-KO library screening is based on single-guide RNA-guided Cas9 cutting of the DNA double strand to the target site, triggering the nonhomologous repair mechanism to achieve gene knockdown. CRISPRko is commonly used to detect the loss of fitness in the population, such as reduced viability, drug sensitivity, cell proliferation, and incapacity to migrate (Shalem et al., 2014). The CRISPRi system has the property of reversibly knocking down target genes without disrupting the genomic sequence and is primarily used in loss-of-function screens (Wang et al., 2022b). CRISPRa activates gene expression by targeting the promoter region of the corresponding locus, and positive selection is performed for analysis (Ding et al., 2022).
The current application of CRISPR/Cas9 library screening for drug target discovery generally involves the following steps (Fig. 1C): 1) select CRISPR libraries and design and construct a gRNA library: CRISPR libraries targeting the full range of genes involved in a disease or related biological process. Design gRNAs and generation of gRNA libraries by evaluating efficiency, specificity, and coverage. 2) Delivery of CRISPR libraries, selection, and drug treatment: Select the appropriate delivery method for different cell types, such as lentiviral transfection or electroporation, to introduce the gRNA into the target cells. Note: Each cell receives only one gRNA to avoid potential intracellular interference or competition. The cells are then treated with the drug of choice. Selective pressure or specific screening conditions sensitize or tolerate disturbed cells to drug treatment. Potential drug target genes are identified by monitoring cell viability, proliferation, other phenotypic changes, and the cellular response to the drug. 3) Polymerase chain reaction (PCR) library creation, single-cell sequencing, and analysis: construct PCR libraries and extract genomic DNA from each single cell via single-cell lysis. Perform single-cell sequencing, such as single-cell whole genome sequencing or single-cell RNA sequencing, to obtain each cell's gRNA sequences and genomic or transcriptomic information. Data are analyzed to determine gRNA-cell associations and assess interfering genes’ functional impact on the screening phenotype. Downstream bioinformatics analyses such as cluster analysis, differential gene expression analysis, or pathway analysis were performed to identify potential drug targets or pathways involved in screening phenotypes (Bailey et al., 2009; Kurata et al., 2018; Chan et al., 2022).
CRISPR/Cas9 library screening accomplishes the identification and validation of potential drug targets by directly interfering with the expression of a normal gene to assess the effect of that gene on disease-related phenotypes or drug responses. This approach contributes to understanding the molecular basis of disease and provides opportunities for developing new therapies (Chan et al., 2022).
However, the actual application of these methods is not as good as imagined. Targets can be identified by means of technological analysis, but it is difficult to develop drugs that are tailored to the way they are structurally and functionally regulated. For example, the Ras gene is mutated in up to a quarter of all cancers. However, to date, no highly effective inhibitors have been identified that directly target Ras signaling (Lu et al., 2023). In addition, although targeted therapies are far less toxic than conventional drugs, they still have significant side effects because cancer cells are more dependent on the target than are normal cells. Among the most common side effects are diarrhea and liver problems such as hepatitis and elevated liver enzymes (Dy et al., 2023). Of course, not all side effects represent bad signs, and the occurrence of side effects with certain targeted therapies portends a better prognosis. Moreover, similar to conventional drug therapy, targeted therapies are prone to resistance (Gao et al., 2023; Jin et al., 2023). Therefore, finding combination strategies for targeted therapies for longer-lasting efficacy has become a priority in the field of biomedical research.
2. Posttranscriptional Regulation-Based Discovery of Drug Targets
NMD is an important pathway for posttranscriptional gene regulation that recognizes and degrades mRNA molecules containing premature termination codons (PTCs), prevents the production and accumulation of harmful proteins in the cell, and plays a crucial role in maintaining quality control of gene expression (Maquat, 2004; Karousis and Mühlemann, 2019; Supek et al., 2021). Although NMD is primarily involved in mRNA quality control, studies have shown that it can also be used as a strategy for disease therapeutic target discovery (Pastor et al., 2010; Tan et al., 2022).
There are currently two main methods for detecting the activity of intracellular NMD targets (Fig. 1D): one is endogenous, in which a pair of plasmid reporter genes, one containing PTC and the other lacking PTC, are delivered to cultured cells, and transcriptional analysis of the reporter genes is used to determine intracellular NMD activity. If the transcript level of the PTC-containing plasmid reporter gene is reduced compared with that of the PTC-deficient plasmid reporter gene, then the cells have strong NMD activity; if the transcript level is similar to that of the PTC-containing plasmid reporter gene, then the cells have poor NMD activity (Zheng et al., 2013; Ghiasi et al., 2024). The second method is exogenous and measures the individual abundance of NMD-sensitive isoforms and their non-NMD counterparts by quantitative real-time PCR. NMD isoforms are first examined for differential expression between treatment and control conditions. If there is an increase in NMD isoforms, changes in NMD repression, transcriptional activation, or alternative splicing in favor of NMD isoforms could be responsible for the increase in their isoforms. These three conditions can be distinguished by examining the expression of non-NMD counterparts, where NMD regulation, transcriptional activation, and selective splicing regulation result in no change, upregulation, and downregulation of NMD-insensitive isoforms, respectively. If there is no change in non-NMD subtypes, enhanced cellular NMD activity decreases NMD subtypes (Li et al., 2017).
NMD selectively degrades PTC-containing transcripts through cellular quality control mechanisms, providing a new strategy for drug target discovery. By inducing the occurrence of transcription-specific NMD and monitoring the resulting phenotypic changes, researchers can determine the efficacy of a drug to be tested or identify genes that are altered in disease due to mRNA degradation (Li et al., 2019; Nogueira et al., 2021). This approach provides a potential avenue for the discovery and development of new drug targets.
Although NMD plays an important role in maintaining cellular homeostasis, there are limitations to using NMD-based approaches for disease drug target discovery. For example, while inhibition of NMD may stabilize mRNAs with premature termination codons, it may also lead to the translation of truncated proteins, which may have deleterious effects (Campbell et al., 2023; Kolakada et al., 2023; Li et al., 2023). Second, NMD sensitivity may vary between cell types and tissues. Targets identified in one cell type may not be effective in another, making it difficult to find broad-spectrum therapeutic targets. For example, NMD control of variants is prominent in human embryonic kidney 293 cells but not in human vascular smooth muscle cells (Dedman et al., 2011). Notably, the inhibitory process of NMD may produce off-target effects, leading to the production of nontarget transcripts that can lead to adverse consequences (Frischmeyer and Dietz, 1999). Therefore, finding drugs that specifically modulate NMD without affecting normal cells is currently a major challenge. In addition, NMD interacts with multiple cellular pathways, including mRNA surveillance, translation, and RNA processing (Embree et al., 2022). The complex interactions with these pathways make it challenging to dissect the specific contribution of NMD to the disease process. Furthermore, despite the progress in NMD-related research in recent years, the number of NMD-related drug targets in actual clinical applications has not increased significantly. These indicate that there are still challenges in translating NMD-related basic experimental research results into clinical applications.
Currently, although NMD is a potentially significant mechanism in the drug target discovery process, using NMD for drug target discovery still faces various challenges. Addressing these limitations is critical to successfully exploit NMD-based target discovery mechanisms.
3. Multiomics Approach for Drug Target Discovery
Multiomics research explores the interactions between genes, proteins, metabolites, and other substances in an organism. This approach has now become one of the most important strategies in the drug target discovery process. The development of many multiomics technologies [e.g., proteomics, genomics, metabolomics, transcriptomics, and phenomics (Fig. 1E)] has opened up more possibilities to search for biomarkers that are highly relevant to diseases to aid in the early diagnosis, treatment, and prognosis of diseases, as well as for discovering potential drug targets (Li et al., 2021; Zhang et al., 2021a).
This section describes the multiomics techniques involved in the drug target discovery process. One is proteomics. Proteins are the primary target type of most drugs, with 50% of pharmaceuticals targeting proteins. Therefore, proteomics is vital in drug target discovery. As early as 1998, Müllner et al. suggested that proteomics is one of the most potent techniques for drug target discovery (Müllner et al., 1998). Proteomics involves studying three aspects of proteins, namely, their expression levels, modifications, and interactions.
Analysis of protein expression patterns, posttranslational modifications, and interactions between proteins can help identify disease-specific biomarkers, reveal underlying disease mechanisms, and identify essential proteins targeted for therapeutic intervention (Meissner et al., 2022). Genomics studies the complete set of genes and their functions in an organism. Correlation analysis of genomic data allows the identification of abnormalities in the expression of relevant genes that contribute to disease development. Genomic studies help to identify target genes associated with specific diseases and guide the development of targeted therapies. In addition, genomics correlation analysis provides insights into the genetic basis of drug response and resistance, enabling personalized medical approaches (Haley and Roudnicky, 2020). Transcriptomics involves studying all RNA transcripts within a cell or tissue, and the analysis of transcriptome data can be reactive to information about the expression levels of relevant genes and alternative splicing events (Liu et al., 2024). In addition, transcriptome data obtained and analyzed by techniques such as RNA sequencing can identify differentially expressed genes associated with diseases, which will help to discover disease-specific pathways and potential drug targets for a more comprehensive understanding of disease mechanisms (Spaethling and Eberwine, 2013; Wu et al., 2022c). Metabolomics focuses on the study of small molecule metabolites involved in cellular metabolism. The application of metabolomics analyses in drug target discovery has helped to provide insight into metabolic changes associated with disease and drug response and has helped to identify disease-specific metabolic pathways and potential drug targets. In addition, monitoring changes in metabolite levels can help to assess drug efficacy and toxicity (Rabinowitz et al., 2011; Cuperlovic-Culf and Culf, 2016; Garana and Graham, 2022). Phenomics comprehensively analyzes an organism’s phenotypic characteristics, including physiological, morphological, and behavioral features. Phenotyping can be performed using techniques such as high-throughput screening and fluorescence imaging. Studying phenotypic changes caused by genetic or environmental disturbances helps to explain the functional consequences of pharmacological interventions. In addition, joint applications with other histologies provide a holistic view of disease mechanisms (Wang et al., 2020b). By integrating these multiomics techniques, researchers can discover molecular changes at different levels (genes, transcripts, proteins, metabolites) and their relationships with disease (Paananen and Fortino, 2020), contributing to discovering disease-specific biomarkers and validating potential drug targets for personalized therapeutic strategies.
The application of multiomics approaches provides valuable insights into disease mechanisms and potential drug targets, but multiple challenges are also present in the actual drug target discovery process: the interpretation of large-scale multiomics data is one of the major challenges to overcome. This process requires the use of sophisticated bioinformatics tools to extract meaningful biological information from large data sets (Jeong and Yoon, 2023). Second, the disease process changes over time. Multiomic data at a single time point may not fully capture dynamic changes at the molecular level, leading researchers to have an incomplete understanding of disease progression and potential targets (Huang et al., 2011). Furthermore, although multiomic data can reveal disease-associated molecular changes, the corresponding functional annotation of these molecular changes is challenging (Rauthan et al., 2023). Despite these limitations, multiomics methods are still a powerful tool in the drug target discovery process. The combination of multiomics result prediction and later experimental verification can significantly improve the stability of drug target discovery, which is important in drug target discovery.
III. Applications in the Discovery of Drug Targets for the Treatment of Human Diseases
A. Practical Application of Drug-Centered Target Discovery Methods
The design and screening of novel drugs require the premise of known targets, which makes target screening a critical part of the drug development process.
DARTS was first proposed by Lomenick et al. in 2009, and the method is now widely used for drug target discovery (Lomenick et al., 2009). In mammals, based on the application of DARTS, Shi et al. showed that 5-aza-2′-deoxycytidine could enhance antitumor immunity in colorectal peritoneal metastases by targeting ABC A9-mediated cholesterol accumulation in macrophages (Shi et al., 2022). Yu et al. found that dictamnine could target and inhibit c-Met activity and downregulate the phosphatidylinositol-3-kinase/AKT/mammalian target of rapamycin and mitogen-activated protein kinase signaling pathways to inhibit lung cancer cell proliferation (Yu et al., 2022). In addition, DARTS has been used to identify therapeutic target proteins for colorectal cancer (Derry et al., 2014), hepatocellular carcinoma (An et al., 2022), and osteosarcoma cell proliferation (Zhu et al., 2021). Furthermore, DARTs are also applied in diseases other than cancer, such as ischemia/reperfusion injury (Wang et al., 2021) and osteoarthritis target discovery (He et al., 2023).
Network-based and machine-learning approaches were among the first methods for drug target discovery. Its application provides a powerful means to identify biomarkers (Gong et al., 2019; Li et al., 2020b; Wu et al., 2022b). Network-based machine learning approaches take advantage of the observation that genes with similar phenotypic roles tend to colocate in specific regions of protein-protein interaction networks (Rawls et al., 2020). This trend has been used for target and biomarker discovery and characterization. For example, many potential drug targets and pathways have been identified in COVID-19-related studies based on bioinformatics, network pharmacology, and machine-learning approaches (Auwul et al., 2021; Zhang et al., 2022a). Breast cancer has become the most common tumor in women, accounting for 16.72% of all new cancer cases (Liu et al., 2023). Luca Cattelani et al. improved two NSGA2 algorithms to make biomarker discovery in breast cancer disease more accurate (Cattelani and Fortino, 2022). A comprehensive bioinformatics study revealed that the poor prognostic therapeutic targets for gastric cancer are COL1A1, COL1A2, COL3A1, COL5A1, FN1, and SPARC (Ucaryilmaz Metin and Ozcan, 2022). In addition, ulcerative colitis targets are identified by a novel network biology approach through the modular triad (Voitalov et al., 2022).
B. Practical Application of Disease-Centered Target Discovery Methods
Gene editing technology is also one of the most essential tools for drug target discovery, the most important of which is the application of CRISPR/Cas9 library screening. There are three scenarios depending on the type of library chosen. 1) CRISPR KO libraries have been found in studies related to drug target discovery in many cancers, such as colorectal cancer (Ringel et al., 2020), pancreatic ductal adenocarcinoma (Steinhart et al., 2017; Ubhi et al., 2024), esophageal squamous cell carcinoma (Xu et al., 2023), hepatocellular carcinoma (Bao et al., 2021), B-cell acute lymphoblastic leukemia (Han et al., 2017; Ramos et al., 2023), and breast cancer (Guarducci et al., 2024). 2) The discovery of drug targets is also extensive based on the CRISPRi library. An efficient C-G to G-C base editor was developed using CRISPRi screening, improving target accuracy (Koblan et al., 2021). One study identified NSD2 as a target for treating lung adenocarcinoma through CRISPR interference in mouse models (Sengupta et al., 2021). Vest et al. identified potential targets for treating neurodegenerative diseases, such as lysosomes, by screening for genome-wide CRISPRi targets (Vest et al., 2022). In addition, Wang et al. identified potential targets for targeting adenocarcinoma through a genome-wide CRISPRi screen (Wang et al., 2022b). 3) CRISPRa activates gene expression by targeting the promoter region of the corresponding locus. Only the single-guide RNA library is needed based on the target, and gene transcription is activated endogenously at the proximal promoter (Chan et al., 2022). Jost et al. showed by a combined CRISPRi/a-based chemical genetic screen that rigosertib, a test drug at high risk of myelodysplastic syndrome, is a microtubule destabilizer (Jost et al., 2017). In addition to discovering drug targets using a single library, combinations of different libraries are also commonly used for drug target identification. For example, a combinatorial screen of CRISPRi and CRISPRa revealed a strong correlation between p97 and CB-5083, confirming that p97 is a target of CB-5083 (Anderson et al., 2015).
NMD is an mRNA quality surveillance pathway present in all eukaryotes and was first discovered in human cells and yeast in 1979 (Karousis and Mühlemann, 2019). Not only can it limit the translation of abnormal proteins, but it can also sometimes have deleterious effects on specific genetic mutations. NMD is currently known to be mainly involved in mRNA quality control, but a large number of recent studies have found that it can be used as a strategy for drug target discovery screening (Tan et al., 2022; Nagar et al., 2023). The most typical case is a 2007 study that found that PTC124 (Ataluren) can target genetic diseases caused by nonsense mutations (Welch et al., 2007); subsequently, Ataluren was developed and approved for the treatment of Duchenne muscular dystrophy (Namgoong and Bertoni, 2016).
Furthermore it has been found that PTC124 can inhibit head and neck squamous cell carcinoma cell proliferation by affecting nonsense mutations in two tumor suppressor genes, NOTCH1 and FAT1 (Wu et al., 2022a). Mutant UPF1 was found in pancreatic ductal adenocarcinoma, resulting in overactivated NMD, closely associated with elevated asparagine synthetase expression levels. Knockdown of asparagine synthetase targeting may improve the antitumor efficacy of pancreatic ductal adenocarcinoma (Hu et al., 2022). In addition, a study by Shi et al. found that PIWIL1 interacts with the UPF1, UPF2, and SMG1 complexes and that PIWIL1 acts in a piRNA-independent manner through the NMD mechanism to promote gastric carcinogenesis (Shi et al., 2020). One study found potential targets by targeting the key NMD factor UPF1 to bind the fragile X syndrome protein FMRP (Kurosaki et al., 2021). The NMD regulator SMG1 has been shown to be a candidate therapeutic target for multiple myeloma (Leeksma et al., 2023). Moreover, relevant studies have shown that the NMD factor UPF2 is a key target for the treatment of early embryonic lethality (Chousal et al., 2022). Research related to neurodegenerative diseases has also confirmed that nonsense-mediated mRNA degradation defects are its therapeutic targets (Zuniga et al., 2023). The mechanisms by which NMD promotes or inhibits the onset of disease are shown in Fig. 2.
Finally, there is a summary of the application of multiomics techniques in drug target discovery. Second-generation sequencing technologies and mass spectrometry multiomics assays have significantly advanced clinical oncology, with potential therapeutic targets and biomarkers, helping to individualize tumor treatment and significantly improving outcomes in a wide range of common and rare solid tumors (Blay et al., 2020; Fedorov et al., 2022). Seamlessly integrating multiomics data with precision therapy is a challenge for clinical practice (Vidova and Spacil, 2017; Ball et al., 2020; Fedorov et al., 2022; Holbrook-Smith et al., 2022; Mitchell et al., 2023). Current multiomics technologies are widely used in human disease target discovery. For example, one study used shotgun proteomic label-free quantification and parallel reaction monitoring mass spectrometry to discover changes in pancreatic cancer cell proteins and potential drug targets (Li et al., 2022b). Proteomic studies of triple-negative breast cancer identified NAE1 and AKT1/FASN as potential drug targets for the iP-1 and iP-2 subtypes (Gong et al., 2022). Raffel et al. identified chemoresistant leukemia stem cells target proteins by combining proteomics and transcriptomics analysis (Raffel et al., 2020). Using high-throughput metabolomics Duncan et al. predicted drug-target relationships for 86 eukaryotic proteins. The associated validation results demonstrated the feasibility of using high-throughput metabolomics to predict drug-target relationships for eukaryotic proteins (Holbrook-Smith et al., 2022).
In summary, there is a need for different approaches to the discovery of drug targets for the treatment of human diseases. By applying these different approaches, researchers can identify and validate potential drug targets for different diseases, leading to the development of novel therapies and improved treatment options.
Table 1 summarizes the applications of DARTS, network-based and machine learning methods, CRISPR library screening, NMD, and multiomics in the human drug target discovery process.
C. Multimethod Combinations in the Search for Therapeutic Drug Targets for Human Diseases
Drug target discovery is a critical step in the drug discovery process. In actual cases of target discovery, the combined application of multiple methods is more common. Compared with the use of a single method, the combined application of multiple target discovery methods has more obvious advantages. First, through the combined use of multiple methods, researchers can gain a more comprehensive understanding of the disease occurrence process, influencing mechanisms, and expression patterns of related genes. This will facilitate subsequent in-depth research and analysis of drug targets (Yang et al., 2023). Second, different methods are used to identify the same target, and mutual verification between methods can help improve the reliability of the results. This is also very important for subsequent in-depth research (Pun et al., 2023).
Currently, multimethod combinations are widely used in research related to human drug target discovery—for example, the use of multiomics technology in combination with CRISPR/Cas9 screening. Ruan et al. redefined the regulatory network of pluripotency in embryonic stem cells based on CRISPR screening as well as integrated analyses of multiomics data (Ruan et al., 2023); Vujovic et al. demonstrated the universal dependence of RNA-binding proteins as leukemia stem cell regulators by integrating multiomics with in vivo CRISPR-Cas9 screening and highlighted their potential as therapeutic targets for acute myeloid leukemia. (Vujovic et al., 2023). In addition, regarding the combined application of multiomics technologies and network and machine learning approaches, Fang et al. identified candidate targets for drug repurposing in Alzheimer's disease using a network-based artificial intelligence framework that combines multiomics data with the human protein-protein interactome network (Fang et al., 2022). Voitalov et al. present a multiomics network biology approach for prioritizing protein targets for ulcerative colitis treatment (Voitalov et al., 2022). Allesøe et al., developed multiomics variational autoencoders based on deep learning for the discovery of pharmacogenomic associations in type 2 diabetes (Allesøe et al., 2023). Moreover, a combination of multiomics techniques and DARTS was applied. For example, Hwang et al. used analysis of proteomics data dependency to identify target proteins via DARTS. When combined with liquid chromatography/tandem mass spectrometry, DARTS can identify proteins that bind to drug molecules, leading to conformational changes in the target protein (Hwang et al., 2020b). A combination of methods is indispensable. These synergistic approaches harmoniously combine different scientific approaches to address the inherent complexity of biological systems and improve our understanding of disease mechanisms.
The combined use of multiple methods has obvious advantages in the drug target discovery process, but it also faces some challenges. The first is the problem that may be faced during data processing. The use of multiple methods means that we need to analyze multiple sets of data. During the cross-validation process of the results, we need to standardize and unify each set of data. In this process, we may face potential errors. Other issues are in terms of experimental design and sample selection. Different methods may have different requirements for sample type, quantity, status, etc. How to standardize the experimental design and sample selection of each method is crucial to reducing errors and ensuring the reliability of results (Ji et al., 2023; Mousavian et al., 2023).
IV. Conclusion and Discussion
Target discovery for therapeutic drugs is the key to new drug development and a prerequisite for precision medicine. By advancing target discovery techniques, researchers can accelerate drug discovery, improve therapeutic outcomes, and move closer to the goal of precision medicine (Koivisto et al., 2022).
At present, most experimental research related to target discovery uses a combination of multiple methods to discover and determine targets. Joint analysis of multiple methods and multiple data provides the possibility for a more comprehensive understanding of disease mechanisms (Rodrigues and Bernardes, 2020; Li et al., 2022c; Wang et al., 2022a). For example, the continuous advancement and development of related technologies such as high-throughput sequencing, omics technology, and CRISPR-based library screening have greatly changed the form of drug target discovery (Liu et al., 2020; Kumar et al., 2024). In addition, the emergence of emerging drug target discovery strategies such as DARTS and NMD has injected new impetus into drug target discovery.
In addition to the combinatorial application of methods, drug target discovery also relies on some large databases. These databases typically contain information such as compounds, biological targets, structural information, and toxicological data (Ala et al., 2024). For example, the creation of large public databases such as PubChem, ChEMBL, and PDB has provided researchers in pharmaceutical-related fields with a wealth of valuable information on drug development (Konc and Janežič, 2022; Ala et al., 2024). Moreover, the interoperability of bioinformatics and computational methods has become indispensable in analyzing large-scale data, compound design, target identification, and toxicity prediction. Their shared use will further accelerate pharmaceutical discovery (Avilés-Alía et al., 2024). Furthermore, “new use of old drugs” based on computer-assisted and experimental methods can reduce costs, save time, and increase the success rate of R&D, which will be an important development direction in the future (Pushpakom et al., 2019; Cartas-Cejudo et al., 2024).
It is also important to note that understanding the context of relevant disease pathways is necessary to study therapeutic targets for the disease. Disease pathways are often composed of multiple key regulatory elements. The study of disease regulatory pathways enables researchers to comprehensively understand the mechanisms of diseases, identify interrelated molecular events, feedback loops, and crosstalk between pathways, providing a clear framework for understanding how different molecules (e.g., genes, proteins, and metabolites) interact to regulate biological disease processes (Ogishima et al., 2013). By studying targets in these pathways, researchers can identify major regulatory factors that play a key role in the disease process, providing more possibilities for subsequent disease treatment (Zheng et al., 2021). In addition, by studying disease pathways, researchers can identify upstream regulators that control target activity, providing insights into how the target is regulated and how interventions in this pathway affect disease development (Torrence et al., 2021). Furthermore, disease-modulating pathways often include potential biomarkers relevant to disease diagnosis, prognosis, and treatment response. Simultaneous investigation of targets and pathways enhances the identification and validation of relevant biomarkers and lays the foundation for the development of targeted and personalized therapeutic interventions (Chang et al., 2020).
Currently, despite significant progress in the application of multiple methods in drug target discovery, the overall productivity of the pharmaceutical industry has not improved significantly (Ringel et al., 2020; Scannell and Bosley, 2016). This problem can be attributed to various challenges in the drug development process. Although technologies such as modern computational methods, bioinformatics, and big data analysis can identify candidate targets for target diseases from large amounts of data. However, biological systems are extremely complex, and discovering targets often requires an in-depth understanding of their roles and interactions in biological systems. Therefore, in many cases, verification of target effectiveness still requires traditional experiments. For example, techniques such as immunohistochemistry, real-time PCR, and Northern blotting can be used to detect gene expression levels in different tissues or cell types (Wang et al., 2020a; Chen et al., 2023). Techniques such as Western blotting can be used to analyze intracellular protein expression. SDS-PAGE can be used for the separation and preliminary identification of proteins (Lyu et al., 2020; Zhao et al., 2021, 2023a; Chen et al., 2023). Column chromatography and gel filtration can be used to purify specific proteins from cells or tissues (Duong-Ly and Gabelli, 2014; Nguyen et al., 2023). Techniques such as yeast two-hybridization and immunoprecipitation are used to study interactions between proteins to reveal protein networks involved in specific biological processes (Zhao et al., 2021, 2023a). In addition, after verifying the completion of protein interactions, we need some bioinformatics tools to perform functional enrichment analyses of genes or proteins to identify the biological processes and pathways they are involved in (Lyu et al., 2020; Chen et al., 2023; Zhang et al., 2023a). Furthermore, we need to validate the biological functions of genes or proteins using cell lines and animal models, for example, through gene knockout, overexpression, or mutation to study their effects on physiological and pathological processes (Chang et al., 2023; Xie et al., 2024). However, even when therapeutic targets for a disease are identified through conventional experiments, the complexity of the disease, coupled with a limited understanding of the underlying molecular mechanisms, often leads to instability of the newly discovered targets (Shoshan and Linder, 2008). Second, these challenges are exacerbated by rising development costs and high attrition rates during clinical trials.
In the future, to truly address these multifaceted challenges, an integrated approach must be adopted that fully combines traditional experiments, high-throughput techniques, computational modeling, and artificial intelligence (Rifaioglu et al., 2019; Nayarisseri, 2020; Giri and Ianevski, 2022; You et al., 2022) to improve the accuracy and efficiency of target discovery and verification. Furthermore, as technology evolves, drug target discovery methods are expected to become more sophisticated, efficient, and impactful, transforming drug development, driving innovation, and paving the way for developing more effective targeted therapies for human diseases.
Data Availability
Not applicable.
Authorship Contributions
Participated in research design: Das, Chen, J. Wu.
Contributed new reagents or analytic tools: Jia, Yang, Chen, J. Wu.
Performed data analysis: Jia, Yang, Das.
Wrote or contributed to the writing of the manuscript: Jia, Yang, Y.-K. Wu, Li.
Footnotes
- Received August 21, 2023.
- Revision received May 28, 2024.
- Accepted May 31, 2024.
This work was supported by the National Key Research and Development Program of China (2022YFD1700200), the Science and Technology Plan Project of Guizhou Province (Qiankehezhicheng [2024] the general 083), the Guizhou Provincial Basic Research Program (Natural Science)-ZK[2023]-099, the National Natural Science Foundation of China (3201452), the Program of Introducing Talent to Chinese Universities (111 Program, D20023), the Frontiers Science Center for Asymmetric Synthesis and Medicinal Molecules, Department of Education, Guizhou Province (Qianjiaohe KY (2020)004), and the Guizhou Science and Technology Cooperation Foundation (ZK(2021)140), the Central Government Guides Local Science and Technology Development Fund Projects (Qiankehezhongyindi (2023) 001).
All authors declare no conflict of interest.
↵1 Z.-C.J. and X.Y. contributed equally to this work.
↵This article has supplemental material available at pharmrev.aspetjournals.org.
Abbreviations
- DARTS
- drug affinity responsive target stability
- DTI
- drug-target interaction
- gRNA
- guide RNA
- NMD
- nonsense-mediated mRNA degradation
- PCR
- polymerase chain reaction
- PTC
- premature termination codon
- U.S. Government work not protected by U.S. copyright