Introduction to Drug Discovery

Drug discovery is a research, but should not be confused with academic research. Most academic research involves researching on something for the sake of understanding and there is no final goal. However drug discovery has a focused target which is to discover a new compound which could be a potential drug to be sold in the market. It is called research as it is risky and involves a lot of cost and time. Drug discovery is also very different from Drug development. Where drug discovery involves discovering a molecule with therapeutic properties, drug development involves developing knowledge, information and processes around that compound so that it can be marketed as a pharmaceutical product.

However focused it may seem, the time and cost involved in discovering a compound with potential therapeutic properties is large.

This can be illustrated by the fact that roughly out of every hundred drug discovery projects, only 2-3 compounds enter the pharmaceutical market as drugs.

Drug discovery consists of the following phases:

  • Target Identification
  • Target Validation
  • Lead Generation
  • Lead Optimization
  • Candidate Selection

While target validation is a small bottleneck in the process of discovery due to complexity in validating targets for complex diseases like cancer, Lead Optimization poses greater problems due to being a labor intensive task.

Overall computers have played a vital role in target identification and lead generation and their use in the rate limiting phases (Validation and Optimization) are yet to be fully utilized.

Let us look at each of these phases in details.

Phases of Drug Discovery


The process of target identification has its roots in research conducted in academia. Every few months there are publications regarding discovery of identified proteins, receptors and novel genes and their relationship to diseases in man. Also the mapping of the human genome via the Human Genome Project (HGP) has been completed and due to this the number of potential targets, proteins and receptors that could be considered to influence diseases in man continues to rise.

Target Identification basically involves picking up a target based on certain studies/ findings that link the disease to the target and developing a hypothesis of how this target influences the disease in question. The target could be a protein, receptor or even a gene.

An example of this is that the Gastro-intestinal tract produces a hormone called Incretin when we eat food. This hormone causes the body to release insulin. This observation created a pharmaceutical target for diabetes, when it was first discovered. The hypothesis supporting the relation of the target to the disease was that adding an incretin hormone like glucagon, such as peptide 1 (GLP-1) or a related compound might cause more insulin to be released and thus promote better control of glucose in diabetes.


Once the Target is identified and a hypothesis generated, the Target needs to be validated.

Target Validation means gathering additional evidence that is a molecule is found that would affect the target (either block or trigger it), the outcome should be beneficial to the patient. Also it means to justify that this combination of molecule-target activity would be preferred over other methods/approaches, if any. This is often done by taking a disease model. These models are generally animals like mice where genes have been either deleted or added, to reproduce the exact disease in them which is being tested.

However some diseases like hypertension and diabetes can be reproduced easily thus allows disease models to be developed easily while other such as cancer cannot.

Thus the process of target validations becomes simpler for much known diseases which can be modeled and validated while it is difficult for rare diseases about which much is not known and which cannot be reproduced to be tested. In the latter case the time and money lost continuing research without validating the target is enormous and may result in an additional spending of roughly $50 million for the study. This makes target validation a bottleneck in the drug discovery process.


The term Lead generation refers to the creation of molecules through chemistry. These molecules created could be candidates for prospective drugs. This process involves assay development, screening and early medicinal chemistry which tests the compound initially for its efficacy.

Lead generation has its roots in what is called the Structure-Activity- Relationship (SAR). In simple words, a chemist synthesizes a new molecule with a calculated structure based on the knowledge of the target and gives it to the biologist to test. The biologist in turn tests the compound on disease model and observes its effect and gives feedback to the chemist who in turn may tweak the molecules structure until the biologist is satisfied with the outcome. This leads to a compound which is a best candidate for the disease against which is it tested and could possibly become a prospective drug.

This process has been fastened by using assays, which mimic the disease process and screening thousands on compounds against these assays using a method called High Throughput Screening (HTS) to find the right fit molecule. Some methods of Ultra High Throughput Screening look at performing a SAR all at once, by screening millions of molecules against more than one assay. However screening a single compound against a single assay still gives the best results.

The result of this is that once a good assay is developed, scientists can screen it against a million compound libraries, to find the leads required.

The area of lead generation is growing continuously. With the available compounds in the libraries not able to generate sufficient high value leads, scientists are adapting a method called X-Ray Crystallography. In this method the target protein is imaged to analyze its binding capabilities so as to strategically create compounds having the right structure and a library of more specific leads. For example the first of the HIV protease inhibitors utilized crystallography studies to study the protease structure against which the medicinal chemist had created a compound with the right structure to be effective against the proteases.

In theory molecular diversity is said to be 1030 and even if we screen a million compounds we still cover extremely small portion of this diversity during the screening process. Thus the process of lead generation is a science that is hardly explored and the more we explore it, the newer thing, we are going to learn.


Once a Lead has been generated, it has to be optimized, so that it can be more effective, tolerable and soluble. In other words though a compound has been found that is effective against a disease model, yet it may not be a great compound to be used in humans.

For example if the compound has low solubility, this means it may not dissolve in the stomach or intestine or on the other hand it may dissolve and be assimilated too quickly hence making it necessary to be taken at very short periods. These characteristics of the compound would make it unacceptable as a treatment drug in humans.

Thus it is necessary to test the drug in animals systems that are similar to humans to check its characteristics. Also sometimes it may be needed that the structure of the compound be tweaked to give it those characteristics that it lacks, but yet retain its efficacy towards the disease in question.

This is a labor intensive task and thus makes Lead Optimizations a relative bottleneck in the research and development process.


After optimization of the compounds that have been generated as leads after they have passed screening, some compounds are found worthy to continue with for further development. While selecting the candidate one of the compounds needs to be selected while another 1-2 compounds need to be kept as backup should the lead compound fail, which in many cases is a reasonable possibility.

In some cases there are leads which fail one after the other. This result in re-assessment of the biologics involved and understanding the necessary chemistry needed to achieve success.

For example for Eli Lilly, as a result of their antifolate program for cancer, “Almita” was launched in the year 2004 after the initial two compounds which were potential leads in the program had failed. This became the only compound out of that program to enter into the market.


Patents may be generated at any time during the discovery process.

There are patents around the different portions of discovery such as:

  • Biologics of the disease process
  • Targets/ structure of proteins involved.
  • Disease models being used for a discovery
  • Chemical classes and specific structures which are found to be therapeutically active.
  • Product development
  • Manufacturing

As such the most valuable of the above are patents for the new compounds/ structures that are found to be therapeutically active against the disease also called New Chemical Entities (NCE).

It takes at least two years to receive a patent from the date of application. During this time the patent is reviewed for originality and kept a secret. However if a patent is granted, it does not mean that the NCE can be developed.

For each NCE, there might be more than one patent required such as a patent for its ability to act on the target protein, the patent on the biological target itself and how it affects the disease. In a case that a company “A” already has one of these patents, the other company “B” who files for the other patent will get it, but will not be able to develop the NCE  until it  buys out the patent owned by company “A”  or pays royalty for the same. Such conflicts result in alliances within the pharmaceutical industry.

However there may be a company that has created a different compound that acts on a different target with a different mechanism, but cures the same disease. This results in Competition within the pharmaceutical industry.

For a deep insight into the world of pharmaceutical research and development, subscribe to our Clinical Research Knowledgebase

Want a explore a career in Pharmaceutical Research and Development? Join our Diploma in Clinical Research program and kick-start a career in Clinical Research.

You may also be interested in reading our next post on Process of Drug Development