Proteins perform many important functions in our body. Influencing the behaviour of proteins helps to control the spread of disease. Since a protein's function is directed by the molecules with which it binds, the ultimate goal of drug design is to design molecules that selectively inhibit or activate particular proteins and hence modulate their function.
Identifying small molecules that modulate protein-protein interactions is a major challenge of drug discovery, to the extent that such targets have been labelled "undrugable". Protein-protein interactions involve large flat surfaces usually burying greater than 1100Ǻ2 . This differs significantly from traditional drug targets that contain small molecule binding cavities. However, a minor fraction of the interface residues within protein-protein interactions account for the majority of free energy of binding . Such 'hot spots' appear to be common to protein-protein interactions, and tend to be clustered together at the center of the interface . If the hot spots consist of short continuous binding domains then the discovery of mimetics is feasible, as exemplified by the small molecule integrin inhibitors . However, most protein-protein interfaces consist of noncontinuous binding epitopes, and little is known about the common structure (if any) of protein "hot-spots" and protein recognition surfaces in general.
Molecular recognition is a surface phenomena and this surface is dictated by both the chemistry and topology of exposed functional groups. Thus, protein recognition is dictated by a combination of the 20 naturally occurring amino acids found on the binding surface and how these side chains are topologicals arranged. Drug design techniques are focussed on the discovery of suitable scaffolds and functional group attachments to present the required electrostatic and steric surface for binding to a target and inducing a functional response.
To focus molecular design towards the discovery of molecules that modulate protein-protein interactions, we have developed algorithms and tools to cluster protein contact surfaces and have identified common side chain positions of proteins involved in molecular recognition events. These populated shapes are found at the centre of the interface suggesting that they may describe the geometries of protein "hot-spots". Consequently using these common motifs in drug design will result in the discovery of scaffolds that mimic the shape of functional regions of protein surfaces, and appending suitable functionality to these scaffolds should result in the discovery of molecules to mimic the function of proteins.
To achieve this, we have developed a Population Based Incremental Learning algorithm  for efficient conformational searching of potential scaffolds. This algorithm performs strongly in a number of respects, including its ability to calculate the global minimum energy conformation of large flexible molecules and, more importantly, to determine in a short period a large number of low energy conformations of rigid "drug-like" molecules. This then allows the creation of virtual libraries of molecules which are then "filtered" through the common motifs to identify appropriate scaffolds. Finally, this process is illustrated by presentation of scaffolds that match common protein surfaces.