Tag: modeling

Scaling further GPCR summits

February 5th, 2010, No Comments

ResearchBlogging.orgThere’s a nice review on GPCRs and their continuing challenges in the British Journal of Pharmacology this month. The authors focus on both structural and functional challenges in the characterization of this most important class of signaling proteins. As is well-known, drugs targeting GPCRs generate the highest revenue among all drugs. And given their basic roles in signal transduction, GPCRs are also clearly very important from an academic standpoint. Yet there is a wall of obstacles confronting us.

For starters there are the well-known problems with crystallization plaguing all membrane proteins like GPCRs. Until now only four GPCRs- rhodopsin, beta1 and beta2 adrenergic receptors and A2a adenosine receptor- have been crystallized, and the publication of each structure was considered a breakthrough. As the review mentions, the proteins are unstable outside the membrane and conditions for stabilization and crystallization are frequently incompatible; for instance stabilization is often effected by long-chain detergents while the opposite is true for crystallization. To circumvent these problems clever strategies have been adopted and immense trial and error and hard work were required. The rhodopsin and adrenergic receptors were crystallized by point mutations and special techniques; in one case an antibody was tethered to the protein and in another case a fusion protein was attached to stabilize the domain.

It’s when we enter the dense jungle of GPCR biology that crystallization problems almost start sounding trivial. GPCRs couple to a variety of ligands including well-known biogenic amines (like adrenaline and serotonin), peptides, proteins and nucleotides. Where is starts to become complex is in the kind of response these ligands elicit, which could be full agonism, partial agonism, inverse agonism and full antagonism.

What structural features distinguish these different responses from each other? This is a key question in GPCR biology. But not only can ligands be agonists or antagonists but they can act in different ways on the same GPCR, activating different pathways. The case of partial agonists is especially interesting and more protein-partial agonist structures would be quite valuable.

The traditional model of protein binding assumes two dominant states, inactive and active. Agonists stabilize the active state, antagonists stabilize both states, and inverse agonists stabilize the inactive state. But, as the authors say, the traditional model is slowly undergoing a revision:

The concept of a receptor existing in a simple pair of active and inactive states (R and R*) is no longer sufficient to explain the observations of pharmacology. Agonists vary considerably in their efficacy and how this relates to the bound conformational states is unclear. A partial agonist with 50% efficacy could fully activate 50% of the receptors or could activate 100% of the receptor by 50%. Alternatively, a partial agonist might stabilize a different form of the receptor to a full agonist state and this different conformation might activate the G protein with a lower efficiency. The study of rhodopsin suggests that activation of the receptor involves the release of key structural constraints within the E/DRY and NPxxY regions. Energy provided by agonist binding must be sufficient to break these constraints and stabilize the new active conformation. In the case of rhodopsin, whether this transition is complete or partial depends on the chemical nature of the ligand (Fritze et al., 2003). The retinal analogue 9-demethyl-retinal is a partial agonist of rhodopsin which only poorly activates G protein in response to light. Spin-labeling studies (Knierim et al., 2008) suggest that in the presence of this ligand, only a small proportion of receptors are in the active conformation equivalent to all-trans-retinal. However, this can also result in a new state that is not formed with the full agonist. Therefore, rhodopsin studies suggest that that partial agonism may result in either a reduced number of fully active receptors or conformations which are not capable of fully engaging the signal transduction process. Structures of other GPCRs in complex with partial agonists are required to determine their effects on conformation.

An example makes the hideous complexity clear. The mu-opioid receptor is activated by several ligands including morphine, etorphine and fentanyl. However, morphine acts only as a partial agonist in effecting a phosphorylation endpoint whereas the other two act as full agonists. But it gets more interesting. While morphine effects phosphorylation of the kinase ERK through activation of PKC (protein kinase C), etorphine also activates ERK but by activation of beta-arrestin. Thus the same endpoint can be effected through different pathways. And it doesn’t even stop there. Morphine causes the phosphorylated ERK to stay in the cytoplasm while etorphine causes the ERK to translocate to the nucleus. Not done yet; in addition, morphine can reverse its role and act as a full agonist on the adenylyl cyclase pathway.

Thus, the same ligand adopts different roles when activating different pathways. To begin with it’s not even clear which pathway is activated under what circumstances. And the problem is only accentuated by the participation of different G proteins in inducing different responses.

Another dense layer of complexity is added by the fact that GPCRs have been found to dimerize and oligomerize. Crystallography can often be misleading in studying these dimers since there are several documented reports of dimers being formed as misleading artifacts of the crystallization conditions.

Apart from the stated problems, there are even more differences in further downstream signaling and receptor internalization induced by oligomerization. It’s clearly a jungle out there. No wonder the design of drugs targeting GPCRs needs a measure of faith. For instance consider the various drugs targeting CNS proteins. CNS drug discovery has long been considered a black box for a good reason. Once a drug enters the brain, one can imagine it not only targeting a diverse subset of GPCRs (and even other classes of proteins) but, given the above complexities, also acting separately as agonist and antagonist at the various receptors. We clearly have a long way to go before we can prospectively design a CNS drug that will do all this on cue.

It would be a tall order trying to explain all these differences simply through structural modifications induced by the ligands. Yet whatever signal is eventually transmitted to the G proteins must begin with a crucial structural movement. It seems that elucidating the differences in helix and loop movements induced by partial and full agonists, inverse agonists and antagonists is a tantalizing part of the GPCR puzzle.

Since crystal structure data on GPCR is lacking, modeling approaches especially based on homology modeling have proved especially fruitful. Earlier attempts were all based on the single rhodopsin template. Since then the higher resolution adrenergic and adenosine receptor structures have provided significant insight. But here again numerous caveats abound. Modeling the helices is relatively easy since all GPCRs share the same general 7TM helix topology which is highly conserved, but modeling the fine differences between helices that lead to structural changes upon ligand binding is harder. And most difficult and important of all is modeling the extracellular loops which actually bind the ligands. Subtle changes in loop movement, salt-bridge breakage, hydrophobic effects and interaction of loops with helices is difficult to model. Often a change in conformation of a single residue can be enough to throw the modeling off balance. Nonetheless, the paucity of structural data means that modeling when done right will continue to be valuable. In the absence of structural data, computational ligand-based approaches which search for ligands similar to known compounds could be useful.

We have made a lot of progress in understanding the structure and function of these key proteins. But investigations seem to have unearthed more questions than answers. Which is always good for science since then it can have more choice fodder for contemplation.

Congreve, M., & Marshall, F. (2009). The impact of GPCR structures on pharmacology and structure-based drug design British Journal of Pharmacology DOI: 10.1111/j.1476-5381.2009.00476.x

Zheng, H., Loh, H., & Law, P. (2010). Agonist-selective signaling of G protein-coupled receptor: Mechanisms and implications IUBMB Life DOI: 10.1002/iub.293

More model perils; parametrize this

November 30th, 2009, No Comments

ResearchBlogging.orgNow here’s a very interesting review article that puts some of the pitfalls of models that I have mentioned on these pages in perspective. The article is by Jack Dunitz and his long-time colleague Angelo Gavezzotti. Dunitz is in my opinion one of the finest chemists and technical writers of the last half century and I have learnt a lot from his articles. Two that are on my “top 10″ list are his article showing the entropic gain accrued by displacing water molecules in crystals and proteins (a maximum of 2 kcal/mol for strongly bound water) and his paper demonstrating that organic fluorine rarely, if ever, forms hydrogen bonds.

In any case, in this article he talks about an area in which he is the world’s acknowledged expert; organic crystal structures. Understanding and predicting (the horror!) crystal structures essentially boils down to understanding the forces that makes molecules stick to each other. Dunitz and Gavezzotti describe theoretical and historical attempts to model forces between molecules, and many of their statements about the inherent limitations of modeling these forces rang as loudly in my mind as the bell in Sainte-Mère-Église during the Battle of Normandy.

Dunitz has a lot to say about atom-atom potentials that are the most popular framework for modeling inter and intramolecular interactions. Basically such potentials assume simple functional forms that model the attractive and repulsive interactions between nuclei which are treated as rigid balls. This is also of course the fundamental approximation in molecular mechanics and force fields. The interactions are basically Coulombic interactions (relatively simple to model) and more complicated dispersion interactions which are essentially quantum mechanical in nature. The real and continuing challenge is to model these weak dispersive interactions.

But the problem is fuzzy. As Dunitz says, atom-atom potentials are popular mainly because they are simple in form and easy to calculate. However, they have scant, if any, connection to “reality”. This point cannot be stressed enough again. As this blog has noted several times before, we use models because they work, not because they are real. The coefficients in the functional forms of the atom-atom potentials are essentially varied to minimize the potential energy of the system and there are several ways to skin this cat. For instance, atomic point charges are rather arbitrary (and definitely not “real”) and can be calculated and assigned by a variety of theoretical approaches. In the end, nobody knows if the final values or even the functional forms have much to do with the real forces inside crystals. It’s all a question of parameterization which gives you the answer, and while parameterization may seem like a magic wand which may give you anything that you want, that’s precisely the problem with it…that it may give you anything that you want without reproducing the underlying reality. Overfitting is also a constant headache and one of the biggest problems with any modeling in my opinion; whether in chemistry, quantitative finance or atmospheric science. More on that later.

An accurate treatment of intermolecular forces will have to take electron delocalization into consideration. The part which is the hardest to deal with is the part close to the bottom of the famous Van der Waals energy curve, where there is an extremely delicate balance between repulsion and attraction. Naturally one thinks of quantum mechanics to handle such fine details. A host of sophisticated methods have been developed to calculate molecular energies and forces. But those who think QM will take them to heaven may be mistaken; it may in fact take them to hell.

Let’s start with the basics. In any QM calculation one uses a certain theoretical framework and a certain basis set to represent atomic and molecular orbitals. One then adds terms to the basis set to improve accuracy. Consider Hartree-Fock theory. As Dunitz says, it is essentially useless for dealing with electron delocalization because it does not take electron correlation into account, no matter how large a basis set you use. More sophisticated methods have names like “Moller-Plesset perturbation theory with second order corrections” (MP2) but these may greatly overestimate the interaction energy, and more importantly the calculations become hideously computer intensive for anything more than the simplest molecules.

True, there are “model systems” like the benzene dimer (which has been productively beaten to death) for which extremely high levels of theory have been developed that approach experimental accuracy within a hairsbreadth. But firstly, model systems are just that, model systems; the benzene dimer is not exactly a molecular arrangement which real life chemists deal with all the time. Secondly, a practical chemist would rather have an accuracy of 1 kcal/mol for a large system than an accuracy of 0.1 kcal/mole for a simple system like the benzene dimer. Thus, while MP2 and other methods may give you unprecedented accuracy for some model systems, they are usually very expensive for most systems of biological interest and not very useful.

DFT still seems to be one of the best techniques around to deal with intermolecular forces. But “classical” DFT suffers from a well-known inability to treat dispersion. “Parameterized DFT” in which an inverse sixth power term is added to the basic equations can work well and promises to be a very useful addition to the theoretical chemist’s arsenal. More parameterization though.

And yet, as Dunitz points out, problems remain. Even if one can accurately calculate the interaction energy of the benzene dimer, it is not really possible to know how much of it comes from dispersion and how much of it comes from higher order terms. Atom-atom potentials are happiest calculating interaction energies at large distances, where the Coulomb term is pretty much the only one which survives, but at small interatomic distances which are the distances most of interest for the chemist and the crystallographer, a complex dance between attraction and repulsion, monopoles, dipoles and multipoles and overlapping electron clouds manifests itself. The devil himself would have a hard time calculating interactions in these regions.

The theoretical physicist turned Wall Street quant Emanuel Derman (author of the excellent book (“My Life as a Quant: Reflections on Physics and Finance”) says that one of the problems with the financial modelers on Wall Street is that they suffer from “physics envy”. Just like in physics, they want to discover three laws that govern 99% of the financial world. More predictably as Derman says, they end up discovering 99 laws that seem to govern 3% of the financial world with varying error margins. I would go a step further and say that even physics is accurate only in the limit of ideal cases and this deviation from absolute accuracy distinctly shows in theoretical chemistry. Just consider that the Schrodinger equation can be solved exactly only for the hydrogen atom, which is where chemistry only begins. Anything more complicated that, and even the most austere physicist cannot help but approximate, parametrize, and constantly struggle with errors and noise. As much as the theoretical physicist would like to tout the platonic purity of his theories, their practical applications would without exception involve much approximation. There is a reason why that pinnacle of twentieth century physics is called the Standard Model.

I would say that computational modelers in virtually every field from finance to climate change to biology and chemistry suffer from what Freeman Dyson has called “technical arrogance”. We have made enormous progress in understanding complex systems in the last fifty years and yet when it comes to modeling the stock market, the climate or protein folding, we seem to think that we know it all. But we don’t. Far from it. Until we do all we can do is parametrize, and try to avoid the fallacy of equating our models with reality.

That’s right Dorothy. Everything is a model. Let’s start with the benzene dimer.

Dunitz, J., & Gavezzotti, A. (2009). How molecules stick together in organic crystals: weak intermolecular interactions Chemical Society Reviews, 38 (9) DOI: 10.1039/b822963p

The model zoo

October 19th, 2009, No Comments

So I am back from the eCheminfo meeting at Bryn Mawr College. For those having the inclination (both computational chemists and experimentalists), I would strongly recommend the meeting for the small group and consequent close interaction. The campus with its neo-gothic architecture and verdant lawns provides a charming environment.

Whenever I go to most of these meetings I am usually left with a slightly unsatisfied feeling at the end of many talks. Most computational models to describe proteins and protein-ligand interactions are patchwork models based on several approximations. Often one finds several quite different methods (force fields, QSAR, quantum mechanics, docking, similarity based searching) giving similar answers to a given problem. The choice of method is usually made on the basis of availability and computational power and past successes, rather than some sound judgement allowing one to choose that particular method over all others. And as usual it depends on what question you are trying to ask.

But in such cases, I am always left with two questions; firstly, if several methods give similar answers (and sometimes if no method gives the right answer), then which is the “correct” method? And secondly, because there is no one method that gives the right answer, one cannot escape the feeling at the end of a presentation that the results that were obtained could have been obtained by chance. Sadly, it is not even always possible to actually calculate the probability that a result was obtained by chance. An example is our own work on the design of a kinase inhibitor which was recently published; docking was remarkably successful in this endeavor, and yet it’s hard to pinpoint why it worked. In addition a professor might use some complex model combining neural networks and machine learning and may get results agreeing with experiment, and yet by that time the model may have become so abstract and complex that one would have trouble understanding any of its connections to reality (that is partly what happened to financial derivatives models when their creators themselves stopped understanding why they are really working, but I am digressing…)

However, I remind myself in the end about something that is always easy to forget; models are emphatically not supposed to be “correct” from the point of view of modeling “reality”, no matter what kind of fond hopes their creators may have. The only way in which it is possible to gauge the “correctness” of a model is by comparing it to experiment. If several models agree with experiment, then it may be meaningless to really argue about which one is the right one. There are metrics suggested by people to discriminate between such similar models, for instance employing that time-honored principle of Occam’s Razor where a model with fewer parameters might be better. Yet in practice such philosophical distinctions are hard to apply and the details can be tricky.

Ultimately, while models can work well on certain systems, I can never escape the nagging feeling that we are somehow “missing reality”. Divorcing models from reality, irrespective of whether they are supposed to represent reality or not, can have ugly consequences, and I think all these models are in danger of falling into a hole on specific problems; adding too many parameters to comply with experimental data can easily lead to overfitting for instance. But to be honest, at this point what we are trying to model is so complex (the forces dictating protein folding or protein-ligand interactions only get more and more convoluted like Alice’s rabbit hole) that this is probably the best we can do. Even ab initio quantum mechanics involves acute parameter fitting and approximations in modeling the real behavior of biochemical systems. The romantic platonists like me will probably have to wait, perhaps forever.

Feed

http://nucleus.chemistryblog.net / modeling