George M. von Furstenberg and Joseph P. Daniels

The authors offer a yardstick for measuring the credibility of commitments made by industrial country leaders at annual G-7 summits. Their main purpose is to provide an objective basis for holding political leaders accountable. Applying the yardstick to past summits, how well have they done? The answer is, perhaps not nearly as badly as cynics would have us believe, but then again, not nearly as well as they should.

For government announcements to reduce the costs of desired economic adjustments through concerted action, credibility is everything. But how do policy announcements come to merit belief? Governments speak with many tongues at several levels, and separate branches and independent agencies may diffuse authority. However, the preconditions for credibility could hardly be better than when heads of governments exchange promises in the glare of publicity at the annual economic summits.

According to game theorists, in repeated games that leave a record on which reputations can be built, players learn to collaborate. They can do so quite effectively even when not bound together by any particular esprit de corps. Yet, as our detailed investigation shows, the over 200 quantifiable commitments in economic declarations issued at the first 15 summits (1975-89) of the Group of Seven (G-7) do not appear to have been very credible in most areas.

The Metric. With the rational-expectations revolution having put "credibility" on everyone's lips, it is surprising that there have been no attempts to develop a standard metric for systematic appraisals of performance. For this reason we had to prepare our own. Although we have experimented with both linear and nonlinear scoring functions, the linear scores are easiest to deal with. The range and scale of these scores is set by analogy with belief functions: where complete belief, at one extreme, is given an index of 1 and complete disbelief, at the other, an index value of -1. Intermediate values would indicate some weighted combination of beliefs in a proposition and its opposite, with 0 indicating equal belief in both.

This type of metric can be used to score the degree of compliance, non-compliance, or contravention of economic undertakings. The scores compare change achieved with change promised. Dividing one change by the other gives a dimensionless number so that scores can be aggregated. Individual scores measure the degree of compliance, with values ranging from 1 for full (or overfull, since more change is not necessarily better) conformance to -1 for "achieving" the opposite (or worse than the opposite) of the change promised. Taken together, the scores show the average degree of compliance by the G-7 countries. Scores by the types of economic variables addressed in summit commitments are also distinguished in the table.

What the Record Shows. The number of macroeconomic, trade, or energy-policy undertakings that were sufficiently specific to be scored ranged from a high of 26 at the first Bonn summit (Bonn I, 1978) to a low of 4 at Tokyo II (1986). The simple average scores per summit also had a very wide range from 0.86 at Venice II (1987) to a low of -0.45 for the Toronto summit the very next year. More interesting, perhaps, is the record of commitment fidelity by country. Here the United Kingdom and Canada have taken the high road, while France and Japan stuck to the bottom. However, little should be made of small differences. The entries in the last column of the table show how large the standard errors of the mean scores would be expected to be if the commitments, on average, had no predictive significance at all.

The hypothesis of zero average credibility can be tested by comparing the simple average in the first column of numbers with 2-times the standard error in the last column (the critical multiple rises to 2.5 if the number of undertakings (N) being averaged in the group is small, say 7). If the average is greater than the appropriate multiple of the standard error, the commitments meant something of what they said, on average.

By this test, all 209 commitments together clearly had some predictive significance, or at least pointed in the right direction. Whether this is because of some success at forecasting impending changes or due to genuine policy effort facilitated or inspired by the summit process we dare not say. Only mustering such effort at the summit could, of course, make the "news" that might change anyone's forecast and give the summits an informational edge. But regardless of whether they involve such news, summit undertakings or prognostications are far from completely fulfilled. They are more like 30 percent proof (first line of table), and weaker still if the generally quite reliable commitments in the area of energy are excluded (second-to-last line).

Mercifully for harmony in international relations, the average scores by country, while never conclusively different from 0 individually, were at least all positive. There is one large negative score by subject area: the few commitments to greater exchange-rate stability that were rated sufficiently concrete have a score of -0.7. But there were only two such commitments; the finding lacks significance. International trade restraint (mostly by Japan) and energy commitments get high positive scores. The largest group of commitments relate to another, though not unrelated, nominal stability objective. Over 40 percent of all undertakings promised to reduce inflation in the year ahead, but little was generally accomplished. Hence average scores improve considerably if these undertakings are excluded (last line of the table).

There is more apparent success with commitments to raise real growth. When the outlook was still soured by the 1982-83 recession, no commitments in this category were made. Still, it is obviously not true that the heads of government exchange mainly "safe" promises at the summit. Rather, they appear to be willing to take considerable reputational risks even with commitments that are most telling and hence damaging to ignore. Indeed, the entries in a lower couplet of line in the table show that when heads of government sign off on the use of (mostly fiscal) direct policy measures, they do not score any higher than on all other commitments.

How G-7 Promises Have Been Kept
(Undertaking of 15 Summits, 1975-89)
Score: Average SD N (N-1)-0.5*
All Undertakings 0.317 0.688 209 0.069

By country
United States 0.286 0.707 33 0.177
Japan 0.269 0.649 29 0.189
Germany 0.340 0.739 24 0.209
France 0.239 0.613 24 0.209
United Kingdom 0.411 0.757 21 0.224
Italy 0.281 0.680 27 0.196
Canada 0.391 0.662 26 0.200

By function and controllability
1. Real GNP Growth 0.432 0.625 18 0.243
2. Demand Composition 0.268 0.825 7 0.408
3. International Trade 0.758 0.358 7 0.408
4. Fiscal Adjustments 0.263 0.692 40 0.160
5. Interest Rate 0.266 0.547 21 0.224
6. Inflation Rate 0.222 0.727 84 0.110
7. Foreign Exchange Rate -0.720 0.281 2 1
8. Aid and Schedules 0.265 0.388 5 0.5
9. Energy 0.668 0.558 25 0.204

Direct Policy Measures 0.278 0.635 10 0.333
All Others 0.319 0.691 199 0.071

All Except Energy 0.269 0.690 184 0.074
All Except Inflation 0.381 0.653 125 0.090

* This is the stand deviation (SD) of the average score under the joint null hypothesis that the population value of the SD of scores is 1 because summit ambition and effect are both 0.

Why the Low Scores? Perhaps the reason for the low average scores is that politicians will invest in credibility to a degree that balances the political attractiveness of making bold promises against the risk of outcomes falling short. The time horizon of politicians, particularly those who expect to be reelected only one more time, may be quite limited. Knowing this, the public may believe them only a little when they promise a lot. This however, does not mean that, given a reputation for wide exaggeration and more miss than hit, politicians would be better off should they all suddenly cease to exaggerate their ability and willingness to influence events. Except when the situation leaves no hope of any quick remedy or turnaround at all, it may well be politically safer for heads of government to appear actively engaged than to plead incapacity or reveal a lack of concern. They may then prefer to address the major concerns of their citizens at the summit, at least rhetorically, in a manner that leads to concrete undertakings documenting their commitment, even if the chances of following through or getting lucky are low. The hearts and minds of voters, not credibility to each other, is what government leaders principally are fighting for.

Everybody Knows Different. Judgments about compliance and the "success" of summits differ widely. At one extreme, Peter Hajnal, in The Seven Power Summit: Documents from the Summit of Industrialized Countries 1975-1989 (Millwood, NY: Kraus International Publications, 1989) has concluded that "(d)espite a recurrent desire for deliberative rather than decision making summits, and a persistent cynicism about the willingness or ability of the leaders to keep their summit commitments once the gathering is over, the summits generally do arrive at a consensus and make decisions that stick." An editorial in a leading German newspaper (Frankfurter Allgemeine Zeitung, June 22, 1988) came to the exact opposite conclusion around the same time. It charged that attempts at summit coordination of economic and exchange-rate policies rarely succeed, and that the obligations commonly adopted are either not observed or generally followed by disastrous consequences. Similarly the London Economist (April 14, 1990, p. 16), in an editorial entitled "The G7 Charade," asked the reader to consider "macroeconomic coordination: currencies, interest rates, budget deficits and all that" and finds the record, except within the European Monetary System, "unconvincing."

Obviously the standards of journalistic judgments are unclear, but discourse among international economists is filled with equally preemptory judgments. Few writers in this field are as careful and nuanced as Wendy Dobson in her Economic Policy Coordination: Requiem or Prologue? (Washington: Institute for International Economics, 1991), work which nevertheless shows that a common standard of reference is very much needed for the different verdicts. We have attempted to provide a unifying factual basis that could inform these verdicts.

What is In It? The quantified appraisals described here address only the issue of compliance, and not whether the original commitments were well advised and should have been adhered to even once circumstances had changed. Limiting ourselves to establishing the degree of compliance with declared undertaking has the advantage that scores can be assigned to any possible future outcome once pledges have been exchanged at the summit. If a country promised to raise its growth rate, say, from 3 percent this year to 5 percent next, we know what score realizing growth of 2 percent (-0.5), 4 percent (0.5), or whatever it may turn out to be next year, will fetch. Presumably the summit participants also know all or more of the risks and contingencies ahead of time. If they make reasonably precise and unqualified commitments nonetheless, it is fair to let performance reflect on the credibility of their commitments. Extenuating circumstances that may develop have no bearing on the scoring, giving it some objective validity.

With an unconditional scoring scheme in place, scores can be derived as soon as the first data are in. Thus accountability - crucial for the functioning of political delegation, international cooperation, and the organization of principal agent relationships - may be enhanced. Agreed practical verification procedures may encourage compliance and discourage false or adventitious promises, so that the performance scores and a better system for scoring may interact. Far from being a mechanical formality, monitoring credibility could thus provide clues as to which variables politicians like to hide from the public: what they are really not up to, not even at the summit.

Dr. von Furstenberg is Rudy professor of economics at Indiana University, with prior service at both the U.S. Council of Economic Advisers and the International Monetary Fund; Mr. Daniels, an assistant professor of economics at Marquette University, is currently completing his Ph.D. dissertation, entitled "The Meaning and Reliability of Economic Summit Undertakings: 1975-1989."

This article is based on their "Policy Undertakings by the Seven `Summit' Countries: Ascertaining the Degree of Compliance," in Charles T. Plosser and Allan H. Meltzer, eds., Carnegie-Rochester Conference Series on Public Policy, Vol. 35 and their Economic Summit Declarations 1975-1989: Examining the Written Record of International Cooperation, Princeton Studies in International Finance, No. 72 (Princeton, N.J.: International Finance Section of Princeton University, 1992).

Source: Von Furstenberg, George M., and Joseph P. Daniels. "Can You Trust G-7 Promises?" International Economic Insights 3 (September/October 1992): 24-27. ©Institute for International Economics. Reproduced by permission of Institute for International Economics.

