Inconsistent evidence for price substitution between butter and margarine: a shallow review

Apr. 13

Authors: Samara Mendez , Jacob Peacock , and Ben Stevenson Please note that this report was updated on April 27, 2023.

Executive summary

One prominent strategy for reducing animal product usage is to decrease the prices of plant-based analogs for meat, dairy and eggs. Cross-price elasticities measure how prices of analogs affect sales of animal products, and vice versa.
Previously, we found cross-price elasticities that sometimes indicated decreased plant-based milk prices could cause increased consumption of dairy milk. To see whether this result replicated, we studied elasticities of butter and margarine.
We synthesize 51 cross-price elasticities from 18 demand studies of butter and margarine.
1. We expected butter and margarine to be substitutes. Instead, we observed wide variation in estimates, complementarity, and even opposite signs of two elasticities for the same pair of products in the same setting (study, time, country, etc.).
2. Margarine was a substitute for butter in two thirds of estimates, while butter was a substitute for margarine in about half of estimates.
The results are likely explained by some combination of methodological issues and complex consumer behavior.
1. Estimating elasticities from observational data is very difficult, and there is some evidence of methodological issues in the literature. Consumer behavior may be highly context specific.
2. Our best highly uncertain guess is that complex consumer behavior contributes slightly more of the observed variation in results, with methodological issues accounting for the remaining slight minority of variation.
Price substitution between plant-based analogs and animal products is not a certainty.
1. It is possible that decreasing plant-based analog prices causes harmful increases in animal product usage in some contexts.
2. Research aiming to test behavioral theories that might explain our results and to validate observational estimates of elasticities against experimental methods may help clarify consumer behavior.

Abstract

Previously, we found unexpected results in a synthesis of cross-price elasticities between plant-based and animal-based milk: the estimates display a large amount of variation (including contradictory economic interpretations) within studies, time periods, geography, and even within pairs of products in a single study. We further examine this problem in a brief review of the demand estimation literature and relevant meta-analyses. We next synthesize cross-price elasticities for a new pair of animal- and plant-based analog products—butter and margarine—to see whether we find similarly unexpected results in a different context. We find that margarine is a substitute for butter in two thirds of estimates, while butter is a substitute for margarine in about half of the estimates; this lack of consensus is similar to the unexpected results we found in milk. We examine methodological issues and complex consumer behavior as potential explanations, and we discuss the implications of each explanation. If these results indeed reflect consumer behavior, they suggest that reducing prices of plant-based analogs may sometimes cause increases, rather than decreases, in consumption of corresponding animal products.

Introduction

Consumer demand theory is a cornerstone of economic research, aiming to explain the way consumers decide how much of a particular good or bundle of goods to purchase given their income. The theory generates individual demand curves of different goods for a single consumer as well as market demand curves for a specific good or service, which are aggregations of the individual curves of all consumers. Demand curves are characterized by a series of parameters useful to understand how consumers’ preferences respond to external and internal shocks. For example, own-price elasticity of demand¹ indicates how consumers’ purchases of one good respond to changes in the good’s own price; cross-price elasticities indicate responses to changes in the prices of other products. In general, we interpret elasticities as the ratio of two percentage changes. For example, if the cross-price elasticity of a gallon of whole milk with respect to soy milk is 1.5, this indicates that if the price of soy milk decreases by 1%, the number of gallons of whole milk demanded will decrease by 1.5%. Furthermore, the magnitudes and signs of elasticities have economic interpretations: a negative cross-price elasticity indicates a complementary relationship between a pair of two products, while a positive elasticity indicates a substitute relationship.

Empirical estimates of own- and cross-price elasticities have a wide range of uses, from forecasting market demand to supporting cases of antitrust litigation. Cross-price elasticities are especially important for brand owners and businesses to understand how to penetrate new markets and to set prices of new products. Demand elasticities are also used ex ante to analyze the effect of public policies. In these contexts, own- and cross-price elasticities are used to predict consumers’ response to market and policy shocks. For example, they are used to understand the effects of so-called sin taxes, designed to reduce the purchase of unhealthy products, on the consumption of soda, alcohol, cigarettes, fat, etc. (Allcott et al., 2019). Similarly, the elasticities of the demand for energy such as gasoline (West & Williams, 2007) and carbon (Brännlund & Nordström, 2004) can be used to set environmental policies. Cross-price elasticities are especially useful for policymakers to understand the effects of policies across related categories, as, for example, the introduction of a tax on sugar can increase the consumption of alcohol (Quirmbach et al., 2018).

Our review of cross-price elasticities of plant-based milk (for example, soy and almond milk) and animal-based (dairy) milk showed substantial unexpected variation in estimates across studies and contexts (Mendez & Peacock, 2021). We collected elasticity estimates and study characteristics from studies on plant-based and dairy milk, expecting to see something of a consensus of estimates that indicated plant-based milks to be substitutes for dairy milks, and vice versa. However, we observed unexpected variation and a high rate of unexpected signs, including complementarity and even instances of opposite signs on the two elasticities for a pair of products. For example, soy milk seemed to act as a complement for whole milk and a substitute for skim milk; and almond milk seemed to be a substitute for dairy milk, but dairy milk seemed to be a complement for almond milk.

Economic theory of course does not require that products exhibit the same relationships in all situations, nor even that elasticities be symmetric within a pair of products, as we discuss in more detail below. However, the situations described by theory are fairly unintuitive, and the milk study results defied a simple narrative of the products’ relationships. We looked for explanations for the milk results among the studies’ methodologies and product contexts but did not identify strong patterns. We were unable to conduct meta-analysis as most of the studies did not report standard errors,² a common problem in the cross-price elasticity literature (Auer & Papies, 2020).

To understand these unexpected results, we conducted a brief investigation of the elasticity literature, which confirmed that other studies have encountered similar variation in results. Auer and Papies (2020) present a meta-analysis of 7,264 cross-price elasticities. Among the authors’ conclusions, they observe that estimated magnitudes have decreased over time, referring to the year of data collection. Other meta-analyses of food demand estimates confirm the difficulty in estimating cross-price elasticities: Chen et al. (2016) study food demand in China and note that “deriving reliable estimates of cross-price elasticities is difficult” compared to own-price and income elasticities. Cornelsen et al. (2015) note that cross-price elasticities are less often studied than own-price or income elasticities in the public health literature. Chen et al. (2016) identify product aggregation level to be associated with elasticity magnitudes, which is also discussed by Okrent and Alston’s (2011) overview and empirical testing of several different demand theory models. The conclusions from these studies do not provide a unified high-level reason for the variation in cross-price elasticity results, although due to time constraints, we may have overlooked a relevant meta-analysis or potential explanation. Product aggregation, discussed by two studies, is the closest we found to a consensus around determining factors of variation; notably, we observed inconsistent levels of product aggregation used in milk studies we reviewed, but we were unable to statistically test any correlation between aggregation and elasticity effect size.

The unexpected results from our milk study and the subsequent literature review prompted us to investigate whether another pair of plant-based and animal-based analog products displayed similar seemingly contradictory results. We chose butter and margarine³ for their long market histories, availability of cross-price elasticity estimates, and the relative homogeneity of margarine as a product category. While plant-based milks are distinguished by their primary ingredient (for example, soy, almond, and coconut), margarine has fewer such distinctions and thus more consistent aggregation, potentially eliminating one source of variation across studies. Finally, we hoped to test the generality of a commonly held theory of price substitution for plant-based analogs, whereby decreases in the prices of plant-based dairy and milk analogs might cause decreased animal product usage. Following Mendez and Peacock (2021), we conducted a similar review and synthesis of cross-price elasticities for butter and margarine.

Methods

Search

We performed a non-systematic literature review of cross-price elasticities between butter and margarine. Our sole inclusion criterion was the presence of at least one butter and margarine cross-price elasticity estimate (in other words, either the elasticity of butter demand with respect to margarine price, or the elasticity of margarine demand with respect to butter price). We did not have inclusion criteria pertaining to geography, dates of data collection, or date of publication.

We searched Google Scholar and EconPapers (Research Papers in Economics, 2023) using the following four search strings:

“cross price elasticity butter margarine”
“cross price” OR “cross-price” “elasticities” “butter”
“butter” AND “margarine” AND “elasticity” OR “elastic” OR “inelastic” OR “demand”
"change" OR "shift" AND "demand" AND "butter" AND "margarine"

We read each study’s title and, where the study seemed potentially relevant, we read the abstract, excerpts of the study, or the full study, to confirm whether to include the study in our analysis. As a non-systematic review, we worked through search results until reading two full pages of irrelevant results, at which point we assumed further results would also be irrelevant. We checked the bibliographies of the included studies for citations of more potentially relevant studies. We explored other strategies to expand our literature review, including concatenating the above search strings with the names of countries that had experienced abnormally large changes in demand for dairy products, but these strategies were not fruitful.

Best guesses of estimate summary statistics

Before extracting the cross-price elasticity data, we specified our best guesses about the summary statistics of the data we’d collect. We made these forecasts of the results to increase the transparency of our reasoning and research, in line with the emerging practices in the social sciences of collecting expert predictions of research findings (DellaVigna et al., 2019; DellaVigna & Vivalt, 2022).

Our guesses were informed by summary statistics from our study of milk, the number of butter-margarine studies collected in our search,⁴ and summary statistics from Auer and Papies (2020). From both studies, we anticipated wide variation in results; from the milk study, we anticipated that estimates might conflict (in terms of economic interpretation) with either our naive expectations or with other estimates across time and space. We did not predict different results for butter and margarine cross-price elasticities, because we did not find enough information to feel confident in distinguishing our predictions between the two products. We constructed a lower and upper bound for a prediction interval that we guessed, with 80% confidence, would include each summary statistic for the butter and margarine elasticity estimates (calculated separately): mean, minimum, maximum, and the percent of positive estimates. The prediction intervals are interpreted as “if a new sample of estimates, with the same sample size, could be repeatedly drawn, 80% of the time the sample statistic would be in this prediction interval.” Table 1 displays these guesses.

Table 1: Best guesses of cross-price elasticity summary statistics

	Statistic	Lower	Upper
Samara	Mean	0.02	0.48
	Min	-1.10	-0.30
	Max	0.80	1.40
	Percent positive	0.65	0.75

Jacob	Mean	0.05	0.50
	Min	-2.00	0.00
	Max	0.60	1.90
	Percent positive	0.62	0.86

Data collection & analysis

All data cleaning, plotting, and analysis were performed using the statistical programming language R v4.2.2 (R Core Team, 2020), and all materials are available in the project repository https://osf.io/3smev/. For each study and where available, we extracted:

uncompensated and compensated⁵ cross-price elasticities of butter demand with respect to margarine, and of margarine demand with respect to butter, and
uncompensated and compensated own-price elasticities of butter and margarine demand, respectively.⁶

When a study provided multiple eligible estimates of cross-price elasticities, we included the estimates covering the longest duration, or derived from the preferred model specification, as characterized by the study’s author(s). The latter criteria applied only to Al-Zand and Hassan (1977).

For each elasticity estimate, we also coded, where available:
- statistical information
  - any reported estimated standard error
  - any reported t-statistic⁷
  - any stated claims of statistical significance
- information about data collection
  - data source (who collected the data?)
  - observed country or region
  - observed time period
  - frequency of data sampling (for example, weekly, monthly, yearly, etc.)
- methodological information
  - estimation method
  - underlying economic demand model
  - whether the analysis was pre-registered (determined by searching American Economic Association (2012) and Open Science Framework (2023) project registries, as well as searching the reports for “register” and “registration”)

We plotted each elasticity point estimate and its confidence interval, when available. Finally, we calculated means, minima, maxima, standard deviations, and percentage positive for the two sets of cross-price elasticity estimates.

Results

We extracted 51 cross-price elasticity estimates from 18 studies (Al-Zand & Hassan, 1977; Breuer, 2007; Chang & Kinnucan, 1990, 1993; Coelho et al., 2010; Davis et al., 2010, 2011; Goddard & Amuah, 1989; Gould et al., 1991; Heien & Wessells, 1988; Henson & Traill, 1994; Huang, 1985; Lin & Lee, 2010; Malla et al., 2022; Pitts & Herlihy, 1982; Veeman & Peng, 1997; Vertessen, 1981; Yang & Pickford, 2010).⁸ Studies were conducted in several continents, and publication dates ranged from the 1970s and the 2020s. Figure 1 illustrates the variation in cross-price elasticity estimates, and Table 2 provides summary statistics of the estimates.

Similar to the results found for milk, we observe a lack of consensus over whether butter and margarine are complements or substitutes. Specifically, margarine estimates indicate substitution for butter nearly two thirds of the time (65% of estimates), while butter estimates were roughly equally split between substitution (52%) and complementarity (48%) for margarine. We also observe opposite signs of elasticity estimates for the same pair of products within countries, within similar investigation periods, and even within studies (Al-Zand & Hassan, 1977; Chang & Kinnucan, 1993; Pitts & Herlihy, 1982; Yang & Pickford, 2010). The unweighted average of margarine estimates did suggest substitution (0.23); however, the standard deviation of this estimate was itself much higher (0.43).

Figure 1: Plot of cross-price elasticity estimates

Point estimates of 51 cross-price elasticities and their 95% confidence intervals, where available. The vertical axis separates cross-price elasticities for butter and margarine; points within each cluster are jittered vertically only to improve readability. The dotted vertical line at zero separates estimates representing substitutes (right-hand side) and complements (left-hand side). Data supporting this figure can be found at https://osf.io/3smev/.

Table 2: Summary statistics of cross-price elasticity estimates

Product

demanded

Price-

changing product

Mean

Minimum

Maximum

Percent positive

Standard deviation

Number of

estimates

Margarine

Butter

0.23

-0.68

1.28

0.43

Butter

Margarine

-0.02

-0.86

0.37

0.25

Summary statistics for the 51 cross-price elasticity estimates extracted from 18 studies examining butter and margarine.

A cursory examination of plots grouped by additional study characteristics such as country, publication date, sample size, or statistical significance did not reveal any obvious patterns in the data.⁹ The lack of consensus around economic interpretation and the opposite signs of elasticity estimates for the same pair of products were found across all of these dimensions. Since our more detailed investigation of whether study characteristics explained variance in cross-price elasticities for milk was largely fruitless, we limited our exploration in this study. Notably, any statistical analysis would be limited by the lack of standard errors reported alongside the cross-price elasticity estimates in the original studies. When reported, the standard errors are often somewhat large—larger than observed for milk cross-price elasticities—and have a mix of statistical significance levels. Thus, it could be the case that the variation in elasticity magnitudes is partially caused by sampling error, although some examples in this review are still not sufficiently explained by sampling error.

Our best guesses for the mean, minimum, maximum, and percent positive, shown in Table 1, were moderately successful in predicting the observed values shown in Table 2. Since we gave 80% confidence prediction intervals, we expect 13 of 16 guesses to be correct, while we observed 10 of 16 guesses to be correct:

Correct: Our prediction intervals encompassed the actual values in all cases of margarine elasticity with respect to butter price changes, and the minimum estimate of butter elasticity with respect to margarine price changes.
Incorrect: The mean, maximum, percent positive elasticity for butter with respect to margarine price change fell below both of our prediction intervals for each statistic.

Discussion

The implications of our results depend on the underlying causes of variation in elasticity estimates. If we think the results are inaccurate due to methodological issues causing mismeasurement of true behavior, we might conclude that more accurate estimation methods are needed for cross-price elasticity estimates to be informative. If, on the other hand, the results are accurate, the measurements must reflect behavior that is far more complicated than expected. For price interventions to reduce demand for animal products, we would then need to characterize which particular contexts cause which behavior and tailor interventions accordingly. Either way, taking butter and margarine cross-price elasticities at face value may lead to unintended harm by increasing demand for animal products. We examine some evidence for each potential explanation (methodological issues and complicated behavior) to inform our best guess about how much each may explain the observed results.

Methodological issues

Our results suggest that the estimated elasticities for the same pair of products have contradictory economic interpretations within a country, within a certain time period, and even within a pair of products from the same study. Consumer demand theory has made attempts to account for unintuitive behaviors like those seemingly observed in the milk and margarine contexts. However, it might be the case that true behavior aligns with the standard economic theory around substitute products, and the variation in estimates arises because the methods poorly capture actual behavior.

This hypothesis is supported in general by the fact that causal inference based on observational data is challenging, due to the many avenues of potential bias especially from unobserved or confounding factors (Sterne et al., 2021). Price endogeneity and other factors may make causality especially difficult to determine in the context of demand estimation. That said, econometric methods to account for confounding have improved over the course of demand estimation history. For example, Auer and Papies (2020) find models accounting for price endogeneity estimate cross-price elasticities closer to zero than those that overlook endogeneity, and they conclude that estimates from newer studies may be less subject to bias than those from older studies. However, it is not clear that these modeling improvements have been sufficiently validated, and the difficulty in determining causality remains.

The potential for bias introduced by researchers’ prior assumptions also supports the methodological issues hypothesis. Margarine was created specifically as a substitute for butter and has long been viewed by economists as a substitute good (Dupré, 1999). Therefore, prior assumptions about butter and margarine might introduce several types of bias at different stages in the demand modeling process:

publication bias
p-hacking, which is a collection of techniques used to achieve “better” p-values or even certain effect sizes (some examples include changing variable definitions, specification searching, or imposing sample restrictions)
choice of demand model during study conceptualization

Publication bias, which might result from authors not publishing results too far outside of some expected range, receives the most attention in the meta-analyses we reviewed. Chen et al. (2016) explicitly consider this concern in their regression analysis and find that publication in peer-reviewed journals is associated with larger (more positive) cross-price elasticities. Auer and Papies (2020) do not formally test for publication bias, and they make the counterclaim that publication bias driven by cross-price elasticities may be minimal because those estimates are usually not used to “sell” a paper, as they are rarely mentioned in the papers’ abstracts. However, the distribution of cross-price elasticities observed in their meta-analysis is consistent with publication bias. In particular, for the subset of elasticities that represent putative substitute goods,¹⁰ we observe a notable discontinuity around zero (Figure 2), with a large cluster of very small but positive elasticities. Of course this is not conclusive evidence, but publication bias offers one feasible explanation for the discontinuity.

Figure 2: Frequency distribution of cross-price elasticities for substitutes

Frequency distribution of 6,921 cross-price elasticity estimates from 115 studies. There is a large discontinuity around zero, with the majority of estimates greater than zero. These elasticities are coded as substitutes in the meta-analysis, according to the classification of each pair of products by the original studies’ authors. Reproduced from Auer and Papies (2020).

Other feasible explanations for the discontinuity further support the methodological issues hypothesis. The discontinuity could be consistent with p-hacking. While none of the meta-analyses we reviewed discuss p-hacking, the recent credibility revolution has shown the prevalence of such methods in quantitative studies using quasi-experimental methods or observational data (Angrist & Pischke, 2010; Brodeur et al., 2018; Vivalt, 2019). Pre-registration of analysis plans can mitigate p-hacking, but pre-registration is still uncommon for observational studies (Williams et al., 2010). To our knowledge, none of the butter-margarine studies in this analysis were pre-registered, so we cannot rule out this source of bias in the estimates.

Alternatively, the discontinuity might be caused by the authors’ choice of which demand model to estimate. Each demand model comes with its own set of underlying theoretical assumptions about consumer behavior, and an author could introduce bias by choosing a demand model according to their prior expectations about the relationships in question. For example, certain types of discrete choice models build demand equations on the assumption that an observed purchase represents the consumer’s preferred choice out of several possible products, implying that the products under comparison are substitutes and not complements (Feng et al., 2018). Discrete choice models are not used in the 18 butter-margarine studies in this analysis, but they are commonly used in demand estimation overall. Auer and Papies (2020) include attraction choice models, a type of discrete choice model, in their meta-analysis, which might explain the discontinuity in Figure 2.

The lack of reported standard errors supports the methodological issues hypothesis, even though it does not explicitly indicate bias from poor estimation methods. This oversight hinders researchers in employing traditional methods for detecting publication bias, replicating the findings, or doing meta-analysis. Further, reporting standard errors is seen as standard research practice in economics, and the oversight makes us suspect deeper issues with the estimation methods, especially in recent publications that have access to modern estimation software. Auer and Papies’ (2020) argument that cross-price elasticities are not the selling point of a paper may explain why they are reported in less detail compared to other demand parameters; regardless, without standard errors, we are less able to evaluate the methods.

What are the implications if cross-price elasticity estimation methods have issues, and we ignore those issues or assume the results are “good enough” estimates? Our butter and margarine example illustrates a high rate of potentially incorrect signs on the cross-price elasticities. Basing policy or intervention decisions on the opposite economic interpretation than the true product relationship (in other words, on supposed substitution instead of true complementarity) may have serious consequences. For example, if inaccurately measured estimates suggest that plant-based and animal-based beef are substitutes, but consumers actually purchase them as complements, an initiative to reduce plant-based beef prices might inadvertently cause demand for animal-based beef to increase instead of fall. However, there is much uncertainty in the butter and margarine estimates we’ve presented: some wide confidence intervals, a relatively high rate of statistical non-significance (higher than our results in milk), a high rate of unreported standard errors, potential publication bias or p-hacking, and likely residual confounding. As such, the accuracy of the results remains unclear, and, if methodological issues predominate, the results might provide no clear evidence on butter and margarine cross-price elasticities. That said, the results might preclude large and consistent cross-price elasticities (either complementary or substitute) as, if this were the case, such a strong signal might be expected to persist through the methodological noise.

Complicated behavior

Suppose, on the other hand, that cross-price elasticity estimation methods really are accurately capturing consumer behavior. If we take these results at face value, they imply behavior may be more complex than we might expect for products like butter and margarine. How do we explain this behavior?

Economists have put forward different theoretical explanations for unexpected elasticity signs, attempting to reconcile elasticity estimates that either display opposite signs within pairs of products or over time, or indicate complementarity when products are widely believed to be substitutes (or vice versa). While a full review of consumer demand theory is outside the scope of this report, we discuss a few studies that attempt to characterize types of behavior that could generate the complex results we observe.

De Jaegher (2009) explores conditions that lead to asymmetric substitutability, when Good A is a substitute for Good B, while Good B is a complement for Good A. In particular, one good must be a luxury—defined by an income elasticity, or how purchases change when consumers’ incomes change, greater than 1—while the other is a necessity; and one good is own-price elastic—defined by an own-price elasticity less than one—while the other is own-price inelastic. Physical aspects of the product pairs may lead to these conditions and help us build intuition around this situation. For example, consumers may gain utility from consuming two product characteristics, one characteristic possessed by both goods and the other possessed by only one good, instead of the products themselves. The author gives the example of making calls inside and outside the home using mobile phones (necessity for both in-home and out-of-home calls) and fixed landlines (luxury for in-home calls):

If the price of mobile phone services increases, less mobile phone services and less fixed phone services are consumed. This is because fixed phones cannot substitute for mobile phones, as the former cannot be used to call outside of home. However, if the price of fixed phone services increases, more mobile phone services may be bought. This is because mobile phones will then also be used to call at home, a job that was otherwise done by fixed phones. (p. 854)

The conditions for asymmetric substitutability are relatively straightforward, but they are also quite restrictive and therefore may not apply widely. Testing these conditions is outside the scope of this report, although it may be possible to show that asymmetric substitutability does not hold using the own-price elasticities that we collected. Either way, the presence of testable conditions with which future research might determine true behavior in each context is promising for better-informed policy decisions.

Pitts and Herlihy (1982) explain unexpected complementarity of margarine and butter with the hypothesis that consumers maintain a constant expenditure for yellow fats. When consumers strive to maintain a specific expenditure budget, unexpected complementarity may arise under two conditions: both goods are own-price inelastic, and the price difference between the two products is large. As the price of one good increases, its quantity demanded decreases, but due to inelasticity, expenditure on that good also increases. This reduces the available expenditure for the other good. In turn, consumption of the second good also decreases, which indicates complementarity even though the two goods are generally considered substitutes. The authors also note this hypothesis may lead to pairs of elasticities with opposite signs if one good is own-price inelastic while the other is own-price elastic. As above, testable conditions may help to characterize each new product context to better inform policy decisions..

Chang and Kinnucan’s (1993) empirical results show opposite signs on the respective butter and margarine cross-price estimates. They discuss a characteristics theory similar to De Jaeger’s (2009) intuition, combined with consumer heterogeneity, to explain these results, although they do not investigate the theoretical conditions that may give rise to this situation. They note that relative differences in usage patterns of consumers with heterogeneous preferences for certain characteristics like spreadability and taste might explain the opposite signs. They cite survey data to show that Canadian consumers in the 1990s used margarine at higher rates than butter and might more easily move away from butter than vice versa:

[B]utter users are more likely to use margarine than margarine users are to use butter (Goldfarb Consultants). The same data indicate a much larger portion of the Canadian population are nonusers of butter (23%) than are nonusers of margarine (13%). Frequent users of butter make up 35 percent of the population compared to 66 percent for margarine. (p. 273)

More indirect explanations for complex consumer behavior may come from outside of standard economic theory based on expected utility, which postulates that individuals are symmetrically affected by losses and gains. Recent studies have found that the magnitude of consumers’ response to price changes can be asymmetric depending on the price direction; that is, the response is larger in absolute value for price increases than for price decreases (Biondi et al., 2020). This is because individuals can have loss aversion, a stronger aversion to losses than the benefit experienced from gains of similar magnitude (Kahneman & Tversky, 2012), and their responses will depend on the reference point, generally their current price level. This asymmetry could explain the variation we observe in the butter-margarine results, as the elasticity estimates may be picking up an asymmetric underlying consumer response to price increases or decreases in the data. Indeed, Biondi (2020) develops a demand system augmented to include different elasticities for price increases or decreases, so researchers using this system could empirically test for the presence of asymmetric price response.

This brief and non-comprehensive review of the literature shows that consumer demand theory proposes some feasible explanations for empirical results of unexpected complementarity and opposite elasticity signs. The existence of a few feasible explanations makes the complicated behavior hypothesis more credible. These explanations often depend on very specific conditions, which provide testable hypotheses and might allow researchers to determine what type of consumer behavior to expect in a given context.

What are the implications if the estimated elasticities are correct and consumer behavior is indeed very complicated? Designing an intervention around one product to elicit a certain consumption response in another product might have a wide range of unexpected effects, especially if implementation coincides with other external changes to the market. For example, let us generalize our margarine results. Suppose consumers view margarine as a substitute for butter (they buy less margarine when butter prices decrease), but see butter as a complement for margarine (they buy more butter when margarine prices decrease). If a promotional sale for a plant-based butter brand were to coincide with an unrelated decrease in butter prices, the expected increase in butter consumption due to a decrease in its own price might be amplified by the changes to margarine prices. As a result, the sale causes a harmful increase in butter consumption instead of the intended decrease.

Conclusions

As in our previous work on milk, we found unexpectedly large variation in a synthesis of margarine and butter cross-price elasticity estimates, which ranged from substitutes to complements. Variation persisted within studies, time periods, geography, and even within pairs of products in a single study. We find that margarine is a substitute for butter in two thirds of estimates, while butter is a substitute for margarine in about half of the estimates. After reviewing discussions of this problem in the meta-analytic and demand estimation literature, we examine two potential explanations—methodological issues in estimation or complex consumer behavior—and discuss the implications of each.

Our best guess based on the evidence presented here—which is not comprehensive—is that slightly more of the variation in butter and margarine cross-price elasticities is due to complex behavior, with the remaining due to methodological issues. To arrive at these conclusions, we start from a prior assumption that methodological issues and complex behavior each explain 50% of the variation in cross-price elasticity estimates, and consider how each piece of evidence presented above subjectively updates us toward one particular explanation.

Evidence in favor of the complex behavior hypothesis updates us toward the behavioral explanation:

We found several feasible theoretical explanations for unexpected signs and opposite signs within product pairs in different situations, in a relatively short non-comprehensive literature search.
There is meta-analytic evidence of correlations between effect size and product context characteristics, which implies that time, place, and product might impact effect size.

Evidence against the complex behavior hypothesis and for the methodological issues hypothesis updates us toward the methodological explanation:

There is a high risk of confounding bias in estimates from observational studies as well as a risk of publication bias and p-hacking.
There is meta-analytic evidence of correlations between effect size and methodological characteristics.
The restrictive nature of some of the conditions in potential behavioral explanations implies that these conditions might only rarely hold.
Non-reporting of standard errors creates difficulty in evaluating methods and reduces our trust in methodological rigor.

Evidence against the methodological issues hypothesis updates us back toward the behavioral explanation:

We did not find a consensus in the meta-analyses that specific estimation strategies were inappropriate in all contexts. That is, no one is using a completely discredited method.
Estimation methods have improved over time, but we do not see a corresponding convergence in newer results, as we might expect if methodological issues were the main cause of variation.
There is some uncertainty in estimates, where standard errors are reported.

Further research might aim to better understand the relative contributions of methodological issues and complex behavior to the variance in cross-price elasticity estimates. In particular, we have identified several potential theoretical explanations for the complex behavior suggested by our data which could be readily tested. More such mechanisms might be discovered with further review of the literature. If behavior proves to be as complicated as our data suggest, significant research efforts would be needed to characterize the diverse cross-price elasticities posited in different contexts. However, given the unexplained variance currently exhibited in the literature on both margarine and milk, simply generating further estimates of cross-price elasticities using existing approaches seems unlikely to be informative, especially for animal advocacy researchers. Instead, future research might aim to validate cross-price elasticities estimated from observational data against experimental and quasi-experimental estimates of the same quantity.

Overall, this work demonstrates that price interventions on margarine could cause unintended increases in butter consumption. Increased consumption could then result in concomitant harms to animal welfare, the environment, and human health. In concert with our work on plant-based and dairy milk, these results cast doubt on the general theory that plant-based analogs will behave as price substitutes for their animal-based counterparts. Although work on the cross-price elasticities of plant- and animal-based meat is nascent, estimates so far have already shown unexpected complementarity (Zhao et al., 2022). While it is tempting to assert that displacement of animal-based products by plant-based analogs at lower prices is an economic certainty, the empirical evidence so far does not strongly support this notion.

Acknowledgments

This research is a project of Rethink Priorities. It was written by Samara Mendez, Jacob Peacock, and Ben Stevenson. SM and JP jointly devised this project. BS and SM collected data, and SM and JP conducted analysis and wrote the report. Thanks to David Reinstein and Simone Angioloni for helpful feedback, and Adam Papineau for copyediting. If you like our work, please consider subscribing to our newsletter. You can explore our completed public work here.

Revision history

April 27, 2023

Updated the discussion of variation explained by each explanation in the Executive Summary and the Conclusions sections.
Removed the result extracted from Trail and Henson (1994), which was incorrectly extracted as a demand elasticity (it is actually the coefficient of a price regression). Revised the text, Figure 1, Table 2, and all data and materials in the project repository https://osf.io/3smev/ to reflect this revision. Removing this estimate changed the details of our results, but did not affect the substance of our conclusions.

References

AEA. (2012, 2023). American Economic Association Randomized Controlled Trial Registry. AEA RCT Registry. https://www.socialscienceregistry.org/

Allcott, H., Lockwood, B. B., & Taubinsky, D. (2019). Regressive Sin Taxes, with an Application to the Optimal Soda Tax. The Quarterly Journal of Economics, 134(3), 1557–1626. https://doi.org/10.1093/qje/qjz017

Al-Zand, O. A., & Hassan, Z. A. (1977). The Demand for Fats and Oils in Canada. Canadian Journal of Agricultural Economics/Revue Canadienne d’agroeconomie, 25(2), 14–25. https://doi.org/10.1111/j.1744-7976.1977.tb02873.x

Angrist, J. D., & Pischke, J.-S. (2010). The Credibility Revolution in Empirical Economics: How Better Research Design is Taking the Con out of Econometrics. The Journal of Economic Perspectives, 24(2), 3–30. https://doi.org/10.1257/jep.24.2.3

Auer, J., & Papies, D. (2020). Cross-price elasticities and their determinants: A meta-analysis and new empirical generalizations. Journal of the Academy of Marketing Science, 48(3), 584–605. https://doi.org/10.1007/s11747-019-00642-0

Bentley, J., & Ash, M. (2016, July 5). Butter and Margarine Availability Over the Last Century [USDA ERS Findings]. https://perma.cc/B77L-LWLG

Biondi, B., Cornelsen, L., Mazzocchi, M., & Smith, R. (2020). Between preferences and references: Asymmetric price elasticities and the simulation of fiscal policies. Journal of Economic Behavior & Organization, 180, 108–128. https://doi.org/10.1016/j.jebo.2020.09.016

Brännlund, R., & Nordström, J. (2004). Carbon tax simulations using a household demand model. European Economic Review, 48(1), 211–233. https://doi.org/10.1016/S0014-2921(02)00263-5

Breuer, C. C. (2007). Cost-of-living indexes for Germany [Working Paper]. https://perma.cc/3NFC-XMEW

Brodeur, A., Cook, N., & Heyes, A. G. (2018). Methods Matter: P-Hacking and Causal Inference in Economics (SSRN Scholarly Paper No. 3249910). https://doi.org/10.2139/ssrn.3249910

Center for Open Science. (2023). OSF Registries. Open Science Framework. https://osf.io/registries

Chang, H.-S., & Kinnucan, H. (1990). Advertising and Structural Change in the Demand for Butter in Canada. Canadian Journal of Agricultural Economics/Revue Canadienne d’agroeconomie, 38(2), 295–308. https://doi.org/10.1111/j.1744-7976.1990.tb03465.x

Chang, H.-S., & Kinnucan, H. W. (1993). Blend Bans and Butter Demand. Review of Agricultural Economics, 15(2), 269–278. https://doi.org/10.2307/1349447

Chen, D., Abler, D., Zhou, D., Yu, X., & Thompson, W. (2016). A Meta-analysis of Food Demand Elasticities for China. Applied Economic Perspectives and Policy, 38(1), 50–72. https://doi.org/10.1093/aepp/ppv006

Coelho, A. B., Aguiar, D. R. D. de, & Eales, J. S. (2010). Food demand in Brazil: An application of Shonkwiler & Yen Two-Step estimation method. Estudos Econômicos (São Paulo), 40, 186–211. https://doi.org/10.1590/S0101-41612010000100007

Cornelsen, L., Green, R., Turner, R., Dangour, A. D., Shankar, B., Mazzocchi, M., & Smith, R. D. (2015). What Happens to Patterns of Food Consumption when Food Prices Change? Evidence from A Systematic Review and Meta-Analysis of Food Price Elasticities Globally. Health Economics, 24(12), 1548–1559. https://doi.org/10.1002/hec.3107

Davis, C. G., Dong, D., Blayney, D. P., & Owens, A. (2010). An Analysis of U.S. Household Dairy Demand (Technical Bulletin Number 1928; Economic Research Service, p. 28). United States Department of Agriculture. https://perma.cc/B5UK-G668

Davis, C. G., Yen, S. T., Dong, D., & Blayney, D. P. (2011). Assessing economic and demographic factors that influence United States dairy demand. Journal of Dairy Science, 94(7), 3715–3723. https://doi.org/10.3168/jds.2010-4062

De Jaegher, K. (2009). Asymmetric Substitutability: Theory and Some Applications. Economic Inquiry, 47(4), 838–855. https://doi.org/10.1111/j.1465-7295.2008.00158.x

DellaVigna, S., Pope, D., & Vivalt, E. (2019). Predict science to improve science. Science, 366(6464), 428–429. https://doi.org/10.1126/science.aaz1704

DellaVigna, S., & Vivalt, E. (2022). Social Science Prediction Platform. Social Science Prediction Platform. https://socialscienceprediction.org/ForecastingGuide

Dupré, R. (1999). “If It’s Yellow, It Must Be Butter”: Margarine Regulation in North America Since 1886. The Journal of Economic History, 59(2), 353–371. https://doi.org/10.1017/S0022050700022865

Feng, G., Li, X., & Wang, Z. (2018). On substitutability and complementarity in discrete choice models. Operations Research Letters, 46(1), 141–146. https://doi.org/10.1016/j.orl.2017.11.016

Goddard, E. W., & Amuah, A. K. (1989). The Demand for Canadian Fats and Oils: A Case Study of Advertising Effectiveness. American Journal of Agricultural Economics, 71(3), 741–749. https://doi.org/10.2307/1242030

Gould, B. W., Cox, T. L., & Perali, F. (1991). Demand for Food Fats and Oils: The Role of Demographic Variables and Government Donations. American Journal of Agricultural Economics, 73(1), 212–221. https://doi.org/10.2307/1242897

Heien, D. M., & Wessells, C. R. (1988). The Demand for Dairy Products: Structure, Prediction, and Decomposition. American Journal of Agricultural Economics, 70(2), 219–228. https://doi.org/10.2307/1242060

Henson, S., & Traill, B. (1994). Nutrition information and the demand for yellow fats. Department of Agricultural Economics and Management, University of Reading.

Huang, K. S. (Ed.). (1985). U.S. Demand for Food: A Complete System of Price and Income Effects. https://doi.org/10.22004/ag.econ.157014

Kahneman, D., & Tversky, A. (2012). Prospect Theory: An Analysis of Decision Under Risk. In Handbook of the Fundamentals of Financial Decision Making: Vol. Volume 4 (pp. 99–127). WORLD SCIENTIFIC. https://doi.org/10.1142/9789814417358_0006

Lin, C.-T., & Lee, J.-Y. (2010, January). Influences of Labeling Policy and Media Coverage On the Demand for Butter and Margarine. The Agricultural & Applied Economics Association 2010 AAEA, CAES, & WAEA Joint Annual Meeting, Denver. https://doi.org/10.22004/ag.econ.61161

Malla, S., Klein, K. K., & Presseau, T. (2022). Has the Demand for Fats and Meats in the United States been Affected by the Health Claim on Risk of Coronary Heart Disease Issued by the Food and Drug Administration? Athens Journal of Business & Economics, 8(2), 97–118. https://doi.org/10.30958/ajbe.8-2-1

Mas-Colell, A., Green, J. R., & Whinston, M. D. (1995). Microeconomic Theory (First). Oxford University Press.

Mendez, S., & Peacock, J. (2021). Milking It: Exploring the impact of plant-based milk in the US (Report E019R02). The Humane League Labs. https://doi.org/10.31219/osf.io/j89qm

Okrent, A. M., & Alston, J. M. (2011). Demand for Food in the United States. A Review of Literature, Evaluation of Previous Estimates, and Presentation of New Estimates of Demand. Gianni Foundation Monograph, 48, 137. https://perma.cc/RZY5-EERQ

Pitts, E., & Herlihy, P. (1982). Perverse Substitution Relationships in Demand Studies: The Example of Butter and Margarine. Journal of Agricultural Economics, 33(1), 37–46. https://doi.org/10.1111/j.1477-9552.1982.tb00710.x

Quirmbach, D., Cornelsen, L., Jebb, S. A., Marteau, T., & Smith, R. (2018). Effect of increasing the price of sugar-sweetened beverages on alcoholic beverage purchases: An economic analysis of sales data. J Epidemiol Community Health, 72(4), 324–330. https://doi.org/10.1136/jech-2017-209791

R Core Team. (2020). R: A Language and Environment for Statistical Computing. R Foundation for Statistical Computing. http://www.R-project.org/

Research Papers in Economics. (2023, March 26). EconPapers. https://econpapers.repec.org/

Snodgrass, K. (1930). Margarine as a butter substitute (No. 4; Fats and Oils Studies of the Food Research Institute). Food Research Institute. https://hdl.handle.net/2027/uc1.$b232781

Sterne, J. A., Hernán, M. A., McAleenan, A., Reeves, B. C., & Higgins, J. P. (2021). Chapter 25: Assessing risk of bias in a non-randomized study. In Cochrane Handbook for Systematic Reviews of Interventions (Version 6.2). Cochrane. https://perma.cc/CUB3-Z9VY

Traill, W. B., & Henson, S. (1994). Price Transmission in the United Kingdom Yellow Fats Market in the Presence of Imperfect Competition. Journal of Agricultural Economics, 45(1), 123–131. https://doi.org/10.1111/j.1477-9552.1994.tb00383.x

UCLA Advanced Research Computing. (n.d.). How can I estimate the standard error of transformed regression parameters in R using the delta method? Statistical Methods and Data Analytics. https://perma.cc/BNW2-VU7Z

Veeman, M. M., & Peng, Y. (1997). Canadian Dairy Demand (Project Report 97-03, No. 940503; RURAL ECONOMY). Alberta Agricultural Research Institute. http://dx.doi.org/10.22004/ag.econ.24037

Vertessen, J. (1981). Influence of butter and margarine prices on the demand for butter in Belgium. European Review of Agricultural Economics, 8(1), 99–109. https://doi.org/10.1093/erae/8.1.99

Vivalt, E. (2019). Specification Searching and Significance Inflation Across Time, Methods and Disciplines. Oxford Bulletin of Economics and Statistics, 81(4), 797–816. https://doi.org/10.1111/obes.12289

West, S. E., & Williams, R. C. (2007). Optimal taxation and cross-price effects on labor supply: Estimates of the optimal gas tax. Journal of Public Economics, 91(3), 593–617. https://doi.org/10.1016/j.jpubeco.2006.08.007

Williams, R. J., Tse, T., Harlan, W. R., & Zarin, D. A. (2010). Registration of observational studies: Is it time? CMAJ : Canadian Medical Association Journal, 182(15), 1638–1642. https://doi.org/10.1503/cmaj.092252

Yang, Q. G., & Pickford, M. (2010). Are Butter and Margarine Close Substitutes? New Evidence From New Zealand (SSRN Scholarly Paper No. 3683322). https://doi.org/10.2139/ssrn.3683322

Zhao, S., Wang, L., Hu, W., & Zheng, Y. (2022). Meet the meatless: Demand for new generation plant‐based meat alternatives. Applied Economic Perspectives and Policy, aepp.13232. https://doi.org/10.1002/aepp.13232

Notes

In this report, we are referring to demand elasticities, unless otherwise mentioned. ↩
Many of the studies we gathered use demand models that construct cross-price elasticity estimates indirectly via transformations of the coefficients in the main regression model. The authors of these studies do report standard errors for the coefficients of the regression model. Sometimes they do not report standard errors for the transformations, which must be obtained as approximations, for example by bootstrapping or the Delta method (UCLA Advanced Research Computing, n.d.). ↩
Margarine is often, but not exclusively, plant-based. Most margarines contain a majority of plant-based ingredients, but early forms of margarine were produced from non-milk animal fats and current forms may include butter or buttermilk as flavors. For more of the history of margarine’s development, see Snodgrass (1930), Dupré (1999), and Bentley and Ash (2016). ↩
We performed this prediction exercise after identifying the 18 studies to include in the analysis, but before extracting any of the elasticity results from these studies. ↩
Uncompensated elasticities represent the change in consumption in response to both the change in price (the substitution effect) and the subsequent induced change in real income due to a higher price (the income effect). In contrast, compensated elasticities imagine “compensating” consumer income so that the change in consumption is only due to the isolated substitution effect (Mas-Colell et al., 1995). Compensated elasticities are theoretically identical for a pair of products whether, for example, margarine or butter is the quantity good in question, whereas pairs of uncompensated elasticities may not be identical due to differences in the size of the substitution effect and the income effect between the two goods (Mas-Colell et al., 1995). ↩
Although own-price elasticities are not the focus of this report, these were collected for comparative purposes with the previous milk research (Mendez & Peacock, 2021). ↩
Two studies indicate statistical significance by reporting t-statistics instead of the underlying standard errors. ↩
We were unable to locate the full text of Henson and Traill (1994), but the elasticity results are reported in Traill and Henson (1994). ↩
These figures can be found in the /report/images subdirectory of the project repository at https://osf.io/3smev/. ↩
Auer and Papies’ (2020) method for coding cross-price elasticities as considering substitutes or complements is as follows: “We controlled for whether authors of a given study explicitly label the relation between brand pairs as either complements or substitutes. If the authors do not provide any information, we treated all products in one category as substitutes, and across-category relations as complements.” ↩

Samara MendezBen StevensonJacob R. Peacock

Samara Mendez

Inconsistent evidence for price substitution between butter and margarine: a shallow review

Executive summary

Abstract

Introduction

Methods

Search

Best guesses of estimate summary statistics

Data collection & analysis

Results

Discussion

Methodological issues

Complicated behavior

Conclusions

Acknowledgments

Revision history

April 27, 2023

References

Notes

About

Our Work

Inconsistent evidence for price substitution between butter and margarine: a shallow review

Executive summary

Abstract

Introduction

Methods

Search

Best guesses of estimate summary statistics

Data collection & analysis

Results

Discussion

Methodological issues

Complicated behavior

Conclusions

Acknowledgments

Revision history

April 27, 2023

References

Notes

Prospects for AI safety agreements between countries

Eradicating rodenticides from U.S. pest management is less practical than we thought

About

Our Work