Empirically grounding analytics ( EGA ) research in the Journal of Operations Management

Journal of Operations Management(2023)

引用 0|浏览2
暂无评分
摘要
Empirically grounding analytics (EGA) is an area of research that emerges at the intersection of empirical and analytical research. By “empirically grounding,” we mean both the empirical justification of model assumptions and parameters and the empirical assessment of model results and insights. EGA is a critical but largely missing aspect of operations management (OM) research. Spearman and Hopp (2021, p. 805) stated that “since empirical testing and refutation of operations models is not an accepted practice in the IE/OM research community, we are unlikely to leverage these to their full potential.” They named several “examples of overly simplistic building blocks leading to questionable representations of complex systems” (p. 805) and suggested that research using analytical tools like closed queuing network models and the Poisson model of demand processes could incorporate empirical experiments to improve understanding of where they do and do not fit reality, highlighting “the importance of making empirical tests of modeling assumptions, both to ensure the validity of the model for its proposed purpose and to identify opportunities for improving or extending our modeling capabilities. The fact that very few IE/OM papers make such empirical tests is an obstacle to progress in our field” (p. 808). They concluded that “Editors should push authors to compare mathematical models with empirical data. Showing that a result holds in one case but not another adds nuance and practicality to research results. It also provides stimulus for research progress” (p. 814). These arguments remind of Little's (1970) observation that many potentially useful analytical models are not widely adopted in practice. Thus, EGA research can help to close two major gaps between (1) the empirical and analytical subdivisions in the OM field and (2) scholarly output and practical relevance. As a journal focused on empirical research, the Journal of Operations Management (JOM) seeks to encourage EGA submissions and publications, but doing so requires our community of authors, reviewers, and editors to share an understanding of the expectations. While such contributions have been encouraged for some time in the verbiage on the JOM website, a more formal effort to draw out examples of EGA research was driven by an editorial call (Browning & de Treville, 2018), and we have since had many discussions, panels, webinars, and workshops to continue to develop and communicate the expectations. This editorial represents another step in that development. In a general sense, an EGA paper combines mathematical, stochastic, and/or economic modeling insights with empirical data. Modeling captures non-linearities and elements of distributions and allows these parameters to be incorporated into decision making, whereas empirical research transforms observations into knowledge. Analytical models are evaluated in terms of their results and insights, which might prompt further extensions to or modifications of the model, including new or different inputs and recalibrations. Most modeling papers stop there because the primary contribution is the analytical model. Although some realism is required, it falls short of empirical grounding, and a gap is often left between the model's insights and what implementation in practice will entail. Filling this gap by empirically grounding an analytic model creates knowledge by linking analytical insights to what has been observed using empirical methods (such as case studies, action research, field experiments, interviews, analysis of secondary data, etc.) to establish a theoretically and empirically relevant research question. Moreover, since analytical models tend to make many simplifying assumptions, EGA research can help tease out where these assumptions are valid and where they excessively bias results. Figure 1 situates two kinds of EGA research with traditional analytical models. Typically, publications with analytical models focus on the center of the figure: the model details and the insights derived from it. The left side of the figure refers to the empirical grounding of the model, that is, whether there is empirical evidence to justify the model's assumptions, parameters, and specific calibrations. The right side of the figure refers to empirical evidence of the impact of the model, that is, whether the model fits the problem situation, can be used in real time, and provides useful output. The concerns expressed above by Spearman and Hopp stem from the expectation that a single paper will present both the model and the empirical testing. This expectation leads to the situation in which empirical testing serves only to demonstrate the model in action, rather than preparing the way for the insights encapsulated in the model to be deployed in practice. Given the lack of openness (among some) to publishing further empirical testing, the model may be accepted by the research community based on its analytical strength—but the first question anyone from practice will ask is, “Who else has used this, and what were the results?” JOM is interested in papers that address questions related to both empirical sides of the development and use of analytical models—their grounding and their impact—that is, either side of Figure 1. Are data available for model parameters? How well do the results work in a variety of real situations? Are the results practically implementable? Are they useful to practitioners? Will managers actually use them? Figure 1 thus highlights important but often undervalued elements encountered in empirically grounding insights from analytical models. Both sides of Figure 1 require a significant amount of empirical research—and it is empirical work on either side of Figure 1 that is the primary contribution of an EGA paper in JOM. It is usually expecting too much of single paper for it to address both sides of Figure 1 sufficiently. On the left side of the figure, analytical models are linked to data and observations of reality: Their assumptions, parameters, and calibration should bear resemblance to a real situation. Here, an empirical contribution focuses on the empirical discovery of a new regularity (new assumption) that leads to the development or revision of analytical models to exploit that new-found regularity. Contributions on the left side of Figure 1 represent the “heavy lifting” of empirically grounding models, transforming mathematical insights into a form that permits measurement and application, and making existing mathematical and modeling insights available to address an observed problem. Finding, collecting, preparing, and analyzing data requires a substantial amount of work—especially when it is impossible to obtain data from the company or situation on which a model was developed. Key parameter values may be unobservable and require estimation from available data. Also, the assumptions that made the model tractable may not hold in key contexts: Empirical research needs to address this tradeoff between parsimony and accuracy. At JOM we want the value of such research to be recognized. Contributions on the right side of Figure 1 assess an existing model's performance in real contexts and address emerging issues. Experiments, field tests, and intervention-based research methods are likely candidates for this type of EGA research. These contributions typically build on the empirical insights from the left side of Figure 1 and the insights/results of prior analytical models, but they add the new knowledge created when the effect on decision making of the nonlinearities captured by analytical models is observed empirically. We classify these contributions as EGA as well, although one could also consider them as “analytically grounded empirics.” Engaging in either side of Figure 1 will trigger an improvement process where the model is revised based on new assumptions or the availability of new data, and/or its effectiveness (usefulness) and efficiency are increased in the real-world context. This will require a toggling back and forth between inductive reasoning to capture the new empirical evidence, deductive reasoning through the analytical model, and abductive reasoning (Josephson & Josephson, 1994) to reconcile the emerging insights and empirical regularities. The surprising and unexpected results that trigger the abduction logic indicate that both the model and its empirical grounding matter to creating actionable knowledge. Creating space for abduction is one of the reasons why successful EGA contributions are more likely to come from the sides than the center of Figure 1. Again, JOM encourages papers that tackle either side of Figure 1 and empirically motivate a significant revision to existing models (see examples below). The above-described empirical grounding is often replaced in the modeling literature (where the focus is the model formulation and insights) by either stylized assumptions (explicit simplifications that still capture key elements of the problem situation) or artificial (simulated) data to assess the model performance. Table 1 identifies four types of modeling efforts, depending on the source of assumptions and data for assessing model performance (cf. the left and right sides of Figure 1), together with the key contribution of each type of study (the italicized terms in each cell). The upper-left quadrant (a stylized model tested with artificial data) is common where an analytical insight is a paper's primary contribution. Empirical grounding can take the form of either moving to empirical data applied in an actual situation (lower-left quadrant) or observing areas in practice where the model requires extension (upper-right quadrant). Analytical Insight Parsimonious Causal Insights Explore implications Test of Value Quality of approximation Solutions/Designs Performance improvement An effective EGA process will encourage moving across the quadrants in Table 1: Progress made in any quadrant can open new doors in adjacent quadrants. As we gain fluency in managing the EGA research process, it will become easier to take analytical insights into the field, transforming them into effective interventions through a multi-stage process that links analytical and empirical publication outlets. All else being equal, JOM is more interested in empirical studies that assess the effectiveness and usefulness of a model (despite its simplifying assumptions), that is, the right side of Figure 1. However, it should be noted that items in the lower half of the table—referring to analytical models that are fit to empirical data but provide no insight into how implementation of the model increased knowledge, understanding, or improvements to the model—typically do not qualify as EGA even though the research is carried out in a real context. The contribution from EGA papers (in JOM) must be foremost empirical—even if some of the insights arise from the analytical model—but the use of the data must translate into model improvements that further improve the results derived from the model. This strategy, however, should not be confused with intervention-based research (IBR), where the outcome of the intervention is to improve existing theories or develop new theoretical insights as a result of the engagement with the problem situation (Chandrasekaran et al., 2020; Oliva, 2019). Tables 2 and 3 summarize aspects of several EGA papers that we will discuss further in this section. We begin with some example papers (in Table 2) that fit best on the left side of Figure 1, followed by papers (in Table 3) that fit best on the right side. Most of these papers have been published in JOM and exemplify the new space that we are seeking to develop, in which empirical work is done to improve the usability of a model. Serrano et al. (2018) began from the observation that financial risk propagates upstream in supply chains, the so-called “contagion effect,” similar to what has been observed with upstream propagation of order variability. They combined models that incorporate policies and constraints that drive agent behavior to reproduce the observed propagation behavior in terms of values and dynamics of variance of payments, and grounded the model on findings from the empirical finance literature to identify factors that drive such propagation and explore what payment variability propagation would look like when encountering rare but important states of nature. Chuang et al. (2022) began from the recognition that retail managers allocate auditing effort (i.e., inspections) for reducing inventory record inaccuracy (IRI) to groups of SKUs rather than to individual SKUs, as most models in the literature assume. This paper proposed the use of survival analysis to transform available data into an estimate of the data degradation rate in the group of SKUs and developed a simple model to allocate auditing effort optimally for groups of SKUs. The paper further grounds the model by empirically estimating the unobservable cost of IRI and testing the model with data from the research site, where it outperformed current managerial practices and the prevalent industry method for allocating auditing effort. Abbey et al. (2015) began from the observation that customers differ in their preference for new or remanufactured products, and wondered whether this observed heterogeneity might affect pricing decisions. They demonstrated the existence of these two customer groups empirically and then extended a pricing-demand model to incorporate them. When customers are assumed to be homogenous in their preference for new or remanufactured products, pricing-demand models indicate that the price for new products should decrease when remanufactured products are added to the mix. The extended model, however, indicates that the price for new products should increase when remanufactured products are added, with this price increase reducing cannibalization and increasing profit. Gray et al. (2017) began from the observation that companies offshored then reshored production. They examined nine offshoring cases, six of which were subsequently reshored. The reshoring cases were motivated by emergent problems such as intellectual-property protection, quality, and/or logistics. The authors observed that the offshoring decision was made on the basis of per-unit landed cost, with other factors not considered even though information was available. The reshoring decision, however, was based on much richer data. They modeled the decision process using system dynamics, with loops grounded in observations from the cases. The model reproduced the observed behavior and also suggested a risk of oscillation: Although the reshoring decision brought richer information into the decision process, there was no evidence that offshoring decisions were updated to consider more than per-unit landed cost. Craig et al. (2021) began from the observation that the majority of chargebacks used by retailers to penalize supplier fulfillment errors arose from execution errors rather than product-related factors. Mechanisms in place tended to prioritize simplicity over accuracy and might not take into consideration the type of error encountered. Also, errors were experienced differently by retailers and suppliers, and the authors argued that such differences should be considered in designing the chargeback system. An error that is inexpensive for a supplier to fix and costly for the retailer should perhaps be treated differently than one that is difficult and expensive for a supplier to fix and relatively inexpensive for the retailer, for example. The researchers extracted data concerning fulfillment errors, chargebacks, and penalties from supplier compliance manuals. They used the efficient order quantity (EOQ) model to exemplify a traditional OM tradeoff model to provide insight into how the optimal decision is affected by how the error rate changes with the order size. The researchers used numerical experiments to demonstrate first the potential magnitude of the cost-of-execution errors and second the potential for the chargeback mechanisms in common use to substantially under- or overcharge for execution-quality cost. Finally, the authors presented a game showing the importance of considering supplier and retailer power and margin in designing chargeback penalties and coordinating efforts to improve execution quality. de Treville, Petty, et al. (2014) began from the observation that 10% of the deals assessed as worth evaluating at a venture capital (VC) firm were not evaluated, with the reason logged as a lack of capacity: Had more capacity been available, the VC firm would have evaluated all deals that passed their initial assessment. How would the cost of the additional capacity needed compare to the expected benefit from evaluating more deals? Exploring this question began with development of an analytical model that estimates how the expected value of the highest deal value grows with increased evaluations for a given tail index of the distribution of deal values. The insight was that the highest deal value grows little from increased evaluations if deal values are described by a distribution that has a tail that decays exponentially (tail index of 0, covering distributions up to the exponential distribution). If, however, the tail grows polynomially (tail index > 0), then the value of the best deal is expected to increase significantly with an increase in evaluations. It was not possible to observe the tail weight of deal values considered by the VC firm, so the authors used the tail weight of VC-backed initial public offerings—substantially above 0—to get a rough idea of whether positive tail weights should be considered. The value of the increase in the best deal under various tail weights was then compared to a rough estimate of the cost of increasing VC capacity. de Treville, Schürhoff, et al. (2014) presented a tool that used real-options theory to calculate the value of postponing the decision about what to produce under demand risk. This work began from the realization—captured in a Harvard Business Review article (de Treville & Trigeorgis, 2010)—that the value of reducing the time between production commitment and observation of demand could be conceptualized as a real option, with this insight qualitatively explored in a company. Model development consisted of combining models from quantitative finance with the newsvendor model. The analytical model proposed—known as the Cost-Differential Frontier—was applied to two cases to illustrate how incorporation of these real options would affect the decisions companies made about investment in lead-time reduction. Although this article is at the edge of EGA, at the time it was generally agreed that the contribution was more the model than it was the empirical insights, and it was not considered to fit well at JOM. Gaur et al. (2007) began from the realization in Fisher and Raman (1996) that developing the ability to transform the dispersion of forecaster estimates into an estimate of the standard deviation of demand would be useful in the absence of historical demand data. Fisher and Raman (1996) simultaneously developed an analytic model to better understand supply-demand mismatches and sought to better match production to demand in a company. The model, while elegant, was too computationally demanding and also not sufficiently flexible to answer the needs of the company project. The authors then combined analytical tools from OM into a model to allocate production between suppliers who varied in responsiveness. This required an estimate of the standard deviation of demand for the product, but no historical demand data was available. The authors thus created a heuristic to transform dispersion of forecaster estimates into a rough estimate of the standard deviation of demand that was sufficient to allow them to try out the concept at the company and obtain initial indications that the approach would reduce supply-demand mismatches. Whereas the initial research project fits best in the center of Figure 1 with the model as the primary contribution, Gaur et al. (2007) empirical testing of the insight that emerged from practice about the use of forecaster estimates to estimate the standard deviation of demand takes us to the right-hand side of Figure 1. In the previous section, we noted that the Cost-Differential Frontier tool developed in de Treville, Schürhoff, et al. (2014) involved practical application that was enough to give an idea of how the tool worked but fell short of the empirical grounding that we are describing here. In de Treville, Bicer, et al. (2014), the Cost-Differential Frontier was taken into three companies (Nissan Europe, GSK Vaccines, and Nestlé Switzerland) to explore insights from applying the tool and to map out the path to implementation. This analysis of the supply chain through the lens of the Cost-Differential Frontier led Nissan Europe to realize that the productivity gains from their efforts to freeze the production schedule several weeks in advance were considerably less than the supply-demand mismatches from the increased time between when the decision was made about which car to assemble and when actual demand was known. The Cost-Differential Frontier needed to be extended to incorporate the fact that 80% of the sales for GSK Vaccines came from a tender structure: The production commitment needed to be made 10 months before delivery, but GSK Vaccines only learned 2 months before delivery whether they had won the tender. Incorporating this demand risk into the Cost-Differential Frontier made clear to company decision makers that there was high value in reducing the decision lead time enough to learn whether the tender had been won. Nestlé Switzerland was experiencing a high level of excess finished-goods inventory for a product with low demand variability and a long shelf life. Examination of the excess inventory through the lens of the Cost-Differential Frontier—which required extending the tool to incorporate bullwhip-like demand risk—made it possible to quantify the effect of bullwhip-like demand risk on the valuation of decision lead time. Historical demand data was used to calculate that the bullwhip-like effects were sufficient to cause the observed supply-demand mismatches. Biçer et al. (2018) extended the Cost-Differential Frontier to include demand risk in the form of jumps. They then applied the extended model to data from Reebok (Parsons, 2004, see also Graves & Parsons, 2005). Was the increased model complexity from incorporating possible jumps warranted by the improvement in supply-demand matching? The rather counterintuitive result was that the simple form of the Cost-Differential Frontier gave a reasonably good answer when a jump was expected to reduce median demand. If, however, a jump was expected to increase median demand, then the simple version of the Cost-Differential Frontier risked undervaluing lead time. This article combined a model extension with analytical grounding to provide a simple decision rule for when jumps should be incorporated in decision making. Saunders et al. (2021) began from a supply-demand mismatch for a consumer-goods product that appeared to fit Fisher's (1997) description of a functional product. The initial objective was to determine whether the extended (low-cost) supply chain for the product was the cause of the mismatches. They used the Cost-Differential Frontier to estimate the expected mismatch cost, with the result that lead time was not particularly valuable for this product and that an “efficient” supply chain was appropriate. The authors then built a simulation model to capture the full decision process in use. When the simulation output did not match company data, it emerged that because the supply chain was considered to be efficient rather than responsive, no efforts were made to update orders. Soon after the initial order was placed with the distant supplier, a key supplier made a second order. The majority of the mismatch disappeared if the initial order was immediately updated based on the key supplier's update, which was feasible given the supplier's lead time. The contribution was the realization that a small dose of flexibility in an otherwise inflexible supply chain could make an important difference in supply-demand mismatches, and that designating a supply chain as purely efficient could change behavior in a way that caused a dramatic increase in mismatches. EGA papers in JOM must have an empirical focus. The analytical insights to be explored empirically are likely to emerge from a model, or models, that have already been published elsewhere. The JOM paper would be evaluated primarily in terms of its empirical contribution rather than its modeling insights. To make this clear in a manuscript, we often advise authors to summarize the model in an appendix (citing its original publication, of course). When the development of an analytical model takes center stage in a paper, that is a sign that it is probably not a good fit for JOM (because the focus of the paper is on the center of Figure 1 rather than on either side of it). How much empirical grounding is enough? No paper will ever be able to do this completely; it is a matter of degree. Whether the degree is sufficient is a question of warrant (Ketokivi & Mantere, 2021), and whether it is significant is largely subjective (more on this below). How much does the grounding add new insight or change the understanding? A manuscript must provide sufficient warrant for its claims of appropriate grounding and the significance of the new insights, often by showing how and why a model's assumptions, calibrations, factors, and/or results should be significantly different. It is incumbent upon authors to convince reviewers that grounding is sufficient and leads to something significant. The requisite empirical grounding can be achieved by a variety of methods, both qualitative and quantitative. Model parameterization should similarly be grounded in empirical data, and assumptions that the model makes must be empirically reasonable. As with all research published in JOM, authors must seek a sense of generality, not just focus on a single instance of a problem. We encourage authors to make use of publicly available data in generating empirical insights from the application of the analytical model, while noting that reviewers are not always accustomed to this use of publicly available data: Authors should be prepared to carefully explain what they are doing and why their data set provides warrant for empirical grounding. The other, usual expectations of a JOM paper also apply. For one, the paper should contribute to OM theory. This contribution distinguishes a JOM EGA paper from an article published in a journal such as the INFORMS Journal of Applied Analytics (formerly called Interfaces), wherein articles are oriented toward practitioners and designed to illustrate the use of analytical models in practice. An EGA contribution in JOM brings new knowledge and understanding, occupying a different space than practitioner-oriented usage guides and mere examples of model deployment and application. As with other types of papers in JOM, the paper's contribution must also be sufficiently significant rather than marginal. This criterion is admittedly subjective, with each reviewer bringing their own perspective on the size of a paper's contribution. As a general OM journal, JOM expects contributions to be generalizable rather than specifically applicable only to niche areas. Other author guidelines apply, including the maximum 40-page manuscript length guideline. JOM is announcing an open call for papers for a special issue on EGA. This call will mention further example papers from other journals. We expect this special issue to provide opportunities to develop and exhibit what JOM expects from EGA papers.
更多
查看译文
关键词
analytics,operations,management</i>,research,<i>journal
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要