Home/Magazine Archive/August 2007 (Vol. 50, No. 8)/Designing Large-Scale Supply Chain Linear Programs.../Full Text

Communications of the ACM
## Designing Large-Scale Supply Chain Linear Programs in Spreadsheets

A supply chain is a network of facilities for activities such as raw materials procurement, finished goods manufacturing, and distribution of products to customers. Because of competitive pressures, many organizations have expanded their supply chains to include international sites for more economical production and/or increased sales. The result has been increasingly complex supply chain management problems.

One approach to dealing with such complexities is the use of a linear program (LP), a type of optimization model for applications such as minimizing costs or maximizing profits. Unfortunately, in the past LPs were explained in algebraic terms, which was confusing to many non-specialists. Recently, however, the awareness of LPs by non-technical senior managers has greatly increased because they can be explained more intuitively using spreadsheets. Given the advent of spreadsheets for LPs, more managers are approving the use of the now-familiar linear programming approach for minimizing supply chain costs. As a result, supply chain LP models have become one of the most common LPs being solved.

Table 1 shows some of the organizations that have used spreadsheet-based optimization models [3]; see [46] for descriptions of other applications of LP for supply chain management. Because of this increasing use, the ability to design an optimization model within a spreadsheet is important. However, spreadsheets have inherent limitations when dealing with large-scale models, and a direct approach based on standard procedures recommended in textbooks is generally not successful.

In this article, we discuss some common difficulties encountered during the development of practical spreadsheet optimization models for supply chain management and describe efficient procedures for overcoming these problems. Our approach uses Microsoft Excel's standard features extensively, but augments the spreadsheet with customized Visual Basic for Applications (the VBA programming language within Excel). Our specific contributions include:

- A discussion of the limits of previous "best" techniques for spreadsheet supply chain linear programming and the presentation of new techniques for successfully implementing larger supply chain spreadsheet LPs than were previously thought possible;
- The use of filters for extracting applicable data when different modifications of a basic LP must be solved;
- Examples of the use of VBA to increase the efficiency of a spreadsheet supply chain LP;
- An example of the trade-off between easy-to-develop and slow-to-runa concept that is well-known in procedural programming but novel in the spreadsheet context; and
- An example of an important large-scale supply chain optimization problem that was implemented using LP in a spreadsheet.

This work was motivated by our recent development of a large-scale multi-period supply chain LP for Nu-kote International, and we will use that model as a basis for illustration.^{1} Nu-kote is the world's largest independent remanufacturer of cartridges for printers, copiers, and fax machines. Nu-kote managers who would be using the LP model specifically requested that we develop it in Excel, but our initial attempts at an Excel-based LP were not successful due to the sheer size of the problem. However, using the techniques described in this article we succeeded in developing an LP that is being successfully used to support strategic decisions in a global supply chain, and that has resulted in savings of approximately $1 million in its first year of use.

In a supply chain LP, changing cells (also called decision variables) typically represent shipments of raw materials from suppliers to manufacturing sites and finished goods from manufacturing sites to customers. Constraints often prevent the total shipped out of each manufacturing site from exceeding its capacity and require the total shipped into each customer site to equal that site's demand. Table 2 shows the structure of our supply chain LP for Nu-kote. In addition to the static structure depicted in Table 2, we used weekly time periods to keep track of when the individual shipments originate and terminate. Changing cells referring to weekly shipments that are possible given the shipping lead times are said to be feasible. A spreadsheet LP requires passing constraints to a solution "engine," such as Excel's built in Solver or one of the upgraded versions available [2, 7]. Ranges containing changing cells and the total cost function must also be passed to the Solver, which returns the optimal values for the changing cells.

The size of a supply chain LP is normally given by the numbers of variables and constraints, but the size of a spreadsheet LP can also be expressed in terms of the number of rows and columns or by the file size in megabytes. A large-scale model is generally considered to have tens of thousands of rows. Our Nu-kote LP had approximately 68,500 variables and 1,400 constraints and was solved for each of 86 different cartridges. The spreadsheet had more than 30,000 rows, 229 columns, and was over 50MB in size.^{2} For complex models on this scale, common events such as the recalculation of cell formulas can be extremely time-consuming, and have even caused our Excel LPs to crash.^{3}

**Efficient Specification of Changing Cells.** When developing a spreadsheet optimization model for supply chain management, it is convenient to use rectangular ranges for changing cells, since row and column sums represent flow in and flow out in a natural manner. For example, Figure 1 shows our initial design of a supply chain LP for shipments from manufacturing sites to customers in weeks 178. Manufacturing sites are China and Thailand and are shown in B7:C162, and customers are A and B, as shown in D5:FC6. Changing cells in D7:FC162 specify the shipments from China and Thailand to customers A and B in each week. Thus, the rectangular range consists of four quadrantsone for each manufacturing site/customer combination. The row sums in column FD give the total flow (shipments) out of the corresponding manufacturing sites in each week, and the column sums in row 163 give the total flow into the customer sites in each week. These sums are parts of constraints that the total amounts shipped out of manufacturing sites cannot exceed the capacity in each week, and totals shipped into customer sites must equal the demand in each week.

Unfortunately, the inherent limitations in this approach make it unusable for large-scale applications. Note from D6:FC6 in Figure 1 that 278 = 156 columns are used just to specify the flows of finished goods to demand locations for various weeks. Since Excel 2003 has only 256 columns in each worksheet, this space-intensive approach severely restricts the number of columns available in this worksheet for other parts of the model. Obviously, problems with more than 256 options cannot be modeled using a single rectangular range.^{4} Note also that only the unshaded changing cells in D7:FC162 are needed because of shipping lead times between pairs of sites. For example, China to customer A takes seven weeks, so K7 (week 1 from China to Customer A for week 8) is the first feasible changing cell. The unneeded cells greatly slow the solution process for large-scale models. However, when storing a rectangular range in Excel, this wasted space is unavoidable.

The feasible (non-shaded) changing cells in Figure 1 consist of approximately 300 separate ranges^{5}K7:CC7, CJ7:FC7, L8:CC8, CK8:FC8, M9:CC9, and so forth. Passing ranges of changing cells to Solver manually requires selecting each individual range in turn and then clicking an "Add" button. This is far too time-consuming and error prone, and it is even difficult using a programming language such as VBA. For these reasons it is much more efficient to use columns for the changing cells, coefficients, labels, and formulas. This allows the changing cells to be passed to Solver as a single range of cells. The use of the columns structure for changing cells can be seen for the case of raw materials shipments in the left half of Figure 2a (cells A4:K30820)

**Facilitating Multiple Runs of LP Models.** Typically, a supply chain spreadsheet LP is solved for various proposed manufacturing and shipping lead times (due to premium payments for expedited manufacturing and air freight) and for proposed new suppliers and manufacturing sites. The flexibility to make such adjustments easily is an important characteristic of any supply chain model (see Fischer et al. [1] for a general discussion of the benefits of development flexibility).

Since the feasible changing cells depend on the specific lead times and sites included in the model, all potential cells for labels, coefficients, formulas, and changing cells must be included for any possible set of lead times and for all vendors and manufacturing sites. An Excel filter can then be used to extract only the allowable cells for the actual lead times and sites specified by the user in the model; this is shown in Figure 2a.

This approach works well for small and even moderately large problems. Unfortunately, it can lead to implementation failure for large-scale models. The resulting Excel file contains columns for both the original data (for any set of possible lead times and allowable sites) and the filtered data (which is feasible for the specified lead times and allowable sites). In our large-scale supply chain application for Nu-kote, the resulting Excel file size, in combination with the inefficiencies discussed next, dramatically slowed Solver performance and even caused Excel crashes. Projected solution time for the LP model (if Excel did not crash) was greater than 27 hours.

Model size can be greatly reduced by foregoing the convenience of duplicate data storage (the original data for any lead times and sites, and the filtered data for the user-specified lead times and sites). To accomplish this, VBA within Excel can be used to generate labels, coefficients, changing cells, and formulas only for shipments that are feasible for the given lead times. This is seen by noting the equivalence of the right side of Figure 2a (filtered data) with the right side of Figure 2b (output from VBA). With this automated approach, only the data needed is generated, and the wasted storage of duplicate data is eliminated.

With this automated approach, only the data needed is generated, and the wasted storage of duplicate data is eliminated.

Representative VBA code for generating these types of columns is shown in Figure 2b. It loops over all sources and destinations, looking up the shipping time between each pair. It then uses two "Do While" nested loops over all possible weeks for sourcing the material (SourceWeek) and manufacturing it (ManufactureWeek). Whenever the manufacturing week is at least as large as the source week plus the shipping lead time, concise output is written to columns A and B one row a time. VBA then simply copies the existing cost formulas in C7:F7, which look up various costs, down to all rows for which labels have been created.

It has been considered best practice to use Excel's SUMIF worksheet function in supply chain LPs to implement capacity, demand, and flow conservation constraints. This function adds the cells in one range whenever cells in another range meet the given criteria. For example, a manufacturing capacity constraint can be written using this function to sum all changing cells that refer to shipments into the given site. However, this approach is not workable for large-scale models. Our initial supply chain LP for Nu-kote had SUMIF functions in more than 1,000 cells, and each of these compared and summed the values in nearly 60,000 other cells. This massive recalculation at each of numerous iterations of the solution algorithm was too time-consuming, even for Frontline Systems' state-of-the-art Large-Scale LP solver with the premium solver platform.

To overcome the problem of excessive recalculation time with the SUMIF worksheet functions, each SUMIF can be replaced with the direct equivalent sum of the specific cells. Figure 3a shows the new formulas for the week 10 China capacity constraint in cell J16. This completely eliminates the problems with Solver crashing due to excessive recalculation time. However, this formula involves the exact specification of multiple single-cell ranges in formulas in more than 1,000 cells. Each formula is unique, so none can be "copied and pasted" to other cells. This is far too error-prone and time-consuming to perform manually, but a VBA procedure within Excel can be used to produce these formulas.

Figure 3b shows the VBA code for producing the customized formula =G9 + G79 + G148 + ... in cell J16 of Figure 3a. This function uses only five lines of code to look at each cell in the range B7:B26823 (the actual range is passed to the function). The function concatenates a "+" and the reference to the corresponding cell in column G each time a match is found to the given criteria (MyCriteria = "10 China"). This same function is used to create the unique Excel formulas in the rest of column J in Figure 3a.

Obviously, this customized approach is more complex than simply using SUMIFs. Thus, as is common in other programming environments, there is a trade-off between programming ease and model efficiency when developing spreadsheet-based supply chain LPs. We recommend using the SUMIF approach initially. If the runtime is excessive, then the VBA-driven approach described previously should be used.

We have outlined approaches to developing a large-scale supply chain LP using Microsoft Excel. The techniques described here have been proven effective in the creation of a valuable application for Nu-kote, International. The efficiency of these approaches allows developers to create a complete supply chain LP model, including all of the input data, in a single Excel workbook. Given the familiarity of many managers with the Excel software package, this exclusive use of Excel can significantly increase users' comfort level with, and therefore use of, advanced LPs for supply chain management. Illustrative of this situation is the fact that our original LP model was intended for use with only a few major product types, but these efficiency improvements prompted management to expand its use to support decisions regarding the entire line of 86 products. Total solution time for all 86 LPs was only 4 hours and 15 minutes, with no Excel crashing. This is down from the estimated 27 hours (if Excel did not crash) before our enhancements.

1. Fischer, G., Giaccardi, E., Ye, Y., Sutcliffe, A.G., and Mehandjiev, N. Meta-Design: A manifesto for end-user development. *Commun. ACM 47*, 9 (Sept. 2004), 3337.

2. Frontline Systems, Inc.; www.solver.com/.

3. Fylstra, D., president, Frontline Systems, Inc. Private communication, 2005.

4. LeBlanc, L., Hill, J., Greenwell, G., and Czesnat, A. Nu-kote's spreadsheet linear programming models for optimizing transportation. *Interfaces 34*, 2 (Feb. 2004), 139146.

5. Prior, R. et al. Menlo Worldwide Forwarding optimizes its network routing. *Interfaces 34*, 1 (Jan. 2004), 2638.

6. Tyagi, R., Kalish, P., Akbay, K., and Munshaw, G. GE Plastics optimizes the two-echelon global fulfillment network at its high performance polymers division. *Interfaces 34*, 5 (May 2005), 359366.

^{1}Nu-kote's proprietary information has been disguised.

^{2}Frontline Systems [2] provides several solver engines capable of handling problems of this size. for the Nu-kote implementation, we chose Frontline's Large Scale SQP Solver.

^{3}This is evidenced by "Microsoft Excel has encountered an error and must close."

^{4}Even when the column limit is increased, as in Excel 2007, rectangular ranges still result in inefficient use of space in the spreadsheet, and they can cause other issues, such as the difficulties in specifying changing cells as described here.

Figure 1. Inefficient supply chain LP using rectangular ranges for changing cells.

Figure 2a. Storing data for any possible lead times and later filtering for feasible rows.

Figure 2b. Input data, VBA output, and VBA code to generate data only for applicable lead times.

Figure 3a. Excel's SUMIF worksheet function vs. the direct sum equivalent.

Figure 3b. The VBA code for producing the customized formula =G9 + G79 + G148 + ... in cell J16 of

**©2007 ACM 0001-0782/07/0800 $5.00**

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. To copy otherwise, to republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee.

The Digital Library is published by the Association for Computing Machinery. Copyright © 2007 ACM, Inc.

No entries found