acm-header
Sign In

Communications of the ACM

Viewpoint

Integrating Management Science Into the HPC Research Ecosystem


buildings on a circuit board, illustration

Credit: Ico Maker

High performance computing (HPC) refers to the practice of aggregating computing power in a way that delivers much higher performance than one could get out of a typical desktop computer or workstation in order to solve problems in science, engineering, or business. HPC is usually realized by means of computer clusters or supercomputers. Interestingly, the June 2019 edition of the list of TOP500 supercomputer sites marks a milestone in its history because, for the first time, all 500 systems deliver a petaflop or more. But looking at the development of single core performance reveals that it has stopped growing due to heat dissipation and energy consumption issues. As a result, substantial performance growth has started to come only from parallelism, which, in turn, means that sequential programs will not run faster on successive generations of hardware.

Many academic disciplines have been using HPC for research. For example, HPC has become important in systems medicine for Alzheimer's research, in biophysics for HIV-1 antiviral drug development, in earth system sciences for weather simulations, in material science for discovering new materials for solar cells and LEDs, and in astronomy for exploring the universe. Usage statistics of academic supercomputing centers, which have started offering HPC as a commodity good for research, show a high diversity of scientific disciplines that make use of HPC in order to address unresolved scientific problems with computational resources that have been unavailable in the past. These statistics are also an indicator for other scientific disciplines that have hardly used HPC to solve their research problems. Undoubtedly, several of these other research areas will hardly benefit from HPC, for example, because their research is not computationally intensive. But there are also areas where using massive computational resources can help solving scientific problems. One of these areas is Management science (MS), including its strong link to economics.

Back to Top

Management Science Benefits from HPC

Management science refers to any application of science to the study of management problems. Originally synomous with operations research, MS has become much broader in terms of problems studied and methods applied, including those related to econometrics, mathematical modeling, optimization, data mining and data analytics, engineering, and economics. Its multidisciplinary and often quantitative nature makes it a promising scientific field for HPC. The potential of applying HPC to MS includes the improvement of efficiency ("solving a problem faster"), effectiveness ("solving a problem of larger size and/or with enhanced quality") and robustness ("solving a problem in a way that makes the solution robust against changes").

Although MS very rarely occurs in HPC usage statistics, the potential of HPC for solving problems in MS is being tapped in several of its subfields. A first example are fraud detection services in finance provided by Bertelsmann Arvato Financial Solutions.8 Machine learning approaches are integrated into the real-time analysis of transaction data. These approaches are based on the development of self-learning analytical models from past fraud cases for early recognition of new fraudulent cases. From a technological perspective, the Hadoop framework and the Microsoft Azure cloud infrastructure are used.

A second example is the provision of the cloud-enabled software PathWise (provided by Aon) that uses HPC to manage financial guarantee risk embedded in life insurance retirement products.2 Applying HPC capabilities allows reducing time required to evaluate policies from hours and days to minutes through a variety of approaches, including the parallelization of Monte-Carlo simulations.


Despite promising applications of HPC in different areas of MS, its deployment in MS is far away from being an established and well-known research approach.


A third example of benefiting from HPC when solving problems in MS is the parallelization of algorithms for solving discrete optimization problems in operations research. In Rauchecker and Schryen,9 the authors parallelize a branch-and-price algorithm for solving the "unrelated parallel machine scheduling problem with sequence-dependent setup times." Their computational experiments conducted on a large university cluster used MPI (Message Passing Interface) to connect 960 computing nodes. Results on efficiency show their parallelization approach can even lead to superlinear speedup.

Further examples can be found in economics, where dimensional decomposition for dynamic stochastic economic models has been implemented on a supercomputer3 and equilibria in heterogeneous agent macro models have been solved on HPC platforms.7 In data analytics, HPC has been applied not only to detect financial fraud but also to solve problems in social network analysis (for example, Zhang et al7) and in global health management (for example, Juric et al.6). Overall, the applicability of HPC to problems occurring in MS shows a large methodological diversity, with methods from machine learning and artificial intelligence, simulation, and optimization being included, and the identification of many computational problems in MS that may be solved through HPC is not difficult. However, what turned out to be difficult is bringing HPC to MS and fostering the position of MS in the HPC research ecosystem. This deployment essentially targets exploiting technical HPC capabilities for solving managerial problems through raising awareness of this potential in the MS community; implementing HPC-related education for MS researchers; and providing software development support with frameworks, libraries, programming languages, and so forth, which focus on solving specific types of problems occurring in MS.

Back to Top

Raising Awareness of HPC Benefits for Management Science

Despite promising applications of HPC in different areas of MS, its deployment in MS is far away from being an established and well-known research approach. This subordinate role of HPC in MS is reflected in different phemomena, which, at the same time, point to opportunities for raising awareness of HPC benefits in MS and for informing researchers that HPC is not identical to "High Performance Technical Computing."

In the MS community, only very rarely are (special issues of) journals, workshops, or conference tracks dedicated to HPC-based research. A few examples exist (for example, Schryen et al.10) but much more of these efforts to identify and communicate the potential that HPC brings for MS and to foster corresponding research is needed. In addition, MS departments may profit from introducing HPC to master's and Ph.D. students and young scientists by offering HPC courses and HPC summer/winter schools, in close cooperation with HPC sites of universities. Such courses and schools are of particular benefit when MS students and researchers can bring their own problems, algorithmic blueprints, or codes, and learn how to think, design, and implement parallel. In short, we need a more thorough computational and HPC-oriented education of MS students, who are the MS scientists of tomorrow.

HPC sites at universities and research institutions today generally focus on their current "power users," who are scholars from the natural sciences, engineering disciplines, medical sciences, among others. Often, the expectation of these sites on users' expertise includes a clear understanding of how HPC works technically (for example, shared vs. distributed memory), which parallel programming paradigms exist (threads, processes, and so forth), which libraries and APIs are state-of-the-art (OpenMP, MPI, CUDA, and others), and how parallel programming should be done (take care of data races, deadlocks, and so forth). Unfortunately, MS researchers often do not (need to) have this deep knowledge for understanding their research field. This gap between expected and existing knowledge of HPC finally prevents MS researchers from tapping the potential that HPC might bring to their research. HPC sites should contribute to closing this gap by providing high-educational courses dedicated to MS. It might be helpful to bridge the gap by establishing and financing (jointly with MS departments) positions of HPC-MS engineers.

Finally, funding programs dedicated to computational and HPC research in MS are likely to foster the awareness of HPC benefits for MS and the attractiveness of HPC for MS researchers.

Back to Top

Scalability of Parallel Applications in Management Science Can Differ Fundamentally

Depending on the specific type of MS problem to be solved, the parallelization of algorithms may scale substantially differently over the number of parallel processing units. Therefore, it is important to thoroughly inform MS researchers on issues of efficiency and scalability so they can assess what to expect when solving their particular problem types with HPC.

Some research problems in MS involve solving specific instances of an optimization problem. In such cases, often a fixed-size model (constant total workload, variable execution time) occurs and strong scaling applies: according to Amdahl's Law,1 the speedup factor that can be achieved from parallelization is upper-bounded by 1 divided by the serial fraction of code, which is always larger than zero in practice. Due to execution time required for coordination, this upper bound is usually not achieved, and speedup values even start dropping when a particular number of parallel processing units (referred to as "processors") is exceeded. Even when ignoring all coordination efforts, a serial fraction of code amounting to 20%, for example, would limit the maximum speedup by the factor of 5 regardless of the number of processors. Consequently, MS researchers must be informed on speedup, efficiency, and scalability that can be expected and their determining factors.


It is important to support MS researchers in designing parallel algorithms and to release them from parallel implementation and technical issues.


Other problems in MS, often occurring in data analytics and in a realtime decision making context, follow a scale-size model (variable total workload, constant workload per processor, constant execution time). Then, according to Gustafson's Law,5 weak scaling applies, which means speedup can increase (almost) linearly with the number of processors (even when co-ordination efforts are considered).

Back to Top

Applications Need Not Primarily to be Rewritten but Rethought by (Re-)design

While some sequential applications can be parallelized straightforward, limiting parallelization efforts to the implementation phase is myopic. As noted in Fuller and Millett,4 "attempts to extract parallelism from the serial implementations are unproductive exercises and likely to be misleading if they cause one to conclude that the original problem has an inherently sequential nature." Thus, it is important to support MS researchers in designing parallel algorithms and to release them from parallel implementation and technical issues. While this approach is useful for scholars of all scientific disciplines, it is of particular importance in fields where researchers are not used to programming-intensive tasks, as this is often the case in the MS community.

Several frameworks applicable to MS have already been suggested (for example, Apache Hadoop, Ubiquity Generator by ZIB, Branch-Cut-Price framework in COIN-OR) or are under development (for example, the PASC project "Framework for computing equilibria in heterogeneous agent macro models"); however, we should strengthen our efforts to develop IT artifacts that support MS researchers in parallel design and parallel implementation. In particular, high-level languages at the application level rather than multipurpose parallel languages at the programming level would need to be provided. The availability and usability of such application languages would allow MS researchers to focus on parallel design issues and release them from writing parallel code at the programming level, which would be generated automatically by application language compilers. Such approaches are appropriate for deploying HPC in MS at a large scale. Concluding, it is currently an auspicious time for integrating MS into the HPC research ecosystem, and the MS community can look forward to the promising developments to come.

Back to Top

References

1. Amdahl. G.M. Validity of the single-processor approach to achieving large scale computing capabilities. In AFIPS Conference Proceedings, Vol. 30, AFIPS Press, Reston, VA, (1967), 483–485.

2. Aon. PathWise. 2019; https://www.aon.com/reinsurance/PathWise-(1)/default.jsp

3. Eftekhari, A., Scheidegger, S., and Schenk, O. Parallelized dimensional decomposition for large-scale dynamic stochastic economic models. In Proceedings of the Platform for Advanced Scientific Computing Conference. (2017).

4. Fuller, S.H. and Millett, L.I. The Future of Computing Performance: Game Over or Next Level? National Academy Press. (2011).

5. Gustafson, J.L. Reevaluating Amdahl's Law. Commun. ACM 31, 5 (May 1988), 532–533.

6. Juric, R., Kim, I., Panneerselvam, H., and Tesanovic, I. Analysis of ZIKA virus tweets: Could Hadoop platform help in global health management? In Proceedings of the 50th Hawaii International Conference on System Sciences. (2017).

7. Kübler, F., Scheidegger, S., and Schenk, O. Computing Equilibria in Heterogeneous Agent Macro Models on Contemporary HPC Platforms (2017); www.pasc-ch.org/projects/2017-2020/computing-equilibria-in-heterogeneous-agent-macro-models/ <http://www.pasc-ch.org/projects/2017-2020/computing-equilibria-in-heterogeneous-agent-macro-models/>

8. Microsoft. Big data: Improving fraud recognition with Microsoft Azure. (2017); https://customers.microsoft.com/en-us/story/arvato-azure-powerbigermany-media-inovex

9. Rauchecker G. and Schryen G. Using High Performance Computing for unrelated parallel machine scheduling with sequence-dependent setup times: Development and computational evaluation of a parallel branch-and-price algorithm. Computers and Operations Research 104 (2019), 338–357.

10. Schryen, G., Kliewer, N., and Fink, A. Call for Papers Issue 1/2020—High Performance Business Computing. Business and Information Systems Engineering 60, 5 (2018), 439–440.

11. Zhang, K., Bhattacharyya, S., and Ram, S. Large-scale network analysis for online social brand advertising. MIS Quarterly 40, 4 (2016), 849–868.

Back to Top

Author

Guido Schryen (guido.schryen@upb.de) is a full professor of Management Information Systems and Operations Research at Paderborn University, Germany.


Copyright held by author.
Request permission to (re)publish from the owner/author

The Digital Library is published by the Association for Computing Machinery. Copyright © 2020 ACM, Inc.


 

No entries found