|Cryptome DVDs are offered by Cryptome. Donate $25 for two DVDs of the Cryptome 12 and a half-years collection of 47,000 files from June 1996 to January 2009 (~6.9 GB). Click Paypal or mail check/MO made out to John Young, 251 West 89th Street, New York, NY 10024. The collection includes all files of cryptome.org, jya.com, cartome.org, eyeball-series.org and iraq-kill-maim.org, and 23,100 (updated) pages of counter-intelligence dossiers declassified by the US Army Information and Security Command, dating from 1945 to 1985.The DVDs will be sent anywhere worldwide without extra cost.|
8 May 1998
Source: Hardcopy (221 pages) from the Office of Public Affairs, Bureau of Export Administration
Dr. Seymour E. Goodman
Stanford University/University of Arizona
Dr. Peter Wolcott
University of Nebraska at Omaha
Dr. Patrick Homer
University of Arizona Sierra Vista
April 25, 1998
|Chapter 1: Introduction||13|
The Basic Premises Behind Export Control Thresholds
|Chapter 2: Industry Trends Affecting Export Control (65K)||18|
Overview of Key Industry Trends
|Chapter 3: HPC Controllability and Export Control Thresholds (54K)||46|
Kinds of Controllability
|Chapter 4: Applications of National Security Interest (322K)||67|
Key application findings
|Chapter 5: Policy Issues, Options and Implications (44K)||176|
Policy Options for a Control Threshold
|APPENDIX A. Applications of National Security Importance (76K)||191|
|APPENDIX B. People and Organizations||212|
|APPENDIX C: Glossary||218|
[Footer every page; omitted hereafter]
Goodman, Wolcott, Homer
Discussion Paper 4/25/1998
|Figure 1 Computing platform categories (4Q 1997)
Figure 2 Minimum, actual, and maximum computing power available for F-22 design
Figure 3 Establishing a range for a viable control threshold
Figure 4 Performance of microprocessors in volume production
Figure 5 Microprocessor feature sizes
Figure 6 High, low, and average transistor counts for RISC microprocessors, compared
with Moore's Law
Figure 7 High, low, average clock frequency for RISC microprocessors
Figure 8 Instructions issued per clock cycle
Figure 9 Industry projections of microprocessor performance
Figure 10 Percentage of Top500 systems based on commercial microprocessors/
proprietary processors. Source: [Dongam, et al, Top500 Lists]
Figure 11 Price/Performance of HPC Systems
Figure 12 Memory hierarchy and interconnects
Figure 13 Interconnect Latency and Bandwidth
Figure 14 Sampling of platforms with low controllability (4Q 1997)
Figure 15 Sampling of platforms with moderate controllability (4Q 1997)
Figure 16 Configuration, Attainable, and Maximum Performance of Non-scalable Systems
Figure 17 Configuration, Attainable, and Maximum Performance for Scalable Systems
Figure 18 Minimum, maximum, and end-user attainable performance of selected
Figure 19 Threshold of uncontrollable (attainable) performance
Figure 20 End-user attainable performance of categories of HPC systems
Figure 21 CTP's of sample upperbound applications
Figure 22 Conversion path for vector to parallel code
Figure 23 Grid examples
Figure 24 Weather applications
Figure 25 CTP for current CFD applications
Figure 26 Radar reflection from various shapes
Figure 27 One dimensional parallelization
Figure 28 Data processing in sensor-based applications
Figure 29 Synthetic aperture radar
Figure 30 Origin2000 performance on RT_STAP benchmarks Source: 
Figure 31 Communications between nodes and primary router
Figure 32 Communication patterns between clusters
Figure 33 Contribution of HPC to nuclear weapons development with, and
without, test data
Figure 34 Application histogram
Figure 35 Distribution of applications examined in this study
|Table 1 Multimedia support in major microprocessor families
Table 2 Memory latency
Table 3 Current HPC Systems and Trends
Table 4 HPC chassis
Table 5 Categories of dependence on vendors
Table 6 Impact of installed base on controllability (circa 4Q 1997)
Table 7 Examples of platform controllability (4Q 1997)
Table 8 Selected HPC systems from Tier 3 countries
Table 9 Categories of computing platforms (4Q 1997)
Table 10 DoD Computational Technology Areas (CTA)
Table 11 Weather prediction models used by Fleet Numerical MOC
Table 12 Global ocean model projected deployment 
Table 13 Weather applications
Table 14 Current and projected resolution of weather forecasting models used by FNMOC
Table 15 Approximation stages in CFD solutions
Table 16 Results of improvements to tiltrotor simulation
Table 17 Results of porting two CFD applications from vector to parallel
Table 18 Evolution of submarine CFD applications
Table 19 Performance of parafoil CFD applications
Table 20 CFD applications
Table 21 Computational chemistry applications
Table 22 Structural mechanics applications
Table 23 Computational electromagnetic applications
Table 24 Submarine design applications
Table 25 Mission parameters for X-band SAR sensors. Sources: [87-89]
Table 26 SIR-C/X-SAR processing times on the Intel Paragon Source: 
Table 27 Signal Processing Systems sold by Mercury Computer Systems
Table 28 Synthetic forces experiments
Table 29 Computing requirements for first-principles simulations. Source: 
Table 30 Nuclear weapons related applications. Sources: [115,116]
Table 31 Applications between 5,000 and 10,000 Mtops
Table 32 Applications between 10,000 and 15,000 Mtops
Table 33 Applications between 15,000 and 20,000 Mtops
Table 34 Categories of computing platforms (4Q 1997)
Table 35 Trends in Lower Bound of Controllability (Mtops)
Table 36 Selected sample of national security applications and the rising lower
bound of controllability
If high-performance computing (HPC) export control policy is to be effective, three basic premises must hold:
1. There exist problems of great national security importance that require high-performance computing for their solution, and these problems cannot be solved, or can only be solved in severely degraded forms, without such computing assets.
2. There exist countries of national security concern to the United States that have both the scientific and military wherewithal to pursue these or similar applications.
3. There are features of high-performance computers that permit effective forms of control.
This study applies and extends the methodology established in . Its objective has been to study trends in HPC technologies and their application to problems of national security importance to answer two principal questions:
Industry Trends Impacting Export Control
HPC industry trends having the strongest impact on the export control regime include:
1. Developments that increase the performance of HPC products within given market/price niches, and
2. Developments that enhance scalability and, more generally, the ability to apply the computing power of multiple smaller systems to the solution of a single computational problem.
Some of the most significant developments are advances in microprocessors, interconnects, and system architectures.
Microprocessor performance will continue to improve dramatically through the end of the century. In 1997 nearly all microprocessor developers had volume products above 500 Mtops. In 1998. the first microprocessors to exceed 1500 Mtops will be in volume production. By l999, some microprocessors will exceed 2000 Mtops; in the year 2000, processors of nearly 7,000 Mtops will reach volume production. Industry projections are that in 2001 microprocessors of 710 thousand Mtops will ship. Improvements in performance will come from a combination of more functional units, multiple central processing units on a chip, on-chip graphics processors
and increased clock frequency. Industry feels that such improvements can be made without having to make significant technological breakthroughs this century.
In multiprocessor systems, actual performance is strongly influenced by the quality of the interconnect that moves data among processors and memory subsystems. Traditionally, interconnects could be grouped into two categories: proprietary high-performance interconnects used within individual vendor products, and industry standard interconnects such as local area networks. The two categories represent very different qualities, measured in bandwidth and latency. In recent years, a new class of interconnect has emerged represented by products from Myricom, Digital Equipment Corporation (DEC), Essential Communications, Dolphin Interconnect Solutions, Inc., and others. These "clustering interconnects" offer much higher bandwidth and lower latency than local area networks. While they can be used to integrate a number of individual systems into a single configuration that can perform useful work on many applications, they still have shortcomings compared to proprietary high-performance interconnects. These shortcomings may include lower bandwidth, higher latency, greater performance degradation in large configurations, or immature system software environments.
The implication for the export control regime is that commercially available clustering interconnects, while useful, should not be considered equal substitutes for the high-performance, proprietary interconnects used within most high-performance computing systems today.
The dominant trend in overall system architecture in recent years has been towards either a distributed shared memory system or a hierarchical modular system. Vendors today are pursuing one strategy or the other, or both simultaneously. In distributed shared memory systems, memory is physically distributed, but logically shared. A consequence is that memory access time may not be uniform. In hierarchical modular systems, multiprocessor nodes have memory that is logically and physically shared, while between nodes a distributed memory, message passing paradigm is used. It is unlikely that dramatically different architectures will be used within the next 3-5 years. The implication for the export control regime is that the difference between large and small configuration systems, from the perspective of user applications and systems management, is decreasing. Users of small configurations will be better positioned to test and improve their applications without the need to use the larger systems. The larger systems will only be needed for those runs that require greater resources.
Controllability of HPC Platforms and Performance
Building on the Basics asserted that there were computational performance levels that could be attained so easily that control thresholds set below these levels would be ineffective. The principal factors influencing the so-called lower-bound of controllability are:
1. the performance of systems available from foreign sources not supporting U.S. export control policies;
2. the performance of computing platforms that have qualities (size, price, numbers installed, vendor distribution channels, age, dependence on vendor support) that make them difficult to monitor; and
3. the scalability of platforms.
Under current export control policy, licensing decisions are made largely on the basis of the performance of the specific configuration being sold. When systems are extensively and easily scalable without vendor support, end-users may acquire, completely legitimately, multiple small configurations lying below the control threshold and then, on their own, recombine CPUs and memory to create a single configuration with a performance above the control threshold. For example, some rack-based systems sold in 1997-1998 may be sold in configurations of less than 1000 Mtops, and scaled by competent end-users to over 15,000 Mtops by adding CPU, memory, and I/O boards.
An alternative to current practice is to consider the end-user attainable performance of a system when making licensing decisions, rather than the performance of a specific configuration. The end-user attainable performance is defined as the performance of the largest configuration of an individual, tightly coupled system an end-user could assemble without vendor support, using only the hardware and software provided with lesser configurations. For example, an end-user purchasing a desk-side (mid-range) server with some empty CPU and memory slots could easily add boards and increase the system's performance. Upgrading beyond the number of boards that could fit within the desk-side chassis would require additional hardware (a rack) and usually vendor expertise or software not provided with the original configuration. In this case, the end-user attainable performance would be the performance of a full desk-side chassis. In contrast, the end-user attainable performance of traditional supercomputers such as the Cray vector-pipelined systems is precisely the performance of the configuration installed; end-users are not able to upgrade these systems alone.
The concept of end-user attainable performance provides policy makers with a more precise means to distinguish between systems with different controllability characteristics. The following table illustrates a few controllability qualities of major categories of computing systems available in 4Q 1997.
|Type||Units installed||Price||End-User Attainable performance|
|Multi-Rack HPC systems||100s||$750K-10s of millions||20K+ Mtops|
|High-end rack servers||1,000s||$85K-1 million||7K-20K Mtops|
|High-end deskside servers||1,000s||$90-600K||7K-11K Mtops|
|Mid-range deskside servers||10,000s||$30-250K||800-4600 Mtops|
|UNIX/RISC workstations||100,000s||$10-25K||300-2000 Mtops|
|Windows NT/Intel servers||100,000s||$3-25K||200-800 Mtops|
|Laptops, uni-processor PCs||10s of millions||$1-5K||200-350 Mtops|
Figure 1 Computing platform categories (4Q 1997)
In establishing a lower bound of controllability, we factored in a time lag needed for a given product's market to mature. In the case of rack-based systems, we conservatively estimated this time lag to be about two years. Smaller systems are sold more quickly and in higher volumes. We have used a one-year time lag for mid-range servers. Given these time lags, we estimate that the end-user attainable performance of rack-based systems whose markets have matured to lie between 15,000 and 30,000 Mtops in the year 2000, depending on the vendor. The end-user attainable performance for mid-range, deskside systems will reach approximately 6,500 Mtops that same year. The latter figure will rise dramatically in late 2000 or 2001 as tens or hundreds of thousands of mid-range servers in the 15,000+ Mtops range are shipped. Configurations of four or eight CPUs, with each CPU measuring 4-7,000 Mtops, will constitute the "sweet spot" of the mid-range market.
National Security Applications
A major finding of this study is that there is no lack of computationally demanding applications of national security concern, nor is there likely to be in the foreseeable future. The computational requirements of moving to larger problem sizes, finer resolutions, multi-disciplinary problems, etc., create demands for compute cycles and memory that are, for all practical purposes, insatiable. The first basic premise is, and will continue to be satisfied.
The major change in applications over the last several years has been the extent to which practitioners have used parallel computing platforms not only in research settings, but also in production environments. The combination of mature parallel hardware/software platforms from vendors, platform independent application programming interfaces like the Message Passing Interface (MPI), and industry trends towards microprocessor-based systems have prompted practitioners to make the transition from parallel vector-pipelined platforms to massively parallel platforms for most high-end applications.
The methodology used for this report and its predecessor requires the establishment of an "upper bound" for the control threshold that lies at or above the lower bound, but below the performance requirements of key applications of national security concern, or clusters of national security applications. This study has cataloged in detail an extensive number of national security applications. The national security community must decide which of these have the greatest significance for the nation's security. Are there performance levels at which there is a relatively greater density of national security applications? There appear to be, although these tend to be found around the performance levels of "workhorse" computing systems widely used in the national security community. We have observed application clusters at:
Countries of National Security Concern
The second basic premise of export control policy states that there exist countries of national security concern with the scientific and military wherewithal to pursue computationally demanding applications of national security importance. Capability depends not only on having ( 1) the necessary hardware and systems software, but also (2) the application codes, valid test data, and correct input data, and (3) the expertise necessary to use the codes and interpret the results correctly. (2) and (3) are often more of a limiting factor than (1). There exist a few clear examples of foreign countries having the expertise necessary to pursue particular applications successfully. Nuclear weapons development and stockpile stewardship is one, although the computational performance necessary for weapons development, given the necessary test data, is at or below today's UNIX/RISC workstations. Military-grade weather forecasting is another. A critical question, which we have been unable to pursue satisfactorily in this study, is which countries are able to productively use HPC to pursue which applications? It does not appear that the U.S. government is effectively gathering such intelligence in a systematic fashion.
Export Control at the Turn of the Century
This study has concluded that the export control regime can remain viable for the next several years and offers policy makes a number of alternatives for establishing thresholds and licensing practices that balance national security interest and the realities of HPC technologies and markets. Nevertheless, there are a number of trends that will make the regime less successful in achieving its objectives than has been the case in the past. In the future, the probability will increase that individual restricted end-use organizations will be able to successfully acquire or construct a computing system to satisfy a particular application need. A number of factors contribute to this "leakage."
First, if policy makers do decide that systems with installed bases in the thousands are controllable, it is inevitable that individual units will find their way to restricted destinations. This should not necessarily be viewed as a failure of the regime or the associated governmental and industry participants. Rather, it is a characteristic of today's technologies and international patterns of trade.
Second, industry is working intensively towards the goal of seamless scalability, enhanced systems management, single-system image, and high efficiency across a broad range of performance levels. Systems with these qualities make it possible for users to develop and test software on small configurations yet run it on large configurations. An inability to gain access to a large configuration may limit a user's ability to solve certain kinds of problems, but will not usually inhibit their ability to develop the necessary software.
Third, clustered systems are improving both in the quality of their interconnects and supporting software. Foreign users are able to cluster desktop or desk-side systems into configurations that perform useful work on some applications. Such systems are not the equivalent of vendor-supplied, fully integrated systems. However, because it is difficult to prevent the construction of clustered systems, the control regime will leak.
Nevertheless, even an imperfect export control regime offers a number of benefits to U.S. national security interests. First, licensing requirements at appropriate levels force vendors and government agencies to pay close attention to who the end-users are and what kinds of applications they are pursuing. Second, the licensing process provides government with an opportunity to review and increase its knowledge about end-users brought to its attention. The government should improve its understanding of end-users of concern so that it can make better decisions regarding those end-users. Finally, while covert acquisition of computers is easier today than in the past, users without legitimate access to vendor support are at a disadvantage, especially for operational or mission-critical applications.
Outstanding Issues and Concerns
Periodic Reviews. This study documents the state of HPC technologies and applications during 1997-early 1998, and makes some conservative predictions of trends in the next 2-5 years. The pace of change in this industry continues unabated. The future viability of the export control policy will depend on its keeping abreast of change and adapting in an appropriate and timely manner. When based on accurate, timely data and an open analytic framework, policy revisions become much sounder, verifiable, and defensible. There is no substitute for periodic reviews and modification of the policy. While annual reviews may not be feasible given policy review cycles, the policy should be reviewed every two years at the most.
The use of end-user attainable performance in licensing. The use of end-user attainable performance in licensing decisions is a departure from past practice. It is a more conservative approach to licensing in that it assumes a worst-case scenario, that end-users will increase the performance of a configuration they obtain to the extent they can. By the same token, however, it reduces or eliminates a very problematic element of export control enforcement: ensuring that end-users do not increase their configurations beyond the level for which the license had been granted. If U.S. policy makers do not adopt the use of end-user attainable performance, then the burden of ensuring post-shipment compliance will remain on the shoulders of HPC vendors and U.S. government enforcement bodies. If they do, then post-shipment upgrades without the knowledge of U.S. vendors or the U.S. Government should not be a concern, having been taken into account when the license was granted.
Applications of national security importance. The current study has surveyed a substantial number of applications of national security importance to determine whether or not there are applications that can and should be protected using export controls of high performance computing. While the study has enumerated a number of applications that may be protected, it has not answered the question of which applications are of greatest national security importance and should be protected. This question can only be answered by the national security community, and it is important that it be answered. If an application area lacks a constituency willing to defend it the public arena, it is difficult to argue that it should be a factor in setting export control policy.
During the Cold War, when the world's superpowers were engaged in an extensive arms race and building competing spheres of influence, it was relatively easy to make the argument that certain applications relying on high performance computing were critical to the nation's security. Because of changes in the geopolitical landscape, the nature of threats to U.S. national security, and the HPC technologies and markets, the argument appears to be more difficult to make today than in the past. We have found few voices in the applications community who feel that export control on HPC hardware is vital to the protection of their application. Constituencies for the nuclear and cryptographic applications exist, although they are not unanimous in their support of the policy. An absence of constituencies in other application areas who strongly support HPC hardware export controls may reflect an erosion of the basic premises underlying the policy. If this is the case, it should be taken into account; where such constituencies exist, they should enter into the discussion.
 Goodman, S. E., P. Wolcott, and G. Burkhart, Building on the Basics: An Examination of High-Performance Computing Export Control Policy in the l 990s, Center for International Security and Arms Control, Stanford University, Stanford, CA, 1995.
This study is a successor to Building on the Basics: High-Performance Computing Export Control in the 1990s  (published in a slightly updated form in ). That study established a framework and methodology for deriving an export control threshold by taking into account both applications of national security concern, and the technological and market characteristics of a rapidly changing high-performance computing (HPC) industry. One objective of the study was to establish a process for updating the policy that would be transparent, objective, defensible, and repeatable. The current study, undertaken two years after the first, applies the methodology and framework of the first study to determine (a) whether or not a control threshold exists that could be part of a viable export control policy, and (b) what the range of possible thresholds might be.
In addition to recommending that the process be repeated regularly, the earlier study recommended a much more comprehensive analysis of applications of national security concern than was possible in 1995. Consequently, this study provides greatly enhanced coverage of national security applications, their computational nature and requirements, and the manner in which such applications are pursued given the changes in the HPC industry.
This introduction provides a brief review of the framework developed in Building on the Basics. Chapter 2 analyzes key trends in the HPC industry from the perspective of those elements of significance to the export control regime. Chapter 3 provides an expanded analysis of the concept of the lower bound of controllability and establishes a set of options for policy-makers in establishing the lower bound of a range of viable control thresholds. Chapter 4 provides extensive coverage of a broad spectrum of national security applications to give policy-makers insight into the upper bound for a control threshold. Chapter 5integrates chapters 3 and 4 into a concrete set of policy options and implications.
The Basic Premises Behind Export Control Thresholds
The HPC export control policy has been successful in part because it has been based on three premises that were largely true for the duration of the Cold War:
1. There are problems of great national security importance that require high-performance computing for their solution, and these problems cannot be solved, or can only be solved in severely degraded forms, without such computing assets.
2. There are countries of national security concern that have both the scientific and military wherewithal to pursue these or similar applications.
3 There are features of high-performance computers that permit effective forms of control.
If the first two premises do not hold, there is no justification for the policy. If the third premise does not hold, an effective export control policy cannot be implemented, regardless of its desirability.
While a strong case can be made that all three premises held during the Cold War, there have been significant changes that impact this policy. In particular, following the dissolution of the Soviet Union, threats to national security have become smaller, but more numerous; there have been dramatic advances in computing technologies; and the use of HPC within the U.S. national security community has expanded.
If the premises are still valid, it should be possible to derive a control threshold in a way that is explicit, justifiable, and repeatable. If the premises are not valid, then the analysis should clearly illustrate why no effective control policy based on the premises is possible.
Deriving a Control Threshold
The first premise postulates that there exist applications with high minimum computational requirements. In other words, if the minimum computational resources (especially, but not exclusively, performance) are not available, the application cannot be performed satisfactorily. To establish the performance requirements, we asked applications practitioners to identify the computer configuration that they would need to carry out the application. The composite theoretical performance (CTP) of such a configuration was used to quantify the Mtops used for this application.1
In some cases, the configuration used was more powerful than was necessary to do the application satisfactorily. Figure 2 shows the performance of the minimum acceptable configuration and the configuration actually used for the F-22 aircraft design application.
1 The Composite Theoretical Performance is measured in millions of theoretical operations per second (Mtops). Mtops ratings consider both floating-point and non-floating-point operations, and account for variations in word length, numbers of processors, and whether the system is based on a shared memory or distributed memory paradigm.
_ 3,708 | Most powerful system available | in 1987-1990: Cray YMP/8 | | | |
CTP | |
958 _ Actual design performed on | Cray YMP/2 |
189 _ Minimum system acceptable: IBM 3090/250
Figure 2 Minimum, actual, and maximum computing power available for F-22
The third premise requires that systems above the minimum performance level of a particular application have characteristics that permit their export and use to be closely monitored, controlled, and when necessary, denied. If there exist systems that cannot be controlled whose performance exceeds the minimum necessary to carry out the application, then the U.S. government will be unable to control that application solely by t~ing to deny the necessary computer power.
In order for the control regime to be justifiable, there must be some applications that satisfy both the first and third premises.
Over time, the computational performance of the most powerful uncontrollable system(s) rises. As it rises, it overtakes the minimum computing requirements of individual applications. If the minimum and actual performance levels for particular applications are plotted over time, the dynamic may be illustrated as shown in Figure 3. For illustration purposes, this figure uses only hypothetical data.
Figure 3 Establishing a range for a viable control threshold
Under current practice, a control threshold is established at a particular point in time and remains in effect until revised through policy changes. The set of viable thresholds~those that satisfy the three basic premises--must at the same time lie between two bounds: The 'lower bound' is determined by the level of the most powerful uncontrollable systems. The 'upper bound' is determined by those national security applications whose minimum performance requirements lie above the lower bound. In Figure 3, applications N, P, and R are those that can be protected by export control of HPC hardware.
The selection of a specific control threshold further takes into account the nature of the computer market for systems whose performance, measured in Mtops, lies within the range between the lower and upper bounds. Ideally, a control threshold would be established below a point where there were numerous or particularly significant applications of national security concern, but above the performance level of systems enjoying large markets.
The following chapters supply data for the model. Chapter 2 discusses industry trends that impact the export control policy. Chapter 3 discusses the determination of a lower bound. Chapter 4 discusses the computational requirements of a substantial number of applications of national security concern. Chapter 5integrates the results of the previous chapters into a discussion of policy options and implications.
 Goodman, S. E., P. Wolcott, and G. Burkhart, Building on the Basics: An Examination of High-Performance Computing Export Control Policy in the 1990s, Center for International Security and Arms Control, Stanford University, Stanford, CA, 1995.
 Goodman, S., P. Wolcott, and G. Burkhart, Executive Briefing: An Examination of High-Performance Computing Export Control Policy in the 1990s, EEE Computer Society Press, Los Alamitos, 1996.