Imagination Engines, Inc., Home of the Creativity Machine


The Big Bang of Machine Intelligence!

Imagination Engines, Inc., Home of the Creativity Machine
The simple
  • Three Generations of Creativity Machines

    The simple, elegant, and inevitable path to human level machine intelligence and beyond, the Creativity Machine Paradigm, US Patent 5,659,666 and all subsequent foreign and divisional filings.


Wall Street Journal: Can an AI System Be Given a Patent?

Fast Company: Can a robot be an inventor?

BBC: AI system 'should be recognised as inventor'

Financial Times: Patent agencies challenged to accept AI inventor

Futurism: Scientists are trying to list AI as the inventor on a new patent

The Disruption Lab: The disruption that is DABUS: Beyond AI

ACT-IAC: The dawn of conscious computing

WIRED: This artificial intelligence is designed to be mentally unstable



The Warhead Design Creativity Machine

Stephen L. Thaler, Ph.D.

President and CEO, Imagination Engines, Inc.

(Adapted from Weapon Systems Technology Information Analysis Center (WSTIAC), Volume 3, Number 1, December, 2001)

1.0   Introduction

Artificial neural networks (ANN) were victimized by some very bad press in the mid 1980s, as some overly zealous researchers promised more than they could possibly deliver. Since that time, these unique computational devices, inspired by the very benchmark of intelligence, the human brain, are significantly more advanced in their modeling capabilities, autonomy, and ease of use.

Perhaps the most significant development in this branch of AI has been the discovery of a powerful neural network paradigm that enables ANNs to perform beyond their customary role as pattern detectors, allowing them to now function as novel pattern generators that 'invent' seminal concepts and plans of action. The machine intelligence enabled by this neural net paradigm shift, called a "Creativity Machine" (Thaler, 1995A, 1996A, 1997B; Yam, 1995B; Holmes, 1996B; Brown, 1997A), allows computers to automatically generate new and useful ideas via the pattern-based computing scheme utilized by the brain. In contrast to the older schools of artificial intelligence, such as expert systems, where sundry rules and atoms of knowledge must be laboriously assembled and entered into a computational system, such Creativity Machines capitalize upon the extraordinary ability of ANNs to simply 'watch' a conceptual space and implicitly absorb the fundamental entities and heuristics therein. Rather than take the extremely inefficient and computationally costly 'hit and miss' approach used by genetic algorithms to discover the most robust concepts, the Creativity Machine (CM) takes the approach used by the human cortex and not blind evolution. ...Intelligently, rather than randomly, this advanced neural system arrives at highly optimized concepts, including specially tailored and adaptive ordnance designs.

1.1 The Current Neural Network Paradigm - The Multilayer Perceptron

ANNs may be very succinctly described as computer codes that autonomously write themselves as they are exposed to representative input and output patterns. During this process of autonomous code writing, a system of computer-simulated on-off switches, called computational neurons, effectively wire themselves together using what I speak of tongue-in-cheek as a 'mathematical spanking.' In practice, such spankings are achieved through a series of partial differential equations that constitute a process called back-propagation (Rummelhart, 1986). Using this methodology, a series of input patterns are fed through successive layers of neurons and their interconnecting weights, at first producing a series of initially erroneous output patterns. Successively, the network's errors, represented by the difference between actual and desired outputs, propagate from output to input end of the ANN, via the back-propagation equations, iteratively correcting the numerical values of weights connecting the neurons. After a sufficient number of backpropagation cycles, the connection weights, tantamount to the expansion coefficients in a traditional statistical fit, produce a model that connects the known input-output pairs. When such ANNs consist of three or more layers of such computational neurons they are called multilayer perceptrons (MLP), and may be used to fit arbitrarily complex and often nonlinear functional relationships in very high dimensional spaces. The predictive accuracy of such models is contingent upon several factors, including the network architecture (i.e., the numbers of neurons in each of the successive layers, the functional form taken by each neuron, usually of sigmoid form, and a variety of parameters intrinsic to the back-propagation algorithm). The reliability of the MLP model fit also shares many of the limitations of traditional statistical modeling techniques wherein the quality and uniform sampling of data patterns are critical.

Close examination of a trained MLP reveals that the system of computational neurons, along with their connection weights, have assumed the form of 'logic circuits,' both discrete and fuzzy, that embody the many rules or heuristics embodying the input-output transformation. After all, the network is developing an accurate model, so it is compelled to 'grow' an effective internal logic that is tantamount to a theory. Therefore, with sufficient dissection of the trained MLP, we uncover the critical neural circuitry that combines the various features of the input patterns to faithfully produce the individual features of the output patterns.

In Figure 1, we show the weight architecture of a multilayer perceptron that accurately maps the design parameters of a potential warhead to its fragmentation field. It has been trained by supplying a variety of known warhead design specifications (i.e., length, width, casing thickness, explosive type and weight, geometry, striation pattern, etc.), and the associated fragmentation patterns (i.e., Z-data tables that have been generated from arena testing). Training input patterns consist of these design specifications, and polar sampling window Dq, centered at q. Associated training output patterns consist of the initial fragment velocity at the polar angle, major fragment weight groups involved, and the numerical count of fragments within these respective weight groups. Once trained, we may enter as network inputs some set of design parameters for the warhead, a polar sampling width Dq and then progressively ramp the value of q to generate the fragmentation field of some real or hypothetical warhead design.

Figure 1. A multilayer perceptron (top) has trained upon multiple examples of warhead design specifications and corresponding arena test data (i.e., fragment velocities and fragment distributions). Applying some warhead design specification to the inputs of this network, we may set the polar angle inputs, q, and the angular collection window, Dq, to calculate the resulting initial fragment velocity, weight groups, and weight group counts, within that polar sampling window. (Note: in arena tests, the fragment distribution captured by wall board spanning Dq is assessed through manual fragment counts.)

In other words, such an MLP trained on arena test data may serve as a Z-data table generator for new candidate warhead designs, or if embedded within a warhead, intelligently readapt the warhead inbound toward the target. Currently, using the available arena test data assembled from the Joint Munitions Effectiveness Manual (JMEM), root-mean-square training error is estimated at approximately 5%, reflecting the underlying experimental error in counting fragments following an arena test. Further, such Z-data generators have achieved 8% root-mean-square error in predicting the fragmentation field for warhead designs that have been isolated from the original training examples, to provide a test set to gauge the network's overall predictive power.

When contrasted with the very complex and costly first principles calculations performed using hydrodynamic and chemical thermodynamic theories, we see the MLP as a very practical and economical alternative. In effect, the MLP is autonomously devising its own warhead fragmentation theory!

1.2 The Future Neural Network Paradigm - The Creativity Machine

Although very powerful, the MLP approach to warhead design is limited, especially when we begin to consider how best to find a warhead design that will generate the needed beam spray. It would be very convenient, for instance, to use the MLP in reverse, applying some sought fragmentation field at the network's outputs and then observing the required warhead design to achieve that fragmentation pattern at the inputs. However, MLPs, by definition, are not bi-directional and only predict through forward propagation. Unfortunately such an approach would fail, even if we were to train a new MLP to map fragmentation patterns to design specifications for the following reason: the mapping between design specification and beam spray is intrinsically many-to-one. That is, there are potentially multiple warhead designs that may achieve the same fragmentation field. In a computational system that allows only one output pattern for any given input pattern, the design specification appearing at the outputs of the proposed net would effectively represent an average of several design solutions, that in itself may be an incorrect one.

One very impractical approach to finding the required warhead design would be to apply appropriately scaled random numbers, representing potential warhead design parameters, to the inputs of the network shown in Figure 1, until closely producing the necessary fragmentation field. The overwhelming problem with such a stochastic approach is that the vast majority of these potential designs, represented by random patterns, would be unphysical or nonsensical. For instance, shell casings may not be large enough to accommodate the prescribed weight of explosive, or the structural portion of the warhead may be far too heavy compared to its yield. These and other myriad constraints must be readily captured and exercised when feeding a stream of potential warhead concepts to an MLP that maps design specifications to fragmentation field.

To build an effective generator of new warhead designs we employ a special kind of neural network called an "imagination engine." To build such an inventive net for this particular problem, we train an MLP to produce an input-output mapping whose output patterns represent known warhead designs[1]. Important to note is that as a result of such training, the net has automatically absorbed all of the implicit constraint equations (i.e., heuristics) that dictate how one design parameter, such as explosive weight, varies with all of the other design parameters. Typically, these constraint relations are captured within the final layer of weights of the MLP.

The effect that enables such a pre-trained MLP to generate a stream of new, potential warhead designs is one that I published (Thaler, 1995A) in a variety of scientific publications in the early to mid 90s. This extremely important neural network phenomenon is described as follows: If we administer transient disturbances (i.e., small algebraic variations) to the connection weights that precede the final weight layers of the trained MLP, the network tends to rapidly output a sequence of plausible output patterns, in this case representing potential warhead designs. The observation that each of these output patterns represents a physically realistic design owes itself to the fact that each disturbance dealt to a connection weight exercises the absorbed and implicit constraint equations. In effect, the network is 'dreaming' new warhead designs. The underlying driver behind these emerging concepts is what I call "virtual input effect." (The network is hallucinating outputs, since inputs, tantamount to external sensory stimuli, are not being applied to the network.)

The most direct approach to building such imagination engines is through the use of what is called an auto-associative network. In training such a network, known warhead designs, are shown to it, both as the input and output patterns. Following training, when any known warhead design specification is applied to the inputs of this net, it is 'deconstructed' through the preliminary weight layer(s), and then 'reconstructed' at the MLP outputs, using the various constraint rules contained within the output weight layer. As in the general case of MLPs, the application of mild disturbances to the preliminary layer of connection weights in the auto-associative MLP produces plausible warhead designs at the net's output layer.

To make this effect more relevant, imagine that through years of study, warhead designers have established analytically represented equations of a general warhead, explicitly representing how any design parameter depends upon all others:

casing diameter

= F1(casing length, explosive type, explosive weight,          ..., fusing scheme)


casing length

= F2(casing diameter, explosive type, explosive weight,      ..., fusing scheme)


explosive type

= F3(casing length, casing diameter, explosive weight,        ..., fusing scheme)









fusing scheme

= FN(casing length, casing diameter, explosive type,           ..., casing striation)


Then, implementing this model on a digital computer, random variations can be administered to any arbitrary design parameter, as the computer algorithm automatically adjusts all others according to these master warhead equations (1). The truly amazing fact is that the imagination engine rapidly absorbs such equations, from raw warhead deign data, in a matter of seconds or minutes, and then effectively generates myriad design variations through the parallel exercise of all these constraint equations through the perturbation of the network's connection weights.

In addition to experimenting with different values of design parameters, any small numerical perturbations delivered to the imagination engine's connection weights effectively softens or breaks the implicit rules absorbed therein. To this end, IEI has developed techniques wherein perturbations transiently 'hop' among the connection weights of the imagination engine, rapidly experimenting with rule softening. One simple parameter, the magnitude of the mean synaptic perturbation (expressed as the fractional perturbation to each connection weight) is adjusted according to how drastically we intend to bend the usual design rules. The new warhead concepts are then relayed to an MLP of the type shown in Figure 1, that maps each of these could-be designs to the resulting predicted fragmentation field. The governing algorithm then accumulates a wealth of potential design specifications that will provide a desired beam spray pattern.

We belatedly note that whereas any MLP could have been used to produce output patterns representing known warhead designs, the auto-associative net is most expedient, for the following reason: We need not seek out associated data that would be used to create a hetero-associative mapping, wherein a warhead inventory code, for instance is mapped to design specification. In contrast, all that we require are numerous examples of known warhead design specifications. Furthermore, the auto-associative version of the imagination engine is extremely useful for studying variations on the prototypical designs offered through the warhead training exemplars.

To achieve such design variations, we may apply to the imagination engine's inputs, some known warhead design specification, while simultaneously administering small, transient perturbations to connection weights in the preliminary weight layer. The result is that the stream of output patterns, new potential warheads, represent small design modifications to individual features of the warhead, as the other design features change in a self-consistent way to generally preserve the absorbed design heuristics.

2.0   The Warhead Design Creativity Machine

A simple Warhead Design Creativity Machine that has already been tested[2], is shown in Figure 2. There, we show the internally perturbed auto-associative imagination engine whose inputs are momentarily set to established warhead designs. The synaptic perturbations drive the network to produce plausible variations on these prototypical warheads. These potential warhead designs are then relayed to the lower critic network that now calculates an associated fragmentation pattern. The overall search process terminates when the combined networks discover, to within some predetermined root-mean-square error, the sought fragmentation pattern. The associated design specification now represents the sought warhead that propels the desired number of fragments into the required polar angles.


Figure 2. A Simple Warhead Design Creativity Machine. Starbursts within the imagination engine produce a series of plausible warhead designs, while the lower, critic net, calculates the associated fragmentation pattern. This process continues until the sought fragmentation field is generated. q may then be varied, at set Dq, to explore the full angular dependence of the fragmentation field.

We note that in this initial effort, with only a limited sampling of warhead design specifications, it was only possible to consider a subset of possible design parameters that included length, diameter, case thickness, weight, explosive type, explosive weight, and the nature of fusing (nose or tail initiation). Rather than solve for the design specification required to give the full fragment distribution, we solved for a design specification projecting the required fragmentation distribution into some polar angle. Then, we were able to explore the full polar dependence of a promising design, by maintaining the candidate design specification at the imagination engine's inputs, and then ramping the value q from 0 to 180 degrees. This process is shown in Figure 3, depicting the system's user interface, wherein we modify the characteristics of an exemplary fragment distribution and then allow the Creativity Machine to solve for the required design specification.


Figure 3. User Interface for Warhead Design Creativity Machine. Data point indicated by red arrow is modified to alter the fragmentation pattern. The underlying neural networks then solve for a warhead design specification potentially delivering this fragmentation field. Thereafter, the system may calculate comparative plots showing the variation of fragmentation pattern as the various design parameters are varied (right).

3.0   Rule Extraction from the Warhead Design Creativity Machine

Since the critic network in Figure 2 has formed a highly accurate mapping between warhead design specifications and resulting fragment field, the necessary 'logic circuits' have developed within the critic net that reflect the underlying physics of warhead detonation. Using IEI's proprietary weight pruning schemes, we may strip away all but the relevant connections to visually display these important relationships. In Figure 4, for example, is a selectively skeletonized critic network that reveals an important connection trace between the Boolean choice of tail initiation (yes or no) with the resulting fragment spread. From this trace we may readily discern that by choosing tail initiation, we enhance the lower fragment weight groups, while opting out of tail fusing (i.e., nose intiation) higher weight groups are augmented.

In a similar manner, we may skeletonize similar connection traces and then selectively interrogate them to discern highly valuable interrelationships that may of immense use to weapons designers. More advanced rule extraction techniques that have been pioneered by IEI, are capable of converting a neural to a semantic network, wherein logical rules governing the relationship between individual design features and the resulting fragmentation characteristics are automatically depicted through diagrams incorporating natural language (English).


Figure 4. Rule Extraction from the Underlying Neural Networks. Here the critic network of the Warhead Design Creativity Machine has pruned itself of unnecessary connection weights to reveal how the presence or absence of tail initiation toggles the distribution of weight groups. The upper link contains an excitatory connection (red), while the lower link (green) is inhibitory. Therefore the upper link labeled "tail initiation = true" turns on the weight groups 4, 5, 6, and 7, while the lower trace marked "tail initiation = false" turns on weight groups 9, 10, and 11. The color legend shows the relative strengths of the weights involved in these connection traces.

4.0   Planned Improvements in the Warhead Design Creativity Machine

The foremost problem faced in training the Warhead Design Creativity Machine has been a shortage of available arena test data. In this first proof-of-principle experiment, only seven distinct and unclassified warhead designs were considered, combining JMEM derived design specifications with the results of corresponding arena tests carried out at the AFRL Munitions Directorate at Eglin AFB. The primary limitation of the overall system currently is the lower critic network pictured in Figure 2, that has demonstrated an approximate 8% root-mean-square prediction error for warhead designs swapped out of the original training set and then submitted to the network to sample generalization capacity. Because of this current limitation, the accuracy of the Creativity Machine may be lacking, especially as the imagination engine generates designs far removed from those used as training exemplars. For this reason, trial runs of this CM were run at low perturbation levels to produce warhead designs that were only slight variational themes on those seen by the imagination engine through training.

Based upon past Creativity Machine projects there is no doubt that as additional training exemplars are supplied, the ability of this architecture to produce reliable new designs will vastly improve. Likewise, as observed in past CM efforts, there will inevitably be a bootstrapping phase wherein new recommended designs are actually built and tested. Iteratively, through cumulative mistakes, Creativity Machines, just as people, become progressively more reliable in producing new, reliable, and useful concepts.

5.0   Automation of the Warhead Design Creativity Machine

Recent advances at IEI have led to a totally new type of MLP known as a "Self-Training Artificial Neural Network Objects" or "STANNO" (Thaler, 1996C, 1998). To make a very long story short, these neural network class templates contain their own integrated training algorithm. The very useful result is that multiple, self-training networks may be instantiated and combined into cascade structures, such as the Creativity Machine shown in Figure 2. Each of these STANNOs may then train in situ, without the necessity of removing and individually training each upon its associated exemplar database. Since most of the subtleties ordinarily dealt with by neural network practitioners are handled automatically within the STANNO, the overall Creativity Machine may be trained by those with little or no expertise in the area. This autonomy is extremely valuable to those working with classified, need-to-know data. Appropriately cleared personnel, without extensive knowledge of neural networks, may now train and exercise Creativity Machines through simple menu selections and mouse clicks.

The autonomy of the STANNO has also contributed to the development of large neural networks that may be implemented as client-server applications. Thus, some centralized neural network may be collaboratively trained by multiple users around the country via secure TCP/IP connections. In this manner multiple services could be supplying arena test data to a master neural network that is able to absorb the implicit rules relating warhead design and their resulting potencies. Through the client applications, users could independently query this central network for Z-data tables of hypothetical warhead designs. Similarly, the component networks of a warhead design Creativity Machine could be trained and interrogated in a similar client-server scheme.

6.0   Planned Refinements to the Warhead Design Creativity Machine

The next obvious step in improving the Warhead Design Creativity Machine is to enlarge the critic network to calculate the full fragmentation field (i.e., 0 < q < 180) at once. To accomplish this task, much more arena test data will be required than was utilized in the prototype system, perhaps resorting to physics-based supercomputer simulations of arena tests. Once completed though, Z-data files generated for hypothetical weapons designs may be output to existing ray trace programs such as MEVA, to calculate blast effects upon various types of target structures (i.e., buildings, bunkers, etc.)

Additional critic modules may be added that monitor for other kinds of effects, including collateral damage to personnel, structures, and materials. Furthermore, additional Creativity Machine levels may be added that generate new potential chemical explosives that may be incorporated into the candidate warhead designs. Such an ancillary explosive materials Creativity Machines could be especially critical as new potential explosive materials, unaccompanied by thermodynamic data, are synthesized. Considering that extant codes such as JAGUAR rely upon chemical free energies of formation, this explosives discovery system would offer an immediate alternative to this highly successful code, perhaps absorbing that algorithm, and then refining itself through exposure to the behavior of new explosive materials and their highly irreversible thermodynamics.

7.0   The Future of Creativity Machines

Implicit in the discussion of this warhead discovery system was the message that the Creativity Machine paradigm is a perfectly general methodology that may be applied to any conceptual space or area of human endeavor. Since these systems are totally autonomous in their formation, and because no explicit heuristics need be gleaned from their respective conceptual spaces, they may be functional and producing new concepts and courses of action in very little time. For this reason, Creativity Machines, serving as viable alternatives to and improvements over genetic algorithms, are currently providing solutions within in a variety of problem areas including materials discovery, personal hygiene products, synthetic music, robotics, communication satellite scheduling, as well as weapons design. Therefore, the sky is the limit as far as future applications of this very fundamental neural network paradigm, in both the civilian and military sectors as it offers "AI's best bet" (Bushnell, 2001) to a fully creative brand of artificial intelligence.

8.0   References

Rummelhart, D. E. (1986), Parallel Distributed Processing Explorations in the Microstructure of Cognition, Volume 1: Foundations, MIT Press, Cambridge, MA, 328-330.

Thaler, S. L. (1995A) "Virtual Input Phenomena" Within the Death of a Simple Pattern  Associator, Neural Networks, 8(1), 55-65.

Yam, P. (1995B) As They Lay Dying ... Near the end, artificial neural networks become creative, Scientific American, May, 1995.

Thaler, S. L. (1996A) Neural Networks That Create and Discover, PC AI, May/June 1996.

Holmes, R. (1996B) The Creativity Machine, New Scientist, 20 January 1996.

Thaler, S. L. (1996C) Self-Training artificial Neural Networks, PC AI, Nov/Dec 1996

Brown, A. (1997A) Computers that create: No hallucination, Aerospace America, January 1997

Thaler, S. L. (1997B) - Device for the autonomous generation of useful information, issued 08/19/1997 and divisional patents

Thaler, S. L. (1998)  - Non-Algorithmically implemented artificial neural networks and components thereof, issued 12/01/1998 and divisional patents

Bushnell, D. M. (2001). Future Strategic Issues/Future War.

[1] Note that for now, I am noncommittal about the nature of this network's input patterns.

[2] contract F08630-99-C-0090 P00002 with AFRL/MNF, Eglin Air Force Base, FL

New Page 1

© 1997-2020, Imagination Engines, Inc. | Creativity Machine®, Imagination Engines®, Imagitron®, and DataBots® are registered trademarks of Imagination Engines, Inc.

1550 Wall Street, Ste. 300, St. Charles, MO 63303 • (636) 724-9000