© 1997, Stephen L. Thaler, sthaler@imagination-engines.com, Imagination Engines, Inc., St. Louis, MO, USA, http://www.imagination-engines.com/, Presented to the Mind II Conference, Dublin, Ireland, 1997
Creativity is perhaps the most
celebrated of human capacities, embraced by the human potential movement and
revered in the same light as other "folk" attributes such as In taking this purely physical tack we must realize that just like swinging doors and molecules, the brain is a dynamical system endowed with various degrees of freedom. For a door the single degree of freedom describing its state is the rotation angle about its hinge axis. For a polyatomic molecule the degrees of freedom include the many allowed vibrational and rotational modes into which all conceivable motions of constituent atoms may be resolved. For the brain the allowed states are its neural activation patterns, each one of which represents some memory, sensation, or idea. We note that just as in the simpler physical systems, the total number of possible degrees of brain freedom is finite and attributable largely to the existence of electrochemical constraints (i.e., long-term potentiation) that bar the arbitrary activation of any given neuron or group of neurons within this sizable collective system. Recognizing that any of these
dynamical systems can only evolve in a manner dictated by internal constraints,
we anticipate that the introduction of any random perturbation to these systems
will drive them only through their allowed degrees of freedom. Therefore, the
door will only move through its hinge axis when subjected to random jarring. A
water molecule will respond to impact with other molecules by executing only its
allowed translational, vibrational, and rotational modes. Likewise, when bathed
in internal chaos, the cortex can only move through its allowed manifold of
electrochemical states, each encompassing some idea or notion, whether mundane
or profound. The result is what we commonly term Obviously, as long as the systems of electrochemical constraints in the brain remain intact, there can be no more seminal thought than the turnover of preexisting memories. One avenue toward creating new and unique activation patterns, and hence original ideas, is to destroy electrochemical constraint relations within cortical networks. Theoretically, exercising this single option opens the door not only to an intriguing model of creative cognition but also to a powerful computational paradigm.
Rumelhart
and colleagues (1986) emphasized the
utility of parallel distributed processing systems as constraint satisfaction
networks in their pioneering work. Using "hand-wired" competitive
networks exposed to various room schemata, they were able to demonstrate a
primitive brand of creativity in which novel, yet plausible furniture
combinations were predicted. Using the well-known principle of
Recently I have demonstrated
(Thaler, 1995,
1996
a, b,
c)
that a trained artificial neural network supplied no inputs whatsoever, and
driven by stochastic perturbations to its internal architecture may generate
self-consistent schemata related to the conceptual space embodied within its
training exemplars. In short, the network is perceiving something when in fact
there are no presented environmental inputs. Accordingly I have coined the term
"virtual input effect" to describe the phenomenon. Contacting with
Rumelhart’s work, if we were to train a simple auto-associative feedforward
net on numerous examples of room schemata (hence bypassing the tedious Bayesian
statistics used to construct his net), setting the inputs of the network to
values of zero and then randomly perturbing the connection weights from their
trained values, we would observe a progression of network activations
corresponding to plausible furniture schemes. The difference in operating
procedure from Rumelhart’s work is significant, representing the distinction
between When supplied no external inputs,
the production of meaningful activations by the network relies upon a different
brand of vector completion than is normally discussed. Rather than fill in
incomplete or corrupted input patterns, the net attempts to complete internal,
noise-induced activation patterns within the net’s encryption layers.
Therefore, any local or temporary damage to the network’s mapping is
interpreted by downstream layers as some "familiar" activation pattern
normally encountered upon application of a training exemplar to the network’s
inputs (Thaler,
1995). Because of the many combinatorial
possibilities in perturbing connection weights within a network, we arrive at a
means for generating proportionately more novel schema than is possible with
input perturbations alone. Furthermore, because the connection traces within a
trained neural network generally correspond to the rules binding the underlying
conceptual space together, such stochastic perturbation schemes serve to soften
these rules, in turn allowing a gradual departure from the known space of
possibilities. The result is a strictly neurological search engine whose
internal noise level may be parametrically increased to achieve progressively
more novel concepts. I call such a chaotic network an By attaching to the IE a critic
network (termed an
The practicality and successes (see Table 1 for a few examples) of the Creativity Machine paradigm stem from the fact that all networks involved are trained by example. Therefore, as long as historical data exist within any conceptual space, backpropagation or any other neural network learning paradigm may be used to rapidly train the required Creativity Machine networks. This ease of construction has allowed the building of a wide variety of Creativity Machines focused on diverse knowledge spaces, ranging from music composition, to ultrahard materials discovery, to the invention of personal hygiene products. Common to the operation of most
Creativity Machines built to date is a perturbation scheme in which small
disturbances stochastically "hop" among the connection weights of the
network. To parametrize the internal chaos within the IE, the governing
algorithm parcels out
Therefore, during operation the
Creativity Machine may be run under a whole range of operating conditions
governed by the parameters Obviously the ideal regime for
Creativity Machine operation lies somewhere between the Neo-Darwinian and
Neo-Lamarckian search regimes. To achieve the necessary level of internal
perturbation, the parameters
representing the mean rate of perturbation for any connection weight in the IE and the primary controlling parameter behind the imagination engine. Analytically, the choice of mean
perturbation
We find in general that the most fertile cavitation regime corresponds to mean connection weight perturbations near 0.06. At lower perturbation levels the IE revisits largely training exemplars and their generalizations. At progressively higher levels of connection weight perturbation, the IE produces less constrained and hence more nonsensical possibilities (i.e., noise). Realizing that the connection
weights of a neural network implicitly contain the rules and schema that bind
together any given conceptual space, the perturbation scheme embodied within
cavitation effectively experiments with these rules by softening them either
individually or in parallel while the AAC judges the utility of the resulting
concepts. A mean connection weight perturbation of approximately 0.06 appears to
be a universal amount by which to soften these internal rules without producing
nonsensical or known concepts. Symbolically representing the constraint
relations within any given neural network as the unit sphere, coherent concepts
that embody most of the useful ideas emerging from an IE fall within a thin
membrane surrounding the
Qualitatively, the Creativity Machine constitutes a compelling model of how both novel and mundane concepts may both nucleate within any parallel distributed system. Accordingly, it represents a strongly competitive functional metaphor for how the similarly connectionist brain creates. To search for a closer equivalence between the two systems, it is necessary to establish quantifiable observables and then compare them. Having reduced the description to the most canonical level, we sidestep any top-down subjective description that is generally embellished by human cortical networks. Within the quantitative dynamical system analogy, physics routinely recruits conjugate observables, such as position and momentum or energy and time, to describe the evolution of almost everything. Assuming no special physical status for the brain, its behavior may be adequately described by similar quantities describable by phase spaces whose axes correspond to these conjugate quantities. In the most rigorous of portrayals, cortical activation patterns would be described by a huge multidimensional space whose axes would correspond to the on–off status of each of the roughly 100 billion cortical neurons, along with a similarly immense space of rates of change of each of these neuron activations. In essence, this description represents a dichotomy, distinguishing what is being thought (i.e., the exact activation pattern in the former subspace) and how those activations are evolving in time (the latter subspace). Presently surrendering all hope of understanding exactly what is being thought, we may readily monitor and model the temporal pattern of cognitive turnover. Accordingly, the first clue of a temporal link between the Creativity Machine and human cognition comes when an audible tone is attached to each Creativity Machine discovery. Listening to the resulting stream of alarms one detects a clustered distribution, with discoveries generally clumped together. Run at high noise levels, the stream attains the rhythm or prosody (Kosslyn and Koenig, 1992) of human speech, sounding much like a garbled conversation. To quantitatively examine the suggested correspondence to the human cognitive rhythm, we first calculate temporal distributions experimentally for both the Creativity Machine and cognitive streams for human test subjects. The measured temporal behaviors of both neurobiological and computational neural systems are shown to be identical, with both turnover rates derivable from the theory of fractal Brownian motion (fBm).
Intuitively we are well aware of the fact that the temporal distribution of thought shows similar clustering behavior over different temporal regimes. For instance, the musical output of a great composer may show a clustering over time, consisting of lull periods of inactivity peppered with spasms of creative turnover over months or years (Jamison, 1994, in the context of manic-depressive illness). Within the course of a single day, that composer’s musical output may display similar surges and lulls. Likewise, in speaking we tend to produce a grouping of words as some main theme or idea appears to us followed by a noticeable lag as a new train of thought emerges. Similar clustering then appears at the level of sentences and individual words within those sentences. Therefore, to arrive at some convention for measuring temporal distribution, we not only require some means to measure the probability that any thought will accompany any other thought within a given time frame, but we also require some measure of any time-scale invariance involved. The natural way of approaching this problem is in the context of fractal theory where we are accustomed to examining spatial invariance (i.e., the coastline of Great Britain at various levels of magnification, where a satellite view is statistically indistinguishable from a view from several feet).
Consider for instance the generic
temporal stream of events pictured at the top of Figure 5 where we see a
distribution of equally separated events occurring at regular intervals.
Randomly moving statistical sampling boxes of different durations t over the
distribution we will find that the average number of captured events scales as t In the Mandelbrot analysis, P(m, t) is defined as the probability of statistically measuring m points within a sampling time of duration t. A computer code may calculate this quantity by "dropping" sampling boxes of progressively larger time frames t onto the resulting distribution and then counting the number of bracketed events. For each sampling box of time t, the algorithm may perform multiple random samplings of the distribution. P(m, t) is then normalized such that
for all t, where N is the total number of points within the sampled system. The distribution P(m,t) is then used to define the mass moments
where q assumes positive integer values. The fractal dimension, D, is then estimated from the logarithmic derivatives,
or from linear plots as shown in
Figure 3. Generally, if the same fractal dimension D
Twelve volunteers contacted by telephone were asked to name 20 items as quickly as possible for each of the series of topics listed in Table 2, while digitally recording their responses. Test subjects thereby tacitly assumed that the objective of the experiment was to note speed, rather than the sought distribution of their thought stream. The desired effect was then to minimize the latent period between idea formation and concept articulation to approximate as closely as possible the arrival times of consecutive thoughts. The resulting digital strip-chart recording (as exemplified in Figure 7) was then used to quantify the cognitive event stream by noting the start of each word or phrase on a millisecond time scale. Stuttering, which was rare within this study, was considered the leading edge to any voiced concept. Subsequent analysis of this event stream, by the methods of Mandelbrot analysis, yielded both a total observation time Dt and a fractal dimension D. The combination of the fractal characteristics along with the total time scale required to complete the cognitive task constituted a complete statistical, temporal description of the cognitive event stream. Because the calculated fractal dimension is intrinsically time-scale independent and measured total time Dt is explicitly time-scale dependent, the two parameters form a complementary set of temporal observables.
In retrospect, it is noteworthy to mention that within the context of these cognitive experiments, the resulting fractal calculations appeared independent of any foreknowledge by the test subject of the intent of the experiment. This observation may be testament to inability of human cognitive faculties to store a large number of thoughts while simultaneously counterfeiting a bogus pattern of articulation. Only in well-rehearsed cases or in reading from written lists could test subjects attain arbitrary speech rhythms.
An auto-associative network was trained by the standard methods of backpropagation (Rumelhart, Hinton, and Williams, 1986) by exposing it to identical binary input–output vectors. The network contained four layers of 10 processing units each, all fully connected between layers, for a total of 300 connection weights. Input and output training exemplars for the network consisted of generic binary memories, ranging from (0, 0, 0, 0, 0, 0, 0, 0, 0, 0) to (1, 1, 1, 1, 1, 1, 1, 1, 1, 1). Once trained, this net was embedded within a C code that cyclically supplied perturbations of fixed magnitude s to n randomly chosen connection weights, as described above and depicted in Figure 2. To normalize the output of this
network to that of the cognitive study, the perturbation time constant dt
was adjusted to a value of 300 msec to correspond to the fastest enunciation
rate of ideas (i.e., the recall of 20 numbers, invariably in ordinal fashion).
Therefore, the maximum rate at which the network could produce its own
activation turnover was adjusted to correspond to the fastest possible cognitive
rate measured experimentally. So normalized, the net was run at the optimal mean
connection weight perturbation of 0.057, yielding a cavitation rate of r
= 0.057/0.3 sec = 0.19 sec Following each distinct redistribution of perturbations among the N weights, fixed inputs of 1/2 were fed through the network with the controlling algorithm noting whether a transition (i.e., 50% change in any output vector component) had occurred. These transitions then constituted the activation turnover of the net. Simulation halted after 20 such transitions, at which time the algorithm applied fractal dimensional analysis to the recorded network output transitions using the same algorithm to determine Mandelbrot measures as was used in the cognitive study.
Having amassed roughly 100 cognitive
experiments and a similar number of experiments on IE activation turnover, a
number of curve fitting experiments were carried out to investigate whether any
simple pattern existed between the calculated fractal dimension and measured
time scales for either data base. For both sets of experiments it was found that
D Empirically, we find that both cognition and the chaotic ANN both obey the same law of event turnover given by
where Dt is expressed in seconds and the proportionality constant represents the mean between the cognitive and ANN result of Figure 8 and all previous studies (Thaler, 1996b). Recasting Equation 2 into exponential form, we obtain the relation
indicating that for both the sampled cognition and the Creativity Machine, there exists a trade-off between the time scale required for a set number of distinct transitions and the inherent clustering of events within the resulting transition sequence.
In the case of human speech, this result is intuitively familiar: a speaker who is familiar with his or her presentation material tends to speak in a relatively linear fashion. In contrast, with more ad lib delivery the speaker’s articulated words tend to be more clustered. In this sense, the relationship embodied in Equation 5 is a quantitative expression of this all too familiar phenomenon of hesitancy. In the plot for the cognitive study, some of the specific tasks were overlaid to display their relative positions along this curve. Within this plot at low fractal dimension (to the left) we find the fairly demanding cognitive tasks of inventing nonsense words beginning with the letter "r" or ways of entering a house. At high fractal dimension (to the right) we find the more rote tasks such as naming 20 numbers or recalling various foods. Similarly, within the temporal characterization of the Creativity Machine output we find a similar dichotomy between data points falling on the rightmost and leftmost extremes of the plot, where ns/N was maintained constant and equal to 0.06. Leftmost points occurring at a low fractal dimension represent extremes in either s or n (i.e., n = 2 and s = 9 or n = 9 and s = 2). Alternately, the rightmost points on the plot at the high fractal dimension correspond to intermediate values of both s and n (i.e., n = 4 and s = 4.5). To illustrate this relationship
between fractal dimension of the network’s output stream and its internal
perturbations, I have trained a small feedthrough net on the results of 100
computer experiments to map both n and s
to the resulting fractal dimension, D Therefore, literally equating the two processes of human cognition and cavitation within the Creativity Machine paradigm, simple tasks such as counting and rote memory recall seem to involve the distribution of many intermediate strength perturbations among many different connection weights of the system. By contrast, more inventive cognitive forays appear to involve the spontaneous appearance of large perturbations localized to just a few connection weights. Admittedly, it will be harder for the cortical or synthetic network to perform internal vector completion on a large, local spike in connection weight perturbation than on smaller distributed disturbances. For a dramatic internal perturbation, the network cannot easily fall into an attractor basin representing a known training exemplar (i.e., a memory). Such a large perturbation can only disrupt and transform the local attractor basin structure to create new and unique attractors. It is these newly formed attractor basins derived from the established memories that now constitute corrupted memories that may or may not be of utilitarian value to connected associative networks.
We are accordingly drawn to the conclusion that what the common parlance calls memories and ideas are derivatives of one another that fall at opposite extremes of network perturbations. In short, inventive thought may be no more than internally generated confabulations nucleated upon large synaptic noise spikes, and deemed valuable or interesting by the surrounding cortical networks.
In view of the striking result that both cognitive and Creativity Machine conceptual streams obey the same empirically determined law, it may be worthwhile to develop an ab initio theory, at least for the computational model, to account for the simple, yet all-encompassing relationship contained in Equation 5. Viewed from the perspective of an average neuron embedded within the cavitating IE, the mean activation is essentially performing a random walk, occasionally and sporadically passing through the "all or nothing" activation threshold that toggles the computational neuron into its on or off state. This random walk may be imagined to take place in a series of infinitesimally small steps, each of which is independent of the previous one, thus qualifying the system for analysis via the theory of fBm (Voss, 1988). A general result from the theory of
fractal Brownian motion states that if some time-dependent function F
where H is in the range 0<H<1
and k is a unitary dimensional constant. The fractal dimension, D, of the
resulting functional trace, F
Furthermore, the intersection of
this fractal curve with the time axis generates a set of points, known as the
"zeroset," with a fractal dimension D
where the unitary constant k has the
dimensions of sec Similarly, if the chaotic net input to the representative mean neuron also performs a random walk, in a series of small independent steps, then the RMS variation in that net input, Dnet, varies as
where
Dt is the observation time and D
Assuming that the average activation for all computational neurons is effectively ˝ (i.e., the computational neurons may only activate within the range from 0 to 1), and noting that over any half cycle portrayed in Figure 2, the average change in net input is ns/N within a time frame dt/2, the average net input transition rate for a mean neuron is given by ˝*2ns/Ndt = ns/Ndt. Substituting this value for Dnet/Dt in Equation 10, we obtain the empirically obtained functional form of Equation 4.
Hence for a fixed level of internal
network perturbation (ns/Ndt
= constant) the product of the mean neuron’s output zeroset dimension and the
logarithm of the observation time should likewise be constant. We note that the
right side of Equation 11 is related to the logarithm of the total number of
network transitions, N Exponentiation of both sides of Equation 11 and omitting the dimensional constant k leads to
reproducing Equation 5, with
substitution of the optimal cavitation rate ns/Ndt
= 0.19 sec
Returning to the equivalent
cognitive result, empirically discovered through the cumulative plot of Figure
8, we must conclude that the cognitive event streams sampled within this study
display the signature of fractal Brownian motion by their adherence to Equation
12. This result strongly suggests a dominant mechanism behind all forms of
cognition, namely the stochastic perturbation of synaptic connections between
biological neurons (or in an equivalent circuit sense, perturbations within the
neuron itself). Furthermore, the quantitative applicability of Equation 12 to
human cognition, as well as the universality of this result for all artificial
neural network structures, indicates that similar levels of network
perturbation, r
The rhythm of concept generation within both human test subjects and the Creativity Machine have been shown to be identical for all data gathered within this study. These results substantiate earlier investigations (Thaler, 1996b) probing the temporal behavior of diverse cavitating neural networks spanning a wide range of sizes, complexities, and connectivities. The remainder of the comparison with human cognition, the singularity and significance of Creativity Machine discoveries and inventions, will always be open to debate, as is the case with any human innovator who must battle against a variety of societal forces (i.e., consensus opinion) and competitive pressures no matter how inherently valuable his or her concepts may be. Nevertheless, the temporal and fractal equivalence between Creativity Machine concept generation and human cognition is striking, strongly suggesting that stream of consciousness, both mundane and novel, follows the same empirically discoverable laws. The fact that the prosody of cortical concept formation shows the signature of fractal Brownian motion, strongly suggests that stochastic, and perhaps chaotic phenomena within biological neural networks are at the heart of all cognition. Since the all-neural Creativity Machine demonstrates identical time evolution with human cognition, and because it is capable of producing both incremental and paradigm shift thinking, we may consider this canonical system to be a potential functional model of human cognition. By analogy with the computationally transparent Creativity Machine, rote memory recall appears to be the result of a relatively uniform distribution of small perturbations spread across many connection weights. Alternatively, novel concept formation is tied to the sporadic appearance of relatively large and localized perturbations. Because of their significant effect on the attractor landscape of the network, these larger perturbations may readily alter, merge, and separate specific attractor basins, representing distinct memories and concepts, into modified or perhaps hybridized notions. Further, because the connections weights within an artificial neural network constitute the statistical rules and correlations that bind together any conceptual space, we may view the process of cavitation as a stochastic experiment within the net in which each of the underlying rules and conventions are randomly softened, singly or in parallel, to produce derivative concepts beyond those experienced within network training. Viewed in the context of hopping perturbations, weight disruptions may sporadically congregate on specific connection weight traces constituting what we would normally consider symbolically representable rules. When such disruptions occur, the singled-out conventions are modified, for better or for worse, as judged by the critic network’s response to the emerging concepts. Improvement in the search efficiency of such a system comes in the ability of the policing network to identify and selectively soften those connection traces cumulatively learned by that critic to be essential to the emergence of useful concepts. In spite of its simplicity, Equation
12 may broadly describe the gamut of human cognition with the perturbation rate
of ns/Ndt
= r Placed on the same continuum of perturbation, all cognition may be viewed as acts of creativity. Even at the lowest levels of synaptic disruption one idea is supplanted by another in a display of low-level originality, as in everyday stream of consciousness, conversation, or movement planning. The noblest invention, scientific discovery, or artistic inspiration lies at the opposite extreme of this spectrum, where the Creativity Machine model implicates large localized perturbations as the nucleating events. Within either of these regimes, consciousness itself may be no more than the spontaneous invention of significance by associative cortical networks to the endless noise-driven activations of their brethren. Our search for an objective truth regarding the basis of creativity and consciousness alike may be blinded by the capacity of such networks to overwhelm and distract us with multiple drafts (i.e., Dennett, 1991) of the actual underlying processes. This dynamical systems approach is an attempt to circumvent an inevitable philosophical cul-de-sac.
Dennett, D.C. (1991). Consciousness Explained, Little Brown, Boston. Jamison,
K.R. (1994). Manic Depressive Illness and Creativity, Mandelbrot,
B.B. and van Ness, J.W. (1982). Thaler,
S.L. (1996a). Neural Nets That Create and Discover, * U.S. Patent 5,659,666, Device for the Autonomous Generation of Useful Information, Issued 8/19/97 |

© 1997-2012, Imagination Engines, Inc. | Creativity Machine®, Imagination
Engines®, and DataBots® are registered trademarks of Imagination Engines, Inc.