- Science with Personality 
 

MileGu.com 


 
 

Epsilon Machines - Occam’s mathematical Razor

 

Suppose you had a black box, and all you could measure from it are  streams of data, which for the sake of simplicity, are streams of bits. 

10100010010…
00101100100…

We  would like to construct a theory that explains its behaviour. Indeed, this is essentially what science is about - the construction of theories to explain phenomena we observe. 

Of course, there’s an infinite number of different constructions that explain output. We can explain the ways the stars move by saying individual angels moved them each day, but it wouldn’t be a good explanation since you’re left with the question of why the angels are moving them that way. Essentially, good explanations should compress information.

So, our basic goal is too

 Construct the simplist machine that accurately captures the patterns within the data.

In this article, I’ll outline one possible approach, known as epsilon-machines, developed by C. Shalizi and J. Crutchfield.

Machines that Generate Strings - A Simple Example:

Essentially, every machine can be thought of as a physical system that can be in one of N possible physical states. The dynamics of the system can then be defined as transitions between different states.  Since we want a system that can produce bitstrings, we’re going to add another condition. At every discrete timestep, the system is going to emit a bit, which is determined by what dynamics is happening at that timestep.

For example, one simple system as a two-state system, whose states we label A and B that oscillates between the A and B, emiting a 1 or a 0 depending on its current state. Such a system, when you run it, will clearly always give bitstrings of the form

..1010101010101…

What we want to do is the reverse. We have the bitstring above… and we want to reverse engineer the two state machine.

Any physical system can exists in a number of different states. Thus, the first step for us is to introduce bare minimum number of physical states that allow us to explain the output.

We begin with a finite state machine, where all the information about the future dynamics and hence output of the machine is determined by its current state. Any extra information would be redundant. Thus, the idea is to take a string of bits up to a certain point (the history), and ask what information is required to predict the next bit.

Therefore, we group the set of all possible output histories  into equivalence classes, such that replacing one history with another history within the same class does not alter does not affect our prediction of its future evolution. We refer to these classes as  Causal States

In our example, for instance, we observe that all strings alternate between 0 and 1. So, any string the ends with a 0 will have identical future evolution (the same also applies to 1). Thus, we can define a machine with two causal states

S_1=\{1,01,101,\ldots\}
S_2=\{0,10,010,\ldots\} 


Epsilon Machines

 

The next step  is to model of dynamics. That is, at each discrete time step, our system jumps from one causal state, to another, and in process, emits a bit. We model these transitions by transition matrices, where T^{s}_{i,j} represents the probability of transitioning from S_i to S_j while emitting the bit s. Thus, this simple example, features two matrices, T^{(0)} and T^{(1)}, with values as listed in the figure above. 

The Formalities:

With the intuition afforded by out example, we can define an epsilon machine more generally by the ordered pair \{\mathbf{S},\mathbf{T}\}, where \mathbf{S} is the collection of causal states, and \mathbf{T} the collection of matrices the define how the states transform.

In out simple example, the string we analysed was deterministic, such that 0 always succeeded a 1 and vice versa. In general, e-machines describe stochastic processes. For example, the completely random string can be described by a trivial e-machine with a single causal state, with T^0=T^1=1/2.

Features of E-Machines:

Some useful properties of these machines I list without proof:

  • The complexity of a e-machine equates the the Entropy of its eternal states, i.e. \sum p_i log p_i where p_i is the probablity it is in causal state S_i. This is easy to compute.
  • E-machines are the simplest possible representations of a process. Complexity, as measured by \sum p_i \log p_i is minimal when compared to any other state machines that predicts the dynamics of the process.

Note this this more general definition takes into account non-deterministic emissions… the measurement results can contain inherent randomness. 

A Few Personal Queries:

Epsilon machines are motivated by Occam’s Razor, the idea the the best explanation of a given phenomena is the simplest explanation. In the case of Epsilon machines, their simplicity was judged on the basis of the entropy of the causal state distribution.  Intuitively, this equates to the amount of information contained the knowledge of which causal state your epsilon machine is in.

However, is this an appropraite measure of complexity for Occam’s Razor? There exists many different epsilon machines with the same Causal State distribution, with different transition  matrices and hence model different physical processes. Thus, a measure of the simplicity of the model should necessarily include the information required to describe these transition matrices.

Could an alternative construction, with a more complex state space, have a simply transition structure?

References:

 Date Posted: 10 Mar 2009 @ 03 59 AM
Last Modified: 18 Mar 2009 @ 07 51 AM
Posted By: Mile Gu
E-mailPermalink
 

Responses to this post » (3 Total)

 
  1. Occam’s Mathematical Razor | MileGu.com said...
    4:17 am - March 12th, 2009

    [...] Epsilon Machines: Formalising Occam’s Razor [...]

  2. Presentation: Occam’s Razor | MileGu.com said...
    9:21 am - March 18th, 2009

    [...] Comments Occam’s Mathematical Razor | MileGu.com on Epsilon Machines - Occam’s mathematical RazorMile Gu on Of Go (Weiqi), intuition, and Artificial IntelligenceBob Hearn on Of Go (Weiqi), [...]

  3. air jordan retro 13 for sale said...
    3:28 am - August 18th, 2010

    E-machines are the simplest possible representations of a process. Complexity, ntuitively, this equates to the amount of information contained the knowledge of which causal state your epsilon machine is in.

 

Leave A Comment ...

 

 XHTML:
You can use these tags: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>
\/ More Options ...
Change Theme...
  • Users » 78
  • Posts/Pages » 37
  • Comments » 39
Change Theme...
  • VoidVoid « Default
  • LifeLife
  • EarthEarth
  • WindWind
  • WaterWater
  • FireFire
  • LiteLight
  • No Child Pages.