Information Physics: The Next Frontier
Kevin H. Knuth Departments of Physics and Informatics University at Albany
19th Century
George Boole George Boole was the inventor of Boolean Logic. In 1854 he published: “An Investigation of the Laws of Thought, on Which are Founded the Mathematical Theories of Logic and Probabilities”
George Boole (1815 – 1874)
Since that time, HUNDREDS OF THOUSANDS of papers have been published on this topic.
20th Century
Claude E. Shannon Claude Shannon realized that Boolean logic could be used to optimize arrays of electromagnetic relays used in switching telephone systems. In 1949, he developed information theory, which enabled him to quantify the effectiveness of a communication channel using the information entropy (Shannon entropy).
Claude E. Shannon (1916 – 2001)
Richard T. Cox In 1946, Cox shows that Bayesian probability theory is the unique generalization of Boolean algebra to degrees of plausibility. It is the first example of such a generalization of an algebra to a calculus. Richard T. Cox (1898 – 1991)
Edwin T. Jaynes Around 1956, Jaynes realizes that the connection between Shannon entropy and thermodynamic entropy is not that of a mere analogy, but is due to the fact that they derive from similar underlying ideas. Edwin T. Jaynes (1922 – 1998)
21St cENTURY
Symmetry
Order
Equivalence Relations among Partitioned Sets
Order Relations among Ordered Sets
Group Theory
Lattice Theory
Order
In the beginning…
This caveman finds it easy to order these rocks in terms of how heavy they are to lift.
Heavier
He uses a binary weight comparison to order his rocks
Sophisticated Rock Hunting
Heavier
Totally ordered elements form a CHAIN
Is Less than or Equal to
Heavier
Isomorphisms 5
4
3
2
1
Incomparable elements form an ANTICHAIN
Partitions exhibit both chain-like and antichain-like properties
4 3 2 1
8 Divides
Is Less than or Equal to
Two Posets with Integers
4 6 9 2
3 5 7 1
The Powerset of {a, b, c} P = ({ ∅, {a}, {b}, {c}, {a, b}, {a, c}, {b, c}, {a, b, c} }, ⊆ )
{a, b, c} {a, b} {a, c} {b, c}
{a} {b} ∅
{c}
Is a subset of
⊆
Upper Bound Upper Bound of {b} and {c}
{a, b, c} {a, b} {a, c} {b, c}
{a} {b} ∅
{c}
Join Least Upper Bound of {b} and {c}
{a, b, c} {a, b} {a, c} {b, c}
{a} {b} ∅
{b} ∨ {c} = {b, c}
{c} The join of two elements is their least upper bound
Meet Greatest Lower Bound of {a,b} and {a,c}
{a, b, c} {a, b} {a, c} {b, c}
{a, b} ∧ {a, c} = {a}
{a} {b} ∅
{c} The meet of two elements is their least upper bound
Posets versus Lattices Lattices are posets where every join and meet is unique.
(x ∨ y)
Poset
Lattice
x
y
(x ∧ y)
Lattice Identities The Lattice Identities L1. x ∧ x = x,
x∨x = x
L2. x ∧ y = y ∧ x,
Idempotent
x∨ y = y ∨ x
L3. x ∧ ( y ∧ z ) = ( x ∧ y ) ∧ z ,
Commutative
x ∨ ( y ∨ z ) = ( x ∨ y ) ∨ z Associative
L4. x ∧ ( x ∨ y ) = x ∨ ( x ∧ y ) = x
Absorption
If x ≤ y the meet and join follow the Consistency Relations C1. x ∧ y = x C2 . x ∨ y = y
(x is the greatest lower bound of x and y) (y is the least upper bound of x and y)
Lattices are Algebras Structural Viewpoint
Operational Viewpoint
max(a, b) = b a≤b ⇔ min(a, b) = a Integers, Is less than or equal to
Lattices are Algebras Structural Viewpoint
Operational Viewpoint
lcm(a, b) = b a |b ⇔ gcd(a, b) = a Positive Integers, Divides
Lattices are Algebras Structural Viewpoint
Operational Viewpoint
a ∪b = b a⊆b ⇔ a∩b = a Sets, Is a subset of
Lattices are Algebras Structural Viewpoint
Operational Viewpoint
a∨b = b a→b ⇔ a ∧b = a Assertions, Implies
Quantification Via Valuations
Algebra to Calculus
Algebra
Calculus
Quantification quantify the partial order = assign real numbers to the elements { a, b, c } { a, b } { a, c } { b, c } { a } { b } { c } Any quantification must be consistent with the lattice structure. Otherwise, information about the partial order is lost.
Local Consistency Any general rule must hold for special cases Look at special cases to constrain general rule Enforce local consistency
f(x ∨ y) ↔ f(x) and f(y)
x ¤ y x
y
This implies that:
f(x ∨ y) = S[f(x), f(y)]
f : x ∈ L → IR
Where S is an unknown function to be determined.
Associativity of Join Write the same element two different ways
x ∨ (y ∨ z)
=
(x ∨ y) ∨ z
which implies
S[f(x), S[f(y), f(z)]] =
S[S[f(x), f(y)], f(z)]
Note that the unknown function S is nested in two distinct ways, which reflects associativity
Associativity Equation S[f(x), S[f(y), f(z)]] =
S[S[f(x), f(y)], f(z)]
The general solution (Aczel 1966) is:
F(S[f(x), f(y)]) = F(f(x)) + F(f(y)) where F is an arbitrary function. Define v(x) = F(f(x)) so that we have straightforward summation.
v(x ∨ y) = v(x) + v(y) DERIVATION OF THE SUMMATION AXIOM IN MEASURE THEORY!
Valuation VALUATION
v:x∈L → R
If y ≥ x then v(y) ≥ v(x) x ¤ y x
y
v(x ∨ y) = v(x) + v(y)
General Case x ¤ y x
y x ⁄ y
z
General Case x ¤ y x
y x ⁄ y
v(y) = v(x ∧ y) + v(z)
z
General Case x ¤ y x
y x ⁄ y
v(y) = v(x ∧ y) + v(z)
z
v(x ∨ y) = v(x) + v(z)
General Case x ¤ y x
y x ⁄ y
v(y) = v(x ∧ y) + v(z)
z
v(x ∨ y) = v(x) + v(z)
v(x ∨ y) = v(x) + v(y) − v(x ∧ y)
Sum Rule v(x ∨ y) = v(x) + v(y) − v(x ∧ y)
v(x) + v(y) = v(x ∨ y) + v(x ∧ y) symmetric form (self‐dual)
Sum Rule p( x ∨ y | i) = p( x | i) + p( y | i) − p( x ∧ y | i) I ( X ; Y ) = H ( X ) + H (Y ) − H ( X , Y )
max( x, y ) = x + y − min( x, y )
χ =V − E + F log(gcd( x, y )) = log( x) + log( y ) − log(lcm( x, y ))
Lattice Products
x
=
Direct (Cartesian) product of two spaces
Direct Product Rule The lattice product is also associative
A × (B × C)
=
(A × B) × C
After the sum rule, the only freedom left is rescaling
v((a, b))
=
v(a) v(b)
which is again summation (after taking the logarithm)
Constraints on Valuations Sum Rule
v(x ∨ y) = v(x) + v(y) − v(x ∧ y) Direct Product Rule
v((a, b))
=
v(a) v(b)
Quantification Via Bi-Valuations
Context and Bi-Valuations BI‐VALUATION
w : x, i ∈ L → IR Valuation
Bi‐Valuation
w(x | i)
vi (x)
v(x)
Context i is explicit
Measure of x with respect to Context i
Context i is implicit
Bi‐valuations generalize lattice inclusion to degrees of inclusion
Inherited Constraints Sum Rule
w(x | i) + w(y | i) = w(x ∨ y | i) + w(x ∧ y | i) Direct Product Rule
w((a, b) | (i, j))
=
w(a | i) w(b | j)
Inherited from valuations
Associativity of Context
=
Chain Rule c w(a | c) = w(a | b) w(b | c)
b a
Extending the Chain Rule w(x | x) + w(y | x) = w(x ∨ y | x) + w(x ∧ y | x) Since x§x and x § x¤y, w(x|x)=1 and w(x¤y |x)=1 x ¤ y
y
x
x ⁄ y
w(y | x) = w(x ∧ y | x)
Extending the Chain Rule w(x ∧ y ∧ z | x) = w(x ∧ y | x)w(x ∧ y ∧ z | x ∧ y)
y
x x⁄y
z y⁄z
x⁄y⁄z
Extending the Chain Rule w(x ∧ y ∧ z | x) = w(x ∧ y | x)w(x ∧ y ∧ z | x ∧ y) w(y ∧ z | x) = w(y | x)w(z | x ∧ y) y
x x⁄y
z y⁄z
x⁄y⁄z
Extending the Chain Rule w(x ∧ y ∧ z | x) = w(x ∧ y | x)w(x ∧ y ∧ z | x ∧ y) w(y ∧ z | x) = w(y | x)w(z | x ∧ y) y
x x⁄y
z y⁄z
x⁄y⁄z
Extending the Chain Rule w(x ∧ y ∧ z | x) = w(x ∧ y | x)w(x ∧ y ∧ z | x ∧ y) w(y ∧ z | x) = w(y | x)w(z | x ∧ y) y
x x⁄y
z y⁄z
x⁄y⁄ z
Extending the Chain Rule w(x ∧ y ∧ z | x) = w(x ∧ y | x)w(x ∧ y ∧ z | x ∧ y) w(y ∧ z | x) = w(y | x)w(z | x ∧ y) y
x x⁄y
z y⁄z
x⁄y⁄ z
Constraint Equations Sum Rule
w(x | i) + w(y | i) = w(x ∨ y | i) + w(x ∧ y | i) Direct Product Rule
w((a, b) | (i, j))
=
w(a | i) w(b | j)
Product Rule
w(y ∧ z | x) = w(y | x)w(z | x ∧ y)
Probability Theory
States
apple
banana
cherry
states of a piece of fruit picked from my grocery basket
Statements (States of Knowledge) { a, b, c }
subset inclusion
powerset { a, b } { a, c } { b, c } a b c
{ a } { b } { c }
states of a piece of fruit
statements about a piece of fruit
statements describe potential states
Implication { a, b, c } implies
{ a, b } { a, c } { b, c } { a } { b } { c } statements about a piece of fruit
ordering encodes implication
Inference { a, b, c } { a, b } { a, c } { b, c } { a } { b } { c }
Quantify to what degree knowing that the system is in one of three states {a, b, c} implies knowing that it is in some other set of states
statements about a piece of fruit
inference works backwards
Inference { a, b, c }
a ¤ b ¤ c
{ a, b } { a, c } { b, c }
a ¤ b a ¤ c b ¤ c
{ a } { b } { c }
a b c
set notation
logic notation
Change notation
Inference a ¤ b ¤ c a ¤ b a ¤ c b ¤ c a b c statements about a piece of fruit
Quantify to what degree knowing that the system is in one of three states {a, b, c} implies knowing that it is in some other set of states
Inference a ¤ b ¤ c
p(c | a ¤ b ¤ c)
a ¤ b a ¤ c b ¤ c a b c statements about a piece of fruit Quantify to what degree knowing that the system is in one of three states {a, b, c} implies knowing that it is in some other set of states
Constraint Equations Sum Rule
p(x | i) + p(y | i) = p(x ∨ y | i) + p(x ∧ y | i) Direct Product Rule
p((a, b) | (i, j)) Product Rule
=
p(a | i) p(b | j)
p(y ∧ z | x) = p(y | x) p(z | x ∧ y)
Commutativity Commutativity x ∧ y = y ∧ x leads to Bayes Theorem…
p(y | i) p(x | y ∧ i) = p(x | i) p(y | x ∧ i)
p(x | i) p(x | y ∧ i) = p(y | x ∧ i) p(y | i)
p(x | i) p(x | y) = p(y | x) p(y | i) Bayes Theorem involves a change of context.
Bayesian Probability Theory Sum Rule
p(x | i) + p(y | i) = p(x ∨ y | i) + p(x ∧ y | i) Direct Product Rule
p((a, b) | (i, j))
=
p(a | i) w(b | j)
Product Rule
p(y ∧ z | x) = p(y | x) p(z | x ∧ y)
Bayes Theorem
p(x | i) p(x | y) = p(y | x) p(y | i)
Inquiry (Information Theory)
Three Spaces FD(N ) 2
N
N
a∨b∨c
powerset
a∨b a∨c
a
a
b
States
c
b
exp b∨c
c
log Statements (sets of states) (potential states)
Questions (sets of statements) (potential statements)
answers
Inquiry Space
Questions are sets of Statements Free Distributive Lattice
Relevance
“Is it an Apple?”
Relevance Decreases
answers
“Is it an Apple or Cherry, or is it a Banana or Cherry?”
Central Issue “Is it an Apple, Banana, or Cherry?”
Relevance and Entropy d ( I | Q)
H ( pa , pb∨ c )
− pa log 2 pa
H ( I ) = − pa log 2 pa − pv log 2 pv − pm log 2 pm
Inquiry Sum Rule
d(X | I) + d(Y | I) = d(X ∨ Y | I) + d(X ∧ Y | I) Direct Product Rule
d((A, B) | (I, J))
=
d(A | I) d(B | J)
Product Rule
d(Y ∧ Z | X) = d(Y | X) d(Z | X ∧ Y)
Bayes Theorem
d(X | I) d(X | Y) = d(Y | X) d(Y | I)
Inquiry and Information Theory Sum Rule
d(X | I) + d(Y | I) = d(X ∨ Y | I) + d(X ∧ Y | I) d(X ∨ Y | I) = d(X | I) + d(Y | I) − d(X ∧ Y | I)
I(X; Y) = H(X) + H(Y) − H(X, Y)
2
MaxEnt
N
FD(N )
a∨b∨c a∨b a∨c
a
b
exp b∨c
c
log One way to assign priors is to note that we have chosen the atomic hypotheses for a reason… they are most relevant. We can then assign probabilities, so that the relevance of the Central Issue is maximized!
Quantum Theory
Measurement Sequences
Quantify a quantum mechanical measurement sequence [m1, m2, m3] with a pair of real numbers.
Parallel
Serial
Quantification Via Distinguished Sets
Quantification of a Poset
Consider a poset comprised of an enormous number of elements.
Distinguished Chains We will distinguish one or more subsets of elements and use them for quantification. The method discussed here relies on Identifying chains.
Simplicity Chains will be quantified so that they look simple
Projection of an Element onto a Chain
The projection of an event x onto a chain P is given by the least event px on the chain that can be informed about x.
px x
P
Quantifying a Projection 9 8 Projections can be quantified by selecting particular elements on the chain and assigning a numeric value to each event on the chain.
7 6 px = 5 4
x
3 2 1
P
Two Observer Chains
qx px x
P
Q
Observers must be Coordinated Chains are coordinated by carefully selecting which events along the chain are used for quantification. Events are selected so that successive events on one chain project to successive events on the other.
P
Q
Quantification with Pairs Event x can be quantified by a pair of numbers derived from the labels of the events on each chain that are first informed about x.
qx px
The pair (px, qx) quantifies event x Technically this pair represents the direct product of measures of the two chains
x
P
Q
Intervals We consider two events and quantify the interval between them. This results in two pairs, or two degrees of freedom, since the origin of the labels on each chain is arbitrary.
p2
q2 q1 2
p1 1 P
Q
Quantification p2
q2
Δp = p 2 − p1
Δq = q2 − q1
q1 2 p1 1 P
(Δp, Δq) =
Q
(p 2 − p1, q2 − q1 )
Two Fundamental Configurations
p2
q2
p2
q1
p1
q1
p1
q2
2
1
2
1 P
Q
symmetric (chain‐like)
P
Q
antisymmetric (antichain‐like)
Decomposition p2
q2
Δp = p 2 − p1
Δq = q2 − q1
q1 2 p1 1 P
Q
⎛ Δp + Δq Δp + Δq ⎞ ⎛ (Δp − Δq) − (Δp − Δq) ⎞ (Δp, Δq) = ⎜ , , ⎟ + ⎜ ⎟ 2 2 2 2 ⎠ ⎝ ⎠ ⎝
Decomposition p2
q2
Δp = p 2 − p1
Δq = q2 − q1
q1 2 p1 1 P
Q
⎛ Δp + Δq Δp + Δq ⎞ ⎛ (Δp − Δq) − (Δp − Δq) ⎞ (Δp, Δq) = ⎜ , , ⎟ + ⎜ ⎟ 2 2 2 2 ⎠ ⎝ ⎠ ⎝
symmetric
antisymmetric
Coordinates We can define coordinates by Δt =
Δp + Δq 2
Δx =
Δp − Δq 2
The pair ⎛ Δp + Δq Δp + Δq ⎞ ⎛ (Δp − Δq) − (Δp − Δq) ⎞ (Δp, Δq) = ⎜ , , ⎟ + ⎜ ⎟ 2 2 2 2 ⎝ ⎠ ⎝ ⎠
can be rewritten as (Δt + Δx, Δt − Δx ) =
(Δt, Δt )
+ (Δx,− Δx )
Measuring Intervals Given an interval quantified by a pair (Δt + Δx, Δt − Δx ) =
(Δt, Δt )
+ (Δx,− Δx )
there are two scalar measures: Pair Component Sum (Δt + Δx ) + (Δt − Δx ) = 2Δt
Scalar Interval (Δt + Δx )(Δt − Δx ) = Δt 2 − Δx 2
Which is the MINKOWSKI METRIC! Δs 2
= Δt 2 − Δx 2
Special Relativity (see my poster)
CONCLUSIONS Order remains an untapped resource in physics Quantification of partially ordered sets leads to constraint equations which play a significant role in determining many physical laws Inspired by the ideas of Cox and Jaynes we derive: measure theory probability theory information theory quantum mechanics special relativity
Information Physics views the laws of physics as arising from our descriptions of the universe, not the universe itself.
Special thanks to: Newshaw Bahreyni Ariel Caticha Seth Chaiken Keith Earle
Adom Giffin Philip Goyal Carlos Rodriguez John Skilling