The principle of gauge invariance

The first time I ever heard the term, "gauge invariance", I didn't know what to make of it. Of course it didn't help that Hungarian physics literature uses the term, "mértékinvariancia", which, literally translated, means something like "invariance of measure". But to be honest, the English phrase doesn't appear to be much more meaningful either.

It wasn't until I came across an excellent informal introduction to the concept in Aitchison's and Hey's book, Gauge Theories in Particle Physics, that I finally began to understand what this whole concept was all about. And it didn't take very long to realize just how powerful a concept it really is!

The idea is simple and has its roots in the well-known (non-gauge) invariances in classical physics. For instance, everyone knows that the equations of physics remain invariant under, say, a translation of the coordinate system. This is just a fancy way of saying that there are no absolute positions; what matters is not where an object is in absolute terms, but where it is relative to other objects, i.e., coordinate differences.

So when you have an equation of physics involving the $x$ coordinate of two objects, $x_1$ and $x_2$, the equation may contain their differences but not the absolute coordinates themselves. An equation in the form, $F(x_1)=0$ won't be physical, whereas an equation like $F(x_2-x_1)=0$ might be.

Or, even for a single object, you may have a physical equation that contains the time derivative of its $x$ coordinate. Since $d(x+C)/dt$ is the same as $dx/dt$ for any constant $C$, such equations will remain invariant under a translation.

Translations, rotations, time translations, Lorentz-boosts... these transformations are all geometrical. But is it only geometrical transformations under which physical equations remain invariant? Of course not. While much of physics is about geometry (some hope that one day, all of it will be), physics also deals with non-geometrical quantities. Take, for instance, voltage. It is another one of those "relative" quantities: what matters is not the absolute voltage but the relative potential between two conductors. In other words, adding a constant to all voltages should leave physics equations invariant. This may, at first, look to be at odds with equations like $R=U/I$ but only until you realize that in this case, $U$ is really the potential difference, $U_2-U_1$, between two points in a circuit: and adding the same constant to both $U_2$ and $U_1$ will leave this difference unchanged, so Ohm's equation remains valid.

There is one common characteristic to these transformations under which physical laws remain unchanged. Namely that all are global: the value that characterizes the transformation is the same everywhere in all of spacetime.

What if this is not so? What if we use a transformation that is characterized by a value that is different everywhere (e.g., a smooth function of spacetime coordinates, as opposed to a constant?)

At first sight, the idea may appear like madness. Such an arbitrary transformation must surely destroy the validity of any physics equation.

Then again... there's another way of looking at this. Okay, so our physics equation was mangled by that strange transformation. But is there any way to unmangle it?

Surprisingly, the answer is yes. Even more surprisingly, we find that when the equations are unmangled, i.e., changed to accommodate our strange transformation, the new components that appear in them will correspond to known physical forces!

By far the simplest gauge theory is electromagnetism. And by far the simplest way to present electromagnetism as a gauge theory is through the non-relativistic Schrödinger equation of a particle moving in empty space:

\[i\hbar\frac{\partial\psi}{\partial t}=\frac{-\hbar^2}{2m}\nabla^2\psi.\]

Although the equation contains the wave function $\psi$, we know that the actual probability of finding a particle in some state is a function of $|\psi|$. In other words, the phase of the complex function ψ can be changed without altering the outcome of physical experiments. In other words, all physical experiments will produce the same result if we perform the following substitution:

\[\psi\rightarrow e^{ip(x,t)}\psi,\]

where $p(x,t)$ is an arbitrary smooth function of space and time coordinates.

Let's see what happens to the Schrödinger equation though when we apply this transformation. First, the left-hand side:

\[\frac{\partial\psi'}{\partial t}=\frac{\partial e^{ip(x,t)}}{\partial t}\psi+e^{ip(x,t)}\frac{\partial\psi}{\partial t}=e^{ip(x,t)}\left[i\frac{\partial p(x,t)}{\partial t}\psi+\frac{\partial\psi}{\partial t}\right].\]

Next, the right-hand side, which is a bit more difficult to tackle, but hey, it's just straightforward algebraic manipulation:

\begin{align} \nabla^2\psi'&=\nabla\{\nabla[e^{ip(x,t)}\psi]\}=\nabla\{\nabla[e^{ip(x,t)}]\psi+e^{ip(x,t)}\nabla\psi\}\\ &=\nabla[e^{ip(x,t)}i\nabla p(x,t)\psi+e^{ip(x,t)}\nabla\psi]=\nabla\{e^{ip(x,t)}[i\nabla p(x,t)\psi+\nabla\psi]\}\\ &=\nabla e^{ip(x,t)}[i\nabla p(x,t)\psi+\nabla\psi]+e^{ip(x,t)}[i\nabla p(x,t)\psi+\nabla\psi]\\ &=e^{ip(x,t)}i\nabla p(x,t)[i\nabla p(x,t)\psi+\nabla\psi] +e^{ip(x,t)}[i\nabla^2p(x,t)\psi+i\nabla p(x,t)\nabla\psi+\nabla^2\psi]\\ &=e^{ip(x,t)}\{i\nabla p(x,t)[i\nabla p(x,t)\psi+\nabla\psi]+i\nabla^2p(x,t)\psi+i\nabla p(x,t)\nabla\psi+\nabla^2\psi\}\\ &=e^{ip(x,t)}\{\nabla^2\psi+2i\nabla p(x,t)\nabla\psi-[\nabla p(x,t)]^2\psi+i\nabla^2p(x,t)\psi\}\\ &=e^{ip(x,t)}\{[\nabla+i\nabla p(x,t)]^2\psi\}. \end{align}

On both sides of the equation, we now have an extra factor $e^{ip(x,t)}$, which we can safely drop, resulting in the following equation:

\[i\hbar\left[i\frac{\partial p(x,t)}{\partial t}\psi+\frac{\partial\psi}{\partial t}\right]=\frac{-\hbar^2}{2m}[\nabla+i\nabla p(x,t)]^2\psi,\]

or

\[i\hbar\frac{\partial\psi}{\partial t}=\frac{-\hbar^2}{2m}\left\{[\nabla+i\nabla p(x,t)]^2-\frac{2m}{\hbar}\frac{\partial p(x,t)}{\partial t}\right\}\psi.\]

Whatever it is, it is definitely not the Schrödinger equation of a particle in empty space. In other words, we can conclude that the Schrödinger equation is not invariant under the gauge transformation $\psi\rightarrow e^{ip(x,t)}\psi$.

Now of course that is not exactly surprising. We have, after all, mangled the wave function beyond recognition by changing its complex phase with an arbitrary amount at each point of spacetime.

But what if we start with a Schrödinger equation that already includes components that look like the ones we ended up with? For instance:

\[i\hbar\frac{\partial\psi}{\partial t}=\frac{-\hbar^2}{2m}[(\nabla+i{\bf\mathrm{A}})^2+V]\psi.\]

Starting with this equation, when we perform the gauge transformation $\psi\rightarrow e^{ip(x,t)}\psi$ we end up with the following substitutions:

\begin{align}{\bf\mathrm{A}}&\rightarrow{\bf\mathrm{A}}+\nabla p(x,t),\\
V&\rightarrow V-\frac{2m}{\hbar}\frac{\partial p(x,t)}{\partial t},\end{align}

after which our new Schrödinger equation remains valid.

What we have here is a vectorial and a scalar quantity, and a pair of transformation laws that supposedly do not alter the validity of our physics. But we already have just such a set of quantities in physics, in electromagnetism. If $\vec{A}$ were the electromagnetic vector potential and $V$ were the scalar potential, the transformation rules (2) would leave measurable physical quantities—namely, the magnetic field, $\vec{B}=\nabla\times\vec{A}$, and the electric field, $\vec{E}=-\nabla V-\partial\vec{A}/\partial t$ invariant. Which suggests that (1) is nothing less than the Schrödinger equation of a charged particle moving in an electromagnetic field characterized by $\vec{A}$ and $V$. Which it indeed is, as can be confirmed through experiment.

This is nothing short of remarkable. We have, after all, made no a priori assumptions about electromagnetism. We started off with the Schrödinger equation of a particle moving in empty space, observed that the probability of finding a particle in a state does not depend on the complex phase of its wave function, and made an attempt to incorporate this invariance into the equation itself. We were successful, and in the process, we managed to recover a vectorial and a scalar quantity that satisfy Maxwell's equations: in a sense, we "invented" electromagnetism!

Pure magic, if you ask me. But this really is just the beginning. The transformations represented by $e^{ip(x,t)}$, that is, rotations in a plane, form an Abelian group; these transformations are commutative. However, when the wave function is not a complex-valued function but something more intricate (such as a pair of quaternions, the simplest solution of the Dirac equation) the gauge transformation can become something more intricate as well. When the gauge transformation is not commutative, the force field that corresponds to it will contain the commutator of the gauge transformation, which will make the field self-interacting, and things get really interesting... But no, this is not why the photon is massless while other interactions are carried by massive particles. That was my naïve initial thought when I first read about the fact that a non-commutative gauge transformation results in a field that is self-interacting, but, as usual, reality tends to be more complex than one's naïve first impressions of it. What we get into here is Yang-Mills theory, and I am quite a long way away from being able to say anything meaningful about it just yet!


References

Aitchison, I. J. R. & Hey, A. J. G., Gauge Theories in Particle Physics, Institute of Physics Publishing, 1996