The problem of doing physics in curved spacetime can be stated simply. Most of physics concerns itself with differential equations. But ordinary derivative operators work only in the linear coordinate system of flat spacetime. So the question is whether a meaningful analog of the ordinary derivative operator can be invented that works in curved spacetime.
The answer comes in the form of a logical progression, which begins with generalizing a vector transformation law from ordinary rectangular coordinate systems to curvilinear ones; deriving the concept of covariance and contravariance and the notion of using infinitesimals as basis vectors; and then using these tools to derive a meaningful way to compare vectors in tangent spaces attached to different points of a spacetime. The result of this effort is the Christoffel-symbol, expressing the difference in the action of two derivative operators; armed with the Christoffel-symbol we can compute the difference between the derivative at a point in curved space relative to the ordinary derivative in a Euclidean tangent space at that point. The non-commutativity of this new derivative operator then offers a natural way to characterize curvature through the Riemann curvature tensor.
Vector Transformation Law
What is a vector? To the physicist, it is a quantity that has a magnitude and a direction. What a vector is not is a set of n numbers. A set of n numbers, at best, is the representation of a vector in a particular coordinate system, and may change as we change coordinate systems.
The first step in the analysis is to examine how, in a straight-line coordinate system, vector components are transformed under a linear transformation. As it is well known, so long as the origin remains in place the transformation can be fully described by a matrix:
x' = A · x
or
![]() | (1) |
Partial Derivatives
What we cannot do is to apply this method to anything other than straight-line coordinates in flat spacetime. Even something as simple as a polar coordinate system cannot be described this way. And, of course, we're in even deeper trouble in curved spacetime, where no straight-line coordinate system exists at all.
Now is the time to notice a simple identity. Since
| (2) |
the partial derivative of
| (3) |
Of course when we stop using
a straight-line coordinate system (even if we stay in a flat spacetime, e.g.,
when switching to polar coordinates) we can no longer use a constant
transformation matrix with coefficients
![]() | (4) |
One way to look at this is to notice that the matrix coefficients now vary from one point to the next, as opposed to being the same constant matrix across all space.
Tensors, Covariant and Contravariant Quantities
In physics, one often deals with quantities that relate one vector to another, and are expressed in matrix form:
When we switch to another coordinate system, this is how this equation changes:
How would we express
Next, we multiply on the left by A:
Now that we have an
expression with
Or, since matrix multiplication is associative:
| y' = (A · T · A-1) · x' | (5) |
This is the expression we
sought, expressing
y => y' = A · y
T => T' = (A
· T · A-1)
A quantity like a vector that
transforms according to the equation
A quantity that transforms
according to the equation
Tensors are often of a mixed
type. The tensor T above is one example. Generally, a tensor that
transforms according to
All tensors in fact form vector
spaces; basic identities such as
It is no accident that we
expressed a linear relationship between vectors x and y above
using a matrix T. Generally, the following statement is true: Any linear
relationship between a tensor X of type (p, q) and a
tensor Y of type (r, s) can be expressed using a tensor T
of type
What are the components of
the inverse matrix
Using exactly the same
argument used to derive (3), the components of
| (6) |
The transformation law for a
tensorial quantity T of type
| (7) |
Good thing we have shorthand notation for this!
Abstract Index Notation
The abstract index notation was developed to make tensor equations simpler. First, one must realize that many tensor operations are independent of the basis. For instance, the exterior product of two contravariant vectors (upper indices are used to indicate a contravariant quantity):
appears, at first sight, to
be dependent on the basis in which the components
and thus it is a tensorial quantity. Indeed, in the above expressions we don't really care what the coordinate system is in which the vectors or tensors are expressed in component form: the indices a and b are "abstract", all we need to know is whether two indices are the same, and whether two quantities are expressed using the same or different coordinates.
Yet another quantity that obeys the tensor transformation law is the interior, or tensor product. Multiplying a covariant (indicated by a subscripted index) and a contravariant vector's components and summing the result yields a scalar quantity that is independent of the coordinate system:
![]()
as can be verified through direct calculation.
To avoid having to use the summation sign too often, the Einstein summation convention is used. When an index appears in an expression twice, as a contravariant (upper) and covariant (lower) index, it is assumed that the expression will be summed over this index:
![]()
Infinitesimal Quantities as Basis Vectors
At this point, it is
important to notice a curious coincidence. There is another type of a
"quantity" that transforms according to equation (6), this quantity
being the operator
.
Given a real-valued smooth function f its partial derivatives in the new
coordinate system can be expressed as a function of its partial derivatives in
the original system as a covariant quantity:
| (8) |
Similarly, an
"infinitesimal displacement"
| (9) |
Because of this, the quantities
and
The whole point of this exercise, of course, is to develop a mechanism to deal with non straight-lined coordinate systems. Such coordinate systems exist in flat spacetime; one example is the polar coordinate system. What equation (7) tells us is how we can transform between arbitrary coordinate systems.
All this happened in flat space so far. But the mechanism that is described here is also applicable to converting between coordinate systems in curved space.
The Metric
Thanks to the work of a Greek chap named Pythagoras who lived some 2500 years ago, the distance between two points is easy to compute in a Cartesian coordinate system. This is not so in a non straight-lined coordinate system. However, by examining how the infinitesimal squared distance behaves under coordinate transformations, a meaningful new quantity can be derived.
In a Cartesian coordinate system, the infinitesimal squared distance is computed using the Pythagorean formula:
![]()
Under a change of coordinate systems, this quantity will transform as follows:
![]()
By introducing the quantities
| (10) |
we can rewrite this formula as:
![]()
Notice that:
![]()
If we take the infinitesimal
square distance,
. This is
called a "lowering of the index". What we basically find with the
help of
Similarly, the inverse of
The quantities
Christoffel Symbol
In a Euclidean space, it is possible to construct a parallel vector field: I.e., assigning the same vector to each point in space. Because of this, it also makes sense to talk about how a vector field changes from one point to the next; we can parallel-transport a vector from the first point to the second, and compare it with the value of the vector field there. Thus, the notion of a differential operator on a vector field is born.
The components of a parallel vector field are obviously constant relative to a rectangular coordinate system x' j:
![]()
We can also express this
displacement in a curvilinear coordinate system
we
get:
| (11) |
With the help of the metric,
it is possible to eliminate the second derivatives from this equation. First,
we need to differentiate the defining equation (10) with respect to
![]()
![]()
![]()
Adding the first and second of these equations, subtracting the third, and dividing by two we obtain:
| (12) |
We can now multiply (11) with
and sum
over j, to get:
![]()
Substituting (12) we obtain the result we sought:
| (13) |
For a parallel vector field,
(13) should be zero of course. The right hand side of this equation can then be
made more meaningful if we multiply it by the inverse of
![]()
Denoting
with
, we get the following
equation:
| (14) |
The quantities
are called the Christoffel-symbols
and they essentially express the difference between the ordinary differential
operator of a rectilinear coordinate system and that of a curvilinear
coordinate system. When the coordinate system is rectilinear and we perform a
parallel transport,
is also zero. In a curvilinear
coordinate system, we define parallel transport by equation (14); consequently,
equation (14) also defines what can thereafter serve as a derivative operator.
When the space is intrinsically curved, there exists no rectilinear coordinate system of course, and straightforward differentiation is not possible. But, as any self-respecting physicist knows (much to the distress of self-respecting mathematicians) in a small enough neighborhood, everything is linear. What the Christoffel-symbols express, in this case, is essentially the differential operator of curved coordinates relative to the rectilinear coordinates of a Euclidean tangent space at a particular point. The main significance of the Christoffel-symbols lies in the fact that they provide a practical method for differentiation in curved space, by expressing the differential operator as a function of ordinary differential operators in a Euclidean tangent space.
Using equation (14), we can compute the differential of a contravariant vector field with respect to xk:
![]()
If we were to repeat the same computation for a covariant vector field, we'd get a slightly different formula:
![]()
This method can be generalized to tensors of arbitrary type. For each contravariant index, one term will be added; for each covariant index, one term must be subtracted. Spelling it out, here's what we get:
![]()
Yuck. But, you get the idea.
Are we justified in calling Dk a differential operator?
Yes, for multiple reasons. First of all, Dk has the common
algebraic properties of a differential operator, namely bilinearity and the
Leibniz rule. Furthermore, it can be shown that with respect to the metric, the
choice of Dk is unique: it is the only differential operator
that preserves the inner product of two vectors as they are parallel-transported
along any curve, i.e.,
Commutators
It is interesting to take a look at the commutator of the derivative operator. In an ordinary rectilinear coordinate system the order in which differential operators are applied doesn't matter. In a curvilinear system, it does:
![]() | (15) |
The commutator of the derivative operator with respect to a scalar field,
i.e.,
, is called torsion.
Often, the torsion is assumed to vanish for
physically meaningful spacetimes, though this does not necessarily have to be
the case.
Before proceeding any
further, it ought to be mentioned that although it looks like one,
is not really a
tensor, as it depends on the choice of a coordinate system. Another way of
looking at it is that it characterizes the differential operator in curved
space relative to a particular coordinate system; if we used another coordinate
system, we would derive a different
.
Curvature
We now have the rudiments of the apparatus need to express a fundamental geometric characteristic of curved space: its curvature. Intuitively, the notion is about the failure of a vector to remain parallel with itself as it is translated along a closed loop and returns to the origin. This can be seen easily as you imagine a north-pointing tangent vector at the Earth's equator. First, parallel transport this vector eastward ninety degrees. It'll still point towards the North Pole of course. Next, parallel transport the vector all the way to the North Pole. Finally, parallel transport it back to the point of origin, at which time you'll notice that the vector no longer points north; instead, it is now pointing west, along the equator.

Of course we don't have to go all the way to the North Pole. We can simply cover a small rectangular area, going east first, then north, then back west and then south again, to return to the point of origin:

Changes to a vector V
along the path
The path
Which means that the overall
difference to a vector as it returns to the point of origin along the path
But this is none other than the commutator of the differential operator! This time, however, it has to be computed not for a scalar field, as we did in the previous section, but for a vector field.
In flat space,
Computing the commutator over a vector field is straightforward, though tedious:

The result consists of two parts, one of which we've already encountered: it is the torsion (15). The other part contains a tensor of type (1,3), which is called the Riemann curvature tensor:
| (16) |
When the torsion vanishes, i.e., when the spacetime is torsion-free, the curvature tensor (16) completely characterizes its intrinsic geometry.
Conclusion
At this point, we can conclude that three different quantities: the metric (10), the Christoffel-symbol (14), and the Riemann curvature tensor (16), all express the same thing: the geometry of a spacetime as seen by denizens of that spacetime (i.e., the intrinsic curvature.) If you have the metric, you can determine the Christoffel-symbol for the preferred derivative operator. With the Christoffel-symbol at hand, you can compute the curvature tensor. If all you have is the curvature tensor, you can derive the metric.
In practice, knowing the Christoffel-symbol is the most important, since it is these symbols that give a straightforward prescription for differentiation. With the derivative operator thus "tamed", one can proceed and try to express ordinary physical equations, for instance Maxwell's equations of the electromagnetic field, in curved spacetime and start doing some real physics with it!
Lovelock, David & Rund, Hanno, Tensors, Differential Forms, and Variational Principles, Dover Publications, 1989 Wald, Robert M., General Relativity, The University of Chicago Press, 1984