Physics Notes: On Tensors and their Matrix Representations

Here's something that caused me undue confusion over the years as I was learning about tensors.

Many textbooks, for instance, will tell you that the metric tensor of special relativity takes a form like

ημν





       





–1 0 0 0
0 1 0 0
0 0 1 0
  0     0     0     1  
       

and of course we all know what they mean, but this is technically incorrect. Why? Well, let us try and multiply a contravariant vector by this metric tensor. What should we get? Why, it's ημνvν = vμ, i.e., a covariant vector. OK, so let's do this in matrix form, where contravariant vectors are represented as column vectors:







       











 





 = 





 





–1 0 0 0 t t
0 1 0 0 x x
0 0 1 0 y y
  0     0     0     1     z     z  
           

Something is wrong here. Instead of a row vector representing a covariant vector, we got another column vector. How can this be?

Let's take a step backward. What, exactly, does a 4×4 matrix represent? Why, it has 4 rows and 4 columns, meaning that it is a tensor with one covariant and one contravariant index. I.e., what we wrote above makes no sense. What would make sense is this

ημν





       





–1 0 0 0
0 1 0 0
0 0 1 0
  0     0     0     1  
       

but of course this doesn't work very well either, since we know that the "mixed index metric tensor" is really just the identity (e.g., ημνηνξ = δμξ). So how could we write the metric tensor correctly in matrix form? Well... it has two covariant indices. Meaning it has... a row of four row vectors? Let's try:

ημν = [(–1 0 0 0)  (0 1 0 0)  (0 0 1 0)  (0 0 0 1)]

For it to make any sense, we should be able to use it to multiply a contravariant vector with it on the left and get a covariant vector. Let's give it a try:

[(–1 0 0 0)  (0 1 0 0)  (0 0 1 0)  (0 0 0 1)]





 





 = ?
t
x
y
  z  
 

Can this multiplication be carried out? We need to tweak the usual rules of matrix multiplication a little, but it still kind of makes sense. Let's take the first element of the object on the left, which is the row matrix [–1 0 0 0], and scalar multiply the vector on the right with it; we get –t. Repeat this procedure for the second, third, and fourth element: we get x, y, and z respectively. Since the four elements on the left were arranged in a row, we arrange the result in a row: [–t x y z]. This looks a great deal more like a covariant vector!

Better yet, this way we can in fact represent tensors of higher valence. For instance, the elements of the Christoffel symbol Γμνξ in some coordinate representation could be arranged as








       






(Γ000 Γ010 Γ020 Γ030) (Γ100 Γ110 Γ120 Γ130) (Γ200 Γ210 Γ220 Γ230) (Γ300 Γ310 Γ320 Γ330)
(Γ001 Γ011 Γ021 Γ031) (Γ101 Γ111 Γ121 Γ131) (Γ201 Γ211 Γ221 Γ231) (Γ301 Γ311 Γ321 Γ331)
(Γ002 Γ012 Γ022 Γ032) (Γ102 Γ112 Γ122 Γ132) (Γ202 Γ212 Γ222 Γ232) (Γ302 Γ312 Γ322 Γ332)
  (Γ003 Γ013 Γ023 Γ033)     (Γ103 Γ113 Γ123 Γ133)     (Γ203 Γ213 Γ223 Γ233)     (Γ303 Γ313 Γ323 Γ333)  
       

Or, when there are multiple contravariant indices, we get stacked columns. For instance:

ημν



































 





–1
0
0
  0  
 




































 





0
1
0
  0  
 






 





0
0
1
  0  
 






 





0
0
0
  1  
 

Of course it is not very convenient to write matrices like this, which is why we resort to the incorrect representation that is seen so often. Nevertheless, it is helpful to keep in mind what those representations really mean, in order to avoid confusion.