Journal of Generalized Lie Theory and Applications Matrix Lie Groups: An Introduction

This article presents basic notions of Lie theory in the context of matrix groups with goals of minimizing the required mathematical background and maximizing accessibility. It is structured with exercises that enhance the text and make the notes suitable for (part of) an introductory course at the upper level undergraduate or early graduate level. Indeed the notes were originally written as part of an introductory course to geometric control theory.


Introduction
Lie theory, the theory of Lie groups, Lie algebras, and their applications is a fundamental part of mathematics that touches on a broad spectrum of mathematics, including geometry (classical, differential, and algebraic), ordinary and partial differential equations, group, ring, and algebra theory, complex and harmonic analysis, number theory, and physics (classical, quantum, and relativistic). It typically relies upon an array of substantial tools such as topology, differentiable manifolds and differential geometry, covering spaces, advanced linear algebra, measure theory, and group theory to name a few. However, we will considerably simplify the approach to Lie theory by restricting our attention to the most important class of examples, namely those Lie groups that can be concretely realized as (multiplicative) groups of matrices.
Lie theory began in the late nineteenth century, primarily through the work of the Norwegian mathematician Sophus Lie, who called them "continuous groups," in contrast to the usually finite permutation groups that had been principally studied up to that point. An early major success of the theory was to provide a viewpoint for a systematic understanding of the newer geometries such as hyperbolic, elliptic, and projective, that had arisen earlier in the century. This led Felix Klein in his Erlanger Programme to propose that geometry should be understood as the study of quantities or properties left invariant under an appropriate group of geometric transformations. In the early twentieth century Lie theory was widely incorporated into modern physics, beginning with Einstein's introduction of the Lorentz transformations as a basic feature of special relativity. Since these early beginnings research in Lie theory has burgeoned and now spans a vast literature.
The essential feature of Lie theory is that one may associate with any Lie group G a Lie algebra g. The Lie algebra g is a vector space equipped with a bilinear non-associative anti-commutative product, called the Lie bracket or commutator and usually denoted [•,•]. The crucial and rather surprising fact is that a Lie group is almost completely determined by its Lie algebra g. There is also a basic bridge between the two structures given by the exponential map exp : g→G. For many purposes structure questions or problems concerning the highly complicated nonlinear structure G can be translated and reformulated via the exponential map in the Lie algebra g, where they often lend themselves to study via the tools of linear algebra (in short, nonlinear problems can often be linearized). This procedure is a major source of the power of Lie theory.

The General Linear Group
Let V be a finite dimensional vector space equipped with a complete norm || • || over the field , where  =  or  = . (Actually since the space V is finite dimensional, the norm must be equivalent to the usual euclidean norm, and hence complete.) Let End(V) denote the algebra of linear self-maps on V , and let GL(V) denote the general linear group, the group (under composition) of invertible self-maps. If V =  n , then End(V) may be identified with M n (), the n × n matrices, and GL(V) = GL n (), the matrices of nonvanishing determinant.
We endow End(V ) with the usual operator norm, a complete norm defined by Absolute convergence allows us to rearrange terms and to carry out various algebraic operations and the process of differentiation term wise. We henceforth allow ourselves the freedom to carry out such manipulations without the tedium of a rather standard detailed verification.
Exercise 3.1: (i) Show that the exponential image of a block diagonal matrix with diagonal blocks A 1 ,…,A m is a block diagonal matrix with diagonal blocks exp(A 1 ),…,exp(A n ). In particular, to compute the exponential image of a diagonal matrix, simply apply the usual exponential map to the diagonal elements.
(ii) Suppose that A is similar to a diagonal matrix, A = PDP -1 . Show that exp(A) = P exp(D)P -1 .
Proof: Computing term wise and rearranging we have  where s = e r -1. In particular for r = ln 2,

One-parameter Groups
A one-parameter subgroup of a topological group G is a continuous homomorphism α:  → G from the additive group of real numbers into G.
Proof: Since sA and tA commute for any s, t ∈ , we have from Proposition 2 that exp( )  t tA is a homomorphism from the additive reals to End V under multiplication. It is continuous, indeed analytic, since scalar multiplication and exp are. The last assertion follows from the homomorphism property and assures the image lies in GL(V). Proof: Since exp(tA) defines a one-parameter subgroup, 2 2 (exp( / 2)) exp( / 2) exp( / 2) exp( / 2 / 2) exp( ) .
(i) If a subgroup contains a sequence of nonzero numbers {a n } converging to 0, then the subgroup is dense.
Proof: (i) Let t ∈  and let ε > 0. Pick a n such that |a n | < ε. Pick an integer k such that |t/a n -k| < 1 (for example, pick k to be the floor of t/a n ). Then multiplying by |a n | yields |t -ka n | < |a n | < ε. Since ka n must be in the subgroup, its density follows.
The preceding theorem establishes that a merely continuous one-parameter subgroup must be analytic. This is a very special case of Hilbert's fifth problem, which asked whether a locally euclidean topological group was actually an analytic manifold with an analytic multiplication. This problem was solved positively some fifty years later in the 1950's by Gleason, Montgomery, and Zippin. Remark 9: The element A ∈ EndV is called the infinitesimal generator of the one-parameter group exp( ).  t tA We conclude from the preceding theorem and exercise that there is a one-toone correspondence between one-parameter subgroups and their infinitesimal generators.

Curves in End V
In this section we consider basic properties of differentiable curves in End V. Let I be an open interval and let A(•) : I → End V be a curve. We say that A is C r if each of the coordinate functions A ij (t) is The derivative exists iff the derivative ( )  ij A t of each coordinate function exists, and in this case ( )  A t is the linear operator with coordinate Items (1) and (2) in the following list of basic properties for operator valued functions are immediate consequences of the preceding characterization, and item (5) is a special case of the general chain rule.
is, since it is the composition with the inversion function, which is analytic, hence C r for all r.

Differentiate the equation A(t).A -1 (t) = I and solve for D t (A -1 (t)).)
We can also define the integral The following are basic properties of the integral that follow from the real case by working coordinate wise.
We consider curves given by power series: Since for an operator A, |a ij | ≤ || A || for each entry a ij (exercise), we have that absolute convergence, the convergence of (ii)Use termwise differentiation to show D t (exp(tA)) = A exp(tA).
(iii)Show that X(t) = exp(tA)X 0 satisfies the differential equation on End V given by

The Baker-Campbell-Hausdorff Formalism
It is a useful fact that the derivative of the multiplication map at the identity I of End V is the addition map. then just as for real numbers this series converges absolutely for || A || < 1. Further since exp(log A) = A holds in the case of real numbers, it holds in the algebra of formal power series, and hence in the linear operator or matrix case. Indeed one can conclude that exp is 1 -1 on B ln 2 (0), carries it into B 1 (I), and has inverse given by the preceding logarithmic series, all this without appeal to the Inverse Function Theorem.
The local diffeomorphic property of the exponential function allows one to pull back the multiplication in GL(V) locally to a neighbourhood of 0 in End V . One chooses two points A, B in a sufficiently small neighbourhood of 0, forms the product exp(A) • exp(B) and takes the log of this product: This Baker-Campbell-Hausdorff multiplication is defined on any B r (0) small enough so that exp(B r (0)) • exp(B r (0)) is contained in the domain of the log function; such exist by the local diffeomorphism property and the continuity of multiplication. Now there is a beautiful formula called the Baker-Campbell-Hausdorff formula that gives A * B as a power series in A and B with the higher powers being given by higher order Lie brackets or commutators, where the (firstorder) commutator or Lie bracket is given by [A,B]:= AB − BA. The Baker-Campbell-Hausdorff power series is obtained by manipulating the power series for log(exp(x) • exp(y)) in two noncommuting variables x, y in such a way that it is rewritten so that all powers are commutators of some order. To develop this whole formula would take us too far afield from our goals, but we do derive the first and second order terms, which suffice for many purposes.
(Note that the second equality is just the definition of the derivative, where the norm on End V × End V is the sum norm.) This gives (i).
(ii) Using (i), we obtain the following string: as A, B → 0, so the right-hand sum is less than or equal 2(|| || || ||) where the second inequality and last equality follow by applying appropriate parts of Proposition 12. Proposition 12 also insures that follows directly from equation (5). Finally by equation (3) and Proposition 12(ii) 3 which goes to 0 as A, B → 0.

The Trotter and Commutator Formulas
In the following sections we show that one can associate with each closed subgroup of GL(V) a Lie subalgebra of End V , that is, a subspace closed under Lie bracket. The exponential map carries this Lie algebra into the matrix group and using properties of the exponential map, one can frequently transfer structural questions about the Lie group to the Lie algebra, where they often can be treated using methods of linear algebra. In this section we look at some of the basic properties of the exponential map that give rise to these strong connections between a matrix group and its Lie algebra.  (ii) The first equality follows directly by applying the exponential function to (i): where the last equality follows from the fact that exp is a local isomorphism from the BCH-multiplication to operator multiplication, and penultimate equality from the fact that exp(nA) = exp(A) n , since exp restricted to A is a one-parameter group. The second equality in part (ii) of the theorem follows from the first by setting A n = A/n, B n = B/n.
The exponential image of the Lie bracket of the commutator can be calculated from products of group commutators. To see second equality in item (i), observe first that on a BCHneighbourhood where the exponential map is injective,

A B A B A B B A A B B A A B B A A B B A S A B B A
Applying this equality to the given sequences, we obtain

B B A n S A B B A
Now if we show that the two terms in the second expression approach 0 as n→∞, then the first expression approaches 0, and thus the two limits in (i) will be equal. We observe first that by the Trotter Product Formula

The Lie Algebra of a Matrix Group
In this section we set up the fundamental machinery of Lie theory, namely we show how to assign to each matrix group a (uniquely determined) Lie algebra and an exponential map from the Lie algebra to the matrix group that connects the two together. We begin by defining the notions and giving some examples.
By a matrix group we mean a closed subgroup of GL(V ), where V is a finite dimensional vector space. (1) The general linear group GL(V ). If V =  n , then we write the group of n × n invertible matrices as GL n ().
(3) Let V be a real (resp. complex) Hilbert space equipped with an inner product 〈⋅ , ⋅〉. The orthogonal group (resp. unitary group) consists of all transformations preserving the inner product, i.e., If V =  n (resp.  n ) equipped with the usual inner product, then the orthogonal group O n (resp. unitary group U n ) consists of all g ∈ GL(V ) such that g t = g -1 (resp. g * = g -1 ).
(4) Let V =  n ⊕  n equipped with the sympletic form ( ) x y y x y y The real sympletic group is the subgroup of GL(V ) preserving Q: , [ , ] [ , ] λ µ λ µ [ , ] [ , ]. It follows directly from the preceding exercise that any subspace of End V that is closed with respect to the Lie bracket operation is a Lie subalgebra.
We define a matrix semigroup S to be a closed multiplicative subsemigroup of GL(V ) that contains the identity element. We define the tangent set of S by We define a wedge in End V to be a closed subset containing {0} that is closed under addition and scalar multiplication by nonnegative scalars.

Proposition 17. If S is a matrix semigroup, then L(S) is a wedge.
Proof. Since I = exp(t.0) for all t ≥ 0 and I ∈ S, we conclude that 0 ∈ L(S). If A ∈ L(S), then exp(tA) ∈ S for all t ≥ 0, and thus exp(rtA) ∈ S for all r, t ≥ 0 It follows that rA ∈ L(S) for r ≥ 0. Exercise 8.4. Show for a matrix group G (which is a matrix semigroup, in particular) that g = L(G).

Lemma 19. Suppose that G is a matrix group, {A n } is a sequence in
End V such that A n → 0 and exp(A n ) ∈ G for all n. If s n A n has a cluster point for some sequence of real numbers s n , then the cluster point belongs to g. s n A n converges to B. Let t ∈  and for each n pick an integer m n such that | m n − ts n | < 1. Then which implies m n A n → tB. Since exp(m n A n ) = (exp A n ) m n ∈ G for each n and G is closed, we conclude that the limit of this sequence exp(tB) is in G. Since t was arbitrary, we see that B ∈ g.
We come now to a crucial and central result. Proof. Let B r (0) be a BCH-neighborhood around 0 in End V, which maps homeomorphically under exp to an open neighborhood exp(B r (0)) of I in GL(V ) with inverse log. Assume that exp(B r (0) ∩ g) does not contain a neighborhood of I in G. Then there exists a sequence g n contained in G but missing exp(B r (0) ∩ g) that converges to I. Since exp(B r (0)) is an open neighborhood of I, we may assume without loss of generality that the sequence is contained in this open set. Hence A n = log g n is defined for each n, and A n → 0. Note that A n ∈ B r (0), but A n ∉ g, for each n, since otherwise exp(A n ) = g n ∈ exp(g ∩ B r (0)).
Let W be a complementary subspace to g in End V and consider the restriction of the BCH-multiplication µ (A, B) = A * B to (g ∩ B r (0)) × (W ∩ B r (0)). By the proof of Proposition 12, the derivative dµ (0,0) of µ at (0, 0) is addition, and so the derivative of the restriction of µ to (g ∩ B r (0)) × (W ∩ B r (0)) is the addition map + : g × W → End V. Since g and W are complementary subspaces, this map is an isomorphism of vector spaces. Thus by the Inverse Function Theorem there exists an open ball B s (0), 0 < s ≤ r, such that µ restricted to (g ∩ B s (0)) × (W ∩ B s (0)) is a diffeomorphism onto an open neighborhood Q of 0 in End V . Since A n ∈ Q for large n, we have A n = B n * C n (uniquely) for B n ∈ (g ∩ B s (0)) and C n ∈ (W ∩ B s (0)). Since the restriction of µ is a homeomorphism and 0 * 0 = 0, we have (B n , C n ) → (0, 0), i.e., B n → 0 and C n → 0.
By compactness of the unit sphere in End V, we have that C n /||C n || clusters to some C ∈ W with ||C || = 1. Furthermore, g n = exp(A n ) = exp(B n * C n ) = exp(B n ) exp(C n ) so that exp(C n ) = (exp B n ) -1 g n ∈ G. It follows from Lemma 19 that C ∈ g. But this is impossible since g ∩ W = {0}and C ≠ 0. We conclude that exp(B r (0) ∩ g) does contain some neighborhood N of I in G. We sketch here how that theory of matrix groups develops from what we have already done in that direction. Recall that a manifold is a topological space M, which we will assume to be metrizable, that has a covering of open sets each of which is homeomorphic to an open subset of euclidean space. Any family of such homeomorphisms from any open cover of M is called an atlas, and the members of the atlas are called charts. The preceding theorem allows us to introduce charts on a matrix group G in a very natural way. Let U be an open set around 0 in g contained in a BCH-neighborhood such that W = exp U is an open neighborhood of I in G. Let λ g : G → G be the left translation map, i.e., λ g (h) = gh. We define an atlas of charts on G by taking all open sets g -1 N, where N is an open subset of G such that I ∈ N ⊆ W and defining the chart to be log λ g : g -1 N → g (to view these as euclidean charts, we identify g with some  n via identifying some basis of g with the standard basis of  n ). One can check directly using the fact that multiplication of matrices is polynomial that for two such charts and φ and ψ, the composition φ ο ψ -1 , where defined, is smooth, indeed analytic. This gives rise to a differentiable structure on G, making it a smooth (analytic) manifold. The multiplication and inversion on G, when appropriately composed with charts are analytic functions, and thus one obtains an analytic group, a group on an analytic manifold with analytic group operations. This is the unique analytic structure on the group making it a smooth manifold so that the exponential map is also smooth.

The Lie Algebra Functor
We consider the category of matrix groups to be the category with objects matrix groups and morphisms continuous (group) homomorphisms and the category of Lie algebras with objects subalgebras of some End V and morphisms linear maps that preserve the Lie bracket., The next result shows that the assignment to a matrix group of its Lie algebra is functorial.
Proposition 21. Let α : G → H be a continuous homomorphism between matrix groups. Then there exists a unique Lie algebra homomorphism dα : g → h such that the following diagram commutes: dα α Proof. Let A ∈ g. Then the map β (t) := α(exp(tA)) is a one-parameter subgroup of H. Hence it has a unique infinitesimal generator Ã ∈ h Define dα (A) =Ã. We show that dα is a Lie algebra homomorphism.
Let A, B ∈ G. Then This shows that dα (A + B) = Ã +  B = dα (A) + dα (B), and thus dα is linear. In an analogous way using the commutator, one shows that dα preserves the commutator.
For t = 1, α (exp A) = exp(Ã) = exp(dα (A)). Thus α ο exp = dα ο exp. This shows the square commutes. If γ : g → h is another Lie algebra homomorphism that also makes the square commute, then for A ∈ g and all t ∈ ,