Note: These are notes live-tex’d from a graduate course in Lie Algebras taught by Brian Boe at the University of Georgia in Fall 2022. As such, any errors or inaccuracies are almost certainly my own.
Last updated: 2022-12-27
Main goal: understand semisimple finite-dimensional Lie algebras over \({\mathbf{C}}\). These are extremely well-understood, but there are open problems in infinite-dimensional representations, representations over other fields, and Lie superalgebras.
Recall that an associative algebras is a ring with the structure of a \(k{\hbox{-}}\)vector space, and algebra generally means a non-associative algebra. Given any algebra, one can define a new bilinear product \begin{align*} [{-}, {-}]: A\otimes_k A &\to A \\ a\otimes b &\mapsto ab-ba \end{align*} called the commutator bracket. This yields a new algebra \(A_L\) which is an example of a Lie algebra.
For \(L\in {}_{{ \mathbf{F} }}{\mathsf{Mod}}\) with an operation \([{-}, {-}]: L\times L\to L\) (called the bracket) is a Lie algebra if
Check that \([ab]\mathrel{\vcenter{:}}= ab-ba\) satisfies the Jacobi identity.
Expanding \([x+y, x+y] = 0\) yields \([xy] = -[yx]\). Note that this is equivalent to axiom 2 when \(\operatorname{ch}{ \mathbf{F} }\neq 2\) (given axiom 1).
The Jacobi identity can be rewritten as \([x[yz]] = [[xy]z] + [y[xz]]\), where the second term is an error term measuring the failure of associativity. Note that this is essentially the Leibniz rule.
A Lie algebra \(L\in\mathsf{Lie} \mathsf{Alg}_{/ {{ \mathbf{F} }}}\) is abelian if \([xy]=0\) for all \(x,y\in L\).
A morphism in \(\mathsf{Lie} \mathsf{Alg}_{/ {{ \mathbf{F} }}}\) is a morphism \(\phi\in {}_{{ \mathbf{F} }}{\mathsf{Mod}}(L, L')\) satisfying \(\phi( [xy] ) = [ \phi(x) \phi(y) ]\).
Check that if \(\phi\) has an inverse in \({}_{{ \mathbf{F} }}{\mathsf{Mod}}\), then \(\phi\) automatically has an inverse in \(\mathsf{Lie} \mathsf{Alg}_{/ {{ \mathbf{F} }}}\).
A vector subspace \(K\leq L\) is a Lie subalgebra if \([xy]\in K\) for all \(x,y\in K\).
For \(V\in {}_{{ \mathbf{F} }}{\mathsf{Mod}}\), the endomorphisms \(A\mathrel{\vcenter{:}}={ \operatorname{End} }_{ \mathbf{F} }(V)\) is an associative algebra over \({ \mathbf{F} }\). Thus it can be made into a Lie algebra \({\mathfrak{gl}}(V) \mathrel{\vcenter{:}}= A_L\) by defining \([xy] = xy-yx\) as above.
Any subalgebra \(K\leq {\mathfrak{gl}}(V)\) is a linear Lie algebra.
After picking a basis for \(V\), there is a noncanonical isomorphism \({ \operatorname{End} }_{ \mathbf{F} }(V) \cong \operatorname{Mat}_{n\times n}({ \mathbf{F} })\) where \(n\mathrel{\vcenter{:}}=\dim_{ \mathbf{F} }V\). The resulting Lie algebra is \({\mathfrak{gl}}_n({ \mathbf{F} }) \mathrel{\vcenter{:}}=\operatorname{Mat}_{n\times n}({ \mathbf{F} })_L\).
By Ado-Iwasawa, any finite-dimensional Lie algebra is isomorphic to some linear Lie algebra.
The upper triangular matrices form a subalgebra \({\mathfrak{t}}_n({ \mathbf{F} }) \leq {\mathfrak{gl}}_n({ \mathbf{F} })\).1 This is sometimes called the Borel and denoted \({\mathfrak{b}}\). There is also a subalgebra \({\mathfrak{n}}_n({ \mathbf{F} })\) of strictly upper triangular matrices. The diagonal matrices form a maximal torus/Cartan subalgebra \({\mathfrak{h}}_n({ \mathbf{F} })\) which is abelian.
These can be viewed as the matrices of a nodegenerate bilinear form: writing \(N\) for the size of the matrices, the matrices act on \(V \mathrel{\vcenter{:}}={ \mathbf{F} }^N\) by a bilinear form \(f: V\times V\to { \mathbf{F} }\) given by \(f(v, w) = v^t s w\). The form will be symmetric for \({\mathfrak{so}}\) and skew-symmetric for \({\mathfrak{sp}}\). The equation \(sx=-x^ts\) is a version of preserving the bilinear form \(s\). Note that these are the Lie algebras of the Lie groups \(G = {\operatorname{SO}}_{2n+1}({ \mathbf{F} }), {\operatorname{Sp}}_{2n}({ \mathbf{F} }), {\operatorname{SO}}_{2n}({ \mathbf{F} })\) defined by the condition \(f(gv, gw) = f(v, w)\) for all \(v,w\in { \mathbf{F} }^N\) where \(G = \left\{{g\in \operatorname{GL}_N({ \mathbf{F} }) {~\mathrel{\Big\vert}~}f(gv, gw) = f(v, w)}\right\}\). This is equivalent to the condition that \(f(gv, w) = f(v, g^{-1}w)\).
Philosophy: \(G\to {\mathfrak{g}}\) sends products to sums.
Check that the definitions of \({\operatorname{SO}}_n({ \mathbf{F} }), {\operatorname{Sp}}_n({ \mathbf{F} })\) yield Lie algebras.
Let \(A\in \mathsf{Alg}_{/ {{ \mathbf{F} }}}\), not necessarily associative (e.g. a Lie algebra). An \({ \mathbf{F} }{\hbox{-}}\)derivation is a morphism \(D: A \to A\) such that \(D(ab) = D(a)b + aD(b)\). Equipped with the commutator bracket, this defines a Lie algebra \(\mathop{\mathrm{Der}}_{ \mathbf{F} }(A) \leq {\mathfrak{gl}}_{ \mathbf{F} }(A)\).2
If \(D, D'\) are derivations, then the composition \(D\circ D'\) is not generally a derivation.
If \(L\in \mathsf{Lie} \mathsf{Alg}_{/ {{ \mathbf{F} }}}\), for \(x\in L\) fixed define the adjoint operator \begin{align*} { \operatorname{ad}}_x: L &\to L \\ y &\mapsto [x, y] .\end{align*} Note that \({ \operatorname{ad}}_x\in \mathop{\mathrm{Der}}_{ \mathbf{F} }(A)\) by the Jacobi identity. Any derivation of this form is an inner derivation, and all other derivations are outer derivations.
Given \(x\in K\leq L\), note that \(K\) is a Lie subalgebra, and we’ll want to distinguish \({ \operatorname{ad}}_L x\) and \({ \operatorname{ad}}_K x\) (which may differ). Note that \({\mathfrak{gl}}_n({ \mathbf{F} }) \geq {\mathfrak{b}}= {\mathfrak{h}}\oplus {\mathfrak{n}}\), where \({\mathfrak{b}}\) are upper triangular, \({\mathfrak{h}}\) diagonal, and \({\mathfrak{n}}\) strictly upper triangular matrices. If \(x\in {\mathfrak{h}}\) then note that \({ \operatorname{ad}}_{\mathfrak{h}}x = 0\), but \({ \operatorname{ad}}_{\mathfrak{g}}x {\mathfrak{h}}\neq 0\).
Some notes:
Is \({\mathfrak{h}}{~\trianglelefteq~}{\mathfrak{b}}\)?
A Lie algebra \(L\) is simple if \(L\neq 0\) and \(\operatorname{Id}(L) = \left\{{0, L}\right\}\) and \([L, L] \neq 0\). Note that \([LL] \neq 0\) only rules out the 1-dimensional Lie algebra, since \([L, L] = 0\) and if \(0 < K < L\) then \(K{~\trianglelefteq~}L\) since \([L,K] = 0\).
Let \(L = {\mathfrak{sl}}_2({\mathbf{C}})\), so \(\operatorname{tr}(x) = 0\). This has standard basis \begin{align*} x = { \begin{bmatrix} {0} & {1} \\ {0} & {0} \end{bmatrix} }, \qquad y = { \begin{bmatrix} {0} & {0} \\ {1} & {0} \end{bmatrix} },\qquad h = { \begin{bmatrix} {1} & {0} \\ {0} & {-1} \end{bmatrix} }. \\ [xy]=h,\quad [hx] = 2x,\quad [hy] = -2y .\end{align*}
Prove that \({\mathfrak{sl}}_2({\mathbf{C}})\) is simple.
Show that for \(K\leq L\), the normalizer \(N_L(K)\) is the largest subalgebra of \(L\) in which \(K\) is an ideal.
Show that \({\mathfrak{h}}\subseteq {\mathfrak{g}}\mathrel{\vcenter{:}}={\mathfrak{sl}}_n({\mathbf{C}})\) is self-normalizing subalgebra of \({\mathfrak{g}}\).
Hint: use \([h, e_{ij}] = (h_i - h_j) e_{ij}\) where \(h = \operatorname{diag}(h_1,\cdots, h_n)\). The standard basis is \({\mathfrak{h}}= \left\langle{e_{11} - e_{22}, e_{22} - e_{33}, \cdots, e_{n-1, n-1} - e_{n,n} }\right\rangle\).
What is \(\dim {\mathfrak{sl}}_3({\mathbf{C}})\)? What is the basis for \({\mathfrak{g}}\) and \({\mathfrak{h}}\)?
Notes:
Let \(L\in \mathsf{Lie} \mathsf{Alg}_{/ {{ \mathbf{F} }}}\), then \(\mathop{\mathrm{Aut}}(L)\) is the group of isomorphisms \(L { \, \xrightarrow{\sim}\, }L\). Some important examples: if \(L\) is linear and \(g\in \operatorname{GL}(V)\), if \(gLg^{-1}= L\) then \(x\mapsto gxg^{-1}\) is an automorphism. This holds for example if \(L = {\mathfrak{gl}}_n({ \mathbf{F} })\) or \({\mathfrak{sl}}_n({ \mathbf{F} })\). Assume \(\operatorname{ch}{ \mathbf{F} }= 0\) and let \(x\in L\) with \({ \operatorname{ad}}x\) nilpotent, say \(( { \operatorname{ad}}x)^k=0\). Then the power series expansion \(e^{ { \operatorname{ad}}x} = \sum_{n\geq 0} ( { \operatorname{ad}}x)^n\) is a polynomial.
\(\exp^{ { \operatorname{ad}}x}\in \mathop{\mathrm{Aut}}(L)\) is an automorphism. More generally, \(e^\delta\in \mathop{\mathrm{Aut}}(L)\) for \(\delta\) any nilpotent derivation.
\begin{align*} \delta^n(xy) = \sum_{i=0}^n {n\choose i} \delta^{n-i}(x) \delta^{i}(y) .\end{align*}
One can prove this by induction. Then check that \(\exp(\delta(x))\exp(\delta(y)) = \exp(\delta(xy))\) and writing \(\exp(\delta) = 1+\eta\) there is an inverse \(1-\eta +\eta^2 +\cdots \pm \eta^{k-1}\). Automorphisms which are of the form \(\exp(\delta)\) for \(\delta\) nilpotent derivation are called inner automorphisms, and all others are outer automorphisms.
Recall that if \(L \subseteq {\mathfrak{g}}\) is any subset, the derived algebra \([LL]\) is the span of \([xy]\) for \(x,y\in L\). This is the analog of needing to take products of commutators to generate a commutator subgroup for groups. Define the derived series of \(L\) as \begin{align*} L^{(0)} = [LL], \quad L^{(1)} = [L^{(0)}, L^{(0)}], \cdots \quad L^{(i+1)} = [L^{(i)} L^{(i)}] .\end{align*}
These are all ideals.
By induction on \(i\) – it STS that \([x[ab]] \in L^{(i)}\) for \(a,b \in L^{(i-1)}\) and \(x\in L\). Use the Jacobi identity and the induction hypothesis that \(L^{(i-1)} {~\trianglelefteq~}L\): \begin{align*} [x,[ab]] = [[xa]b] + [a[xb]] \in L^{(i-1)} + L^{(i-1)} \subseteq L^{(i)} .\end{align*}
If \(L^{(n)} = 0\) for some \(n\geq 01\) then \(L\) is called solvable.
Note that
Let \({\mathfrak{b}}\mathrel{\vcenter{:}}={\mathfrak{b}}_n({ \mathbf{F} })\) be upper triangular matrices, show that \({\mathfrak{b}}\) is solvable.
Use that \([{\mathfrak{b}}{\mathfrak{b}}] = {\mathfrak{n}}\) is strictly upper triangular since diagonals cancel. More generally, bracketing matrices with \(n\) diagonals of zeros yields matrices with about \(2^n\) diagonals of zeros.
Let \(L\in \mathsf{Lie} \mathsf{Alg}_{/ {{ \mathbf{F} }}}\), then
Prove these.
Every \(L\in \mathsf{Lie} \mathsf{Alg}_{/ {{ \mathbf{F} }}}^{{\mathrm{fd}}}\) has a unique maximal solvable ideal, the sum of all solvable ideals, called the radical of \(L\), denote \(\sqrt{(}L)\). \(L\) is semisimple if \(\sqrt{(}L) = 0\).
Prove that any simple algebra is semisimple, and in general \(L/\sqrt{(}L)\) is semisimple (if nonzero).
Assume \({\mathfrak{sl}}_n({\mathbf{C}})\) is simple, then \(R\mathrel{\vcenter{:}}=\sqrt{(}{\mathfrak{gl}}_n({\mathbf{C}})) = Z({\mathfrak{g}}) \supseteq{\mathbf{C}}\operatorname{id}_n\) for \({\mathfrak{g}}\mathrel{\vcenter{:}}={\mathfrak{gl}}_n({\mathbf{C}})\).
\(\supseteq\): Centers are always solvable ideals, since it’s abelian and brackets are ideals, and the radical is sum of all solvable ideals.
\(\subseteq\): Suppose \(Z\subsetneq R\) is proper, then there is a non-scalar matrix \(x\in R\). Write \(x = aI_n + y\) for \(a = {\mathrm{tr}}(x)/n\) and \(0\neq y\in {\mathfrak{sl}}_n({\mathbf{C}})\) is traceless. Consider \(I = \left\langle{x}\right\rangle {~\trianglelefteq~}{\mathfrak{gl}}_n({\mathbf{C}})\), i.e. the span of all brackets \([zx]\) for \(z\in {\mathfrak{g}}\) and their iterated brackets containing \(x\), e.g. \([z_1[z_2x]]\). Note that \([zx]=[zy]\) since \(aI_n\) is central. Since \({\mathfrak{sl}}_n({\mathbf{C}})\) is simple, so \(\left\langle{y}\right\rangle_{{\mathfrak{sl}}_n({\mathbf{C}})} = {\mathfrak{sl}}_n({\mathbf{C}})\) and thus \({\mathfrak{sl}}_n({\mathbf{C}}) \subseteq I\). This containment must be proper, since \(I \subseteq \sqrt{(}{\mathfrak{g}})\) and the latter is solvable, so \(I\) must be solvable – but \({\mathfrak{sl}}_n({\mathbf{C}})\) is not solvable. We can thus choose \(x\in I\) such that \(x = aI_n + y\) with \(a\neq 0\) and \(0\neq y \in {\mathfrak{sl}}_n({\mathbf{C}})\), so \(x-y= aI \in I\) since \(y\in I\) because \({\mathfrak{sl}}_n({\mathbf{C}}) \subseteq I\). Since \(a\neq 0\), we must have \(I_n\in I\). Then \({\mathbf{C}}\cdot I_n \subseteq I\), forcing \(I = {\mathfrak{g}}\) since every matrix in \({\mathfrak{gl}}_n({\mathbf{C}})\) is a scalar multiple of the identity plus a traceless matrix. This contradicts that \(I\) is solvable, since \({\mathfrak{g}}^{(1)} \mathrel{\vcenter{:}}=[{\mathfrak{g}}{\mathfrak{g}}] = {\mathfrak{sl}}_n({\mathbf{C}})\). But \({\mathfrak{g}}^{(1)} = {\mathfrak{sl}}_n({\mathbf{C}})\), so the derived series never terminates. \(\contradiction\)
The descending/lower central series of \(L\) is defined as \begin{align*} L^0 = L, \quad L^1 = [LL], \quad \cdots L^i = [L, L^{i-1}] .\end{align*} \(L\) is nilpotent if \(L^n=0\) for some \(n\).
Check that \(L^i {~\trianglelefteq~}L\).
Show that \(L\) nilpotent is equivalent to there existing a finite \(n\) such that for any set of elements \(\left\{{x_i}\right\}_{i=1}^n\), \begin{align*} ( { \operatorname{ad}}_{x_1} \circ { \operatorname{ad}}_{x_2} \circ \cdots \circ { \operatorname{ad}}_{x_n})(y) = 0 \qquad \forall y\in L .\end{align*}
Recall \(L\) is nilpotent if \(L^{n} = 0\) for some \(n\geq 0\) (the descending central series) where \(L^{i+1} = [LL^{i}]\). Equivalently, \(\prod_{i\leq n} { \operatorname{ad}}_{x_i} =0\) for any \(\left\{{x_i}\right\}_{i\leq n} \subseteq L\). Note that \(L^{i} \supseteq L^{(i)}\) by induction on \(i\) – these coincide for \(i=0,1\), and one can check \begin{align*} L^{(i+1)} = [L^{(i)} L^{(i)}] \subseteq [L L^{i}] = L^{i+1} .\end{align*}
\({\mathfrak{b}}_n\) is solvable but not nilpotent, since \({\mathfrak{b}}_n^1 = {\mathfrak{n}}_n\) but \({\mathfrak{b}}_n^2 = {\mathfrak{n}}_n\) and the series never terminates.
\({\mathfrak{n}}_n\) is nilpotent, since the number of diagonals with zeros adds when taking brackets \([LL^{i}]\).
\({\mathfrak{h}}\) is also nilpotent, since any abelian algebra is nilpotent.
Let \(L\in\mathsf{Lie} \mathsf{Alg}_{/ {{ \mathbf{F} }}}\), then
Show that if \(L/I\) and \(I{~\trianglelefteq~}L\) are nilpotent, then \(L\) need not be nilpotent.
Distinguish \({ \operatorname{End} }(L)\) whose algebra structure is given by associative multiplication and \({\mathfrak{gl}}(L)\) with the bracket multiplication.
An element \(x\in L\) is ad-nilpotent if \({ \operatorname{ad}}_x \in { \operatorname{End} }(L)\) is a nilpotent endomorphism.
If \(L\) is nilpotent then \(x\in L\) is ad-nilpotent by taking \(x_i = x\) for all \(i\). It turns out that the converse is true:
If all \(x\in L\) are ad-nilpotent, then \(L\) is nilpotent.
To be covered in an upcoming section.
Let \(x\in {\mathfrak{gl}}(V)\) be a nilpotent linear transformation for \(V\) finite-dimensional. Then \begin{align*} { \operatorname{ad}}_x: {\mathfrak{gl}}(V)&\to {\mathfrak{gl}}(V) \\ y &\mapsto x\circ y - y\circ x \end{align*} is a nilpotent operator.
Let \(\lambda_x, \rho_x \in { \operatorname{End} }({\mathfrak{gl}}(V))\) be left and right multiplication by \(x\), which are commuting nilpotent operators. The binomial theorem shows that if \(D_1, D_2\) are any two commuting nilpotent endomorphisms of a vector space, then \(D_1\pm D_2\) is again nilpotent. But then one can write \({ \operatorname{ad}}_x = \lambda_x - \rho_x\).
If \(x\in {\mathfrak{gl}}(V)\) is nilpotent then so is \({ \operatorname{ad}}_x\). Conversely, if all \({ \operatorname{ad}}_x\) for \(x\in L \leq {\mathfrak{gl}}(V)\) are nilpotent operators then \(L\) is nilpotent by Engel’s theorem.
The converse of the above lemma is not necessarily true: \(x\) being ad-nilpotent does not imply that \(x\) is nilpotent. As a counterexample, take \(x=I_n\in {\mathfrak{gl}}_n({\mathbf{C}})\), then \({ \operatorname{ad}}_x = 0\) but \(x^k=x\) for any \(k\geq 1\).
The following is related to the classical linear algebra theorem that commuting operators admit a simultaneous eigenvector:
Let \(L\) be a Lie subalgebra of \({\mathfrak{gl}}(V)\) for \(V\) finite-dimensional. If \(L\) consists of nilpotent endomorphisms, then there exists a nonzero \(v\in V\) such that \(Lv=0\).
Proceed by induction on \(n = \dim L\) (assuming it holds for all vector spaces), where the \(n=1\) case is clear – the characteristic polynomial of such an operator is \(f(t) = t^n\), which has roots \(t=0\) and every field contains zero. Once one has an eigenvalue, there is at least one eigenvector.
For \(n > 1\), suppose \(K \leq L\) is a proper Lie subalgebra. By hypothesis, \(K\) consists of nilpotent elements in \({ \operatorname{End} }(V)\), so apply the previous lemma to see that \({ \operatorname{ad}}(K) \subseteq { \operatorname{End} }(L)\) acts by nilpotent endomorphisms of \(L\) since they are restrictions to \(L\) of nilpotent endomorphisms of \({\mathfrak{gl}}(V)\). Since \([KK] \subseteq K\), we can view \({ \operatorname{ad}}(K) \subseteq { \operatorname{End} }(L/K)\) where \(L/K\) is a vector space.
By the IH with \(V = L/K\), where \({ \operatorname{End} }(L/K)\) has smaller dimension, one can find a nonzero \(x+K \in L/K\) such that \({ \operatorname{ad}}(K)(x+K)=0\). Hence one can find an \(x\in L\setminus K\) such that for all \(y\in K\) one has \([yx] \in K\), so \(x\in N_L(K)\setminus K\). Thus \(K \subsetneq N_L(K)\) is a proper containment.
To be continued.
Recall: we were proving that if \(L \leq {\mathfrak{gl}}(V)\) with \(V\) finite dimensional and \(L\) consists of nilpotent endomorphisms, then there exists a common eigenvector \(v\), so \(Lv = 0\).
We’re inducting on \(\dim L\) (over all \(V\)). Assuming \(\dim V > 1\), we showed that proper subalgebras are strictly contained in their normalizers: \begin{align*} K \lneq L \implies K \subsetneq N_L(K) .\end{align*} Let \(K\) be a maximal proper subalgebra of \(L\), then \(N_L(K) = L\) by maximality and thus \(K\) is a proper ideal of \(L\). Then \(L/K\) is a Lie algebra of some dimension, which must be 1 – otherwise the preimage in \(L\) under \(L\twoheadrightarrow L/K\) would be a subalgebra of \(L\) properly between \(K\) and \(L\). Thus \(K\) is a codimension 1 ideal in \(L\).Minimal model program Choosing any \(z\in L\setminus K\) yields a decomposition \(L = K \bigoplus { \mathbf{F} }z\) as vector spaces. Let \(W \mathrel{\vcenter{:}}=\left\{{v\in V{~\mathrel{\Big\vert}~}Kv=0}\right\}\), then \(W\neq 0\) by the IH.
\(W\) is an \(L{\hbox{-}}\)stable subspace.
To see this, let \(x\in L, y\in K, w\in W\). A useful trick: \begin{align*} y.(x.w) = w.(y.w) - [xy].w = 0 ,\end{align*} since the first term is zero and \([xy]\in K {~\trianglelefteq~}L\).
Since \(z\curvearrowright W\) nilpotently, choose an eigenvector \(v\) for \(z\) in \(W\) for the eigenvalue zero. Then \(z.v=0\), so \(Lv=0\).
If all elements of a Lie algebra \(L\) are ad-nilpotent, then \(L\) is nilpotent as an algebra.
Induct on \(\dim L\). Note that \({ \operatorname{ad}}(L) \leq {\mathfrak{gl}}(V)\) consists of nilpotent endomorphisms. Use the theorem to pick \(x\in L\) such that \({ \operatorname{ad}}(L).x = 0\), i.e. \([L, x] = 0\), i.e. \(x\in Z(L)\) and thus \(Z(L)\) is nonzero. Now \(\dim L/Z(L) < \dim L\), and a fortiori its elements are still ad-nilpotent so \(L/Z(L)\) is nilpotent. By proposition 3.2b, \(L\) is nilpotent.7
Let \(o\neq L \leq {\mathfrak{gl}}(V)\) with \(\dim V < \infty\) be a Lie algebra of nilpotent endomorphisms (as in the theorem).8 Then \(V\) has a basis in which the matrices of \(L\) are all strictly upper triangular.
Induct on \(\dim V\). Use the theorem to pick a nonzero \(v_1\) with \(Lv_1=0\). Consider \(W\mathrel{\vcenter{:}}= V/{ \mathbf{F} }v_1\), and view \(L \subseteq { \operatorname{End} }(V)\) as a subspace of \({ \operatorname{End} }(W)\) – these are still nilpotent endomorphisms. By the IH, \(W\) has a basis \(\left\{{\overline{v}_i}\right\}_{2\leq i \leq n}\) with respect to the matrices in \(L\) (viewed as a subspace of \({ \operatorname{End} }(W)\)) are strictly upper triangular. Let \(\left\{{v_i}\right\} \subseteq V\) be their preimages in \(L\); this basis has the desired properties. This results in a matrix of the following form:
From now on, assume \({ \mathbf{F} }= \overline{{ \mathbf{F} }}\) is algebraically closed and \(\operatorname{ch}(k) = 0\).
Let \(L\neq 0\) be a solvable Lie subalgebra of \({\mathfrak{gl}}(V)\) with \(\dim V < \infty\). Then \(V\) contains a common eigenvector for all of \(L\).
Induct on \(\dim L\). If \(\dim L = 1\), then \(L\) is spanned by 1 linearly operator \(x\) and over an algebraically closed field, \(x\) has at least one eigenvector. For \(\dim L > 1\), take the following strategy:
Off we go!
Step 1: Since \(L\) is solvable, we have \([LL]\) properly contained in \(L\). In \(L/[LL]\) choose any codimension 1 subspace – it is an ideal, which lifts to a codimension 1 ideal \(K \subset L\).
Step 2: Since subalgebras of solvable algebras are again solvable, \(K\) is solvable. By the IH, pick a common nonzero eigenvector \(v\) for \(K\). There exists a linear map \(\lambda: K\to { \mathbf{F} }\) such that \(x.v = \lambda(x) v\) for all \(x\in K\). Let \(W \mathrel{\vcenter{:}}=\left\{{v\in V {~\mathrel{\Big\vert}~}y.v = \lambda(y) v\,\,\forall y\in K}\right\}\), which is nonzero.
Step 3: Note \(L.W \subseteq W\). Let \(w\in W, x\in L, y\in K\); we WTS \(y.(x.w) = \lambda(y)x.w\). Write \begin{align*} y.(x.w) &= x.(y.w) - [xy].w \\ &= \lambda(y)(x.w) - \lambda([xy])w ,\end{align*} where the second line follows since \([xy]\in K\). We then need \(\lambda([xy]) = 0\) for all \(x\in L\) and \(y\in K\). Since \(\dim V < \infty\), choose \(n\) minimal such that \(\left\{{w, x.w, x^2.w,\cdots, x^n.w}\right\}\) is linearly dependent. Set \(W_i \mathrel{\vcenter{:}}=\mathop{\mathrm{span}}_{ \mathbf{F} }\left\{{w, x.w, \cdots, x^i.w}\right\}\), so \(W_0 = 0, W_1 = \mathop{\mathrm{span}}_{ \mathbf{F} }\left\{{w}\right\}\), and so on, noting that
For all \(y\in K\),
\begin{align*} y.x^i.w = \lambda(y) x^i.w \operatorname{mod}W_i .\end{align*}
To be continued!
Recall \(\dim L, \dim V < \infty\), \({ \mathbf{F} }\) is algebraically closed, and \(\operatorname{ch}{ \mathbf{F} }= 0\). For \(L \leq {\mathfrak{gl}}(V)\) solvable, we want a common eigenvector \(v\in V\) for \(L\). Steps for the proof:
Step 3: Fix \(x\in L, w\in W\) and \(n\) minimal such that \(\left\{{x^i w}\right\}_{i\leq n}\) is linearly dependent. For \(i\geq 0\) set \(W_i = { \mathbf{F} }\left\langle{w, xw, \cdots, x^{i-1}w}\right\rangle\). Then \(\dim W_n = n, W_n = W_{n+i}\) for \(i\geq 0\), and \(xW_n \subseteq W_n\).
For all \(y\in K\), \begin{align*} yx^i .w = \lambda(y) x^i w \operatorname{mod}W_i .\end{align*}
This is proved by induction on \(i\), where \(i=0\) follows from how \(W\) is defined. For \(i\geq 1\), use the commuting trick: \begin{align*} yx^i . w &= yxx^{i-1}w \\ &= (xy - [xy]) x^{i-1} w \\ &= x(y x^{i-1} w) - [xy]x^{i-1}w \\ &\equiv \lambda(y) x^i w - \lambda([xy])x^{i-1} w \operatorname{mod}W_{i-1} \\ &\equiv \lambda(y) x^i w - \lambda([xy])x^{i-1} w \operatorname{mod}W_{i} \qquad \text{since } W_{i-1} \leq W_i \\ &\equiv \lambda(y) x^i w \operatorname{mod}W_i .\end{align*}
Given this claim, for \(i=n\) this says that the matrices of any \(y\in K\) with respect to the basis \(\left\{{x^iw}\right\}_{0\leq i \leq n-1}\) is upper triangular with diagonal entries all equal to \(\lambda(y)\). Thus \({ \left.{{\operatorname{tr}(y)}} \right|_{{W_m}} } = n \lambda(y)\), and so \([xy]\curvearrowright W_n\) with trace \(n \lambda([xy])\). On the other hand, \(x,y\) both act on \(W_n\) (e.g. by the formula in the claim for \(yx^i.w\)) and so \begin{align*} { \left.{{[xy]}} \right|_{{W_n}} } = { \left.{{xy}} \right|_{{W_n}} } - { \left.{{yx}} \right|_{{W_n}} } ,\end{align*} thus \({ \left.{{ \operatorname{tr}([xy])}} \right|_{{W_n}} } = 0\). Since \({ \mathbf{F} }\) is characteristic zero, we have \(n \lambda([xy]) = 0 \implies \lambda([xy]) = 0\).
Step 4: By step 1, \(L = K \oplus { \mathbf{F} }z\) for some \(z\in L\setminus K\). Viewing \(z: W\to W\) and using \({ \mathbf{F} }= \overline{{ \mathbf{F} }}\), \(z\) has an eigenvector \(v\in W\). Since \(v\in W\), it is also a common eigenvector for \(K\) and thus an eigenvector for \(L\) by additivity.
Let \(L\leq {\mathfrak{gl}}(V)\) be a solvable subalgebra, then \(L\) stabilizes some flag in \(V\). In particular, there exists a basis for \(V\) with respect to which the matrices in \(L\) are all upper triangular.
Recall that for \(V \in{ \mathsf{Vect}}_{/ {{ \mathbf{F} }}}\), a complete flag is an element of \begin{align*} \operatorname{Fl}(V) \mathrel{\vcenter{:}}=\left\{{ 0 = V^0 \subsetneq V^1 \subsetneq \cdots \subsetneq V^n = V {~\mathrel{\Big\vert}~}\dim V^i = i}\right\} .\end{align*} A subalgebra \(L\) stabilizes a flag if \(LV^i \subseteq V^i\) for all \(i\), which implies there is a compatible basis (got by extending one vector at a time from a basis for \(V^1\)) for which \(L\) acts by upper triangular matrices.
Use the theorem and induct on \(n=\dim V\) as in Engel’s theorem – find a common eigenvector for \(V^1\), since \(L\) stabilizes one gets an action \(L\curvearrowright V^i/V^{i-1}\) which is smaller dimension. Then just lift through the quotient.
Let \(L\) be a solvable Lie algebra, then there exists a chain of ideals \begin{align*} 0 = L_0 \subsetneq L_1 \subsetneq \cdots \subsetneq L_n = L \end{align*} such that \(\dim L_i = i\).
Consider \({ \operatorname{ad}}L \leq {\mathfrak{gl}}(L)\). Apply Lie’s theorem: \(( { \operatorname{ad}}L)L_i \subseteq L_i \iff [LL_i] \subseteq L_i\), making \(L_i{~\trianglelefteq~}L\) an ideal.
Let \(L\) be solvable, then \(x\in [LL]\implies { \operatorname{ad}}_L x\) is nilpotent. Hence \([LL]\) is nilpotent by Lie’s theorem.
Find a flag of ideals by and let \(\left\{{x_1,\cdots, x_n}\right\}\) be a compatible basis. Then the matrices \(\left\{{ { \operatorname{ad}}_x{~\mathrel{\Big\vert}~}x\in L}\right\}\) are all upper triangular. If \(x\in [LL]\), without loss of generality \(x = [yz]\) for some \(y,z\in L\). Then \begin{align*} { \operatorname{ad}}_x = [ { \operatorname{ad}}_y { \operatorname{ad}}_z] = { \operatorname{ad}}_y { \operatorname{ad}}_z - { \operatorname{ad}}_z { \operatorname{ad}}_y \end{align*} will be strictly upper triangular (since these are upper triangular and the commutator cancels diagonals) and hence nilpotent.
We’ll come back to 4.2 next time. For this section, assume \({ \mathbf{F} }= \overline{{ \mathbf{F} }}\) and \(\operatorname{ch}{ \mathbf{F} }= 0\). Cartan’s criterion for a semisimple \(L\) (i.e. \(\mathop{\mathrm{Rad}}(L) = 0\)) involves the Killing form, a certain nondegenerate bilinear form on \(L\). Recall that if \(L\) is solvable then \([LL]\) is nilpotent, or equivalently every \(x\in [LL]\) is ad-nilpotent.
Let \(A \subseteq B\) be subspaces of \({\mathfrak{gl}}(V)\) (really \({ \operatorname{End} }(V)\) as a vector space) with \(V\) finite-dimensional. Let \begin{align*} M\mathrel{\vcenter{:}}=\left\{{w\in {\mathfrak{gl}}(V) {~\mathrel{\Big\vert}~}[wB] \subseteq A}\right\} \end{align*} and suppose some \(w\in M\) satisfies \(\operatorname{tr}(wz) = 0\) for all \(z\in M\). Then \(w\) is nilpotent.
Later!
A bilinear form is a map \begin{align*} \beta({-}, {-}): L\times L\to { \mathbf{F} } ,\end{align*} which is symmetric if \(\beta(x,y) = \beta(y,x)\) and associative if \(\beta([xy], z) = \beta(x, [yz])\) for all \(x,y,z\in L\). The radical of \(\beta\) is \begin{align*} \mathop{\mathrm{Rad}}(\beta) \mathrel{\vcenter{:}}=\left\{{w\in V{~\mathrel{\Big\vert}~}\beta(w, V) = 0}\right\} ,\end{align*} and \(\beta\) is nondegenerate if \(\mathop{\mathrm{Rad}}(\beta) = 0\).
For \(L = {\mathfrak{gl}}(V)\), take \(\beta(x,y)\mathrel{\vcenter{:}}=\operatorname{tr}(xy)\). One can check this is symmetric, bilinear, and associative – associativity follows from the following: \begin{align*} [xy]z &= xyz-yxz\\ x[yz] &= xyz - xzy .\end{align*} Then note that \(y(xz)\) and \((xz)y\) have the same trace, since \(\operatorname{tr}(AB) = \operatorname{tr}(BA)\).
If \(\beta\) is associative, then \(\mathop{\mathrm{Rad}}(\beta) {~\trianglelefteq~}L\).
Let \(z\in \mathop{\mathrm{Rad}}(\beta)\) and \(x,y\in L\). To see if \([zx]\in \mathop{\mathrm{Rad}}(\beta)\), check \begin{align*} \beta([zx], y) = \beta(z, [xy]) = 0 \end{align*} since \(z\in \mathop{\mathrm{Rad}}(\beta)\). Thus \([zx] \in \mathop{\mathrm{Rad}}(\beta)\).
Let \({ \mathbf{F} }= \overline{{ \mathbf{F} }}\) of arbitrary characteristic and \(V\in{ \mathsf{Vect}}_{/ {{ \mathbf{F} }}}^{\mathrm{fd}}\) with \(x\in { \operatorname{End} }_{ \mathbf{F} }(V)\). The JCF of \(x\) is of the form \(D+N\) where \(D\) is diagonal and \(N\) is nilpotent where \(D, N\) commute. Recall \(x\) is semisimple (diagonalizable) iff the minimal polynomial of \(x\) has distinct roots.
If \(x\in { \operatorname{End} }(V)\),
There is a decomposition \(x = x_s + x_n\) where \(x_s\) is semisimple and \(x_n\) is nilpotent. This is unique subject to the condition that \(x_s, x_n\) commute.
There are polynomials \(p(T), q(T)\) without constant terms with \(x_s = p(x), x_n = q(x)\). In particular, \(x_s, x_n\) commute with any endomorphism which commutes with \(x\).
Let \(x\in {\mathfrak{gl}}(V)\) with Jordan decomposition \(x = x_s + x_n\). Then \({ \operatorname{ad}}_x = { \operatorname{ad}}_{x_s} + { \operatorname{ad}}_{x_n}\) is the Jordan decomposition of \({ \operatorname{ad}}_x\) in \({ \operatorname{End} }({ \operatorname{End} }(V))\).
If \(x\in {\mathfrak{gl}}(V)\) is semisimple then so is \({ \operatorname{ad}}_x\) since the eigenvalues for \({ \operatorname{ad}}_x\) are differences of eigenvalues of \(x\). I.e. if \(\left\{{v_1,\cdots, v_n}\right\}\) is an eigenbasis for \(V\) and \(x.v_i = a_i v_i\) in this bases, we have \([x e_{ij}] = (a_i - a_j) = e_{ij}\), so \(\left\{{e_{ij}}\right\}\) is an eigenbasis for \({ \operatorname{ad}}_x\). If \(x\) is nilpotent then \({ \operatorname{ad}}_x\) is nilpotent, since \({ \operatorname{ad}}_x(y) = \lambda_x(y) - \rho_x(y)\) where \(\lambda, \rho\) are left/right multiplication, and sums of nilpotents are nilpotent. One can check \([ { \operatorname{ad}}_{x_s} { \operatorname{ad}}_{x_n}] = { \operatorname{ad}}_{[x_s x_n} = 0\) since they commute.
One can show that if \(L\) is semisimple then \({ \operatorname{ad}}(L) = \mathop{\mathrm{Der}}(L)\), which is used to show that if \(L\) is an arbitrary Lie algebra then one has
This gives a notion of semisimplicity and nilpotency for Lie algebras not of the form \({\mathfrak{gl}}(V)\).
Let \(U\in \mathsf{Alg}_{/ {{ \mathbf{F} }}}^{\mathrm{fd}}\), then \(\mathop{\mathrm{Der}}(U)\) is closed under taking semisimple and nilpotent parts.
Let \(\delta\in \mathop{\mathrm{Der}}(U)\) and write \(\delta = \sigma + v\) be the Jordan decomposition of \(\delta\) in \({ \operatorname{End} }(U)\). It STS \(\sigma\) is a derivation, so for \(a\in { \mathbf{F} }\) define \begin{align*} U_a \mathrel{\vcenter{:}}=\left\{{x\in U {~\mathrel{\Big\vert}~}(\delta - a)^k x = 0 \,\,\text{for some } k}\right\} .\end{align*} Note \(U = \bigoplus _{a\in \Lambda} U_a\) where \(\Lambda\) is the set of eigenvalues of \(\delta\), which are also the eigenvalues of \(\sigma\) – this is because \(\sigma, v\) are commuting operators, so eigenvalues of \(\delta\) are sums of eigenvalues of \(s\) and \(v\).
For any \(a,b\in { \mathbf{F} }\), \(U_a U_b \subseteq U_{a+b}\).
Assuming this, it STS \(\sigma(xy) = \sigma(x)y + x \sigma(y)\) when \(x\in U_a, y\in U_b\) where \(a,b\) are eigenvalues. Using that eigenvalues of \(\delta\) are also eigenvalues of \(\sigma\), since \(xy\in U_{a+b}\) by the claim, \(\sigma(xy) = (a+b)xy\) and thus \begin{align*} \sigma(x)y + x \sigma(y) = axy + xby = (a+b)xy .\end{align*} So \(\sigma\in \mathop{\mathrm{Der}}(U)\).
A sub-claim: \begin{align*} (\delta - (a+b) 1) (xy) = \sum_{0\leq i\leq n} {n\choose i} (\delta - aI)^{n-i}x (\delta- b 1)^i y .\end{align*}
For the rest of the course, \(V\) is a vector space of finite dimension. Goal: get a criterion for semisimplicity.
Let \(L\leq {\mathfrak{gl}}(V)\) be a linear Lie algebra and suppose \(\operatorname{tr}(xz)=0\) for all \(x\in [LL]\) and \(z\in L\). Then \(L\) is solvable.
Let \(A \subseteq B\) be subspaces of \({ \operatorname{End} }(V) = {\mathfrak{gl}}(V)\) and define \begin{align*} M = \left\{{w\in {\mathfrak{gl}}(V) {~\mathrel{\Big\vert}~}[w, B] \subseteq A}\right\} .\end{align*} Suppose that \(w\in M\) satisfies \(\operatorname{tr}(wz) = 0\) for all \(z\in M\). Then \(w\) is nilpotent.
To show \(L\) is solvable, it STS that \([LL]\) is nilpotent since the ideals used to check nilpotency are bigger than those to check solvability. By Engel’s theorem, it STS to show each \(w\in [LL]\) is ad-nilpotent. Since \(L \leq {\mathfrak{gl}}(V)\), it STS to show each \(w\in [LL]\) is a nilpotent endomorphism. As in the setup of the lemma, set \(B = L, A = [LL]\), then \begin{align*} M \mathrel{\vcenter{:}}=\left\{{z\in {\mathfrak{gl}}(V) {~\mathrel{\Big\vert}~}[zL] \subseteq [LL] }\right\} \supseteq L \supseteq[LL] .\end{align*} Let \(w\in [LL] \subseteq M\), then note that \(\operatorname{tr}(wz) = 0\) for all \(z\in L\), but we need to know this for all \(z\in M\). Letting \(z\in M\) be arbitrary; by linearity of the trace it STS \(\operatorname{tr}(wz) = 0\) on generators \(w = [xy]\) on \([LL]\) for \(x,y\in L\). We thus WTS \(\operatorname{tr}([xy]z) = 0\): \begin{align*} \operatorname{tr}([xy]z) &= \operatorname{tr}(x [yz] ) \\ &=\operatorname{tr}([yz] x) \qquad \in \operatorname{tr}(LMx) \subseteq \operatorname{tr}([LL]L) \\ &= 0 \end{align*} by assumption. By the lemma, \(w\) is nilpotent.
Let \(L\in \mathsf{Lie} \mathsf{Alg}\) with \(\operatorname{tr}( { \operatorname{ad}}_x { \operatorname{ad}}_y) = 0\) for all \(x \in [LL]\) and \(y\in L\). Then \(L\) is solvable.
Use \({ \operatorname{ad}}: L\to {\mathfrak{gl}}(V)\), a morphism of Lie algebras. Its image is solvable by Cartan’s criterion above, and \(\ker { \operatorname{ad}}= Z(L)\) which is abelian and hence a solvable ideal.9 Therefore \(L\) is solvable.
Let \(w = s + n\) be the Jordan-Chevalley decomposition of \(w\). Choose a basis for \(V\) such that this is the JCF of \(w\), i.e. \(s = \operatorname{diag}(a_1,\cdots, a_n)\) and \(n\) is strictly upper triangular. Idea: show \(s=0\) by showing \(A\mathrel{\vcenter{:}}={\mathbf{Q}}\left\langle{a_1,\cdots, a_n}\right\rangle = 0\) by showing \(A {}^{ \vee }= 0\), i.e. any \({\mathbf{Q}}{\hbox{-}}\)linear functional \(f: A\to {\mathbf{Q}}\) is zero. If \(\sum a_i f(a_i) = 0\) then \begin{align*} 0 = f(\sum a_i f(a_i)) = \sum f(a_i)^2 \implies f(a_i) = 0 \,\,\forall i ,\end{align*} so we’ll show this. Let \(y = \operatorname{diag}( f(a_1), \cdots, f(a_n) )\), then \({ \operatorname{ad}}_y\) is a polynomial (explicitly constructed using Lagrange interpolation) in \({ \operatorname{ad}}_s\) without a constant term. So do this for \({ \operatorname{ad}}_y\) and \({ \operatorname{ad}}_s\) (see exercise). Since \({ \operatorname{ad}}_s\) is a polynomial in \({ \operatorname{ad}}_w\) with zero constant term, and since \({ \operatorname{ad}}_w: B\to A\), we have \({ \operatorname{ad}}_s(B) \subseteq A\) and the same is thus true for \({ \operatorname{ad}}_y\). So \(y\in M\) and \(w\in M\), and applying the trace condition in the lemma with \(z\mathrel{\vcenter{:}}= y\) we get \begin{align*} 0 = \operatorname{tr}(wy) = \sum a_i f(a_i) ,\end{align*} noting that \(w\) is upper triangular and \(y\) is diagonal. So \(s=0\) and \(w=n\) is nilpotent.
Show \({ \operatorname{ad}}_y\) is a polynomial in \({ \operatorname{ad}}_s\).
Recall that \(\mathop{\mathrm{Rad}}L\) is the unique maximal (not necessarily proper) solvable ideal of \(L\). This exists, e.g. because sums of solvable ideals are solvable. Note that \(L\) is semisimple iff \(\mathop{\mathrm{Rad}}L = 0\).
Let \(L\in \mathsf{Lie} \mathsf{Alg}^{\mathrm{fd}}\) and define the Killing form \begin{align*} \kappa: L\times L &\to { \mathbf{F} }\\ \kappa(x, y) &= \operatorname{tr}( { \operatorname{ad}}_x \circ { \operatorname{ad}}_y) .\end{align*} This is an associative10 bilinear form on \(L\).
Let \(L = {\mathbf{C}}\left\langle{x, y}\right\rangle\) with \([xy] = x\). In this ordered basis, \begin{align*} { \operatorname{ad}}_x = { \begin{bmatrix} {0} & {1} \\ {0} & {0} \end{bmatrix} } \qquad { \operatorname{ad}}_y = { \begin{bmatrix} {-1} & {0} \\ {0} & {0} \end{bmatrix} } ,\end{align*} and one can check \(\kappa(x,x) = \kappa(x, y) = \kappa(y, x) = 0\) and \(\kappa(y,y) = 1\). Moreover \(\mathop{\mathrm{Rad}}\kappa = {\mathbf{C}}\left\langle{x}\right\rangle\).
See the text for \(\kappa\) defined on \({\mathfrak{sl}}_2\).
Let \(I {~\trianglelefteq~}L\). If \(\kappa\) is the Killing form of \(L\) ad \(\kappa_I\) that of \(I\), then \begin{align*} \kappa_I = { \left.{{\kappa}} \right|_{{I\times I}} } .\end{align*}
Let \(x\in I\), then \({ \operatorname{ad}}_x(L) \subseteq I\) since \(I\) is an ideal. Choosing a basis for \(I\) yields a matrix:
So if \(x,y\in I\), we have \begin{align*} \kappa(x,y) &= \operatorname{tr}( { \operatorname{ad}}_x \circ { \operatorname{ad}}_y) \\ &= \operatorname{tr}( { \operatorname{ad}}_{I, x} \circ { \operatorname{ad}}_{I, y}) \\ &= \kappa_I(x, y) .\end{align*}
For the rest of the course: \(k = { \overline{k} }\) and \(\operatorname{ch}k = 0\). Theorem from last time: \(L\) is semisimple iff its Killing form \(\kappa(x, y) \mathrel{\vcenter{:}}=\operatorname{tr}( { \operatorname{ad}}_x { \operatorname{ad}}_y)\) is nondegenerate.
Let \(S = \mathop{\mathrm{Rad}}(\kappa) {~\trianglelefteq~}L\), which is easy to check using “invariance” (associativity) of the form. Given \(s,s'\in S\), the restricted form \(\kappa_S(x, y) = \operatorname{tr}( { \operatorname{ad}}_{S, s} { \operatorname{ad}}_{S, s'}) = \operatorname{tr}( { \operatorname{ad}}_{L, s} { \operatorname{ad}}_{L, s'})\), which was proved in a previous lemma. But this is equal to \(\kappa(s, s') = 0\). In particular, we can take \(s\in [SS]\), so by (the corollary of) Cartan’s criterion for solvable Lie algebras, \(S\) is solvable as a Lie algebra and thus solvable as an ideal in \(L\).
\(\implies\): Since \(\mathop{\mathrm{Rad}}(L)\) is the sum of all solvable ideals, we have \(S \subseteq \mathop{\mathrm{Rad}}(L)\), but since \(L\) is semisimple \(\mathop{\mathrm{Rad}}(L) = 0\) and thus \(S=0\).
\(\impliedby\): Assume \(S=0\). If \(I {~\trianglelefteq~}L\) is a solvable ideal so \(I^{(n)} = 0\) for some \(n\geq 0\). If \(I^{(n-1)} \neq 0\), it is a nonzero abelian ideal – since we want to show \(\mathop{\mathrm{Rad}}(L) = 0\), we don’t want this to happen! Thus it STS every abelian ideal is contained in \(S\).
So let \(I {~\trianglelefteq~}L\) be an abelian ideal, \(x\in I\), \(y\in L\). Define an operator \begin{align*} A_{xy}^2 \mathrel{\vcenter{:}}=( { \operatorname{ad}}_x { \operatorname{ad}}_y)^2: L \xrightarrow{ { \operatorname{ad}}_y} L \xrightarrow{ { \operatorname{ad}}_x} I \xrightarrow{ { \operatorname{ad}}_y} I \xrightarrow{ { \operatorname{ad}}_x} 0 ,\end{align*} which is zero since \([II] =0\). Thus \(A_{xy}\) is a nilpotent endomorphism, which are always traceless, so \(0 = \operatorname{tr}( { \operatorname{ad}}_x { \operatorname{ad}}_y) = \kappa(x, y)\) for all \(y\in L\), and so \(x\in S\). Thus \(I \subseteq S\).
\(\mathop{\mathrm{Rad}}(\kappa) \subseteq \mathop{\mathrm{Rad}}(L)\) always, but the reverse containment is not always true – see exercise 5.4.
Let \(L_i\in \mathsf{Lie} \mathsf{Alg}_{/ {k}}\), then their direct sum is the product \(L_1 \times L_2\) with bracket \begin{align*} [x_1 \oplus x_2, y_1 \oplus y_2] \mathrel{\vcenter{:}}=[x_1 y_1] \oplus [x_2 y_2] .\end{align*}
In particular, \([L_1, L_2] = 0\), and thus any ideal \(I_1 {~\trianglelefteq~}L_1\) yields an ideal \(I_1 \oplus 0 {~\trianglelefteq~}L_1 \oplus L_2\). Moreover, if \(L = \bigoplus I_i\) is a vector space direct sum of ideals of \(L\), this is automatically a Lie algebra direct sum since \([I_i, I_j] = I_i \cap I_j = 0\) for all \(i\neq j\).
This is not true for subalgebras! Also, in this theory, one should be careful about whether direct sums are as vector spaces or (in the stronger sense) as Lie algebras.
Let \(L\) be a finite-dimensional semisimple Lie algebra. Then there exist ideals \(I_n\) of \(L\) such that \(L = \bigoplus I_i\) with each \(I_j\) simple as a Lie algebra. Moreover every simple ideal if one of the \(I_j\).
Let \(I {~\trianglelefteq~}L\) and define \begin{align*} I^\perp \mathrel{\vcenter{:}}=\left\{{x\in L {~\mathrel{\Big\vert}~}\kappa(x, I) = 0 }\right\} ,\end{align*} the orthogonal complement of \(I\) with respect to \(\kappa\). This is an ideal by the associativity of \(\kappa\). Set \(J\mathrel{\vcenter{:}}= I \cap I^\perp {~\trianglelefteq~}L\), then \(\kappa([JJ], J) = 0\) and by Cartan’s criterion \(J\) is a solvable ideal and thus \(J = 0\), making \(L\) semisimple.
From the Endman-Wildon lemma in the appendix (posted on ELC, lemma 16.11), \(\dim L = \dim I + \dim I^\perp\) and \(L = I \oplus I^\perp\), so now induct on \(\dim L\) to get the decomposition when \(L\) is not simple. These are semisimple ideals since solvable ideals in the \(I, I^\perp\) remain solvable in \(L\). Finally let \(I{~\trianglelefteq~}L\) be simple, then \([I, L] \subseteq L\) is a ideal (in both \(L\) and \(I\)), which is nonzero since \(Z(L) = 0\). Since \(I\) is simple, this forces \([I, L] = I\). Writing \(L = \bigoplus I_i\) as a sum of simple ideals, we have \begin{align*} I = [I, L] = [I, \bigoplus I_i] = \bigoplus [I, I_i] ,\end{align*} and by simplicity only one term can be nonzero, so \(I = [I, I_j]\) for some \(j\). Since \(I_j\) is an ideal, \([I, I_j] \subseteq I_j\), and by simplicity of \(I_j\) we have \(I = I_j\).
Let \(L\) be semisimple, then \(L = [LL]\) and all ideals and homomorphic images (but not subalgebras) are again semisimple. Moreover, every ideal of \(L\) is a sum of simple ideals \(I_j\) of \(L\).
Take the canonical decomposition \(L = \bigoplus I_i\) and check \begin{align*} [L, L] = [\bigoplus I_i, \bigoplus I_j] = \bigoplus [I_i, I_i] = \bigoplus I_i ,\end{align*} where in the last step we’ve used that the \(I_i\) are (not?) abelian. Let \(J {~\trianglelefteq~}L\) to write \(L = J \bigoplus J^\perp\), both of which are semisimple as Lie algebras. In particular, if \(\phi: L\to L'\), set \(J \mathrel{\vcenter{:}}=\ker \phi {~\trianglelefteq~}L\). Then \(\operatorname{im}\phi = L/J \cong J^\perp\) as Lie algebras, using the orthogonal decomposition, so \(\operatorname{im}\phi\) is semisimple. Finally if \(J {~\trianglelefteq~}L\) then \(L = J \oplus J^\perp\) with \(J\) semisimple, so by the previous theorem \(J\) decomposes as \(J = \oplus K_i\) with \(K_i\) (simple) ideals in \(J\) – but these are (simple) ideals in \(L\) as well since it’s a direct sum. Thus the \(K_i\) are a subset of the \(I_j\), since these are the only simple ideals of \(L\).
Question from last time: does \(L\) always factor as \(\mathop{\mathrm{Rad}}(L) \oplus L_{{\mathrm{ss}}}\) with \(L_{{\mathrm{ss}}}\) semisimple? Not always, instead there is a semidirect product decomposition \(L = \mathop{\mathrm{Rad}}(L) \rtimes{\mathfrak{s}}\) where \({\mathfrak{s}}\) is the Levi subalgebra. Consider \(L = {\mathfrak{gl}}_n\), then \(\mathop{\mathrm{Rad}}(L) \neq {\mathfrak{h}}\) since \([h, e_{ij}] = (h_i - h_j)e_{ij}\), so in fact this forces \(\mathop{\mathrm{Rad}}(L) = {\mathbf{C}}I_n = Z(L)\) with complementary subalgebra \({\mathfrak{s}}= {\mathfrak{sl}}_n\). Note that \({\mathfrak{gl}}_n = {\mathbf{C}}I_n \oplus {\mathfrak{sl}}_n\) where \({\mathfrak{sl}}_n = [L L]\) is a direct sum, and \({\mathfrak{gl}}_n\) is reductive.
If \(L\) is semisimple then \({ \operatorname{ad}}(L) = \mathop{\mathrm{Der}}L\).
We know \({ \operatorname{ad}}(L) \leq \mathop{\mathrm{Der}}L\) is a subalgebra, and \(L\) semisimple implies \(0 = Z(L) = \ker { \operatorname{ad}}\), so \({ \operatorname{ad}}: L { \, \xrightarrow{\sim}\, } { \operatorname{ad}}(L)\) is an isomorphism and \({ \operatorname{ad}}(L)\) is semisimple. The Killing form of a semisimple Lie algebra is always nondegenerate, so \(\kappa_{ { \operatorname{ad}}(L)}\) is nondegenerate, while \(\kappa_{\mathop{\mathrm{Der}}L}\) may be degenerate. Recall that \({ \operatorname{ad}}(L) {~\trianglelefteq~}\mathop{\mathrm{Der}}L\), so \([\mathop{\mathrm{Der}}L, { \operatorname{ad}}(L)] \subseteq { \operatorname{ad}}(L)\). Define \({ \operatorname{ad}}(L)^\perp {~\trianglelefteq~}\mathop{\mathrm{Der}}L\) to be the orthogonal complement in \(\mathop{\mathrm{Der}}(L)\) with respect to \(\kappa_{\mathop{\mathrm{Der}}L}\), which is an ideal by the associative property.
Claim: \({ \operatorname{ad}}(L) \cap { \operatorname{ad}}(L)^\perp = 0\), This follows readily from the fact that \(\kappa_{ { \operatorname{ad}}(L)}\) is nondegenerate and so \(\mathop{\mathrm{Rad}}(\kappa_{ { \operatorname{ad}}(L)}) = 0\).
Note that \({ \operatorname{ad}}(L), { \operatorname{ad}}(L)^\perp\) are both ideals, so \([ { \operatorname{ad}}(L), { \operatorname{ad}}(L)^\perp] \subseteq { \operatorname{ad}}(L) \cap { \operatorname{ad}}(L)^\perp = 0\). Let \(\delta \in { \operatorname{ad}}(L)^\perp\) and \(x\in L\), then \(0 = [\delta, { \operatorname{ad}}_x] = { \operatorname{ad}}_{ \delta(x) }\) where the last equality follows from an earlier exercise. Since \({ \operatorname{ad}}\) is injective, \(\delta(x) = 0\) and so \(\delta = 0\), thus \({ \operatorname{ad}}(L)^\perp = 0\). So we have \(\mathop{\mathrm{Rad}}\kappa_{\mathop{\mathrm{Der}}L} \subseteq { \operatorname{ad}}(L)^\perp = 0\) since any derivation orthogonal to all derivations is in particular orthogonal to inner derivations, and thus \(\kappa_{\mathop{\mathrm{Der}}L}\) is nondegenerate. Finally, we can write \(\mathop{\mathrm{Der}}L = { \operatorname{ad}}(L) \oplus { \operatorname{ad}}(L)^\perp = { \operatorname{ad}}(L) \oplus 0 = { \operatorname{ad}}(L)\).
Earlier: if \(A\in \mathsf{Alg}_{/ {{ \mathbf{F} }}}^{\mathrm{fd}}\), not necessarily associative, \(\mathop{\mathrm{Der}}A\) contains the semisimple and nilpotent parts of all of its elements. Applying this to \(A = L\in \mathsf{Lie} \mathsf{Alg}\) yields \(L \cong { \operatorname{ad}}(L) = \mathop{\mathrm{Der}}L\) ad \({ \operatorname{ad}}_x = s + n \in { \operatorname{ad}}(L) + { \operatorname{ad}}(L)\), so write \(s = { \operatorname{ad}}_{x_s}\) and \(n = { \operatorname{ad}}_{x_n}\), then \({ \operatorname{ad}}_x = { \operatorname{ad}}_{x_s} + { \operatorname{ad}}_{x_n} = { \operatorname{ad}}_{x_s + x_n}\) so \(x = x_s + x_n\) by injectivity of \({ \operatorname{ad}}\), yielding a definition for the semisimple and nilpotent parts of \(x\). If \(L \leq {\mathfrak{gl}}(V)\), it turns out that these coincide with the usual decomposition – this is proved in section 6.4.
Let \(L \in \mathsf{Lie} \mathsf{Alg}_{/ {{\mathbf{C}}}}^{\mathrm{fd}}\), then a representation of \(L\) on \(V\) is a homomorphism of Lie algebras \(\phi: L \to {\mathfrak{gl}}(V)\). For \(V\in{ \mathsf{Vect}}_{/ {{\mathbf{C}}}}\) with an action of \(L\), i.e. an operation \begin{align*} L\times V &\to V \\ (x, v) &\mapsto x.v ,\end{align*} is an \(L{\hbox{-}}\)module iff for all \(a,b\in {\mathbf{C}}, x,y\in L, v,w\in V\),
An \(L{\hbox{-}}\)module \(V\) is equivalent to a representation of \(L\) on \(V\). If \(\phi: L \to {\mathfrak{gl}}(V)\) is a representation, define \(x.v \mathrel{\vcenter{:}}=\phi(x)v \mathrel{\vcenter{:}}=\phi(x)(v)\). Conversely, for \(V\in {}_{L}{\mathsf{Mod}}\) define \(\phi: L\to {\mathfrak{gl}}(V)\) by \(\phi(x)(v) \mathrel{\vcenter{:}}= x.v\).
\(L\in {}_{L}{\mathsf{Mod}}\) using \({ \operatorname{ad}}\), this yields the adjoint representation.
A morphism of \(L{\hbox{-}}\)modules is a linear map \(\psi: V\to W\) such that \(\psi(x.v) = x.\psi(v)\) for all \(x\in L, v\in V\). It is an isomorphism as an \(L{\hbox{-}}\)module iff it is an isomorphism of the underlying vector spaces.11 In this case we say \(V, W\) are equivalent representations.
Let \(L = {\mathbf{C}}x\) for \(x\neq 0\), then
What is a representation of \(L\) on \(V\)? This amounts to picking an element of \({ \operatorname{End} }(V)\).
When are 2 \(L{\hbox{-}}\)modules on \(V\) equivalent? This happens iff the two linear transformations are conjugate in \({ \operatorname{End} }(V)\).
Thus representations of \(L\) on \(V\) are classified by Jordan canonical forms when \(V\) is finite dimensional.
For \(V\in {}_{L}{\mathsf{Mod}}\), a subspace \(W \subseteq V\) is a submodule iff it is in invariant subspace, so \(x.w\in W\) for all \(x\in L, w\in W\). \(V\) is irreducible or simple if \(V\) has exactly two invariant subspaces \(V\) and \(0\).
Note that this rules out \(0\) as being a simple Lie algebra.
For \(W\leq V \in {}_{L}{\mathsf{Mod}}\) a submodule, the quotient module is \(V/W\) has underlying vector space \(V/W\) with action \(x.(v+W) \mathrel{\vcenter{:}}=(x.v) + W\). This is well-defined precisely when \(W\) is a submodule.
\(I{~\trianglelefteq~}L \iff { \operatorname{ad}}(I) \leq { \operatorname{ad}}(L)\), i.e. ideals corresponds to submodules under the adjoint representation. However, irreducible ideals need not correspond to simple submodules.
Note that all of the algebras \({\mathfrak{g}}\) we’ve considered naturally act on column vectors in some \({ \mathbf{F} }^n\) – this is the natural representation of \({\mathfrak{g}}\).
Letting \({\mathfrak{b}}_n\) be the upper triangular matrices in \({\mathfrak{gl}}_n\), this acts on \({ \mathbf{F} }^n\). Taking a standard basis \({ \mathbf{F} }^n = V \mathrel{\vcenter{:}}=\left\langle{e_1,\cdots, e_n}\right\rangle_{ \mathbf{F} }\), one gets submodules \(V_i = \left\langle{e_1,\cdots, e_i}\right\rangle_{ \mathbf{F} }\) which correspond to upper triangular blocks got by truncating the first \(i\) columns of the matrix. This yields a submodule precisely because the lower-left block is zero.
Let \(\phi: L\to {\mathfrak{gl}}(V)\) be a representation, noting that \({ \operatorname{End} }(V)\) is an associative algebra. We can consider the associative unital algebra \(A\) generated by the image \(\phi(L)\). Note that the structure of \(V\) as an \(L{\hbox{-}}\)module is the same as its \(A{\hbox{-}}\)module structure, so we can apply theorems/results from the representation of rings and algebras to study Lie algebra representations, e.g. the Jordan-Holder theorem and Schur’s lemma.
Given \(V, W\in {}_{L}{\mathsf{Mod}}\), their vector space direct sum admits an \(L{\hbox{-}}\)module structure using \(x.(v, w) \mathrel{\vcenter{:}}=(x.v, x.w)\), which we’ll write as \(x.(v+w) \mathrel{\vcenter{:}}= xv + xw\).
\(V\in {}_{L}{\mathsf{Mod}}\) is completely reducible iff \(V\) is a direct sum of irreducible \(L{\hbox{-}}\)modules. Equivalently, for each \(W\leq V\) there is a complementary submodule \(W'\) such that \(V = W \oplus W'\).
“Not irreducible” is strictly weaker than “completely reducible,” since a submodule may not admit an invariant complement – for example, the flag in the first example above.
The natural representation of \({\mathfrak{h}}_n\) is completely reducible, decomposing as \(V_1 \oplus V_2 \oplus \cdots V_n\) where \(V_i = { \mathbf{F} }e_i\).
A module \(V\) is indecomposable iff \(V\neq W \oplus W'\) for proper submodules \(W, W' \leq V\). This is weaker than irreducibility.
Consider the natural representation \(V\) for \(L \mathrel{\vcenter{:}}={\mathfrak{b}}_n\). Every nonzero submodule of \(V\) must contain \(e_1\), so \(V\) is indecomposable if \(n\geq 1\).
Recall that the socle of \(V\) is the (direct) sum of all of its irreducible submodules. If \(\mathop{\mathrm{Soc}}(V)\) is simple (so one irreducible) then \(V\) is indecomposable, since every summand must contain this simple submodule “at the bottom.” For \(L = {\mathfrak{b}}_n\), note that \(\mathop{\mathrm{Soc}}(V) = { \mathbf{F} }e_1\).
For the reminder of chapters 6 and 7, we assume all modules are finite-dimensional over \({ \mathbf{F} }= \overline{{ \mathbf{F} }}\).
Let \(L\in\mathsf{Lie} \mathsf{Alg}_{/ {{ \mathbf{F} }}}^{\mathrm{fd}}\), then there exists a composition series, a sequence of submodules \(0 = V_0 \subseteq V_1 \subseteq \cdots \subseteq V_n = V\) such that each composition factor (sometimes called sections) \(V_i/V_{i-1}\) is irreducible/simple. Moreover, any two composition series admit the same composition factors with the same multiplicities, up to rearrangement and isomorphism.
If \(V = W \oplus W'\) with \(W, W'\) simple, there are two composition series:
These aren’t equal, since they’re representations on different coset spaces, but are isomorphic.
If \(\phi: L\to {\mathfrak{gl}}(V)\) is an irreducible representation, then \({ \operatorname{End} }_L(V)\cong { \mathbf{F} }\).
If \(V\) is irreducible then every \(f\in {}_{L}{\mathsf{Mod}}(V, V)\) is either zero or an isomorphism since \(f(V) \leq V\) is a submodule. Thus \({ \operatorname{End} }_L(V)\) is a division algebra over \({ \mathbf{F} }\), but the only such algebra is \({ \mathbf{F} }\) since \({ \mathbf{F} }= \overline{{ \mathbf{F} }}\).
Letting \(\phi\) be as above, it has an eigenvalue \(\lambda \in { \mathbf{F} }\), again since \({ \mathbf{F} }= \overline{{ \mathbf{F} }}\). Then \(\phi - \lambda I \in { \operatorname{End} }_L(V)\) has a nontrivial kernel, the \(\lambda{\hbox{-}}\)eigenspace. So \(\phi - \lambda I = 0 \implies \varphi = \lambda I\).
Schur’s lemma is not always true for Lie superalgebras.
The trivial \(L{\hbox{-}}\)module is \({ \mathbf{F} }\in {}_{L}{\mathsf{Mod}}\) equipped with the zero map \(\varphi: L\to {\mathfrak{gl}}({ \mathbf{F} })\) where \(x.1 \mathrel{\vcenter{:}}= 0\) for all \(x\in L\). Note that this is irreducible, and any two such 1-dimensional trivial modules are isomorphic by sending a basis \(\left\{{e_1}\right\}\) to \(1\in { \mathbf{F} }\).
More generally, an \(V \in {}_{L}{\mathsf{Mod}}\) is trivial iff \(x.v = 0\) for all \(x\in L, v\in V\), and \(V\) is completely reducible as a direct sum of copies of the above trivial module.
Let \(V, W\in {}_{L}{\mathsf{Mod}}\), then the following are all \(L{\hbox{-}}\)modules:
These structures come from the Hopf algebra structure on the universal associative algebra \(U({\mathfrak{g}})\), called the universal enveloping algebra. Note that we also have \begin{align*} \mathop{\mathrm{Hom}}_{\mathbf{C}}(V, W)\underset{ {}_{L}{\mathsf{Mod}}}{\cong} V {}^{ \vee }\otimes_{ \mathbf{F} }W .\end{align*}
Last time: \(L\) semisimple over \({\mathbf{C}}\) implies \(\kappa(x,y)\mathrel{\vcenter{:}}=\operatorname{tr}( { \operatorname{ad}}_x { \operatorname{ad}}_y)\) is nondegenerate. Using Cartan’s criterion, we can show that for any faithful representation \(\phi: L\to {\mathfrak{gl}}(V)\) we can define a symmetric bilinear form \(\beta_\phi\) on \(L\) defined by \(\beta_\phi(x, y) = \operatorname{tr}(\phi(x) \phi(y))\). Note that \(\beta_{ { \operatorname{ad}}} = \kappa\). Since \(\mathop{\mathrm{Rad}}(\beta_\phi) = 0\), it is nondegenerate. This defines an isomorphism \(L { \, \xrightarrow{\sim}\, }L {}^{ \vee }\) by \(x\mapsto \beta(x, {-})\), so given a basis \({\mathcal{B}}\mathrel{\vcenter{:}}=\left\{{x_i}\right\}_{i\leq n}\) for \(L\) there is a unique dual basis \({\mathcal{B}}' = \left\{{y_i}\right\}_{i\leq n}\) for \(L\) such that \(\beta(x_i, y_j) = \delta_{ij}\). Note that the \(y_i \in L\) are dual to the basis \(\beta(x_i, {-}) \in L {}^{ \vee }\).
For \(L\mathrel{\vcenter{:}}={\mathfrak{sl}}_2({\mathbf{C}})\), the matrix of \(\kappa\) is given by \({ \begin{bmatrix} {0} & {0} & {4} \\ {0} & {8} & {0} \\ {4} & {0} & {0} \end{bmatrix} }\) with respect to the ordered basis \({\mathcal{B}}=\left\{{x,h,y}\right\}\).14 Thus \({\mathcal{B}}' = \left\{{{1\over 4}y, {1\over 8}h, {1\over 4}x}\right\}\).
Now let \(\phi: L\to {\mathfrak{gl}}(V)\) be a faithful irreducible representation. Fix a basis \({\mathcal{B}}\) of \(L\) and define the Casimir element \begin{align*} c_\phi = c_\phi(\beta) \mathrel{\vcenter{:}}=\sum_{i\leq n} \phi(x_i) \circ \phi(y_i) \in { \operatorname{End} }_{\mathbf{C}}(V) .\end{align*}
One can show that \(c_\phi\) commutes with \(\phi(L)\). Supposing \(\phi\) is irreducible, \({ \operatorname{End} }_L(V) = {\mathbf{C}}\) by Schur’s lemma, so \(c_\phi\) is acts on \(V\) as a scalar. This follows from \begin{align*} \operatorname{tr}(c_ \varphi) = \sum_{i\leq n} \operatorname{tr}( \varphi(x_i) \varphi(y_i) ) = \sum_{i\leq n} \beta(x_i, y_i) = n = \dim L \implies c_\phi = {\dim L \over \dim V} \operatorname{id}_V ,\end{align*} since there are \(\dim V\) entries. In particular, \(c_\phi\) is independent of the choice of \({\mathcal{B}}\). This will be used to prove Weyl’s theorem, one of the main theorems of semisimple Lie theory over \({\mathbf{C}}\). If \(L\) is semisimple and \(\phi:\) is not faithful, replace \(L\) by \(L/\ker \phi\). Since \(\ker \phi {~\trianglelefteq~}L\) and \(L\) is semisimple, \(\ker \phi\) is a direct sum of certain simple ideals of \(L\) and the quotient is isomorphic to the sum of the remaining ideals. This yields a representation \(\overline{\phi}: L/\ker \varphi \to {\mathfrak{gl}}(V)\) which is faithful and can be used to define a Casimir operator.
Let \(L = {\mathfrak{sl}}_2({\mathbf{C}})\) and let \(V = {\mathbf{C}}^2\) be the natural representation, so \(\phi: L\to {\mathfrak{gl}}(V)\) is the identity. Fix \({\mathcal{H}}= \left\{{x,h,y}\right\}\), then \(\beta(u, v) = \operatorname{tr}( u v )\) since \(\phi(u) = u\) and \(\phi(v) = v\). We get the following products:
\({ \begin{bmatrix} {0} & {1} \\ {0} & {0} \end{bmatrix} }\) | \({ \begin{bmatrix} {1} & {0} \\ {0} & {-1} \end{bmatrix} }\) | \({ \begin{bmatrix} {0} & {0} \\ {1} & {0} \end{bmatrix} }\) | |
---|---|---|---|
\({ \begin{bmatrix} {0} & {1} \\ {0} & {0} \end{bmatrix} }\) | 0 | \({ \begin{bmatrix} {0} & {-1} \\ {0} & {0} \end{bmatrix} }\) | \({ \begin{bmatrix} {1} & {0} \\ {0} & {0} \end{bmatrix} }\) |
\({ \begin{bmatrix} {1} & {0} \\ {0} & {-1} \end{bmatrix} }\) | I | \({ \begin{bmatrix} {0} & {0} \\ {-1} & {0} \end{bmatrix} }\) | |
\({ \begin{bmatrix} {0} & {0} \\ {1} & {0} \end{bmatrix} }\) | 0 |
Thus \(\beta = { \begin{bmatrix} {0} & {0} & {1} \\ {0} & {2} & {0} \\ {1} & {0} & {0} \end{bmatrix} }\), and \({\mathcal{B}}' = \left\{{y, {1\over 2}h, x}\right\}\), so \begin{align*} c_\phi = xy + {1\over 2}h^2 + yx = { \begin{bmatrix} {1} & {0} \\ {0} & {0} \end{bmatrix} } + {1\over 2}I + { \begin{bmatrix} {0} & {0} \\ {0} & {1} \end{bmatrix} } = {3\over 2}I = {\dim {\mathbf{C}}^2\over \dim {\mathfrak{sl}}_2({\mathbf{C}})} I .\end{align*}
Let \(\phi: L\to {\mathfrak{gl}}(V)\) be a representation of a semisimple Lie algebra, then \(\phi(L) \subseteq {\mathfrak{sl}}(V)\). In particular, \(L\) acts trivially on any 1-dimensional \(L{\hbox{-}}\)module since a \(1\times 1\) traceless matrix is zero. The proof follows from \(L = [LL]\).
Arbitrary reductive Lie algebras can have nontrivial 1-dimensional representations.
Let \(\phi: L\to {\mathfrak{gl}}(V)\) be a finite-dimensional representation of a finite-dimensional semisimple Lie algebra over \({\mathbf{C}}\). This \(\phi\) is completely reducible.
This is not true for characteristic \(p\), or infinite dimensional representations in characteristic 0.
Replace \(L\) by \(L/\ker \phi\) if necessary to assume \(\phi\) is faithful, since these yield the same module structures. Define a Casimir operator \(c_\phi\) as before, and recall that complete reducibility of \(V\) is equivalent to every \(L{\hbox{-}}\)submodule \(W\leq V\) admitting a complementary submodule \(W''\) such that \(V = W \oplus W''\). We proceed by induction on \(\dim V\), where the dimension 1 case is clear.
Case I: \(\operatorname{codim}_V W = 1\), i.e. \(\dim (V/W) = 1\). Take the SES \(W \hookrightarrow V \twoheadrightarrow V/W\).
Case 1: Suppose \(W' \leq W\) is a proper nonzero \(L{\hbox{-}}\)submodule. Schematic:
Using the 2nd isomorphism theorem, there is a SES \(W/W' \hookrightarrow V/W' \twoheadrightarrow V/W\). Since \(\dim W' > 0\), \(\dim V/W' \leq \dim V\), so by the IH there is a 1-dimensional complement to \(W/W'\) in \(V/W'\). This lifts to \(\tilde W' \leq V\) with \(W' \leq \tilde W\) with \(\dim \tilde W/W' = 1\), and moreover \(V/W' = W/W' \oplus \tilde W/W'\). Take the SES \(W' \hookrightarrow\tilde W \twoheadrightarrow\tilde W/W'\) with \(\dim \tilde W < \dim V\). Apply the IH again to get a subspaces \(X \leq \tilde W \leq V\) with \(\tilde W = W' \oplus X\). We’ll continue by showing \(V = W \bigoplus X\).
Recall: we’re proving Weyl’s theorem, i.e. every finite-dimensional representation of semisimple Lie algebra over \({\mathbf{C}}\) is completely reducible. Strategy: show every \(W \leq_L V\) has a complement \(W'' \leq_L V\) such that \(V = W \oplus W''\); induct on \(\dim V\).
Case I: \(\dim V/W = 1\).
Case 1: \(W\) is reducible. We got \(0 < W' < W < V\) (proper submodules), represented schematically by a triangle. We showed \(V/W' \cong W/W' \oplus \tilde W/W'\), since
Thus replacing \(\tilde W\) in the second point yields \(V = W + \tilde W = W + W' +X = W + X\); we want to prove this sum is direct. Since \(X\) is contained in \(\tilde W\), we can write \begin{align*} X \cap W &= (X \cap\tilde W) \cap W \\ &= X \cap(\tilde W \cap W) \qquad\text{by 1}\\ &\subseteq X \cap W' = 0 \qquad \text{by 2} ,\end{align*} so \(V = W \oplus X\).
Case 2: Let \(c_\phi\) be the Casimir element of \(\phi\), and note that \(c_\phi(W) \subseteq W\) since \(c_\phi\) is built out of endomorphisms in \(\phi(L)\) sending \(W\) to \(W\) (since \(W\) is a submodule). In fact, \(\phi(L)(V) = W\) since \(V/W\) is a 1-dimensional representation of the semisimple Lie algebra \(L\), hence trivial. Thus \(c_\phi(V) \subseteq W\) and thus \(\ker c_\phi \neq 0\) since \(W < V\) is proper. Note also that \(c_\phi\) commutes with anything in \(c_\phi(L)\) on \(V\), and so defines a morphism \(c_\phi \in {}_{L}{\mathsf{Mod}}(V, V)\) and \(\ker c_\phi \leq_L V\). On the other hand, \(c_\phi\) induces an element of \({ \operatorname{End} }_{L}(W)\), and since \(W\) is irreducible, \({ \left.{{c_\phi}} \right|_{{W}} } = \lambda \operatorname{id}_W\) for some scalar \(\lambda\). This can’t be zero, since \(\operatorname{tr}({ \left.{{c_\phi}} \right|_{{W}} }) = {\dim L \over \dim W} > 0\), so \(\ker c_\phi \cap W = 0\). Since \(\operatorname{codim}_V W = 1\), i.e. \(\dim W = \dim V - 1\), we must have \(\dim \ker c_\phi = 1\) and we have a direct sum decomposition \(V = W \oplus \ker c_\phi\).
Use of the Casimir element in basic Lie theory: producing a complement to an irreducible submodule.
Case 2: Suppose \(0 < W < V\) with \(W\) any nontrivial \(L{\hbox{-}}\)submodule; there is a SES \(W \hookrightarrow V \twoheadrightarrow V/W\) Consider \(H \mathrel{\vcenter{:}}=\hom_{\mathbf{C}}(V, W)\), then \(H \in {\mathsf{L}{\hbox{-}}\mathsf{Mod}}\) by \((x.f)(v) \mathrel{\vcenter{:}}= x.(f(v)) - f(x.v)\) for \(f\in H, x\in L, v\in V\). Let \({\mathcal{V}}\mathrel{\vcenter{:}}=\left\{{f {~\mathrel{\Big\vert}~}H {~\mathrel{\Big\vert}~}{ \left.{{f}} \right|_{{W}} } = \alpha \operatorname{id}_W \,\text{ for some } \alpha \in {\mathbf{C}}}\right\} \subseteq H\). For \(f\in V\) and \(w\in W\), we have \begin{align*} (x.f)(w) = x.f(w) - f(x.w) = \alpha x.w - \alpha x.w = 0 .\end{align*} So let \({\mathcal{W}}\mathrel{\vcenter{:}}=\left\{{f\in {\mathcal{V}}{~\mathrel{\Big\vert}~}f(W) = 0}\right\} \subseteq {\mathcal{V}}\), then we’ve shown that \(L.{\mathcal{V}}\subseteq {\mathcal{W}}\). Now roughly, the complement is completely determined by the scalar. Rigorously, since \(\dim {\mathcal{V}}/{\mathcal{W}}= 1\), any \(f\in {\mathcal{V}}\) is determined by the scalar \({ \left.{{f}} \right|_{{W}} } = \alpha \operatorname{id}_W\): we have \(f-\alpha \chi_W \in {\mathcal{W}}\) where \(\chi_W\) is any extension of \(\operatorname{id}_W\) to \(V\) which is zero on \(V/W\), e.g. by extending a basis and having it act by zero on the new basis elements.
Now \({\mathcal{W}}\hookrightarrow{\mathcal{V}}\twoheadrightarrow{\mathcal{V}}/{\mathcal{W}}\in {\mathsf{L}{\hbox{-}}\mathsf{Mod}}\) with \(\operatorname{codim}_{\mathcal{V}}{\mathcal{W}}= 1\). By Case I, \({\mathcal{V}}= {\mathcal{W}}\oplus {\mathcal{W}}''\) for some complement \(L{\hbox{-}}\)submodule \({\mathcal{W}}''\). Let \(f: V\to W\) span \({\mathcal{W}}''\), then \({ \left.{{f}} \right|_{{W}} }\) is a nonzero scalar – a scalar since it’s in \({\mathcal{V}}\), and nonzero since it’s in the complement of \({\mathcal{W}}\). By rescaling, we can assume the scalar is 1, so \(\operatorname{im}f = W\) and by rank-nullity \(\dim \ker f = \dim V - \dim W\). Thus \(\ker f\) has the right dimension to be the desired complement. It is an \(L{\hbox{-}}\)submodule, since \(L.f \subseteq {\mathcal{W}}'' \cap{\mathcal{W}}= 0\) since \({\mathcal{W}}''\) is an \(L{\hbox{-}}\)submodule and \(f\in {\mathcal{W}}\) since \(L.{\mathcal{V}}\subseteq {\mathcal{W}}\). Noting that if \((x.f) = 0\) then \(x.(f(v)) = f(x.v)\), making \(f\) an \({\mathsf{L}{\hbox{-}}\mathsf{Mod}}\) morphism. Thus \(W'' \mathrel{\vcenter{:}}=\ker f \leq_L V\), and \(W \cap W'' = 0\) so \({ \left.{{f}} \right|_{{W}} } = \operatorname{id}_W\). Since the dimensions add up correctly, we get \(V = W \oplus W''\).
Let \(L \leq {\mathfrak{gl}}(V)\) be a subalgebra with \(L\) semisimple and finite-dimensional. Given \(x\in L\), there are two decompositions: the usual JCF \(x = s + n\), and the abstract decomposition \({ \operatorname{ad}}_x = { \operatorname{ad}}_{x_s} + { \operatorname{ad}}_{x_n}\). \(L\) contains the semisimple and nilpotent parts of all of its elements, and in particular the two above decompositions coincide.
The proof is technical, but here’s a sketch:
If \(L \in \mathsf{Lie} \mathsf{Alg}_{/ {{\mathbf{C}}}}^{{\mathrm{ss}}}\) (not necessarily linear) and \(\phi\in {\mathsf{L}{\hbox{-}}\mathsf{Mod}}\), writing \(x=s+n\) the abstract Jordan decomposition, \(\phi(x) = \phi(s) + \phi(n)\) is the usual JCF of \(\phi(x)\in {\mathfrak{gl}}(V)\).
Consider \({ \operatorname{ad}}_{\phi(L)}\phi(s)\) and \({ \operatorname{ad}}_{\phi(L)}\phi(n)\), which are semisimple (acts on a vector space and decomposes into a direct sum of eigenspaces) and nilpotent respectively and commute, yielding the abstract Jordan decomposition of \({ \operatorname{ad}}_{\phi(x)}\). Now apply the theorem.
If \(L\in \mathsf{Lie} \mathsf{Alg}\) and \(s\) is a semisimple element, then \(\phi(s)\) is semisimple in any finite-dimensional representation \(\phi\) of \(L\). In particular, taking the natural representation, this yields a semisimple operator. For the same reason, \({ \operatorname{ad}}_s\) is semisimple. Similar statements hold for nilpotent elements.
Let \(L \mathrel{\vcenter{:}}={\mathfrak{sl}}_2({\mathbf{C}})\), then recall
Goal: classify \({\mathsf{L}{\hbox{-}}\mathsf{Mod}}^{\mathrm{fd}}\). By Weyl’s theorem, they’re all completely reducible, so it’s enough to describe the simple objects.
Note that \(L \leq {\mathfrak{gl}}_2({\mathbf{C}}) = {\mathfrak{gl}}({\mathbf{C}}^2)\), and since \(h\) is semisimple, \(\phi(h)\) acts semisimply on any finite-dimensional representation \(V\) with \(\phi: L\to {\mathfrak{gl}}(V)\). I.e. \(\phi(h)\) acts diagonally on \(V\). Thus \(V = \bigoplus _{ \lambda} V_{ \lambda}\) which are eigenspaces for the \(\phi(h)\) action, where \begin{align*} V_\lambda \mathrel{\vcenter{:}}=\left\{{v\in V {~\mathrel{\Big\vert}~}h.v = \lambda v, \lambda\in {\mathbf{C}}}\right\} .\end{align*} If \(V_\lambda \neq 0\) we say \(\lambda\) is a weight of \(h\) in \(V\) and \(V_\lambda\) is the corresponding weight space.
If \(v\in V_ \lambda\), then \(x.v \in V_ {\lambda+2}\) and \(y.v \in V_{ \lambda- 2}\).
Prove this using the commutation relations.
Note that if \(V\) is finite-dimensional then there can not be infinitely many nonzero \(V_\lambda\), so there exists a \(\lambda\in {\mathbf{C}}\) such that \(V_{ \lambda} \neq 0\) but \(V_{ \lambda+ 2} = 0\). We call \(\lambda\) a highest weight (h.w.) of \(V\) (which will turn out to be unique) and any nonzero \(v\in V\) a highest weight vector.
Let \(V \in {\mathsf{L}{\hbox{-}}\mathsf{Mod}}^{{\mathrm{fd}}, {\mathrm{irr}}}\) and let \(v_0\in V_{ \lambda}\) be a h.w. vector. Set \(v_{-1} = 0\) for \(i\geq 0\) and \(v_i \mathrel{\vcenter{:}}={1\over i!}y^i v_0\), then for \(i\geq 0\),
In parts:
By the lemma and induction on \(i\).
Clear!
Follows from \(i x.v_i = x.(y.v_{i-1}) = y.(x.v_{i-1}) + [xy].v_{i-1}\) and induction on \(i\).
Some useful facts:
The nonzero \(v_i\) are linearly independent since they are eigenvectors of \(h\) with different eigenvalues – this is a linear algebra fact.
The subspace of \(V\) spanned by the nonzero \(v_i\) is an \(L{\hbox{-}}\)submodule, but since \(V\) is irreducible the \(v_i\) must form a basis for \(V\).
Since \(V\) is finite-dimensional, there must be a smallest \(m\geq 0\) such that \(v_m \neq 0\) but \(v_{m+1} = 0\), and thus \(v_{m+k} = 0\) for all \(k\). Thus \(\dim_{\mathbf{C}}V = m+1\) with basis \(\left\{{v_0, v_1, \cdots, v_m}\right\}\).
Since \(v_{m+1} = 0\), we have \(0 = x.v_{m+1} = ( \lambda- m) v_m\) where \(v_m\neq 0\), so \(\lambda = m \in {\mathbf{Z}}_{\geq 0}\). Thus its highest weight is a non-negative integer, equal to 1 less than the dimension. We’ll reserve \(\lambda\) to denote a highest weight and \(\mu\) an arbitrary weight.
Thus the weights of \(V\) are \(\left\{{m, m-2, \cdots, \star, \cdots, -m+2, -m}\right\}\) where \(\star = 0\) or \(1\) depending on if \(m\) is even or odd respectively, each occurring with multiplicity one (using that \(\dim V_{\mu} = 1\) if \(\mu\) is a weight of \(V\)).
Let \(V \in {\mathsf{L}{\hbox{-}}\mathsf{Mod}}^{{\mathrm{fd}}, {\mathrm{irr}}}\) for \(L\mathrel{\vcenter{:}}={\mathfrak{sl}}_2({\mathbf{C}})\), then
Relative to \(h\), \(V\) is a direct sum of weight spaces \(V_\mu\) for \(\mu \in \left\{{m, m-2,\cdots, -m+2, -m}\right\}\) where \(m+1=\dim V\) and each weight space is 1-dimensional.
\(V\) has a unique (up to nonzero scalar multiples) highest weight vector whose weight (the highest weight of \(V\)) is \(m\in {\mathbf{Z}}_{\geq 0}\)
The action \(L\curvearrowright V\) is given explicitly as in the lemma if the basis is chosen in a prescribed fashion. In particular, there exists a unique finite-dimensional irreducible \({\mathfrak{sl}}_2{\hbox{-}}\)module of dimension \(m+1\) up to isomorphism.
Let \(V \in {\mathsf{L}{\hbox{-}}\mathsf{Mod}}^{{\mathrm{fd}}, {\mathrm{irr}}}\), then the eigenvalues of \(h\curvearrowright V\) are all integers, and each occurs along with its negative with the same multiplicity. Moreover, in a decomposition of \(V\) in a direct sum of irreducibles, the number of simple summands is \(\dim V_0 + \dim V_1\).
Existence of irreducible highest weight modules of highest weight \(m \geq 0\):
The formula in the lemma can be used to construct an irreducible representation of \(L\) having highest weight \(\lambda= m\) for any \(m\in {\mathbf{Z}}_{\geq 0}\), which is unique up to isomorphism and denoted \(L(m)\) (or \(V(m)\) in Humphreys) which has dimension \(m+1\). In fact, the formulas can be used to define an infinite-dimensional representation of \(L\) with highest weight \(\lambda\) for any \(\lambda\in {\mathbf{C}}\), which is denoted \(M( \lambda)\) – we just don’t decree that \(v_{m+1} = 0\), yielding a basis \(\left\{{v_0, v_1,\cdots}\right\}\). This yields a decomposition into 1-dimensional weight spaces \(M( \lambda) = \oplus _{i=1}^\infty M_{ \lambda-2i}\) (Verma modules) where \(M_{ \lambda-2i} = \left\langle{v_i}\right\rangle_{\mathbf{C}}\).
Recall that the relations from last time can produce an infinite-dimensional module with basis \(\left\{{v_0,v_1,\cdots}\right\}\). Note that if \(m\in {\mathbf{Z}}_{\geq 0}\), then \(x.v_{m+1} = ( \lambda- m ) v_m = 0\). This says that one can’t raise \(v_{m+1}\) back to \(v_m\), so \(\left\{{v_{m+1},v_{m+2} \cdots}\right\}\) spans a submodule isomorphic to \(M(-m-2)\). Quotienting yields \(L(m) \mathrel{\vcenter{:}}= M(m) / M(-m-2)\), also called \(V(m)\), spanned by \(\left\{{v_0, \cdots, v_m}\right\}\). Note that \(M(-m-2)\) and \(L(m)\) are irreducible.
Let \(L\in \mathsf{Lie} \mathsf{Alg}_{/ {{\mathbf{C}}}}^{{\mathrm{fd}}, {\mathrm{ss}}}\) for this chapter.
Let \(L\ni x = x_s + x_n \in L + L\) be the abstract Jordan decomposition. then if \(x=x_n\) for every \(x\in L\) then \(L\) is nilpotent which contradicts Engel’s theorem. Thus there exists some \(x=x_s\neq 0\).
A toral subalgebra is any nonzero subalgebra spanned by semisimple elements.
The algebraic torus \(({\mathbf{C}}^{\times})^n\) which has Lie algebra \({\mathbf{C}}^n\), thought of as diagonal matrices.
Let \(H\) be a maximal toral subalgebra of \(L\). Any toral subalgebra \(H \subseteq L\) is abelian.
Let \(T \leq L\) be toral and let \(x\in T\) be a basis element. Since \(x\) is semisimple, it STS \({ \operatorname{ad}}_{T, x} = 0\). Semisimplicity of \(x\) implies \({ \operatorname{ad}}_{L, x}\) diagonalizable, so we want to show \({ \operatorname{ad}}_{T, x}\) has no non-zero eigenvalues. Suppose that there exists a nonzero \(y\in T\) such that \({ \operatorname{ad}}_{T, x}(y) = \lambda y\) for \(\lambda \neq 0\). Then \({ \operatorname{ad}}_{T, y}(x) = [yx] = -[xy] = - { \operatorname{ad}}_{T, x} = - \lambda y\neq 0\), and since \({ \operatorname{ad}}_{T, y}(y) = -[yy] = 0\), \(y\) is an eigenvector with eigenvalue zero. Since \({ \operatorname{ad}}_{T, y}\) is also diagonalizable and \(x\in T\), write \(x\) as a linear combination of eigenvectors for it, say \(x = \sum a_i v_i\). Then \({ \operatorname{ad}}_{T, y}(x) = \sum \lambda_i a_i v_i\) and the terms with \(\lambda_i = 0\) vanish, and the remaining element is a sum of eigenvectors for \({ \operatorname{ad}}_{T, y}\) with nonzero eigenvalues. \(\contradiction\)
If \(L = {\mathfrak{sl}}_n({\mathbf{C}})\) then define \(H\) to be the set of diagonal matrices. Then \(H\) is toral and in fact maximal: if \(L = H \oplus {\mathbf{C}}z\) for some \(z\in L\setminus H\) then one can find an \(h\in H\) such that \([hz] \neq 0\), making it nonabelian, but toral subalgebras must be abelian.
Recall that a commuting family of diagonalizable operators on a finite-dimensional vector space can be simultaneously diagonalized. Letting \(H \leq L\) be maximal toral, this applies to \({ \operatorname{ad}}_L(H)\), and thus there is a basis in which all operators in \({ \operatorname{ad}}_L(H)\) are diagonal. Set \(L_{ \alpha} \mathrel{\vcenter{:}}=\left\{{x\in L {~\mathrel{\Big\vert}~}[hx] = \alpha(h) x\,\,\forall h\in H}\right\}\) where \(\alpha: H\to {\mathbf{C}}\) is linear and thus an element of \(H {}^{ \vee }\). Note that \(L_0 = C_L(H)\), and the set \(\Phi\mathrel{\vcenter{:}}=\left\{{\alpha\in H {}^{ \vee }{~\mathrel{\Big\vert}~}\alpha\neq 0, L_\alpha\neq 0}\right\}\) is called the roots of \(H\) in \(L\), and \(L_\alpha\) is called a root space. Note that \(L_0\) is not considered a root space. This induces a root space decomposition \begin{align*} L = C_L(H) \oplus _{ \alpha\in \Phi } L_\alpha .\end{align*}
Note that for classical algebras, we’ll show \(C_L(H) = H\) and corresponds to the standard bases given early in the book.
Type \(A_n\) yields \({\mathfrak{sl}}_{n+1}({\mathbf{C}})\) and \(\dim H = n\) for \(H\) defined to be the diagonal traceless matrices. Define \({\varepsilon}_i\in H {}^{ \vee }\) as \({\varepsilon}_i \operatorname{diag}(a_1,\cdots, a_{n+1}) \mathrel{\vcenter{:}}= a_i\), the \(\Phi \mathrel{\vcenter{:}}=\left\{{{\varepsilon}_i - {\varepsilon}_j {~\mathrel{\Big\vert}~}1\leq i\neq j\leq ,n+1}\right\}\) and \(L_{{\varepsilon}_i - {\varepsilon}_j} = {\mathbf{C}}e_{ij}\). Why: \begin{align*} [h, e_{ij}] = \left[ \sum a_k e_{kk}, e_{ij}\right] = a_i e_{ii} e_{ij} - a_j e_{ij} e_{jj} = (a_i - a_j) e_{ij} \mathrel{\vcenter{:}}=({\varepsilon}_i -{\varepsilon}_j)(h) e_{ij} .\end{align*}
A Lie algebra is toral iff every element is semisimple – this exists because any semisimple Lie algebra contains at least one semisimple element and you can take its span. Let \(H\) be a fixed maximal toral subalgebra, then we have a root space decomposition \begin{align*} L = C_L(H) \oplus \bigoplus _{\alpha\in \Phi \subseteq H {}^{ \vee }} L_\alpha, \qquad L_\alpha \mathrel{\vcenter{:}}=\left\{{ x\in L {~\mathrel{\Big\vert}~}[hx] = \alpha(h) x \,\forall h\in H}\right\} .\end{align*} Let \(L\) be semisimple and finite dimensional over \({\mathbf{C}}\) from now on.
Follows from the Jacobi identity.
Follows from (1), that \(\dim L < \infty\), and the root space decomposition. This is because \(L_{\alpha}\neq L_{\alpha + \beta}\neq L_{\alpha + 2\beta} \neq \cdots\) and there are only finitely many \(\beta\) to consider.
If \(\alpha + \beta\neq 0\) then \(\exists h\in H\) such that \((\alpha + \beta)(h) \neq 0\). For \(x\in L_\alpha, y\in L_ \beta\), \begin{align*} \alpha(h) \kappa(x,y) &= \kappa([hx], y) \\ &= -\kappa([xh], y) \\ &= -\kappa(x, [hy]) \\ &= -\beta(h)\kappa(x,y) \\ &\implies ( \alpha + \beta)(h) \kappa(x,y) = 0 \\ &\implies \kappa(x,y)=0 \text{ since } (\alpha + \beta)(h) \neq 0 .\end{align*}
\({ \left.{{\kappa}} \right|_{{L_0}} }\) is nondegenerate, since \(L_0 \perp L_{\alpha}\) for all \(\alpha\in \Phi\), but \(\kappa\) is nondegenerate. Moreover, if \(L_{ \alpha} \neq 0\) then \(L_{- \alpha}\neq 0\) by (3) and nondegeneracy.
Let \(H \leq L\) be maximal toral, then \(H = C_L(H)\).
Skipped, about 1 page of dense text broken into 7 steps. Uses the last corollary along with Engel’s theorem.
If \(L\) is a classical Lie algebra over \({\mathbf{C}}\), we choose \(H\) to be diagonal matrices in \(L\), and \(x\in L\setminus H\) is non-diagonal, then there exists an \(h\in H\) such that \([hx]\neq 0\). Note that toral implies abelian and nonabelian implies nontoral, thus there is no abelian subalgebra of \(L\) properly containing \(H\) – adding any nontoral element at all to \(H\) makes it nonabelian. This same argument shows \(C_L(H) = H\) since nothing else commutes with \(H\). This implies that \(L = H \oplus_{ \alpha \in \Phi} L_\alpha\).
\({ \left.{{\kappa}} \right|_{{H}} }\) is nondegenerate.
As a result, \(\kappa\) induces an isomorphism \(H { \, \xrightarrow{\sim}\, }H {}^{ \vee }\) by \(h\mapsto \kappa(h, {-})\) and \(H {}^{ \vee } { \, \xrightarrow{\sim}\, }H\) by \(\phi\mapsto t_\phi\), the unique element such that \(\kappa(t_\phi, {-}) = \phi({-})\). In particular, given \(\alpha\in \Phi \subset H {}^{ \vee }\) there is some \(t_\alpha\in H\). The next 3 sections are about properties of \(\Phi\):
If it does not span, choose \(h \in H\setminus\left\{{0}\right\}\) with \(\alpha(h) = 0\) for all \(\alpha\in \Phi\). Then \([h, L_ \alpha] = 0\) for all \(\alpha\), but \([h H] = 0\) since \(H\) is abelian. Using the root space decomposition, \([h L] =0\) and so \(h\in Z(L) = 0\) since \(L\) is semisimple. \(\contradiction\)
Follows from proposition 8.2 and \(\kappa(L_ \alpha, L_ \beta) = 0\) when \(\beta\neq \alpha\).
Let \(h\in H\), then \begin{align*} \kappa(h, [xy]) &= \kappa([hx], y) \\ &= \alpha(h) \kappa(x, y)\\ &= \kappa(t_\alpha, h) \kappa(x, y) \\ &= \kappa( \kappa(x, y) t_ \alpha, h) \\ &= \kappa( h, \kappa(x, y) t_ \alpha) \\ &\implies \kappa(h, [xy] - \kappa(x,y)t_\alpha) = 0 \\ &\implies [xy] = \kappa (x,y) t_ \alpha ,\end{align*} where we’ve used that \([xy]\in H\) and \(\kappa\) is nondegenerate on \(H\) and \([L_{ \alpha}, L_{ - \alpha}] \subseteq L_0 = H\).
Suppose \(\alpha(t_\alpha) = \kappa(t_{ \alpha}, t_{ \alpha}) = 0\), then for \(x\in L_{\alpha}, y\in L_{ - \alpha}\), we have \([t_ \alpha, x] = \alpha(t_ \alpha)x = 0\) and similarly \([t_ \alpha, y] = 0\). As before, find \(x\in L_{ \alpha}, y\in L_{ - \alpha}\) with \(\kappa(x,y)\neq 0\) and scale one so that \(\kappa(x, y) = 1\). Then by (c), \([x, y] = t_ \alpha\), so combining this with the previous formula yields that \(S \mathrel{\vcenter{:}}=\left\langle{x, y, t_ \alpha}\right\rangle\) is a 3-dimensional solvable subalgebra.15 Taking \({ \operatorname{ad}}: L\hookrightarrow{\mathfrak{gl}}(L)\), which is injective by semisimplicity, and similarly \({ \left.{{ { \operatorname{ad}}}} \right|_{{S}} }: S\hookrightarrow { \operatorname{ad}}(S) \leq {\mathfrak{gl}}(L)\). We’ll use Lie’s theorem to show everything here is a commutator of upper triangular, thus strictly upper triangular, thus nilpotent and reach a contradiction.
Recall the proposition from last time:
Part e: We have \(\alpha(t_ \alpha ) = \kappa(t_ \alpha, t_ \alpha)\), so suppose this is zero. Pick \(x\in L_{ \alpha}, y\in L_{ - \alpha}\) such that \(\kappa(x, y) = 1\), then
Set \(S \mathrel{\vcenter{:}}={\mathfrak{sl}}( \alpha) \mathrel{\vcenter{:}}={\mathbf{C}}\left\langle{x,y,t_ \alpha}\right\rangle\) and restrict \({ \operatorname{ad}}: L\hookrightarrow{\mathfrak{gl}}(L)\) to \(S\). Then \({ \operatorname{ad}}(S) \cong S\) by injectivity, and this is a solvable linear subalgebra of \({\mathfrak{gl}}(L)\). Apply Lie’s theorem to choose a basis for \(L\) such that the matrices for \({ \operatorname{ad}}(S)\) are upper triangular. Then use that \({ \operatorname{ad}}_L([SS]) = [ { \operatorname{ad}}_L(S) { \operatorname{ad}}_L(S)]\), which is strictly upper triangular and thus nilpotent. In particular, \({ \operatorname{ad}}_L (t_ \alpha)\) is nilpotent, but since \(t_\alpha\in H\) which is semisimple, so \({ \operatorname{ad}}_L( t_ \alpha)\) is semisimple. The only things that are semisimple and nilpotent are zero, so \({ \operatorname{ad}}_L( t_ \alpha) = 0 \implies t_\alpha = 0\). This contradicts that \(\alpha\in H {}^{ \vee }\setminus\left\{{0}\right\}\). \(\contradiction\)
Part f: Given \(x_ \alpha\in L_ \alpha\setminus\left\{{0}\right\}\), choose \(y_ \alpha \in L_{ - \alpha}\) and rescale it so that \begin{align*} \kappa(x_ \alpha, y _{\alpha} ) = {2\over \kappa(t_ \alpha, t_ \alpha)} .\end{align*} Set \(h_ \alpha \mathrel{\vcenter{:}}={2t_ \alpha\over \kappa(t_ \alpha, t_ \alpha) }\), then by (c), \([x_ \alpha, y_ \alpha] = \kappa( x_ \alpha, y_ \alpha) t_ \alpha = h_ \alpha\). So \begin{align*} [ h_ \alpha, x_ \alpha] = {2\over \alpha(t_ \alpha) }[ t_ \alpha, x_ \alpha] = {2\over \alpha(t_ \alpha)} \alpha(t_ \alpha) x_ \alpha = 2x_ \alpha ,\end{align*} and similarly \([h_ \alpha, y_ \alpha] = -2 y_ \alpha\). Now the span \(\left\langle{x_ \alpha, h_ \alpha, y_ \alpha}\right\rangle \leq L\) is a subalgebra with the same multiplication table as \({\mathfrak{sl}}_2({\mathbf{C}})\), so \(S \cong {\mathfrak{sl}}_2({\mathbf{C}})\).
Part g: Note that we would have \(h_{ - \alpha} = {2 t_{ - \alpha} \over \kappa( t_ { - \alpha}, t_{- \alpha} ) } = - h_ \alpha\) if \(t_{ \alpha} = t_{ - \alpha}\). This follows from the fact that \(H {}^{ \vee } { \, \xrightarrow{\sim}\, }H\) sends \(\alpha\mapsto t_\alpha, -\alpha\mapsto t_{- \alpha}\), but by linearity \(-\alpha\mapsto -t_{ \alpha}\).
\(L\) is generated as a Lie algebra by the root spaces \(\left\{{L_ \alpha{~\mathrel{\Big\vert}~}\alpha\in \Phi}\right\}\).
It STS \(H \subseteq \left\langle{\left\{{L_\alpha}\right\}_{\alpha\in \Phi}}\right\rangle\). Given \(\alpha\in \Phi\), \begin{align*} \exists x_\alpha\in L_ \alpha, y_ \alpha\in L_{- \alpha} \quad\text{such that}\quad \left\langle{x_ \alpha, y_ \alpha, h_ \alpha\mathrel{\vcenter{:}}=[x_ \alpha, y_ \alpha] }\right\rangle \cong {\mathfrak{sl}}_2({\mathbf{C}}) .\end{align*} Note any \(h_\alpha\in {\mathbf{C}}^{\times}t_ \alpha\) corresponds via \(\kappa\) to some \(\alpha\in H {}^{ \vee }\). By (a), \(\Phi\) spans \(H {}^{ \vee }\), so \(\left\{{t_\alpha}\right\}_{\alpha\in \Phi}\) spans \(H\).
Any \(\alpha\in \Phi\) yields \({\mathfrak{sl}}(\alpha) \cong {\mathfrak{sl}}(\alpha)\), and in fact that the generators entirely determined by the choice of \(x_\alpha\). View \(L\in {}_{{\mathfrak{sl}}(\alpha)}{\mathsf{Mod}}\) via \({ \operatorname{ad}}\).
If \(M \leq L \in {}_{{\mathfrak{sl}}(\alpha)}{\mathsf{Mod}}\) then all eigenvalues of \(h_\alpha\curvearrowright M\) are integers.
Apply Weyl’s theorem to decompose \(M\) into a finite direct sum of irreducibles in \({}_{ {\mathfrak{sl}}_2({\mathbf{C}}) }{\mathsf{Mod}}\). The weights of \(h_\alpha\) are of the form \(m, m-2,\cdots, -m+2, -m\in {\mathbf{Z}}\).16
Let \(M = H + {\mathfrak{sl}}(\alpha) \leq L \in {}_{{\mathfrak{sl}}( \alpha)}{\mathsf{Mod}}\), which one can check is actually a submodule since bracketing either lands in \({\mathfrak{sl}}(\alpha)\) or kills elements. What does Weyl’s theorem say about this submodule? There is some intersection. Set \(K \mathrel{\vcenter{:}}=\ker\alpha \subseteq H\), so \(\operatorname{codim}_H K = 1\) by rank-nullity. Note that \(h_\alpha \not\in K\), so \(M = K \oplus {\mathfrak{sl}}(\alpha)\). Moreover \({\mathfrak{sl}}(\alpha)\curvearrowright K\) by zero, since bracketing acts by \(\alpha\) which vanishes on \(K\). So \(K\cong {\mathbf{C}}{ {}^{ \scriptscriptstyle\oplus^{n+1} } }\) decomposes into trivial modules.
Let \(\beta\in \Phi\cup\left\{{0}\right\}\) and define \(M \mathrel{\vcenter{:}}=\bigoplus _{c\in {\mathbf{C}}} L_{\beta + c\alpha}\), then \(L \leq L \in {}_{{\mathfrak{sl}}( \alpha)}{\mathsf{Mod}}\). It will turn out that \(L_{\beta+ c \alpha} \neq 0 \iff c\in [-r, q] \subseteq {\mathbf{Z}}\) with \(r, q\in {\mathbf{Z}}_{\geq 0}\).
Let \(\alpha\in \Phi\), then the root spaces \(\dim L_{\pm \alpha} = 1\), and the only multiples of \(\alpha\) which are in \(\Phi\) are \(\pm \alpha\).
Note \(L_\alpha\) can only pair with \(L_{- \alpha}\) to give a nondegenerate Killing form. Set \begin{align*} M \mathrel{\vcenter{:}}=\bigoplus _{c\in {\mathbf{C}}} L_{c \alpha} = H \oplus \bigoplus _{c \alpha\in \Phi} L_{c \alpha} .\end{align*} By Weyl’s theorem, this decomposes into irreducibles. This allows us to take a complement of the decomposition from before to write \(M = H \bigoplus {\mathfrak{sl}}(\alpha) \oplus W\), and we WTS \(W = 0\) since this contains all \(L_{c \alpha}\) where \(c\neq \pm 1\). Since \(H \subseteq K \oplus {\mathfrak{sl}}( \alpha)\), we have \(W \cap H = 0\). If \(c_ \alpha\) is a root of \(L\), then \(h_ \alpha\) has \((c \alpha)(h_ \alpha) = 2c\) as an eigenvalue, which must be an integer by a previous lemma. So \(c\in {\mathbf{Z}}\) or \(c\in {\mathbf{Z}}+ {1\over 2}\).
Suppose \(W\neq 0\), and let \(V(s)\) (or \(L(s)\) in modern notation) be an irreducible \({\mathfrak{sl}}( \alpha){\hbox{-}}\)submodule of \(W\) for \(s\in {\mathbf{Z}}_{\geq 0}\). If \(s\) is even, \(V(s)\) contains an eigenvector \(w\) for \(h_ \alpha\) of eigenvalue zero by applying \(y_\alpha\) \(s/2\) times. We can then write \(w = \sum_{c\in {\mathbf{C}}} v_{c \alpha}\) with \(v_{ ca } \in L_{ c \alpha}\), and by finiteness of direct sums we have \(v_{c \alpha} = 0\) for almost every \(c\in {\mathbf{C}}\). Then \begin{align*} 0 &= [h_ \alpha, w] \\ &= \sum_{c\in {\mathbf{C}}} [h_ \alpha, v_{ c \alpha} ] \\ &= \sum_{c\in {\mathbf{C}}} (c\alpha)( h _{\alpha} )v_{c \alpha} \\ &= \sum 2c v_{c \alpha} \\ &\implies v_{c \alpha} = 0 \text{ when } c\neq 0 ,\end{align*} forcing \(w\in H\), the zero eigenspace. But \(w\in W\), so \(w\in W \cap H = 0\). \(\contradiction\)
\(\alpha\in \Phi\implies \dim L_{\pm \alpha} = 1\), and \(\alpha, \lambda \alpha\in \Phi\implies \lambda = \pm 1\).
Consider \(M \mathrel{\vcenter{:}}=\bigoplus \bigoplus _{ c\in {\mathbf{C}}} L_{c \alpha} \leq L\in {}_{{\mathfrak{sl}}( \alpha)}{\mathsf{Mod}}\). Write \({\mathfrak{sl}}(\alpha) = \left\langle{ x_ \alpha, h_ \alpha, y_ \alpha}\right\rangle\), we decomposed \(M = K \oplus {\mathfrak{sl}}(\alpha) \oplus W\) where \(\ker \alpha \leq H\) and \(W \cap H = 0\). WTS: \(W = 0\). So far, we’ve shown that if \(L(s) \subseteq W\) for \(s\in {\mathbf{Z}}_{\geq 0}\) (which guarantees finite dimensionality), then \(s\) can’t be even – otherwise it has a weight zero eigenvector, forcing it to be in \(H\), but \(W \cap H = 0\).
Aside: \(\alpha\in \Phi \implies 2\alpha\not\in \Phi\), since it would have weight \((2\alpha)(h_ \alpha) = 2\alpha h_ \alpha = 4\), but weights in irreducible modules have the same parity as the highest weight and no such weights exist in \(M\) (only \(0, \pm 2\) in \(K \oplus {\mathfrak{sl}}_2(\alpha)\) and only odd in \(W\)). Suppose \(L(s) \subseteq W\) and \(s\geq 1\) is odd. Then \(L(s)\) has a weight vector for \(h_ \alpha\) of weight 1. This must come from \(c=1/2\), since \((1/2) \alpha (h_ \alpha) = (1/2) 2 = 1\), so this is in \(L_{\alpha/2}\). However, by the aside, if \(\alpha\in \Phi\) then \(\alpha/2\not\in\Phi\).
Thus it \(W\) can’t contain any odd roots or even roots, so \(W = 0\). Note also that \(L_{\pm\alpha}\not\subset K \oplus W\), forcing it to be in \({\mathfrak{sl}}( \alpha)\), so \(L_{ \alpha} = \left\langle{x_ \alpha}\right\rangle\) and \(L_{- \alpha} = \left\langle{y _{\alpha} }\right\rangle\).
Let \(\alpha, \beta\in \Phi\) with \(\beta\neq \pm \alpha\) and consider \(\beta + k \alpha\) for \(n\in {\mathbf{Z}}\).
Consider \begin{align*} M \mathrel{\vcenter{:}}=\bigoplus _{k\in {\mathbf{Z}}} L_{ \beta + k \alpha} \leq L \quad\in {}_{{\mathfrak{sl}}( \alpha)}{\mathsf{Mod}} .\end{align*}
\(\beta(h _{\alpha} )\) is the eigenvalues of \(h_ \alpha\) acting on \(L_ \beta\). But by the lemma, \(\beta(h_ \alpha)\in {\mathbf{Z}}\).
By the previous proposition, \(\dim L_{ \beta+ k \alpha} = 1\) if nonzero, and the weight of \(h_\alpha\) acting on it is \(\beta( h _{\alpha} ) + 2k\) all different for distinct \(k\). By \({\mathfrak{sl}}_2{\hbox{-}}\)representation theory, we know the multiplicities of various weight spaces as the sum of dimensions of the zero and one weight spaces, and thus \(M\) is a single irreducible \({\mathfrak{sl}}(\alpha){\hbox{-}}\)module. So write \(M - L(d)\) for some \(d\in {\mathbf{Z}}_{\geq 0}\), then \(h_ \alpha\curvearrowright M\) with eigenvalues \(\left\{{d,d-2,\cdots, -d+2, -d}\right\}\). But \(h_ \alpha\curvearrowright M\) with eigenvalues \(\beta( h_ \alpha) + 2k\) for those \(k\in {\mathbf{Z}}\) with \(L_{\beta + k \alpha}\neq 0\). Since the first list is an unbroken string of integers of the same parity, thus the \(k\) that appear must also be an unbroken string. Define \(r\) and \(q\) by setting \(d = \beta(h_\alpha) + 2q\) and \(-d =\beta( h_ \alpha ) - 2r\) to obtain \([-r, q]\). Adding these yields \(0 = 2\beta( h_ \alpha) + 2q-2r\) and \(r-q = \beta(h_ \alpha)\).
Let \(M\cong L(d) \in {}_{{\mathfrak{sl}}(\alpha)}{\mathsf{Mod}}\) and \(x_\beta \in L_ \beta\setminus\left\{{0}\right\}\subseteq M\) with \(x_ \alpha\in L_{ \alpha}\). If \([x_ \alpha x_ \beta] = 0\) then \(x_ \beta\) is a maximal \({\mathfrak{sl}}(\alpha){\hbox{-}}\)vector in \(L(d)\) and thus \(d = \beta(h_ \alpha)\). But \(\alpha + \beta\in \Phi \implies \beta)(h_ \alpha) + 2\) is a weight in \(M\) bigger than \(d\), a contradiction. Thus \(\alpha + \beta\in \Phi \implies [x_ \alpha x_ \beta] \neq 0\). Since this bracket spans and \(\dim L_{ \alpha + \beta} = 1\), so \([L_ \alpha L_ \beta] = L_{ \alpha + \beta}\).
Use that \(q\geq 0, r\geq 0\) to write \(-r \leq -r+q \leq q\). Then \begin{align*} \beta - \beta(h_ \alpha) \alpha = \beta - (r-q) \alpha = \beta + (-r+q \alpha)\mathrel{\vcenter{:}}=\beta + \ell\alpha \end{align*} where \(\ell\in [-r, q]\). Thus \(\beta + \ell\alpha\in \Phi\) is an unbroken string by (b).
Is it true that \(\bigoplus_{k\in {\mathbf{Z}}} L_{\beta+ k \alpha} = \bigoplus _{c\in {\mathbf{C}}} L_{\beta + c\alpha}\)? The issue is that \(c\in {\mathbf{Z}}+ {1\over 2}\) is still possible.
Recall that \(\kappa\) restrict to a nondegenerate bilinear form on \(H\) inducing \(H {}^{ \vee } { \, \xrightarrow{\sim}\, }H\) via \(\phi\mapsto t_\phi\) where \(\kappa(t_\phi, {-}) = \phi({-})\). Transfer to a nondegenerate symmetric bilinear form on \(H {}^{ \vee }\) by \((\lambda, \mu) \mathrel{\vcenter{:}}=\kappa(t_\lambda, t_\mu)\). By prop 8.3 we know \(H {}^{ \vee }\) is spanned by \(\Phi\), so choose a \({\mathbf{C}}{\hbox{-}}\)basis \(\left\{{ \alpha_1,\cdots, \alpha_n}\right\} \subseteq \Phi\). Given \(\beta\in\Phi\), write \(\beta = \sum c_i \alpha_i\) with \(c_i\in {\mathbf{C}}\).
\(c_i \in {\mathbf{Q}}\) for all \(i\)!
Setup:
Decompose \(L = H \oplus \bigoplus _{ \alpha\in \Phi} L_{ \alpha}\)
Use the isomorphism \begin{align*} H & { \, \xrightarrow{\sim}\, }H {}^{ \vee }\\ \varphi &\mapsfrom t_{ \varphi} \end{align*} to define \((\lambda, \mu) \mathrel{\vcenter{:}}=\kappa(t_ \lambda, t_ \mu)\) on \(H\).
Choose a basis \(\left\{{ \alpha_i}\right\} \subseteq \Phi \subseteq H {}^{ \vee }\)
For any \(\beta \in \Phi\), write \(\beta= \sum c_i \alpha_i\) with \(c_i\in {\mathbf{C}}\). Then
\begin{align*} c_i\in {\mathbf{Q}} .\end{align*}
Write \(( \beta, \alpha_j) = \sum c_i (\alpha_i, \alpha_j)\) and m \begin{align*} {2 (\beta, \alpha_j) \over (\alpha_j, \alpha_j) } = \sum c_i {2 (\alpha_i, \alpha_j) \over (\alpha_j, \alpha_j) } ,\end{align*} where the LHS is in \({\mathbf{Z}}\), as is \(2( \alpha_i, \alpha_j) \over (\alpha_j, \alpha_j)\). On the other hand \begin{align*} {2 (\beta, \alpha_j) \over (\alpha_j, \alpha_j) } = {2 (t_ \beta, t_{\alpha_j} ) \over \kappa(t_{ \alpha_j}, \kappa_{ \alpha_j} ) } = \kappa(t_ \beta, h_{\alpha_j} ) = \beta(h_{ \alpha_j}) \end{align*} using that \(( \alpha_j, \alpha_j) = \kappa( t_{ \alpha_j}, t_{ \alpha_j} )\neq 0\) from before.17 Since \(\left\{{ \alpha_i}\right\}\) is a basis for \(H {}^{ \vee }\) and \(({-}, {-})\) is nondegenerate, the matrix \([ ( \alpha_i, \alpha _j) ]_{1\leq i, j\leq n}\) is invertible. Thus so is \(\left[ 2 ( \alpha_i, \alpha_j) \over (\alpha_j, \alpha_j ) \right]_{1\leq i,j \leq n}\), since it’s given by multiplying each column as a nonzero scalar, and one can solve for the \(c_i\) by inverting it. This involves denominators coming from the determinant, which is an integer, yielding entries in \({\mathbf{Q}}\).
Given \(\lambda, \mu \in H {}^{ \vee }\) then \begin{align*} (\lambda, \mu) = \kappa(t_ \lambda, t_\mu) = \operatorname{Trace}( { \operatorname{ad}}_{t_ \lambda} \circ { \operatorname{ad}}_{t_\mu} ) = \sum_{ \alpha\in \Phi} \alpha(t_ \lambda) \cdot \alpha(t_\mu) ,\end{align*} using that both ads are diagonal in this basis, so their product is given by the products of their diagonal entries. One can write this as \(\sum_{ \alpha\in \Phi} \kappa(t_ \alpha, t_ \lambda) \kappa(t_ \alpha, t_\mu)\), so we get a formula \begin{align*} ( \lambda, \mu ) = \sum_{ \alpha\in \Phi} ( \alpha, \lambda) (\alpha, \mu), \qquad (\lambda, \lambda) = \sum_{ \alpha\in \Phi} (\alpha, \lambda)^2 .\end{align*} Setting \(\lambda = \beta\) and dividing by \((\beta, \beta)^2\) yields \begin{align*} {1\over (\beta, \beta)} = \sum_{ \alpha\in \Phi} {(\alpha, \beta)^2 \over (\beta, \beta)^2} \in {1\over 4}{\mathbf{Z}} ,\end{align*} since \((\alpha, \beta)\over (\beta, \beta)\in {1\over 2} {\mathbf{Z}}\). So \((\beta, \beta)\in {\mathbf{Q}}\) and thus \((\alpha, \beta)\in {\mathbf{Q}}\) for all \(\alpha, \beta\in \Phi\). It follows that the pairings \((\lambda, \mu)\) on the \({\mathbf{Q}}{\hbox{-}}\)subspace \({\mathbb{E}}_{\mathbf{Q}}\) of \(H {}^{ \vee }\) spanned by \(\left\{{ \alpha_i}\right\}\) are all rational.
\(({-}, {-})\) on \({\mathbb{E}}_{\mathbf{Q}}\) is still nodegenerate
If \(\lambda\in {\mathbb{E}}_{\mathbf{Q}}, ( \lambda, \mu) =0 \forall \mu\in {\mathbb{E}}_{\mathbf{Q}}\), then \(( \lambda, \alpha_i) = 0 \forall i \implies (\lambda, \nu) = 0 \forall \nu\in H {}^{ \vee }\implies \lambda= 0\).
Similarly, \((\lambda, \lambda) = \sum_{ \alpha\in \Phi \subseteq {\mathbb{E}}_{\mathbf{Q}}} ( \alpha, \lambda)^2\) is a sum of squares of rational numbers, and is thus non-negative. Since \(( \lambda, \lambda) = 0 \iff \lambda= 0\), the form on \({\mathbb{E}}_{\mathbf{Q}}\) is positive definite. Write \({\mathbb{E}}\mathrel{\vcenter{:}}={\mathbb{E}}_{\mathbf{Q}}\otimes_{\mathbf{Q}}{\mathbf{R}}= {\mathbf{R}}\left\{{\alpha_i}\right\}\), then \(({-}, {-})\) extends in the obvious way to an \({\mathbf{R}}{\hbox{-}}\)values positive definite bilinear form on \({\mathbb{E}}\), making it a real inner product space.
Let \(L, H, \Phi, {\mathbb{E}}_{/ {{\mathbf{R}}}}\) be as above, then
Thus \(\Phi\) satisfies the axioms of a root system in \({\mathbb{E}}\).
Recall that for \({\mathfrak{sl}}_3({\mathbf{C}})\), \(\kappa(x,y) = 6 \operatorname{Trace}(xy)\). Taking the standard basis \(\left\{{v_i}\right\} \mathrel{\vcenter{:}}=\left\{{x_i, h_i, y_i \mathrel{\vcenter{:}}= x_i^t}\right\}\), the matrix \(\operatorname{Trace}(v_i v_j)\) is of the form \begin{align*} { \begin{bmatrix} {0} & {0} & {I} \\ {0} & {A} & {0} \\ {I} & {0} & {0} \end{bmatrix} }\qquad A \mathrel{\vcenter{:}}={ \begin{bmatrix} {2} & {-1} \\ {-1} & {2} \end{bmatrix} } .\end{align*} This is far from the matrix of an inner product, but the middle block corresponds to the form restricted to \(H\), which is positive definite. One can quickly check this is positive definite by checking positivity of the upper-left \(k\times k\) minors, which here yields \(\operatorname{det}(2) = 2, \operatorname{det}A = 4-1 = 3\).
Let \({\mathbb{E}}\) be a fixed real finite-dimensional Euclidean space with inner product \((\alpha, \beta)\), we consider property (c) from the previous theorem: \begin{align*} \beta - {2( \beta, \alpha) \over (\alpha, \alpha)} \in \Phi \qquad\forall \alpha, \beta\in \Phi .\end{align*}
A reflection in \({\mathbb{E}}\) is an invertible linear map on an \(n{\hbox{-}}\)dimensional Euclidean space that fixes pointwise a hyperplane \(P\) (of dimension \(n-1\)) and sending any vector \(v\perp P\) to \(-v\):
If \(\sigma\) is a reflection sending \(\alpha\mapsto - \alpha\), then \begin{align*} \sigma_\alpha(\beta) = \beta - {2( \beta, \alpha) \over (\alpha, \alpha)} \alpha \qquad \forall \beta\in {\mathbb{E}} .\end{align*} One can check that \(\sigma_\alpha^2 = \operatorname{id}\). Some notes on notation:
Recall the formula \begin{align*} s_\alpha( \lambda) = \lambda- (\lambda, \alpha {}^{ \vee })\alpha, \qquad \alpha {}^{ \vee }\mathrel{\vcenter{:}}={2\alpha\over (\alpha, \alpha)}, \alpha\neq 0 ,\end{align*} which is a reflection through the hyperplane \(P_\alpha\mathrel{\vcenter{:}}=\alpha\perp\):
Let \(\Phi \subseteq {\mathbb{E}}\) be a set that spans \({\mathbb{E}}\), and suppose all of the reflections \(s_\alpha\) for \(\alpha \in \Phi\) leave \(\Phi\) invariant. If \(\sigma\in \operatorname{GL}({\mathbb{E}})\) leaves \(\Phi\) invariant, fixes a hyperplane \(P\) pointwise, and sends some \(\alpha\in \Phi\setminus\left\{{0}\right\}\) to \(-\alpha\), then \(\sigma = s_\alpha\) and \(P = P_\alpha\).
Let \(\tau = \sigma s_ \alpha =\sigma s_{ \alpha}^{-1}\in \operatorname{GL}({\mathbb{E}})\), noting that every \(s_\alpha\) is order 2. Then \(\tau( \Phi) = \Phi\) and \(\tau( \alpha) = \alpha\), so \(\tau\) acts as the identity on the subspace \({\mathbf{R}}\alpha\) and the quotient space \({\mathbb{E}}/{\mathbf{R}}\alpha\) since there are two decompositions \({\mathbb{E}}= P_ \alpha \oplus {\mathbf{R}}\alpha = P \oplus {\mathbf{R}}\alpha\) using \(s_\alpha\) and \(\sigma\) respectively. So \(\tau - \operatorname{id}\) acts as zero on \({\mathbb{E}}/{\mathbf{R}}\alpha\), and so maps \({\mathbb{E}}\) into \({\mathbf{R}}\alpha\) and \({\mathbf{R}}\alpha\) to zero, s \((\tau - \operatorname{id})^2 = 0\) on \({\mathbb{E}}\) and its minimal polynomial \(m_\tau(t)\) divides \(f(t) \mathrel{\vcenter{:}}=(t-1)^2\).
Note that \(\Phi\) is finite, so the vectors \(\beta, \tau \beta, \tau^2 \beta, \tau^3 \beta,\cdots\) can not all be distinct. Since \(\tau\) is invertible we can assume \(\tau^k \beta = \beta\) for some particular \(k\). Taking the least common multiple of all such \(k\) yields a uniform \(k\) that works for all \(\beta\) simultaneously, so \(\tau^k \beta = \beta\) for all \(\beta \in \Phi\). Since \({\mathbf{R}}\Phi = {\mathbb{E}}, \tau^k\) acts as \(\operatorname{id}\) on all of \({\mathbb{E}}\), so \(\tau^k - 1 = 0\) and so \(m_\tau(t) \divides t^k - 1\) for some \(k\). Therefore \(m_\tau(t) \divides \gcd( (t-1)^2, t^k-1 ) = t-1\), forcing \(\tau = \operatorname{id}\) and \(\sigma = s_ \alpha\) and \(P = P_\alpha\).
A subset \(\Phi \subseteq {\mathbb{E}}\) of a real Euclidean space is a root system iff
Notably, \(\ell\mathrel{\vcenter{:}}={\left\lVert {\beta - s_\alpha\beta} \right\rVert}\) is an integer multiple of \(\alpha\):
The Weyl group \(W\) associated to a root system \(\Phi\) is the subgroup \(\left\langle{s_\alpha, \alpha\in \Phi}\right\rangle \leq \operatorname{GL}({\mathbb{E}})\).
Note that \({\sharp}W < \infty\): \(W\) permutes \(\Phi\) by (R3), so there is an injective group morphism \(W \hookrightarrow\mathrm{Perm}(\Phi)\), which is a finite group – this is injective because if \(w\curvearrowright\Phi\) as \(\operatorname{id}\), since \({\mathbf{R}}\Phi = {\mathbb{E}}\), by linearity \(w\curvearrowright{\mathbb{E}}\) by \(\operatorname{id}\) and \(w=\operatorname{id}\). Recalling that \(s_ \alpha( \lambda) = \lambda- (\lambda, \alpha {}^{ \vee }) \alpha\), we have \((s_ \alpha(\lambda), s_ \alpha(\mu)) = ( \lambda, \mu)\) for all \(\lambda, \mu \in {\mathbb{E}}\). So in fact \(W \leq {\operatorname{O}}({\mathbb{E}}) \leq \operatorname{GL}({\mathbb{E}})\), which have determinant \(\pm 1\) – in particular, \(\operatorname{det}s_\alpha = -1\) since it can be written as a block matrix \(\operatorname{diag}(1, 1, \cdots, 1, -1)\) by choosing a basis for \(P_\alpha\) and extending it by \(\alpha\).
Note that one can classify finite subgroups of \({\operatorname{SO}}_n\).
Let \(\Phi = \left\{{ {\varepsilon}_i - {\varepsilon}_j {~\mathrel{\Big\vert}~}1\leq i,j \leq n+1, i\neq j}\right\}\) be a root system of type \(A_n\) where \(\left\{{{\varepsilon}_i}\right\}\) form the standard basis of \({\mathbf{R}}^{n+1}\) with the standard inner product, so \(({\varepsilon}_i, {\varepsilon}_j) = \delta_{ij}\). One can compute \begin{align*} s_{{\varepsilon}_i - {\varepsilon}_j}({\varepsilon}_k) = {\varepsilon}_k {2 ({\varepsilon}_k, {\varepsilon}_i - {\varepsilon}_j) \over ({\varepsilon}_i - {\varepsilon}_j, {\varepsilon}_i - {\varepsilon}_j)}({\varepsilon}_i - {\varepsilon}_j) = {\varepsilon}_k - ({\varepsilon}_k, {\varepsilon}_i - {\varepsilon}_j)({\varepsilon}_i - {\varepsilon}_j) = \begin{cases} {\varepsilon}_j & k=i \\ {\varepsilon}_i & k=j \\ {\varepsilon}_k & \text{otherwise}. \end{cases} = {\varepsilon}_{(ij).k} \end{align*} where \((ij) \in S_{n+1}\) is a transposition, acting as a function on the index \(k\). Thus there is a well-defined group morphism \begin{align*} W &\to S_{n+1} \\ s_{{\varepsilon}_i - {\varepsilon}_j} &\mapsto (ij) .\end{align*} This is injective since \(w\) acting by the identity on every \({\varepsilon}_k\) implies acting by the identity on all of \({\mathbb{E}}\) by linearity, and surjective since transpositions generate \(S_{n+1}\). So \(W\cong S_{n+1}\), and \(A_n\) corresponds to \({\mathfrak{sl}}_{n+1}({\mathbf{C}})\) using that \begin{align*} [h, e_{ij}] = (h_i - h_j) e_{ij} = ({\varepsilon}_i - {\varepsilon}_j)(h) e_{ij} .\end{align*} In \(G = {\operatorname{SL}}_n({\mathbf{C}})\) one can define \(N_G(T)/C_G(T)\) for \(T\) a maximal torus.
What are the Weyl groups of other classical types?
Let \(\Phi \subseteq {\mathbb{E}}\) be a root system. If \(\sigma\in \operatorname{GL}({\mathbb{E}})\) leaves \(\Phi\) invariant, then for all \(\alpha\in \Phi\), \begin{align*} \sigma s_{ \alpha} \sigma = s_{ \sigma( \alpha)}, \qquad (\beta, \alpha {}^{ \vee }) = (\sigma( \beta), \sigma(\alpha) {}^{ \vee }) \,\,\forall \alpha, \beta\in \Phi .\end{align*} Thus conjugating a reflection yields another reflection.
Note that \(\sigma s_ \alpha \sigma^{-1}\) sends \(\sigma( \alpha)\) to its negative and fixes pointwise the hyperplane \(\sigma(P_\alpha)\). If \(\beta \in \Phi\) then \(s_{ \alpha}( \beta) \in \Phi\), so \(\sigma s_ \alpha ( \beta) \in \Phi\) and \begin{align*} (\sigma s_ \alpha \sigma^{-1}) ( \sigma( \beta)) = \sigma s_ \alpha(\beta) \in \sigma\Phi ,\end{align*} so \(\sigma s_ \alpha \sigma^{-1}\) leaves invariant the set \(\left\{{ \sigma( \beta) {~\mathrel{\Big\vert}~}\beta\in \Phi}\right\} = \Phi\). By the previous lemma, it must equal \(s_{ \sigma( \alpha)}\), and so \begin{align*} ( \sigma( \beta), \sigma( \alpha) {}^{ \vee }) = (\beta, \alpha {}^{ \vee }) \end{align*} by applying both sides to \(\sigma(\beta)\).
This does not imply that \((\sigma( \beta), \sigma( \alpha) ) = (\beta, \alpha)\)! With the duals/checks, this bracket involves a ratio, which is preserved, but the individual round brackets are not.
Let \(\Phi \subseteq {\mathbb{E}}\) be a root system with Weyl group \(W\). If \(\sigma\in \operatorname{GL}({\mathbb{E}})\) leaves \(\Phi\) invariant then \begin{align*} \sigma s_{\alpha} \sigma^{-1}= s_{ \sigma( \alpha)} \qquad\forall \alpha\in \Phi \end{align*} and \begin{align*} ( \beta, \alpha {}^{ \vee }) = ( \sigma(\beta), \sigma(\alpha) {}^{ \vee }) \qquad \forall \alpha, \beta \in \Phi .\end{align*}
\begin{align*} ( \sigma( \beta), \sigma( \alpha) ) \neq (\beta, \alpha) ,\end{align*} i.e. the \(({-}) {}^{ \vee }\) is important here since it involves a ratio. Without the ratio, one can easily scale to make these not equal.
Two root systems \(\Phi \subseteq {\mathbb{E}}, \Phi' \subseteq {\mathbb{E}}'\) are isomorphic iff there exists \(\phi: {\mathbb{E}}\to {\mathbb{E}}'\) of vector spaces such that \(\phi(\Phi) = \Phi'\) such that \begin{align*} (\varphi( \beta), \varphi(\alpha) {}^{ \vee }) = (\beta, \alpha) \mathrel{\vcenter{:}}={2 (\beta, \alpha) \over (\alpha, \alpha)} \qquad\forall \alpha, \beta \in \Phi .\end{align*}
One can scale a root system to get an isomorphism:
Note that if \(\phi: \Phi { \, \xrightarrow{\sim}\, }\Phi'\) is an isomorphism, then \begin{align*} \varphi(s_{ \alpha}( \beta)) = s_{ \varphi( \alpha)}( \varphi(\beta)) \qquad \forall \alpha, \beta\in \Phi \implies \varphi \circ s_{ \alpha} \varphi^{-1}= s_{ \varphi( \alpha)} .\end{align*} So \(\phi\) induces an isomorphism of Weyl groups \begin{align*} W & { \, \xrightarrow{\sim}\, }W' \\ s_{\alpha} &\mapsto s_{ \varphi( \alpha)} .\end{align*}
By the lemma, an automorphism of \(\Phi\) is the same as an automorphism of \({\mathbb{E}}\) leaving \(\Phi\) invariant. In particular, \(W\hookrightarrow\mathop{\mathrm{Aut}}( \Phi)\).
If \(\Phi \subseteq {\mathbb{E}}\) is a root system then the dual root system is \begin{align*} \Phi {}^{ \vee }\mathrel{\vcenter{:}}=\left\{{ \alpha {}^{ \vee }{~\mathrel{\Big\vert}~}\alpha\in \Phi}\right\}, \qquad \alpha {}^{ \vee }\mathrel{\vcenter{:}}={2\alpha\over (\alpha, \alpha)} .\end{align*}
Show that \(\Phi {}^{ \vee }\) is again a root system in \({\mathbb{E}}\).
One can show \(W( \Phi) = W( \Phi {}^{ \vee })\) and \({\left\langle {\lambda},~{ \alpha {}^{ \vee }} \right\rangle} \alpha {}^{ \vee }= {\left\langle { \lambda},~{ \alpha} \right\rangle} \alpha = (\lambda, \alpha {}^{ \vee })\) for all \(\alpha\in \Phi, \lambda\in {\mathbb{E}}\), so \(s_{\alpha {}^{ \vee }} = s_{\alpha}\) as linear maps on \({\mathbb{E}}\).
Let \(\Phi \subseteq {\mathbb{E}}\) be a root system, then \(\ell \mathrel{\vcenter{:}}=\dim_{\mathbf{R}}{\mathbb{E}}\) is the rank of \(\Phi\).
Rank 1 root systems are given by choice of \(\alpha\in {\mathbf{R}}\):
Recall \({2( \beta, \alpha) \over (\alpha, \alpha)} \in {\mathbf{Z}}\), and from linear algebra, \({\left\langle {v},~{w} \right\rangle} = {\left\lVert {v} \right\rVert} \cdot {\left\lVert {w} \right\rVert} \cos( \theta)\) and \({\left\lVert {\alpha} \right\rVert}^2 = ( \alpha, \alpha)\). We can thus write \begin{align*} {\left\langle { \beta},~{ \alpha} \right\rangle} = {2( \beta, \alpha) \over (\alpha, \alpha)} = 2{{\left\lVert {\beta} \right\rVert}\over {\left\lVert {\alpha} \right\rVert}} \cos (\theta), \qquad {\left\langle {\alpha },~{\beta} \right\rangle}= 2{{\left\lVert {\alpha} \right\rVert}\over {\left\lVert {\beta} \right\rVert}} \cos( \theta) ,\end{align*} and so \begin{align*} {\left\langle {\alpha },~{\beta} \right\rangle}{\left\langle {\beta },~{\alpha } \right\rangle}= 4\cos^2( \theta) ,\end{align*} noting that \(L_{ \alpha, \beta} \mathrel{\vcenter{:}}={\left\langle {\alpha },~{\beta} \right\rangle}, {\left\langle {\beta },~{\alpha} \right\rangle}\) are integers of the same sign. If positive, this is in QI, and if negative QII. This massively restricts what the angles can be, since \(0 \leq \cos^2( \theta) \leq 1\).
First, an easy case: suppose \(L_{ \alpha, \beta} = 4\), so \(\cos^2( \theta) = 1\implies \cos( \theta) = \pm 1\implies \theta= 0, \pi\).
So assume \(\beta\neq \pm \alpha\), and without loss of generality \({\left\lVert {\beta} \right\rVert}\geq {\left\lVert {\alpha} \right\rVert}\), or equivalently \({\left\langle {\alpha },~{\beta } \right\rangle}\leq {\left\langle {\beta },~{\alpha} \right\rangle}\). Note that if \({\left\langle {\alpha },~{\beta} \right\rangle}\neq 0\) then \begin{align*} { {\left\langle {\beta },~{\alpha} \right\rangle}\over {\left\langle {\alpha },~{\beta} \right\rangle}} = {{\left\lVert { \beta} \right\rVert}^2 \over {\left\lVert { \alpha} \right\rVert}^2} .\end{align*}
The other possibilities are as follows:
\({\left\langle {\alpha},~{\beta} \right\rangle}\) | \({\left\langle {\beta},~{\alpha} \right\rangle}\) | \(\theta\) | \({\left\lVert {\beta} \right\rVert}^2/{\left\lVert {\alpha} \right\rVert}^2\) |
---|---|---|---|
0 | 0 | \(\pi/2\) | Undetermined |
1 | 1 | \(\pi/3\) | 1 |
-1 | -1 | \(2\pi/3\) | 1 |
1 | 2 | \(\pi/4\) | 2 |
-1 | -2 | \(3\pi/4\) | 2 |
1 | 3 | \(\pi/6\) | 3 |
-1 | -3 | \(5\pi/6\) | 3 |
Cases for the norm ratios:
These are the only three irreducible rank 2 root systems.
Let \(\alpha, \beta\in\Phi\) lie in distinct linear subspaces of \({\mathbb{E}}\). Then
Note that (2) follows from (1) by replacing \(\beta\) with \(-\beta\). Assume \((\alpha, \beta) > 0\), then by the chart \({\left\langle {\alpha },~{\beta } \right\rangle}=1\) or \({\left\langle {\beta },~{\alpha } \right\rangle}= 1\). In the former case, \begin{align*} \Phi\ni s_{ \beta}( \alpha) = \alpha - {\left\langle {\alpha },~{\beta } \right\rangle}\beta = \alpha- \beta .\end{align*} In the latter, \begin{align*} s_{ \alpha}(\beta) = \beta- \alpha \in \Phi\implies - (\beta- \alpha) = \alpha- \beta\in \Phi .\end{align*}
Suppose \(\operatorname{rank}( \Phi) = 2\). Letting \(\alpha\in \Phi\) be a root of shortest length, since \({\mathbf{R}}\Phi = {\mathbb{E}}\) there is some \(\beta \in {\mathbb{E}}\) not equal to \(\pm \alpha\). Without loss of generality assume \(\angle_{\alpha, \beta}\) is obtuse by replacing \(\beta\) with \(-\beta\) if necessary:
Also choose \(\beta\) such that \(\angle_{ \alpha, \beta}\) is maximal.
Case 0: If \(\theta = \pi/2\), one gets \({\mathbf{A}}_1\times {\mathbf{A}}_1\):
We’ll continue this next time.
If \(\beta\neq \pm \alpha\),
Rank 2 root systems: let \(\alpha\) be a root of shortest length, and \(\beta\) a root with angle \(\theta\) between \(\alpha,\beta\) with \(\theta \geq \pi/2\) as large as possible.
If \(\theta = \pi/2\): \(A_1 \times A_1\).
If \(\theta = 2\pi/3\): \(A_2\)
One can check \({\left\langle {\alpha },~{\beta} \right\rangle}= 2(-1/2) = -1\) and \({\left\langle {\alpha + \beta},~{ \beta} \right\rangle} = {\left\langle {\alpha },~{\beta } \right\rangle}+ {\left\langle {\beta },~{\beta } \right\rangle}= -1 + 2 = 1\).
One can check that using linearity of \({\left\langle {{-}},~{{-}} \right\rangle}\) in the first variable that
Note that in each case one can see the root strings, defined as \begin{align*} R_\beta \mathrel{\vcenter{:}}=\left\{{\beta+ k \alpha {~\mathrel{\Big\vert}~}k\in {\mathbf{Z}}}\right\} \cap\Phi .\end{align*} Let \(r,q\in {\mathbf{Z}}_{\geq 0}\) be maximal such that \(\beta-r \alpha, \beta + q \alpha\in \Phi\). The claim is that every such root string is unbroken.
Suppose not, then there is some \(k\) with \(-r < k < q\) with \(\beta + k \alpha \not\in \Phi\). One can then find a maximal \(p\) and minimal \(s\) with \(p < s\) and \(\beta+p \alpha \in \Phi\) but \(\beta + (p+1) \alpha \not\in \Phi\), and similarly \(\beta + (s-1)\alpha\not\Phi\) but \(\beta + s \alpha\in \Phi\). By a previous lemma, \((\beta+ p \alpha, \alpha) \geq 0\) and similarly \((\beta+ s \alpha, \alpha) \leq 0\). Combining these, \begin{align*} p( \alpha, \alpha) \geq s (\alpha, \alpha) \implies p \geq s \text{ since } (\alpha, \alpha) > 0 \qquad\contradiction .\end{align*}
The picture:
So \(s_\alpha\) reverses the root string, since it sends the line containing the root string to itself but reflects through \(P_\alpha\). One can compute \begin{align*} \beta - r \alpha &= s_ \alpha(\beta + q \alpha) \\ &= (\beta+ q \alpha) - {\left\langle {\beta+ q \alpha},~{ \alpha} \right\rangle} \alpha \\ &= (\beta+ q \alpha) - {\left\langle { \beta},~{\alpha} \right\rangle}ga - 2q \alpha \\ &= \beta - \qty{{\left\langle {\beta },~{\alpha } \right\rangle}+ q} \alpha ,\end{align*} so \(r = {\left\langle {\beta },~{\alpha} \right\rangle}\) and \(r-q = {\left\langle {\beta },~{\alpha } \right\rangle}= \beta(h_ \alpha)\) for a semisimple Lie algebra.
Supposing \({\left\lvert {{\left\langle {\beta },~{\alpha} \right\rangle}} \right\rvert} \leq 3\). Choose \(\beta\) in \(R_\alpha\) such that \(\beta-\alpha\) is not a root and \(\beta\) is at the left end of the string and \(r=0\):
Then \(q = -{\left\langle {\beta },~{\alpha} \right\rangle}\), so the root string contains at most 4 roots (for \(\Phi\) of any rank).
A subset \(\Delta \subseteq \Phi\) is a base (or more modernly a set of simple roots) if
Today: finding bases for root systems. It’s not obvious they always exist, but e.g. in the previous rank 2 examples, \(\alpha,\beta\) formed a base.
Given a base \(\Delta \subseteq \Phi\), the height of a root \(\beta = \sum_{\alpha\in \Delta} k_ \alpha \alpha\) is \begin{align*} \operatorname{ht}(\beta) \mathrel{\vcenter{:}}=\sum k_ \alpha .\end{align*} If all \(k_\alpha \geq 0\), we say \(\beta\) is positive and write \(\beta\in \Phi^+\). Similarly, \(\beta\) is negative iff \(k_\alpha \leq 0\) for all \(\alpha\), and we write \(\beta\in \Phi^-\). This decomposes a root system into \(\Phi = \Phi^+ {\textstyle\coprod}\Phi^-\), and moreover \(-\Phi^+ = \Phi^-\).
A choice of \(\Delta\) determines a partial order on \(\Phi\) which extends to \({\mathbb{E}}\), where \(\lambda\geq \mu \iff \lambda- \mu\) is a non-negative integer linear combination of elements of \(\Delta\).
If \(\Delta \subseteq \Phi\) is a base and \(\alpha, \beta\in \Delta\), then \begin{align*} \alpha\neq \beta\implies ( \alpha, \beta) \leq 0 \text{ and } \alpha- \beta\not\in \Phi .\end{align*}
We have \(\alpha\neq \pm \beta\) since \(\Delta\) is a linearly independent set. By a previous lemma, if \(( \alpha, \beta) > 0\) then \(\beta- \alpha \in \Phi\) by a previous lemma. \(\contradiction\)
An element \(\gamma\in {\mathbb{E}}\) is regular iff \(\gamma \in {\mathbb{E}}\setminus\bigcup_{\alpha\in \Phi} P_ \alpha\) where \(P_ \alpha= \alpha^\perp\), otherwise \(\gamma\) is singular.
Regular vectors exist.
The basic fact used is that over an infinite field, no vector space is the union of a finite number of proper subspaces. Note that this is a union, not a sum!
Given a regular vector \(\gamma\in {\mathbb{E}}\), define \begin{align*} \Phi^+(\gamma) = \left\{{ \alpha\in \Phi {~\mathrel{\Big\vert}~}(\alpha, \gamma) > 0 }\right\} ,\end{align*} the roots on the positive side of the hyperplane \(\alpha^\perp\):
This decomposes \(\Phi = \Phi^+(\gamma) {\textstyle\coprod}\Phi^-(\gamma)\) where \(\Phi^-(\gamma) \mathrel{\vcenter{:}}=- \Phi^+(\gamma)\). Note that \(\gamma\) lies on the positive side of \(\alpha^\perp\) for every \(\alpha\in \Phi^+(\gamma)\).
A positive root \(\beta\in \Phi^+\) is decomposable iff \(\beta = \beta_1 + \beta_2\) for \(\beta_i \in \Phi^+( \gamma)\). Otherwise \(\beta\) is indecomposable.
There exists a base for \(\Phi\).
Let \(\gamma\in {\mathbb{E}}\) be regular. Then the set \(\Delta(\gamma)\) of all indecomposable roots in \(\Phi^+( \gamma)\) is a base for \(\Phi\). Moreover, any base for \(\Phi\) arises in this way.
The proof: if not, pick \(\beta\in \Phi^+(\gamma)\) which cannot be written this way and choose it such that \((\beta, \gamma)\) is maximal (by finiteness). Since \(\beta\not\in \Delta( \gamma)\), it is decomposable as \(\beta = \beta_1 + \beta_2\) with \(\beta_i \in \Phi^+\). Now \((\beta, \gamma) = \sum (\beta_i, \gamma)\) is a sum of nonnegative numbers, so \((\beta_i, \gamma) < (\beta, \gamma)\) for \(i=1,2\). By minimality, \(\beta_i\in {\mathbf{Z}}_{\geq 0 } \Delta(\gamma)\), but then by adding them we get \(\beta\in {\mathbf{Z}}_{\geq 0 } \Delta(\gamma)\).
Claim: if \(\alpha, \beta\in \Delta( \gamma)\) with \(\alpha\neq \beta\) then \((\alpha, \beta) \leq 0\). Note \(\alpha\neq - \beta\) since \((\alpha, \gamma), (\beta, \gamma) > 0\). By lemma 9.4, if \(( \alpha, \beta) > 0\) then \(\alpha - \beta\in \Phi\) is a root. Then one of \(\alpha- \beta, \beta- \alpha\in \Phi^+( \gamma)\). In the first case, \(\beta + (\alpha- \beta ) = \alpha\), decomposing \(\alpha\). In the second, \(\alpha + (\beta- \alpha) = \beta\), again a contradiction.
Claim: \(\Delta\mathrel{\vcenter{:}}=\Delta( \gamma)\) is linearly independent. Suppose \(\sum_{ \alpha\in \Delta} r_ \alpha \alpha = 0\) for some \(r_ \alpha \in {\mathbf{R}}\). Separate the positive terms \(\alpha\in \Delta'\) and the remaining \(\alpha\in \Delta''\) to write \({\varepsilon}\mathrel{\vcenter{:}}=\sum_{ \alpha\in \Delta'}r_ \alpha \alpha = \sum_{ \beta\in \Delta''} t_\beta \beta\) where now \(r_ \alpha, t_\beta > 0\). Use the two expressions for \({\varepsilon}\) to write \begin{align*} ({\varepsilon}, {\varepsilon}) = \sum _{ \alpha\in \Delta', \beta\in \Delta''} r_ \alpha t_ \beta (\alpha, \beta) \leq 0 ,\end{align*} since \(r_ \alpha t_ \beta >0\) and \((\alpha, \beta) \leq 0\). So \({\varepsilon}= 0\), since \(({-}, {-})\) is an inner product. Write \(0 = (\gamma, {\varepsilon}) = \sum_{ \alpha\in \Delta'} r_\alpha (\alpha, \delta)\) where \(r_ \alpha > 0\) and \((\alpha, \gamma) > 0\), so it must be the case that \(\Delta' = \emptyset\). Similarly \(\Delta'' = \emptyset\), so \(r_\alpha = 0\) for all \(\alpha\in \Delta\).
Claim: \(\Delta( \gamma)\) is a base for \(\Phi\). Since \(\Phi = \Phi^+(\gamma){\textstyle\coprod}\Phi^-( \gamma)\), we have B2 by step 1. This also implies \(\Delta( \gamma)\) is a basis for \({\mathbb{E}}\), since we have linear independent by step 3. Thus \({\mathbf{Z}}\Delta ( \gamma) \supseteq\Phi\) and \({\mathbf{R}}\Phi = {\mathbb{E}}\).
Claim: every base of \(\Phi\) is \(\Delta( \gamma)\) for some regular \(\gamma\). Given \(\Delta\), choose \(\gamma\in {\mathbb{E}}\) such that \((\gamma, \alpha) > 0\) for all \(\alpha\in \Delta\). Then \(\gamma\) is regular by B2. Moreover \(\Phi^+ \subseteq \Phi^+( \gamma)\) and similarly \(\Phi^- \subseteq \Phi^-( \gamma)\), and taking disjoint unions yields \(\Phi\) for both the inner and outer sets, forcing them to be equal, i.e. \(\Phi^{\pm} = \Phi^{\pm}( \gamma)\). One can check that \(\Delta \subseteq \Delta( \gamma)\) using \(\Phi^+ = \Phi^+( \gamma)\) and linear independence of \(\Delta\) – but both sets are bases for \({\mathbb{E}}\) and thus have the same cardinality \(\ell = \dim {\mathbb{E}}\), making them equal.
From a previous discussion: given a rank \(n\) root system \(\Phi\) with \(n\geq 2\), is \({\mathbf{R}}\left\langle{ \alpha, \beta}\right\rangle \cap\Phi\) always a rank 2 root system? The answer is yes! This follows readily from just checking the axioms directly.
For a regular \(\gamma \in {\mathbb{E}}\setminus\bigcup_{\alpha\in \Phi} P_ \alpha\), define \(\Phi^+(\gamma) \mathrel{\vcenter{:}}=\left\{{ \beta\in \Phi {~\mathrel{\Big\vert}~}(\beta, \gamma) > 0}\right\}\) and let \(\Delta( \gamma)\) be the indecomposable elements of \(\Phi^+( \gamma)\):
\(\Delta( \gamma)\) is a base for \(\Phi\), and every base is of this form.
The connected components of \({\mathbb{E}}\setminus\bigcup_{ \alpha\in \Phi} P_ \alpha\) are called the (open) Weyl chambers of \({\mathbb{E}}\). Each regular \(\gamma\) belongs to some Weyl chamber, which we’ll denote \(C(\gamma)\).
Note that \(C(\gamma) = C(\gamma') \iff \gamma, \gamma'\) are on the same side of every root hyperplane \(P_\alpha\) for \(\alpha\in \Phi\), which happens iff \(\Phi^+( \gamma) = \Phi^+(\gamma;) \iff \Delta( \gamma) = \Delta(\gamma')\), so there is a bijection \begin{align*} \left\{{\substack{ \text{Weyl chambers} }}\right\} &\rightleftharpoons \left\{{\substack{ \text{Bases for $\Phi$} }}\right\} .\end{align*} Note also that \(W\) sends one Weyl chamber to another: any \(s_ \alpha\) preserves the connected components \({\mathbb{E}}\setminus\bigcup_{ \beta\in \Phi} P_{\beta}\), so if \(\gamma\) is regular and \(\sigma\in W\) and \(\sigma( \gamma) = \gamma'\) for some regular \(\gamma'\), then \(\sigma(C(\gamma)) = C(\gamma')\). \(W\) also acts on bases for \(\Phi\): if \(\Delta\) is a base for \(\Phi\), then \(\sigma( \Delta)\) is still a basis for \({\mathbb{E}}\) since \(\sigma\) is an invertible linear transformation. Since \(\sigma( \Phi) = \Phi\) by the axioms, any root \(\alpha\in \Phi\) is of the form \(\sigma(\beta)\) for some \(\beta\in \Phi\), but writing \(\beta = \sum _{\alpha\in\Delta} k_ \alpha \alpha\) with all \(k_\alpha\) the same sign, \(\sigma( \beta) = \sum_{ \alpha\in \Delta} k_ \alpha \sigma( \alpha)\) is a linear combination of elements in \(\sigma(\Delta)\) with coefficients of the same sign.
The actions of \(W\) on chambers and bases are compatible: if \(\Delta = \Delta( \gamma)\) then \(\sigma( \Delta) = \Delta( \sigma( \gamma))\), since \(\sigma(\Phi^+( \gamma)) = \Phi^+( \sigma( \gamma))\) since \(W \leq {\operatorname{O}}({\mathbb{E}})\) and thus \(( \sigma \gamma, \sigma \alpha) = (\gamma, \alpha)\).
Fix a base \(\Delta \subset \Phi\), which decomposes \(\Phi = \Phi^+ {\textstyle\coprod}\Phi^-\). If \(\beta\in \Phi^+ \setminus\Delta\), then \(\beta- \alpha\in \Phi^+\) for some \(\alpha\in\Delta\).
If \(( \beta, \alpha)\leq 0\) for all \(\alpha\in \Delta\), then the proof of theorem 10.1 would show \(\beta = 0\) by taking \({\varepsilon}\mathrel{\vcenter{:}}=\beta\) in step 3. One can then find an \(\alpha\in \Delta\) with \(( \beta, \alpha) > 0\), where clearly \(\beta \neq \pm \alpha\). By lemma 9.4, \(\beta- \alpha\in \Phi\). Why is this positive? Note that \(\beta\) has at least one coefficient for a simple root (not the coefficient for \(\alpha\)) which is strictly positive, and thus all coefficients are \(\geq 0\). This coefficient stays the same in \(\beta- \alpha\), so its coefficients are all non-negative by axiom B2 and \(\beta-\alpha\in \Phi^+\).
Each \(\beta\in \Phi^+\) can be written as \(\alpha_1 + \cdots + \alpha_k\) where \(\alpha_i\in \Delta\) not necessarily distinct, such that each truncated sum \(\alpha_1 + \cdots + \alpha_i\) for \(1\leq i \leq k\) is a positive root. One proves this by induction on the height of \(\beta\).
Let \(\alpha \in \Delta\), then \(s_ \alpha\) permutes \(\Phi^+ \setminus\left\{{ \alpha }\right\}\).
Let \(\beta \in \Phi^+\setminus\left\{{ \alpha }\right\}\); if \(\beta = \sum _{\gamma\in \Delta} k_ \gamma \gamma\) with \(k_ \gamma \in {\mathbf{Z}}_{\geq 0}\), since \(\beta\neq \alpha\), some \(k_ \gamma > 0\) for some \(\gamma \neq \alpha\). Using the formula \(s_ \alpha( \beta) = \beta- {\left\langle {\beta },~{\alpha } \right\rangle}\alpha\) still has coefficient \(k_ \gamma\) for \(\gamma\). Thus \(s_ \alpha( \beta) \in \Phi^+\) and \(s_ \alpha( \beta)\neq \alpha\) since \(s_ \alpha(- \alpha) = \alpha\) and \(s_\alpha\) is bijective, and so \(s_\alpha(\beta)\in \Phi^+\setminus\left\{{ \alpha }\right\}\). As a result, \(s_\alpha\) permutes this set since it is invertible.
Let \begin{align*} \rho \mathrel{\vcenter{:}}={1\over 2}\sum_{ \beta\in \Phi^+} \beta \quad\in {\mathbb{E}} .\end{align*} Then \(s_\alpha( \rho) = \rho- \alpha\) for all \(\alpha\in \Delta\), and \(s_ \alpha( \rho) = \rho\).
Note that Humphreys uses \(\delta\), but nobody uses this notation.
Let \(\alpha_1,\cdots \alpha_t \in \Delta\) be not necessarily distinct simple roots, and write \(s_i \mathrel{\vcenter{:}}= s_{\alpha_i}\). If \(s_1 \cdots s_{t-1}(\alpha_t) < 0\), then for some \(1\leq u \leq t\) one has \begin{align*} s_1\cdots s_t = s_1\cdots s_{u-1} s_{u+1} s_{t-1} ,\end{align*} so one can delete \(s_u\) and \(s_t\) to get a shorter product of reflections.
For \(0\leq i\leq t-1\), let \(\beta_i \mathrel{\vcenter{:}}= s_{i+1} \cdots s_{t-1}( \alpha_t)\) and \(\beta_{t-1} \mathrel{\vcenter{:}}=\alpha_t\). Since \(\beta_0 \mathrel{\vcenter{:}}= s_1\cdots s_{t-1}( \alpha_t ) < 0\) and \(\beta_{t-1} = \alpha_t > 0\), there must be a smallest index \(u\) such that \(\beta_u > 0\). Note that \(u\geq 1\) since \(\beta_0\) is negative. Then \begin{align*} s_u( \beta_u) &= s_u s_{u+1} \cdots s_{t-1}( \alpha_t) \\ &= \beta_{u-1} \\ & < 0 \end{align*} by choice of \(u\). Noting that \(\beta_u = s_{u+1}\cdots s_{t-1} (\alpha_t)\), by lemma B, \(s_u = s_{\alpha_u}\) permutes roots other than \(\alpha_u\) since \(\beta_u > 0\) and \(s_{\alpha_u}( \beta_u) < 0\). By lemma 9.2, write \begin{align*} s_{\alpha_u} = s_{\beta_u} = s_{ s_{u+1}\cdots s_{t-1}( \alpha_t ) } = (s_{u+1} \cdots s_{t-1}) s_{\alpha_u} (s_{u+1}\cdots s_{t-1})^{-1} .\end{align*} Multiply both sides on the left by \((s_1\cdots s_u)\) and on the right by \((s_{u+1}\cdots s_{t-1})\) to obtain \begin{align*} (s_1 \cdots s_{u-1})(s_{u+1}\cdots s_{t-1}) = (s_1\cdots s_u)(s_{u+1}\cdots s_t), \qquad s_t \mathrel{\vcenter{:}}= s_{\alpha_t} .\end{align*}
If \(\sigma = s_1\cdots s_t\) is an expression for \(w\in W\) in terms of simple reflections (which we don’t yet know exists, but it does) with \(t\) minimal, then \(\sigma( \alpha_t) < 0\).
Fix a base for \(\Phi\).
(\(W\) acts transitively on the set of Weyl chambers) If \(\gamma\in {\mathbb{E}}\) is regular (not on a root hyperplane), there exists \(\sigma\in W\) such that \(( \sigma(\delta), \alpha)> 0\) for all \(\alpha\in \Delta\), i.e. \(\sigma( \gamma) \in {\mathcal{C}}(\Delta)\), the dominant Weyl chamber relative to \(\Delta\).
(\(W\) acts transitively on bases) If \(\Delta'\) is another base for \(\Phi\), then there exists \(\sigma\in W\) such that \(\sigma( \Delta') = \Delta\), so \(W\) acts transitively on bases.
(Every orbit of \(W\Phi\) contains a simple root) If \(\beta\in \Phi\) then there exists a \(\sigma\in W\) such that \(\sigma( \beta)\in \Delta\).
(\(W\) is generated by simple roots) \(W = \left\langle{s_ \alpha {~\mathrel{\Big\vert}~}\alpha\in \Delta}\right\rangle\) is generated by simple roots.
(Stabilizers are trivial) If \(\sigma( \Delta) = \Delta\) for some \(\sigma\in W\), then \(\sigma = 1\).
Part c: Set \(W' \mathrel{\vcenter{:}}=\left\langle{s_ \alpha{~\mathrel{\Big\vert}~}\alpha\in \Delta}\right\rangle\), we’ll prove (c) with \(W\) replaced \(W'\), which is larger. First suppose \(\beta\in \Phi^+\) and consider \(W' \beta \cap\Phi^+\). This is nonempty since it includes \(\beta\) and is a finite set, so choose \(\gamma\) in it of minimal height. Claim: \(\operatorname{ht}(\gamma) = 1\), making \(\gamma\) simple. If not, supposing \(\operatorname{ht}( \gamma) > 1\), write \(\gamma = \sum_{ \alpha\in \Delta} k_ \alpha \alpha\) with \(k_ \alpha > 0\). Since \(\gamma\neq 0\), we have \((\gamma, \gamma) > 0\), so substitute to yield \begin{align*} 0 < (\gamma, \gamma) = (\gamma, \sum_{\alpha \in \Delta} k_ \alpha \alpha) = \sum_{\alpha\in \Delta} k_ \alpha (\gamma, \alpha) ,\end{align*} so \((\gamma, \alpha)>0\) for some \(\alpha \in \Delta\), and \(s_{ \alpha} \gamma = \gamma - {\left\langle {\gamma },~{\alpha } \right\rangle}\alpha\in \Phi^+\) is positive where \({\left\langle {\gamma },~{\alpha } \right\rangle}> 0\). This is a contradiction, since it has a smaller height. Note that if \(\beta\in \Phi^-\) then \(-\beta\in \Phi^+\) and there exists a \(\sigma\in W'\) such that \(\sigma( - \beta) = \alpha\in \Delta\). So \(\sigma( \beta) = - \alpha\), and \(s_ \alpha \sigma( \beta) = s_ \alpha( - \alpha) = \alpha \in \Delta\).
Part d: Given \(\beta\), pick \(\sigma\in W'\) such that \(\sigma^{-1}( \beta) = \alpha\in \Delta\). Then \begin{align*} s_ \beta = s_{ \sigma( \alpha)} = \sigma s_{ \alpha} \sigma^{-1}\in W' ,\end{align*} so \(W \leq W' \leq W\), making \(W = W'\).
Parts a and b: Recall \(\rho = {1\over 2}\sum _{ \beta\in \Phi^+}\) and choose \(\sigma\in W\) such that \(( \sigma(\delta), \rho)\) is maximal (picking from a finite set). Given \(\alpha\in \Delta\), we have \(s_\alpha \sigma\in W\), and so \begin{align*} ( \sigma(\delta), \rho) &\geq ( \sigma(\delta), \rho) \\ &= (\sigma(\delta), s_ \alpha \rho) \\ &= (\sigma(\delta), \rho - \alpha) \\ &= (\sigma(\delta), \rho ) - ( \sigma( \delta), \alpha) ,\end{align*} and so \(( \sigma( \delta), \alpha)\geq 0\) for all \(\alpha\in \Delta\). Importantly, \(\gamma\) is regular, so this inequality is structure for all \(\alpha\in \Delta\). So \(W\) acts transitively on the Weyl chambers, and consequently on simple systems (i.e. bases for \(\Phi\)) by the discussion at the end of \(\S 10.1\).2
Part e: Suppose \(\sigma( \Delta) = \Delta\) and \(\sigma \neq 1\), and write \(\sigma = \prod_{1\leq i \leq t} s_i\) with \(s_i \mathrel{\vcenter{:}}= s_{\alpha_i}\) for \(\alpha_i \in \Delta\) with \(t \geq 1\) minimal. Note \(\sigma( \Delta) = \Delta\) and \(\alpha_t \in \Delta\), we have \(\sigma( \alpha_t) > 0\) and \(\prod_{1\leq i\leq t}(\alpha_t) = \prod_{1\leq i \leq t-1}s_i (-\alpha_t)\) so \(\prod_{1\leq i\leq t-1} s_i(\alpha_t) < 0\). This fulfills the deletion condition, so \(\prod_{1\leq i \leq t} = s_1\cdots \widehat{s_u}\cdots \widehat{s_t}\) which is of smaller length.
In type \(A_n\), \({\sharp}W(A_n) \approx n!\), and since bases biject with \(W\) there are many choices of bases.
Let \(\Delta \subseteq \Phi\) be a base and write \(\sigma\in W\) as \(\sigma = \prod_{1\leq i \leq t} s_{\alpha_i}\) with \(\alpha_i\in \Delta\) and \(t\) minimal. We say this is a reduced expression for \(\sigma\) and say \(t\) is the length of \(\sigma\), denoted \(\ell( \sigma)\). By definition, \(\ell(1) = 0\).
Since \(W \leq \operatorname{GL}({\mathbb{E}})\), there is a map \(\operatorname{det}: W\to \operatorname{GL}_1({\mathbf{R}}) = {\mathbf{R}}^{\times}\). The determinant of a reflection is \(-1\) by writing it in a basis about the fixed hyperplane, and so \(\operatorname{det}\sigma = (-1)^{\ell( \sigma)}\) and in fact \(\operatorname{det}: W\to \left\{{\pm 1}\right\}\). Thus \(\ell( \sigma \sigma') \equiv \ell( \sigma) + \ell( \sigma)\operatorname{mod}2\).
Note also that if \(\sigma' = s_ \alpha\) for \(\alpha\) simple, then \(\ell( \sigma s_{ \alpha}) = \ell( \sigma) \pm 1\). The proof: \(\ell( \sigma s_ \alpha)\leq \ell( \sigma) + 1\), similarly for \(\sigma s_ \alpha\), and use \(\operatorname{det}( \sigma s_ \alpha) = - \operatorname{det}\sigma\).
Reduced expressions are not unique: for \(A_2\), one has \(s_ \alpha s_ \beta s_ \alpha = s_ \beta s_ \alpha s_ \beta\), and these two reflections do not commute.
Some temporary notation for this section: for \(\sigma\in W\), set \begin{align*} n( \sigma) \mathrel{\vcenter{:}}={\sharp}( \Phi^- \cap\sigma(\Phi^+)) ,\end{align*} the number of positive roots that \(\sigma\) sends to negative roots.
For all \(\sigma\in W\), \begin{align*} n( \sigma) = \ell( \sigma) .\end{align*}
Induct on \(\ell(\sigma)\): if zero, then \(\sigma = 1\) and \(n(1) = 0\) since it fixes all positive roots. If \(\ell( \sigma ) = 1\) then \(\sigma = s_{ \alpha}\) for some simple \(\alpha\), and we know from the last section that \(\sigma\) permutes \(\Phi^+\setminus\left\{{ \alpha }\right\}\) and \(\sigma( \alpha) = - \alpha\), so \(n( \sigma) = 1\).
We’re proving \(\ell( \sigma) = n(\sigma) \mathrel{\vcenter{:}}={\sharp}( \Phi^- \cap\sigma(\Phi^-))\) by induction on \(\ell(\sigma)\), where we already checked the zero case. Assume the result for all \(\tau\) with \(\ell( \tau) \leq \ell( \sigma)\) for \(\tau \in W\). Write \(\sigma = s_1\cdots s_t\) with \(s_i \mathrel{\vcenter{:}}= s_{ \alpha_i}, \alpha_i\in \Delta\) reduced. Set \(\tau \mathrel{\vcenter{:}}=\sigma s_t = s_1\cdots s_{t-1}\) which is again reduced with \(\ell(\tau) = \ell( \sigma) - 1\). By the deletion condition, \(s_1 \cdots s_{t-1}( \alpha_t) > 0\), so \(s_1\cdots s_{t-1}s_t (\alpha_t) = s_1 \cdots s_{t-1}(- \alpha_t) < 0\). Thus \(n(\tau) = n( \sigma) - 1\), since \(s_t\) permutes \(\Phi^+\setminus\left\{{ \alpha_t }\right\}\), so \begin{align*} \ell( \sigma) - 1 = \ell( \tau) = n( \tau) = n( \sigma) -1 \implies \ell( \sigma) = n( \sigma) .\end{align*}
This is useful for finding reduced expressions, or at least their length: just compute how many positive roots change sign under \(\sigma\). Using the deletion condition and lemma A, it’s clear that any expression for \(\sigma\) as a product of simple reflections can be converted into a reduced expression by deleting pairs of simple reflections, and this terminates after finitely many steps.
Recall that the open Weyl chambers are the complements of hyperplanes. The closure of any Weyl chamber is a fundamental domain for the action \(W\curvearrowright{\mathbb{E}}\).
A root system \(\Phi \subseteq {\mathbb{E}}\) is irreducible if it cannot be partitioned into mutually orthogonal nonempty subsets. Otherwise, \(\Phi\) is reducible.
Let \(\Delta \subseteq \Phi\) be a simple system. Then \(\Phi\) is irreducible iff \(\Delta\) is irreducible, i.e. \(\Delta\) cannot be partitioned into nonempty orthogonal subsets.
\(\Phi\) reducible implies \(\Delta\) reducible: write \(\Phi = \Phi_1 {\textstyle\coprod}\Phi_2\) where \(( \Phi_1, \Phi_2) = 0\); this induces a similar partition of \(\Delta\). Then \((\Delta, \Phi_2) = 0 \implies ({\mathbb{E}}, \Phi_2) = 0 \implies {\mathbb{E}}= \emptyset\) using nondegeneracy of the bilinear form. \(\contradiction\)
Now \(\Delta\) reducible implies \(\Phi\) reducible: write \(\Delta =\Delta_1 {\textstyle\coprod}\Delta_2\) with \((\Delta_1, \Delta_2) = 0\). Let \(\Phi_i\) be the roots which are \(W{\hbox{-}}\)conjugate to an element of \(\Delta_i\). Then elements in \(\Phi_i\)are obtained from \(\Delta_i\) by adding and subtracting only elements of \(\Delta_i\), so \((\Phi_1, \Phi_2) = 0\) and \(\Phi = \Phi_1 \cup\Phi_2\) by a previous lemma that every \(\beta\in \Phi\) is conjugate to some \(\alpha\in \Delta\).
Let \(\Phi \supseteq\Delta\) be irreducible. Relative to the partial order \(\leq\) on roots, there is a unique maximal root \(\tilde \alpha\). In particular, if \(\beta\in \Phi\) and \(\beta\neq \tilde \alpha\), then \(\operatorname{ht}( \beta) < \operatorname{ht}( \tilde \alpha)\) and \((\tilde \alpha, \alpha) \geq 0\) for all \(\alpha\in \Delta\). Moreover, one can write \(\tilde \alpha = \sum _{\alpha\in \Delta}\) with \(k_\alpha > 0\), i.e. it is a sum where every simple root appears.
Existence: Let \(\tilde \alpha\) be any maximal root in the ordering. Given \(\alpha \in \Delta, (\tilde \alpha, \alpha) \geq 0\) – otherwise \(s_ \alpha(\tilde \alpha)= \tilde \alpha-{\left\langle { \tilde \alpha},~{\alpha} \right\rangle} \alpha > \alpha\), a contradiction. \(\contradiction\) Write \(\tilde \alpha = \sum_{\alpha\in \Delta} k_ \alpha \alpha\) with \(k_ \alpha \in {\mathbf{Z}}_{\geq 0}\), where it’s easy to see these are all non-negative. Suppose some \(k_\gamma = 0\), then \((\tilde \alpha, \gamma)\leq 0\) – otherwise \(s_\gamma( \tilde \alpha) = \tilde \alpha - {\left\langle {\tilde \alpha},~{\gamma} \right\rangle} \gamma\) has both positive and negative coefficients, which is not possible. Since \((\tilde \alpha, \alpha) \geq 0\), we must have \(( \tilde \alpha, \gamma) = 0\). So write \begin{align*} 0 = (\tilde \alpha, \gamma) = \sum_{ \alpha\in \Delta} k_ \alpha( \alpha, \gamma) \leq 0 ,\end{align*} so \(( \alpha, \gamma)= 0\) whenever \(k_\alpha \neq 0\), otherwise this expression would be strictly \(< 0\). Thus \(\Delta = \Delta_1 {\textstyle\coprod}\Delta_2\) where \(\Delta_1 = \left\{{\alpha\in \Delta {~\mathrel{\Big\vert}~}K_ \alpha\neq 0}\right\}\) and \(\Delta_2 = \left\{{ \alpha\in \Delta{~\mathrel{\Big\vert}~}k_ \alpha = 0 }\right\}\). This is an orthogonal decomposition of \(\Delta\), since any \(\gamma \in \Delta_2\) is orthogonal to any \(\alpha\in \Delta_1\). Note that \(\Delta_1\neq \empty\) since \(\tilde\alpha \neq 0\), and if \(\Delta_2\neq \empty\) then this is a contradiction, so \(\Delta_2\) must be empty. So no such \(\gamma\) exists.
Uniqueness: let \(\tilde \alpha\) be any maximal root in the ordering and let \(\tilde \alpha'\) be another such root. Then \((\tilde \alpha, \tilde \alpha') = \sum_{\alpha\in \Delta} k_ \alpha (\alpha, \tilde \alpha')\) with \(k_ \alpha > 0\) and \((\alpha, \tilde \alpha') \geq 0\). So \((\tilde \alpha, \tilde \alpha ') > 0\) since \(\Delta\) is a basis for \({\mathbb{E}}\) and anything orthogonal to a basis is zero by nondegeneracy of the form. Since \(\tilde \alpha \neq 0\), it is not orthogonal to everything. By Lemma 9.4, either \(\tilde \alpha, \tilde \alpha '\) are proportional (which was excluded in the lemma), in which case they are equal since they’re both positive, or otherwise \(a \mathrel{\vcenter{:}}=\tilde \alpha - \tilde \alpha' \in \Phi\) is a root. In the latter case, \(a > 0 \implies \tilde \alpha > \tilde \alpha'\) or \(a< 0 \implies \tilde \alpha < \tilde \alpha'\), both contradicting maximality.
If \(\beta = \sum_{\alpha\in \Delta} m_ \alpha \alpha \in \Phi^+\), then \(m_ \alpha \leq k_ \alpha\) for all \(\alpha\) since \(\beta \leq \alpha\).
If \(\Phi\) is irreducible then \(W\) acts irreducibly on \({\mathbb{E}}\) (so there are no \(W{\hbox{-}}\)invariant subspaces). In particular, the \(W{\hbox{-}}\)orbit of a root spans \({\mathbb{E}}\).
Omitted.
If \(\Phi\) is irreducible, then at most two root lengths occur, denoted long and short roots.
Omitted.
\(B_2\) has 4 long roots and 4 short roots, since they fit in a square:
Similarly \(G_2\) has long and short roots, fitting into a star of David.
If \(\Phi\) is irreducible then the maximal root \(\tilde \alpha\) is a long root.
There is also a unique maximal short root.
Omitted.
Fix \(\Delta \subseteq \Phi\) a rank \(\ell\) root system with Weyl group \(W\). Let \(\Delta = \left\{{ {\alpha }_{1}, \cdots, {\alpha }_{\ell} }\right\}\) and then the matrix \(A\) where \(A_{ij} = {\left\langle {\alpha_i},~{\alpha_j} \right\rangle} = 2{{\left\langle {\alpha_i},~{\alpha_j} \right\rangle} \over {\left\langle {\alpha_j},~{\alpha_j} \right\rangle}}\) is the Cartan matrix of \(A\). Note that changing the ordering of \(\Delta\) permutes the rows and columns of \(A\), but beyond this, \(A\) does not depend on the choice of \(\Delta\) since they are permuted by \(W\) and \(W\) preserves the inner products and thus the ratios defining the Cartan numbers \(A_{ij}\). More \(A\in \operatorname{GL}_\ell({\mathbf{Z}})\) since the inner product is nondegenerate and \(\Delta\) is a basis for \({\mathbb{E}}\).
Note that the diagonals are always 2. Some classical types:
The Cartan matrix \(A\) determines the root system \(\Phi\) up to isomorphism: if \(\Phi' \subseteq {\mathbb{E}}'\) is another root system with base \(\Delta' = \left\{{ { \alpha'}_{1}, \cdots, { \alpha'}_{\ell} }\right\}\) with \(A'_{ij} = A_{ij}\) for all \(i, j\) then the bijection \(\alpha_i \mapsto \alpha_i'\) extends to a bijection \(\phi: {\mathbb{E}} { \, \xrightarrow{\sim}\, }{\mathbb{E}}'\) sending \(\Phi\) to \(\Phi'\) which is an isometry, i.e. \({\left\langle {\varphi(\alpha)},~{\varphi( \beta)} \right\rangle} = {\left\langle {\alpha },~{\beta} \right\rangle}\) for all \(\alpha, \beta \in \Phi\). Since \(\Delta, \Delta'\) are bases of \({\mathbb{E}}\), this gives a vector space isomorphism \(\phi(\alpha_i) \mathrel{\vcenter{:}}=\alpha_i'\). If \(\alpha, \beta\in \Delta\) are simple, then \begin{align*} s_{\varphi( \alpha)}( \varphi( \beta)) &= \varphi( \beta)- {\left\langle {\beta'},~{\alpha'} \right\rangle}\phi( \alpha) \\ &= \varphi( \beta)-{\left\langle { \beta},~{ \alpha} \right\rangle} \phi( \alpha) \\ &= \phi(\beta- {\left\langle {\beta },~{\alpha } \right\rangle}\alpha) \\ &= \phi( s_ \alpha( \beta)) ,\end{align*} so this diagram commutes since these maps agree on the simple roots, which form a basis:
Since \(W, W'\) are generated by reflections and \(s_{ \varphi( \alpha)} = \varphi\circ s_ \alpha \circ \varphi^{-1}\) for \(\alpha\in \Delta\), there is an isomorphism \begin{align*} W & { \, \xrightarrow{\sim}\, }W \\ s_ \alpha &\mapsto s_{ \varphi( \alpha)} = \varphi s_ \alpha \varphi \quad \forall \alpha \in \Delta .\end{align*} If \(\beta \in \Phi\), then \(\beta = w( \alpha)\) for some \(\alpha\in \Delta\) and \(w\in W\) by theorem 10.3C. Thus \(\phi( \beta) = ( \varphi \circ w \circ \varphi^{-1})( \varphi( \alpha))\in \Phi'\) since \(\varphi\circ w \circ \varphi^{-1}\in W'\). Thus \(\phi( \Phi) = \Phi'\). Using lemma 9.2, \(s_{\varphi(\beta)} = \varphi s_ \beta \varphi^{-1}\), so \(\phi\) preserves all of the Cartan integers \({\left\langle {\beta },~{\gamma} \right\rangle}\) for all \(\gamma, \beta\in \Phi\).
Read the last paragraph of \(\S 11.1\) which gives an algorithm for constructing \(\Phi^+\) from \(\Delta\) and \(A\).
If \(\alpha\neq \beta\in \Phi^+\) then \({\left\langle {\beta },~{\alpha} \right\rangle}{\left\langle {\alpha },~{\beta } \right\rangle}= 0,1,2,3\) from the table several sections ago. Fix \(\Delta = \left\{{ {\alpha }_{1}, \cdots, {\alpha }_{\ell} }\right\}\), then the Coxeter graph \(\Gamma\) of \(\Phi\) is the graph with \(\ell\) vertices \(1,\cdots, \ell\) with vertices \(i, j\) connected by \({\left\langle {\alpha_i},~{ \alpha_j} \right\rangle} {\left\langle {\alpha_j},~{\alpha_i} \right\rangle}\) edges.
Recall that the table was
\({\left\langle {\alpha },~{\beta} \right\rangle}\) | \({\left\langle {\beta },~{\alpha} \right\rangle}\) |
---|---|
0 | 0 |
-1 | -1 |
-1 | -2 |
-1 | -3 |
Here \(\alpha\) is the shorter root., although without loss of generality in the first two rows we can rescale so that \({\left\lVert {\alpha} \right\rVert}= {\left\lVert {\beta} \right\rVert}\). The graphs for some classical types:
If \(\Phi\) has roots all of the same length, the Coxeter graph determines the Cartan integers since \(A_{ij} = 0, 1\) for \(i\neq j\). If \(i \to j\) is a subgraph of \(\Gamma\) then \({\left\langle { \alpha_i},~{ \alpha_j} \right\rangle} = {\left\langle { \alpha_j},~{\alpha_i} \right\rangle} = -1\), so \(\alpha_i, \alpha_j\) have the same length. However, if there are roots of multiple lengths, taking the product to determine the number of edges loses information about which root is longer.
The Dynkin diagram of \(\Phi\) is the Coxeter graph \(\Gamma\) where for each multiple edge, there is an arrow pointing from the longer root to the shorter root.
In rank 2:
We also have the following diagram for \(F_4\):
Note that \(\Phi\) is irreducible iff \(\Delta\) can not be partitioned into two proper nonempty orthogonal subsets iff the Coxeter graph is connected. In general, if \(\Gamma\) has \(t\) connected components, let \(\Delta = \coprod_{1\leq i\leq t} \Delta_i\) be the corresponding orthogonal partition of simple roots. Let \({\mathbb{E}}_i = \mathop{\mathrm{span}}_{\mathbf{R}}\Delta_i\), then \({\mathbb{E}}= \bigoplus_{1\leq i\leq t}{\mathbb{E}}_i\) is an orthogonal direct sum decomposition into \(W{\hbox{-}}\)invariant subspaces, which follows from the reflection formula. Writing \(\Phi_i = ({\mathbf{Z}}\Delta_I) \cap\Phi\), one has \(\Phi = \coprod_{1\leq i\leq t} \Phi_i\) since each root is \(W{\hbox{-}}\)conjugate to a simple root and \({\mathbf{Z}}\Delta_i\) is \(W{\hbox{-}}\)invariant and each \(\Phi_i \subseteq {\mathbb{E}}_i\) is itself a root system. Thus it’s enough to classify irreducible root systems.
Classifying root systems: \(\Delta \subseteq \Phi \subseteq {\mathbb{E}}\) a base yields a decomposition
If \(\Phi\) is an irreducible root system of rank \(\ell\), then its Dynkin diagram is one of the following:
Types ADE are called simply laced since they have no multiple edges.
Idea: classify possible connected Coxeter graphs, ignoring relative root lengths. If \(\alpha, \beta\) are simple roots, note that for any \(c\), \begin{align*} {\left\langle { \alpha },~{ \beta } \right\rangle} {\left\langle { \beta },~{ \alpha } \right\rangle} = {2(c \alpha, \beta)\over (\beta, \beta)} {2( \beta, c \alpha) \over (c \alpha, c \alpha)} \in \left\{{0,1,2,3}\right\} ,\end{align*} so \(\alpha \mapsto c\alpha\) leaves this number invariant and we can assume all simple roots are unit vectors.
Let \({\mathbb{E}}\) be a finite dimensional Euclidean space, then a subset \(A =\left\{{{ {\varepsilon}_1, \cdots, {\varepsilon}_{n} } }\right\}\subseteq {\mathbb{E}}\) of linearly independent unit vectors satisfying
is called admissible.
Any base for a root system where each vector is normalized is admissible.
To such an \(A\) we associate a graph \(\Gamma\) as before with vertices \(1,\cdots, n\) where \(i,j\) are joined by \(4({\varepsilon}_i, {\varepsilon}_j)^2\) edges. We’ll determine all connected graphs \(\Gamma\) that can occur, since these include all connected Coxeter graphs.
An easy 10 steps:
Proof: Set \({\varepsilon}\mathrel{\vcenter{:}}=\sum_{i=1}^n {\varepsilon}_i\), which is nonzero by linear independence. Then \(0 < ({\varepsilon}, {\varepsilon}) = n + \sum_{i<j} 2 ({\varepsilon}_i, {\varepsilon}_j)\), and if \(i< j\) are joined, so \(({\varepsilon}_i, {\varepsilon}_j)\neq 0\), then \(4({\varepsilon}_i, {\varepsilon}_j)^2 = 1,2,3\) and so \(2({\varepsilon}_i, {\varepsilon}_j)\leq -1\). However, since the sum is positive, there are \(\leq n-1\) such pairs.
Proof: A cycle is subgraph that corresponds to an admissible subset \(A' \subseteq A\) with \(m\) vertices and \(m\) edges corresponding to \(m\) connected pairs. This contradicts (2).
Let \({\varepsilon}\in A\) be a vertex, and suppose \({ \eta _1, \cdots, \eta _{k} }\) are the vertices connected to \({\varepsilon}\):
By (3), no two \(\eta_i, \eta_j\) are connected, so \(( \eta_i, \eta_j) = 0\) for all \(i,j\). Apply Gram-Schmidt to \(\left\{{{ \eta _1, \cdots, \eta _{k} } , {\varepsilon}}\right\}\) only involves modifying \({\varepsilon}\) since \(\left\{{{ \eta _1, \cdots, \eta _{k} }}\right\}\) are already orthonormal. Call the new vector \(\eta_0\) and let \(\left\{{{ \eta _0, \cdots, \eta _{k} }}\right\}\) be the resulting orthonormal set. One can then write \({\varepsilon}= \sum_{i=0}^k ({\varepsilon}, \eta_i)\eta_i\), which implies \(({\varepsilon}, \eta_0)\neq 0\) by linear independence. Then \(1 = ({\varepsilon}, {\varepsilon}) = \sum_{i=0}^k ({\varepsilon}, \eta_i)^2\), but \(({\varepsilon}, \eta_0)^2 > 0\) so \(\sum_{i=1}^k ({\varepsilon}, \eta_i)^2 < 1\) and thus \(\sum_{i=1}^k 4 ({\varepsilon}, \eta_i)^2 < 4\). But this sum is the number of edges incident to \({\varepsilon}\) in \(\Gamma\).
The only connected graph which contains a triple edge is the Coxeter graph of \(G_2\) by (4), since the triple edge forces each vertex to already have 3 incident edges.
Let \(\left\{{{\varepsilon}_1,\cdots, {\varepsilon}_k}\right\} \subseteq A\) have a simple chain \(\cdot \to \cdot \to \cdots \to \cdot\) as a subgraph. If \(A' \mathrel{\vcenter{:}}=\left\{{A\setminus\left\{{ { {\varepsilon}_1, \cdots, {\varepsilon}_{k} } }\right\}}\right\} \cup\left\{{{\varepsilon}}\right\}\) where \({\varepsilon}\mathrel{\vcenter{:}}=\sum_{i=1}^k {\varepsilon}_i\), then \(A'\) is admissible. The corresponding graph \(\Gamma'\) is obtained by shrinking the chain to a point, where any edge that was incident to any vertex in the chain is now incident to \({\varepsilon}\), with the same multiplicity.
Proof: number the vertices in the chain \(1,\cdots, k\). Linear independence of \(A'\) is clear. Note \(4({\varepsilon}_i, {\varepsilon}_{i+1})^2 = 1\implies 2({\varepsilon}_i, {\varepsilon}_{i+1}) \implies ({\varepsilon}, {\varepsilon}) = k + 2 \sum_{i< j} ({\varepsilon}_i, {\varepsilon}_j) = k + (-1)(k-1) = 1\). Any \(\eta\in A\setminus\left\{{ { {\varepsilon}_1, \cdots, {\varepsilon}_{k} } }\right\}\) is connected to at most one of \({ {\varepsilon}_1, \cdots, {\varepsilon}_{k} }\) since this would otherwise form a cycle, so \((\eta, {\varepsilon}) = (\eta, {\varepsilon}_i)\) for a single \(i\). So \(4(\eta, {\varepsilon})^2 = 4(\eta, {\varepsilon}_i)^2 \in \left\{{0,1,2,3}\right\}\) and \((\eta, {\varepsilon}) = (\eta, {\varepsilon}_i) \leq 0\), which verifies all of the admissibility criteria.
Proof: collapsing the chain in the middle produces a vertex with 4 incident edges.
Missed first 15m!
The only connected \(\Gamma\) graphs of the second type in Step 8 are either \(F_4\) or \(B_\ell = C_\ell\).
Compute \begin{align*} ({\varepsilon}, {\varepsilon}) = \sum_{i=1}^k i^2 - \sum_{i=1}^{p-1} i(i+1) \\ = p^2 - \sum_{i=1}^{p-1} i \\ = p^2 - {p(p-1)\over 2} \\ = {p(p+1)\over 2} ,\end{align*} and similarly \((\eta, \eta) = {q(q+1)\over 2}\). Note \(4({\varepsilon}_p, \eta_q)^2 = 2\), so \(({\varepsilon},\eta)^2 = p^2 q^2 ({\varepsilon}_p, \eta_q)^2 = {p^2q^2\over 2}\). By Cauchy-Schwarz, \(({\varepsilon}, \eta)^2< ({\varepsilon}, {\varepsilon}) (\eta, \eta)\), where the inequality is strict since \(\eta, {\varepsilon}\) are linearly independent. Then check \begin{align*} {p^2 q^2\over 2} &< {p(p+1)\over 2} \cdot {q(q+1)\over 2} \\ {p1\over 2} &< {p+1\over 2}\cdot {q+1\over 2} \\ 2pq &< pq + p + q + 1 ,\end{align*} and so combining these yields \(pq-p-q+1 < 2\) and thus \begin{align*} (p-1)(q-1) < 2 .\end{align*} Since \(p\geq q\geq 1\), this yields two possible cases:
The only connected \(\Gamma\) of type (d) are \(D_\ell, E_6, E_7, E_8\).
Set \({\varepsilon}\mathrel{\vcenter{:}}=\sum i{\varepsilon}_i\), \(\eta \mathrel{\vcenter{:}}=\sum i\eta_i\), and \(\zeta = \sum i \zeta_i\). Note that \({\varepsilon}, \eta, \zeta\) mutually orthogonal by inspecting the graph, and \(\psi\) is not in their span. Let \(\theta_1\) (resp. \(\theta_2, \theta_3\)) be the angles between \({\varepsilon}\) (resp. \(\eta, \zeta\)) and \(\psi\). Since \({\varepsilon},\eta,\zeta\) are linearly independent, the idea is to apply Gram-Schmidt to \(\left\{{{\varepsilon},\eta,\zeta,\psi}\right\}\) without normalizing. The first 3 are already orthogonal, so we get a new orthogonal basis \(\left\{{\psi_1 \mathrel{\vcenter{:}}={\varepsilon}, \psi_2\mathrel{\vcenter{:}}=\eta, \psi_3\mathrel{\vcenter{:}}=\zeta, \psi_0}\right\}\) where \((\psi_0, \psi) \neq 0\). We can expand \(\psi\) in this basis to write \(\psi = \sum_{i=0}^3 \qty{\psi, {\psi_i\over {\left\lVert {\psi_i} \right\rVert}}} {\psi_i \over {\left\lVert {\psi_i} \right\rVert}}\). Note that \((\psi, \psi) = 1\), and consequently \(\sum_{i=1}^3 \qty{\psi, {\psi_i \over{\left\lVert {\psi_i} \right\rVert}}}^2 < 1 \implies \sum_{i=1}^3 \cos^2(\theta_i) < 1\). So \begin{align*} \cos^2(\theta_1) + \cos^2( \theta_2) + \cos^2( \theta_3) < 1 .\end{align*}
As in Step (9), \(({\varepsilon}, {\varepsilon}) = {p(p-1)\over 2}\) and similarly for \(\eta,\zeta\), and so \begin{align*} \cos^1( \theta_1) = {({\varepsilon}, \psi)^2 \over ({\varepsilon}, {\varepsilon}) (\psi, \psi)} = {(p-1)^2 ({\varepsilon}_{p-1}, \psi )^2 \over {p(p-1)\over 2} \cdot 1 } \\ = {p-1\over p} {1/4\over 1/2} \\ = {1\over 2}\qty{1-{1\over p}} ,\end{align*} where we’ve used that \(4({\varepsilon}_{p-1},\psi ) = 1\). Similarly (and summarizing), \begin{align*} \cos^2(\theta_1) &= {1\over 2}\qty{1-{1\over p}} \\ \cos^2(\theta_2) &= {1\over 2}\qty{1-{1\over q}} \\ \cos^2(\theta_3) &= {1\over 2}\qty{1-{1\over r}} \\ \\ &\implies {1\over 2}\qty{ 1 - {1\over p} + 1 - {1\over q} + 1 - {1\over r}} < 1 \\ &\implies p^{-1}+ q^{-1}+r^{-1}> 1 .\end{align*} and since \(p\geq q\geq r\geq 2 \implies p^{-1}\leq q^{-1}\leq r^{-1}\leq 2^{-1}\), we have \({3\over r} > 1\) by replacing \(p,q\) with \(r\) above. So \(r < 3\), forcing \(r=2\), and there is only one “top leg” in the graph for (d) above.
We also have \begin{align*} {2\over q} \geq {1\over p} + {1\over q} > {1\over 2}, \qquad (\star) .\end{align*} so \(q<4\) forces \(q=2,3\).
Note that the diagrams we’ve constructed are the only possible Coxeter graphs of a root system, since normalizing any set of simple roots yields an admissible set. This proves one direction of a correspondence, but what are all possible Dynkin diagrams? Note that types \(B_\ell, C_\ell\) have the same underlying Coxeter graph, and only differ by directions on the multi-edges.
Does every connected Dynkin diagram correspond to an irreducible root system.
Yes: types \(A,B,C,D\) can be constructed from root systems in classical Lie algebras, and the corresponding Dynkin diagrams can be constructed directly. The 5 exceptional types must be constructed directly.
Does each irreducible root system occur as the root system of some semisimple Lie algebra over \({\mathbf{C}}\)?
The answer is of course: yes!
Next time: starting Ch. V.
Let \({ \mathbf{F} }\) be an arbitrary field, not necessarily characteristic zero, and let \(L\in \mathsf{Lie} \mathsf{Alg}_{/ {{ \mathbf{F} }}}\) be an arbitrary Lie algebra, not necessarily finite-dimensional. Recall that the tensor algebra \(T(V)\) is the \({\mathbf{Z}}_{\geq 0}\) graded unital algebra where \({\mathsf{gr}\,}_n T(V) = T^n(V) \mathrel{\vcenter{:}}= V{ {}^{ \scriptstyle\otimes_{{ \mathbf{F} }}^{n} } }\) where \(T^0(V) \mathrel{\vcenter{:}}={ \mathbf{F} }\). Note \(T(V) = \bigoplus _{n\geq 0} T^n(V)\). If \(V\) has a basis \(\left\{{x_k}\right\}_{k\in K}\) then \(T(V) \cong { \mathbf{F} }\left\langle{x_k {~\mathrel{\Big\vert}~}k\in K}\right\rangle\), a polynomial ring in the noncommuting variables \(x_k\). Degree \(n\) monomials in this correspond to pure tensors with \(n\) components in \(T(V)\).
There is an \({ \mathbf{F} }{\hbox{-}}\)linear map \(V \xrightarrow{i} T(V)\), and \(T(V)\) satisfies a universal property: given any linear map \(\phi\in {}_{{ \mathbf{F} }}{\mathsf{Mod}}(V, A)\) where \(A\) has the structure of an associative algebra, there exists a unique \(\psi\in \mathsf{Assoc} \mathsf{Alg}_{/ {{ \mathbf{F} }}}(T(V), A)\) making the diagram commute:
In fact, one can explicitly write \(\psi\) as \(\psi(x_{k_1}\otimes\cdots x_{k_n}) = \phi(x_k)\cdots \phi(x_{k_n})\) using the multiplication in \(A\).
The symmetric algebra and exterior algebra are defined as \begin{align*} S(V) \mathrel{\vcenter{:}}= T(V)/\left\langle{x\otimes y -y\otimes x {~\mathrel{\Big\vert}~}x,y\in V}\right\rangle, { {\bigwedge}^{\scriptscriptstyle \bullet}} (V) \mathrel{\vcenter{:}}= T(V)/\left\langle{x\otimes y + y\otimes x {~\mathrel{\Big\vert}~}x,y\in V}\right\rangle .\end{align*}
Let \(L\in \mathsf{Lie} \mathsf{Alg}_{/ {{ \mathbf{F} }}}\) with basis \(\left\{{x_k}\right\}_{k\in K}\). A universal enveloping algebra for \(L\) is a pair \((U, i)\) where \(U\) is a unital associative \({ \mathbf{F} }{\hbox{-}}\)algebra and \(i: L \to U_L\) (where \(U_L\) is \(U\) equipped with the commutator bracket multiplication) is a morphism of Lie algebras, i.e. \begin{align*} i([xy]) = i(x) i(y) - i(y) i(x) = [i(x) i(y) ] \quad \forall x,y\in L .\end{align*} It satisfies a universal property: for any unital associative algebra \(A\) receiving a Lie algebra morphism \(j: L\to A_L\), there is a unique \(\phi\) in the following:
Uniqueness follows from the usual proof for universal objects. Existence: let \begin{align*} U(L) \mathrel{\vcenter{:}}= T(L) / J, \qquad J \mathrel{\vcenter{:}}=\left\langle{x\otimes y - y\otimes x - [xy] {~\mathrel{\Big\vert}~}x,y\in L}\right\rangle .\end{align*} Warning: \(J\) is a two-sided ideal, but is not homogeneous!
One can form the required map:
This satisfies \(\psi(x\otimes y - y\otimes x - [xy]) = j(x) j(y) - j(y)j(x) - j([xy]) = 0\) using the properties of \(j\). \(\phi\) is unique because \(U(L)\) is generated by 1 and \(\operatorname{im}i\), since \(T(L)\) is generated by 1 and the image of \(L = T^1(L)\).
If \(L\) is abelian, \(U(L) = S(L)\) is the symmetric algebra.
Note that \(J \subseteq \bigoplus _{n\geq 1} T^n(L)\) so \({ \mathbf{F} }= T^0(L)\) maps isomorphically into \(U(L)\) under \(\pi\). So \({ \mathbf{F} }\hookrightarrow U(L)\), meaning \(U(L) \neq 0\), although we don’t yet know if \(L\) injects into \(U(L)\).
Let \(L\) be a Lie algebra with basis \(\left\{{x_k}\right\}_{k\in K}\), and filter \(T(L)\) by \(T_m \mathrel{\vcenter{:}}=\bigoplus _{i\leq m} T^i (L)\). Then \(T_m\) is the span of words of length at most \(m\) in the basis elements \(x_k\). Note \(T_m \cdot T_n \subseteq T_{m+n}\), and the projection \(\pi: T(L) \twoheadrightarrow U(L)\) induces an increasing filtration \(U_0 \subseteq U_1 \subseteq \cdots\) of \(U(L)\). Let \(G^m \mathrel{\vcenter{:}}= U_m / U_{m-1}\) be the \(m\)th graded piece. The product on \(U(L)\) induces a well-defined product \(G^m \times G^n \to G^{m+n}\) since \(U_{m-1}\times U_{n-1} \subseteq U_{m+n-2} \subseteq U_{m+n-1}\). Extending this bilinearly to \(\bigoplus _{m\geq 0} G^m\) to form the associated graded algebra of \(U(L)\).
Note that this construction generally works for any filtered algebra where the multiplication is compatible with the filtration.
Let \(L \mathrel{\vcenter{:}}={\mathfrak{sl}}_2({\mathbf{C}})\) with ordered basis \(\left\{{x,h,y}\right\}\). Then \(y\otimes h\otimes x\in T^3(L)\) – denote the image of this monomial in \(U(L)\) by \(yhx \in U_3\). We can reorder this: \begin{align*} yhx &= hyx + [yh]x &= hyx + 2yx \qquad \in U_3 + U_2 \\ ,\end{align*} so in \(G^3\) we have \(yhx=hyx\). This is a general feature: reordering introduces error terms of lower degree, which are quotiented out. Continuing, \begin{align*} hyx + 2yx &= hxy + h[yx] + 2xy + 2[yx] \\ &= hxy - h^2 + 2xy - 2h \\ &= xhy + [hx]y - h^2 + 2xy - 2h \\ &= xhy + 2xy - h^2 + 2xy - 2h \\ &= xhy + 4xy - h^2 - 2h \qquad \in U_3 + U_2 + U_2 + U_2 .\end{align*}
Clarification from last time: for \(L\in \mathsf{Lie} {}_{{ \mathbf{F} }} \mathsf{Alg}\) over \({ \mathbf{F} }\) an arbitrary field:
Then \(i\) is a Lie algebra morphism since \(i([xy]) = i(x)i(y) - i(x)i(y) = [i(x) i(y)]\). We know \begin{align*} 0 &= \pi(x\otimes y - y\otimes x - [xy]) \\ &= \pi(i_0(x) i_0(y) - i_0(y) i_0(x) - i_0([xy])) \\ &= i(x) i(y) - i(y) i(x) - i([xy]) .\end{align*}
Recall that we filtered \(0 \subseteq U_1 \subseteq \cdots \subseteq U(L)\) and defined the associated graded \(G^m = U_m / U_{m-1}\) and \(G(L) \mathrel{\vcenter{:}}=\bigoplus _{m\geq 0} G^m\), and we saw by example that \(yhx = hyx = xhy\) in \(G^3({\mathfrak{sl}}_2)\). There is a projection map \(T^m(L) \to U(L)\) whose image is contained in \(U_m\), so there is a composite map \begin{align*} T^m(L) \to U_m \to U_m/U_{m-1} = G^m .\end{align*} Since \(T(L) = \bigoplus _{m\geq 0} T^m(L)\), these can be combined into an algebra morphism \(T(L) \to G(L)\). It’s not hard to check that this factors through \(S(L) \mathrel{\vcenter{:}}= T(L)/\left\langle{x\otimes y - y\otimes x{~\mathrel{\Big\vert}~}x,y\in L}\right\rangle\) since \(x\otimes y = y\otimes x + [xy]\) and the \([xy]\) term is in lower degree. So this induces \(w: S(L) \to G(L)\), and the PBW theorem states that this is an isomorphism of graded algebras.
Let \(\left\{{x_k}\right\}_{k\in K}\) be an ordered basis for \(L\), then the collection of monomials \(x_{k_1}\cdots x_{k_m}\) for \(m\geq 0\) where \(k_1\leq \cdots k_m\) is a basis for \(U(L)\).
The collection of such monomials of length exactly \(n\) forms a basis for \(S^m(L)\), and via \(w\), a basis for \(G^m(L)\). In particular these monomials form a linearly independent in \(U_m/U_{m-1}\), since taking quotients can only increase linear dependencies, and hence these are linearly independent in \(U_m\) and \(U(L)\). By induction on \(m\), \(U_{m-1}\) has a basis consisting of all monomials of length \(\leq m-1\). We can then get a basis of \(U_m\) by adjoining to this basis of \(U_{m-1}\) any preimages in \(U_m\) of basis elements for the quotient \(U_m/U_{m-1}\). So a basis for \(U_m\) is all ordered monomials of length \(\leq m\). Since \(U(L) = \cup_{m\geq 0} U_m\), taking the union of bases over all \(m\) yields the result.
The canonical map \(i: L\to U(L)\) is injective.
This follows from taking \(m=1\) in the previous corollary.
Let \(H\leq L\) be a Lie subalgebra and extend an ordered basis for \(H\), say \(\left\{{h_1, h_2,\cdots}\right\}\), and extend it to an ordered basis \(\left\{{h_1,h_2,\cdots, x_1,x_2,\cdots}\right\}\). Then the injection \(H\hookrightarrow L\) induces an injective morphism \(U(H) \hookrightarrow U(L)\). Moreover, \(U(L)\in {}_{U(H)}{\mathsf{Mod}}^{\mathrm{free}}\) with a basis of monomials \(x_{k_1}x_{k_2}\cdots x_{k_m}\) for \(m\geq 0\).
This follows directly from corollary C.
We’ll skip 17.4, which proves the PBW theorem. The hard part: linear independence, which is done by constructing a representation of \(U(L)\) in another algebra.
Let \(L \in \mathsf{Lie} {}_{{ \mathbf{F} }} \mathsf{Alg}\) which is generated as a Lie algebras (so allowing commutators) by a subset \(X \subseteq L\).18 We say \(L\) is free on \(X\) and write \(L = L(X)\) if for any set map \(\phi: X\to M\) with \(M\in \mathsf{Lie} {}_{{ \mathbf{F} }} \mathsf{Alg}\) there exists an extension:
Existence:
One checks that \(\tilde \phi\) restricts to a Lie algebra morphisms \(\tilde \phi: L(X) \to U(M)\) whose image is the Lie subalgebra of \(U(M)\) generated by \(M\) – but this subalgebra is precisely \(M\), since e.g. \(U(M)\ni x\otimes y - y\otimes x = [xy]\in M\). Thus we can view \(\tilde \phi\) as a map \(\tilde \phi: L(X)\to M\).
One can check that \(U(L(X)) = T(V(X))\).
Recall that the free Lie algebra of a set \(X\), \(L(X)\) satisfies a universal property:
Given an arbitrary \(L\in {\mathsf{Lie}{\hbox{-}} \mathsf{Alg}}\) and fix a set \(X\) of generators for \(L\) and form \(LX()\), then there is a Lie algebra morphism \(\pi: L(X) \twoheadrightarrow L\) which is surjective since \(X\) generates \(L\). Defining \(R\mathrel{\vcenter{:}}=\ker \pi\), one has \(L \cong L(X)/R\), so \(R\) is called the ideal of relations.
Let \(L \in {\mathsf{Lie}{\hbox{-}} \mathsf{Alg}}^{{\mathrm{fd}}, {\mathrm{ss}}}_{{\mathbf{C}}}\), let \(H \subseteq L\) be a maximal toral subalgebra, and \(\Phi\) its root system. Fix a base \(\Delta = \left\{{ \alpha_1, \cdots, \alpha_\ell}\right\} \subseteq \Phi\). Recall \begin{align*} {\left\langle { \alpha_j},~{ \alpha_i} \right\rangle} \mathrel{\vcenter{:}}={2 (\alpha_j, \alpha_i) \over (\alpha_i, \alpha_i)} = \alpha_j(h_i),\qquad g_i \mathrel{\vcenter{:}}= h_{\alpha_i} = {2 \alpha_i \over (\alpha_i, \alpha_i)} .\end{align*} The root strings are of the form \(\beta - r \alpha, \cdots, \beta, \cdots, \beta+ q \alpha\) where \(r-q = \beta(h_ \alpha)\). For any \(i\) we can fix a standard \({\mathfrak{sl}}_2\) triple \(\left\{{x_i, h_i, y_i}\right\}\) such that \(x_i\in L_{ \alpha_i}, y_i \in L_{- \alpha_i}, h_i\in [x_i y_i]\).
\(L\) is generated as a Lie algebra by the \(3\ell\) generators \(X\mathrel{\vcenter{:}}=\left\{{x_i, h_i, y_i {~\mathrel{\Big\vert}~}1\leq i\leq \ell}\right\}\) subject to at least the following relations:
Recall that differences of simple roots are never roots, since the coefficients have mixed signs. Since \(\alpha_i - \alpha_j \not\in \Phi\), we have \([x_i y_j] = 0\) for \(i\neq j\) since it would have to be in \(L_{\alpha_i - \alpha_j}\). Consider the \(\alpha_i\) root string through \(\alpha_j\): we have \(r=0\) from above, and the string is \begin{align*} \alpha, \alpha+ \alpha_i, \cdots, \alpha_j - {\left\langle {\alpha_j},~{\alpha_i} \right\rangle} \alpha_i \end{align*} since \(L_\beta = 0\) for \(\beta \mathrel{\vcenter{:}}=\alpha_j - \qty{ {\left\langle {\alpha_j},~{ \alpha_i} \right\rangle} + 1} \alpha_i\). The relations for \(S_{ij}^\pm\) follow similarly.
Note that these relations are all described in a way that only involves the Cartan matrix of \(\Phi\), noting that changing bases only permutes its rows and columns.
These five relations form a complete set of defining relations for \(L\), i.e. \(L \cong L(X)/R\) where \(R\) is the ideal generated by the Serre relations above. Moreover, given a root system \(\Phi\) and a Cartan matrix, one can define a Lie algebra using these generators and relations that is finite-dimensional, simple, and has root system \(\Phi\).
Fix an irreducible root system \(\Phi\) of rank \(\ell\) with Cartan matrix \(A\). Let \(\widehat{L} \mathrel{\vcenter{:}}= L(\widehat{X})\) where \(\widehat{X} \mathrel{\vcenter{:}}=\left\{{\widehat{x}_i, \widehat{h}_i, \widehat{y}_i {~\mathrel{\Big\vert}~}1\leq i\leq \ell}\right\}\). Let \(\widehat{K} {~\trianglelefteq~}\widehat{L}\) be the 2-sided ideal generated by the relations S1, S2, S3. Let \(L_0 \mathrel{\vcenter{:}}=\widehat{L}/\widehat{K}\) and write \(\pi\) for the quotient map \(\widehat{L}\to L_0\) – note that \(L_0\) is infinite-dimensional, although it’s not yet clear that \(L_0\neq 0\). We’ll study \(L_0\) by defining a representation of it, which is essentially the adjoint representation of \(L_0\) acting on \(\left\{{y_i}\right\}\).
Recall that a representation of \(M\in {\mathsf{Lie}{\hbox{-}} \mathsf{Alg}}\) is a morphism \(\phi\in {\mathsf{Lie}{\hbox{-}} \mathsf{Alg}}(M, {\mathfrak{gl}}(V))\) for \(V\in {}_{ { \mathbf{F} }}{\mathsf{Mod}}\). This yields a diagram
Conversely given an algebra morphism \(\tilde \phi: U(M) \to { \operatorname{End} }(V)\), and restricting \(\tilde \phi\) to \(M \subseteq U(M)\) gives a Lie algebra morphism \(\phi: M\to { \operatorname{End} }(V) = {\mathfrak{gl}}(V)\). This representations of \(M\) (using \({\mathfrak{gl}}(V)\)) correspond to associative algebra representations of \(U(M)\) (using \({ \operatorname{End} }(V)\)). Since \(U(M) = T(V(\widehat{X}))\), using the various universal properties, having a representation \(V\) of \(\widehat{L}\) is equivalent to having a set map \(\widehat{X}\to { \operatorname{End} }(V)\), i.e. elements of \(\widehat{X}\) should act linearly on \(V\).
Let \(V\) be the tensor algebra on a vector space with basis \(\left\{{v_1,\cdots, v_\ell}\right\}\), thinking of each \(v_i\) being associated to \(\widehat{y}_i\). Write \(v_1 v_2\cdots v_t \mathrel{\vcenter{:}}= v_1 \otimes v_2\otimes\cdots\otimes v_t\), and define elements of \({ \operatorname{End} }(V)\) by
Last time: constructing a semisimple Lie algebra that has a given root system. Setup:
\(\widehat{K} \subseteq \ker \widehat{\phi}\), so \(\widehat{\phi}\) induces a representation \(\phi\) of \(L_0\) on \({\mathfrak{gl}}(V)\)
Straightforward but tedious checking of all relations, e.g. \begin{align*} \widehat{\phi}(\widehat{h}_i) \circ \widehat{\phi}(\widehat{x}_j) - \widehat{\phi}(\widehat{x}_j )\widehat{\phi}(\widehat{h}_i) = {\left\langle { \alpha_j},~{\alpha_i} \right\rangle} \widehat{\phi}(\widehat{x}_j) .\end{align*}
In \(L_0\), the \(h_i\) form a basis for an \(\ell{\hbox{-}}\)dimensional abelian subalgebra \(H\) of \(L_0\), and moreover \(L_0 = Y \oplus H \oplus X\) where \(Y, X\) are the subalgebras generated by the \(x_i\) and \(y_i\) respectively.
Steps 1 and 2: :::{.claim} \(\pi(\widehat{H}) = H\) is \(\ell{\hbox{-}}\)dimensional.
Clearly the \(\widehat{h}_i\) span an \(\ell{\hbox{-}}\)dimensional subspace of \(\widehat{L}\), so we need to show that \(\pi\) restricts to an isomorphism \(\pi: \widehat{H} { \, \xrightarrow{\sim}\, }H\). Suppose \(\widehat{h} \mathrel{\vcenter{:}}=\sum_{j=1}^\ell c_j \widehat{h}_j \in \ker \pi\), so \(\widehat{\phi }(\widehat{h}) = 0\). Thus \begin{align*} 0 = \widehat{h} \cdot v_i = \sum_{j} c_j \widehat{h}_j \cdot v_i = - \sum_j c_j {\left\langle { \alpha_i},~{\alpha_j} \right\rangle} = - \sum_j a_{ij} c_j \qquad \forall i ,\end{align*} so \(A\mathbf{c} = 0\) where \(A\) is the Cartan matrix, and so \(\mathbf{c} = \mathbf{0}\) since \(A\) is invertible (since it was essentially a Gram matrix).
Step 3: Now \(\sum { \mathbf{F} }x_i + \sum { \mathbf{F} }h_i + \sum { \mathbf{F} }y_i \xrightarrow{\pi} L_0\) maps isomorphically to \(L_0\), and S2, S3 show that for each \(i\). Then \({ \mathbf{F} }x_i + { \mathbf{F} }h_i + { \mathbf{F} }y_i\) is a homomorphic image of \({\mathfrak{sl}}_2\), which is simple if \(\operatorname{ch}{ \mathbf{F} }\neq 2\). Note \(\pi( \widehat{h}_i) = h_i \neq 0\) in \(L_0\) by (1), so this subspace of \(L_0\) is isomorphic to \({\mathfrak{sl}}_2({ \mathbf{F} })\). In particular \(\left\{{x_i, h_i, y_i}\right\}\) is linearly independent in \(L_0\) for each fixed \(i\). Supposing \(0 = \sum_{j=1}^\ell (a_j x_j + b_j h_j + c_j y_j)\), applying \({ \operatorname{ad}}_{L_0, h_i}\) for each \(i\) to obtain \begin{align*} 0 = \sum_{j=1}^\ell \qty{ a_j{\left\langle {\alpha_j},~{\alpha_i} \right\rangle} x_j + b_j 0 - c_j {\left\langle { \alpha_j },~{ \alpha_i } \right\rangle} y_j } = \sum_{j=1}^\ell {\left\langle { \alpha_j },~{ \alpha_i } \right\rangle} (a_j x_j - x_y y_j) ,\end{align*} and by invertibility of \(A\) we have \(a_j x_j - c_j y_j = 0\) for each \(j\). So \(a_j = c_j = 0\) for all \(j\), and \(\sum b_j h_j = 0\) implies \(b_j = 0\) for all \(j\) from (1).
Step 4: \(H = \sum_{j=1}^\ell { \mathbf{F} }h_j\) is an \(\ell{\hbox{-}}\)dimensional abelian subalgebra of \(L_0\) by (1) and S1.
Step 5: Write \([x_{i_1}\cdots x_{i_t}] \mathrel{\vcenter{:}}=[x_{i_1} [ x_{i_2} [ \cdots [x_{i_{t-1}} x_{i_t}] \cdots ]] \in X\) for an iterated bracket, taken by convention to be bracketing from the right. We have \begin{align*} { \operatorname{ad}}_{L_0, h_j}([x_{i_1} \cdots x_{i_t}] ) = \qty{ {\left\langle { \alpha_{i_1} },~{ \alpha_j } \right\rangle} + \cdots + {\left\langle { \alpha_{i_t} },~{ \alpha_j } \right\rangle} } [x_{i_1} \cdots x_{i_t}] \qquad t\geq 1 ,\end{align*} and similarly for \([y_{i_1}\cdots y_{i_t}]\).
Step 6: For \(t\geq 2\), \([y_j [ x_{i_1} \cdots x_{i_t} ] ] \in X\), and similarly with the roles of \(x_i, y_i\) reversed. This follows from the fact that \({ \operatorname{ad}}_{L_0, y_j}\) acts by derivations, and using S2 and S3.
Step 7: It follows from steps 4 through 6 that \(Y+H+X\) is a subalgebra of \(L_0\). One shows that \([[ x_{i_1} \cdots x_{i_t}], [y_{i_1} \cdots y_{i_t} ]]\), which comes down to the Jacobi identity and induction on \(s+t\). E.g. \begin{align*} [ [x_1 x_2], [y_3 y_4] ] = [x_1[ x_2 [y_3 y_4 ] ] ] - [x_2 [x_1 [y_3 y_4]] \in [x_1, { \mathbf{F} }y_3 + { \mathbf{F} }y_4] + \cdots \in H + \cdots ,\end{align*} which lands in \(H\) since there are as many \(x_i\) as \(y_i\), whereas if there are more \(x_i\) than \(y_i\) this lands in \(X\), and so on. Since \(Y+H+X\) is a subalgebra that contains the generators \(x_i, h_i, y_i\) of \(L_0\), it must be equal to \(L_0\).
Step 8: The decomposition \(L_0 = X + H + Y\) is a direct sum decomposition of \(L_0\) into submodule for the adjoint action of \(H\). Use the computation in the previous step to see that every element of \(X\) is a linear combination of elements \([x_{i_1}\cdots x_{i_t}]\) and similarly for \(Y\). These are eigenvectors for the action of \({ \operatorname{ad}}_H \curvearrowright L_0\) by (5), and eigenfunctions for \(X\) have the form \(\lambda = \sum_{i=1}^\ell c_i \alpha_i\) with \(c_i \in {\mathbf{Z}}_{\geq 0}\). The \(\lambda_i\) is referred to as a weight, and \(c_i\) is the number of times \(i\) appears is an index in \(i_1,\cdots, i_t\). So every weight space \(X_\lambda\) is finite-dimensional, and the weights of \(Y\) are \(-\lambda\). Since the weights in \(X, H, Y\) are all different, their intersections must be trivial and the sum is direct. :::
\(L_0 = Y \bigoplus H \bigoplus X\) is known as the triangular decomposition, where the \(x_i\) are on the super diagonal and bracket to upper-triangular elements, and the \(y_i\) are their transposes.
Progress so far: we start with the data of an irreducible root system \(\Phi \supseteq\Delta = \left\{{ {\alpha }_{1}, \cdots, {\alpha }_{\ell} }\right\}\) and Cartan matrix \(A = ({\left\langle { \alpha_i},~{\alpha_j} \right\rangle})\) and Weyl group \(W\). We set \(L_0 \mathrel{\vcenter{:}}={\left\langle{x_i, y_i, h_i {~\mathrel{\Big\vert}~}1\leq i\leq \ell}\right\rangle \over \left\langle{\text{S1, S2, S3}}\right\rangle} = Y \oplus H \oplus X\). Letting \(h\in H\) act by \({ \operatorname{ad}}_h\), we get weight spaces \((L_0)_{ \lambda} \mathrel{\vcenter{:}}=\left\{{v\in L_0 {~\mathrel{\Big\vert}~}[hv] = \lambda(h) v \, \forall h\in H}\right\}\).
For \(i\neq j\), set \begin{align*} y_{ij} \mathrel{\vcenter{:}}=( { \operatorname{ad}}_{y_i})^{- {\left\langle {\alpha_j},~{ \alpha_i} \right\rangle} + 1}(y_j) ,\end{align*} and similarly for \(x_{ij}\). Recall \(\alpha_i(h_j) \mathrel{\vcenter{:}}={\left\langle { \alpha_i },~{ \alpha_j } \right\rangle}\).
\begin{align*} { \operatorname{ad}}_{x_k}(y_{ij}) = 0 \qquad \forall i\neq j, \forall k .\end{align*}
Case 1: \(k\neq i\).
In this case, \([x_k y_i] = 0\) and thus \begin{align*} ( { \operatorname{ad}}_{x_k}) ( { \operatorname{ad}}_{y_i})^{- {\left\langle { \alpha_j },~{ ga_i } \right\rangle} +1 }(y_j) = ( { \operatorname{ad}}_{y_i})^{- {\left\langle { \alpha_j },~{ \alpha_i } \right\rangle} +1 } ( { \operatorname{ad}}_{x_k})(y_j) .\end{align*}
Case 2: \(k=i\).
In this case, we saw that for any fixed \(i\), \(\left\{{x_i, h_i, y_i}\right\}\) spans a standard \({\mathfrak{sl}}_2\) triple in \(L_0\), so consider the \({\mathfrak{sl}}_2{\hbox{-}}\)submodule of \(Y_J \leq L_0\) generated by \(y_j\). Since \(i\neq j\), we know \([x_i y_j] = 0\), so \(y_j\) is a maximal vector for \(Y_j\) with weight \(m \mathrel{\vcenter{:}}=- {\left\langle { \alpha_j },~{ \alpha_i } \right\rangle}\).
One can show by induction on \(t\) that the following formula holds: \begin{align*} ( { \operatorname{ad}}_{x_i}) ( { \operatorname{ad}}_{y_i})^t (y_j) = t (m-t+1) ( { \operatorname{ad}}_{y_i})^{t-1} (y_j) \qquad t\geq 1 .\end{align*} So in particular \(( { \operatorname{ad}}_{x_i}) ( { \operatorname{ad}}_{y_i})^{m+1}(y_j) = 0\), and the LHS is \(y_{ij}\).
An endomorphism \(x\in { \operatorname{End} }(V)\) is locally nilpotent if for all \(v\in V\) there exists some \(n\) depending on \(v\) such that \(x^n \cdot v = 0\). If \(x\) is locally nilpotent, then define the exponential as \begin{align*} \exp(x) = \sum_{k\geq 0} {1\over k!} x^k = 1 + x + {1\over 2}x^2 + \cdots \qquad \in { \operatorname{End} }(V) ,\end{align*} which is in fact an automorphism of \(V\) since its inverse is \(\exp(-x)\).
Suppose \({ \operatorname{ad}}_{x_i}, { \operatorname{ad}}_{y_i}\) are locally nilpotent on \(L_0\) and define \begin{align*} \tau_i \mathrel{\vcenter{:}}=\exp( { \operatorname{ad}}_{x_i})\circ \exp( { \operatorname{ad}}_{-y_i}) \circ \exp( { \operatorname{ad}}_{x_i}) .\end{align*} Then \(\tau_i((L_0)_\lambda) = (L_0)_{s_i (\lambda)}\) where \(s_i \mathrel{\vcenter{:}}= s_{ \alpha_i} \in W\) for \(\alpha_i\in \Phi\). Here \(\lambda\in H {}^{ \vee }\cong {\mathbf{C}}\left\langle{ \alpha_1, \cdots, \alpha_\ell}\right\rangle\) since \(H = {\mathbf{C}}\left\langle{h_1,\cdots, h_{\ell}}\right\rangle\), using that \(A\) is invertible. We use the formula \(s_{\alpha_i}(\alpha_j) = \alpha_j - {\left\langle { \alpha_j },~{ \alpha_i } \right\rangle} \alpha_i\) and extending linearly to \(H {}^{ \vee }\) as done previously.
Omitted. See \(\S 14.3\) and \(\S 2.3\) for a very similar calculation.
The Lie algebra \(L\) generated by the \(3\ell\) elements \(\left\{{x_i, h_i, y_i}\right\}_{1\leq i\leq \ell}\) subject to relations S1-S3 and the remaining two relations \(S_{ij}^{\pm}\) (which hold in any finite dimensional semisimple Lie algebra) is a finite dimensional semisimple Lie algebra with maximal torus spanned by \(\left\{{h_i}\right\}_{1\leq i\leq \ell}\) and with corresponding root system \(\Phi\).
By definition, \(L \mathrel{\vcenter{:}}= L_0/K\) where \(K{~\trianglelefteq~}L_0\) is generated by the elements \(x_{ij}, y_{ij}\) where \(i\neq j\). Recall that \(X, Y\leq L_0\) are the subalgebras generated by the \(x_i\) and \(y_i\) respectively, so let \(I\) (resp. \(J\)) be the ideal in \(X\) (resp. \(Y\)) generated by the \(x_{ij}\) (resp. \(y_{ij}\)) for \(i\neq j\). Clearly \(I, J \subseteq K\).
\begin{align*} I, J {~\trianglelefteq~}L_0 .\end{align*}
We’ll prove this for \(J\), and \(I\) is similar. Note \(J{~\trianglelefteq~}Y\) and write \(J = \left\langle{y_{ij} {~\mathrel{\Big\vert}~}i\neq j}\right\rangle\). Fix \(1\leq k\leq \ell\), then \(( { \operatorname{ad}}_{y_k}) (y_{ij}) \in J\) by definition. Recall \(y_{ij} = ( { \operatorname{ad}}_{y_i})^{- {\left\langle { \alpha_i },~{ \alpha_j } \right\rangle} +1 }(y_{ij})\). Note \(( { \operatorname{ad}}_{h_k})(y_{ij}) = c_{ijk} y_{ij}\) for some constant \(c_{ijk} \in {\mathbf{Z}}\), and \(( { \operatorname{ad}}_{x_k})(y_{ij}) = 0\) by lemma A above. Since \(x_k, h_k, y_k\) generate \(L_0\), we have \([L_0, y_{ij}] \subseteq J\). Using the Jacobi identity and that \({ \operatorname{ad}}_z\) is a Lie algebra derivation for \(z\in L_0\), it follows that \([L_0, J] \subseteq J\).
This essentially follows from \([h_\ell, Y] \subseteq Y\) and \([x_\ell, Y] \subseteq H + Y\), and bracketing these against \(y_{ij}\) lands in \(J\).
\begin{align*} K = I + J .\end{align*}
We have \(I+J \subseteq K\), but \(I+J {~\trianglelefteq~}L_0\) by claim 1 and it contains the generators of \(K\) – since \(K\) is the smallest such ideal, \(K \subseteq I+J\).
We have a decomposition \(L_0 = Y \oplus H \oplus X\) as modules under \({ \operatorname{ad}}_H\), and \(K = J \oplus 0 \oplus I\). Taking the quotient yields \(L \mathrel{\vcenter{:}}= L_0/K = Y/J \oplus H \oplus X/I \mathrel{\vcenter{:}}= N^- \oplus H \oplus N^+\).
As in the proof last time, \(\left\{{x_i, h_i, y_i}\right\} \subseteq L\) spans a copy of \({\mathfrak{sl}}_2\). We deduce that \(\sum_{1\leq i \leq \ell} { \mathbf{F} }x_i + { \mathbf{F} }h_i + { \mathbf{F} }y_i \subseteq L_0\) maps isomorphically into \(L\), so we can identify \(x_i, h_i, y_i\) with their images in \(L\), which are still linearly independent and still generate \(L\) as a Lie algebra.
For \(\lambda\in H {}^{ \vee }\), set \(L_ \lambda\mathrel{\vcenter{:}}=\left\{{z\in L {~\mathrel{\Big\vert}~}[hz] = \lambda(h) z\, \forall h\in H }\right\}\) and write \(\lambda > 0 \iff \lambda \in {\mathbf{Z}}_{\geq 0} \Delta\) and similarly define \(\lambda < 0\). View \(\alpha_i \in H {}^{ \vee }\), extended linearly as before. Note \(H = L_0, N^+ = \sum_{\lambda> 0} L_{\lambda}, N^- = \sum_{\lambda<0} L_ \lambda\), and thus \begin{align*} L = N^- \oplus H \oplus N^+ ,\end{align*} which is a direct sum since the eigenvalues in different parts are distinct.
Recall that we have \begin{align*} L = N^- \oplus H \oplus N^+ \mathrel{\vcenter{:}}= Y/\left\langle{(S_{ij}^-)}\right\rangle \oplus {\mathbf{C}}\left\langle{h_1,\cdots, h_\ell}\right\rangle \oplus X/\left\langle{ (S_{ij}^+) }\right\rangle .\end{align*}
For \(1\leq i\leq \ell\), note \({ \operatorname{ad}}_{L, x_i}\) (and similarly \({ \operatorname{ad}}_{L, y_i}\)) is locally nilpotent on \(L\). Let \(M \subseteq L\) be the subspace of elements on which \({ \operatorname{ad}}_{x_i}\) acts nilpotently. By the Leibniz rule, \(( { \operatorname{ad}}_{x_i})^{m+n}([uv])\) when \(( { \operatorname{ad}}_{x_i})^m(v) = 0\) and \(( { \operatorname{ad}}_{x_y})^n(u) = 0\), so \(M \leq L\) is a Lie subalgebra. By the Serre relations, \(( { \operatorname{ad}}_{x_i})^2(h_j) = 0\) and \(( { \operatorname{ad}}_{x_i})^3(y_j) = 0\), so the generators of \(L\) are in \(M\) and thus \(L = M\).
Defining \(\tau_i \mathrel{\vcenter{:}}=\exp( { \operatorname{ad}}_{x_i}) \circ \exp( { \operatorname{ad}}_{-y_i}) \circ \exp( { \operatorname{ad}}_{x_i}) \in \mathop{\mathrm{Aut}}(L)\), by lemma (B) we have \(\tau_i(L_ \lambda) = L_{s_i \lambda}\) where \(s_i \mathrel{\vcenter{:}}= s_{\alpha_i}\) and \(\lambda\in H {}^{ \vee }\).
Let \(\lambda, \mu \in H {}^{ \vee }\) and suppose \(w \lambda = \mu\); we want to show \(\dim L_ \lambda = \dim L_{ \mu}\). Note \(W\) is generated by simple reflections \(s_i\), so it STS this when \(w=s_i\), whence it follows from lemma (B).
Clearly \(\dim (L_0)_{\alpha_i} = 1\) since it’s spanned by \(x_i\). Then \(\dim (L_0)_{k \alpha_i} = 0\) for \(k\neq 0, \pm 1\), so \(\dim L_{\alpha_i} \leq 1\) and \(\dim L_{k \alpha} = 0\) for \(k\neq 0, \pm 1\). Since \(x_i\in L_{\alpha_i}\) has a nonzero image in \(L\), \(\dim L_{\alpha_i} = 1\).
If \(\beta\in \Phi\), conjugate it to a simple root using \(\beta = w \alpha_i\) with \(w\in W, \alpha_i \in \Delta\). By step 8, \(\dim L_{ \beta} = 1\) and \(L_{k \beta} = 0\) for \(k\neq 0, \pm 1\).
Suppose \(L_ \lambda \neq 0\) where \(\lambda\neq 0\). Then \(\lambda\in {\mathbf{Z}}_{\geq 0} \Delta\) or \({\mathbf{Z}}_{\leq 0} \Delta\), i.e. all coefficients are the same sign. Suppose \(\lambda\not\in \Phi\), then \(\lambda\in {\mathbf{Z}}\Phi\) by (10). Exercise 10.10 yields \(\exists w\in W\) such that \(w \lambda{\mathbf{Z}}\Delta\) with both positive and negative coefficients. Thus \(w \lambda\) can not be a weight, and by step 8, \(0 = \dim L_{w \lambda} = \dim L_ \lambda\).
Writing \(L = N^- \oplus H \oplus N^+\) with \(H = L_0, N^+ = \sum_{ \lambda > 0} L_{ \lambda} = \sum_{\beta\in \Phi^+} L_{ \lambda}\) and \(N^- = \sum_{ \lambda < 0} L_{ \lambda} = \sum_{ \beta\in \Phi^-} L_ \beta\), by step 10 we can conclude \(\dim L = \ell + {\sharp}\phi < \infty\). This shows that \(H\) is toral, i.e. its elements are ad-semisimple.
We have that \(L\) is a finite-dimensional Lie algebra. To show semisimplicity, we need to now \(L\) has no nonzero solvable ideals, and as before it’s ETS \(L\) has no nonzero abelian ideals. Suppose \(A \subseteq L\) is an abelian ideal; we WTS \(A = 0\). Since \([H,A] \subseteq A\) and \(H\curvearrowright L\) diagonally, \(H\curvearrowright A\) diagonally as well and thus \begin{align*} A = (A \cap H) \oplus \bigoplus _{\alpha\in \Phi} (A \cap L_{ \alpha}) .\end{align*} If \(A \cap L_{\alpha} \neq 0\) then \(A \cap L_{ \alpha} = L_{\alpha}\), which is 1-dimensional.
Note: the argument in Humphreys here may not be quite right, so we have to do something different.
Now \(\exists w\in W\) such that \(w \alpha = \alpha_i \in \Delta\) as in step 8, so write \(w = s_{i_1}\cdots s_{i_t}\). Then \(\tau_{i_1}\cdots \tau_{i_t} (L_ \alpha) = L_ \alpha\). Set \(A' \mathrel{\vcenter{:}}=\tau(A)\), then \(A'{~\trianglelefteq~}L\) is necessarily an abelian ideal and \(A' \cap L_{\alpha_i} = L_{\alpha_i}\).
So we can replace \(\alpha\) by a simple root and replace \(A\) by \(A'\). Then \(x_i\in L_{\alpha_i} \subseteq A' {~\trianglelefteq~}L\), but \(A'\ni -[y_i, x_i] = h_i\), but \([h_i, x_i] \neq 0\), contradicting that \(A'\) is abelian. \(\contradiction\).
Note \([A', L_{\alpha_j}] \subseteq A' \subseteq H\) and \([A', L_{\alpha_j}] \subseteq L_{\alpha_j}\), but \(H\curvearrowright L_{\alpha_j}\) with eigenvalues \(\alpha_j\) and thus \(A' \subseteq \cap_{j=1}^\ell \ker \alpha_j = 0\) since the \(\alpha_j\) span \(H {}^{ \vee }\). So \(A' = 0\) and \(L\) is semisimple.
Since \(L = H \oplus \bigoplus _{\alpha\in \Phi} L_{ \alpha}\), it’s easy to check that \(C_L(H) = H\) by considering what happens when bracketing against any nonzero element in \(\bigoplus L_ \alpha\). Thus \(H\) is a maximal toral subalgebra with corresponding root system \(\Phi\).
Next: part VI on representation theory, although we’ll first cover \(\S 13\) on weights, especially \(\S 13.1, \S 13.2\). Goal: Weyl’s character formula.
Let \({\mathbb{E}}\supseteq\Phi \supseteq\Delta\) with Weyl group \(W\). An element \(\lambda\in {\mathbb{E}}\) is an integral weight if \({\left\langle {\lambda },~{\beta } \right\rangle}= (\lambda, \beta {}^{ \vee })\in {\mathbf{Z}}\) for all \(\beta \in \Phi\), where \(\beta {}^{ \vee }\mathrel{\vcenter{:}}={2\beta \over (\beta, \beta)}\). We write the set of all weights as \(\Lambda\), and write \(\Lambda_r \mathrel{\vcenter{:}}={\mathbf{Z}}\Phi\) for the root lattice.
Recall \(\Delta {}^{ \vee }\mathrel{\vcenter{:}}=\left\{{\alpha {}^{ \vee }{~\mathrel{\Big\vert}~}\alpha\in \Delta}\right\}\) is a base for \(\Phi {}^{ \vee }= \left\{{ \beta {}^{ \vee }{~\mathrel{\Big\vert}~}\beta \in \Phi }\right\}\), and so \begin{align*} \lambda \in \Lambda\iff (\lambda, \alpha {}^{ \vee }) = {\left\langle {\lambda },~{\alpha } \right\rangle}\in {\mathbf{Z}}\forall \alpha \in \Delta .\end{align*}
A weight \(\lambda\in \Lambda\) is dominant iff \({\left\langle {\lambda },~{\alpha } \right\rangle}\geq 0\) for all \(\alpha\in \Delta\), and we denote the set of all such dominant weights \(\Lambda^+\). The weight \(\lambda\) is strongly dominant if \({\left\langle {\lambda },~{\alpha } \right\rangle}> 0\) for all \(\alpha \in \Delta\). Writing \(\Delta = \left\{{ {\alpha }_{1}, \cdots, {\alpha }_{\ell} }\right\}\), let \(\left\{{ \lambda_i }\right\}_{1\leq i \leq \ell}\) be the dual basis for \({\mathbb{E}}\) relative to \({\left\langle {{-}},~{{-}} \right\rangle}\), so \({\left\langle { \lambda_i},~{\alpha_i} \right\rangle} = \delta_{ij}\). The \(\lambda_i\) are referred to as the fundamental dominant weights, written \(\lambda_i = \omega_i = \varpi_i\).
If \(\lambda \in \Lambda\) then one can write \(\lambda = \sum_{i=1}^\ell m_i \lambda_i\) where \(m_i \mathrel{\vcenter{:}}={\left\langle { \lambda},~{\alpha_i} \right\rangle}\), so \(\Lambda\) is a \({\mathbf{Z}}{\hbox{-}}\)lattice with lattice basis \(\left\{{\lambda_i}\right\}_{1\leq i\leq \ell}\) containing the root lattice as a sublattice, so in fact \(\Lambda_r = {\mathbf{Z}}\Delta\). Writing the Cartan matrix as \(A = ({\left\langle { \alpha_i},~{\alpha_j} \right\rangle})\) we have \(\alpha_i = \sum_{j=1}^\ell {\left\langle {\alpha_i},~{\alpha_j} \right\rangle} \lambda_j\) coming from the \(i\)th row of \(A\). So this matrix expresses how to write simple roots in terms of fundamental dominant roots, and inverting it allows writing the fundamental roots in terms of simple roots.
The entires of \(A^{-1}\) are all nonnegative rational numbers, so each fundamental dominant root is a nonnegative rational linear combination of simple roots.
For \(A_3\) one has \(A = { \begin{bmatrix} {2} & {-1} & {0} \\ {-1} & {2} & {-1} \\ {0} & {-1} & {2} \end{bmatrix} }\), so \begin{align*} \alpha_1 &= 2 \lambda_1 - \lambda_2 \\ \alpha_2 &= - \lambda_1 + 2 \lambda_2 - \lambda_3 \\ \alpha_3 &= - \lambda_2 + 2 \lambda_3 .\end{align*}
The quotient \(\Lambda/ \Lambda_r\) is called the fundamental group of \(\Phi\), and the index \(f\mathrel{\vcenter{:}}=[\Lambda: \Lambda_r]\) is called its index of connection.
The index is generally small:
Note that \begin{align*} s_i \lambda_j = \lambda_j - {\left\langle { \lambda_j},~{\alpha_i} \right\rangle} \alpha_i = \lambda_j - \delta_{ij} \alpha_i ,\end{align*} so \(\Lambda\) is invariant under \(W\). In fact, any sublattice of \(\Lambda\) containing \(\Lambda_r\) is \(W{\hbox{-}}\)invariant.
Each integral weight is \(W{\hbox{-}}\)conjugate to exactly one dominant weight. If \(\lambda\) is dominant, then \(w \lambda\leq \lambda\) for all \(w\in W\), and if \(\lambda\) is strongly dominant then \(w \lambda= \lambda\iff w=1\).
Most of this follows from Theorem 10.3, exercise 10.14, lemma 10.3B, and corollary 10.2C, along with induction on \(\ell(w)\). We’ll omit the details.
The ordering \(\leq\) on \(\Lambda\) is not well-behaved with respect to dominant weights, i.e. one can have \(\mu \Lambda\) with \(\mu\in \Lambda^+\) dominant but \(\lambda\not\in \Lambda^+\) not dominant.
Let \(\Phi\) be indecomposable of type \(A_1\) with two roots \(\alpha, \beta\), then \(0\in \Lambda^+\) is dominant, but \(0 < \alpha \in \Delta\) is not dominant: \((\alpha,\beta) < 0 \implies {\left\langle {\alpha },~{\beta} \right\rangle}< 0\).
Let \(\lambda \in \Lambda^+\) be dominant, then the number of dominant \(\mu \in \Lambda^+\) with \(\mu\leq \lambda\) is finite.
Let \(\lambda, \mu\in \Lambda^+\) and write \(\lambda - \mu\) as a nonnegative integer linear combination of simple roots. Note \begin{align*} 0 \leq (\lambda+ \mu, \lambda- \mu ) = (\lambda, \lambda ) - (\mu, \mu) = {\left\lVert { \lambda} \right\rVert}^2 - {\left\lVert {\mu} \right\rVert}^2 ,\end{align*} so \(\mu\) lies in the compact set of vectors whose length is \({\left\lVert {\lambda} \right\rVert}\) and also in the discrete set \(\Lambda^+\). The intersection of a compact set and a discrete set is always finite.
\begin{align*} \rho \mathrel{\vcenter{:}}={1\over 2} \sum_{\alpha\in \Phi^+} \alpha .\end{align*}
This section shows \(\rho = \sum_{i = l}^\ell \lambda_i\) and \({\left\lVert {\lambda+ \rho} \right\rVert}^2 \geq{\left\lVert {w \lambda+ \rho} \right\rVert}^2\) when \(\lambda\) is the unique dominant weight in the orbit \(W \lambda\).
This section will be used later to analyze the set of weights in a finite-dimensional module for semisimple Lie algebra over \({\mathbf{C}}\).
Let \(L\) be finite-dimensional semisimple over \({\mathbf{C}}\) containing \(H\) its toral subalgebra. This corresponds to \(\Phi \supseteq\Delta\) with Weyl group \(W\) and \(\Phi \subseteq {\mathbb{E}}= {\mathbf{R}}\Phi\).
Let \(V\) be a finite-dimensional \(L{\hbox{-}}\)module. By corollary 6.4, \(H\curvearrowright V\) semisimply (diagonally) and we can simultaneously diagonalize to get a decomposition \begin{align*} V= \bigoplus _{\lambda\in H {}^{ \vee }} V_{\lambda}, \qquad V_{ \lambda} \mathrel{\vcenter{:}}=\left\{{v\in V {~\mathrel{\Big\vert}~}h.v = \lambda(h) v\,\,\forall h\in H}\right\} .\end{align*} If \(V_{\lambda}\neq 0\) then \(\lambda\) is a weight.
If \(\phi = { \operatorname{ad}}\) and \(V=L\), then \(L = H \oplus \oplus _{\alpha\in \Phi} L_{\alpha}\) where \(H = L_0\).
If \(\dim V = \infty\), \(V_{ \lambda}\) still makes sense but \(V\) may no longer decompose as a direct sum of its weight spaces. E.g. take \(V = {\mathcal{U}}(L)\) and the left regular representation given by left-multiplication in the algebra \({\mathcal{U}}(L) \curvearrowright{\mathcal{U}}(L)\). This restricts to \(L = L_0 \curvearrowright{\mathcal{U}}(L)\), the regular action of \(L\) on \({\mathcal{U}}(L)\). Note that there are no eigenvectors, since taking a PBW basis one can write \(\prod h_i^{n_i} \cdot \prod_{\alpha\in \Phi} x_{\alpha}^{n_ \alpha}\), which strictly increases monomial degrees and thus there are no eigenspaces. So \(V_ \lambda= 0\) for all \(\lambda\), i.e. there are no weight spaces at all.
Let \(L\in \mathsf{Lie} \mathsf{Alg}^{{\mathrm{fd}}, {\mathrm{ss}}}\) containing \(H\) with \(\Phi, \Delta, W\) as usual. Recall that \(V\in {}_{L}{\mathsf{Mod}}\implies V = \bigoplus _{\lambda\in H {}^{ \vee }} V_{ \lambda}\) where \(V_{ \lambda} \mathrel{\vcenter{:}}=\left\{{v\in V{~\mathrel{\Big\vert}~}h.v = \lambda(h)v\, \forall h\in H}\right\}\), which we call a weight space when \(\lambda\neq 0\). Note that if if \(V\) is any representation of \(V\), even finite-dimensional, \(V' \mathrel{\vcenter{:}}=\bigoplus _{ \lambda\in H {}^{ \vee }} V_{ \lambda} \leq V\) is always an \(L{\hbox{-}}\)submodule. The sum is still direct since the terms correspond to eigenspaces with distinct eigenvalues. Note that if \(h\in H, x\in L_{\alpha}, v\in V_{ \lambda}\), then \begin{align*} h.(x.v) &= x.(h.v) + [hx].v \\ &= \lambda(h) x.v + \alpha(h) x.v \\ &= (\lambda+ \alpha)(h) x.v ,\end{align*} so \(L_{ \alpha} V_{ \lambda} \subseteq V_{ \lambda+ \alpha}\).
Let \(V\in {}_{L}{\mathsf{Mod}}\), then
A maximal vector in an \(L{\hbox{-}}\)module \(V\) is a nonzero weight vector \(v\in V_{ \lambda}\) such that \(L_ \alpha.v = 0\) for all positive roots \(\alpha \in \Phi^+\). Equivalently, \(L_ \alpha .v = 0\) for all \(\alpha\in \Delta\).
A highest weight vector is a nonzero \(v\in V_{ \lambda}\) where \(\lambda\) is maximal among all weights of \(V\) with respect to the ordering \(\leq\) corresponding to the choice of \(\Delta\).
If \(v\) is a highest weight vector then \(v\) is necessarily a maximal vector, since \(\lambda+ \alpha > \lambda\), but the converse is not necessarily true.
I.e., the weight of a highest weight vector need not be maximal.
In \(\S 18\), \(L\) is constructed using the Serre relations to get \(L_0 \twoheadrightarrow L\) where \(L\) involved \((S_{ij}^{\pm})\) and \(L_0\) involved S1-S3. Recalling \(y_i^{m+1} y_j = y_{ij}\), since \(x_k . y_{ij} = 0\), \(y_{ij}\) is a maximal vector in \(L_0\) as an \(L{\hbox{-}}\)module but is not a highest weight vector since \(\mathrm{wt} y_{ij} = (-m+1) \alpha_i - \alpha_j < - \alpha_j = \mathrm{wt}(y_j)\) and the weight is not maximal.
View \(L\in {}_{L}{\mathsf{Mod}}\) via \({ \operatorname{ad}}_L\), then \(\S 10.4\) shows that there is a unique highest root \(\tilde \alpha\) satisfying \(\tilde \alpha \geq \alpha\) for all \(\alpha\in \Phi\). Any nonzero \(v\in L_{\tilde \alpha}\) is a highest weight vector for the adjoint representation.
A Borel subalgebra of \(L\) is a maximal solvable subalgebra \(B\leq L\).
\(B \mathrel{\vcenter{:}}= H \oplus \bigoplus _{\alpha\in \Phi^+} L_ \alpha\) is a Borel subalgebra of \(L\).
If \(\alpha, \beta\in \Phi^+\) then \([L_{ \alpha}, L_{ \beta}] = L_{\alpha + \beta}\) where \(\alpha + \beta\in \Phi^+\) (if this is still a root), so \(H\leq L\). One has \(B^{(i)} \mathrel{\vcenter{:}}=[B^{(i-1)}, B^{(i-1)}] \subseteq \sum_{\operatorname{ht}( \beta) \geq 2^{i-1} } L_{\beta}\), since bracketing elements of \(H\) together will vanish (since \(H\) is abelian) and bracketing height 1 roots yields height 2, bracketing height 2 yields height 4, and so on. Thus \(B\) is a solvable subalgebra, since the height is uniformly bounded above by a finite number. To see that its maximal, note that any subalgebra \(B' \leq L\) containing \(B\), it must also contain some \(L_{ - \alpha}\) for some \(\alpha\in \Phi^+\). But then \(B' \supseteq{\mathfrak{sl}}_2({\mathbf{C}}) = L_{ - \alpha} \oplus [ L_{ - \alpha }, L_{ \alpha}] \oplus L_{ \alpha}\) which is not solvable, so \(B'\) can not be solvable.
Let \(V\in {\mathsf{L}{\hbox{-}}\mathsf{Mod}}^{\mathrm{fd}}\), then \(V\in {}_{B}{\mathsf{Mod}}\) by restriction and by Lie’s theorem \(V\) must have a common eigenvector \(v\) for the action of \(B\). Since \(B\supseteq H\), \(v\) is a weight vector, \([B, B] = \bigoplus _{ \alpha\in \Phi^+} L_ \alpha\) acts by commutators of operators acting by scalars, which commute, and thus this acts by zero on \(v\) and makes \(v\) a maximal vector in \(V\). So any finite dimensional \(L{\hbox{-}}\)module as a maximal vector.
A module \(V\in {\mathsf{L}{\hbox{-}}\mathsf{Mod}}\), possibly infinite dimensional, is a highest weight module if there exists a \(\lambda \in H {}^{ \vee }\) and a nonzero vector \(v^+ \in V_{ \lambda}\) such that \(V\) is generated as an \(L{\hbox{-}}\)module by \(v^+\), i.e. \(U(L).v^+ = V\).
Let \(x_ \alpha \in L_ \alpha, y_ \alpha\in L_{- \alpha}, h_{\alpha} = [x_ \alpha, y_ \alpha]\) be a fixed standard \({\mathfrak{sl}}_2\) triple in \(L\).
Let \(V\in {\mathsf{L}{\hbox{-}}\mathsf{Mod}}\) be a highest weight module with maximal vector \(v^+ \in V_{ \lambda}\). Write \(\Phi^+ = \left\{{ {\beta }_{1}, \cdots, {\beta }_{m} }\right\}\), \(\Delta = \left\{{ {\alpha }_{1}, \cdots, {\alpha }_{\ell} }\right\}\), then
\(V\) is spanned by the vectors \(y_{\beta_1}^{i_1}\cdots y_{ \beta_m}^{i_m} .v^+\) for \(i_j\in {\mathbf{Z}}_{\geq 0}\). In particular, \(V = \bigoplus _{ \mu\in H {}^{ \vee }} V_ \mu\).
The weights of \(V\) are of the form \(\mu = \lambda- \sum_{i=1}^\ell k_i \alpha_i\) with \(k_i\in {\mathbf{Z}}_{ \geq 0}\), and all weights \(\mu\) satisfy \(\mu \leq \lambda\).
For each \(\mu\in H {}^{ \vee }\), \(\dim V_{ \mu} < \infty\), and for the highest weight \(\lambda\), one has \(\dim V_{ \lambda} = 1\) spanned by \(v^+\).
Each \(L{\hbox{-}}\)submodule \(W\) of \(V\) is a direct sum of its weight spaces.
\(V\) is indecomposable in \({\mathsf{L}{\hbox{-}}\mathsf{Mod}}\) with a unique maximal proper submodule and a corresponding unique irreducible quotient.
Every nonzero homomorphic image of \(V\) is also a highest weight module of the same highest weight.
Use the PBW theorem and extend a basis for any \(B\leq L\) to a basis where the \(B\) basis elements come second. Writing \(L = N^ \oplus B\), one can decompose \(U(L) = U(N^-) \otimes_{\mathbf{C}}U(B)\) and get \(U(L) v^+ = U(N^-)U(B)v^+ = U(N^-)U(H \oplus N^+)v^+\)
Writing the \(\beta\) in terms of \(\alpha\) yields this expression.
Clear.
We’ll finish the rest next time.
Last time: if \(V\in {\mathsf{L}{\hbox{-}}\mathsf{Mod}}^{\mathrm{fd}}\) then \(V = L( \lambda)\) for some dominant weight \(\lambda\in \Lambda^+\), yielding a necessary condition for finite-dimensionality. Today: a sufficient condition.
Write \(\Delta = \left\{{ {\alpha }_{1}, \cdots, {\alpha }_{n} }\right\}\) and set \(x_i \mathrel{\vcenter{:}}= x_{ \alpha_i}, y_i \mathrel{\vcenter{:}}= y_{\alpha_i}\). For \(k\geq 0\) and \(1\leq i,j\leq \ell\), the following relations hold in \(U(L)\):
Use that \({ \operatorname{ad}}\) acts by derivations: \begin{align*} [x_i, y_i^{k+1}] &= x_i y_i^{k+1} - y_i^{k+1} x_i \\ &= x_i y_i y_i^k - y_i x_i y_i^k + y_i x_i y_i^k - y_i y_i^k x_i \\ &= [x_i y_i] y_i^k + y_i[x_i y_i^k] \\ &= h_i y_i^k + y_i [x_i y_i^k] \\ &= \qty{ y_i^k h_i - k \alpha_i (h_i) y_i^k } + y_i \qty{k y_i^{k-1} (h_i - (k-1)\cdot 1) } \qquad\text{by (b) and induction} \\ &= y_i^k h_i - 2k y_i^k + ky_i^k (h_i - (k-1)\cdot 1) \\ &= (k+1)y_i^k h_i - (2k + k(k-1)) y_i^k \\ &= (k+1) y_i^k h_i - k(k+1) y_i^k \\ &= (k+1)y_i^k (h_i - k\cdot 1) .\end{align*}
Given \(V\in {\mathsf{L}{\hbox{-}}\mathsf{Mod}}\), let \(\Pi(V) \mathrel{\vcenter{:}}=\left\{{ \lambda\in H {}^{ \vee }{~\mathrel{\Big\vert}~}V_ \lambda\neq 0}\right\}\) be the set of weights. If \(\lambda\in \Lambda^+\) is a dominant weight, then \(V \mathrel{\vcenter{:}}= L(\lambda)\in {\mathsf{L}{\hbox{-}}\mathsf{Mod}}^{\mathrm{irr}}\) is finite-dimensional and \(\Pi(V)\) is permuted by \(W\) with \(\dim V_ \mu = \dim V_ { \sigma \mu}\) for all \(\sigma\in W\).
The main work is showing the last part involving equality of dimensions. It STS this for a simple reflection \(s_i \mathrel{\vcenter{:}}= s_{\alpha_i}\) since \(\sigma\) is a product of such reflections. Write \(\phi: L\to {\mathfrak{gl}}(V)\) be the representation associated to \(V\) – the strategy is to show that \(\phi(x_i)\) and \(\phi(y_i)\) are locally nilpotent endomorphisms of \(V\). Let \(v^+ \in V_ \lambda\setminus\left\{{0}\right\}\) be a fixed maximal vector and set \(m_i \mathrel{\vcenter{:}}=\lambda(h_i)\) so \(h_i.v^+ = m_i v^+\).
Set \(w \mathrel{\vcenter{:}}= y_i^{m_i+1}. v^+\), then the claim is that \(w=0\). Supposing not, we’ll show \(w\) is a maximal vector of weight not equal to \(\lambda\), and thus not a scalar multiple of \(v^+\). We have \({\operatorname{wt}}( w) = \lambda- (m_i +1) \alpha_i < \lambda\) (a strict inequality). If \(j\neq i\) then \(x_j.w = x_j y_i^{m_i + 1} v^+ = y_i^{m_i + 1} x_j v\) by part (a) of the lemma above, and this is zero since \(v^+\) is highest weight and thus maximal (recalling that these are distinct notions). Otherwise \begin{align*} x_i w &= x_i y_i^{m_i + 1} v^+ \\ &= y_i^{m_i+1} x_i v^+ + (m_i + 1) y_i^{m_i} (h_i - m_i\cdot 1)v^+ \\ &= 0 + (m_i+1)y_i^{m_i}(m_i - m_i) v^+ = 0 .\end{align*} So \(w\) is a maximal vector of weight distinct from \(\lambda\), contradicting corollary 20.2 since this would generate a proper submodule. \(\contradiction\)
Let \(S_i = \left\langle{x_i, y_i, h_i}\right\rangle \cong {\mathfrak{sl}}_2\), then the claim is that \(v^+, y_i v^+, \cdots, y_i^{m_i} v^+\) span a nonzero finite-dimensional \(S_i{\hbox{-}}\)submodule of \(V\). The span is closed under (the action of) \(h_i\) since all of these are eigenvectors for \(h_i\), and is closed under \(y_i\) since \(y_i\) raises generators and annihilates \(y_i^{m_i} v^+\), and is closed under \(x_i\) by part (c) of the lemma (since it lowers generators).
The sum of two finite-dimensional \(S_i{\hbox{-}}\)submodules of \(V\) is again a finite-dimensional \(S_i{\hbox{-}}\)submodule, so let \(V_i\) be the sum of all finite-dimensional \(S_i{\hbox{-}}\)submodules of \(V\) (which is not obviously finite-dimensional, since we don’t yet know if \(V\) is finite-dimensional). The claim is \(V'\) is a nonzero \(L{\hbox{-}}\)submodule of \(V\). Let \(w\in V'\), then \(w\) is a finite sum and there exists a finite-dimensional \(S_i{\hbox{-}}\)submodule \(W\) of \(V\) with \(w\in W\). Construct \(U \mathrel{\vcenter{:}}=\sum_{ \alpha \in \Phi} x_ \alpha . W\) where \(x_ \alpha \mathrel{\vcenter{:}}= y_{- \alpha}\) if \(\alpha\in \Phi^-\), which is a finite-dimensional vector subspace of \(V\). Check that \begin{align*} h_i (x_ \beta . W) &= x_ \beta(h_i.W) + [h_i x_ \beta].W \subseteq x_ \beta.W \subseteq U \\ x_i (x_ \beta.W) &= x_ \beta(x_i.W) + [x_i x_ \beta] .W \subseteq x_ \beta .W + x _{ \beta + \alpha_i}.W \subseteq U \\ y_i(x_ \beta. W) &= x_{ \beta}(y_i.W) + [y_i x_ \beta].W \subseteq x_ \beta W + x_{ \beta - \alpha_i}.W \subseteq U ,\end{align*} and so \(U\) is a finite-dimensional \(S_i{\hbox{-}}\)submodule of \(V\) and thus \(U \subseteq V'\). So if \(w\in V'\) then \(x_ \alpha .w \in V'\) for all \(\alpha\in \Phi\) and \(V'\) is stable under a set of generators for \(L\), making \(V' \leq V\) an \(L{\hbox{-}}\)submodule. Since \(V'\neq 0\) (since it contains at least the highest weight space), it must be all of \(V\) since \(V\) is irreducible.
Given an arbitrary \(v\in V\), apply the argument for \(w\) in step 3 to show that there exists a finite-dimensional \(S_i{\hbox{-}}\)submodule \(W \subseteq V\) with \(v\in W\). The elements \(x_i, y_i\) act nilpotently on any finite-dimensional \(S_i{\hbox{-}}\)module, and so in particular they act nilpotently on \(v\) and we get local nilpotence.
Now \(\tau_i \mathrel{\vcenter{:}}= e^{\phi(x_i)} \circ e^{\phi(-y_i)} \circ e^{\phi(x_i)}\) is well-defined. As seen before, \(\tau_i(V_ \mu) = V_{s_i \mu}\), i.e. \(\tau_i\) is an automorphism that behaves like \(s_i\), and so \(\dim V_ \mu = \dim V_{ \sigma \mu}\) for all \(\mu\in \Pi(V)\) and \(\sigma\in W\). Now any \(\mu\in \Pi(V)\) is conjugate under \(W\) to a unique dominant weight \(\mu^+\), and by (4) \(\mu^+\in \Pi(V)\) and since the weights in \(V = L(\lambda)\) has only weights smaller than \(\lambda\), we have \(\mu^+ \leq \lambda\). Note \(\lambda \in \Lambda^+\) is dominant, and so by 13.2B there are only finitely many such weights \(\mu^+\). Now there are only finitely many conjugates of the finitely many possibilities for \(\mu^+\), so \({\sharp}\Pi(V) < \infty\). By the general theory of highest weight modules, all weight spaces \(V_{\mu} \leq L( \lambda)\) are finite-dimensional. Since \(\dim V_\mu < \infty\) for all \(\mu\in\Pi(V)\), we have \(\dim V < \infty\).
Skipping the next two sections \(\S 21.3\) on weight strings and weight diagrams, and \(\S 21.4\) on generators and relations for \(L( \lambda)\) for \(\lambda\in \Lambda^+\) dominant.
Setup: \(L\in {\mathsf{Lie}{\hbox{-}} \mathsf{Alg}}^{{\mathrm{fd}}, {\mathrm{ss}}}_{/ {{\mathbf{C}}}}\) containing \(H\) a maximal toral subalgebra, \(\Phi \supseteq\Phi^+ = \left\{{ {\beta }_{1}, \cdots, {\beta }_{m} }\right\} \supseteq\Delta = \left\{{ {\alpha }_{1}, \cdots, {\alpha }_{\ell} }\right\}\) with Weyl group \(W\), and we have \(x_i \in L_{ \alpha_i}, y_i\in L_{- \alpha_i}, h_i = [ x_i y_i]\). For \(\beta\in \Phi^+\), we also wrote \(x_ \beta\in L_ \beta, y_ \beta\in L_{- \beta}\). There is also a Borel \(B = H \oplus \bigoplus _{\beta > 0} L_ \beta\) with \(H \subseteq B \subseteq L\).
We saw that if \(V\in {\mathsf{L}{\hbox{-}}\mathsf{Mod}}^{\mathrm{fd}}\) then \(V = \bigoplus _{\mu\in H {}^{ \vee }} V_ \mu\) and \(V\) is a highest weight module of highest weight \(\lambda\) where \(\lambda \in \Lambda^+ = \left\{{\nu \in H {}^{ \vee }{~\mathrel{\Big\vert}~}\nu(h_i) \in {\mathbf{Z}}_{\geq 0} \, 1\leq i\leq \ell}\right\}\). Writing \(M( \lambda)\) for the Verma module \({\mathcal{U}}(L) \otimes_{{\mathcal{U}}(B)} C_{ \lambda}\), there is a unique irreducible quotient \(M( \lambda) \twoheadrightarrow L( \lambda)\) with highest weight \(\lambda\). It turns out that \(L ( \lambda)\) is finite-dimensional if \(\lambda\in \Lambda^+\).
How can we understand \(L( \lambda)\) for \(\lambda\in \Lambda^+\) better? What is \(\dim L( \lambda)\)? What is \(\dim L( \lambda)_\mu\) (i.e. the dimensions of weight spaces)?
Let \(\Lambda \subseteq H {}^{ \vee }\) be the lattice of integral weights. Note that \(\lambda(h_\beta) \in {\mathbf{Z}}\,\, \forall \beta\in \Phi \iff \lambda(h_i) \in {\mathbf{Z}}\) for all \(1\leq i\leq \ell\). Since \(\Lambda\in {}_{{\mathbf{Z}}}{\mathsf{Mod}}\) there is a group algebra \({\mathbf{Z}}[ \Lambda]\), the free \({\mathbf{Z}}{\hbox{-}}\)module with basis \(e( \lambda)\) for \(\lambda\in \Lambda\), also written \(e_ \lambda\) or \(e^{ \lambda}\). This has a ring structure given by linearly extending \(e( \lambda)\cdot e( \mu) \mathrel{\vcenter{:}}= e( \lambda+ \mu)\), so \begin{align*} \qty{ \sum a_ \lambda e( \lambda) } \cdot \qty{\sum b_ \mu e( \mu) } = \sum c_ \sigma e( \sigma), \qquad c_ \sigma\mathrel{\vcenter{:}}=\sum_{ \lambda+ \mu = \sigma} a_ \lambda b _{\mu} .\end{align*} Note that if \(V\in {\mathsf{L}{\hbox{-}}\mathsf{Mod}}^{\mathrm{fd}}\) then \(V = \bigoplus _{ \mu\in H {}^{ \vee }} V_ \mu\), where \(V_{ \mu}\neq 0\implies \mu \in \Lambda\). In this case we can define the formal character \begin{align*} \mathrm{ch}_V \mathrel{\vcenter{:}}=\sum_{ \mu\in \Lambda} (\dim V_ \mu) e( \mu) \in {\mathbf{Z}}[ \Lambda ] .\end{align*}
Let \(V, W\in {\mathsf{L}{\hbox{-}}\mathsf{Mod}}^{\mathrm{fd}}\), then \begin{align*} \mathrm{ch}_{V\otimes W} = \operatorname{ch}_V \cdot \operatorname{ch}_W .\end{align*}
Take dimensions in the formula \((V\otimes W)_ \sigma = \sum_{ \lambda+ \mu = \sigma} V_ \lambda\otimes W_ \mu\).
For \(\lambda\in \Lambda^+\), we have \(L( \lambda)\) (noting Humphreys uses \(V( \lambda)\)), we write \begin{align*} \operatorname{ch}_ \lambda\mathrel{\vcenter{:}}=\operatorname{ch}_{L( \lambda)} = \sum m_ \lambda(\mu) e( \mu) \end{align*} where \(m_{\lambda}( \mu) \mathrel{\vcenter{:}}=\dim L( \lambda)_ \mu\in {\mathbf{Z}}_{\geq 0}\).
We now involve the Weyl group to make more progress: let \(W\) be the Weyl group of \((L, H)\), then \(W\curvearrowright{\mathbf{Z}}[ \Lambda]\) by \(w . e( \mu) \mathrel{\vcenter{:}}= e(w \mu)\) for \(w\in W, \mu\in \Lambda\). So \(W\to \mathop{\mathrm{Aut}}_{\mathsf{Ring}}({\mathbf{Z}}[ \Lambda])\), and recalling that \(\dim L( \lambda)_ \mu = \dim L(\lambda)_{ w \mu}\), we have \begin{align*} w. \operatorname{ch}_{\lambda} = \sum_\mu m_ \lambda( \mu) e(w \mu) = \sum_\mu m_ \lambda(w \mu) e(w \mu) = \operatorname{ch}_{\lambda} ,\end{align*} so \(\operatorname{ch}_ \lambda\) are \(W{\hbox{-}}\)invariant elements of the group algebra.
Let \(f\in {\mathbf{Z}}[ \Lambda]^W\) be a \(W{\hbox{-}}\)invariant element, then \(f\) can be written uniquely as a \({\mathbf{Z}}{\hbox{-}}\)linear combination of the \(\operatorname{ch}_ \lambda\) for \(\lambda \in \Lambda^+\), i.e. these form a \({\mathbf{Z}}{\hbox{-}}\)basis: \begin{align*} {\mathbf{Z}}[ \Lambda]^W = \left\langle{\operatorname{ch}_ \lambda{~\mathrel{\Big\vert}~}\lambda\in \Lambda}\right\rangle_{\mathbf{Z}} .\end{align*}
Recall every \(\lambda\in \Lambda\) is conjugate to a unique \(\lambda\in \Lambda^+\). Since \(f\) is \(W{\hbox{-}}\)invariant, we can write it as \begin{align*} f = \sum_{ \lambda\in \Lambda^+} c( \lambda) \qty{\sum_{w\in W} e( w \lambda)} \end{align*} where \(c( \lambda)\in {\mathbf{Z}}\) and almost all are zero. Given \(\lambda\in \Lambda^+\) with \(c( \lambda)\neq 0\), a previous lemma (13.2B) shows that \({\sharp}\left\{{ \mu\in \Lambda^+ {~\mathrel{\Big\vert}~}\mu\leq \lambda}\right\} < \infty\) by a compactness argument. Set \begin{align*} M_f \mathrel{\vcenter{:}}=\bigcup_{c( \lambda) \neq 0, \lambda\in \Lambda^+} \left\{{ \mu\in \Lambda^+ {~\mathrel{\Big\vert}~}\mu\leq \lambda}\right\} ,\end{align*} all of the possible weights that could appear, then \({\sharp}M_f < \infty\) since this is a finite union of finite sets. Choose \(\lambda\in \Lambda^+\) maximal with respect to the property that \(c( \lambda)\neq 0\), and set \(f' \mathrel{\vcenter{:}}= f- c( \lambda) \operatorname{ch}_ \lambda\). Note that \(f'\) is again \(W{\hbox{-}}\)invariant, since \(f\) and \(\operatorname{ch}_ \lambda\) are both \(W{\hbox{-}}\)invariant, and \(M_{ \operatorname{ch}_ \lambda} \subseteq M_f\). However \(\lambda \not\in M_{f'}\) by maximality, since we’ve subtracted \(c( \lambda) e(\lambda)\) off, so \({\sharp}M_{f'} < {\sharp}M_f\). Inducting on \({\sharp}M_f\), \(f'\) is a \({\mathbf{Z}}{\hbox{-}}\)linear combination of \(\operatorname{ch}_{\lambda'}\) for \(\lambda' \in \Lambda\), and thus so is \(f\). One checks the base case \(L(0) \cong {\mathbf{C}}\) where everything acts with weight zero. Uniqueness is relegated to exercise 22.8.
Is there an explicit formula for \(\operatorname{ch}_ \lambda\) for \(\lambda \in \Lambda^+\)? An intermediate goal will be to understand characters of Verma modules \(\operatorname{ch}M( \lambda)\) – note that this isn’t quite well-defined yet, since this is an infinite-dimensional module and thus the character has infinitely many terms and is not an element in \({\mathbf{Z}}[ \Lambda]\).
Let \({\mathcal{Z}}(L) \subseteq {\mathcal{U}}(L)\) be the center of \({\mathcal{U}}(L)\), not to be confused with \(Z(L) \subseteq L\) which is zero since \(L\) is semisimple (since \(Z(L)\) is an abelian ideal). How does \({\mathcal{Z}}(L) \curvearrowright M( \lambda)\)?
Note \(M( \lambda) = {\mathcal{U}}(L) \otimes_{{\mathcal{U}}(B)} {\mathbf{C}}_{ \lambda} \cong {\mathcal{U}}(N^-) \otimes_{\mathbf{C}}{\mathbf{C}}_ \lambda\) and write \(v^+\) for a nonzero highest weight vector of \(M( \lambda)\). Let \(z\in {\mathcal{Z}}(L)\) and \(h\in H\), then \begin{align*} h.(z.v^+) = z.(h.v^+) = z.( \lambda(h) ) v^+ = \lambda(h) v^+ ,\end{align*} and \(x_ \alpha(z.v^+) = z(x_ \alpha.v^+) = 0\) for all \(\alpha\in \Phi^+\), so \(z.v^+\) is a maximal vector in \(M( \lambda)\) of weight \(\lambda\), i.e. there exists \(\chi_ \lambda(z)\in {\mathbf{C}}\) such that \begin{align*} z.v^+ = \chi_ \lambda( z) v^+ \end{align*} since \(\dim M( \lambda)_ \lambda = 1\). Thus there is an algebra morphism \begin{align*} \chi_ \lambda: {\mathcal{Z}}(L) \to {\mathbf{C}} .\end{align*}
Pick a PBW basis for the Verma module \(M(\lambda)\), then \begin{align*} z. y_{\beta_1}^{i_1}\cdots y_{\beta_m }^{i_m} v^+ = y_{\beta_1}^{i_1}\cdots y_{\beta_m }^{i_m} z. v^+ = \chi_ \lambda(z) y_{\beta_1}^{i_1}\cdots y_{\beta_m }^{i_m} v^+ ,\end{align*} so \(z.m = \chi_ \lambda(z)m\) for all \(m\in M( \lambda)\), and thus \({\mathcal{Z}}(L)\curvearrowright M(\lambda)\) by the character \(\chi_ \lambda\). Consequently, \({\mathcal{Z}}(L)\) acts on any subquotient of \(M( \lambda)\).
When is \(\chi_ \lambda= \chi_ \mu\) for two integral weights \(\lambda, \mu\in \Lambda\)?
Recall: \({\mathcal{Z}}(L) \mathrel{\vcenter{:}}= Z(\mu(L))\) acts by a character \(\chi_ \lambda: {\mathcal{Z}}(L) \to {\mathbf{C}}\) on \(M( \lambda)\) and thus any subquotients. For \(\lambda, \mu\in \Lambda \subseteq H {}^{ \vee }\), when is \(\chi_ \lambda = \chi_ \mu\)?
Two weights \(\lambda, \mu\in H {}^{ \vee }\) are linked iff \(\exists w\in W\) such that \(\mu + \rho = w ( \lambda+ \rho)\), where \(\rho \mathrel{\vcenter{:}}={1\over 2}\sum_{ \beta\in \Phi^+} \beta\). In this case, write \(\mu \sim \lambda\) for this equivalence relation, i.e. \(\mu = w( \lambda+ \rho) - \rho\). We’ll write \(w\cdot \lambda \mathrel{\vcenter{:}}= w( \lambda+ \rho)\) and call this the dot action of \(W\) on \(H {}^{ \vee }\).
This defines an action of a group on a set, but it is not a linear action.
Let \(\lambda\in \Lambda, \alpha\in \Delta\) and suppose \(m\mathrel{\vcenter{:}}=\lambda(h_ \alpha) = {\left\langle {\lambda },~{\alpha} \right\rangle}\in {\mathbf{Z}}_{\geq 0}\) and let \(v^+\) be a highest weight vector of \(M( \lambda)\). Then \(w\mathrel{\vcenter{:}}= y_ \alpha^{m+1}\cdot v^+\) is maximal vector in \(M( \lambda)\) of weight \(\lambda - (m+1 ) \alpha\).
The proof that \(w\) is a maximal vector is step 1 in theorem 21.2, which showed \(\dim L( \lambda) < \infty\) (using lemma 21.2 and commutator relations in \({\mathcal{U}}(L)\)). Then check that \begin{align*} { \operatorname{weight} }(w) = \lambda - (m+1) \alpha = \lambda- \qty{ {\left\langle {\lambda },~{\alpha } \right\rangle}+ 1} \alpha .\end{align*} In fact, for any \(\lambda\in H {}^{ \vee }, \alpha\in \Delta\) we can define \begin{align*} \mu \mathrel{\vcenter{:}}=\lambda- \qty{ {\left\langle {\lambda },~{\alpha } \right\rangle}+ 1} \alpha ,\end{align*} so in our case \({ \operatorname{weight} }(w) = \mu\). Note that \begin{align*} \mu &= \lambda- {\left\langle {\lambda},~{\alpha } \right\rangle}\alpha- \alpha \\ &= s_ \alpha( \lambda) + (s_ \alpha( \rho) - \rho) \\ &= s_ \alpha(\lambda+ \rho) - \rho\\ &= s_ \alpha \cdot \lambda .\end{align*} Now \(w\) generates a highest weight module \(W\leq M( \lambda)\) of highest weight \(\mu = s_ \alpha \cdot \lambda\). Note that \(M( \mu)\) is the universal highest weight module with highest weight \(\mu\), i.e. \(\exists! M(\mu) \twoheadrightarrow W\). This yields a \(B{\hbox{-}}\)module morphism \begin{align*} {\mathbf{C}}_ \mu &\to W \\ 1 &\mapsto w ,\end{align*} which yields an \(L{\hbox{-}}\)module morphism \({\mathcal{U}}(L) \otimes_{{\mathcal{U}}(B)} {\mathbf{C}}_ \mu \twoheadrightarrow W\). So \(W\) is a nonzero quotient of \(M( \mu)\) and \({\mathcal{Z}}(L)\curvearrowright W\) by \(\chi_ \mu\). On the other hand \(W\leq M( \lambda)\) and so \({\mathcal{Z}}(L)\curvearrowright W\) by \(\chi_ \lambda\), yielding \(\chi_ \lambda= \chi_ \mu\).
So we conclude that if \(\mu = s_ \alpha \cdot \lambda\) with \({\left\langle {\lambda },~{\alpha} \right\rangle}\in {\mathbf{Z}}_{\geq 0}\), then \(\chi_ \mu = \chi_ \lambda\).
Let \(\lambda\in \Lambda, \alpha \in \Delta, \mu = s_ \alpha \cdot \lambda\). Then \(\chi_ \mu = \chi_ \lambda\).
\(\mu = s_ \alpha \cdot \lambda = \lambda- (m+1) \alpha\) where \(m \mathrel{\vcenter{:}}={\left\langle {\lambda },~{\alpha} \right\rangle}\).
Let \(\lambda, \mu\in \Lambda\), then if \(\mu\sim \lambda\) then \(\chi_ \lambda=\chi_ \mu\).
Say \(\mu = w\cdot \lambda\) for \(w\in W\), then write \(w = s_{i_1}\cdots s_{i_t}\) and use induction on \(t\), where the base case is the previous corollary.
If \(\lambda, \mu\) satisfy \(\chi_ \lambda= \chi_ \mu\), then \(\lambda\sim \mu\).
Goal: find \(\operatorname{ch}_ \lambda\mathrel{\vcenter{:}}=\operatorname{ch}_{L( \lambda)} \mathrel{\vcenter{:}}=\sum_{ \mu\in \Lambda} (\dim L( \lambda)_\mu ) e( \mu) \in {\mathbf{Z}}[ \Lambda]\) for \(\lambda\in \Lambda^+\).
View \({\mathbf{Z}}[ \Lambda]\) as finitely-supported \({\mathbf{Z}}{\hbox{-}}\)valued functions on \(\Lambda\) with elements \(f = \sum_{ \lambda\in \Lambda} a_ \lambda e( \lambda)\) regarded as functions \(f(\mu) \mathrel{\vcenter{:}}= a_\mu\). Thus \(e( \lambda)( \mu) = \delta_{\lambda= \mu }\). The point of this maneuver: Verma modules will be infinite dimensional, but \({\mathbf{Z}}[ \Lambda]\) only handles finite sums. For \(f,g\in \mathop{\mathrm{Hom}}_{\mathbf{Z}}( \Lambda, {\mathbf{Z}})\), define \((f\ast g) ( \sigma) = \sum_{ \lambda+ \mu} f( \lambda) g( \mu)\). Consider the set \({\mathcal{X}}\) of functions \(\mathop{\mathrm{Hom}}_{\mathbf{C}}(H {}^{ \vee }, {\mathbf{C}})\) whose support is contained in a finite union of sets of the form \(\lambda_{ \leq} \mathrel{\vcenter{:}}=\left\{{ \lambda - \sum_{ \mu\in \Phi^+} k_ \beta \beta{~\mathrel{\Big\vert}~}k_ \beta\in {\mathbf{Z}}_{\geq 0}}\right\}\). One can show \({\mathcal{X}}\) is closed under convolution and \({\mathcal{X}}\) becomes a commutative associative algebra containing \({\mathbf{Z}}[ \Lambda]\) as a subring. Note that \(\mathop{\mathrm{supp}}(f\ast g) \subseteq ( \lambda+ \mu)_{\leq }\).
Write \(e( \lambda)\) as \(e_{ \lambda}\) for \(\lambda \in H {}^{ \vee }\), regarded as a function \(e_ \lambda:H {}^{ \vee }\to {\mathbf{C}}\) where \(e_ \lambda( \mu) = \delta_{ \lambda = \mu}\) is the characteristic function of \(\lambda\), and note that \(e_ \lambda \ast e_ \mu = e_ { \lambda+ \mu}\). Let \(p = \operatorname{ch}_{M(0)}: H {}^{ \vee }\to {\mathbf{C}}\), then \begin{align*} M(0) = {\mathcal{U}}(L) \otimes_{{\mathcal{U}}(B)} {\mathbf{C}}_0 \cong {\mathcal{U}}(N^-) \otimes_{\mathbf{C}}{\mathbf{C}}_0 \quad\in {}_{H}{\mathsf{Mod}}, {}_{N^-}{\mathsf{Mod}} .\end{align*} By PBW, \({\mathcal{U}}(N^-)\) has a basis \(y_{\beta_1}^{i_1}\cdots y_{ \beta_m}^{i_m}\) for \(i_j\in {\mathbf{Z}}_{\geq 0}\) where \(\Phi^+ = \left\{{ {\beta }_{1}, \cdots, {\beta }_{m} }\right\}\). The weights of \(M(0)\) are \(\mu = - \sum_{j=1}^m i_j \beta_j\) and \(\mu\in 0_{\leq}\). Note \(\operatorname{ch}_{M(0)}( \mu) = \dim M(0)_ \mu\), and so \(\operatorname{ch}_{M(0)}\in {\mathcal{X}}\) and thus \(p\in {\mathcal{X}}\).
Last time: \begin{align*} {\mathcal{X}}\mathrel{\vcenter{:}}=\left\{{f\in \mathop{\mathrm{Hom}}_{\mathbf{C}}(H {}^{ \vee }, {\mathbf{C}}) {~\mathrel{\Big\vert}~}\mathop{\mathrm{supp}}f \subseteq \bigcup_{i=1}^n (\lambda_i - {\mathbf{Z}}_{\geq 0} \Phi^+) }\right\} ,\end{align*} which is a commutative associative unital algebra under convolution, where \(e_{ \lambda}(\mu) = \delta_{ \lambda \mu}\) for \(\mu\in H {}^{ \vee }\) and \(e_ \lambda\ast e_ \mu = e_{ \lambda+ \mu}\) with \(e_0 = 1\). We have \(\operatorname{ch}_{M(0)}\) which records weights \(\mu = \sum_{i=1}^m -i_j \beta_j\) with \(i_j\in {\mathbf{Z}}_{\geq 0}\), and \begin{align*} \dim M(0)_\mu \mathrel{\vcenter{:}}=\operatorname{ch}_{M(0)}(\mu) = {\sharp}\left\{{\mathbf{i} \in {\mathbf{Z}}_{\geq 0}^m {~\mathrel{\Big\vert}~}\sum i_j \beta_j = -\mu }\right\} \mathrel{\vcenter{:}}= p(\mu) \end{align*} which is the Kostant function – its negative is the Kostant partition function, which records the number of ways of writing a weight as a sum like this. We’ll regard finding such a count as a known or easy problem, since this can be done by enumeration.
Define the Weyl function \(q\mathrel{\vcenter{:}}=\prod_{\alpha \in \Phi^+} (e_{\alpha\over 2} - e_{-{\alpha \over 2}})\) where the product is convolution. For \(\alpha\in \Phi+, f_ \alpha:H {}^{ \vee }\to {\mathbf{Z}}\), define \begin{align*} f_ \alpha( \lambda) \mathrel{\vcenter{:}}= \begin{cases} 1 & \lambda = -k \alpha \text{ for some } k\in {\mathbf{Z}}_{\geq 0} \\ 0 & \text{otherwise}. \end{cases} .\end{align*} We can regard this as an infinite sum \(f_ \alpha = e_0 + e_{ - \alpha} + e_{ -2 \alpha} + \cdots = \sum_{k\geq 0} e_{ -k \alpha}\).
Part a: The coefficient of \(e_\mu\) in \(\prod_{i=1}^m f_{ \beta_j}\) is the convolution \begin{align*} \sum_{i_1,\cdots, i_m \in {\mathbf{Z}}_{\geq 0}, -\sum i_j \beta_j = \mu } f_{\beta_1}(-i_1 \beta_1) \cdots f_{ \beta_m }(-i_m \beta_m) = p(\mu) .\end{align*}
Part b: \begin{align*} (e_0 - e_{- \alpha}) \ast(e_0 + e_{- \alpha} + e_{-2 \alpha} + \cdots) = e_0 + e_{ - \alpha} - e_{ - \alpha}+ e_{ -2 \alpha}- e_{ -2 \alpha} +\cdots = e_0 ,\end{align*} noting the telescoping. This can be checked rigorously by regarding these as functions instead of series.
Part c: Recall \(\rho = \sum_{ \alpha\in \Phi^+} {1\over 2}\alpha\), so \(e_\rho = \prod_{ \alpha\in \Phi^+} e_{\alpha\over 2}\). Thus the RHS is \begin{align*} \prod_{ \alpha\in \Phi^+} \qty{e_{\alpha\over 2} \ast(e_0 - e_{- \alpha}) } = \prod_{\alpha\in \Phi^+}(e_{\alpha\over 2} - e_{- \alpha\over 2} ) \mathrel{\vcenter{:}}= q .\end{align*} Note that \(q\neq 0\) since \(q(\rho) = 1\).
Let \(w\in W\), recalling that \(w.e_{\alpha} = e_{w \alpha}\), \begin{align*} wq = (-1)^{\ell(w)} q .\end{align*}
ETS for \(\alpha \in \Delta\). Recall \(s_ \alpha\) permutes \(\Phi^+ \setminus\left\{{ \alpha }\right\}\) and \(s_ \alpha(\alpha) = - \alpha\), so \(s_ \alpha q\) permutes the factors \((e_{ \beta\over 2} - e_{-{\beta\over 2}})\) for \(\beta\neq \alpha\) and negates \(e_{\alpha\over 2} - e_{- \alpha\over 2}\).
\(q\) is invertible: \begin{align*} q\ast p \ast e_{\rho} = e_0 \quad \implies q^{-1}= \rho \ast e_\rho .\end{align*}
Use lemma A: \begin{align*} q\ast\rho \ast e_{-\rho} &= e_\rho \ast\qty{ \prod_{ \alpha\in \Phi^+} (e_0 - e_{ - \alpha} ) } \ast p \ast e_{ - \rho} \qquad \text{by C} \\ &= \qty{ \prod_{ \alpha\in \Phi^+} (e_0 - e_{ - \alpha} ) } \ast p \\ &= \qty{ \prod_{ \alpha\in \Phi^+} (e_0 - e_{ - \alpha}) \ast f_{ \alpha} } \qquad \text{by A}\\ &= \prod_{ \alpha\in \Phi^+} e_0 \qquad \text{by B}\\ &= e_0 .\end{align*}
For \(\lambda, \mu\in H {}^{ \vee }\), \begin{align*} \operatorname{ch}_{M( \lambda)}( \mu) = p( \mu - \lambda) = (p \ast e_ \lambda)( \mu) \implies \operatorname{ch}_{M( \lambda)} = p \ast e_{ \lambda} .\end{align*}
\(M( \lambda)\) has basis \(y_{\beta_1}^{i_1}\cdots y_{ \beta_m}^{i_m} \cdot v^+\) where \(v^+\) is a highest weight vector of weight \(\lambda\). Note that \begin{align*} \mu = \lambda- \sum_{j=1}^M i_j \beta_j \iff \mu- \lambda= - \sum_{i=1}^m i_j \beta_j ,\end{align*} and \(\dim M(\lambda)_ \mu = p( \mu- \lambda)\). Now check \((p\ast e_ \lambda)( \mu) \mathrel{\vcenter{:}}= p( \mu - \lambda) e_ \lambda(\lambda) = p( \mu- \lambda)\).
\begin{align*} q \ast\operatorname{ch}_{M( \lambda)} = e_{ \lambda+ \rho} .\end{align*}
\begin{align*} \text{LHS} \overset{D}{=} q \ast p \ast e_ \lambda \overset{C}{=} e_ \rho \ast e_ \lambda = e_{ \lambda+ \rho} .\end{align*}
Recall that characters of Verma modules are essentially known. For \(\lambda\in \Lambda^+\), we have \(\operatorname{ch}_{ \lambda} \mathrel{\vcenter{:}}=\operatorname{ch}_{L( \lambda)}\), recalling that \(L( \lambda)\) is a finite-dimensional irreducible representation. Goal: express this as a finite \({\mathbf{Z}}{\hbox{-}}\)linear combination of certain \(\operatorname{ch}_{M( \lambda)}\).
Fix \(\lambda\in H {}^{ \vee }\) and let \({\mathcal{M}}_ \lambda\) be the collection of all \(L{\hbox{-}}\)modules satisfying the following:
Note that any highest weight module of highest weight \(\lambda\) is in \({\mathcal{M}}_ \lambda\), and this is closed under submodules, quotients, and finite direct sums. The Harish-Chandra theorem implies that \begin{align*} {\mathcal{M}}_ \lambda = {\mathcal{M}}_{\mu} \iff \lambda\sim \mu\iff \mu = w. \lambda .\end{align*}
If \(0\neq V \in {\mathcal{M}}_ \lambda\) then \(V\) has a maximal vector.
By (3), the weights of \(V\) lie in a finite number of cones \(\lambda_i - {\mathbf{Z}}_{\geq 0} \Phi^+\). So if \(\mu\) is a weight of \(V\) and \(\alpha\in \Phi^+\), then \(\mu + k \alpha\) is not a weight of \(V\) for \(k\gg 0\). Iterating this for the finitely many positive weights, there exists a weight \(\mu\) such that \(\mu + \alpha\) is not a weight for any \(\alpha\in \Phi^+\). Then any \(0\neq v\in V_ \mu\) is a maximal vector.
For \(\lambda \in H {}^{ \vee }\), set \begin{align*} \Theta( \lambda) \mathrel{\vcenter{:}}=\left\{{ \mu\in H {}^{ \vee }{~\mathrel{\Big\vert}~}\mu\sim \lambda\, \text{ and } \, \mu\leq \lambda}\right\} ,\end{align*} which by the Harish-Chandra theorem is a subset of \(W\cdot \lambda\) which is a finite set.
The following tends to hold in any setting with “standard” modules, e.g. quantum groups or superalgebras:
Let \(\lambda\in H {}^{ \vee }\), then
By induction on the number of maximal vectors (up to scalar multiples). If \(M( \lambda)\) is irreducible then it’s an irreducible highest weight module, and these are unique up to isomorphism and so \(M( \lambda) = L( \lambda)\) and we’re done. Otherwise \(M( \lambda)\) has a proper irreducible submodule \(V\), and \(V\in {\mathcal{M}}_ \lambda\) by closure under submodules. By the lemma, \(V\) has a maximal vector of some weight \(\mu\) which must be strictly less than \(\lambda\), i.e. \(\mu < \lambda\). As before, \(\chi_{ \mu} = \chi_ \lambda\) and thus \(\mu\in \Theta( \lambda)\). Consider \(V\) and \(M( \lambda)/ V\) – each lies in \({\mathcal{M}}_ \lambda\) and either has fewer weights linked to \(\lambda\) than \(M( \lambda)\) has, or else it must have exactly the same set of weights linked to \(\lambda\), just with smaller multiplicities. By induction each of these has a composition series, and these can be pieced together into a series for \(M\) since they fit into a SES.
Last time: \({\mathcal{M}}_ \lambda\) defined as a certain category of \(L{\hbox{-}}\)modules for \(\lambda\in H {}^{ \vee }\), and we defined \(\theta( \lambda) \mathrel{\vcenter{:}}=\left\{{ \mu\in H {}^{ \vee }{~\mathrel{\Big\vert}~}\mu \sim \lambda,\,\, \mu\leq \lambda}\right\} \subseteq W. \lambda\). Proposition from last time:
Note that any character of \(M\) is the sum of the characters of its composition factors.
Part b: Each composition factor of \(M( \lambda)\) is in \({\mathcal{M}}_ \lambda\), hence by the lemma has a maximal vector. Since it’s irreducible, it is a highest weight module \(L( \mu)\) for some \(\mu\in \theta( \lambda)\).
Part c: \([M( \lambda) : L( \lambda)] = 1\) since \(\dim M( \lambda)_ \lambda = 1\) and all other weights are strictly less than \(\lambda\).
Order \(\theta( \lambda) = \left\{{ \mu_1, \cdots, \mu_t}\right\}\) such that \(\mu_\leq \mu_j \implies i\leq j\). In particular, \(\mu_t = \lambda\). By the proposition, \(\operatorname{ch}_{M( \mu_j)}\) is a \({\mathbf{Z}}_{\geq 0}{\hbox{-}}\)linear combination of \(\operatorname{ch}_{L(\mu_i)}\) where \(i\leq j\), and the coefficient of \(\operatorname{ch}_{L(\mu_j)}\) is 1. Recording multiplicities in a matrix, we get the following:
This is an upper triangular unipotent matrix, and thus invertible.
Let \(\lambda \in H {}^{ \vee }\), then \begin{align*} \operatorname{ch}_{L( \lambda)} = \sum_{ \mu \in \theta( \lambda)} c( \mu) \operatorname{ch}_{M( \mu)}, \qquad c( \mu)\in {\mathbf{Z}}, \, c( \lambda) = 1 .\end{align*}
Assume \(\lambda\in \Lambda^+\), and recall:
Fixing \(w\in W\), we have \begin{align*} \sum_{ \mu\in \theta( \lambda) } c( \mu) e_{w (\lambda+ \rho)} &= w( q \ast\operatorname{ch}_ \lambda) \\ &= wq \ast w \operatorname{ch}_{ \lambda} \\ &= (-1)^{\ell(w)} q \operatorname{ch}_ \lambda \qquad \text{since $\operatorname{ch}_ \lambda$ is $W{\hbox{-}}$invariant} \\ &= (-1)^{\ell(w)} \sum_ \mu c( \mu) e_{ \mu+ \rho} .\end{align*} Since \(\lambda\in \Lambda^+\), \(W\) acts simply transitively on \(\theta( \lambda) + \rho \mathrel{\vcenter{:}}=\left\{{v+\rho {~\mathrel{\Big\vert}~}v\in \theta( \lambda) }\right\}\). Note \(\mu \sim \lambda\iff \mu+ \rho = w( \lambda+ \rho)\) for some \(w\in W\), which is unique since \(\lambda+ \rho\) is strongly/strictly dominant, and lemma 13.2A shows its stabilizer is the identity. So \({\operatorname{Stab}}_W( \lambda+ \rho) = \left\{{1}\right\}\). The equation \(\mu + \rho = w( \lambda+ \rho)\) implies \(\mu + \rho \leq \lambda+ \rho\), since apply \(W\) to dominant elements goes down in the partial order. Thus \(\mu\in \theta(\lambda)\), and \(\theta( \lambda)\) consists of precisely those \(\mu\) satisfying this equation, and \begin{align*} \theta( \lambda) = W\cdot \lambda .\end{align*}
Continuing the computation, take \(\mu = \lambda\) on the LHS, so \(w( \lambda+ \rho) = \mu+ \rho\) and \begin{align*} c( \lambda) e_{w( \lambda+ \rho)} = (-1)^{\ell(w)} c( \mu) e_{ \mu+ \rho}\implies c(\mu) = (-1)^{\ell(w)} c( \lambda) .\end{align*}
Substituting this into \(\star_1\) yields \begin{align*} q \ast\operatorname{ch}_ \lambda = \sum_{w\in W} (-1)^{\ell(w)} e_{w ( \lambda+ \rho)} \qquad \star_2 ,\end{align*} so \begin{align*} \operatorname{ch}_ \lambda &= q \ast p \ast e_{ - \rho} \ast\operatorname{ch}_ \lambda\\ &= p \ast e_{- \rho} \ast\sum_{w\in W} (-1)^{\ell(w)} e_{w (\lambda+ \rho)} \\ &= \sum_{w\in W} (-1)^{\ell(w)} p \ast e_{w( \lambda+ \rho) - \rho} &= \sum_{w\in W} (-1)^{\ell(w)} p \ast e_{w\cdot \lambda} .\end{align*}
This yields the following:
For \(\lambda\in \Lambda^+\) dominant, the weight multiplicities in \(L( \lambda)\) are given by \begin{align*} \dim L( \lambda)_\mu \mathrel{\vcenter{:}}= m_ \lambda( \mu) = \sum_{w\in W} (-1)^{\ell(w)} p( \mu+ \rho - w( \lambda+ \rho)) = \sum_{w\in W} (-1)^{\ell(w)} p( \mu - w\cdot \lambda) .\end{align*}
\begin{align*} q = \sum_{w\in W} (-1)^{\ell(w)} e_{w \rho} .\end{align*}
Take \(\lambda = 0\) in \(\star_2\), and use that \(\operatorname{ch}_0 = e_0\) where \(L(0) \cong {\mathbf{C}}\).
Let \(\lambda \in \Lambda^+\), then \begin{align*} \qty{ \sum_{w\in W} (-1)^{ \ell(w)} e_{w\rho} } \ast\operatorname{ch}_{L( \lambda)} = \sum_{w\in W} (-1)^{\ell(w)} e_{w( \lambda+ \rho)} .\end{align*}
Apply \(\star_2\) and the lemma.
\begin{align*} \dim L( \lambda) = { \prod_{ \alpha\in \Phi^+} {\left\langle { \lambda+ \rho},~{ \alpha} \right\rangle} \over\prod_{\alpha\in \Phi^+} {\left\langle {\rho },~{\alpha } \right\rangle}} = \sum_{ \mu \in \Pi( \lambda)} m_ \lambda( \mu) .\end{align*}
Last time:
\begin{align*} \dim L( \lambda) = { \prod_{ \sigma\in \Phi^+} {\left\langle { \lambda+ \rho},~{ \alpha} \right\rangle} \over \prod_{ \alpha\in \Phi^+} {\left\langle {\rho },~{\alpha } \right\rangle}} ,\end{align*} which is a quotient of two integers.
Show that \(W\) always has an equal number of even and odd elements, so \(\sum_{w\in W} (-1)^{\ell(w)} = 0\).
Note \(\operatorname{ch}_{ \lambda} = \sum _{\mu} \dim L( \lambda)_ \mu e_ \mu\in {\mathbf{Z}}[ \Lambda]\), and \(\dim L( \lambda) = \sum_{ \mu\in \Lambda} m_ \lambda( \mu)\). Viewing \(\operatorname{ch}_ \lambda: \Lambda\to {\mathbf{Z}}\) as a restriction of a function \(H {}^{ \vee }\to {\mathbf{C}}\), \(\dim L( \lambda)\) is the sum of all values of \(\operatorname{ch}_ \lambda\). Work in the \({\mathbf{C}}{\hbox{-}}\)subalgebra \({\mathcal{X}}_0\) of \({\mathcal{X}}\) generated by the characteristic functions \(S \mathrel{\vcenter{:}}=\left\{{e_ \mu{~\mathrel{\Big\vert}~}\mu\in \Lambda}\right\}\); this equals the span of \(S\) since \(e_{ \mu} \ast e_{\nu} = e_{ \mu+ \nu}\). We have a map \begin{align*} v: {\mathcal{X}}_0 &\to {\mathbf{C}}\\ f &\mapsto \sum_{ \mu\in \Lambda} f( \mu) ,\end{align*} which makes sense since \({\sharp}\mathop{\mathrm{supp}}f < \infty\). This function sums the values we’re after, so the goal is to compute \(v( \operatorname{ch}_ \lambda)\). By the exercise, attempting to apply this directly to the numerator and denominator yields \(0/0\), and we get around this by using a variant of L’Hopital’s rule. Define \({\partial}_ \alpha (e_ \mu) = ( \mu, \alpha)e_ \mu\), extended linearly to \({\mathcal{X}}_0\). In the basis \(S\) this operator is diagonal, and this is a derivation relative to convolution: \begin{align*} {\partial}_ \alpha \qty{ e_ \mu \ast e_{ \nu} } &= {\partial}_ \alpha( e_{ \mu+ \nu} ) \\ &= ( \mu+ \nu, \alpha)e_{ \mu+ \nu} \\ &= \qty{ (\mu, \alpha) e_ \mu} \ast e_ \nu + e_ \mu \ast\qty{ (\nu, \alpha) e_ \nu} \\ &= ({\partial}_ \alpha e_ \mu) \ast e_ \nu + e_ \mu \ast({\partial}_{\alpha} e _{\nu}) .\end{align*} Moreover they commute, i.e. \({\partial}_ \alpha {\partial}_ \beta = {\partial}_{\beta} {\partial}_{\beta}\). Set \({\partial}\mathrel{\vcenter{:}}=\prod_{ \alpha\in \Phi^+} {\partial}_ \alpha\) where the product here is composition, and view \({\partial}\) as an \(m\)th order differential operator. Write \(\omega( \lambda+ \rho)\mathrel{\vcenter{:}}=\sum_{w\in W} (-1)^{\ell(w)} e_{w (\lambda+ \rho)}\) for \(\lambda\in \Lambda^+\), so \(q = \omega ( \rho)\). Rewriting the WCF we have \begin{align*} \omega( \rho) \ast\operatorname{ch}_ \lambda = \omega( \lambda+ \rho) \qquad \star_1 ,\end{align*} and \begin{align*} \prod_{ \alpha\in \Phi^+}\qty{e_{ \alpha\over 2} - e_{- {\alpha\over 2}}} \ast\operatorname{ch}_ \lambda = \omega( \lambda+ \rho) .\end{align*} We now try to apply \({\partial}\) to both sides, followed by \(v\). Note that if any two factors of \({\partial}\) hit the same factor on the LHS, then noting that \(v(e_{\alpha\over 2} - e_{-{ \alpha\over 2}}) = 0\), such terms will vanish. So the total result will be zero unless all of the factors of \({\partial}\) are applied to the \(q\) factor in the LHS.
So apply \(v\circ {\partial}\) to \(\star_1\) to get \begin{align*} v( {\partial}\omega( \rho)) v( \operatorname{ch}_ \lambda) = v( {\partial}\omega( \lambda+ \rho)) .\end{align*} We can compute \begin{align*} v( {\partial}\omega ( \rho)) = v\qty{ {\partial}\sum_{w\in W} (-1)^{\ell(w)} e_{ w \rho} } = \sum_{w\in W} (-1)^{\ell(w)} v({\partial}( e_{ w \rho} ) ) .\end{align*} We have \begin{align*} v ( {\partial}( e_{ w \rho})) &= v\qty{ \qty{ \prod_{\alpha\in \Phi^+} {\partial}_\alpha } e_{ w \rho} } \\ &= v\qty{\prod_{ \alpha\in \Phi^+} (w \rho, \alpha) e_{ w \rho} } \\ &= \prod_{ \alpha\in \Phi^+} ( w \rho, \alpha) \\ &= \prod_{ \alpha\in \Phi^+} ( \rho, w^{-1}\alpha) \qquad \star_2 .\end{align*} Note that \(w^{-1}\cdot \Phi^+\) is a permutation of \(\Phi^+\), just potentially with some signs changed – in fact, exactly \(n(w^{-1})\), the number of positive roots sent to negative roots, and \(n(w^{-1}) = \ell(w^{-1})\). Thus the above is equal to \begin{align*} (-1)^{\ell(w)} \prod_{ \alpha\in \Phi^+}( \rho, \alpha) .\end{align*} Continuing \(\star_2\), we have \begin{align*} v( {\partial}( e_{ w \rho})) &= \sum_{w\in W} (-1)^{\ell(w)} (-1)^{\ell(w)} \prod_{ \alpha\in \Phi^+}( \rho, \alpha) \\ &= {\sharp}W \prod_{ \alpha\in \Phi^+}( \rho, \alpha) ,\end{align*} which is the LHS. Similarly for the RHS, \begin{align*} v({\partial}( \lambda+ \rho)) = {\sharp}W \prod_{ \alpha\in \Phi^+} (\lambda+ \rho, \alpha) .\end{align*}
Taking the quotient yields \begin{align*} \dim L(\lambda) = {{\sharp}W \prod_{ \alpha\in \Phi^+}( \rho, \alpha) \over {\sharp}W \prod_{ \alpha\in \Phi^+} (\lambda+ \rho, \alpha) } = {\prod_{ \alpha\in \Phi^+}( \rho, \alpha) \over \prod_{ \alpha\in \Phi^+} (\lambda+ \rho, \alpha) } .\end{align*} Multiplying the numerator and denominator by \(\prod_{ \alpha\in \Phi^+} {2\over (\alpha, \alpha)}\) yields \begin{align*} \prod_{ \alpha\in \Phi^+} {\left\langle {\rho},~{\alpha} \right\rangle} \over \prod_{ \alpha\in \Phi^+} {\left\langle { \lambda+ \rho},~{\alpha} \right\rangle} .\end{align*}
If \(\alpha\in \Phi^+\), using that \(\alpha {}^{ \vee }\) is a basis of \(\Phi {}^{ \vee }\), one can write \(\alpha {}^{ \vee }= \sum_{i=1}^\ell c_i^{ \alpha} \alpha_i {}^{ \vee }\) for some \(c_{i}^{\alpha} \in {\mathbf{Z}}_{\geq 0}\) and \(\lambda = \sum_{i=1}^\ell m_i \lambda_i\) for \(m_i \in {\mathbf{Z}}_{ \geq 0}\), using that \(( \rho, \alpha_i {}^{ \vee }) = {\left\langle { \rho},~{ \alpha_i} \right\rangle} = 1\) one can rewrite the dimension formula in terms of the integers \(c_i^{\alpha}\) and \(m_i\).
Where you could go after studying semisimple finite-dimensional Lie algebras over \({\mathbf{C}}\):
Infinite-dimensional representations of such algebras, e.g. the Verma modules \(M( \lambda)\). One has a SES \(K( \lambda) \hookrightarrow M( \lambda) \twoheadrightarrow L( \lambda)\), which doesn’t split since \(M( \lambda)\) is indecomposable.
Category \({\mathcal{O}}\), expressing characters of simples in terms of characters of Vermas.
Parabolic versions of Verma modules: we’ve looked at modules induced from \(B = H + N\), but one could look at parabolics \(P = U + \sum L_i\).
Coxeter groups, i.e. groups generated by reflections, including Weyl groups. These can be infinite, which ones are finite?
Quantize Coxeter groups to get Hecke algebras, which are algebras over \({\mathbf{C}}[q, q^{-1}]\). See Humphreys.
Representations of Lie groups over \({\mathbf{R}}\), semisimple algebraic groups, representations of finite groups of Lie type (see the classification of finite simple groups, e.g. algebraic groups over finite fields).
Characteristic \(p\) representation theory, which is much more difficult.
Infinite-dimensional Lie algebras over \({\mathbf{C}}\), e.g. affine/Kac-Moody algebras using the Serre relations on generalized Cartan matrices. See also current algebras, loop algebras.
Quantum groups (quantized enveloping algebras), closely tied to modular representation theory.
\begin{align*} x=\left(\begin{array}{ll}0 & 1 \\ 0 & 0\end{array}\right)\quad h=\left(\begin{array}{cc} 1 & 0 \\ 0 & -1 \end{array}\right) y=\left(\begin{array}{ll}0 & 0 \\ 1 & 0\end{array}\right)\quad \\ \\ [x, y]=h, \quad[h, x]=2 x, \quad[y, h]=2 y .\end{align*} - \({\mathfrak{sl}}_n({ \mathbf{F} })\) is dimension \(n^2-1\) and corresponds to type \(A_{n-1}\).
Let \(L\) be the real vector space \(\mathbf{R}^{3}\). Define \([x y]=x \times y\) (cross product of vectors) for \(x, y \in L\), and verify that \(L\) is a Lie algebra. Write down the structure constants relative to the usual basis of \(\mathbf{R}^{3}\).
It suffices to check the 3 axioms given in class that define a Lie algebra:
L1 (Bilinearity): This can be quickly seen from the formula \begin{align*} a\times b = {\left\lVert {a} \right\rVert}\cdot {\left\lVert {b} \right\rVert}\sin(\theta_{ab}) \widehat{n}_{ab} \end{align*} where \(\widehat{n}_{ab}\) is the vector orthogonal to both \(a\) and \(b\) given by the right-hand rule. The result follows readily from a direct computation: \begin{align*} (ra)\times(tb) &= {\left\lVert {ra} \right\rVert} \cdot {\left\lVert {tb} \right\rVert} \sin(\theta_{ra, tb}) \widehat{n}_{ra, tb} \\ &= (rt) {\left\lVert {a} \right\rVert} \cdot {\left\lVert {b} \right\rVert} \sin(\theta_{a, b}) \widehat{n}_{a, b} \\ &= (rt)\qty{a\times b} ,\end{align*} where we’ve used the fact that the angle between \(a\) and \(b\) is the same as the angle between any of their scalar multiples, as is their normal.
L2: that \(a\times a = 0\) readily follows from the same formula, since \(\sin( \theta_{a, a}) = \sin(0) = 0\).
L3 (The Jacobi identity): This is most easily seen from the “BAC - CAB” formula, \begin{align*} a\times (b\times c) = b{\left\langle {a},~{c} \right\rangle} - c{\left\langle {a},~{b} \right\rangle} .\end{align*} We proceed by expanding the Jacobi expression: \begin{align*} a\times(b\times c) + c\times (a\times b) + b\times (c\times a) &= {\color{blue} b{\left\langle {a},~{c} \right\rangle} } - {\color{red} c{\left\langle {a},~{b} \right\rangle} }\\ &\quad + {\color{green} a {\left\langle { c },~{ b } \right\rangle} } - {\color{blue} b {\left\langle { c },~{ a } \right\rangle} } \\ &\quad + {\color{red} c {\left\langle { a },~{ b } \right\rangle} } - {\color{green} a {\left\langle { b },~{ c } \right\rangle} } \\ &= 0 .\end{align*}
For the structure constants, let \(\left\{{e_1, e_2, e_3}\right\}\) be the standard Euclidean basis for \({\mathbf{R}}^3\); we can then write \(e_i\times e_j = \sum_{k=1}^3 c_{ij}^k e_k\) and we would like to determine the \(c_{ij}^k\). One can compute the following multiplication table:
\(e_i\times e_j\) | \(e_1\) | \(e_2\) | \(e_3\) |
---|---|---|---|
\(e_1\) | \(0\) | \(e_3\) | \(-e_2\) |
\(e_2\) | \(-e_3\) | \(0\) | \(e_1\) |
\(e_3\) | \(e_2\) | \(-e_1\) | \(0\) |
Thus the structure constants are given by the antisymmetric Levi-Cevita symbol, \begin{align*} c_{ij}^k = {\varepsilon}^{ijk} \mathrel{\vcenter{:}}= \begin{cases} 0 & \text{if any index $i,j,k$ is repeated} \\ \operatorname{sgn}\sigma_{ijk} & \text{otherwise}, \end{cases} \end{align*} where \(\sigma_{ijk} \in S_3\) is the permutation associated to \((i, j, k)\) in cycle notation and \(\operatorname{sgn}\sigma\) is the sign homomorphism.
An example to demonstrate how the Levi-Cevita symbol works:
\begin{align*} e_1\times e_2 = c_{12}^1 e_1 + c_{12}^2 e_2 + c_{12}^3 e_3 = 0 e_1 + 0 e_2 + 1e_3 \end{align*} since the first two terms have a repeated index and \begin{align*} c_{12}^3 = {\varepsilon}_{1,2,3} = \operatorname{sgn}(123) = \operatorname{sgn}(12)(23) = (-1)^2 = 1 \end{align*} using that \(\operatorname{sgn}\sigma = (-1)^m\) where \(m\) is the number of transpositions in \(\sigma\).
Let \(x \in {\mathfrak{gl}}_n({ \mathbf{F} })\) have \(n\) distinct eigenvalues \(a_{1}, \ldots, a_{n}\) in \({ \mathbf{F} }\). Prove that the eigenvalues of \({ \operatorname{ad}}_x\) are precisely the \(n^{2}\) scalars \(a_{i}-a_{j}\) for \(1 \leq i, j \leq n\), which of course need not be distinct.
For a fixed \(n\), let \(e_{ij} \in {\mathfrak{gl}}_n({ \mathbf{F} })\) be the matrix with a 1 in the \((i, j)\) position and zeros elsewhere. We will use the following fact: \begin{align*} e_{ij} e_{kl} = \delta_{jk} e_{il} ,\end{align*} where \(\delta_{jk} = 1 \iff j=k\), which implies that \begin{align*} [e_{ij} e_{kl} ] = e_{ij}e_{kl} - e_{kl}e_{ij} = \delta_{jk} e_{il} - \delta_{li}e_{kj} .\end{align*} Suppose without loss of generality20 that \(x\) is diagonal and of the form \(x = \operatorname{diag}(a_1, a_2, \cdots, a_n)\). Then the eigenvectors of \(x\) are precisely the \(e_{ij}\), since a direct check via matrix multiplication shows \(xe_{ij} = a_i e_{ij}\).
We claim that every \(e_{ij}\) is again an eigenvector of \({ \operatorname{ad}}_x\) with eigenvalue \(a_i - a_j\). Noting that the \(e_{ij}\) are also left eigenvectors satisfying \(e_{ij}x = a_j e_{ij}\), one readily computes \begin{align*} { \operatorname{ad}}_x e_{ij} \mathrel{\vcenter{:}}=[x, e_ij] = xe_{ij} - e_{ij} x = a_i e_{ij} - a_j e_{ij} = (a_i - a_j)e_{ij} ,\end{align*} yielding at least \(n^2\) eigenvalues. Since \({ \operatorname{ad}}_x\) expanded in the basis \(\left\{{e_{ij}}\right\}_{1\leq i, j \leq n}\) is an \(n\times n\) matrix, this exhausts all possible eigenvalues.
When \(\operatorname{ch}{ \mathbf{F} }=0\), show that each classical algebra \(L=\mathrm{A}_{\ell}, \mathrm{B}_{\ell}, \mathrm{C}_{\ell}\), or \(\mathrm{D}_{\ell}\) is equal to \([L L]\). (This shows again that each algebra consists of trace 0 matrices.)
We will check for this type \(A_n\), corresponding to \(L \mathrel{\vcenter{:}}={\mathfrak{sl}}_{n+1}\). Since \([LL] \subseteq L\), it suffices to show \(L \subseteq [LL]\), and we can further reduce to writing every basis element of \(L\) as a commutator in \([LL]\). Note that \(L\) has a standard basis given by the matrices
Considering the equation \([e_{ij} e_{kl} ] = \delta_{jk} e_{il} - \delta_{li}e_{kj}\), one can choose \(j=k\) to preserve the first term and \(l\neq i\) to kill the second term. So letting \(t, i, j\) be arbitrary with \(i\neq j\), we have \begin{align*} [e_{it} e_{tj}] = \delta_{tt} e_{ij} - \delta_{ij}e_{tt} = e_{ij} ,\end{align*} yielding all of the \(x_i\) and \(y_i\). But in fact we are done, using the fact that \(h_i = [x_i y_i]\).
Verify that the commutator of two derivations of an \({ \mathbf{F} }{\hbox{-}}\)algebra is again a derivation, whereas the ordinary product need not be.
We want to show that \([\mathop{\mathrm{Der}}(L) \mathop{\mathrm{Der}}(L)] \subseteq \mathop{\mathrm{Der}}(L)\), so let \(f,g\in \mathop{\mathrm{Der}}(L)\). The result follows from a direct computation; letting \(D \mathrel{\vcenter{:}}=[fg]\), we have \begin{align*} D(ab) = [fg](ab) &= (fg-gf)(ab) \\ &= fg(ab) - gf(ab) \\ &= f\qty{g(a)b + ag(b) } - g\qty{ f(a)b + af(b)} \\ &= f\qty{g(a)b} + f\qty{ag(b)} - g\qty{f(a)b} - g\qty{af(b)} \\ &= { {\color{blue} (fg)(a)b } + {\color{red} g(a)f(b)} } \\ &\quad + { {\color{red} f(a)g(b) } + {\color{green} a (fg)(b)} } \\ &\quad - { {\color{blue} (gf)(a) b } + {\color{red} f(a)g(b)} } \\ &\quad - { {\color{red} g(a)f(b) } - {\color{green} a(gf)(b)} } \\ &= {\color{blue} [fg](a) b} - {\color{green} a [fg](b) } \\ &= D(a)b - aD(b) .\end{align*}
To see that ordinary products of derivations need not be derivations, consider the operators \(D_x \mathrel{\vcenter{:}}={\frac{\partial }{\partial x}\,}, D_y = {\frac{\partial }{\partial y}\,}\) acting on a finite-dimensional vector space of multivariate polynomials of some bounded degree, as a sub \({\mathbf{R}}{\hbox{-}}\)algebra of \({\mathbf{R}}[x,y]\). Take \(f(x,y) = x+y\) and \(g(x,y) = xy\), so that \(fg = g f = x^2 y+ xy^2\). Then \([D_x D_y] = 0\) since mixed partial derivatives are equal, but \begin{align*} D_x D_y (fg) = D_x \qty{x^2 + 2xy} = 2x + 2y \neq 0 .\end{align*}
Let \(L\) be a Lie algebra over an algebraically closed field \({ \mathbf{F} }\) and let \(x \in L\). Prove that the subspace of \(L\) spanned by the eigenvectors of \({ \operatorname{ad}}_x\) is a subalgebra.
Let \(E_x \subseteq L\) be the subspace spanned by eigenvectors of \({ \operatorname{ad}}_x\); it suffices to show \([E_x E_x] \subseteq E_x\). Letting \(y_i \in E_x\), we have \({ \operatorname{ad}}_x(y_i) = \lambda_i y_i\) for some scalars \(\lambda_i \in { \mathbf{F} }\), and we want to show \({ \operatorname{ad}}_x([y_1 y_2]) = \lambda_{12} [y_1 y_2]\) for some scalar \(\lambda_{12}\). Note that the Jacobi identity is equivalent to \({ \operatorname{ad}}\) acting as a derivation with respect to the bracket, i.e. \begin{align*} { \operatorname{ad}}_x([yz]) = [ { \operatorname{ad}}_x(y) z] + [y { \operatorname{ad}}_x(z)] \implies [x[yz]] = [[xy]z] + [y[xz]] .\end{align*} The result then follows from a direct computation: \begin{align*} { \operatorname{ad}}_x([y_1y_2]) &= [[xy_1]y_2] + [y_1 [xy_2]] \\ &= [ \lambda_1 y_1 y_2] + [y_1 \lambda_2 y_2] \\ &= (\lambda_1 + \lambda_2)[y_1 y_2] .\end{align*}
Prove that the set of all inner derivations \({ \operatorname{ad}}_x, x \in L\), is an ideal of \(\mathop{\mathrm{Der}}L\).
It suffices to show \([\mathop{\mathrm{Der}}(L) \mathop{\mathrm{Inn}}(L)] \subseteq \mathop{\mathrm{Inn}}(L)\), so let \(f\in \mathop{\mathrm{Der}}(L)\) and \({ \operatorname{ad}}_x \in \mathop{\mathrm{Inn}}(L)\). The result follows from the following check: \begin{align*} [f { \operatorname{ad}}_x](l) &= (f\circ { \operatorname{ad}}_x)(l) - ( { \operatorname{ad}}_x \circ f)(l) \\ &= f([xl]) - [x f(l)] \\ &= [f(x) l] + [x f(l)] - [x f(l)] \\ &= [f(x) l] \\ &= { \operatorname{ad}}_{f(x)}(l), \qquad \text{and } { \operatorname{ad}}_{f(x)} \in \mathop{\mathrm{Inn}}(L) .\end{align*}
Show that \(\mathfrak{s l}_n( { \mathbf{F} })\) is precisely the derived algebra of \(\mathfrak{g l}_n( { \mathbf{F} })\) (cf. Exercise 1.9).
We want to show \({\mathfrak{gl}}_n({ \mathbf{F} })^{(1)} \mathrel{\vcenter{:}}=[{\mathfrak{gl}}_n({ \mathbf{F} }) {\mathfrak{gl}}_n({ \mathbf{F} })] = {\mathfrak{sl}}_n({ \mathbf{F} })\).
\(\subseteq\): This immediate from the fact that for any matrices \(A\) and \(B\), \begin{align*} {\mathrm{tr}}([AB]) = {\mathrm{tr}}(AB -BA) = {\mathrm{tr}}(AB) - {\mathrm{tr}}(BA) = {\mathrm{tr}}(AB) - {\mathrm{tr}}(AB) = 0 .\end{align*}
\(\supseteq\): From a previous exercise, we know that \([{\mathfrak{sl}}_n({ \mathbf{F} }) {\mathfrak{sl}}_n({ \mathbf{F} })] = {\mathfrak{sl}}_n({ \mathbf{F} })\), and since \({\mathfrak{sl}}_n({ \mathbf{F} }) \subseteq {\mathfrak{gl}}_n({ \mathbf{F} })\) we have \begin{align*} {\mathfrak{sl}}_n({ \mathbf{F} }) = {\mathfrak{sl}}_n({ \mathbf{F} })^{(1)} \subseteq {\mathfrak{gl}}_n({ \mathbf{F} })^{(1)} .\end{align*}
Suppose \(\operatorname{dim} L=3\) and \(L=[L L]\). Prove that \(L\) must be simple. Observe first that any homomorphic image of \(L\) also equals its derived algebra. Recover the simplicity of \(\mathfrak{s l}_2( { \mathbf{F} })\) when \(\operatorname{ch}{ \mathbf{F} }\neq 2\).
Let \(I{~\trianglelefteq~}L\) be a proper ideal, then \(\dim L/I < \dim L\) forces \(\dim L/I = 1,2\). Since \(L\twoheadrightarrow L/I\), the latter is the homomorphic image of a Lie algebra and thus \((L/I)^{(1)} = L/I\) by the hint. Note that in particular, \(L/I\) is not abelian. We proceed by cases:
So no such proper ideals \(I\) can exist, forcing \(L\) to be simple.
Applying this to \(L \mathrel{\vcenter{:}}={\mathfrak{sl}}_2({ \mathbf{F} })\), we have \(\dim_{ \mathbf{F} }{\mathfrak{sl}}_2({ \mathbf{F} }) = 2^2-1 = 3\), and from a previous exercise we know \({\mathfrak{sl}}_2({ \mathbf{F} })^{(1)} = {\mathfrak{sl}}_2({ \mathbf{F} })\), so the above argument applies and shows simplicity.
Let \(\sigma\) be the automorphism of \(\mathfrak{s l}_2({ \mathbf{F} })\) defined in (2.3). Verify that
Note that this automorphism is defined as \begin{align*} \sigma = \exp( { \operatorname{ad}}_x)\circ \exp( { \operatorname{ad}}_{-y}) \circ \exp( { \operatorname{ad}}_x) .\end{align*}
We recall that \(\exp { \operatorname{ad}}_x(y) \mathrel{\vcenter{:}}=\sum_{n\geq 0}{1\over n!} { \operatorname{ad}}_x^n(y)\), where the exponent denotes an \(n{\hbox{-}}\)fold composition of operators. To compute these power series, first note that \({ \operatorname{ad}}_t(t) = 0\) for \(t=x,y,h\) by axiom L2, so \begin{align*} (\exp { \operatorname{ad}}_t)(t) = 1(t) + { \operatorname{ad}}_t(t) + {1\over 2} { \operatorname{ad}}_t^2(t) + \cdots = 1(t) = t \end{align*} where \(1\) denotes the identity operator. It is worth noting that if \({ \operatorname{ad}}_t^n(t') = 0\) for some \(n\) and some fixed \(t,t'\), then it is also zero for all higher \(n\) since each successive term involves bracketing with the previous term: \begin{align*} { \operatorname{ad}}^{n+1}_t(t') = [t\, { \operatorname{ad}}_t^n(t')] = [t\, 0] = 0 .\end{align*}
We first compute some individual nontrivial terms that will appear in \(\sigma\). The first order terms are given by standard formulas, which we collect into a multiplication table for the bracket:
\(x\) | \(h\) | \(y\) | |
---|---|---|---|
\(x\) | \(0\) | \(-2x\) | \(h\) |
\(h\) | \(2x\) | \(0\) | \(-2y\) |
\(y\) | \(-h\) | \(2y\) | \(0\) |
We can thus read off the following:
For reference, we compute and collect higher order terms:
Finally, we can compute the individual terms of \(\sigma\):
\begin{align*} (\exp { \operatorname{ad}}_x)(x) &= x \\ \\ (\exp { \operatorname{ad}}_x)(h) &= 1(h) + { \operatorname{ad}}_x(h) \\ &= h + (-2x) \\ &= h-2x \\ \\ (\exp { \operatorname{ad}}_x)(y) &= 1(y) + { \operatorname{ad}}_x(y) + {1\over 2} { \operatorname{ad}}_x^2(y) \\ &= y + h + {1\over 2}(-2x) \\ &= y+h-x \\ \\ (\exp { \operatorname{ad}}_{-y})(x) &= 1(x) + { \operatorname{ad}}_{-y}(x) x + {1\over 2} { \operatorname{ad}}^2_{-y}(x) \\ &= x + h +{1\over 2}(-2y) \\ &= x+h-y \\ \\ (\exp { \operatorname{ad}}_{-y})(h) &= 1(h) + { \operatorname{ad}}_{-y}(h) \\ &= h - 2y \\ \\ (\exp { \operatorname{ad}}_{-y})(y) &= y ,\end{align*} and assembling everything together yields
\begin{align*} \sigma(x) &= (\exp { \operatorname{ad}}_x \circ \exp { \operatorname{ad}}_{-y} \circ \exp { \operatorname{ad}}_x)(x) \\ &= (\exp { \operatorname{ad}}_x \circ \exp { \operatorname{ad}}_{-y})(x) \\ &= (\exp { \operatorname{ad}}_x)(x+h-y) \\ &= (x) + (h-2x) - (y+h-x) \\ &= -y \\ \\ \sigma(y) &= (\exp { \operatorname{ad}}_x \circ \exp { \operatorname{ad}}_{-y} \circ \exp { \operatorname{ad}}_x)(y) \\ &= (\exp { \operatorname{ad}}_x \circ \exp { \operatorname{ad}}_{-y} )(y+h-x) \\ &= \exp { \operatorname{ad}}_x\qty{(y) + (h-2y) - (x+h-y) } \\ &= \exp { \operatorname{ad}}_x\qty{-x} \\ &= -x \\ \\ \sigma(h) &= (\exp { \operatorname{ad}}_x \circ \exp { \operatorname{ad}}_{-y} \circ \exp { \operatorname{ad}}_x)(h) \\ &= (\exp { \operatorname{ad}}_x \circ \exp { \operatorname{ad}}_{-y} )(h-2x) \\ &= (\exp { \operatorname{ad}}_x )( (h-2y) - 2(x+h-y) ) \\ &= (\exp { \operatorname{ad}}_x )(-2x -h ) \\ &= -2(x) - (h-2x) \\ &= -h .\end{align*}
Let \(I\) be an ideal of \(L\). Then each member of the derived series or descending central series of \(I\) is also an ideal of \(L\).
To recall definitions:
For the derived series, inductively suppose \(I \mathrel{\vcenter{:}}= I^{(i)}\) is an ideal, so \([L I] \subseteq I\). We then want to show \(I^{(i+1)} \mathrel{\vcenter{:}}=[I, I]\) is an ideal, so \([L, [I, I] ] \subseteq [I, I]\). Letting \(l\in L\), and \(i,j\in I\), one can use the Jacobi identity, antisymmetry of the bracket, and the fact that \([I, I] \mathrel{\vcenter{:}}= L^{(i+1)} \subseteq I\) to write \begin{align*} [L, [I, I]] &\ni [l[ij]] \\ &= [[li]j] - [ [lj] i] \\ &\in [[L,I], I] - [[L,I], I] \\ &\subseteq [[L,I], I] \subseteq [I,I] .\end{align*}
Similarly, for the lower central series, inductively suppose \(I\mathrel{\vcenter{:}}= I^i\) is an ideal, so \([L, I] \subseteq I\); we want to show \([L, [L, I]] \subseteq [L, I]\). Again using the Jacobi identity and antisymmetry, we have \begin{align*} [L, [L, I]] &\ni [l_1, [l_2, i]] \\ &= [[i, l_1], l_2] + [[l_2, l_1], i] \\ &\subseteq [[I,L], L] + [ [L, L], I] \\ &\subseteq [I, L] + [L, I] \subseteq [L, I] .\end{align*}
Prove that \(L\) is solvable (resp. nilpotent) if and only \({ \operatorname{ad}}(L)\) is solvable (resp. nilpotent).
\(\implies\): By the propositions in Section 3.1 (resp. 3.2), the homomorphic image of any solvable (resp. nilpotent) Lie algebra is again solvable (resp. nilpotent).
\(\impliedby\): There is an exact sequence \begin{align*} 0 \to Z(L) \to L \xrightarrow{ { \operatorname{ad}}} { \operatorname{ad}}(L) \to 0 ,\end{align*} exhibiting \({ \operatorname{ad}}(L)\cong L/Z(L)\). Thus if \({ \operatorname{ad}}(L)\) is solvable, noting that centers are always solvable, we can use the fact that the 2-out-of-3 property for short exact sequences holds for solvability. Moreover, by the proposition in Section 3.2, if \(L/Z(L)\) is nilpotent then \(L\) is nilpotent.
Prove that the sum of two nilpotent ideals of a Lie algebra \(L\) is again a nilpotent ideal. Therefore, \(L\) possesses a unique maximal nilpotent ideal. Determine this ideal for the nonabelian 2-dimensional algebra \({ \mathbf{F} }x + { \mathbf{F} }y\) where \([xy]=x\), and the 3-dimensional algebra \({ \mathbf{F} }x + { \mathbf{F} }y + { \mathbf{F} }z\) where
To see that sums of nilpotent ideals are nilpotent, suppose \(I^N = J^M = 0\) are nilpotent ideals. Then \((I+J)^{M+N} \subseteq I^M + J^N\) by collecting terms and using the absorbing property of ideals. One can now construct a maximal nilpotent ideal in \(L\) by defining \(M\) as the sum of all nilpotent ideals in \(L\). That this is unique is clear, since \(M\) is nilpotent, so if \(M'\) is another maximal nilpotent ideal then \(M \subseteq M'\) and \(M' \subseteq M\).
Consider the 2-dimensional algebra \(L \mathrel{\vcenter{:}}={ \mathbf{F} }x + { \mathbf{F} }y\) where \([xy] = x\) and let \(I\) be the maximal nilpotent ideal. Note that \(L\) is not nilpotent since \(L^k = { \mathbf{F} }x\) for all \(k\geq 0\), since \(L^1 = { \mathbf{F} }x\) and \([L, { \mathbf{F} }x] = { \mathbf{F} }x\) (since all brackets are either zero or \(\pm x\)). However, this also shows that the subalgebra \({ \mathbf{F} }x\) is an ideal, and is in fact a nilpotent ideal since \([{ \mathbf{F} }x, { \mathbf{F} }x] = 0\). Although \({ \mathbf{F} }y\) is a nilpotent subalgebra, it is not an ideal since \([L, { \mathbf{F} }y] = { \mathbf{F} }x\). So \(I\) is at least 1-dimensional, since it contains \({ \mathbf{F} }x\), and at most 1-dimensional, since it is not all of \(L\), forcing \(I = { \mathbf{F} }x\).
Consider now the 3-dimensional algebra \(L \mathrel{\vcenter{:}}={ \mathbf{F} }x + { \mathbf{F} }y + { \mathbf{F} }z\) with the multiplication table given in the problem statement above. Note that \(L\) is not nilpotent, since \(L^1 = { \mathbf{F} }y + { \mathbf{F} }z = L^k\) for all \(k\geq 2\). This follows from consider \([L, { \mathbf{F} }y + { \mathbf{F} }z]\), where choosing \(x\in L\) is always a valid choice and choosing \(y\) or \(z\) in the second slot hits all generators; however, no element brackets to \(x\). So similar to the previous algebra, the ideal \(J \mathrel{\vcenter{:}}={ \mathbf{F} }x + { \mathbf{F} }y\) is an ideal, and it is nilpotent since all brackets between \(y\) and \(z\) vanish. By similar dimensional considerations, \(J\) must equal the maximal nilpotent ideal.
Let \(L\) be a Lie algebra, \(K\) an ideal of \(L\) such that \(L / K\) is nilpotent and such that \({ \left.{{ { \operatorname{ad}}_x}} \right|_{{K}} }\) is nilpotent for all \(x \in L\). Prove that \(L\) is nilpotent.
Suppose that \(M \mathrel{\vcenter{:}}= L/K\) is nilpotent, so the lower central series terminates and \(M^n = 0\) for some \(n\). Then \(L^n \subseteq K\) for the same \(n\), and the claim is that \(L^n\) is nilpotent. This follows from applying Engel’s theorem: let \(x\in L^n \subseteq K\), then \({ \left.{{ { \operatorname{ad}}_x}} \right|_{{L^n}} } = 0\) by assumption. So every element of \(L^n\) is ad-nilpotent, making it nilpotent. Since \(0 = (L^n)^k = L^{n+k}\) for some \(k\), this forces \(L\) to be nilpotent as well.
Let \(L= {\mathfrak{sl}}(V)\). Use Lie’s Theorem to prove that \(\operatorname{Rad} L=Z(L)\); conclude that \(L\) is semisimple.
Hint: observe that \(\operatorname{Rad} L\) lies in each maximal solvable subalgebra \(B\) of \(L\). Select a basis of \(V\) so that \(B=L \cap \mathfrak{t}(n, \mathrm{~F})\), and notice that \(B^t\) is also a maximal solvable subalgebra of \(L\). Conclude that \(\operatorname{Rad} L \subset L \cap \mathfrak{d}(n, \mathrm{~F})\) (diagonal matrices), then that \(\operatorname{Rad} L=Z(L) .]\)
Let \(R = \mathop{\mathrm{Rad}}(L)\) be the radical (maximal solvable ideal) of \(L\). Using the hint, if \(S \leq L\) is a maximal solvable subalgebra then it must contain \(R\). By (a corollary of) Lie’s theorem, \(S\) stabilizes a flag and thus there is a basis with respect to which all elements of \(S\) (and thus \(R\)) are upper triangular. Thus \(S \subseteq {\mathfrak{b}}\); however, taking the transpose of every element in \(S\) again yields a maximal solvable ideal which is lower triangular and thus contained in \({\mathfrak{b}}^-\). Thus \(R \subseteq S \subseteq {\mathfrak{b}}\cap{\mathfrak{b}}^- = {\mathfrak{h}}\), which consists of just diagonal matrices.
We have \(Z(L) \subseteq R\) since centers are solvable, and the claim is that \(R \subseteq {\mathfrak{h}}\implies R \subseteq Z(L)\). It suffices to show that \(R\) consists of scalar matrices, since it is well-known that \(Z({\mathfrak{gl}}_n({ \mathbf{F} }))\) consists of precisely scalar matrices, and this contains \(Z(L)\) since \(L \leq {\mathfrak{gl}}_n({ \mathbf{F} })\) is a subalgebra. This follows by letting \(\ell = \sum a_i e_{i,i}\) be an element of \(\mathop{\mathrm{Rad}}(L)\) and considering bracketing elements of \({\mathfrak{sl}}_n({ \mathbf{F} })\) against it. Bracketing elementary matrices \(e_{i, j}\) with \(i\neq j\) yields \begin{align*} [e_{i,j}, \ell] = a_j e_{i, j} - a_i e_{i, j} ,\end{align*} which must be an element of \(\mathop{\mathrm{Rad}}(L)\) and thus diagonal, which forces \(a_j = a_i\) for all \(i, j\).
To conclude that \(L\) is semisimple, note that a scalar traceless matrix is necessarily zero, and so \(Z({\mathfrak{sl}}(V)) = 0\). This suffices since \(\mathop{\mathrm{Rad}}(L) = 0 \iff L\) is semisimple.
Consider the \(p \times p\) matrices: \begin{align*} x=\left[\begin{array}{cccccc} 0 & 1 & 0 & . & . & 0 \\ 0 & 0 & 1 & 0 & . & 0 \\ . & . & \cdot & \cdot & \cdot & \cdot \\ 0 & \cdot & \cdot & \cdot & \cdot & 1 \\ 1 & . & \cdot & \cdot & \cdots & 0 \end{array}\right] ,\qquad y = \operatorname{diag}(0,1,2,3,\cdots,p-1) .\end{align*}
Check that \([x, y]=x\), hence that \(x\) and \(y\) span a two dimensional solvable subalgebra \(L\) of \(\mathfrak{g l}(p, F)\). Verify that \(x, y\) have no common eigenvector.
Note that \(x\) acts on the left on matrices \(y\) by cycling all rows of \(y\) up by one position, and similar yacts on the right by cycling the columns to the right. Thus \begin{align*} xy - yx &= \begin{bmatrix} 0 & 1 & 0 & 0 & 0 \\ 0 & 0 & 2 & 0 & 0 \\ 0 & \vdots & \vdots & \ddots & 0 \\ 0 & 0 & \cdots & 0 & p-1 \\ 0 & 0 & \cdots & 0 & 0 \end{bmatrix} - \begin{bmatrix} 0 & 0 & 0 & 0 & 0 \\ 0 & 0 & 1 & 0 & 0 \\ \vdots & \cdots & 0 & 2 & 0 \\ 0 & 0 & \cdots & 0 & 3 \\ p-1 & 0 & \cdots & 0 & 0 \end{bmatrix} \\ &= \begin{bmatrix} 0 & 1 & 0 & 0 & 0 \\ 0 & 0 & 1 & 0 & 0 \\ 0 & 0 & 0 & 1 & 0 \\ 0 & 0 & 0 & 0 & 1 \\ -(p-1) & 0 & 0 & 0 & 0 \end{bmatrix}\\ &\equiv \begin{bmatrix} 0 & 1 & 0 & 0 & 0 \\ 0 & 0 & 1 & 0 & 0 \\ 0 & 0 & 0 & 1 & 0 \\ 0 & 0 & 0 & 0 & 1 \\ 0 & 0 & 0 & 0 & 0 \end{bmatrix} \qquad \in \operatorname{GL}_n({ \mathbf{F} }_p) \\ &= x .\end{align*} Thus \(L \mathrel{\vcenter{:}}={ \mathbf{F} }x + { \mathbf{F} }y\) span a solvable subalgebra, since \(L^{(1)} = { \mathbf{F} }x\) and so \(L^{(2)} = 0\).
Moreover, note that every basis vector \(e_i\) is an eigenvector for \(y\) since \(y(e_i) = i e_i\), while no basis vector is an eigenvector for \(x\) since \(x(e_i) = e_{i+1}\) for \(1\leq i \leq p-1\) and \(x(e_p) = e_1\), so \(x\) cycles the various basis vectors.
For arbitrary \(p\), construct a counterexample to Corollary \(\mathrm{C}\)21 as follows: Start with \(L \subset {\mathfrak{gl}}_p({ \mathbf{F} })\) as in Exercise 3. Form the vector space direct sum \(M=L \oplus { \mathbf{F} }^p\), and make \(M\) a Lie algebra by decreeing that \({ \mathbf{F} }^p\) is abelian, while \(L\) has its usual product and acts on \({ \mathbf{F} }^p\) in the given way.
Verify that \(M\) is solvable, but that its derived algebra \(\left(={ \mathbf{F} }x+ { \mathbf{F} }^p\right)\) fails to be nilpotent.
For pairs \(A_1 \oplus v_1\) and \(A_2 \oplus v_2\) in \(M\), we’ll interpret the given definition of the bracket as \begin{align*} [A_1 \oplus v_1, A_2 \oplus v_2] \mathrel{\vcenter{:}}=[A_1, A_2] \oplus (A_1(v_2) - A_2(v_1)) ,\end{align*} where \(A_i(v_j)\) denotes evaluating an endomorphism \(A\in {\mathfrak{gl}}_p({ \mathbf{F} })\) on a vector \(v\in { \mathbf{F} }^p\). We also define \(L = { \mathbf{F} }x + { \mathbf{F} }y\) with \(x\) and \(y\) the given matrices in the previous problem, and note that \(L\) is solvable with derived series \begin{align*} L = { \mathbf{F} }x \oplus { \mathbf{F} }y \supseteq L^{(1)} = { \mathbf{F} }x \supseteq L^{(2)} = 0 .\end{align*}
Consider the derived series of \(M\) – by inspecting the above definition, we have \begin{align*} M^{(1)} \subseteq L^{(1)} \oplus { \mathbf{F} }^p = { \mathbf{F} }x \oplus { \mathbf{F} }^p .\end{align*} Moreover, we have \begin{align*} M^{(2)} \subseteq L^{(2)} \oplus { \mathbf{F} }^p = 0 \oplus { \mathbf{F} }^p ,\end{align*} which follows from considering considering bracketing two elements in \(M^{(1)}\): set \(w_{ij} \mathrel{\vcenter{:}}= A_i(v_j) - A_j(v_i)\), then \begin{align*} [ [A_1, A_2] \oplus w_{1,2}, \,\, [A_3, A_4] \oplus w_{3, 4} ] \hspace{16em} \\ = [ [A_1, A_2], [A_3, A_4] ] \oplus [A_1, A_2](w_{3, 4}) - [A_3, A_4](w_{1, 2}) .\end{align*} We can then see that \(M^{(3)} = 0\), since for any \(w_i \in { \mathbf{F} }^p\), \begin{align*} [0 \oplus w_1, \, 0 \oplus w_2] = 0 \oplus 0(w_2)-0(w_1) = 0 \oplus 0 ,\end{align*} and so \(M\) is solvable.
Now consider its derived subalgebra \(M^{(1)} = { \mathbf{F} }x \oplus { \mathbf{F} }^p\). If this were nilpotent, every element would be ad-nilpotent, but let \(v = {\left[ {1,1,\cdots, 1} \right]}\) and consider \({ \operatorname{ad}}_{x \oplus 0}\). We have \begin{align*} { \operatorname{ad}}_{x \oplus 0}(0 \oplus v) = [x \oplus 0, 0 \oplus v] = 0 \oplus xv = 0 \oplus v ,\end{align*} where we’ve used that \(x\) acts on the left on vectors by cycling the entries. Thus \({ \operatorname{ad}}_{x \oplus 0}^n (0 \oplus v) = 0 \oplus v\) for all \(n\geq 1\) and \(x \oplus 0 \in M^{(1)}\) is not ad-nilpotent.
Prove that if \(L\) is nilpotent then the Killing form of \(L\) is identically zero.
Note that if \(L\) is nilpotent than every \(\ell \in L\) is ad-nilpotent, so letting \(x,y\in L\) be arbitrary, their commutator \(\ell \mathrel{\vcenter{:}}=[xy]\) is ad-nilpotent. Thus \({ \operatorname{ad}}_{[xy]} \in { \operatorname{End} }(L)\) is a nilpotent endomorphism of \(L\), which are always traceless.
The claim is the following: for any \(x,y\in L\), \begin{align*} \operatorname{Trace}( { \operatorname{ad}}_{[xy]}) = 0\implies \operatorname{Trace}( { \operatorname{ad}}_x \circ { \operatorname{ad}}_y) = 0 ,\end{align*} from which it follows immediately that \(\beta\) is identically zero.
First we can use the fact that \({ \operatorname{ad}}: L\to {\mathfrak{gl}}(L)\) preserves brackets, and so \begin{align*} { \operatorname{ad}}_{[xy]_L} = [ { \operatorname{ad}}_x { \operatorname{ad}}_y]_{{\mathfrak{gl}}(L)} = { \operatorname{ad}}_x \circ { \operatorname{ad}}_y - { \operatorname{ad}}_y \circ { \operatorname{ad}}_x ,\end{align*} and so \begin{align*} 0 = \operatorname{Trace}( { \operatorname{ad}}_{[xy]}) = \operatorname{Trace}( { \operatorname{ad}}_x { \operatorname{ad}}_y - { \operatorname{ad}}_y { \operatorname{ad}}_x) = \operatorname{Trace}( { \operatorname{ad}}_x { \operatorname{ad}}_y) - \operatorname{Trace}( { \operatorname{ad}}_y { \operatorname{ad}}_x) .\end{align*} where we’ve used that the trace is an \({ \mathbf{F} }{\hbox{-}}\)linear map \({\mathfrak{gl}}(L) \to { \mathbf{F} }\). This forces \begin{align*} \operatorname{Trace}( { \operatorname{ad}}_x { \operatorname{ad}}_y) = - \operatorname{Trace}( { \operatorname{ad}}_y { \operatorname{ad}}_x) ,\end{align*} but by the cyclic property of traces, we always have \begin{align*} \operatorname{Trace}( { \operatorname{ad}}_x { \operatorname{ad}}_y) = \operatorname{Trace}( { \operatorname{ad}}_y { \operatorname{ad}}_x) .\end{align*} Combining these yields \(\operatorname{Trace}( { \operatorname{ad}}_x { \operatorname{ad}}_y) = 0\).
Relative to the standard basis of \({\mathfrak{sl}}_3({ \mathbf{F} })\), compute \(\operatorname{det}\kappa\). What primes divide it?
Hint: use 6.7, which says \(\kappa_{{\mathfrak{gl}}_n}(x, y) = 2n \operatorname{Trace}(xy)\).
We have the following standard basis: \begin{align*} x_1=\left[\begin{array}{ccc} \cdot & 1 & \cdot \\ \cdot & \cdot & \cdot \\ \cdot & \cdot & \cdot \end{array}\right] \quad &x_2=\left[\begin{array}{ccc} \cdot & \cdot & 1 \\ \cdot & \cdot & \cdot \\ \cdot & \cdot & . \end{array}\right] &x_3=\left[\begin{array}{ccc} \cdot & \cdot & \cdot \\ \cdot & \cdot & 1 \\ \cdot & \cdot & \cdot \end{array}\right]\\ h_1=\left[\begin{array}{ccc} 1 & \cdot & \cdot \\ \cdot & -1 & \cdot \\ \cdot & \cdot & \cdot \end{array}\right] \quad &h_2=\left[\begin{array}{ccc} \cdot & \cdot & \cdot \\ \cdot & 1 & \cdot \\ \cdot & \cdot & -1 \end{array}\right] &\\ y_1=\left[\begin{array}{ccc} \cdot & \cdot & \cdot \\ 1 & \cdot & \cdot \\ \cdot & \cdot & \cdot \end{array}\right] \quad &y_2=\left[\begin{array}{lll} \cdot & \cdot & \cdot \\ \cdot & \cdot & \cdot \\ 1 & \cdot & \cdot \end{array}\right] &y_3=\left[\begin{array}{ccc} \cdot & \cdot & \cdot \\ \cdot & \cdot & \cdot \\ \cdot & 1 & \cdot \end{array}\right] .\end{align*} For notational convenience, let \(\left\{{v_1,\cdots, v_8}\right\}\) denote this ordered basis.
Direct computations show
Let \(E_{ij}\) denote the elementary \(8\times 8\) matrices with a 1 in the \((i, j)\) position. We then have, for example, \begin{align*} { \operatorname{ad}}_{x_1} &= 0 + 0 + E_{2,3} -2 E_{2, 4} + E_{1, 5} + E_{4, 6} + E_{6, 7} + 0 \\ &= \left(\begin{array}{rrrrrrr} \cdot & \cdot & \cdot & \cdot & 1 & \cdot & \cdot \\ \cdot & \cdot & 1 & -2 & \cdot & \cdot & \cdot \\ \cdot & \cdot & \cdot & \cdot & \cdot & \cdot & \cdot \\ \cdot & \cdot & \cdot & \cdot & \cdot & 1 & \cdot \\ \cdot & \cdot & \cdot & \cdot & \cdot & \cdot & \cdot \\ \cdot & \cdot & \cdot & \cdot & \cdot & \cdot & 1 \\ \cdot & \cdot & \cdot & \cdot & \cdot & \cdot & \cdot \end{array}\right) .\end{align*}
The remaining computations can be readily automated on a computer, yielding the following matrices for the remaining \({ \operatorname{ad}}_{v_i}\):
\({ \operatorname{ad}}_{x_1} = 0 + 0 + E_{2,3} -2 E_{1, 4} + E_{1, 5} + E_{4, 6} + E_{8, 7} + 0\)
\({ \operatorname{ad}}_{x_2} = 0 + 0 + 0 -E_{2, 4} -E_{2, 5} - E_{3, 6} + (E_{4, 7} +E_{5, 7}) + E_{1, 8}\)
\({ \operatorname{ad}}_{x_3} = -E_{2,1} + 0 + 0 + E_{3, 4} -2 E_{3, 5} + 0 + E_{6, 7} + E_{5, 8}\)
\({ \operatorname{ad}}_{h_1} = 2E_{1,1} + E_{2, 2} - E_{3, 3} + 0 + 0 -2 E_{6, 6} - E_{7,7} + E_{8, 8}\)
\({ \operatorname{ad}}_{h_2} = -E_{1, 1} + E_{2,2} +2 E_{3,3} + 0 + 0 + E_{6,6} - E_{7,7} -2 E_{8,8}\)
\({ \operatorname{ad}}_{y_1} = -E_{4, 1} + E_{3, 2} + 0 + 2E_{6, 4} - E_{6, 5} + 0 + 0 - E_{7, 8}\)
\({ \operatorname{ad}}_{y_2} = E_{8,1} - (E_{4, 2} + E_{5, 2}) - E_{6, 3} + E_{7, 4} + E_{7, 5} +0 +0 + 0\)
\({ \operatorname{ad}}_{y_3} = 0 - E_{1, 2} - E_{5, 3} - E_{8, 4} +2 E_{8, 5} + E_{7, 6} + 0 + 0\)
Now forming the matrix \((\beta)_{ij} \mathrel{\vcenter{:}}=\operatorname{Trace}( { \operatorname{ad}}_{v_i} { \operatorname{ad}}_{v_j})\) yields \begin{align*} \beta = \left(\begin{array}{rrrrrrrr} \cdot & \cdot & \cdot & \cdot & \cdot & 2 & \cdot & \cdot \\ \cdot & \cdot & \cdot & \cdot & \cdot & \cdot & 6 & \cdot \\ \cdot & \cdot & \cdot & \cdot & \cdot & \cdot & \cdot & 6 \\ \cdot & \cdot & \cdot & 12 & -6 & \cdot & \cdot & \cdot \\ \cdot & \cdot & \cdot & -6 & 12 & \cdot & \cdot & \cdot \\ 2 & \cdot & \cdot & \cdot & \cdot & \cdot & \cdot & \cdot \\ \cdot & 6 & \cdot & \cdot & \cdot & \cdot & \cdot & \cdot \\ \cdot & \cdot & 6 & \cdot & \cdot & \cdot & \cdot & \cdot \end{array}\right) ,\end{align*} whence \(\operatorname{det}(\beta) = (2\cdot 6\cdot 6)^2(12^2-36) = - 2^8 3^7\).
Using the standard basis for \(L=\mathfrak{s l}_2({ \mathbf{F} })\), write down the Casimir element of the adjoint representation of \(L\) (cf. Exercise 5.5). Do the same thing for the usual (3-dimensional) representation of \(\mathfrak{s l}_3({ \mathbf{F} })\), first computing dual bases relative to the trace form.
A computation shows that in the basis \(\left\{{e_i}\right\} \mathrel{\vcenter{:}}=\left\{{x,h,y}\right\}\), the Killing form is represented by \begin{align*} \beta = { \begin{bmatrix} {0} & {0} & {4} \\ {0} & {8} & {0} \\ {4} & {0} & {0} \end{bmatrix} } \implies \beta^{-T} = { \begin{bmatrix} {0} & {0} & {1\over 4} \\ {0} & {1\over 8} & {0} \\ {1\over 4} & {0} & {0} \end{bmatrix} } ,\end{align*} yielding the dual basis \(\left\{{e_i {}^{ \vee }}\right\}\) read from the columns of \(\beta^{-T}\):
Thus letting \(\phi = { \operatorname{ad}}\), we have \begin{align*} c_\phi &= \sum \phi(e_i)\phi(e_i {}^{ \vee }) \\ &= { \operatorname{ad}}(x) { \operatorname{ad}}(x {}^{ \vee }) + { \operatorname{ad}}(h) { \operatorname{ad}}(h {}^{ \vee }) + { \operatorname{ad}}(y) { \operatorname{ad}}(y {}^{ \vee }) \\ &= { \operatorname{ad}}(x) { \operatorname{ad}}(y/4) + { \operatorname{ad}}(h) { \operatorname{ad}}(h/8) + { \operatorname{ad}}(y) { \operatorname{ad}}(x/4) \\ &= {1\over 4} { \operatorname{ad}}_x { \operatorname{ad}}_y + {1\over 8} { \operatorname{ad}}_h^2 + {1\over 4} { \operatorname{ad}}_y { \operatorname{ad}}_x .\end{align*}
For \({\mathfrak{sl}}_3\), first take the ordered basis \(\left\{{v_1,\cdots, v_8}\right\} = \left\{{x_1, x_2, x_3, h_1, h_2, y_1, y_2, y_3}\right\}\) as in the previous problem. So we form the matrix \((\beta)_{ij} \mathrel{\vcenter{:}}=\operatorname{Trace}(v_i v_j)\) by computing various products and traces on a computer to obtain \begin{align*} \beta = \left(\begin{array}{rrrrrrrr} \cdot & \cdot & \cdot & \cdot & \cdot & 1 & \cdot & \cdot \\ \cdot & \cdot & \cdot & \cdot & \cdot & \cdot & 1 & \cdot \\ \cdot & \cdot & \cdot & \cdot & \cdot & \cdot & \cdot & 1 \\ \cdot & \cdot & \cdot & 2 & -1 & \cdot & \cdot & \cdot \\ \cdot & \cdot & \cdot & -1 & 2 & \cdot & \cdot & \cdot \\ 1 & \cdot & \cdot & \cdot & \cdot & \cdot & \cdot & \cdot \\ \cdot & 1 & \cdot & \cdot & \cdot & \cdot & \cdot & \cdot \\ \cdot & \cdot & 1 & \cdot & \cdot & \cdot & \cdot & \cdot \end{array}\right) \implies \beta^{-T} = \left(\begin{array}{rrrrrrrr} \cdot & \cdot & \cdot & \cdot & \cdot & 1 & \cdot & \cdot \\ \cdot & \cdot & \cdot & \cdot & \cdot & \cdot & 1 & \cdot \\ \cdot & \cdot & \cdot & \cdot & \cdot & \cdot & \cdot & 1 \\ \cdot & \cdot & \cdot & \frac{2}{3} & \frac{1}{3} & \cdot & \cdot & \cdot \\ \cdot & \cdot & \cdot & \frac{1}{3} & \frac{2}{3} & \cdot & \cdot & \cdot \\ 1 & \cdot & \cdot & \cdot & \cdot & \cdot & \cdot & \cdot \\ \cdot & 1 & \cdot & \cdot & \cdot & \cdot & \cdot & \cdot \\ \cdot & \cdot & 1 & \cdot & \cdot & \cdot & \cdot & \cdot \end{array}\right) ,\end{align*} which yields the dual basis
We can thus compute the Casimir element of the standard representation \(\phi\) on a computer as \begin{align*} c_{\phi} &= \sum_i \phi(x_i)\phi(x_i {}^{ \vee }) + \phi(h_1)\phi(h_1 {}^{ \vee }) + \phi(h_2)\phi(h_2 {}^{ \vee }) + \sum_i \phi(y_i)\phi(y_i {}^{ \vee }) \\ &= \sum_i x_i y_i + h_1 h_1 {}^{ \vee }+ h_2 h_2 {}^{ \vee }+ \sum_i y_i x_i \\ &= \sum_i \qty{x_i y_i + y_i x_i} \\ &= {8\over 3}I .\end{align*}
If \(L\) is solvable, every irreducible representation of \(L\) is one dimensional.
Let \(\phi: L\to V\) be an irreducible representation of \(L\). By Lie’s theorem. \(L\) stabilizes a flag in \(V\), say \(F^\bullet = F^1 \subset \cdots F^n = V\) where \(F^k = \left\langle{v_1,\cdots, v_k}\right\rangle\) for some basis \(\left\{{v_i}\right\}_{i\leq n}\). Since \(\phi\) is irreducible, the only \(L{\hbox{-}}\)invariant subspaces of \(V\) are \(0\) and \(V\) itself. However, each \(F^k\) is an \(L{\hbox{-}}\)invariant subspace, which forces \(n=1\) and \(F^1 = V\). Thus \(V\) is 1-dimensional.
A Lie algebra \(L\) for which \(\operatorname{Rad} L=Z(L)\) is called reductive.22
Part 1: If \({ \operatorname{ad}}(L) \neq 0\), as hinted, we can attempt to apply Weyl’s theorem to the representation \(\phi: { \operatorname{ad}}(L)\to {\mathfrak{gl}}(L)\): if we can show \({ \operatorname{ad}}(L)\) is semisimple, then \(\phi\) (and thus \(L\)) will be a completely reducible \({ \operatorname{ad}}(L){\hbox{-}}\)module. Assume \(L\) is reductive, so \(\ker( { \operatorname{ad}}) = Z(L) = \mathop{\mathrm{Rad}}(L)\), and by the first isomorphism theorem \({ \operatorname{ad}}(L) \cong L/\mathop{\mathrm{Rad}}(L)\). We can now use the fact stated in Humphreys on page 11 that for an arbitrary Lie algebra \(L\), the quotient \(L/\mathop{\mathrm{Rad}}(L)\) is semisimple – this follows from the fact that \(\mathop{\mathrm{Rad}}(L/\mathop{\mathrm{Rad}}(L)) = 0\), since the maximal solvable ideal in the quotient would need to be a maximal proper ideal in \(L\) containing \(\mathop{\mathrm{Rad}}(L)\), which won’t exist by maximality of \(\mathop{\mathrm{Rad}}(L)\). Thus \({ \operatorname{ad}}(L)\) is semisimple, and Weyl’s theorem implies it is completely reducible.
To show that \(L = Z(L) \oplus [LL]\), we first show that it decomposes as a sum \(L = Z(L) + [LL]\), and then that the intersection is empty so the sum is direct. We recall that a Lie algebra is semisimple if and only if it has no nonzero abelian ideals. Since \(L/Z(L)\) is semisimple, we have \([L/Z(L), L/Z(L)] = L/Z(L)\) since it would otherwise be a nonzero abelian ideal in \(L/Z(L)\). We can separately identify \([L/Z(L), L/Z(L)] \cong [LL]/Z(L)\), since the latter is also semisimple and the former is an abelian ideal in it. Combining these, we have \([LL]/Z(L) \cong L/Z(L) \cong { \operatorname{ad}}(L)\), and so we have an extension \begin{align*} 0 \to Z(L) \to L \to [LL] \to 0 .\end{align*} Since this sequence splits at the level of vector spaces, \(L = Z(L) + [LL]\) as an \({ \operatorname{ad}}(L){\hbox{-}}\)module, although the sum need not be direct. To show that it is, note that \(Z(L) \leq L\) is an \({ \operatorname{ad}}(L){\hbox{-}}\)invariant submodule, and by complete reducibility has an \({ \operatorname{ad}}(L){\hbox{-}}\)invariant complement \(W\). We can thus write \(L = W \oplus Z(L)\), and moreover \([LL] \subseteq W\), and so we must have \(W = [LL]\) and \(L = [LL] \oplus Z(L)\).
Finally, to see that \([LL]\) is semisimple, note that the above decomposition allows us to write \(L/Z(L) \cong [LL]\), and \(\mathop{\mathrm{Rad}}(L/Z(L)) = \mathop{\mathrm{Rad}}(L/\mathop{\mathrm{Rad}}(L)) = 0\) so \(\mathop{\mathrm{Rad}}([LL]) = 0\).
Part 2: Omitted for time.
Part 3: Omitted for time.
Part 4: Omitted for time.
Let \(L\) be a simple Lie algebra. Let \(\beta(x, y)\) and \(\gamma(x, y)\) be two symmetric associative bilinear forms on \(L\). If \(\beta, \gamma\) are nondegenerate, prove that \(\beta\) and \(\gamma\) are proportional.
Hint: Use Schur’s Lemma.
The strategy will be to define an irreducible \(L{\hbox{-}}\)module \(V\) and use the two bilinear forms to produce an element of \({ \operatorname{End} }_L(V)\), which will be 1-dimensional by Schur’s lemma.
The representation we’ll take will be \(\phi \mathrel{\vcenter{:}}= { \operatorname{ad}}: L\to {\mathfrak{gl}}(L)\), and since \(L\) is simple, \(\ker { \operatorname{ad}}= 0\) since otherwise it would yield a nontrivial ideal of \(L\). Since this is a faithful representation, we will identify \(L\) with its image \(V \mathrel{\vcenter{:}}= { \operatorname{ad}}(L) \subseteq {\mathfrak{gl}}(L)\) and regard \(V\) as an \(L{\hbox{-}}\)module.
As a matter of notation, let \(\beta_x(y) \mathrel{\vcenter{:}}=\beta(x, y)\) and similarly \(\gamma_x(y) \mathrel{\vcenter{:}}=\gamma(x, y)\), so that \(\beta_x, \gamma_x\) can be regarded as linear functionals on \(V\) and thus elements of \(V {}^{ \vee }\). This gives an \({ \mathbf{F} }{\hbox{-}}\)linear map \begin{align*} \Phi_1: V &\to V {}^{ \vee }\\ x &\mapsto \beta_x ,\end{align*} which we claim is an \(L{\hbox{-}}\)module morphism.
Assuming this for the moment, note that by the general theory of bilinear forms on vector spaces, since \(\beta\) and \(\gamma\) are nondegenerate, the assignments \(x\mapsto \beta_x\) and \(x\mapsto \gamma_x\) induce vector space isomorphisms \(V { \, \xrightarrow{\sim}\, }V {}^{ \vee }\). Accordingly, for any linear functional \(f\in V {}^{ \vee }\), there is a unique element \(z(f) \in V\) such that \(f(v) = \gamma(z(f), v)\). So define a map using the representing element for \(\gamma\): \begin{align*} \Phi_2: V {}^{ \vee }&\to V \\ f &\mapsto z(f) ,\end{align*} which we claim is also an \(L{\hbox{-}}\)module morphism.
We can now define their composite \begin{align*} \Phi \mathrel{\vcenter{:}}=\Phi_2 \circ \Phi_1: V &\to V \\ x &\mapsto z(\beta_x) ,\end{align*} which sends an element \(x\in V\) to the element \(z = z(\beta_x) \in V\) such that \(\beta_x({-}) = \gamma_z({-})\) as functionals. An additional claim is that \(\Phi\) commutes with the image \(V \mathrel{\vcenter{:}}= { \operatorname{ad}}(L) \subseteq {\mathfrak{gl}}(L)\). Given this, by Schur’s lemma we have \(\Phi\in { \operatorname{End} }_L(V) = { \mathbf{F} }\) (where we’ve used that a compositions of morphisms is again a morphism) and so \(\Phi = \lambda \operatorname{id}_L\) for some scalar \(\lambda\in { \mathbf{F} }\).
To see why this implies the result, we have equalities of functionals \begin{align*} \beta(x, {-}) &= \beta_x({-}) \\ &= \gamma_{z(\beta_x) }({-}) \\ &= \gamma( z(\beta_x), {-})\\ &= \gamma( \Phi(x), {-}) \\ &= \gamma(\lambda x, {-}) \\ &= \lambda\gamma(x, {-}) ,\end{align*}
and since this holds for all \(x\) we have \(\beta({-}, {-}) = \lambda \gamma({-}, {-})\) as desired.
\(\Phi_1\) is an \(L{\hbox{-}}\)module morphism.
We recall that a morphism of \(L{\hbox{-}}\)modules \(\phi: V\to W\) is an \({ \mathbf{F} }{\hbox{-}}\)linear map satisfying \begin{align*} \phi(\ell .\mathbf{x}) = \ell.\phi(\mathbf{x}) \qquad\forall \ell\in L,\,\forall \mathbf{x}\in V .\end{align*} In our case, the left-hand side is \begin{align*} \Phi_1(\ell . \mathbf{x}) \mathrel{\vcenter{:}}=\Phi_1( { \operatorname{ad}}_\ell(\mathbf{x}) ) = \Phi_1([\ell,\mathbf{x}]) = \beta_{[\ell, \mathbf{x}]} = \beta( [\ell, \mathbf{x}], {-}) .\end{align*} and the right-hand side is \begin{align*} \ell.\Phi_1(\mathbf{x}) \mathrel{\vcenter{:}}=\ell.\beta_{\mathbf{x}} \mathrel{\vcenter{:}}=(y\mapsto -\beta_{\mathbf{x}}( \ell. y)) \mathrel{\vcenter{:}}=(\mathbf{y}\mapsto -\beta_{\mathbf{x}}( [\ell, \mathbf{y}] )) = -\beta(\mathbf{x}, [\ell, {-}]) .\end{align*} By anticommutativity of the bracket, along with \({ \mathbf{F} }{\hbox{-}}\)linearity and associativity of \(\beta\), we have \begin{align*} \beta([\ell, \mathbf{x}], \mathbf{y}) = -\beta([\mathbf{x}, \ell], \mathbf{y}) = -\beta(\mathbf{x}, [\ell, \mathbf{y}]) \qquad \forall \mathbf{y}\in V \end{align*} and so the above two sides do indeed coincide.
\(\Phi_2\) is an \(L{\hbox{-}}\)module morphism.
Omitted for time, proceeds similarly.
\(\Phi\) commutes with \({ \operatorname{ad}}(L)\).
Letting \(x\in L\), we want to show that \(\Phi \circ { \operatorname{ad}}_x = { \operatorname{ad}}_x \circ \Phi \in {\mathfrak{gl}}(L)\), i.e. that these two endomorphisms of \(L\) commute. Fixing \(\ell\in L\), the LHS expands to \begin{align*} \Phi( { \operatorname{ad}}_x(\ell)) = z(\beta_{ { \operatorname{ad}}_x(\ell) }) = z(\beta_{[x\ell]}) ,\end{align*} while the RHS is \begin{align*} { \operatorname{ad}}_x(\Phi(\ell)) = { \operatorname{ad}}_x(z(\beta_\ell)) = [x, z(\beta_\ell)] .\end{align*} Recalling that \(\Phi(t) = z(\beta_t)\) is defined to be the unique element \(t\in L\) such that \(\beta(t, {-}) = \gamma(z(\beta_t), {-})\), for the above two to be equal it suffices to show that \begin{align*} \beta([x, \ell], {-}) = \gamma( [x, z(\beta_\ell)], {-}) \end{align*} as linear functionals. Starting with the RHS of this expression, we have \begin{align*} \gamma( [ x, z(\beta_\ell) ], {-}) &= -\gamma( [z(\beta_\ell), x], {-}) \quad\text{by antisymmetry}\\ &= -\gamma(z(\beta_\ell), [x, {-}]) \quad\text{by associativity of }\gamma \\ &= -\beta(\ell, [x, {-}]) \quad\text{by definition of } z(\beta_\ell) \\ &= -\beta([\ell, x], {-}) \\ &= \beta([x, \ell], {-}) .\end{align*}
It will be seen later on that \(\mathfrak{sl}_n({ \mathbf{F} })\) is actually simple. Assuming this and using Exercise 6, prove that the Killing form \(\kappa\) on \(\mathfrak{s l}_n({ \mathbf{F} })\) is related to the ordinary trace form by \(\kappa(x, y)=2 n \operatorname{Tr}(x y)\).
By the previous exercise, the trace pairing \((x,y)\mapsto \operatorname{Trace}(xy)\) is related to the Killing form by \(\kappa(x,y) = \lambda \operatorname{Trace}(x,y)\) for some \(\lambda\) – here we’ve used the fact that since \({\mathfrak{sl}}_n({ \mathbf{F} })\) is simple, \(\mathop{\mathrm{Rad}}(\operatorname{Trace}) = 0\) and thus the trace pairing is nodegenerate. Since the scalar only depends on the bilinear forms and not on any particular inputs, it suffices to compute it for any pair \((x, y)\), and in fact we can take \(x=y\). For \({\mathfrak{sl}}_n\), we can take advantage of the fact that in the standard basis, \({ \operatorname{ad}}_{h_i}\) will be diagonal for any standard generator \(h_i\in {\mathfrak{h}}\), making \(\operatorname{Trace}( { \operatorname{ad}}_{h_i}^2)\) easier to compute for general \(n\).
Take the standard \(h_{1} \mathrel{\vcenter{:}}= e_{11} - e_{22}\), and consider the matrix of \({ \operatorname{ad}}_{h_1}\) in the ordered basis \(\left\{{x_1,\cdots, x_k, h_1,\cdots, h_{n-1}, y_1,\cdots, y_k }\right\}\) which has \(k + (n-1) + k = n^2-1\) elements where \(k= (n^2-n)/2\). We’ll first compute the Killing form with respect to this basis. In order to compute the various \([h_1, v_i]\), we recall the formula \([e_{ij}, e_{kl}] = \delta_{jk} e_{il} - \delta_{li} e_{kj}\). Applying this to \(h_{1}\) yields \begin{align*} [h_{1}, e_{ij}] = [e_{11} - e_{22}, e_{ij}] = [e_{11}, e_{ij}] - [e_{22}, e_{ij}] = (\delta_{1i} e_{2j} - \delta_{1j} e_{i1}) - (\delta_{2i} e_{2j} - \delta_{2j} e_{i2}) .\end{align*} We proceed to check all of the possibilities for the results as \(i, j\) vary with \(i\neq j\) using the following schematic: \begin{align*} \left[\begin{array}{ c | c | c } \cdot & a & R_1 \, \cdots \\ \hline b & \cdot & R_2\, \cdots \\ \hline \overset{C_1}{\vdots} & \overset{C_2}{\vdots} & M \end{array}\right] .\end{align*}
The possible cases are:
Thus the matrix of \({ \operatorname{ad}}_{h_1}\) has \(4(n-2)\) ones and \(2, -2\) on the diagonal, and \({ \operatorname{ad}}_{h_1}^2\) as \(4(n-2)\) ones and \(4, 4\) on the diagonal, yielding \begin{align*} \operatorname{Trace}( { \operatorname{ad}}_{h_1}^2) = 4(n-2) + 2(4) = 4n .\end{align*} On the other hand, computing the standard trace form yields \begin{align*} \operatorname{Trace}(h_1^2) = \operatorname{Trace}(\operatorname{diag}(1,1,0,0,\cdots)) = 2 ,\end{align*} and so \begin{align*} \operatorname{Trace}( { \operatorname{ad}}_{h_1}^2) = 4n = 2n \cdot 2 = 2n\cdot \operatorname{Trace}(h_1^2) \implies \lambda = 2n .\end{align*}
\(M=\mathfrak{sl}(3, { \mathbf{F} })\) contains a copy of \(L\mathrel{\vcenter{:}}={\mathfrak{sl}}(2, { \mathbf{F} })\) in its upper left-hand \(2 \times 2\) position. Write \(M\) as direct sum of irreducible \(L\)-submodules (\(M\) viewed as \(L\) module via the adjoint representation): \begin{align*} V(0) \oplus V(1) \oplus V(1) \oplus V(2) .\end{align*}
Noting that
it suffices to find distinct highest weight elements of weights \(0,1,1,2\) and take the irreducible submodules they generate. As long as the spanning vectors coming from the various \(V(n)\) are all distinct, they will span \(M\) as a vector space by the above dimension count and individually span the desired submodules.
Taking the standard basis \(\left\{{v_1,\cdots, v_8}\right\} \mathrel{\vcenter{:}}=\left\{{x_1, x_2, x_3, h_1, h_2, y_1, y_2, y_3}\right\}\) for \({\mathfrak{sl}}_3({ \mathbf{F} })\) with \(y_i = x_i^t\), note that the image of the inclusion \({\mathfrak{sl}}_2({ \mathbf{F} }) \hookrightarrow{\mathfrak{sl}}_3({ \mathbf{F} })\) can be identified with the span of \(\left\{{w_1,w_2,w_3}\right\} \mathrel{\vcenter{:}}=\left\{{x_1, h_1, y_1}\right\}\) and it suffices to consider how these \(3\times 3\) matrices act.
Since any highest weight vector must be annihilated by the \(x_1{\hbox{-}}\)action, to find potential highest weight vectors one can compute the matrix of \({ \operatorname{ad}}_{x_1}\) in the above basis and look for zero columns: \begin{align*} { \operatorname{ad}}_{x_1} = \left(\begin{array}{rrrrrrrr} 0 & 0 & 0 & -2 & 1 & 0 & 0 & 0 \\ 0 & 0 & 1 & 0 & 0 & 0 & 0 & 0 \\ 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 \\ 0 & 0 & 0 & 0 & 0 & 1 & 0 & 0 \\ 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 \\ 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 \\ 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 \\ 0 & 0 & 0 & 0 & 0 & 0 & -1 & 0 \end{array}\right) .\end{align*} Thus \(\left\{{v_1 = x_1, v_2 = x_2, v_8 = y_3}\right\}\) are the only options for highest weight vectors of nonzero weight, since \({ \operatorname{ad}}_{x_1}\) acts nontrivially on the remaining basis elements.
Computing the matrix of \({ \operatorname{ad}}{h_1}\), one can read off the weights of each: \begin{align*} { \operatorname{ad}}_{h_1} = \left(\begin{array}{rrrrrrrr} 2 & 0 & 0 & 0 & 0 & 0 & 0 & 0 \\ 0 & 1 & 0 & 0 & 0 & 0 & 0 & 0 \\ 0 & 0 & -1 & 0 & 0 & 0 & 0 & 0 \\ 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 \\ 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 \\ 0 & 0 & 0 & 0 & 0 & -2 & 0 & 0 \\ 0 & 0 & 0 & 0 & 0 & 0 & -1 & 0 \\ 0 & 0 & 0 & 0 & 0 & 0 & 0 & 1 \end{array}\right) .\end{align*}
Thus the candidates for highest-weight vectors are:
We can now repeatedly apply the \(y_1{\hbox{-}}\)action to obtain the other vectors in each irreducible module.
For \(V(2)\):
Since we see \(h_1\) appears in this submodule, we see that we should later take \(h_2\) as the maximal vector for \(V(0)\). Continuing with \(V(1)\):
For the other \(V(1)\):
For \(V(0)\):
We see that we get the entire basis of \({\mathfrak{sl}}_3({ \mathbf{F} })\) this way with no redundancy, yielding the desired direct product decomposition.
Suppose char \({ \mathbf{F} }=p>0, L=\mathfrak{s l}(2, { \mathbf{F} })\). Prove that the representation \(V(m)\) of \(L\) constructed as in Exercise 3 or 4 is irreducible so long as the highest weight \(m\) is strictly less than \(p\), but reducible when \(m=p\).
Note: this corresponds to the formulas in lemma 7.2 parts (a) through (c), or by letting \(L\curvearrowright{ \mathbf{F} }^2\) in the usual way and extending \(L\curvearrowright{ \mathbf{F} }[x, y]\) by derivations, so \(l.(fg) = (l.f)g + f(l.g)\) and taking the subspace of homogeneous degree \(m\) polynomials \(\left\langle{x^m, x^{m-1}y, \cdots, y^m}\right\rangle\) to get an irreducible module of highest weight \(m\).
The representation \(V(m)\) in Lemma 7.2 is defined by the following three equations, where \(v_0 \in V_{ m}\) is a highest weight vector and \(v_k \mathrel{\vcenter{:}}= y^k v_0/k!\):
Supposing \(m< p\), the vectors \(\left\{{v_0, v_1,\cdots, v_m}\right\}\) still span an irreducible \(L{\hbox{-}}\)module since it contains no nontrivial \(L{\hbox{-}}\)submodules, just as in the characteristic zero case.
However, if \(m=p\), then note that \(y.v_{m-1} = (m-1+1) v_m = 0 v_m = 0\) and consider the set \(\left\{{v_0, \cdots, v_{m-1}}\right\}\). This spans an \(m{\hbox{-}}\)dimensional subspace of \(V\), and the equations above show it is invariant under the \(L{\hbox{-}}\)action, so it yields an \(m{\hbox{-}}\)dimensional submodule of \(V(m)\). Since \(\dim_{ \mathbf{F} }V(m) = m+1\), this is a nontrivial proper submodule, so \(V(m)\) is reducible.
Decompose the tensor product of the two \(L\)-modules \(V(3), V(7)\) into the sum of irreducible submodules: \(V(4) \oplus V(6) \oplus V(8) \oplus V(10)\). Try to develop a general formula for the decomposition of \(V(m) \otimes V(n)\).
By a theorem from class, we know the weight space decomposition of any \({\mathfrak{sl}}_2({\mathbf{C}}){\hbox{-}}\)module \(V\) takes the following form: \begin{align*} V = V_{-m} \oplus V_{-m+2} \oplus \cdots \oplus V_{m-2} \oplus V_m ,\end{align*} where \(m\) is a highest weight vector, and each weight space \(V_{\mu}\) is 1-dimensional and occurs with multiplicity one. In particular, since \(V(m)\) is a highest-weight module of highest weight \(m\), we can write \begin{align*} V(3) &= \hspace{5.6em} V_{-3} \oplus V_{-1} \oplus V_1 \oplus V_3 \\ V(7) &= V_{-7} \oplus V_{-5} \oplus V_{-3} \oplus V_{-1} \oplus V_1 \oplus V_3 \oplus V_{5} \oplus V_{7} ,\end{align*} and tensoring these together yields modules with weights between \(-3 -7 = -10\) and \(3+7 = 10\): \begin{align*} V(3) \otimes V(7) &= V_{-10} \oplus V_{-8}{ {}^{ \scriptscriptstyle\oplus^{2} } } \oplus V_{-6}{ {}^{ \scriptscriptstyle\oplus^{3} } } \oplus V_{-4}{ {}^{ \scriptscriptstyle\oplus^{4} } } \oplus V_{-2}{ {}^{ \scriptscriptstyle\oplus^{4} } } \\ \qquad& \oplus V_0{ {}^{ \scriptscriptstyle\oplus^{4} } } \\ \qquad& \oplus V_2{ {}^{ \scriptscriptstyle\oplus^{4} } } \oplus V_4{ {}^{ \scriptscriptstyle\oplus^{4} } } \oplus V_6{ {}^{ \scriptscriptstyle\oplus^{3} } } \oplus V_8{ {}^{ \scriptscriptstyle\oplus^{2} } } \oplus V_{10} .\end{align*} This can be more easily parsed by considering formal characters: \begin{align*} \operatorname{ch}(V(3)) &= e^{-3} + e^{-1} + e^{1} + e^3 = \\ \operatorname{ch}(V(7)) &= e^{-7} + e^{-5} + e^{-3} + e^{-1} + e^1 + e^3 + e^5 + e^7 \\ \operatorname{ch}(V(3) \otimes V(7)) &= \operatorname{ch}(V(3))\cdot \operatorname{ch}(V(7)) \\ \\ &= (e^{-10} + e^{10}) + 2(e^{-8} + e^{8}) + 3( e^{-6} + e^6) \\ &\qquad + 4( e^{-4} + e^4) + 4(e^{-2} + e^2) +4 \\ \\ &= (e^{-10} + e^{10}) + 2(e^{-8} + e^{8}) + 3( e^{-6} + e^6) \\ &\qquad + 4\operatorname{ch}(V(4)) ,\end{align*} noting that \(\operatorname{ch}(V(4)) = e^{-4} + e^{-2} + e^{2} + e^{4}\) and collecting terms.
To see that \(V(3) \otimes V(7)\) decomposes as \(V(4) \oplus V(6) \oplus V(8) \oplus V(10)\) one can check for equality of characters to see that the various weight spaces and multiplicities match up: \begin{align*} \operatorname{ch}(V(4) \oplus V(6) \oplus V(8) \oplus V(10)) &= \operatorname{ch}(V(4)) + \operatorname{ch}(V(6)) + \operatorname{ch}(V(8)) + \operatorname{ch}(V(10) \\ \\ &= \qty{e^{-4} + \cdots + e^4} + \qty{e^{-6} + \cdots + e^6} \\ &\quad +\qty{e^{-8} + \cdots + e^8} + \qty{e^{-10} + \cdots + e^{10}} \\ \\ &= 2\operatorname{ch}(V(4)) + (e^{-6} + e^6) \\ &\,\, + \operatorname{ch}(V(4)) + (e^{-6} + e^6) + (e^{-8} + e^8) \\ &\,\,+ \operatorname{ch}(V(4)) + (e^{-6} + e^6) + (e^{-8} + e^8) + (e^{-10} + e^{10})\\ \\ &= 4\operatorname{ch}(V(4)) + 3(e^{-6} + e^6) \\ &\,\, + 2(e^{-8} + e^8) + (e^{-10} + e^{10}) ,\end{align*} which is equal to \(\operatorname{ch}(V(3) \otimes V(7))\) from above.
More generally, for two such modules \(V, W\) we can write \begin{align*} V\otimes_{ \mathbf{F} }W = \bigoplus _{\lambda \in{\mathfrak{h}} {}^{ \vee }} \bigoplus _{\mu_1 + \mu_2 = \lambda} V_{\mu_1} \otimes_{ \mathbf{F} }W_{\mu_2} ,\end{align*} where we’ve used the following observation about the weight of \({\mathfrak{h}}\) acting on a tensor product of weight spaces: supposing \(v\in V_{\mu_1}\) and \(w\in W_{\mu_2}\), \begin{align*} h.(v\otimes w) &= (hv)\otimes w + v\otimes(hw) \\ &= (\mu_1 v)\otimes w + v\otimes(\mu_2 w) \\ &= (\mu_1 v)\otimes w + (\mu_2 v)\otimes w \\ &= (\mu_1 + \mu_2)(v\otimes w) ,\end{align*} and so \(v\otimes w \in V_{\mu_1 + \mu_2}\).
Taking \(V(m_1), V(m_2)\) with \(m_1 \geq m_2\) then yields a general formula: \begin{align*} V(m_1) \otimes_{ \mathbf{F} }V(m_2) = \bigoplus _{n=-m_1-m_2}^{m_1+m_2} \bigoplus_{a + b = n} V_a \otimes_{ \mathbf{F} }V_b = \bigoplus_{n = m_1-m_2}^{m_2 + m_1} V(n) .\end{align*}
Prove that every three dimensional semisimple Lie algebra has the same root system as \(\mathfrak{s l}(2, { \mathbf{F} })\), hence is isomorphic to \(\mathfrak{s l}(2, { \mathbf{F} })\).
There is a formula for the dimension of \(L\) in terms of the rank of \(\Phi\) and its cardinality, which is more carefully explained in the solution below for problem 8.10: \begin{align*} \dim {\mathfrak{g}}= \operatorname{rank}\Phi + {\sharp}\Phi .\end{align*} Thus if \(\dim L = 3\) then the only possibility is that \(\operatorname{rank}\Phi = 1\) and \({\sharp}\Phi = 2\), using that \(\operatorname{rank}\Phi \leq {\sharp}\Phi\) and that \({\sharp}\Phi\) is always even since each \(\alpha\in \Phi\) can be paired with \(-\alpha\in \Phi\). In particular, the root system \(\Phi\) of \(L\) must have rank 1, and there is a unique root system of rank 1 (up to equivalence) which corresponds to \(A_1\) and \({\mathfrak{sl}}_2({ \mathbf{F} })\).
By the remark in Humphreys at the end of 8.5, there is a 1-to-1 correspondence between pairs \((L, H)\) with \(L\) a semisimple Lie algebra and \(H\) a maximal toral subalgebra and pairs \((\Phi, {\mathbb{E}})\) with \(\Phi\) a root system and \({\mathbb{E}}\supseteq\Phi\) its associated Euclidean space. Using this classification, we conclude that \(L\cong {\mathfrak{sl}}_2({ \mathbf{F} })\).
Prove that no four, five or seven dimensional semisimple Lie algebras exist.
We can first write \begin{align*} {\mathfrak{g}}= {\mathfrak{n}}^- \oplus {\mathfrak{h}}\oplus {\mathfrak{n}}^+,\qquad {\mathfrak{n}}^+ \mathrel{\vcenter{:}}=\bigoplus _{\alpha\in \Phi^+} {\mathfrak{g}}_{ \alpha},\quad {\mathfrak{n}}^- \mathrel{\vcenter{:}}=\bigoplus _{\alpha\in \Phi^+} {\mathfrak{g}}_{ -\alpha} .\end{align*} Writing \(N\mathrel{\vcenter{:}}={\mathfrak{n}}^+ \oplus {\mathfrak{n}}^- = \bigoplus _{\alpha\in \Phi} {\mathfrak{g}}_{\alpha}\), we note that \(\dim_{ \mathbf{F} }{\mathfrak{g}}_{ \alpha} = 1\) for all \(\alpha\in \Phi\). Thus \(\dim_{ \mathbf{F} }N = {\sharp}\Phi\) and \begin{align*} \dim_{ \mathbf{F} }{\mathfrak{g}}= \dim_{ \mathbf{F} }{\mathfrak{h}}+ {\sharp}\Phi .\end{align*} We can also use the fact that \(\dim_{ \mathbf{F} }{\mathfrak{h}}= \operatorname{rank}\Phi \mathrel{\vcenter{:}}=\dim_{\mathbf{R}}{\mathbf{R}}\Phi\), the dimension of the Euclidean space spanned by \(\Phi\), and so we have a general formula \begin{align*} \dim_{ \mathbf{F} }{\mathfrak{g}}= \operatorname{rank}\Phi + {\sharp}\Phi ,\end{align*} which we’ll write as \(d=r+f\).
We can observe that \(f\geq 2r\) since if \({\mathcal{B}}\mathrel{\vcenter{:}}=\left\{{ \alpha_1, \cdots, \alpha_r}\right\}\) is a basis for \(\Phi\), no \(-\alpha_i\) is in \({\mathcal{B}}\) but \(\left\{{\pm \alpha_1, \cdots, \pm \alpha_r}\right\} \subseteq \Phi\) by the axiomatics of a root system. Thus \begin{align*} \dim_{ \mathbf{F} }{\mathfrak{g}}= r+f \geq r + 2r = 3r .\end{align*}
We can now examine the cases for which \(d = r+f = 4,5,7\):
Prove that \(\Phi {}^{ \vee }\) is a root system in \(E\), whose Weyl group is naturally isomorphic to \(\mathcal{W}\); show also that \(\left\langle\alpha {}^{ \vee }, \beta {}^{ \vee }\right\rangle=\langle\beta, \alpha\rangle\), and draw a picture of \(\Phi {}^{ \vee }\) in the cases \(A_1, A_2, B_2, G_2\).
We recall and introduce some notation: \begin{align*} {\left\lVert { \alpha} \right\rVert}^2 &\mathrel{\vcenter{:}}=(\alpha, \alpha) \\ \\ {\left\langle { \beta},~{ \alpha} \right\rangle} &\mathrel{\vcenter{:}}={2 (\beta, \alpha)\over {\left\lVert { \alpha} \right\rVert}^2} \\ &= {2 (\beta, \alpha) \over ( \alpha, \alpha)} \\ \\ s_\alpha( \beta) &= \beta - {2 (\beta, \alpha) \over {\left\lVert {\alpha} \right\rVert}^2 } \alpha \\ &= \beta - {2 (\beta, \alpha) \over (\alpha, \alpha) } \alpha \\ \\ \alpha {}^{ \vee }&\mathrel{\vcenter{:}}={2 \over {\left\lVert {\alpha} \right\rVert}^2} \alpha = {2\over (\alpha, \alpha)} \alpha .\end{align*}
\begin{align*} {\left\langle {\alpha {}^{ \vee }},~{\beta {}^{ \vee }} \right\rangle} = {\left\langle {\beta },~{\alpha} \right\rangle} .\end{align*}
This is a computation: \begin{align*} {\left\langle { \alpha {}^{ \vee }},~{ \beta {}^{ \vee }} \right\rangle} &= {2 (\alpha {}^{ \vee }, \beta {}^{ \vee }) \over {\left\lVert {\beta {}^{ \vee }} \right\rVert}^2 } \\ &= {2 (\alpha {}^{ \vee }, \beta {}^{ \vee }) \over (\beta {}^{ \vee }, \beta {}^{ \vee }) } \\ &= {2\qty{ {2\alpha\over {\left\lVert {\alpha} \right\rVert}^2}, {2 \beta\over {\left\lVert {\beta} \right\rVert}^2} } \over \qty{{2 \beta\over {\left\lVert {\beta} \right\rVert}^2}, {2 \beta\over {\left\lVert {\beta} \right\rVert}^2} } } \\ &= {2^3 {\left\lVert {\beta} \right\rVert}^4 (\alpha, \beta)\over 2^2 {\left\lVert {\alpha} \right\rVert}^2 {\left\lVert {\beta} \right\rVert}^2 (\beta, \beta)} \\ &= {2^3 {\left\lVert {\beta} \right\rVert}^4 (\alpha, \beta)\over 2^2 (\alpha, \alpha) {\left\lVert {\beta} \right\rVert}^2 {\left\lVert {\beta} \right\rVert}^2} \\ &= {2( \alpha, \beta) \over (\alpha, \alpha)} \\ &= {\left\langle {\beta},~{ \alpha} \right\rangle} .\end{align*}
\(\Phi {}^{ \vee }\) is a root system.
The axioms can be checked individually:
\(R1\): there is a bijection of sets \begin{align*} \Phi & { \, \xrightarrow{\sim}\, }\Phi {}^{ \vee }\\ \alpha &\mapsto \alpha {}^{ \vee } ,\end{align*} thus \({\sharp}\Phi {}^{ \vee }= {\sharp}\Phi < \infty\). To see that \({\mathbf{R}}\Phi {}^{ \vee }= {\mathbb{E}}\), for \(\mathbf{v}\in {\mathbb{E}}\), use the fact that \({\mathbf{R}}\Phi = {\mathbb{E}}\) to write \(\mathbf{v} = \sum_{ \alpha\in \Phi} c_\alpha \alpha\), then \begin{align*} \mathbf{v} &= \sum_{ \alpha\in \Phi} c_ \alpha \alpha \\ &= \sum_{ \alpha\in \Phi} c_{\alpha} {{\left\lVert { \alpha} \right\rVert}^2\over 2}\cdot {2\over {\left\lVert { \alpha} \right\rVert}^2} \alpha \\ &\mathrel{\vcenter{:}}=\sum_{ \alpha\in \Phi} c_{\alpha} {{\left\lVert { \alpha} \right\rVert}^2\over 2} \alpha {}^{ \vee }\\ &= \sum_{\alpha {}^{ \vee }\in \Phi {}^{ \vee }} d_{\alpha {}^{ \vee }} \alpha {}^{ \vee }, \qquad d_{ \alpha {}^{ \vee }} \mathrel{\vcenter{:}}={1\over 2}c_ \alpha {\left\lVert { \alpha^2} \right\rVert} ,\end{align*} so \(\mathbf{v}\in {\mathbf{R}}\Phi {}^{ \vee }\). Finally, \(\mathbf{0}\not\in \Phi {}^{ \vee }\) since \({2\over (\alpha, \alpha)} \alpha\neq \mathbf{0}\) since \(\alpha \in \Phi\implies \alpha\neq \mathbf{0}\), and \(2/(\alpha, \alpha)\) is never zero.
\(R2\): It suffices to show that if \(\lambda \alpha {}^{ \vee }= \beta {}^{ \vee }\in \Phi {}^{ \vee }\) then \(\lambda = \pm 1\) and \(\beta {}^{ \vee }= \alpha {}^{ \vee }\). So suppose \(\lambda \alpha {}^{ \vee }= \beta {}^{ \vee }\), then \begin{align*} \lambda {2\over {\left\lVert { \alpha} \right\rVert}^2} \alpha = {2\over {\left\lVert { \beta} \right\rVert}^2} \beta \implies \beta = \lambda{{\left\lVert {\beta} \right\rVert}^2 \over {\left\lVert { \alpha} \right\rVert}^2} \alpha \mathrel{\vcenter{:}}=\lambda' \alpha ,\end{align*} and since \(\Phi\) satisfies \(R2\), we have \(\lambda' = \pm 1\) and \(\beta = \alpha\). But then \begin{align*} \pm 1 = \lambda {{\left\lVert { \beta} \right\rVert}^2 \over {\left\lVert { \alpha} \right\rVert}^2} = \lambda{{\left\lVert { \alpha} \right\rVert}^2 \over {\left\lVert { \alpha} \right\rVert}^2} = \lambda .\end{align*} Finally, if \(\alpha = \beta\) then \(\alpha {}^{ \vee }= \beta {}^{ \vee }\) since \(\Phi, \Phi {}^{ \vee }\) are in bijection.
Continuing:
\(R3\): It suffices to show that if \(\alpha {}^{ \vee }, \beta {}^{ \vee }\in \Phi {}^{ \vee }\) then \(s_{\alpha {}^{ \vee }}(\beta {}^{ \vee }) = \gamma {}^{ \vee }\) for some \(\gamma {}^{ \vee }\in \Phi {}^{ \vee }\). This follows from a computation: \begin{align*} s_{ \alpha {}^{ \vee }}(\beta {}^{ \vee }) &= \beta {}^{ \vee }- {\left\langle {\beta {}^{ \vee }},~{ \alpha {}^{ \vee }} \right\rangle} \alpha {}^{ \vee }\\ &= \beta {}^{ \vee }- {\left\langle { \alpha},~{ \beta} \right\rangle} \alpha {}^{ \vee }\\ &= {2\beta\over {\left\lVert { \beta} \right\rVert}^2 }- {\left\langle { \alpha},~{ \beta} \right\rangle} {2 \alpha\over {\left\lVert { \alpha} \right\rVert}^2 } \\ &= {2\beta\over {\left\lVert { \beta} \right\rVert}^2 }- {2 (\alpha, \beta) \over (\beta, \beta) } {2 \alpha\over {\left\lVert { \alpha} \right\rVert}^2 } \\ &= {2\beta\over {\left\lVert { \beta} \right\rVert}^2 }- {2 (\alpha, \beta) \over {\left\lVert {\beta} \right\rVert}^2 } {2 \alpha\over {\left\lVert { \alpha} \right\rVert}^2 } \\ &= {2\over {\left\lVert {\beta} \right\rVert}^2} \qty{ \beta - {2 (\alpha, \beta) \over {\left\lVert {\alpha} \right\rVert}^2 } \alpha } \\ &= {2\over ( \beta, \beta)} \qty{ \beta - {2 (\beta, \alpha) \over {\left\lVert {\alpha} \right\rVert}^2 } \alpha } \\ &= {2\over ( \beta, \beta)} \sigma_{ \alpha}(\beta) \\ &= {2\over ( \sigma_{ \alpha}(\beta), \sigma_{ \alpha}(\beta) )} \sigma_{ \alpha}(\beta) \\ &\mathrel{\vcenter{:}}=(\sigma_ \alpha( \beta)) {}^{ \vee } ,\end{align*} where we’ve used that \(\sigma_{\alpha}\) is an isometry with respect to the symmetric bilinear form \(({-}, {-})\).
\(R4\): This follows directly from the formula proved in the claim at the beginning: \begin{align*} {\left\langle { \alpha {}^{ \vee }},~{\beta {}^{ \vee }} \right\rangle} = {\left\langle { \beta},~{ \alpha} \right\rangle}\in {\mathbf{Z}} ,\end{align*} since \(\alpha, \beta\in \Phi\) and \(\Phi\) satisfies \(R4\).
There is an isomorphism of groups \({\mathcal{W}}(\Phi) { \, \xrightarrow{\sim}\, }{\mathcal{W}}(\Phi {}^{ \vee })\).
There is a map of Weyl groups \begin{align*} \tilde \psi: {\mathcal{W}}(\Phi) & { \, \xrightarrow{\sim}\, }{\mathcal{W}}(\Phi {}^{ \vee }) \\ s_ \alpha &\mapsto s_{\alpha {}^{ \vee }} ,\end{align*} which is clearly a bijection of sets with inverse \(s_{\alpha {}^{ \vee }} \mapsto s_{ \alpha}\). Since it is also a group morphism, this yields an isomorphism of groups.
The following are pictures of \(\Phi {}^{ \vee }\) in the stated special cases:
In Table 1, show that the order of \(\sigma_\alpha \sigma_\beta\) in \(\mathcal{W}\) is (respectively) \(2,3,4,6\) when \(\theta=\pi / 2, \pi / 3\) (or \(2 \pi / 3\) ), \(\pi / 4\) (or \(3 \pi / 4\) ), \(\pi / 6\) (or \(5 \pi / 6\) ).
Note that \(\sigma_\alpha \sigma_\beta=\) rotation through \(2 \theta\).
Given the hint, this is immediate: if \(s_\alpha s_\beta = R_{2\theta}\) is a rigid rotation through an angle of \(2\theta\), then it’s clear that \begin{align*} R_{2 \cdot {\pi \over 2}}^2 = R_{2 \cdot {\pi \over 3}}^3 = R_{2\cdot {\pi \over 4}}^4 = R_{2\cdot {\pi \over 6}}^6 = \operatorname{id} ,\end{align*} since these are all rotations through an angle of \(2\pi\).
To prove the hint, note that in any basis, a reflection has determinant \(-1\) since it fixes an \(n-1{\hbox{-}}\)dimensional subspace (the hyperplane \(H_\alpha\) of reflection) and negates its 1-dimensional complement (generated by the normal to \(H_\alpha\)). On the other hand, \(\operatorname{det}(s_\alpha s_\beta) = (-1)^2 = 1\) and is an isometry that only fixes the intersection \(H_\alpha = H_\beta = \left\{{\mathbf{0}}\right\}\), so it must be a rotation.
To see that this is a rotation through an angle of exactly \(2\theta\), consider applying \(s_\beta\circ s_\alpha\) to a point \(P\). Letting \(H_\alpha, H_\beta\) by the corresponding hyperplanes. We then have the following geometric situation:
We then have \(\theta_1 + \theta_2 = \theta\), noting that the angle between \(\alpha\) and \(\beta\) is equal to the angle between the hyperplanes \(H_\alpha, H_\beta\). The total angle measure between \(P\) and \(s_\beta(s_\alpha(P))\) is then \(2\theta_1 + 2\theta_2 = 2\theta\).
Prove that the respective Weyl groups of \(A_1 \times A_1, A_2, B_2, G_2\) are dihedral of order \(4,6,8,12\). If \(\Phi\) is any root system of rank 2 , prove that its Weyl group must be one of these.
In light of the fact that \begin{align*} D_{2n} = \left\langle{s, r {~\mathrel{\Big\vert}~}r^n = s^2 = 1, srs^{-1}= r^{-1}}\right\rangle \end{align*} where \(r\) is a rotation and \(s\) is a reflection, for the remainder of this problem, let \(s \mathrel{\vcenter{:}}= s_\alpha\) and \(r \mathrel{\vcenter{:}}= s_\alpha s_\beta\) after choosing roots \(\alpha\) and \(\beta\).
\(A_1\times A_1\): we have \(\Phi(A_2) = \pm e_1, \pm e_2\), and setting \(\alpha = e_1, \beta = e_2\) yields \(\theta = \pi/2\). We have \begin{align*} {\mathcal{W}}(A_2) = \left\{{\operatorname{id}, s_{\alpha}, s_{\beta}, s_\alpha s_\beta}\right\} \end{align*} where \(s_{\alpha}^2 = s_{\beta}^2 = 1\), \(s_\alpha s_\beta\) is rotation through \(2\theta = \pi\) radians, and \((s_\alpha s_\beta)^2 = \operatorname{id}\). Setting \(r= s_\alpha, s= s_\alpha s_\beta\) yields \(r^2 = s^2 = \operatorname{id}\) and \(srs = s\), which are the defining relations for \(D_4\).
\(A_2\): there is an inscribed triangle in the regular hexagon formed by the convex hull of the roots (see the dotted triangle below), and the reflections \(s_\alpha\) about the hyperplanes \(H_\alpha\) restrict to precisely the symmetries of this triangle, yielding \(D_{6}\): Alternatively, choose a simple system \(\Delta = \left\{{\alpha= e_1, \beta= -e_1 - e_2}\right\}\), then \({\mathcal{W}}(A_2) = \left\langle{s_\alpha, s_\alpha s_\beta}\right\rangle\) is enough to generate the Weyl group. Since we have \(s \mathrel{\vcenter{:}}= s_\alpha \implies s^2 = 1\) and \(r\mathrel{\vcenter{:}}= s_\alpha s_\beta \implies r_3 = 1\) (since \(\theta = \pi/3\)), these satisfy the relations of \(D_6\).
\(B_2\): there is similarly a square on which the hyperplane reflections act on, highlighted with dotted lines here: Since the \(s_\alpha\) act faithfully as the symmetries of a square, we have \({\mathcal{W}}(B_2)\cong D_{8}\). Alternatively, take \(\alpha = e_1\) and \(\beta = -e_1 + e_2\) and set \(s = s_\alpha, r = s_\alpha s_\beta\). Then \({\mathcal{W}}(B_2) = \left\langle{s, r}\right\rangle\) and since \(s^2 = r^4 = e\) (since here \(\theta = \pi/4\)) and they satisfy the proper commutation relation, this yields precisely the relations for \(D_{2n}, n=4\).
\(G_2\): In this case, the convex hull of the short roots form a hexagon, on which the hyperplane reflections precisely restrict to symmetries:
This yields \({\mathcal{W}}(G_2)\cong D_{12}\). Alternatively, take \(\alpha = e_1\) and \(\beta\) the long root in quadrant II, set \(s = s_\alpha, r= s_\alpha s_\beta\), then \(s^2 = r^6 = 1\) since \(\theta = \pi/6\) and again the commutation relations for \(D_{2n}, n=6\) are satisfied.
Finally, for any root system \(\Phi\) of rank 2, we will have \({\mathcal{E}}(\Phi) = \left\langle{ s \mathrel{\vcenter{:}}= s_ \alpha, r\mathrel{\vcenter{:}}= s_ \alpha s _{\beta} }\right\rangle\). Because \(\theta\) is restricted to one of the angles in Table 1 in Humphreys \(\S 9.4\), i.e. the angles discussed in problem 9.3 above, the order of \(s\) is always 2 and the order of \(r\) is one of \(4,6,8,12\). Since \(srs^{-1}=srs = r^{-1}\) in all cases, this always yields a dihedral group.
Let \(\Phi {}^{ \vee }\) be the dual system of \(\Phi, \Delta {}^{ \vee }=\left\{\alpha {}^{ \vee }\mathrel{\Big|}\alpha \in \Delta\right\}\). Prove that \(\Delta {}^{ \vee }\) is a base of \(\Phi {}^{ \vee }\).
Compare Weyl chambers of \(\Phi\) and \(\Phi {}^{ \vee }\).
Suppose that \(\Delta\) is a base of \(\Phi\). We can use the fact that bases are in bijective correspondence with Weyl chambers via the correspondence \begin{align*} \Delta \to \mathrm{WC}(\Delta) \mathrel{\vcenter{:}}=\left\{{v\in {\mathbb{E}}{~\mathrel{\Big\vert}~}(v, \delta) > 0 \,\, \forall \delta\in \Delta}\right\} ,\end{align*} sending \(\Delta\) to all of the vectors making an acute angle with all simple vectors \(\delta \in \Delta\), or equivalently the intersection of the positive half-spaces formed by the hyperplanes \(H_{\delta}\) for \(\delta\in \Delta\).
The claim is that \(\mathrm{WC}(\Delta {}^{ \vee }) = \mathrm{WC}(\Delta)\), i.e. the Weyl chamber is preserved under taking duals. This follows the fact that if \(v\in \mathrm{WC}(\Delta)\), then \((v,\delta)> 0\) for all \(\delta\in \Delta\). Letting \(\delta {}^{ \vee }\in \Delta {}^{ \vee }\), we have \begin{align*} (v,\delta {}^{ \vee }) > \qty{v, {2 \over (\delta, \delta)}\delta } = {2 (v, \delta) \over (\delta,\delta)} > 0 \end{align*} using that every term in the last step is non-negative. Since this works for every \(\delta {}^{ \vee }\in \Delta {}^{ \vee }\), this yields \(v\in \mathrm{WC}(\Delta {}^{ \vee })\), and a similar argument shows the reverse containment. So \(\Delta {}^{ \vee }\) corresponds to a fundamental Weyl chamber and thus a base.
Prove that there is a unique element \(\sigma\) in \(\mathcal{W}\) sending \(\Phi^{+}\)to \(\Phi^{-}\)(relative to \(\Delta\) ). Prove that any reduced expression for \(\sigma\) must involve all \(\sigma_\alpha(\alpha \in \Delta)\). Discuss \(\ell(\sigma)\).
The existence and uniqueness of such an element follows directly from the fact that \(W\) acts simply transitively on the set of bases, and since \(\Delta\) and \(-\Delta\) are both bases, there is some \(w_0\in W\) such that \(w_0(\Delta) = - \Delta\) and consequently \(w_0(\Phi^+) = \Phi^-\). Since \(\ell(\alpha) = n(\alpha) \leq {\sharp}\Phi^+\) for any root \(\alpha\) and \(n(w_0) = {\sharp}\Phi^+\) by definition, \(w_0\) must be the longest element in \(W\), i.e. \(\ell(w_0)\) is maximal.
Any reduced expression for \(w_0\) must involve all \(s_\alpha\) – if not, and say \(s_\alpha\) doesn’t occur in any reduced expression for \(w_0\), then \(w_0\) does not change the sign of \(\alpha\) since every \(s_\beta\) for \(\beta\neq \alpha \in \Delta\) changes the sign of \(\beta\) and acts by permutations on \(\Phi^+\setminus\left\{{\beta}\right\}\). However, in this case, \(w_0' \mathrel{\vcenter{:}}= w_0s_\alpha\) satisfies \(n(w_0') = n(w_0) + 1\) since \(w_0'\) necessarily changes the sign of \(\alpha\), contradicting maximality of \(w_0\).
Finally, we have \(\ell(w_0) = n(w_0) = {\sharp}\Phi^+\).
Use the algorithm of (11.1) to write down all roots for \(G_2\).
Do the same for \(C_3\): \begin{align*} \left(\begin{array}{rrr}2 & -1 & 0 \\ -1 & 2 & -1 \\ 0 & -2 & 2\end{array}\right) \end{align*}
Note that it suffices to find all positive roots, since \(\Phi = \Phi^+ {\textstyle\coprod}\Phi^-\) once a simple system \(\Delta\) is chosen. Since \({\sharp}\Phi(G_2) = 12\), it thus suffices to find 6 positive roots. For \(G_2\), the Dynkin diagram indicates one long and one short root, so let \(\alpha\) be short and \(\beta\) be long. In this system we have \begin{align*} {\left\langle {\alpha },~{\alpha } \right\rangle}= {\left\langle {\beta },~{\beta } \right\rangle}&= 2 \\ {\left\langle {\alpha },~{\beta } \right\rangle}&= -1 \\ {\left\langle {\beta },~{\alpha } \right\rangle}&= -3 .\end{align*}
The \(\beta\) root string through \(\alpha\): since \(\operatorname{ht}(\alpha) = 1\) and \(\beta-\alpha\not\in \Phi\), we have \(r=0\). Since \(q = -{\left\langle {\alpha },~{\beta } \right\rangle}= -(-1) = 1\), we obtain the string \(\alpha, \alpha + \beta\).
The \(\alpha\) root string through \(\beta\): since \(\operatorname{ht}( \beta) = 1\) and \(\alpha -\beta \not\in \Phi\) we have \(r=0\) again. Here \(q = - {\left\langle {\alpha },~{\beta } \right\rangle}= - (-3) = 3\), we obtain \(\beta, \beta+ \alpha, \beta + 2 \alpha, \beta+ 3 \alpha\)
We know that the \(\alpha\) root strings through any of the above roots will yield nothing new.
The \(\beta\) root strings through \(\alpha + \beta, \beta + 2\alpha\) turn out to yield no new roots.
The \(\beta\) root string through \(\beta + 3 \alpha\): since \((\beta + 3\alpha) - \beta = 3 \alpha\not\in\Phi\), using that only \(\pm \alpha\in \Phi\), we have \(r=0\). We also have \begin{align*} r-q = {\left\langle {\beta + 3 \alpha},~{ \beta} \right\rangle} = {\left\langle {\beta },~{\beta } \right\rangle}+ 3 {\left\langle {\alpha },~{\beta } \right\rangle}= 2 + 3(-1) =-1 ,\end{align*} we have \(q=1\) and obtain \(\beta+ 3 \alpha, 2\beta + 3 \alpha\).
Combining these yields 6 positive roots: \begin{align*} \Phi^+(G_2) = \left\{{ \alpha, \alpha+ \beta, \beta, \beta+ 2 \alpha, \beta+ 3 \alpha, 2 \beta +3 \alpha}\right\} .\end{align*}
For \(C_3\), there are \(2\cdot 3^2 = 18\) total roots and thus 9 positive roots to find. Let \(\alpha, \beta, \gamma\) be the three ordered simple roots, then the Cartan matrix specifies \begin{align*} {\left\langle {\alpha },~{\alpha } \right\rangle}= {\left\langle {\beta },~{\beta } \right\rangle}= {\left\langle {\gamma },~{\gamma } \right\rangle}&= 2 \\ {\left\langle {\beta },~{\alpha } \right\rangle}= {\left\langle {\alpha },~{\beta } \right\rangle}&= -1 \\ {\left\langle {\alpha },~{\gamma } \right\rangle}= {\left\langle {\gamma },~{\alpha } \right\rangle}&= 0 \\ {\left\langle {\beta },~{\gamma } \right\rangle}&= -1 \\ {\left\langle {\gamma },~{\beta } \right\rangle}&= -2 .\end{align*}
This yields 9 positive roots: \begin{align*} \Phi^+(C_3) = \left\{{\alpha, \beta, \gamma, \alpha + \beta, \gamma + \beta, \gamma+ 2 \beta, \alpha+ \beta+ \gamma, \alpha+ 2 \beta + \gamma, \alpha + 3 \alpha + \gamma}\right\} .\end{align*}
Do one type that is not \(A_n\).
You get something interesting if you take the commutator bracket of two upper triangular matrices.↩︎
The usual product somehow involves “second-order terms,” while the commutator product cancels higher order terms to give something first-order.↩︎
One should check that this is well-defined.↩︎
Use the third isomorphism theorem.↩︎
Lift a series for the quotient, which is eventually in \(Z(L)\) since it was zero in the quotient, and then bracketing with \(Z(L)\) terminates.↩︎
If \(L^n={n-1} \supseteq L^n=0\) then \([LL^{n-1}] = 0\) and thus \(L^{n-1} \subseteq Z(G)\).↩︎
Note that for arbitrary SESs, the 2-out-of-3 property does not hold for nilpotency, but for the special cases of a quotient by the center it does.↩︎
Note that the assumption is not that \(L\) is a nilpotent algebra, but rather the stronger assumption on endomorphisms.↩︎
The derived series terminates immediately for an abelian algebra.↩︎
Associative is \(f([xy]z) = f(x [yz])\), sometimes called invariant.↩︎
It turns out that the inverse map of vector spaces \(\psi^{-1}: W\to V\) is again a morphism of \(L{\hbox{-}}\)modules.↩︎
Note that groups would act on each factor separately, and this is more like a derivative.↩︎
One might expect an inverse from group theory, which differentiates to a minus sign.↩︎
See Humphreys p.22.↩︎
Note that this don’t actually exist! We’re in the middle of a contradiction.↩︎
This fails for infinite dimensional modules, e.g. Verma modules. The highest weight can be any complex number.↩︎
More generally, \begin{align*} {2 (\lambda, \alpha)\over (\alpha, \alpha) } = \lambda(h_ \alpha) \qquad\forall \alpha\in \Phi .\end{align*} ↩︎
Note that \({\mathfrak{sl}}_2\) has a basis \(\left\{{x,h,y}\right\}\) but is freely generated by \(x,y\) since \(h=[xy]\).↩︎
\(W_L\) is \(W\) made into a Lie algebra via \([xy] = xy-yx\).↩︎
If \(x\) is not diagonal, one can use that \(x\) is diagonalizable over \({ \mathbf{F} }\) since \(x\) has distinct eigenvalues in \({ \mathbf{F} }\). So one can reduce to the diagonal case by a change-of-basis of \({ \mathbf{F} }^n\) that diagonalizes \(x\).↩︎
Corollary C states that if \(L\) is solvable then every \(x\in L^{(1)}\) is ad-nilpotent, and thus \(L^{(1)}\) is nilpotent.↩︎
Examples: \(L\) abelian, \(L\) semisimple, \(L=\mathfrak{g l}_n({ \mathbf{F} })\).↩︎
If ad \(L \neq 0\), use Weyl’s Theorem.↩︎