Note: These are notes live-tex’d from a graduate course in moduli theory taught by Valery Alexeev at the University of Georgia in Fall 2020. As such, any errors or inaccuracies are almost certainly my own.
Last updated: 2022-12-27
Some examples of moduli spaces:
Consider \(X=E\) an elliptic curve, which can be defined as:
Recall that the group is \begin{align*} \operatorname{Pic}(X) \mathrel{\vcenter{:}}=\left\{{\text{Invertible } {\mathcal{O}}_X{\hbox{-}}\text{sheaves}}\right\}/\sim \cong \left\{{\text{Line bundles over } X}\right\}/\sim .\end{align*} There is a homomorphism \(\operatorname{Pic}(X) \xrightarrow{\deg} {\mathbf{Z}}\to 0\) with \(\operatorname{Pic}^0(X) \mathrel{\vcenter{:}}=\ker \deg\). A priori \(\operatorname{Pic}^0(X)\) is a group, but in fact has the structure of a variety – there exists a Jacobian variety \(\operatorname{Jac}(X)\) such that \(\operatorname{Pic}^0(X) \cong \operatorname{Jac}(C)(k)\), the \(k{\hbox{-}}\)points of the Jacobian. Thus \(\operatorname{Jac}(X)\) is a moduli space of invertible sheaves of degree zero.
For \(X=E\) an elliptic curve, \(\operatorname{Jac}(E) \cong E\).
There are distinct varieties with the same \(k{\hbox{-}}\)points: take for example the cuspidal curve \(X = V(y^2-x^3)\) and \({\mathbf{A}}^1\) – there is a map \begin{align*} {\mathbf{A}}^1 &\to X \\ t &\mapsto {\left[ {t^2, t^3} \right]} .\end{align*} with inverse \(t=y/x\):
Note that these have the same \(k{\hbox{-}}\)points over any field \(k\). Thus we need to consider not just objects, but families of objects.
The moduli space of elliptic curves \begin{align*} {\mathcal{M}}_{1} = \left\{{\text{Elliptic curves over }{ \overline{k} }}\right\}_{/ {\cong}} .\end{align*} As an algebraic variety, \({\mathcal{M}}_1 \cong {\mathbf{A}}^1_j\) (the \(j{\hbox{-}}\)line) coming from taking the \(j{\hbox{-}}\)invariant \begin{align*} j(X) = j(a, b) =_? {2a\over 4a^3 + 27b^2} .\end{align*}
Then if \(X\to S\) is a family of genus 1 algebraic curves, there exists a unique map \(S\to {\mathbf{A}}^1_j\) where \(s\in S\) maps to \(j(X_s)\). How would you prove this? See Hartshorne’s treatment using the Weierstrass \(\wp{\hbox{-}}\)function. Alternatively, factor to get \(y=x(x-1)(x-\lambda)\) for \(\lambda\not\in\left\{{0, 1}\right\}\) and quotient by \(S_3\) acting by permuting \(\left\{{0,1,\lambda}\right\}\). One can then form \(M_1 = {\mathbf{A}}^1_\lambda/S_3\) and construct \(j(\lambda)\) invariant under this action. Note that when \(X = \operatorname{Spec}R\) is affine and \(G\) is finite, there is an isomorphism \(\operatorname{Spec}R/G \cong \operatorname{Spec}R^G\) to the GIT quotient. If \(X\) is not affine but \(G\) is finite, one can still patch together quotients locally.
Moduli of sheaves or vector bundles (locally free \({\mathcal{O}}_X{\hbox{-}}\)modules of rank \(n\)) on a fixed base variety \(X\), e.g. a curve. One might fix invariants like a rank \(r\), degree \(d\), etc in order to impose a finiteness/boundedness condition on the moduli space. For \(X = {\mathbf{P}}^1\), a vector bundle \(F\to X\) decomposes as \(F = \bigoplus _{i=1}^r {\mathcal{O}}(d_i)\) where \(\deg F = \sum d_i\) by Grothendieck’s theorem. Since some \(d_i\) can be negative, the moduli come in a countably infinite set. To impose boundedness one can additionally add stability conditions such as semistability, which here ensures only finitely many degrees appear and the existence of a moduli space \({\mathcal{M}}_{r, d}(X)\). To do this, twist by a large integer and take global sections to get \(H^0(X; F(n))\) for \(n\gg 0\). Understanding \(\bigoplus_{n\geq 0} H^0(X; F(n))\) as a module over \(R = \bigoplus _{n\geq 0} {\mathcal{O}}(n)\) allows one to reconstruct \(F\). Thus one can construct \({\mathcal{M}}_{r, d}(X) = ? / \operatorname{PGL}_N\) corresponding to choosing a basis for \(H^0\). Here we remove some “unstable” locus before taking the quotient – note that points correspond to orbits, except that some orbits become identified.
This is an “easy” moduli problem, since vector bundles are somehow linear. See Ramanujan, ?, Mumford, 40+ years ago.
Let \({\mathcal{M}}_2\) be the moduli of curves \(C\) with \(g(C) = 2\). All such curves are hyperelliptic, so similar to the \(g=1\) theory. In the \(g=1\) case, curves can be realized as ramified covers of \({\mathbf{P}}^1\):
In the \(g=2\) case, they can similarly be realized as 2-to-1 maps ramified at 6 points:
One can realize \({\mathbf{A}}^3\supseteq U \mathrel{\vcenter{:}}=\left\{{{\left[ { \lambda_1, \lambda_2, \lambda_3} \right]} {~\mathrel{\Big\vert}~}\lambda_i\neq 0,1,\infty, \lambda_i \neq \lambda_j}\right\}\) and \(M_2 = U/S_6\).
For \(g=3\), one has \(g=(1/2)(d-1)(d-2)\) by the adjunction formula, so \(g=3\) corresponds to \(d=4\) and one obtains
Hyperelliptic: degree 4 curves in \({\mathbf{P}}^2\) (the generic case), or
Non-hyperelliptic: 2-to-1 covers of \({\mathbf{P}}^1\) ramified at 8 points.
There is no analog of the Weierstrass equation for degree 4 polynomials, so write \(f_4(x_1, x_2, x_3) = \sum a_m x^m\) where \(x^m \mathrel{\vcenter{:}}= x_1^{m_1}x_2^{m_2}x_3^{m_3}\). How many such polynomials are there? Count points in the triangle:
This yields \(5+4+3+2+1 = 15\) such monomials, and one can write \begin{align*} {\mathbf{A}}^{15}\setminus\left\{{0}\right\}/ k^{\times}= {\mathbf{P}}^{14} \supseteq U = {\mathbf{P}}^{14}\setminus\Delta \end{align*} where \(\Delta\) is the discriminant locus. This is an affine variety, since \(\Delta\) is a high degree hypersurface. Then form \(U/\operatorname{PGL}_3\), noting that \(\dim \operatorname{PGL}_3 = 3^2-1 = 8\), so \(\dim {\mathbf{P}}^{14}\setminus\Delta = 14-8 = 6\).
Ways of forming moduli spaces:
These rarely produce compact/complete spaces, so we’ll discuss compactification. Why compactify? Computing things, projectivizing, intersection theory. See Bailey-Borel and toroidal compactifications.
A note on Hodge theory: for an elliptic curve, one can write \(E = {\mathbf{C}}/\left\langle{1, \tau}\right\rangle\) with \(\Im(\tau) > 0\) (so \(\tau\in {\mathbb{H}}\)), one can form \({\mathcal{M}}_1 = \dcosetr{{\mathbb{H}}}{{\operatorname{SL}}_2({\mathbf{Z}})}\). This is Hodge theory: \(\tau\) is a period, and we quotient a bounded symmetric domain by an arithmetic group. Similarly, for PPAVs one can write \({\mathcal{A}}_g = \dcosetr{H_g}{{\mathsf{Sp}}_{2g}({\mathbf{Z}})}\), and for K3 surfaces one has \(F_{2d} = {\mathbb{D}}_{2g}/\Gamma_{2g}\) where \(\omega_X \in {\mathbb{D}}\). One can determine things like Jacobians using Torelli theorems.
Todo: how much do you know, and what are you trying to get out of the course?
See Mukai’s book
A question from me: is every curve a branched cover of \({\mathbf{P}}^1\) over some number of points? Consider maps \(f: X\to {\mathbf{P}}^1\) where \(f(X) \not\subset {\mathbf{P}}^{n-1}\) is not contained in a hyperplane. This biject with basepoint free linear systems – let \({\mathcal{F}}\) be an invertible sheaf, then a linear system is any linear subspace \(V \subseteq H^0(X; {\mathcal{F}})\). Writing \(V = \left\langle{f_i}\right\rangle_{i=0}^n\), the bijection is sending \(p\mapsto {\left[ {f_0(p): \cdots : f_n(p)} \right]} \in {\mathbf{P}}^{n}\). Since \({\mathcal{F}}\) is invertible, locally \({ \left.{{{\mathcal{F}}}} \right|_{{U}} } \cong {\mathcal{O}}_U\) – this map is well-defined precisely when not all the \(f_i(p)\) are zero, which is precisely the basepoint-free condition. A map \(f:X\to {\mathbf{P}}^1\) thus corresponds to two sections which don’t simultaneously vanish. If \(X\) is projective, it admits a very ample line bundle \({\mathcal{L}}\) where the base locus of \(H^0({\mathcal{L}})\) is empty. One can now project away from any point outside of the curve to get a regular map factoring the projective embedding:
The projection:
One can continue to project until reaching \({\mathbf{P}}^1\).
The gonality of a curve \(X\) is the minimal degree of a map \(X\to {\mathbf{P}}^1\), where degree is the size of a generic fiber. Here a cover may have a ramification locus upstairs and a branch locus downstairs, which are small in the sense that they are algebraic subsets.
Recall \(X\) is hyperelliptic if it admits a 2-to-1 map \(X\to {\mathbf{P}}^1\), so has gonality 2. Gonality 1 curves are isomorphic to \({\mathbf{P}}^1\), and gonality 3 are trigonal.
Plan for the course:
Let \(X\) be a genus \(g\) smooth projective curve. Over \({\mathbf{C}}\), projective implies compact, and non-projective is a Riemann surface with finitely many punctures. More generally, over \(k={ \overline{k} }\) smooth means that \(\dim {\mathbf{T}}_{X, x} = \dim X\) at every point \(x\in X\). Note that \(X^{{\mathrm{sing}}} \subseteq X\) is an algebraic and thus closed subset, so curves have finitely many singularities (nodes, cusps, etc). There is only one topological type of curve, but there are distinct algebraic and conformal structures (which turn out to be equivalent notions for curves).
One can show \({\mathcal{M}}_g({\mathbf{C}})\) is an orbifold of dimension \(3g-3\), i.e. locally a quotient \(M/G\) of a manifold by a finite group. Similarly \({\mathcal{M}}_g\) is a quasiprojective algebraic variety of dimension \(3g-3\) with only quotient singularities. Mumford was the first to ask questions about its geometry, e.g. is it rational?
A variety \(X\in \mathsf{Alg}{\mathsf{Var}}_{/ {k}}\) is rational if \(X\overset{\sim}{\dashrightarrow}{\mathbf{P}}^n\), so there is a common open subset \(X\supseteq U \subset {\mathbf{P}}^n\).2 Equivalently, there is an isomorphism of rational functions \begin{align*} k(X) \cong k({\mathbf{P}}^n)\cong k({\mathbf{A}}^n) ,\end{align*} where the latter is comprised of quotients of polynomials. One can take \(n=\dim X\), since if \(N > n\) one can factor a dominant map \({\mathbf{P}}^N \to X\) through a hyperplane \({\mathbf{P}}^{N-1}\to X\) which is still dominant.
\(X\) is unirational if there is a dominant morphism \(f: {\mathbf{P}}^n \overset{\sim}{\dashrightarrow}X\), so a map defined on an open subset whose image is dense. Equivalently, \(X\) admits a parameterization by coordinates \(x_1, \cdots, x_n\), so there is a rational parameterization.3
In this case, there is a degree \(d\) finite extension \(k(x_1,\cdots, x_n)\) over the pullback of \(k(X)\).
Is the converse true? I.e. if there is a finite extension \(k(x_1,\cdots, x_n)\) over \(k(X)\), is it true that \(k(X) = k(y_1,\cdots, y_n)\)? So does unirational imply rational?
Lüroth proved this in dimension 1, and as a consequence of the classification of surfaces, the Italian school showed this in dimension 2. See the Castelnuovo criterion, which shows \(X\) is rational iff \(X\) is regular, i.e. \(q \mathrel{\vcenter{:}}= h^1(X; {\mathcal{O}}_X) = 0\) and \(p_2 \mathrel{\vcenter{:}}= h^0(X; 2K_X) = 0\).
This is false in dimension 3. 3-4 counterexamples were given in the 70s/80s, first due to Iskovskih-Manin, a second due to Clemens-Griffith, and later due to Mumford.
Show that if \(k \subsetneq K \subset k(X)\), then \(K\) is monogenic (generated by a single element).
\({\mathcal{M}_g}\) is rational for \(g=2\).
Note that \(3(g-1)=3\), and a genus 2 curve is a branched cover \(X\to {\mathbf{P}}^1\) ramified at 6 points \(\left\{{0,1,\infty, \lambda_1, \lambda_2, \lambda_3}\right\}\). This yields a dominant map \({\mathbf{A}}^3\to {\mathcal{M}}_2\) which is finite-to-1 and defined up to the action of \(S_6\). This is not defined if points collide, which corresponds to collapsing cycles in \(X\), and is degree \(6!\). Here we can write \(X = V(y^2 = x(x-1)(x - \lambda_1) (x- \lambda_2)(x - \lambda_3)) \subseteq {\mathbf{A}}^2_{/ {{\mathbf{C}}}}\). If any \(\lambda_i = \lambda_j\) for \(i\neq j\), one obtains a singularity locally modeled on the node \(y^2=x^2\), which is the following over \({\mathbf{R}}\):
Over \({\mathbf{C}}\), this is two hyperplanes intersecting in a single point. We can thus write \begin{align*} {\mathcal{M}}_2 = {\mathbf{A}}^3\setminus\left\{{ \lambda_i = \lambda_j {~\mathrel{\Big\vert}~}i\neq j }\right\} / S^6 ,\end{align*} which is rational and unirational.
\({\mathcal{M}}_3\) is rational and unirational.
We need to show that a genus 3 curve can be parameterized by 6 parameters. Noting that a genus 3 curve is planar of degree 4, which suffices – planar curves are given by polynomials \(f_4(x_1, x_2, x_3) = \sum a_n x^n\), and these are the parameters.
One of the classical Italian algebraic geometers (either Severi or Castelnuovo) “proved” the false statement that \({\mathcal{M}_g}\) is unirational for all \(g\). In fact this is only true for \(g\leq 9\). The idea is good though: any curve \(X\hookrightarrow{\mathbf{P}}^n\) can be projected to a curve in \(X \to {\mathbf{P}}^2\) with only finitely many nodes. The coordinates for the nodes can serve as parameters. Having a curve pass through given points is a linear condition, as is saying it is singular at a point (by computing partial derivatives). Being a node is not a linear condition – instead, it is a quadratic algebraic condition coming from the vanishing of a \(2\times 2\) determinant. It’s also not clear that imposing singularity conditions locally are all independent, since singularities at some points can force singularities at others. Mumford proved that \({\mathcal{M}_g}\) is not unirational for \(g\geq 24\), and is in fact general type, which is far from unirational.
Next time: more general introduction, stable curves, a bit about Hodge theory, then starting Mukai’s book.
Goal: showing \({\mathcal{M}_g}\) exists as a quasiprojective complex variety, and can in fact be defined over any field \(k\) or even over \({\mathbf{Z}}\). Here quasiprojective over \({\mathbf{Z}}\) means \(X \subseteq {\mathbf{P}}^n_{/ {{\mathbf{Z}}}}\) is a closed subset given as \(X = V(f_i)\) for homogeneous integral polynomials \(f_i\). Note that \({\mathcal{M}_g}({ \overline{k} }) = \left\{{\text{smooth projective curves of genus } g}\right\} = X({ \overline{k} }) \setminus Z({ \overline{k} }) \subseteq {\mathbf{P}}^n_{/ {{ \overline{k} }}}{_{\scriptstyle / \sim} }\) where \(Z = V(f_i, g_i)\) – this says \({\mathcal{M}_g}\) satisfies exactly the equations \(f_i\) and no more. Anytime objects have isomorphisms, one only gets a coarse moduli space instead of a fine moduli space, which we’ll later describe. Families \({\mathcal{X}}\to S\) yield to maps \(S\to {\mathcal{M}_g}\) over \(\operatorname{Spec}{ \overline{k} }\), and this will be a bijection when \({\mathcal{M}_g}\) is a fine moduli space and \({\mathcal{X}}\) is the pullback of a universal family \({\mathcal{E}}\to{\mathcal{M}_g}\). Since we only have a coarse moduli space, a family yields a map to \({\mathcal{M}_g}\), but these are not in bijection.
We’ll want projective varieties in order to do intersection theory. The most fundamental compactification: the Deligne-Mumford compactification \(\overline{{\mathcal{M}_g}}\) of \({\mathcal{M}_g}\), i.e. the moduli of stable curves of genus \(g\). This is a projective moduli space containing \({\mathcal{M}_g}\) as an open dense subset, and is obtained by adding degenerate curves “at infinity.”
Consider \(x_0 x_2 = t^n x_1^2\) in \({\mathbf{P}}^2_{x_0, x_1, x_2}\) and take the 1-parameter degeneration \(t\to 0\). This is smooth for \(t\neq 0\), since this is a full rank conic. In affine coordinates this is \(xy=t^n\), which degenerates to the simple node (double point) \(xy=0\). Part of this degeneration data can be recovered from a tropical curve, which is a metric graph whose points are singularities and lengths correspond to the \(n\) in \(t^n\):
A stable curve of genus \(g\) is a connected reduced (possibly reducible) projective curve \(C\) such that
Note that \(g=h^0(\omega_X) = h^1({\mathcal{O}}_X)\).
Writing a multi-component curve as \(X = \bigcup X_i\), the numerical condition requires that for every \(X_i\cong {\mathbf{P}}^1\), one has \({\left\lvert {X_i \cap(X\setminus X_i)} \right\rvert} \geq 3\), and for all \(X_i\) of the following form (or \(X_i\cong E\) an elliptic curve), \({\left\lvert {X_i \cap(X\setminus X_i)} \right\rvert} \geq 1\):
This is equivalent to \({\sharp}\mathop{\mathrm{Aut}}X < \infty\). For \(g\geq 2\) and \(C_g\) smooth of genus \(g\), one has \({\sharp}\mathop{\mathrm{Aut}}C_g < \infty\), and for \(g=1\) enforces \(\dim \mathop{\mathrm{Aut}}C_g = 1\). For \(g=0\), note \(\mathop{\mathrm{Aut}}{\mathbf{P}}^1 = \operatorname{PGL}_2\) which has dimension 3, so fixing at least 3 points cuts this down to a finite automorphism group.
The dualizing sheaf \(\omega_X\) is invertible if \(X\) has only nodes. The adjunction formula yields a twist \({ \left.{{\omega_X}} \right|_{{X_i}} } = \omega_{X_i}(X\setminus X_i)\) Then \(\omega_X\) is ample iff \(\deg{ \left.{{\omega_X}} \right|_{{X_i}} } > 0\). One can compute \(\deg \omega_{X_i}(X\setminus X_i) = 2g_i - 2 + {\left\lvert {X_i \cap(X\setminus X_i)} \right\rvert}\), hence the lower bound on the number of intersection points.
Without the numerical condition, the limit is not unique.
To see this, take a trivial family over \({\mathbf{P}}^1\), so a surface, and blow up a point on the central fiber. This yields a multi-component curve, which we allow, and we can continue blowing up such points:
If \(\omega_{X_t}\) is ample for all \(t\), then \(\omega_{{\mathcal{X}}}/S\) is relatively ample, which implies \({\mathcal{X}}/S\) is the canonical model. One can contract \((-1)\) curves to get a minimal model, and \((-2)\) curves to get canonical models. See degenerations of elliptic curves to wheels of copies of \({\mathbf{P}}^1\):
See Kodaira’s elliptic fibers – classified by extended Dynkin diagrams \(\tilde A_n, \tilde D_n, \tilde E_n\), and special types \(\tilde A_i^*\) for \(i=0,1,2\):
For \(V\) a projective variety, a stable curve is a map \(f:C\to V\) satisfying
So for example, we can ignore vertical curves:
One can define a moduli space of stable curves passing through \(n\) marked points, \(\overline{{\mathcal{M}}}_{g, n}(V)\):
Defined to formulate Gromov-Witten invariants. Motivated by physics, but originally non-algebraic and used almost complex structures. The second condition yields unique limits since it will yield a relative canonical model, which exist and are unique. This moduli space can be generalized to higher dimensions, see KSBA compactifications.
Constructing \({\mathcal{M}_g}\) and \(\overline{{\mathcal{M}_g}}\):
Step 1: Parameterize embedded curves \(C_g \hookrightarrow{\mathbf{P}}^N\) by the picking a basis of the linear system \({\left\lvert {2K_X} \right\rvert}\), where \(N = 2(2g-2) - (g-1) - 1 = 3g-4\) and \(\deg C_g = 2(2g-2)\). Use either the Chow variety \(\mathsf{Ch}_{d, N}\), parameterizing cycles/subvarieties of \({\mathbf{P}}^N\) with degree \(d\), or the Hilbert scheme \(\operatorname{Hilb}_h\) parameterizing closed subschemes \(X \hookrightarrow{\mathbf{P}}^N\) with a fixed Hilbert polynomial \(h\). The latter may not yield reduced curves, but closed subschemes are easier than varieties since they are just defined by equations.
Step 2: Divide by \(\operatorname{PGL}_{N+1}\), using GIT (next week) to produce a space \(X/G\) whose points (ideally) correspond to \(G{\hbox{-}}\)orbits.
Goal: understanding quotients of varieties by general group actions, a basic notion for moduli. The easiest case: finite groups \(G\curvearrowright X\in{\mathsf{Aff}} \mathsf{Alg}{\mathsf{Var}}_{/ {k}}\) for \(k={ \overline{k} }\).
Think of \(X \subseteq {\mathbf{A}}^n_{/ {k}}\) for some \(N\), with coordinates \({\left[ {a_1,\cdots, a_n} \right]}\), so \(X = V(f_1,\cdots, f_n)\). Note \({\mathcal{O}}_{{\mathbf{A}}^n} = k[x_1, \cdots, x_{n}]\), and regular functions on \(X\) are restricted polynomials, so we get a sequence \begin{align*} R = k[X] \leftarrow k[x_1, \cdots, x_{n}]\leftarrow I = \sqrt{\left\langle{f_1,\cdots, f_n}\right\rangle} ,\end{align*} so \(R\in {}_{k} \mathsf{Alg}^{\mathrm{fg}}\) without nilpotents – in fact varieties biject with such algebras. If \(G\curvearrowright R\) any ring, one can take invariants \begin{align*} R^G \mathrel{\vcenter{:}}=\left\{{r\in R {~\mathrel{\Big\vert}~}g^*(r) = r\,\,\forall g\in G}\right\} \end{align*} which is a subring and a \(k{\hbox{-}}\)subalgebra of \(R\). Here \(g^*\) is defined in terms of pullbacks of functions:
If \({\sharp}G\notdivides \operatorname{ch}(k)\) then \(R^G\in {}_{k} \mathsf{Alg}^{{\mathrm{fg}}}\).4
There is a \(k{\hbox{-}}\)linear averaging map \begin{align*} S: R &\to R^G \\ r &\mapsto {1\over {\sharp}G} \sum_{g\in G} g^*(r) ,\end{align*} noting that \(S\) is not a ring morphism.
Let \(a\in R\) and consider \(p_a(x) \mathrel{\vcenter{:}}=\prod_{g\in G} (x - g^*(a) )\), a polynomial of degree \(n = {\sharp}G\) whose coefficients are in the subring \(R^G\) and are symmetric polynomials in the \(g^*(a)\). Since \(p_a(a) = 0 = a^n + \cdots\), \(a^n\) is a linear combination of \(1,a,\cdots, a^{n-1}]\) with coefficients in \(R^G\) and these symmetric polynomials. So if \(\left\{{a_1,\cdots, a_m}\right\}\) generate \(R\) as a \(k{\hbox{-}}\)algebra, the images of monomials \(S(a_1^{k_1}\cdots a_m^{k_m})\) with \(0 \leq k_i \leq n\) generate \(R^G\). If \(b\in R^G\), one one hand \(b = S(b)\), and on the other hand \(b = \sum c_k a_I^{k_I}\) so \(S(b) = \sum c_k S(a_I^{k_I})\). Thus the \(c_k\) are in the subring generated by elementary symmetric polynomials in the \(g^*(a_i)\).
There is another basis for elementary symmetric polynomials given by Newton sums. Recall
The Newton sums are
and one can inductively show that one can be written in terms of the other.
The advantage is that the averaging operator commutes with sums, so the \(c_k\) like in the subring generated by Newton sums of the \(S(a_i^{k_i})\)
Assume \(G\) is finite and acts on \(X\in{\mathsf{Aff}} \mathsf{Alg}{\mathsf{Var}}_{/ {k}}\). There is a bijection \begin{align*} \left\{{G{\hbox{-}}\text{orbits on } X}\right\} \rightleftharpoons\left\{{\text{Points of an affine variety $Y$ with } k[Y] = k[X]^G}\right\} .\end{align*} Writing \(X = \operatorname{mSpec}R\) (since we’re working with varieties over a field), one can write \(Y = \operatorname{mSpec}(R^G)\). There is a quotient map \(\pi: X\to Y\) which is universal with respect to maps \(G{\hbox{-}}\)equivariant maps \(\psi: X\to Z\) with \(Z\) affine.5 This gives a geometric and a categorical quotient.
Since \(R^G \hookrightarrow R\) we obtain a morphism \(X \xrightarrow{\pi} Y\) of varieties and a pullback \(k[Y] \xrightarrow{\pi^*} k[X]\). Given \(\phi\in k[Y]\), the pullback \(\pi^*( \varphi)\) is constant on \(G{\hbox{-}}\)orbits. Given two orbits \(O_1, O_2\), one can find an invariant function which is zero on \(O_1\) and one on \(O_2\). Any finite subset on a variety is closed. Delete a point from \(O_2\) to get a proper containment of sets \(O_1 \cup(O_2\setminus\left\{{ p }\right\}) \subset O_1 \cup O_2\) which are both closed in \(X\). This corresponds to a proper containment of ideals, so pick a function vanishing on the former but not the latter and average. Thus the regular invariant functions separate orbits, and the images of the \(O_i\) in \(Y\) are distinct, making \(X\to Y\) a geometric quotient.
For the universal property, any \(X\to Z\) defines a ring morphism \(S\to R\), and \(G{\hbox{-}}\)equivariance factors this as \(S\to R^G \hookrightarrow R\), thus factoring \(X\to Y\to Z\).
The right classes of groups to take: geometrically reductive and linearly reductive.6
Over \({\mathbf{C}}\) these coincide, and are e.g. \(\operatorname{GL}_n({\mathbf{C}})\) (trivial center, nontrivial \(\pi_1\)), the classical semisimples
along with \(({\mathbf{C}}^{\times})^n\), and their products and extensions, and the exceptional groups \(E_6, E_7, E_8, F_4, G_2\). Here linearly reductive means any finite-dimensional representation decomposes into a sum of irreducible representations.
The most useful for moduli: \(\operatorname{GL}_n, \operatorname{PGL}_n, {\operatorname{SL}}_n\). Note that \(\operatorname{PGL}\) and \({\operatorname{SL}}\) are almost the same, up to a finite group.
For \(\operatorname{ch}(k) = p\), the only linearly reductive group on this list is \({\mathbf{G}}_m^n\), while “geometrically reductive” includes all of these groups. Over \({\mathbf{Z}}\), the split versions \(\operatorname{GL}_n({\mathbf{Z}}), {\operatorname{Sp}}_n({\mathbf{Z}})\), etc still work.
Nonsplit groups are e.g. those not isomorphic to \(\operatorname{GL}_n(k)\) but become isomorphic over \({ \overline{k} }\). Examples: compare \({\mathbf{G}}_m\) over \({\mathbf{R}}\) and \(S^1 = k[x,y]/\left\langle{x^2+y^2-1}\right\rangle\); these only become isomorphic over \({\mathbf{C}}\).
Let \(S_n\curvearrowright k[x_1, \cdots, x_{n}]\) by permuting variables, then \begin{align*} k[x_1, \cdots, x_{n}]^{S_n} = k[\sigma_1, \cdots, \sigma_n] = k[N_1, \cdots, N_n] ,\end{align*} generated by elementary symmetric functions or Newton polynomials.
Suppose \(G\curvearrowright{\mathbf{C}}[x_1,\cdots, x_n]\) with \(G\) finite and generated by pseudo-reflections. Then the invariants are again a polynomial ring: \begin{align*} {\mathbf{C}}[x_1,\cdots, x_n] ^G \cong {\mathbf{C}}[z_1,\cdots, z_n] .\end{align*}
More generally, a root lattice \(\Lambda\) (e.g. for a Coxeter group) gives rise to a Weyl group \(W(\Lambda)\), and one can consider \(W{\hbox{-}}\)invariant functions. For example, \(W(A_n) = S_{n+1}\). For a torus, invariant functions are characters. For a Lie algebra \({\mathfrak{g}}\), one can show that the \(W{\hbox{-}}\)invariants of symmetric functions on the torus, \(S({\mathfrak{h}})^W\), forms a polynomial algebra. The generators are referred to as the fundamental weights.
Coming up next: groups of multiplicative type, infinite groups, and generalizing the above theorem by removing some problematic subsets.
Last time: for \(G\curvearrowright R \supseteq R^G\) for \(G\in{\mathsf{Fin}}{\mathsf{Grp}}\), the Todd-Shepherd(-Chevalley) theorem states that if \(G\curvearrowright{\mathbf{A}}^n\) and \(G\) is generated by pseudoreflections then \(k[x_1, \cdots, x_{n}]^G\) is again a polynomial ring. Consider now \(G\curvearrowright R\) for \(G\) finite abelian and \(\operatorname{ch}k = 0\). This yields a grading \(R = \bigoplus _{\chi \in \widehat{G}} R_\chi\) where \(\widehat{G} = \mathop{\mathrm{Hom}}(G, {\mathbf{C}}^{\times}) = \mathop{\mathrm{Hom}}(G, {\mathbf{Q}}/{\mathbf{Z}})\) and \(R_\chi R_{\chi'} \subseteq R_{\chi + \chi'}\) Note that if \(G \cong \bigoplus {\mathbf{Z}}/n_i {\mathbf{Z}}\) then \(\widehat{G} \cong \bigoplus \mu_{n_i}\), which is non-canonically isomorphic to \(\bigoplus {\mathbf{Z}}/n_i{\mathbf{Z}}\). Recall that reflections have eigenvalues \(\left\{{1, 1,\cdots, 1, \alpha\neq 1}\right\}\).
Let \(C_2 \curvearrowright{\mathbf{A}}^2_{/ {{\mathbf{C}}}}\) by \((x,y)\mapsto (-x, -y)\). What are the invariants \(k[x,y]^{C_2}\)? Check that \(p(x,y) = \sum a_{ij} x^i y^j \mapsto \sum (-1)^{i+j} a_{ij} x^i y^j\), which equals \(p(x,y)\) when all of the \(i,j\) are even. Write \(k[x,y] = \bigoplus _{i+j\equiv_2 0} x^iy^j \oplus \bigoplus _{i+j \equiv_2 1} x^i y^j \mathrel{\vcenter{:}}= R_0 \oplus R_1\) and note \(\widehat{G} \cong \mu_2 \cong C_2\) and \(?\). Also note that for \(r \in R_\chi\) we have \(g.r = \chi(g) r\) We can write \begin{align*} k[x,y]^{C_2} = R_0 = k[x^2, xy, y^2] = k[u,v,w]/\left\langle{uw=v^2}\right\rangle ,\end{align*} which is a singular cone \(V(uw-v^2) \subseteq {\mathbf{A}}^3\):
Shepherd’s theorem does not apply here since the action is given by \({ \begin{bmatrix} {-1} & {0} \\ {0} & {-1} \end{bmatrix} }\), which is not a reflection.
Take \(C^2\curvearrowright{\mathbf{A}}^2\) by \((x,y)\mapsto (x,-y)\), then \(k[x,y]^{C_2} = k[x, y^2]\).
Note that in general, \({\mathbf{A}}^n/G = \operatorname{mSpec}k[x_1, \cdots, x_{n}]^G\) has quotient singularities. Three types of varieties we work with in AG:
Upshot: we can think of projective varieties not as covered by affines, but rather as a “spectrum” of a single graded ring. Given a subset \(Z = V(f_1,\cdots, f_m) \subseteq {\mathbf{P}}^n_{/ {k}}\) cut out by homogeneous polynomials of degree \(d_i\) in the homogeneous degree 1 coordinates \(x_0,\cdots, x_n\), one can take the affine cone \(C(Z) \subseteq {\mathbf{A}}^{n+1}\). A linear action of \(G\curvearrowright{\mathbf{P}}^n_{/ {k}}\) descends to \(G\curvearrowright Z\), where linear means that \(g.{\left[ {x_0:\cdots:x_n} \right]} = M g{\left[ {x_0: \cdots: x_n} \right]}\). Not every action is of this form: take \(G={{\mathbf{C}}^{\times}}\curvearrowright{\mathbf{P}}^1\) by \(\lambda {\left[ {x_0: x_1} \right]} = {\left[ {x_0: \lambda x_1} \right]}\). This is linear; to make a nonlinear action glued the fixed points \(\left\{{0}\right\}\) and \(\left\{{\infty}\right\}\) to get a rational nodal curve:
Note that \(\operatorname{Pic}(C) = {\mathbf{Z}}\bigoplus {{\mathbf{C}}^{\times}}\).
For a linear action by a finite group \(G\), writing \(Z = \operatorname{mProj}R\) with \(R = k[x_1, \cdots, x_{n}]/\sqrt{\left\langle{f_i}\right\rangle}\) then \(Z/G = \operatorname{mProj}R^G\). Such actions can be lifted from \(Z\) to \({ \left.{{{\mathcal{O}}_{{\mathbf{P}}^n}(1) }} \right|_{{Z}} } = {\mathcal{O}}_Z(1)\).
A variety \(G\in{\mathsf{Var}}_{/ {k}}\) is a group variety if it admits morphisms
These are required to satisfy some axioms.
Encoding associativity:
Encoding \(1a=a\):
Encoding \(aa^{-1}= 1\):
Suppose that \(G = \operatorname{Spec}R\) is affine, then there are dual notions:
The additive group \({\mathbf{G}}_a = \operatorname{Spec}k[x]\), whose underlying variety is \({\mathbf{A}}^1\). In coordinates, the group law is written additively as
Write \(z=x+y\), then on the ring side we have
Comultiplication: \begin{align*} \mu^*: k[z] &\to k[x]\otimes_k k[y] \cong k[x+y] \\ z &\mapsto x\otimes 1 + 1\otimes y &\mapsto x+y .\end{align*}
Counit: \(e^*: k[x] \to k\) where \(x\mapsto 0\)
Coinverse: \(i^*: k[x] \to k[x]\) where \(x\mapsto -x\).
The multiplicative group \({\mathbf{G}}_m = \operatorname{Spec}k[x, x^{-1}]\) whose underlying variety is \({\mathbf{A}}^2\setminus\left\{{0}\right\}\)
The group law is:
For rings:
Comultiplication: \begin{align*} \mu^*: k[z] &\mapsto k[x,x^{-1}]\otimes_k k[y,y^{-1}] \cong k[x^{\pm 1}, y^{\pm 1}] \\ z &\mapsto x\otimes y \mapsto xy .\end{align*}
Counit: \(e^*: k[x^{\pm 1} ] \to k\) where \(x\mapsto 1\).
Coinverse: \(i^*: k[x^{\pm 1}] \to k[x^{\pm 1}]\) where \(x\mapsto x^{-1}\).
Roots of units \(\mu_n = \operatorname{Spec}k[x]/\left\langle{x^n-1}\right\rangle\). Note that there is a closed embedding \(\mu_n \hookrightarrow{\mathbf{G}}_m\) since there is a surjection \(\operatorname{Spec}k[x, x^{-1}] \twoheadrightarrow k[x]/\left\langle{x^n-1}\right\rangle\). Note that in \(\operatorname{ch}k = p\), this yields a scheme that is not a variety since it is not reduced: one has \(\mu_p = \operatorname{Spec}k[x]/\left\langle{x^p-1}\right\rangle = \operatorname{Spec}k[x]/\left\langle{(x-1)^p}\right\rangle\) which contains nilpotents. This is the first example of a group scheme which is not a group variety.
The group operations agree with that on \({\mathbf{G}}_m\), e.g. comultiplication is \begin{align*} \mu^*: k[z]/\left\langle{z^n-1}\right\rangle \to k[x]/\left\langle{x^n-1}\right\rangle\otimes_k k[y]/\left\langle{y^n-1}\right\rangle \cong k[x,y]/\left\langle{x^n-1, y^n-1}\right\rangle .\end{align*} One can similarly define \(\alpha_p = \ker\operatorname{Frob}\hookrightarrow{\mathbf{G}}_m = \operatorname{Spec}k[x]/\left\langle{x^p}\right\rangle\).
Upcoming: more group varieties and schemes, especially \(\operatorname{GL}_n, {\operatorname{SL}}_n\), and their actions/coactions.
Last time: group varieties. Most of today will work over \({\mathbf{C}}, k\neq { \overline{k} }\), or \({\mathbf{Z}}\). There is a correspondence:
Affine varieties | Rings and k-algebras |
---|---|
Group varieties/schemes | Hopf coalgebras |
\(\mu: G\times G\to G\) | \(\mu^*: R\to R\otimes_k R\) |
\(e: {\operatorname{pt}}\to G\) | \(e^*: R\to k\) |
\(i: G\to G\) | \(i^*: R\to R\) |
Recall:
If \(M\) is finitely generated, there is a generator/relation exact sequence \(R{ {}^{ \scriptscriptstyle\oplus^{m} } } \to R{ {}^{ \scriptscriptstyle\oplus^{n} } }\twoheadrightarrow C\). Tensoring with any \(N\in {}_{R}{\mathsf{Mod}}\) yields \begin{align*} N { {}^{ \scriptscriptstyle\oplus^{m} } } \to N{ {}^{ \scriptscriptstyle\oplus^{n} } } \twoheadrightarrow C\otimes_R N .\end{align*} In particular, this works for base change \({}_{R}{\mathsf{Mod}}\to {}_{S}{\mathsf{Mod}}\) – the new module is generated as a module by the same generators but new scalars.
Consider \({\mathbf{C}}\otimes_{\mathbf{R}}{\mathbf{C}}\), which has a ring structure. Write \({\mathbf{C}}= {\mathbf{R}}[x]/\left\langle{x^2+1}\right\rangle\), then the base change is \begin{align*} {\mathbf{C}}[x]/\left\langle{x^2+1}\right\rangle = {\mathbf{C}}[x]/\left\langle{x-i}\right\rangle \oplus {\mathbf{C}}[x]/\left\langle{x+i}\right\rangle \cong {\mathbf{C}}\oplus {\mathbf{C}} ,\end{align*} which is a ring with zero divisors and idempotents since \((1, 0)^2 = (1^2, 0^2) = (1, 0)\).
For tensor products: same generators, same relations, extend scalars.
Recall:
Of using the tensor product slogan: identifying the map \begin{align*} { k[z] \over \left\langle{z^n-1}\right\rangle} \to {k[x] \over \left\langle{x^n-1}\right\rangle} \otimes_k {k[y]\over \left\langle{y^n-1}\right\rangle } \cong {k[x,y]\over \left\langle{x^n-1, y^n-1}\right\rangle } ,\end{align*} realizing this as \(z\mapsto x\otimes y\mapsto xy\), checking that \(z^n=1 \implies (xy)^n = 1\).
If \(G\) is an arbitrary finite group it can be made into an affine algebraic group variety over \(k\). Give the underlying set of \(G = {\textstyle\coprod}_{g\in G}\left\{{{\operatorname{pt}}}\right\}\) the discrete topology to get an algebraic variety. To get the algebraic group structure, note that any map of finite sets is algebraic. Define a ring \(R \mathrel{\vcenter{:}}=\bigoplus _{g\in G} k\), and a comultiplication as follows: note that \begin{align*} R\otimes_k R \cong \bigoplus _{(a, b) \in G\times G } k e_{a,b} \end{align*} where the \(e_{a, b}\) just tracks which summand we’re in. So define \begin{align*} R &\to R\otimes_k R \\ e_g &\mapsto \sum_{ab=g} e_{a, b} .\end{align*}
Note that we could have let \({\operatorname{pt}}= \operatorname{Spec}k\). E.g. \(C_p \neq \mu_p\) but are Cartier dual. However, the ring \(k[x]/\left\langle{x-1}\right\rangle^p\) is much easier to understand than the \(R\otimes_k R\) from above, even for very small groups like \(C_2\).
Recall \(\operatorname{GL}_n \subseteq {\mathbf{A}}^{n^2}\) is the open subspace which is the complement of \(V(\operatorname{det})\), so a principal open subset. The ring is \(k[x_{ij}, 1/\operatorname{det}]\) for \(1\leq i,j\leq k\), which is obtained by localizing at the determinant. Thus we can embed it as a closed subset in \({\mathbf{A}}^{n^2+1}\) using \(V(y\operatorname{det}(x_{ij}) = 1)\), i.e. introducing a new free variable for \(1/\operatorname{det}\) and ensuring it’s nonzero.
An affine algebraic group is a closed subgroup \(G\) of \(\operatorname{GL}_n\), and the coordinate ring \(R_G\) is a quotient of \(R_{\operatorname{GL}_n}\).
For \({\operatorname{SL}}_n\), the ring is \(k[x_{ij}]/\left\langle{\operatorname{det}- 1}\right\rangle\), and define \(\operatorname{PGL}_n\) as \({\operatorname{SL}}_n/\mu_n\) or \(\operatorname{GL}_n/{\mathbf{G}}_m\). Although it’s not obvious, these are affine – for \(\operatorname{PGL}_n\), the ring is the \(\mu_n\) invariants of the coordinate ring of \({\operatorname{SL}}_n\), so one gets the ring of polynomials in \(R_{{\operatorname{SL}}_n}\) whose powers are multiples of \(n\).
An action of an algebraic group \(G\) on a variety (or scheme) \(X\) is a map \(G \underset{\scriptscriptstyle {k} }{\times}X \xrightarrow{a} X\) satisfying the usual axioms encoded in commuting diagrams:
Note that one can now reverse these diagrams to get the coaction on rings \(a^*: A\to R\otimes_k A\).
Let \(\mu_n \curvearrowright{\mathbf{A}}^2\) by \(\xi.(x,y) \mathrel{\vcenter{:}}=(\xi x, \xi^k y)\), then the coaction is \begin{align*} k[x,y] &\to {k[x,y,\xi] \over \left\langle{\xi^n-1}\right\rangle } = {k[\xi] \over \xi^n-1}[x, y] \\ x &\mapsto \xi x \\ y &\mapsto \xi^k y .\end{align*}
Check that this satisfies the axioms for a coaction.
Let \(G\in {\mathsf{Grp}}{\mathsf{Var}}_{/ {k}}, X\in { \mathsf{Vect}}_{/ {k}}\), then a linear coaction is a homomorphism \(V \xrightarrow{a^*} R\otimes_k V\) satisfying the duals of the axioms above.
If \(V = kx \oplus ky\) then \(V\to R\otimes_k V = Rx \oplus Ry\).
There is a coaction on \(A = \operatorname{Sym}^* V = k \oplus V \oplus \operatorname{Sym}^2 V \oplus \cdots\), where \(V = \left\langle{x,y}\right\rangle, \operatorname{Sym}^2 V = \left\langle{x^2, xy, y^2}\right\rangle\). This is the same as an action \(G\curvearrowright{\mathbf{A}}^N\) for some \(N\) acting on an affine space.
Last time: coactions on vector spaces \(a^*: V\to R\otimes_k V\) where \(R = k[G]\) is the ring of regular functions on an algebraic group \(G\). Thinking of \(V {}^{ \vee }\cong k^n \cong {\mathbf{A}}^n_{/ {k}} = \operatorname{Spec}\operatorname{Sym}^*V\) as the ring of regular functions, we get a map \(\operatorname{Sym}^*(V) \to R\otimes_k \operatorname{Sym}^*(V)\).
A vector \(v\in V\) is invariant if \(a^*(v) = 1\otimes v\).
Every algebraic coaction is locally finite, i.e. every \(v\in V\) is contained in a finite-dimensional invariant vector subspace.
Check that \(v\mapsto \sum a_i \otimes v_i\) where \(v\in \left\langle{v_i}\right\rangle\) and use \(a.(b.v) = (ab).v\).
Let \(A\) be a finitely-generated abelian group and let \(G\mathrel{\vcenter{:}}=\widehat{A}\) be its Cartier dual. Then \(R_G = k[A] \mathrel{\vcenter{:}}=\left\{{\sum c_a e^a {~\mathrel{\Big\vert}~}c_a\in k, e^{a}e^{b} = e^{a+b}}\right\}\) is a commutative ring and in fact a finitely-generated algebra. For \(k\) a general ring, this yields a scheme, and in fact it has the structure of a group scheme: \begin{align*} e^*: R_G &\to R_G \otimes_k R_G \\ e^c &\mapsto \sum_{a+b=c} e^a\otimes e^b \mathrel{\vcenter{:}}=\sum_{a+b=c}e^{(a, b)} \\ \\ u^*: R_G &\to k \\ e^a &\mapsto 1 \\ \\ i^*: R_G &\to R_g \\ e^a &\mapsto e^{-a} \\ \\ .\end{align*} Note that \(A \cong {\mathbf{Z}}^r \bigoplus {\mathbf{Z}}/n_i{\mathbf{Z}}\), so all diagonalizable groups are of the form \(\widehat{A} = {\mathbf{G}}_m^r \bigoplus \mu_{n_i}\).
An algebraic coaction \(\widehat{A}\curvearrowright V\) yields a grading \(V = \bigoplus _{a\in A} V_a\). Thus a \({\mathbf{G}}_m\) action is a \({\mathbf{Z}}{\hbox{-}}\)grading, and a \(\mu_n\) action is a \(C_n{\hbox{-}}\)grading. This works for \(k\) any ring.
Check that \(V \xrightarrow{a^*} V\otimes k[A]\) by \(v\mapsto \sum_{a\in A} e^a v_a\). This is a finite sum, so there are only finitely many nonzero \(v_a\) appearing in this sum. We need to show
For the first, compose \(V \xrightarrow{e^*} V\otimes R \xrightarrow{u^*} V\) by \(v\mapsto \sum e^a v_a \mapsto \sum v_a\) and this must equal \(v\) by the axioms. For the second, first using \(g(hv)\) to get \(v\mapsto \sum e^a v_a \mapsto (v_a)_b e^a \otimes e^b\), and \((gh)v\) to get \(v\mapsto \sum e^av_a \mapsto \sum_{b+c=a} v_a e_b \otimes e^c\). These must be equal, so the coefficients must be equal.
Check this – show that the last equality is equivalent to being a direct sum.
Let \(G\in \mathsf{Alg}{\mathsf{Grp}}{\mathsf{Var}}_{/ {k}}\) and suppose \(G\curvearrowright V\) is a \(G{\hbox{-}}\)representation, i.e. a coaction \(V\to R\otimes V\). Define the invariant subspace \(V^G \mathrel{\vcenter{:}}=\left\{{v\in V{~\mathrel{\Big\vert}~}a^*(v) = 1\otimes v}\right\} \subseteq V\); \(G\) is linearly reductive iff for any \(V\twoheadrightarrow W\) of \(G{\hbox{-}}\)representations induces \(V^G \twoheadrightarrow W^G\).
One can equivalently require \(V,W\) to be arbitrary or just finite-dimensional.
If \(G\) is a finite group variety and \(\operatorname{ch}k \notdivides {\sharp}G\), then \(G\) is linearly reductive.
Use the Reynolds operator \(V \xrightarrow[]{R} { \mathrel{\mkern-16mu}\rightarrow }\, V^G\) which sections the inclusion \(V^G \hookrightarrow V\), so \(R \circ i = \operatorname{id}_{V^G}\), where \(R(v) = ({\sharp}G)^{-1}\sum_{g\in G} g(v)\).
Any diagonalizable group is linearly reductive.
Writing \(V = \bigoplus _{a\in A} V_a\), then \(V^G = V_0\) and projecting onto the \(a=0\) summand yields a surjection.
Over \(\operatorname{ch}{ \mathbf{F} }= p\), the only linearly reductive groups are either finite or diagonalizable.
Over \({\mathbf{C}}\), the linearly reductive groups are precisely
This includes \(\operatorname{GL}_n = {\operatorname{SL}}_n \times {{\mathbf{C}}^{\times}}/ H\) and \(\operatorname{PGL}_n = {\operatorname{SL}}_n/\mu_n\).
Later: invariants of finitely generated for linearly reductive are again finitely generated. Note that invariants can be finitely-generated even when the group is not.
Suppose \(G\) is a linearly reductive group and \(G\curvearrowright R\) a finitely generated ring (or \(k{\hbox{-}}\)algebra). Then the subring of invariants \(R^G\) is finitely generated.
See proof in Mukai, due to Hilbert.
Suppose \(G\curvearrowright S \mathrel{\vcenter{:}}= k[x_0, \cdots, x_n]\) linearly and preserves the grading. Then \(S^G\) is finitely generated.
Since \(G\) preserves polynomials of degree \(d\), the ring \(S^G\) is graded and decomposes as \(S^G = \bigoplus _{e\geq 0} S^G \cap S_e\). Let \(S_+^G\) be the elements of positive degree and write \(S^G = k \oplus S_+^G\). Writing \(J = \left\langle{S_+^G}\right\rangle {~\trianglelefteq~}S\) for the ideal it generates, since \(S\) is Noetherian then we can write \(J = \left\langle{f_1,\cdots, f_N}\right\rangle\). These can be chosen to be homogeneous by choosing any homogeneous polynomial, stopping if that generates the ring, and otherwise continuing by picking \(f_i\) in the complement to construct an ascending chain of ideals.
\begin{align*} S^G\in {}_{k} \mathsf{Alg} .\end{align*}
Take \(f_i\) such that \(\deg(f_i) > 0\). There is a surjective morphism \(S{ {}^{ \scriptscriptstyle\oplus^{N} } } \twoheadrightarrow J\) of \(S{\hbox{-}}\)modules. Since \(J^G \subseteq S^G\), this yields a surjection \((S^G){ {}^{ \scriptscriptstyle\oplus^{N} } } \twoheadrightarrow J^G\) of \(G{\hbox{-}}\)modules. If \(f\in J^G\) then write \(f = \sum_{i=1}^N h_i f_i\) with \(h_i\in S^G\) and \(\deg h_i < \deg f_i\). Finish by induction.
This yields a surjection \(S = k[x_0,\cdots, x_n] \twoheadrightarrow R\); we want a surjection \(S^G \twoheadrightarrow R^G\).
Let \(R \mathrel{\vcenter{:}}= k[a_i]\), then \(\exists V\in{ \mathsf{Vect}}_{/ {k}}^{\mathrm{fd}}\) where \(V \subseteq R\) is \(G{\hbox{-}}\)invariant and contains all of the \(a_i\).
The action is locally finite, so each \(a_i\) lies in a finite-dimensional subspace \(V_i\) with action \(V_i \to V_i \otimes k[G]\). Set \(V\mathrel{\vcenter{:}}=\sum_i V_i\).
Writing \(X = \operatorname{mSpec}R\) for \(R = k[X]\), a surjection \(k[x_0,\cdots, x_n] \twoheadrightarrow R\) corresponds to an inclusion \(X\to {\mathbf{A}}^{\dim V}\) where \(G\curvearrowright{\mathbf{A}}^{\dim V}\) linearly. This corresponds to \(G\) acting linearly on \(k[x_0,\cdots, x_n]\) and \(R\).
?
Linearly reductive groups:
A group \(G\) is geometrically reductive iff for all \(G\curvearrowright V\) linearly and for all \(w\in V\) invariant vectors, there exists a \(G{\hbox{-}}\)invariant homogeneous polynomial \(h\) such that \(h(w) \neq 0\) and \(\deg h > 0\).
Linear reductive corresponds to \(\deg h = 1\). Evaluating at \(w\) gives a surjection \(V {}^{ \vee }\twoheadrightarrow k = k^G\). This yields a surjection \((V {}^{ \vee })^G \to k = k^G\) since not every such function vanishes. Finite generation of invariants is still true, although the proof takes much more work. See
Note that over \(\operatorname{ch}k = p\), the groups \({\operatorname{SL}}_n, \operatorname{PGL}_n\) are geometrically reductive. In characteristic zero, a nontrivial fact is that linearly reductive is equivalent to geometrically reductive.
\({\mathbf{G}}_a\) is not linearly reductive. Produce a \({\mathbf{G}}_a{\hbox{-}}\)equivariant \(V\twoheadrightarrow W\) such that \(V^G\not\twoheadrightarrow W^G\). Take \({\mathbf{C}}^2\to {\mathbf{C}}\) by the horizontal projection \((x,y)\mapsto y\), and the actions given by horizontal shifts \(\lambda(x,y) = (x+ \lambda y, y)\) and \(\lambda (y) = y\) trivial for \(\lambda \in {\mathbf{C}}\).
This can’t happen if the action is multiplicative. Let \({\mathbf{G}}_m \curvearrowright V = \bigoplus _{ \lambda\in {\mathbf{Z}}} V_{ \lambda}\) and \(w_ \lambda\in V_{ \lambda}\). Set \(\lambda. w_ \lambda \mathrel{\vcenter{:}}=\lambda^\chi w_{ \lambda}\), so e.g. \(V = \bigoplus _{n\in {\mathbf{Z}}} V_n\) and \(\lambda.w_n = \lambda^n w_n\).
Although \({\mathbf{G}}_a\) is not linearly reductive, if \({\mathbf{G}}_a\curvearrowright R\) then \(R^{{\mathbf{G}}_a}\) is still finitely generated.
The proof uses a trick of reducing to an \({\operatorname{SL}}_2\) action where \(R^{{\mathbf{G}}_a} \cong R^{{\operatorname{SL}}_2}\).
Invariants \(R^G\) for various \(G\):
Nagata generalizes this to \({\mathbf{P}}^n\).
Let \(G\mathrel{\vcenter{:}}={\mathbf{G}}_a \curvearrowright k[x_1, \cdots, x_{n}]= \operatorname{Sym}^*V\) for \(V\mathrel{\vcenter{:}}=\left\langle{x_1,\cdots, x_n}\right\rangle_k\), then \(k[x_1, \cdots, x_{n}]^G\) is finitely-generated (despite \(G\) not being linearly reductive).
Useful fact: in characteristic zero, Lie groups are closely connected to Lie algebras. For \(G \leq \operatorname{GL}_n({\mathbf{C}})\) a closed subgroup, its Lie algebra is \({\mathfrak{g}}\mathrel{\vcenter{:}}={\mathbf{T}}_e G\), which has underlying vector space \({\mathbf{C}}^{\dim G}\) and bracket satisfying \([AB] = -[BA]\) and the Jacobi identity. Understanding this tangent space: think of matrices \(I + {\varepsilon}A\) where \({\varepsilon}A\) is small, and do operations discarding \({ \mathsf{O}}({\varepsilon}^2)\) terms. Equivalently, work over \({\mathbf{C}}[{\varepsilon}]/\left\langle{{\varepsilon}^2}\right\rangle\).
Lie group | Lie algebra |
---|---|
\(\operatorname{GL}_n({\mathbf{C}})\) | \(\operatorname{Mat}_{n\times n}({\mathbf{C}})\) |
\({\operatorname{SL}}_n({\mathbf{C}})\) | \({\mathfrak{sl}}_n({\mathbf{C}}) = \left\{{M{~\mathrel{\Big\vert}~}\operatorname{tr}(M) = 0}\right\}\) |
\({\operatorname{SO}}_n({\mathbf{C}}) = \left\{{M{~\mathrel{\Big\vert}~}MM^t = I }\right\}\) | \({\mathfrak{so}}_n({\mathbf{C}}) = \left\{{M{~\mathrel{\Big\vert}~}M + M^t = 0}\right\}\) |
To work out what \({\mathfrak{g}}\) should be for \({\operatorname{SL}}_n({\mathbf{C}})\), linearize the \(\operatorname{det}= 1\) condition: \begin{align*} \operatorname{det}1 + {\varepsilon}A \mathrel{\vcenter{:}}=\operatorname{det}{ \begin{bmatrix} {1 + {\varepsilon}a_{11} } & { {\varepsilon}a_{ij} } \\ {\cdots} & {1 + {\varepsilon}a_{nn} } \end{bmatrix} } = 1 + {\varepsilon}\sum a_{ii} = 1 + {\varepsilon}\operatorname{tr}(A) .\end{align*} For \({\operatorname{SO}}_n\), work out \((I+ {\varepsilon}M)(I+{\varepsilon}M)^t = I\).
There is a way to go back: \({\mathfrak{g}}\xrightarrow{\exp} G\). This is almost a bijection, but can fail: e.g. in semisimple cases, \({\operatorname{SL}}_n({\mathbf{C}}), \operatorname{PGL}_n({\mathbf{C}}) = {{\operatorname{SL}}_n({\mathbf{C}})\over \mu_n} \mapsto {\mathfrak{sl}}_n({\mathbf{C}})\) both have the same Lie algebra. Note that \(\mu_n \subseteq Z({\operatorname{SL}}_n({\mathbf{C}}))\) is central, and more generally if \(G' = G/H\) for \(H \subseteq Z(G)\), \({\mathbf{T}}_I G \cong {\mathbf{T}}_I G'\).
The other issue: consider \(G = ({\mathbf{C}}^{\times})^n\), then \({\mathfrak{g}}= {\mathbf{C}}^n\) with \([AB] = 0\).
In characteristic zero, if \(G\curvearrowright R\mathrel{\vcenter{:}}=\operatorname{Sym}^*V\), then \({\mathfrak{g}}\curvearrowright R\) and \(R^G = R^{{\mathfrak{g}}}\).
Recalling \({\mathfrak{sl}}_n = \left\{{\operatorname{tr}(A) = 0}\right\}\) and \({\mathfrak{so}}_n = \left\{{A + A^t=0}\right\}\), one can define \(e^A \mathrel{\vcenter{:}}=\sum_{n\geq 0} A^n/n!\); then e.g. \(\operatorname{tr}(A) = 0 \implies \operatorname{det}(e^A) = 1\). Note that one needs characteristic zero here to make sense of terms like \(1/n!\)
\({\mathbf{G}}_a\curvearrowright V\) is equivalent to an infinitesimal action, or equivalently a nilpotent map \(f:V\to V\). E.g. for \(\lambda \in {\mathbf{G}}_a\), define \(\lambda.v \mathrel{\vcenter{:}}=\exp(\lambda f)\).
Recall \({\mathfrak{sl}}_2({\mathbf{C}}) = {\mathbf{C}}\left\langle{e,f,h}\right\rangle\), and an action \({\mathfrak{sl}}_2\curvearrowright V\) is equivalent to a choice of 3 operators \(e,f,h\in { \operatorname{End} }_k(V)\). Writing \(V = \bigoplus_{m\in {\mathbf{Z}}_{\geq 0}} V_m\) as a sum of weight spaces for \(h\), where for \(v_i\in V_m\) one has relations
One can write the irreducible representations as \(U_m \mathrel{\vcenter{:}}=\left\{{p_m(x, y) }\right\}\), polynomials of degree \(m\) where the \(U_m\) can appear in the \(V_m\) with multiplicity. Letting \(f:V\to V\) be nilpotent, so \(f^N = 0\), over \({\mathbf{C}}\) one gets a JCF where an \(m\times m\) block has ones on the superdiagonal, yielding a chain \(v_{-1} = 0, v_1'\to v_2'\to\cdots\to v_{m-1}'\to v_m = 0\). Rescaling \(v_i \mathrel{\vcenter{:}}= v_i'/(m-i)!\) yields the above relations and proves the theorem.
Nagata produced an action \({\mathbf{G}}_a^N\curvearrowright V\) such that \(S^G\) is not finitely-generated, where \(S = \operatorname{Sym}^*V \cong {\mathbf{C}}[x_1,\cdots, x_n]\) and \(N = 16\).
Mukai did this for \(N=3\). The \(N=2\) case is open.
We’ll sketch a proof of Mukai’s result. Define \begin{align*} {\mathbf{C}}^n &\curvearrowright k[x_1,\cdots, x_n, y_1,\cdots, y_n] \\ {\left[ { t_1\cdots, t_n} \right]} &\mapsto x_i\mapsto x_i, y_i\mapsto y_i + t_i x_i .\end{align*} Let \(G\mathrel{\vcenter{:}}={\mathbf{C}}^r \leq {\mathbf{C}}^n\) be some vector subspace, so \(G\cong {\mathbf{G}}_a^r\). It turns out that \(S^G\) is the total Cox ring of \(X \mathrel{\vcenter{:}}=\operatorname{Bl}_{p_1,\cdots, p_n} {\mathbf{P}}^{r-1}\), which is generally defined as \begin{align*} \operatorname{Cox}(X) \mathrel{\vcenter{:}}=\bigoplus _{L\in \operatorname{Pic}(X) } H^0(X; L) .\end{align*}
Taking \(r=3\) yields \(X\mathrel{\vcenter{:}}=\operatorname{Bl}_{p_1,\cdots, p_n} {\mathbf{P}}^2\). Note that \(\operatorname{Pic}(X) = {\mathbf{Z}}^{n+1}\), since any \(D\in \operatorname{Pic}(X)\) can be written as \(D = a_0 [H] + \sum a_i E_i\) where \([H] \in \operatorname{Pic}({\mathbf{P}}^2)\) is the hyperplane (line) class and \(E_i\) are the exceptional curves.
The support of \(\operatorname{Cox}(X)\) is \begin{align*} \mathop{\mathrm{supp}}\operatorname{Cox}(X) = \left\{{L \in \operatorname{Pic}(X) {~\mathrel{\Big\vert}~}H^0(X; L)\neq 0}\right\} = \mathrm{Eff}(X) \subseteq {\mathbf{Z}}^{n+1} ,\end{align*} which forms a monoid/semigroup.
If \(\operatorname{Cox}(X)\) is finitely-generated over \({\mathbf{C}}\) then \(\mathrm{Eff}(X)\) is a finitely-generated semigroup.
Thus the strategy is to find points, blow up, and show \(\operatorname{Eff}(X)\) is not finitely-generated. Note that \(E_i^2 = -1\) are effective \((-1){\hbox{-}}\)curves.
A curve is exceptional on \(X\) iff \(E\) is an irreducible curve with \(E^2 < 0\). Any exceptional curve is a primitive generator of \(\operatorname{Eff}(X)\).
This follows since \(E \neq A + B\) for two effective curves – if so, write \(0 > E^2 = A^2 + B^2 + 2AB\). Force \(AB\) to be positive by moving \(A\) or \(B\), forcing \(A^2\) or \(B^2\) to be negative
Producing the example: blow up 9 points on an elliptic curve. Take two cubics \(C_1, C_2\) in \({\mathbf{P}}^2\), intersecting at 9 points, and blow them up. This yields a pencil of curves, and in fact an elliptic fibration with \(C_1, C_2\) in the fibers. The exceptional curves yield sections \(E_i\). The Mordell-Weil group yields sections, and the differences between points yields elements of infinite order:
Continuing real geometric invariant theory. Setup: let \(G \curvearrowright X\) be a linearly reductive group (not necessarily finite) acting on an affine variety \(X= \operatorname{mSpec}R\), e.g. \(R = {\mathbf{C}}[x]\). We have a subring \(R^G \hookrightarrow R\), so \(\operatorname{mSpec}R\to \operatorname{mSpec}R^G\) and we define \(X { \mathbin{/\mkern-6mu/}}G\mathrel{\vcenter{:}}=\operatorname{mSpec}R^G\) to be the affine quotient.
Let \(X = {\mathbf{A}}^1\) so \(R = {\mathbf{C}}[x]\) and \(G = {\mathbf{G}}_m \cong {\mathbf{C}}^{\times}\) with action \(\lambda.x \mathrel{\vcenter{:}}=\lambda^d x\) for \(d\in {\mathbf{Z}}\setminus\left\{{0}\right\}\), a weight \(d\) action. Note that \(\lambda\sum c_i x^i = \sum c_i \lambda^{d_i} x^i\) which differ for any \(i>0\), so \(R^G = {\mathbf{C}}\). Thus \({\mathbf{A}}^1\to {\mathbf{A}}^1{ \mathbin{/\mkern-6mu/}}G \cong \operatorname{mSpec}{\mathbf{C}}\cong {\operatorname{pt}}\). The two orbits are \(0, {\mathbf{A}}^1\setminus\left\{{0}\right\}\), which both map to the same point. Note that the closure of the second orbit \({\mathbf{A}}^1\setminus\left\{{0}\right\}\) is the other orbit \(\left\{{0}\right\}\).
Let \({\mathbf{G}}_m\curvearrowright{\mathbf{A}}^2\) by \(\lambda.(x,y) = (\lambda^{d_1} x, \lambda^{d_2} y)\) for two nonzero weight \(d_1, d_2\). The result depends on the relative signs:
Let \(G\) be linearly reductive and \(X\) affine. Then
Let \({\mathrm{Orb}}_1, {\mathrm{Orb}}_2 \subseteq X\) be \(G{\hbox{-}}\)orbits whose closures do not intersect. Then \(\pi({\mathrm{Orb}}_1)\neq \pi({\mathrm{Orb}}_2)\).
Note that if \(Y \subseteq X{ \mathbin{/\mkern-6mu/}}G\) is a hypersurface \(V(f)\) with \(f\in R^G\), it pulls back to a \(G{\hbox{-}}\)invariant hypersurface \(V(\pi^{-1}(f)) \subseteq X\). Any point in \(X{ \mathbin{/\mkern-6mu/}}G\) is an intersection \(\cap V(f_i)\), which pulls back to an intersection of hypersurfaces. Thus assuming \({\mathrm{Orb}}_1, {\mathrm{Orb}}_2\) are disjoint, it suffices to find a \(G{\hbox{-}}\)invariant function the separates the points \(\pi({\mathrm{Orb}}_1)\) and \(\pi({\mathrm{Orb}}_2)\). Also note that if orbits intersect, they are in the same fiber and thus map to the same point – the claim is that this is the only way this can happen.
The closed sets \({ \operatorname{cl}}{\mathrm{Orb}}_i \subseteq X\) correspond to ideals \(a_i\in \operatorname{mSpec}R\), and \(Z(a_1 + a_2) = V(a_1) \cap V(a_2) = { \operatorname{cl}}{\mathrm{Orb}}_1 \cap{ \operatorname{cl}}{\mathrm{Orb}}_2 = \emptyset\) by assumption. By the Nullstellensatz, \(a_1 + a_2 = \left\langle{1}\right\rangle = R\), so there is a surjective map \(a_1 \oplus a_2 \twoheadrightarrow R\). Since \(G\) is linearly reductive, \(a_1^G \oplus a_2^G \twoheadrightarrow R^G\), so one can find \(G{\hbox{-}}\)invariant functions \(f_i\) with \(f_1 + f_2 = 1\). This yields \({ \left.{{f_1}} \right|_{{{ \operatorname{cl}}{\mathrm{Orb}}_1}} } \equiv 0\) and \({ \left.{{f_1}} \right|_{{{ \operatorname{cl}}{\mathrm{Orb}}_2}} } \equiv 1\) since they sum to 1.
To see that this implies (3), consider a fiber:
Missed the verbal argument here.
To see (1), note that \(R^G = {\mathbf{C}}\left\langle{f_1,\cdots, f_n}\right\rangle\) is finitely-generated as a ring by \(G{\hbox{-}}\)invariant functions \(f_i\) since \(G\) is linearly reductive, yields \(X\to X{ \mathbin{/\mkern-6mu/}}G \hookrightarrow{\mathbf{A}}^n\). Take affine coordinates for \(p = (a_1,\cdots, a_n) \in X{ \mathbin{/\mkern-6mu/}}G\). There is a surjection \({\mathbf{C}}[x_1,\cdots, x_n] \to R^G\) by \(x_i\mapsto f_i\), and similarly a surjection given by \(x_i\mapsto f_i - a_i\). Note that \({\mathbf{C}}[x_1,\cdots, x_n] \to R\) is not surjective, since giving it the trivial \(G{\hbox{-}}\)action yields a non-surjective map \({\mathbf{C}}[x_1,\cdots, x_n]^G \to R^G\). So the image is contained in some maximal ideal \({\mathfrak{m}}\in \operatorname{mSpec}R\), and the claim is that \({\mathfrak{m}}\mapsto {\mathfrak{m}}_p \mathrel{\vcenter{:}}=\left\langle{f_1-a_1,\cdots, f_n-a_n}\right\rangle \in \operatorname{mSpec}R^G\) corresponding to \(p\).
In other words, take \(\left\langle{f_i - a_i}\right\rangle \in R^G\), and the claim is that \(\left\langle{f_i - a_i}\right\rangle\neq R\), or equivalently \(V(f_i - a_i)\neq \emptyset\). This ideal is everything exactly when \(R{ {}^{ \scriptscriptstyle\oplus^{n} } } \to R\) surjects by \((t_i)\mapsto \sum t_i(f_i - a_i)\), but linearly reactivity would give \((R^G){ {}^{ \scriptscriptstyle\oplus^{n} } }\twoheadrightarrow R^G\).
Proving that \(\pi\) is a closed map: let \(a \subseteq R, Z \subseteq X\) closed, and \(\pi(Z) \subseteq X{ \mathbin{/\mkern-6mu/}}G\). Note that \(X\to Y\) yields a ring map \(R\mapsfrom_{\phi} S\) and \(a\) corresponds to \(\left\langle{\phi^{-1}(a)}\right\rangle\). For a subring, this is intersection. Define a map \((t_i)\mapsto \sum t_i g_i\) where \(R{ {}^{ \scriptscriptstyle\oplus^{n} } }\twoheadrightarrow a = \left\langle{g_1,\cdots, g_n}\right\rangle\). Then \((R^G){ {}^{ \scriptscriptstyle\oplus^{n} } } \twoheadrightarrow a^G\). Consider \(X \to X{ \mathbin{/\mkern-6mu/}}G\) by \(Z \xrightarrow{\pi} { \operatorname{cl}}\pi(Z)\).
To be continued, use property 1 but for a subset.
How this fails for groups that are not linearly reductive: let \(G\mathrel{\vcenter{:}}={\mathbf{G}}_a\curvearrowright{\mathbf{C}}\) by the shearing action \(a.(x,y) = (x, y+ax)\). Note that \({\mathbf{C}}[x,y]^G = {\mathbf{C}}[x]\). For \(x\neq 0\), the orbits are vertical lines, and for \(x=0\) a vertical set of discrete points. Note that \(V(xy=c)\) is closed but its image misses the origin under the projection to \({\mathbf{A}}^1\).
Things we quotient by: affine varieties are essentially rings. Recall that projective varieties have affine cones: regard homogeneous equations as usual equations. For quasiprojective varieties, take the projective closure to get a projective variety. However, there are also arbitrary varieties, which are perhaps not as useful. GIT mostly deals with affine or projective varieties, but note that Mumford’s book sets up the general case.
Setup: \(X\in {\mathsf{Aff}}{\mathsf{Var}}_{/ {k}}\) corresponding to \(R\in {}_{k} \mathsf{Alg}\), and \(G\curvearrowright X\) a linearly reductive group corresponding to a coaction \(G\curvearrowright R\). Take affine quotients \(X{ \mathbin{/\mkern-6mu/}}G \mathrel{\vcenter{:}}=\operatorname{Spec}R^G\) which receives a map \(X \xrightarrow{\pi} X{ \mathbin{/\mkern-6mu/}}G\).
In this setup,
For \({\mathbf{C}}^{\times}\curvearrowright{\mathbf{A}}^2\) by \(\lambda.(x,y) \mathrel{\vcenter{:}}=(\lambda x, \lambda^{-1}y)\), one gets the following:
Let \(X \subseteq {\mathbf{A}}^n\) be closed subset defining an affine variety with ideal \(I(X)\) and let \(Z \subseteq X\) be closed with \(a \mathrel{\vcenter{:}}= I(Z)\) Then \(k[x_1, \cdots, x_{n}]\twoheadrightarrow R \twoheadrightarrow R/a\) and \(a\) is \(G{\hbox{-}}\)invariant, so \(R^G \twoheadrightarrow(R/a)^G\) by linear reductivity. Since \(a\hookrightarrow R\), there is a map \(a^G\to R^G\) and \(a^G = a \cap R^G\). So \(Z{ \mathbin{/\mkern-6mu/}}G \subseteq X{ \mathbin{/\mkern-6mu/}}G\), and the claim is \({ \operatorname{cl}}\pi(Z) = Z{ \mathbin{/\mkern-6mu/}}G\). Thus \(\pi(Z)\) is closed.
\(\pi\) is an immersion: if \(S \subseteq Y\) and \(\pi^{-1}(S)\) is open implies \(S\) is open.
Consider \(X \twoheadrightarrow X{ \mathbin{/\mkern-6mu/}}G = S {\textstyle\coprod}S^c\), then \(\pi^{-1}(S {\textstyle\coprod}S^c) = \pi^{-1}(S) {\textstyle\coprod}\pi^{-1}(S^c)\). If \(\pi^{-1}(S)\) is open then \(\pi^{-1}(S^c)\) is closed, so \(S^c\) is closed and \(S\) is open.
A point \(x\in X\) is stable if
Define \(X^s\) to be the set of stable points. There is a further open subset \(X^{\mathrm{ss}}\supseteq X^s\) of semistable points, and \(X\setminus X^{\mathrm{ss}}\) are unstable points.
Note that one can show \(R^G\) is integrally closed, so \(\operatorname{Spec}R^G\) is normal and singular in codimension 1. In general, GIT quotients will be singular – but note that taking the stack quotient will yield a smooth stack if \(X\) is smooth.
If \(G.x\) is in a closure-equivalence class with more than 1 orbit then \(x\) is not stable.
Say \(G.x\) is closed, then \(\dim G.x < \dim G\) since it is strictly less than \(\dim G.y\) for some other orbit \(G.y\). Then \(\dim {\operatorname{Stab}}_x > 0\), and in particular is not finite.
Let \({\mathbf{C}}^{\times}\curvearrowright{\mathbf{A}}^1\) by the trivial action \(\lambda .x = x\). This is a free action, all orbits are single points and thus closed and all stabilizers are \({\mathbf{C}}^{\times}\). However, this is not stable by the above definition.
Let \(Z \mathrel{\vcenter{:}}=\left\{{x\in X {~\mathrel{\Big\vert}~}\dim {\operatorname{Stab}}_x > 0}\right\} \subseteq X\) and \(\pi: X\to X{ \mathbin{/\mkern-6mu/}}G\), then
Note that \(X^s\) may be empty and \(Z\) may be the entire space. Moreover, since \({\operatorname{Stab}}_x\) is a 0-dimensional algebraic variety, it has finitely many points – e.g. \({\operatorname{SL}}_n({\mathbf{Z}}) \subseteq {\operatorname{SL}}_n({\mathbf{C}})\) is not closed, or \({\mathbf{Z}}\subseteq {\mathbf{C}}\), and thus not an algebraic subgroup.
For (1), use \begin{align*} \phi G\times X &\to X\times X \\ (g, x) &\mapsto (gx, x) \end{align*} and consider \(\phi^{-1}(\Delta)\), which corresponds to stabilizers. Then there is a map \(\phi^{-1}(\Delta) \to X\) whose fiber over \(x\) is \({\operatorname{Stab}}_x\). Since affine/projective/quasiprojective varieties are separated (since \(\Delta\) it can just be defined by equations by embedding into a large \({\mathbf{A}}^N\)). This is surjective since \((1,x)\mapsto x\). Now use the general fact that if \(Y\twoheadrightarrow X\) then the set of \(x\in X\) where the fiber dimension jumps is closed.
For (2), note that (1) implies \(X^s\) is open.
\(X = X^s \iff {\operatorname{Stab}}_x\) is finite for all \(x\in X\).
Preview of the projective case: let \(G\curvearrowright X \subseteq {\mathbf{P}}^n\) with coordinates \({\left[ {x_0: \cdots : x_n} \right]}\). Look at the affine cone \(CX \subseteq {\mathbf{A}}^{n+1}\) with coordinates \({\left[ {x_0, \cdots, x_n} \right]}\), so if \(p\in CX\) then \(\lambda p \in CX\) for any \(\lambda \in k\). Note that \(CX\) doesn’t immediately have a \(G{\hbox{-}}\)action, so we need to lift the previous action to some \(G\curvearrowright CX\) called the linearization (a lift to the corresponding line bundle). This may not be unique if \(G\) has characters. Unstable points will be those with orbits whose closure contains zero, which will correspond to nonexistent points in the quotient, so we’ll have to throw these out. Mumford gives numerical criteria to compute them.
Types of varieties:
Being closed in the Zariski topology implies closed in the classical topology, and these are compact in the classical topology. Recall that proper maps are separable and universally closed – think of proper as essentially projective.
For \(k= \overline{k}\), there is a bijection between \({\mathsf{Aff}}{\mathsf{Var}}_{/ {k}}\) and \(R\in {}_{k} \mathsf{Alg}^{\mathrm{fg}}\) with no nilpotents, so there is a surjection \(\phi: k[x_1, \cdots, x_{n}]\twoheadrightarrow R\) with \(I \mathrel{\vcenter{:}}=\ker \phi\) and \(I = \sqrt{I}\). The map sends \(X\) to \(k[X] \mathrel{\vcenter{:}}= k[x_1, \cdots, x_{n}]/I\) the ring of regular functions on \(X\). If \(k = { \overline{k} }\) then \(\operatorname{mSpec}R\) consists of elements \(m = \left\langle{ x_1-a_1, \cdots, x_n-a_n}\right\rangle \in \operatorname{mSpec}k[x_1, \cdots, x_{n}]\), corresponding to points \({\left[ {a_1,\cdots, a_n} \right]} \in {\mathbf{A}}^{n}_{/ {k}}\). Similarly \({\mathsf{Aff}}{\mathsf{Sch}}\) bijects with commutative associative unital rings.
Projective varieties correspond to \({\mathbf{Z}}_{\geq 0}{\hbox{-}}\)graded rings \(R\) over \(k={ \overline{k} }\), so \(R = \oplus _{d\geq 0} R_d\) is finitely-generated without nilpotents. The map sends \(R\) to its projective spectrum \(\operatorname{mProj}R\). For arbitrary \({\mathbf{Z}}_{\geq 0}{\hbox{-}}\)graded commutative associative unital rings, one similarly defines \(\mathop{\mathrm{Proj}}R\). If \(R = k[R_1]\) is generated by degree 1 elements, then there is an embedding \(\operatorname{mProj}R \hookrightarrow{\mathbf{P}}^n\), but an arbitrary projective variety doesn’t necessarily come with such an embedding.
For this, take the Veronese subring \(R^{(e)} \mathrel{\vcenter{:}}=\bigoplus _{d\geq 0} R_{d_e}\) for \(e > 0\). This corresponds to the Veronese embedding \({\mathbf{P}}^n \hookrightarrow{\mathbf{P}}^N\) for some large \(N\), which is defined by \begin{align*} V_e: {\mathbf{P}}^n &\to {\mathbf{P}}^N \\ {\left[ {x_0: \cdots : x_n} \right]} &\mapsto {\left[ {x_0^e: x_{0}^{e-1}x_1, \cdots} \right]} \end{align*} where you send a point to all monomials in the coordinates of degree \(e\). Here \(N+1 = {n+e\choose e}\) is the number of such monomials. Note that e.g. \({x_0^{e-1} x_1 \over x_0^e} = {x_1\over x_0}\), so one can recover the former ratios from the latter. The condition of \(R = k[R_1]\) is needed to guarantee \(V_e\) is an embedding.
\begin{align*} \operatorname{mProj}R = \operatorname{mProj}R^{(e)} .\end{align*}
There exists an \(e_0 \gg 1\) such that \(R^{(e)}\) is generated in degree 1 for \(e > e_0\).
Let \(X \subseteq {\mathbf{P}}^n\), we want to extract a graded ring. Start with \({\mathbf{P}}^n\) corresponding to \(k[x_1, \cdots, x_{n}]\ni p_d(x_0,\cdots, x_n)\), forms of degree \(d\). Then \(\left\{{p_d = 0}\right\} \subseteq {\mathbf{P}}^n\) is well-defined, and this defines the Zariski topology on \({\mathbf{P}}^n\) Moreover any \(p_d/q_d\) is a well-defined regular function on the open subset \(\left\{{q_d\neq 0}\right\} \subseteq {\mathbf{P}}^n\).
There are several ways to produce the ring \(R\):
Constructing the correspondence \(\mathop{\mathrm{Proj}}R \rightleftharpoons R = \bigoplus _{d\geq 0} R_d\).8 One can safely assume the rings are finitely-generated, the general construction goes exactly the same way. Define \(\mathop{\mathrm{Proj}}R\) to be the set of prime homogeneous ideals \(p \subseteq R\) which are not contained in a certain ideal: considering \(k[x_1, \cdots, x_{n}]\), note that \(\left\langle{x_0,\cdots, x_n}\right\rangle\) does not define a point of \({\mathbf{P}}^n\), so define \(R_+ \mathrel{\vcenter{:}}=\bigoplus _{d\geq 1} R_d\) to be the irrelevant ideal. One can define fundamental closed subsets: for \({\mathbf{P}}^n\) these are of the form \(\left\{{f_d = 0}\right\}\), so generalize to \(f_d\in R_d\) and define \(Z(f_d) \mathrel{\vcenter{:}}=\left\{{ [p] {~\mathrel{\Big\vert}~}f([p]) = 0\in R/p}\right\}\). Note that \(f \equiv 0 \operatorname{mod}p\) in \(R\) iff \(f\in p\). Define fundamental closed sets as intersections \(\bigcap_\alpha Z(f_\alpha)\) and fundamental open sets as \(D(f_d) = \left\{{f_d\neq 0}\right\}\).
If \(R\in {}_{k} \mathsf{Alg}^{\mathrm{fg}}\) is generated in degree 1, \(I \hookrightarrow k[x_1, \cdots, x_{n}]\twoheadrightarrow R\) and \(X \subseteq {\mathbf{P}}^n\) corresponds to \(Z(I) \mathrel{\vcenter{:}}=\bigcap_{f\in I} Z(f)\).
Sections of \({\mathcal{O}}\) are locally of the form \(f_k/g_k\), and for \({\mathcal{O}}(d)\) of the form \(f_{d+k}/g_k\). It remains to define local sections of the following on \(\mathop{\mathrm{Proj}}R\):
\({\mathcal{O}}\): \(f_k/g_k\)
\({\mathcal{O}}(1)\): \(f_{k+1}/ g_k\)
\({\mathcal{O}}(d)\): \(f_{d+k}/f_k \in R \left[ { \scriptstyle { {g}^{-1}} } \right] _{\deg = d}\), where \({f\over g^n} \sim {f'\over g^m} \iff g^N(fg^m - f'g^n) = 0\) for some \(N\).
Consider \(\mu_2 \curvearrowright k[x,y]\) by \((x,y)\mapsto (x, -y)\). Then \(k[x, y]^{\mu_2} = k[x, y^2]\), which is a graded subring of \(k[x,y]\) which is not generated in degree 1. Note that \(\mathop{\mathrm{Proj}}k[x,y]^{\mu_2} = {\mathbf{P}}^1(1, 2)\) is a weighted projective space, which turns out to be isomorphic to \({\mathbf{P}}^1\). Moreover \(0 = {\left[ {0: 1} \right]}\) and \(\infty = {\left[ {1: 0} \right]}\) have nontrivial stabilizers, while the action is free elsewhere, and remembering the stabilizers yields a quotient stack.
Consider the moduli of elliptic curves \((E, 0)\) over \({\mathbf{C}}\). Realize \((E, 0)\hookrightarrow{\mathbf{P}}^2\) as a cubic curve by its Weierstrass equation:
Regrade the first equation to total homogeneous degree \(6\) by setting \(\deg y = 3, \deg x = 2, \deg A = 4, \deg B = 6\) and rescale \begin{align*} (x,y,A,B) \mapsto (\lambda^2 x, \lambda^3 y, \lambda^4 A, \lambda^6 B) .\end{align*} This makes it unique up to rescaling, so the moduli of such equations is \({\mathbf{P}}(4, 6)\) corresponding to \(A\) and \(B\), which is the \(j{\hbox{-}}\)line \({\mathbf{P}}^1_j\). Every point has a stabilizer of size at least 2 since 2 divides both 4 and 6, which comes from the involution \(z\mapsto -z\). This corresponds to two lattices with automorphism groups \(C_4\) and \(C_6\):
The former is \({\mathbf{C}}/\left\langle{1, i}\right\rangle\) which has the extra automorphism \(z\mapsto iz\), which has CM. The latter is \({\mathbf{C}}/\left\langle{1, \zeta_3}\right\rangle\) which has \(z\mapsto \zeta_3 z\).
Let \(G = {\mathbf{G}}_m\curvearrowright{\mathbf{A}}^n\) by \(\lambda. {\left[ {x_0,\cdots, x_n} \right]} = {\left[ { \lambda x_0, \cdots, \lambda x_n} \right]}\). What are the stable points?
Note that the orbits are either \(\left\{{0}\right\}\) or lines \(L\setminus\left\{{0}\right\}\). The former has stabilizer \({\mathbf{G}}_m\) and the latter orbits are open, so there are no stable points. The affine quotient is \(\operatorname{mSpec}k[x_1, \cdots, x_{n}]^G \cong \operatorname{mSpec}k\), a point. However, the projective quotient will be \({\mathbf{A}}^{n+1} { \mathbin{/\mkern-6mu/}}_{\mathop{\mathrm{proj}}} {\mathbf{G}}_m \cong {\mathbf{P}}^{n} = {{\mathbf{A}}^{n+1} \setminus\left\{{0}\right\}\over {\mathbf{G}}_m}\), not a point.
Write \begin{align*} R = k[x_1, \cdots, x_{n}]= \bigoplus_d R_d = k \oplus \left\langle{x_0, \cdots, x_n}\right\rangle_k[1] \oplus \left\langle{x_0^2, x_0x_1,\cdots}\right\rangle[2] \oplus \cdots .\end{align*} We’ll say \(R_d\) are semi-invariants of degree \(d\), where \(\lambda.f = \lambda^d f\). More generally, for a character \(\chi: G\to {\mathbf{G}}_m\) given by \(\lambda\mapsto \chi( \lambda)\), for \(\lambda \in G\) we can act by \(\lambda.f = \chi( \lambda) f\).
Given a character \(\chi: G\to {\mathbf{G}}_m\) with \(G\curvearrowright R\) a ring, define the \(k{\hbox{-}}\)vector space of semi-invariants \begin{align*} R_\chi^G = \left\{{f {~\mathrel{\Big\vert}~}\lambda .f = \chi( \lambda) f}\right\} \subseteq R .\end{align*} Say this action is of ray type with respect to \(\chi\) iff for all \(d < 0\), the semi-invariants \(R_{d\chi}^G\) vanish and \(R_0 = k\). We can then define the projective quotient in the direction of a character as \begin{align*} X { \mathbin{/\mkern-6mu/}}_\chi G \mathrel{\vcenter{:}}=\operatorname{mProj}\oplus _{d\in {\mathbf{Z}}} R^G_{d\chi} .\end{align*}
Note that \(R^G = R_{\mathop{\mathrm{triv}}}\), and \(\lambda.(fg) = (\chi_1 + \chi_2)( \lambda) fg\).
Let \(G = {\mathbf{G}}_m\) and take \(\chi = \operatorname{id}_{{\mathbf{G}}_m}\), then \(\bigoplus _{d\in {\mathbf{Z}}} R^G_{d\chi} = R\) and \({\mathbf{A}}^{n+1}{ \mathbin{/\mkern-6mu/}}_\chi {\mathbf{G}}_m = {\mathbf{P}}^n\).
Constructing Proj: write \(\bigoplus _{d\in {\mathbf{Z}}} R^G_{d\chi} = k[f_1,\cdots, f_n]\) with \(f_i\) homogeneous of degree \(d\) in \(R^G_{d\chi }\). Note that \(\prod f_i^{m_i}/\prod f_i^{n_i}\) of total degree zero yield regular functions on \(D(\prod f_i^{n_i}) = V(\prod f_i^{n_i})^c\).
Let \(X = \operatorname{Mat}_{n\times n}({\mathbf{C}}) \cong {\mathbf{A}}^n \in {\mathsf{Aff}}{\mathsf{Var}}_{/ {{\mathbf{C}}}}\), and let \(G \mathrel{\vcenter{:}}=\operatorname{GL}_n({\mathbf{C}}) \curvearrowright X\) by \(g.A = g^{-1}A g\). Recall \(G = \operatorname{mSpec}{\mathbf{C}}[a_{ij}, b] / \left\langle{b\operatorname{det}(a_{ij}) - 1 }\right\rangle\), so the action is algebraic since \(g^{-1}= g^{\operatorname{adj}}/\operatorname{det}(g)\). Considering that characters \(\chi: \operatorname{GL}_n\to {\mathbf{G}}_m\) must be multiplicative, it turns out that every character is a power of \(\operatorname{det}: \operatorname{GL}_n\to {\mathbf{G}}_m\). What is \(\operatorname{Mat}_{n\times n}({\mathbf{C}}){ \mathbin{/\mkern-6mu/}}_{\operatorname{det}} \operatorname{GL}_n({\mathbf{C}})\)? Identify \(R = {\mathbf{C}}[c_{ij}]\) as the coordinate ring of \(X\), and recall \begin{align*} {\mathrm{charpoly}}(gAg^{-1}, x) = {\mathrm{charpoly}}(A, x) = \operatorname{det}(A-xI) = (-1)^n(x^n - \operatorname{Trace}(A)x^{n-1} + \cdots \pm \operatorname{det}(A)) .\end{align*} Note that the affine quotient is \(\operatorname{mSpec}{\mathbf{C}}[\operatorname{Trace}(A),\cdots, \operatorname{det}(A)]\).
Take \(G = \operatorname{GL}_2({\mathbf{C}}) \curvearrowright V_d = k[x,y]_d\cong {\mathbf{C}}^{d-1}\) where \begin{align*} { \begin{bmatrix} {a} & {b} \\ {c} & {d} \end{bmatrix} } .{ \begin{bmatrix} {x} \\ {y} \end{bmatrix} } \mathrel{\vcenter{:}}={ \begin{bmatrix} {ax+by} \\ {cx+dy} \end{bmatrix} } .\end{align*} The ring is \(R = \operatorname{Sym}^*V_d\). Let \(\chi: G\to {\mathbf{G}}_m\) and \(\operatorname{det}: \operatorname{GL}_2\to {\mathbf{C}}^{\times}\).
Find polynomials such that \(A.p(x,y) = \operatorname{det}(A)^N p(x,y)\) for some power \(N\).
This is Mukai’s POV, an alternative POV is described in Mumford’s GIT. Start with \(G\curvearrowright Y = \operatorname{mProj}R\) and \(L\in \operatorname{Pic}(Y)\) ample.
Basic examples:
Note
Consider \(\operatorname{PGL}_n \curvearrowright{\mathbf{P}}^n\), corresponding to \({{\operatorname{SL}}_{n+1} \over \mu_n} = {\operatorname{GL}_{n+1}\over {\mathbf{G}}_m}\curvearrowright{{\mathbf{A}}^{n+1}\setminus\left\{{0}\right\}\over {\mathbf{G}}_m}\). The linearization is an action \(\operatorname{PGL}_{n+1}\curvearrowright{\mathbf{A}}{n+1} = \left\{{{\left[ {x_0, \cdots, x_n} \right]}}\right\}\), which does not exist. However, there are natural actions \(\operatorname{GL}_{n+1}\curvearrowright{\mathbf{A}}^{n+1}\) and thus \({\operatorname{SL}}_{n+1} \curvearrowright{\mathbf{A}}^{n+1}\) by restriction. Thus \({\mathcal{O}}_{{\mathbf{P}}^n}(1)\) is not linearisable for the \(\operatorname{PGL}_{n+1}\) action, but is for \({\operatorname{SL}}_n\). One solution: work with \({\operatorname{SL}}_n\), which has trivial \(\pi_1\), while \(\operatorname{PGL}_n\) has nontrivial solution. Mumford’s solution: the power \({\mathcal{O}}_{{\mathbf{P}}^n}(n+1)\) is linearisable for \(\operatorname{PGL}_{n+1}\), since \(\mu_{n+1}\) acts trivially. This amounts to replacing \(L\) by \(L^{n+1}\) and \(R(X, L)\) by the subring \(R(X, L^{n+1})\) whose powers are divisible by \(n+1\), but their Projs are equal. The difference here is that \({\operatorname{SL}}_n\) has no characters and \(\operatorname{GL}_n\) has only one character. Mukai’s approach is easier when there are many characters, e.g. when \(G\) is a torus with characters \(\mathop{\mathrm{Hom}}(T, {\mathbf{G}}_m)\cong {\mathbf{Z}}^n \mathrel{\vcenter{:}}= M\).
Setup:
We want to convert this to an action \(G\curvearrowright R(X, L) \mathrel{\vcenter{:}}=\bigoplus _{d\geq 0} H^0(X, L^d)\), the ring of homogeneous forms on \(X\). If \(X \subseteq {\mathbf{P}}^n\) then \(R(X, L) = k[x_1, \cdots, x_{n}]/ I(X)\) – at least modulo several beginning terms. There is a SES which can be twisted by \(d\): \begin{align*} I(X) \hookrightarrow{\mathcal{O}}_{{\mathbf{P}}^n} \twoheadrightarrow{\mathcal{O}}_X(d) \leadsto I(X)(d) \hookrightarrow{\mathcal{O}}_{{\mathbf{P}}^n}(d) \twoheadrightarrow{\mathcal{O}}_X(d) .\end{align*}
This yields a map \begin{align*} H^0({\mathbf{P}}^n; {\mathcal{O}}_{{\mathbf{P}}^n}(d)) = \left\langle{x_0^d,\cdots}\right\rangle \xrightarrow{f_d} H^0(X; {\mathcal{O}}_X(d)) \to H^1({\mathbf{P}}^n; I(X) \otimes{\mathcal{O}}_X(d) ) .\end{align*} By Serre vanishing, for \(d > d_0 \gg 0\) the \(H^1\) term vanishes, so \(f_d\) is surjective in this range.
Conversely, \(\mathop{\mathrm{Proj}}R(X, L) = X\) with \({\mathcal{O}}(1) = L\), so we can freely pass between \(X\) and \(R(X, L)\). Given \(G\curvearrowright R(X, L)\) we can take invariants to get the graded ring \(R(X, L)^G\) and take its \(\mathop{\mathrm{Proj}}\) to obtain the GIT quotient \(Y = X{ \mathbin{/\mkern-6mu/}}G \mathrel{\vcenter{:}}=\mathop{\mathrm{Proj}}R(X, L)^G\). Passing from \(G\curvearrowright X\) to \(G\curvearrowright R(X, L)\) is called linearization, i.e. lifting an action on \(X\) to an action on (sections of) \(L\). Recall that \(L\) is the sheaf of regular sections of a line bundle \({\mathbb{L}}\xrightarrow{\pi} X\):
Note that two different linearizations of \(G\curvearrowright X\) differ by a character \(\chi\in \mathop{\mathrm{Hom}}_{ \mathsf{Alg}{\mathsf{Grp}}}(G, {\mathbf{G}}_m)\). Let \(a_1, a_2: G\curvearrowright X\) and consider the two actions on fibers:
Then \(a_1 \circ a_2: G\curvearrowright{\mathcal{O}}_X\), i.e. \(G\curvearrowright H^0(X; {\mathcal{O}}_X^{\times}) = {\mathbf{G}}_m\). This induces \(G\curvearrowright({\mathbf{A}}^1, 0)\) fixing zero, so \(G\mapsto \mathop{\mathrm{Aut}}({\mathbf{A}}^1, 0) = {\mathbf{G}}_m\), yielding a character of \(G\).
Let \(G = {\mathbf{C}}^{\times}= {\mathbf{G}}_m\) and \(X = {\mathbf{P}}^3\) with homogeneous coordinates \(x_0,\cdots, x_3\). Then the cone \(CX\) has affine coordinates \(x_0, \cdots, x_3\), \begin{align*} R(X, L) = k[x_0,\cdots, x_3] = k \oplus \left\langle{x_0, \cdots, x_3}\right\rangle \oplus \cdots ,\end{align*} and \(L = {\mathcal{O}}(1)\). If \(G\curvearrowright X\) then \({\mathbf{G}}_m = G\curvearrowright CX = {\mathbf{A}}^4 = \operatorname{Spec}k[x_0,\cdots,x_n]\). This corresponds to a \({\mathbf{Z}}{\hbox{-}}\)grading, inducing weights \(w_i \mathrel{\vcenter{:}}=\mathrm{weight}(x_i) \in {\mathbf{Z}}\). For \(\lambda\in {\mathbf{C}}^{\times}\) the action is \(\lambda. x_i = \lambda^{w_i} x_i\).
Note that for \(G\curvearrowright{\mathbf{P}}^n\), the regular functions are locally \(p_d(x)/q_d(x)\), ratios of polynomials of the same degree. Write \({\mathbf{P}}^3 = {\mathbf{A}}_3 {\textstyle\coprod}\cdots {\mathbf{A}}^3\), these are \(G{\hbox{-}}\)invariant subspaces since e.g. \(\left\{{x_0=0}\right\}\) is an invariant closed set. So it suffices to specify an action on each subspace. This induces \(\mathrm{weight}(x_i/x_0) = u_i \mathrel{\vcenter{:}}= w_i - w_0\), and more generally \(\mathrm{weight}(x_i/x_j) = w_i - w_j\). A linearization is determining the \(w_i\) such that \(\lambda .x_i = \lambda^{w_i} x_i\). Any two such linearizations differ by addition of an integer, i.e. \(w_i' = w_i + b\). One can check that \(\mathop{\mathrm{Hom}}_{ \mathsf{Alg}{\mathsf{Grp}}}({\mathbf{G}}_m, {\mathbf{G}}_m) \cong {\mathbf{Z}}\) where \(\lambda\mapsto \lambda^b\) for each \(b\in {\mathbf{Z}}\). So the linearizations are a \({\mathbf{Z}}{\hbox{-}}\)torsor.
What are the quotients?
Take the toric polytope for \({\mathbf{P}}^3\): the standard simplex in \({\mathbf{R}}^3\) (a tetrahedron).
Note that \begin{align*} k[x_0,\cdots,x_3] = k \oplus \left\langle{x_0,\cdots, x_3}\right\rangle \oplus \left\langle{x_0^2, x_0 x^1,\cdots}\right\rangle \oplus \cdots \cong k^1 \oplus k^{3+1\choose 1} \oplus k^{3+2\choose 2} \oplus \cdots .\end{align*} More generally one has \begin{align*} H^0({\mathbf{P}}^3; {\mathcal{O}}(d)) = \oplus _{m\in M \cap dP} {\mathbf{C}}x^m ,\end{align*} so e.g. for \(d=2\) one has
To visualize the graded ring \(k[x_0, \cdots, x_3]\):
Generally, one should assign an integer to each lattice point of \(dP\) satisfying \(\mathrm{weight}(x_i x_j) = w_i + w_j\). Since \({\mathbf{G}}_m\curvearrowright R\) corresponds to a \({\mathbf{Z}}{\hbox{-}}\)grading by weight, yielding \(R = \oplus _{w\in {\mathbf{Z}}} R_w\), the invariants are \(R^G = R_0\), the 0th graded piece. So one can consider the fiber over zero in the weight map:
As one changes linearization, one shifts this picture and the corresponding slice, and if the slice is empty, \(R_0 = {\mathbf{C}}\), and \(\mathop{\mathrm{Proj}}R_0\) will be empty.
One can write \(R = \bigoplus _{m\in M \cap dP} {\mathbf{C}}x^m\) and \(R_0 = \bigoplus _{m\in M \cap dQ} {\mathbf{C}}x^m\) where \(Q = \pi^{-1}(0)\) is the slice above weight zero.
This corresponds to the cone on a 4-gon in the dilated cone picture for the graded ring. Note that if the polytope \(1P\) (here the simplex) in the \(h = 1\) slice does not have integral lattice points, it won’t provide a set of generators. We have \(R_0 = S[Q]\) where \(Q \subseteq {\mathbf{R}}^3\) and the vertices of \(Q\) are in \({\mathbf{Q}}^3\). This is a semigroup algebra, so we take its proj to get \(X{ \mathbin{/\mkern-6mu/}}G = \mathop{\mathrm{Proj}}S[Q]\). We can replace \(Q\) with any multiple \(kQ\) to get \(X{ \mathbin{/\mkern-6mu/}}G = \mathop{\mathrm{Proj}}S[kQ]\), the Veronese subring. Note that \(S[kQ]_{\deg = d} = S[Q]_{\deg = kd}\), and if you work with ratios of total degree zero these coincide, and corresponds to replacing \({\mathcal{O}}(1)\) with \({\mathcal{O}}(k)\).
Any rational polytope \(Q\) yields a projective toric variety \(Y_Q\) with \(({\mathbf{C}}^{\times})^2\curvearrowright Y_Q\) which is normal, acts with finitely many orbits, and \(T\mathrel{\vcenter{:}}=({\mathbf{C}}^{\times})^2 \hookrightarrow Y_Q\) as an open orbit. This \(Y_Q\) is a compactification of the torus \(T\) by adding \(T{\hbox{-}}\)orbits, and face of \(Q\) correspond to orbits:
One can ask how the polytope changes under various rational linearizations \(L \mapsto L^k\), corresponding to different GIT quotients. Here one gets 4-gons, two 3-gons, and the empty set, corresponding to 3 chambers (and empty chambers) with two walls describing the combinatorial types of the fibers.
For \({\mathbf{C}}^{\times}\curvearrowright{\mathbf{P}}^4\) one can vary to obtain the following 3D cross-sections:
Mumford’s approach for e.g. \(G = {\operatorname{SL}}_n\): \(G\curvearrowright R = R(X, L) + \bigoplus _{d\geq 0} H^0(X; L^d)\) where e.g. \(L = {\mathcal{O}}(1)\) for \(X \subseteq {\mathbf{P}}^n\). This yields \(R^G\) a graded ring and \(X{ \mathbin{/\mkern-6mu/}}G = \mathop{\mathrm{Proj}}R^G\). Setting \(Y = CX = \operatorname{Spec}R\), we can consider \(Y{ \mathbin{/\mkern-6mu/}}G = \operatorname{Spec}R^G\). We have \(Y \supseteq Y^s = \left\{{g\in G{~\mathrel{\Big\vert}~}G.y \text{ is closed}, G_y \text{ is finite}}\right\}\), the open subset of stable points. For each equivalence class of orbits under the orbit-closure equivalence, there is a unique closed orbit.
One can show that unstable points are a closed condition, so \(Y^s\) is open in \(Y^{\mathrm{ss}}\).
Mukai’s approach: for \({\mathbf{G}}= \operatorname{GL}_n\) define \(Y{ \mathbin{/\mkern-6mu/}}{\mathbf{G}}= X{ \mathbin{/\mkern-6mu/}}G\).
Consider \(V_{d, n}\) the space of degree \(d\) hypersurfaces in \({\mathbf{C}}^n\), which is isomorphic to \({\mathbf{C}}^N\) where \(N = {d+n\choose n}\). We can also consider \({\mathbf{P}}V_{d, n} = \left\{{p_d(x_0,\cdots, x_n) = 0 \text{ homogeneous}}\right\}\) the space of hypersurfaces of degree \(d\) in \({\mathbf{P}}^n\), up to the action of \(\operatorname{PGL}_{n+1} = \mathop{\mathrm{Aut}}({\mathbf{P}}^n)\). We have \({\mathbf{P}}V_{d, n}{ \mathbin{/\mkern-6mu/}}\operatorname{PGL}_{n+1} = {\mathbf{P}}V_{d, n} { \mathbin{/\mkern-6mu/}}{\operatorname{SL}}_{n+1} = V_{d, n}{ \mathbin{/\mkern-6mu/}}\operatorname{GL}_{n+1}\), and \({\operatorname{SL}}_{n+1}\curvearrowright{\mathbf{A}}(V_{d, n}) = S(V_{d, n})\), so we want to describe the ring \(S(V)^{ {\operatorname{SL}}_{n+1} }\) since its proj is \({\mathbf{P}}V_{d, n} { \mathbin{/\mkern-6mu/}}\operatorname{PGL}_n\). Recall \(S(V) = \bigoplus_{k\geq 0} \operatorname{Sym}^k(V)\). Describing this ring is equivalent to describing the variety and yields a solution to moduli problems involving hypersurfaces.
Consider sextic curves in \({\mathbf{P}}^2\), given by polynomials \(p_6(x_0, x_1, x_2)\). Note that smooth curves are stable since their orbits are closed.
He lists the unique closed semistable orbits. One is a cube of a quadratic, one is three quadrics tangent at two points, one is a double line with a smooth quartic:
The case of interest: a moduli space \({\mathcal{M}_g}\) parameterizing smooth curves. One choose a linear system \(\phi_{{\left\lvert {2K_C} \right\rvert}}: C\hookrightarrow{\mathbf{P}}^n\), then one shows that the smooth curves \(\left\{{C\hookrightarrow{\mathbf{P}}^n}\right\} \subseteq \operatorname{Hilb}, {\operatorname{CH}}\) are stable, but one also picks up singular stable curves and quotienting yields a compactification \(\overline{{\mathcal{M}_g}}\).
See Sylvester, a first good American mathematician! He proved some important theorems in invariant theory.
The easiest case: \(\left\{{f(x, y) = 0}\right\} \subseteq {\mathbf{P}}^1\) with coordinates \(x,y\). There is an action of \({\operatorname{SL}}_2\) given by \({ \begin{bmatrix} {a} & {b} \\ {c} & {d} \end{bmatrix} } { \begin{bmatrix} {x} \\ {y} \end{bmatrix} } = { \begin{bmatrix} {ax+by} \\ {cx+dy} \end{bmatrix} }\). Write \(V_{d, 1} = \left\langle{x^d, x^{d-1}y,\cdots}\right\rangle\) and \({\operatorname{SL}}_2\) acts by variable substitution. The algebra is \(R = S(V_{d, 1})\), and we want to find \(R^G = S(V_{d, 1})^{{\operatorname{SL}}_2}\). This is a moduli problem for configurations of distinct points (with multiplicity) on \({\mathbf{P}}^1\), up to reparameterizing by \(\operatorname{PGL}_2\). It turns out \(f\) is
Note that for \(d\) odd, \(X^s = X^{\mathrm{ss}}\).
Setup: \(X \subseteq {\mathbf{P}}^n\) with \(G\curvearrowright X, {\mathbf{P}}^n\), \(G\) linearly reductive, which is linearized so that \(G\curvearrowright{\mathbf{A}}^{n+1}\) acting on projective coordinates is a linear action. Thus each \(g\in G\) induces \(\rho_g\) which is linear in the coordinates \(x_0, x_1,\cdots, x_n\). We have
Define \(X^u = X\setminus X^s\) to be the unstable points; our main problem is to describe \(X^u, X^s, X^{\mathrm{ss}}\).
For \(x\in X\), \(x\in X^{\mathrm{ss}}\iff\) the following holds: Let \(\lambda: {\mathbf{G}}_m\to G\) be nonconstant, and note the action \({\mathbf{G}}_m\curvearrowright{\mathbf{A}}^{n+1}\) corresponds to a grading and thus some system of linear coordinates \(x_0,\cdots, x_n\) with weights \(\omega_0, \cdots, \omega_n \in {\mathbf{Z}}\) where \(t.x_i = t^{w_i} x_i\). Then for all such \(\lambda\), there should exist a coordinate \(x_i\) with \(x_i(p) \neq 0\) and \(w_i \leq 0\).
Similarly, \(x\in X^s \iff \exists \lambda, x_i\) with \(x_i(p) \neq 0\) and \(w_i < 0\) is strictly negative.
Note that replacing \(\lambda\) with \(-\lambda\), one can replace the above conditions with \(w_i \geq 0\) and \(w_i > 0\) respectively. Most papers on GIT start with this theorem, and finding the unstable locus is a computation.
\(x\in X^u \iff\) there exists a 1-parameter subgroup \(\lambda: {\mathbf{G}}_m\to G\) such that \(w_i > 0\) for all \(i\) with \(x_i(p) \neq 0\).
Consider binary degree \(d\) forms, corresponding to degree \(d\) cycles/subschemes in \({\mathbf{P}}^1\). Each point corresponds to a homogeneous polynomial \(f_d(x,y)\) of degree \(d\). Recall that \(V_d = \left\langle{x^d, x^{d-1}y,\cdots, y^d}\right\rangle\), the irrep of \({\operatorname{SL}}_2\) where \({\operatorname{SL}}_2\curvearrowright\left\langle{x,y}\right\rangle\) by matrix multiplication: \begin{align*} { \begin{bmatrix} {a} & {b} \\ {c} & {d} \end{bmatrix} } { \begin{bmatrix} {x} \\ {y} \end{bmatrix} } \mathrel{\vcenter{:}}={ \begin{bmatrix} {ax+by} \\ {cx+dy} \end{bmatrix} } .\end{align*}
We have a 1-parameter subgroup corresponding to \(\operatorname{diag}(t, t^{-1}) \curvearrowright[x,y] = [tx, t^{-1}y]\) which gives \(x\) weight \(1\) and \(y\) weight 2. Call this \(\lambda^\text{std}\).
All 1-parameter subgroups of \({\operatorname{SL}}_2\) are powers \((\lambda^\text{std})^n\) for some \(n\), up to a linear change of coordinates for \(x,y\).
The general theory: if \(G\) is semisimple (e.g. \(G={\operatorname{SL}}_n\)) then \(G\supseteq T\) a maximal torus, and any two such are conjugate. For \({\operatorname{SL}}_n\) these tori are diagonal matrices \(M\) with \(\operatorname{det}M = 1\). Moreover all 1-parameter subgroup is contained in a maximal torus. Powers can be ignored here, since they correspond to multiplying weights by a positive integer. By the theorem, a point is unstable iff the monomials that appear in the binary form are all of negative degree for some choice of coordinates \(x,y\). For \(d=3\), we have
So \(f\) is unstable iff \(f(x,y) = axy^2 + by^3 = y^2(ax + by)\), i.e. in some coordinates \(y^2\divides f\), so \(f\) has a double root.
A binary form of degree \(d\) is
Note that for odd \(d\), stable = semistable, and for even \(d\) they are different.
For \(d=4\), consider the double cover \(z^2 = f(x,y)\):
So
Thus the \(j{\hbox{-}}\)line \({\mathbf{A}}^1\) corresponds to smooth/stable curves, and compactifies to \({\mathbf{P}}^1 = {\mathbf{P}}(2,3)\) by adding nodal curves.
For \({\operatorname{SL}}_3\curvearrowright X\) for \(X\) the space of cubic curves in \({\mathbf{P}}^2\), we have several possibilities for curves \(f_3(x,y) = 0\):
We have \(f_3\in {\mathbf{P}}^{{3+2\choose 2} - 1} = {\mathbf{P}}^9\), with 10 coordinates: \begin{align*} x^3,y^3,z^3, x^2 y, x^2 z, y^2 x, y^2 z, z^2 x, z^2 y, xyz .\end{align*} Each curve \(f = 0\) to a closed subscheme of \({\mathbf{P}}^2\) whose ideal is \(\left\langle{f}\right\rangle\). There is an action of \({\operatorname{SL}}_3 \curvearrowright{\left[ {x,y,z} \right]}\) on coordinates, and a maximal torus \(T = \operatorname{diag}(t_1, t_2, t_1^{-1}t_2^{-1})\). Choosing this torus in a diagonal form is equivalent to choosing a coordinate system. One can then look at \({\mathbf{G}}_m \hookrightarrow T\) and consider its action to define weights. We get the following triangle of monomials:
Take this and project to get weights:
This gives \(w(x) = (1,0), w(y) = (0, 1), w(z) = (-1, -1)\) Those with the right weights are \(x^2 z, xz^2, z^3, yz^2\), all containing a factor of \(z\). So any polynomial of the form \(f(x,y,z) = (axz + bxz^2 + cz^2 + dyz)\) is unstable.
Thus the following are unstable:
The game: take a line through the center point \(xyz\), rotate, take monomials on the positive side, and check for instability, since we need \(w(xyz) = 0\). It turns out that smooth cubics are stable, simple nodes are semistable, and anything worse than \(xyz = 0\) is unstable.
Hilbert-Mumford criterion: \(G\curvearrowright X \subseteq {\mathbf{P}}^n\) projective, which is linearized to \(G\curvearrowright{\mathbf{A}}^{n+1}\) acting on coordinates \(x_0',\cdots, x_n'\). For a point \(p\in X\) corresponding to \(\tilde p\in {\mathbf{A}}^{n+1}\), is it stable, semistable, etc? Note
Equivalently,
If \(\tilde p = {\left[ {x_0,\cdots, x_n} \right]} \subseteq {\mathbf{A}}^{n+1}\), then \(t.x_i = t^{w_i} x_i\). So case (3) above corresponds to \(\lim_{t\to 0} t.\tilde p = \mathbf{0}\), so the origin is in the closure of \({\mathbf{G}}_m.p \subseteq G.p\). In case (2), \({\mathbf{G}}_m.\tilde p \subseteq {\mathbf{A}}^{n+1}\setminus\left\{{0}\right\}\), and to compute the closure on consider \(\lim_{t\to 0, \infty} {\left[ {t^{w_1} x_1, \cdots, t^{w_n} x_n} \right]} \not\in {\mathbf{A}}^{n+1}\). Note that by the valuative criterion, any map from a smooth curve to a proper variety can be extended to its compactification, so \({{\mathbf{C}}^{\times}}\to {\mathbf{A}}^{n+1} \subseteq {\mathbf{P}}^{n+1}\) extends to \({\mathbf{P}}^n\to {\mathbf{P}}^{n+1}\) uniquely, which is where this limit is computed. If some \(w_i = 0\), split into cases:
So \(\lim_{t\to 0} {\left[ {\cdots, t^0 x_i, \cdots} \right]} = {\left[ {\cdots, x_i, \cdots} \right]}\neq \mathbf{0}\).
The condition on 1-parameter subgroups is necessary. For sufficiency, the claim is that it’s enough to look at 1-parameter subgroups, so consider \(\overline{G.\tilde p}\). Embed \(G.p \subseteq {\mathbf{A}}^{n+1} \subseteq {\mathbf{P}}^{n+1}\) to embed \(\overline{G.\tilde p} \subseteq {\mathbf{P}}^{n+1}\). Letting \(0\in \overline{G.\tilde p}\), one can find a \(C\) curve lying entirely in the orbit whose closure contains \(0\), for example by slicing \(\overline{G.\tilde p}\) by hyperplanes to reduce dimension by 1 (using the principal ideal theorem) and picking resulting irreducible components arbitrarily. So one gets the following:
For \(G = {\operatorname{SL}}_{n+1}(K) \supseteq{\operatorname{SL}}_{n+1}(R)\) with \(K = {\mathbf{C}} { \left( {(} \right) } x_0,\cdots, x_n) \supseteq(R, {\mathfrak{m}})\) where \(K = \operatorname{ff}(R), R = {\mathbf{C}}[x_0,\cdots, x_n]_{\prod x_i}\). Note \({\operatorname{SL}}_{n+1}(R) \to {\operatorname{SL}}_{n+1}(R/{\mathfrak{m}}) = {\operatorname{SL}}_{n+1}({\mathbf{C}})\), and taking the \(LDR\) decomposition of a matrix \(M\) finishes the proof. See Mumford/Mukai.
Discuss this next Thursday in class.
Some applications:
Then \(({\mathbf{P}}^N)^s/ {\operatorname{SL}}_{n+1}\) contains nonsingular hypersurfaces, and is contained in \(({\mathbf{P}}^N)^{\mathrm{ss}}{ \mathbin{/\mkern-6mu/}}{\operatorname{SL}}_{n+1}\) which is a projective variety which adds new semistable but not stable surfaces at the boundary.
Last time we looked at \(n=1, d\) arbitrary and \((n, d) = (2, 3)\). For next time, consider \((n, d) = (2, 4), (3,3)\).
Note that if \(X_d, X_d' \subseteq {\mathbf{P}}^n\) with \(X_d \cong X_d'\), then there exists a \(g\in \operatorname{PGL}_n\) with \(g.X_d = X_d'\) for \(n\geq 4\): since \(\operatorname{Pic}X_d = {\mathbf{Z}}[H]\) by Lefschetz, the linear system \(\phi_{{\left\lvert {H} \right\rvert}}: X_d \hookrightarrow{\mathbf{P}}^n\) defines an embedding, and \(H^0(X_d; {\mathcal{O}}(H)), H^0(X_d'; {\mathcal{O}}(H))\) differ only by choosing a basis of sections.
Every \(p\in {\mathbf{P}}^n\) with \(p={\left[ {a_0:\cdots :a_n} \right]}\) has a dual \(H_p \in ({\mathbf{P}}^n) {}^{ \vee }\) where \(H_p = V(\ell)\) for \(\ell\) the line \(\sum a_i x_i\). For any \(d\) points \(p_i\), taking the product \(f_d \mathrel{\vcenter{:}}=\prod \ell_i\) yields….something.
The Chow variety \({\operatorname{Ch}}(d, x, {\mathbf{P}}^n)\) parameterizes cycles \(X\mathrel{\vcenter{:}}=\sum n_i x_i\) with \(x_i \subseteq {\mathbf{P}}^n\) each of dimension \(k\) where \(\deg X = \sum n_i \deg X_i = d\). Let \(X^k \subseteq {\mathbf{P}}^n \supseteq P^{n-k-1}\), a generic such hyperplane won’t intersect \(X^k\). They are parameterized by \({\mathbb{Gr}}(n-k-1, n) ={\operatorname{Gr}}(n-k, n+1)\) which contains a hypersurface \((X) = \left\{{P^{n-k-1} {~\mathrel{\Big\vert}~}P^{n-k-1} \cap X\neq \emptyset}\right\}\). Since this is a codimension 1 condition, it’s given by an equation \(\left\{{F_X = 0}\right\}\). This is the Chow form of \(X\), which replaces the many equations of \(X\) with a single equation.
What this looks like for hypersurfaces \(X_d \subseteq {\mathbf{P}}^n \supseteq P^0\), which is a point. There are parameterized by \({\mathbb{Gr}}(0, {\mathbf{P}}^n) = {\operatorname{Gr}}(1, n+1) = {\mathbf{P}}^{n}\). The equation of the Chow form recovers the equation of \(X\). This recovers the point-hyperplane correspondence from before for \({\mathbf{P}}^n\).
Note that \(\operatorname{Pic}G = 1\) for any Grassmannian \(G\), and the surface \(\left\{{F_x = 0}\right\}\) lives in \(\operatorname{Sym}^n H^0(G; {\mathcal{O}}_G(1))\).
Some major applications of GIT:
Moduli of sheaves: \(\operatorname{Pic}\) and \(\operatorname{Jac}\) as varieties/schemes, moduli of (semi)stable vector bundles (Narasimhan, Seshadri, Mumford), and more generally moduli of semistable coherent sheaves (Maruyama, Simpson). See Gieseker for moduli of vector bundles over surfaces. In this situation, GIT works very well – this is a “linear” problem.
Moduli of varieties: stable curves (Mumford), some surfaces (Gieseker). GIT works less well here, since this is “nonlinear.”
We’ll proceed to look at the first case, moduli of sheaves. Note that for quasicoherent sheaves, one instead needs to pass to pro-objects in coherent sheaves.
Let \(X\in \mathop{\mathrm{Proj}}{ {\mathsf{Var}}_{/k}}\) where \(k\) is is not necessarily algebraically closed. We can define the abstract group \(\operatorname{Pic}(X)\) of invertible (i.e. locally free of rank 1) sheaves on \(X\) modulo isomorphism. There is also a Picard scheme \(\operatorname{Pic}(X) = \operatorname{Jac}(X)\) which is the fine moduli space of invertible sheaves of fixed degree or fixed Hilbert polynomial, which has the structure of a scheme over \(k\) – if \(\operatorname{ch}k = 0\) then it is an algebraic variety, but may have nilpotents in positive characteristic. For appropriate choices, this can be made into a group scheme/variety.
Let \(C\) be a smooth projective genus \(g\) curve over \(k={\mathbf{C}}\). The degree map provides a SES \begin{align*} \operatorname{Pic}^0(C) = \ker \deg = \operatorname{Jac}(C) \hookrightarrow\operatorname{Pic}C \twoheadrightarrow{\mathbf{Z}} .\end{align*} One can realize \(\operatorname{Pic}^0(C) \cong {\mathbf{C}}^g/{\mathbf{Z}}^g\), giving it the structure of a projective algebraic variety and a complex manifold. Note that a random choice of lattice \(L \cong {\mathbf{Z}}^g\) will yield a Kähler variety, but potentially not an algebraic variety unless \(L\) satisfies strict numerical conditions (which it does for \(\operatorname{Pic}^0\)).
Families of invertible sheaves will correspond to moduli functors \begin{align*} \underline{M}: ({\mathsf{Sch}}^{{\mathrm{ft}}}_{\operatorname{Spec}{\mathbf{C}}})^{\operatorname{op}}&\to {\mathsf{Set}}\\ S &\mapsto \left\{{ \text{Invertible sheaves $F$ on $X \underset{\scriptscriptstyle {\operatorname{Spec}{\mathbf{C}}} }{\times} S$} }\right\} / \cong .\end{align*} Such an \(F\) should be thought of as a family of invertible sheaves on \(X\) parameterized by \(S\), i.e. for every \(s\in S\) there is a sheaf \(F_s \mathrel{\vcenter{:}}={ \left.{{F}} \right|_{{X_s}} }\) where \(X_s\) is the fiber over \(s\):
For each \(f: S\to T\) we obtain \(\underline{M}(T) \to \underline{M}(S)\), and pullbacks \(X\times S \xrightarrow{f} X\times T\) induces \(F\mapsto f^* F\).
We also require that each \(F\in \underline{M}(S)\) is equipped with a rigidification: a fixed trivialization \({ \left.{{F}} \right|_{{p\times S}} }\cong{\mathcal{O}}_S\):
This kills automorphisms and gives a fine moduli space. Without this, one could twist by anything coming from the base, so one could alternatively define \begin{align*} \underline{M}'(S) = { \left\{{\text{$F$ on $X\times S$ }}\right\} \over F\to F\otimes\pi^* L}, \qquad L\in \operatorname{Pic}(S) \end{align*} This coincides with the previous notion when \(L\) has a section.
For \({\mathcal{F}}\in {\mathsf{Coh}}(X)\), define the Hilbert polynomial \begin{align*} p_{\mathcal{F}}(n) \mathrel{\vcenter{:}}=\chi(X, {\mathcal{F}}(n)) = \sum_{i\geq 0} h^i(X; {\mathcal{F}}(n)) ,\end{align*} noting that by Serre vanishing, for \(n \gg 0\), \(h^{i > 0}(X; {\mathcal{F}}(n)) = 0\).
If \(L\) is a line bundle on a curve \(C\), by RR we have \begin{align*} \chi(L(n)) = \deg L(n) + 1-g = nd + \deg(L) + 1 - g \end{align*} where \(d \geq \deg {\mathcal{O}}_X(1)\) (which is very ample). Thus \(\deg (L)\) defines the Hilbert polynomial \(p_L(n)\) uniquely, and we often write \(\underline{\operatorname{Pic}}^d_{X/{\mathbf{C}}}\). More generally, if \({\mathcal{F}}\) is a rank \(r\) locally free sheaf on a curve \(C\), one obtains \begin{align*} \chi({\mathcal{F}}(n)) = nd + \deg{\mathcal{F}}+ r(1-g) .\end{align*}
Todo: is this \(nd\) or \(n+d\)?
For \(S\) connected, each \(p_{F_s}(n)\) are the same.
For any field \(k\), not necessarily algebraically closed, of any characteristic, and for all projective varieties \(X\) over \(k\), the rigidified functor \(\underline{\operatorname{Pic}}_{X/k, h(n)}\) is represented by a scheme \(\operatorname{Pic}_{X/k, h(n)}\), i.e. \begin{align*} \underline{\operatorname{Pic}}_{X/k, h(n)}(S) { \, \xrightarrow{\sim}\, }\mathop{\mathrm{Hom}}_{{\mathsf{Sch}}}(S, \operatorname{Pic}_{X/k, h(n) }) .\end{align*} Moreover there exists a universal invertible sheaf \({\mathcal{U}}\) over \(\operatorname{Pic}_{X/k, h(n)}\) and the sheaves \(F\) on \(X \underset{\scriptscriptstyle {k} }{\times} S\) are all pullbacks:
Note that \(\operatorname{Pic}^0_{X/k}\) is a group variety and the other components are torsors over it. Since \([{\mathcal{O}}_X]\in \operatorname{Pic}_{X/k}\), one can compute \(\dim {\mathbf{T}}_{[{\mathcal{O}}_X]} \operatorname{Pic}_{X/k} = h^1({\mathcal{O}}_X)\), which is \(\dim \operatorname{Pic}_{X/k}\) if \(\operatorname{Pic}_{X/k}\) is reduced – this is automatic in characteristic zero, and necessary since \(k[{\varepsilon}]/{\varepsilon}^2\) has dimension 0 but tangent space dimension 1.
Adapting this moduli problem to vector bundles: take the functor sending \(S\) to sheaves \(F\) on \(X \underset{\scriptscriptstyle {\operatorname{Spec}{\mathbf{C}}} }{\times} S\) which are flat over \(S\), noting that there is no way to rigidify in this case. Without any additional conditions, this leads to something horribly infinite. Consider \(X = {\mathbf{P}}^1\) and take \(F = {\mathcal{O}}(k) \oplus {\mathcal{O}}(-k)\), so \(\deg F = 0\) and \(\operatorname{rank}F = 2\), where \(k\in {\mathbf{Z}}\). This is an unbounded family, parameterized by an infinite discrete set \({\mathbf{Z}}_{\geq 0}\), so we need to restrict to nice vector bundles to exclude this case.
If \(C\) is a curve, if \(F\) is a vector bundle (a locally free sheaf of rank \(r\)) then \(F\) stable (resp. semistable) if for any vector sub-bundle \(E\leq F\) there is an inequality \begin{align*} {\deg E\over \operatorname{rank}E} < {\deg F \over \operatorname{rank}F},\qquad \text{resp. } {\deg E\over \operatorname{rank}E} \leq {\deg F \over \operatorname{rank}F},\qquad .\end{align*} These quantities are called slopes, and this is sometimes referred to as slope stability.
There is a moduli space \(\left\{{\text{semistable sheaves}}\right\}/S{\hbox{-}}\text{equivalence} \supseteq\left\{{\text{stable sheaves}}\right\}/\cong\).
Let \(X\) is a projective variety equipped with \({\mathcal{O}}_X(1)\) and \(F\) is a pure coherent sheaf, i.e. \(\mathop{\mathrm{supp}}F\) is pure-dimensional (equidimensional) and there does not exist a subsheaf \(0\neq G\leq F\) with \(\dim \mathop{\mathrm{supp}}G < \dim \mathop{\mathrm{supp}}F\).9 Then stability (resp. semistability) is the condition that for every subsheaf \(E\leq F\), \begin{align*} { p_E(n) \over \operatorname{rank}E} < {p_F(n) \over \operatorname{rank}F},\qquad\text{resp. } { p_E(n) \over \operatorname{rank}E} \leq {p_F(n) \over \operatorname{rank}F}, ,\end{align*} i.e. the normalized Hilbert polynomials (dividing by the leading coefficients) satisfying this inequality.
Note that this definition still works for \(X\) a scheme, potentially non-reduced with many components. This is sometimes referred to as Seshadri stability.
One interesting case: \(X\) a curve but not irreducible. The moduli of invertible sheaves is already nontrivial, since subsheaves may only be defined on some irreducible components and thus not be invertible. Here \({\mathcal{O}}_X(1)\) may have different degrees on different components; as long as they are positive, \({\mathcal{O}}_X(1)\) is ample, and different polarizations yield different Jacobians and balancing these leads to interesting combinatorics.
Today: semistable sheaves on a projective variety \(X \subseteq {\mathbf{P}}^N\), where \({\mathcal{O}}_X(1)\) is the pullback of \({\mathcal{O}}_{{\mathbf{P}}^N}(1)\). Let \({\mathcal{F}}\in {\mathsf{Coh}}(X)\), e.g. a vector bundle (locally free of rank \(r\)) or a line bundle (vector bundle with \(r=1\)). Note \(X\) is covered by affine varieties \(\operatorname{Spec}R_i\) corresponding to rings \(R_i\), and on affine varieties,
For vector bundles, \(M\cong R^r\). By Serre vanishing, \begin{align*} H^{> 0}(X; {\mathcal{F}}(n)) = 0\,\, n \gg 0, \qquad \dim_k H^0(X; {\mathcal{F}}(n)) < \infty\,\,\forall n ,\end{align*} and Grothendieck vanishing yields \(H^{> \dim X}(X; {\mathcal{F}}) = 0\) for any \({\mathcal{F}}\in {\mathsf{QCoh}}(X)\). By Hirzebruch-Riemann-Roch, the Hilbert series \begin{align*} h_{\mathcal{F}}(n) \mathrel{\vcenter{:}}=\chi({\mathcal{F}}(n)) \mathrel{\vcenter{:}}=\sum (-1)^i h^i({\mathcal{F}}(n)) = h^0({\mathcal{F}}(n)),\,\, n\gg 0 \end{align*} is a polynomial in \(n\). This is proved by writing \(Y \mathrel{\vcenter{:}}= X \cap H\) to get \({\mathcal{F}}(-1)\hookrightarrow{\mathcal{F}}\twoheadrightarrow{ \left.{{{\mathcal{F}}}} \right|_{{Y}} }\) and \(h_{\mathcal{F}}(n) - h_{\mathcal{F}}(n-1) = h_Y(n)\). Thus it suffices to know \(h_Y\) is a polynomial, since the LHS is the discrete derivative of \(h_{\mathcal{F}}\), and \(\dim Y < \dim X\).
We define the reduced Hilbert polynomial as \(\overline{h}_{\mathcal{F}}(n) \mathrel{\vcenter{:}}= h_{\mathcal{F}}(n)/h_n\) where \(h_n\) is the leading coefficient of \(h_{\mathcal{F}}(n)\).
Let \(\dim X = 1\) be a smooth curve of genus \(g\) and \({\mathcal{F}}\in {\mathsf{Coh}}(X)\) be locally free of rank \(r\) Then Riemann-Roch yields \begin{align*} \chi({\mathcal{F}}) = \deg({\mathcal{F}}) +r (1-g) = \deg({\mathcal{F}}) + \chi({\mathcal{O}}_X{ {}^{ \scriptscriptstyle\oplus^{r} } }) ,\end{align*} using that \(h^0({\mathcal{O}}_X) = 1, h^1({\mathcal{O}}_X) = g \implies \chi({\mathcal{O}}_X) = g-1\). Twist by \(n\) to obtain \begin{align*} \chi({\mathcal{F}}(n)) = \deg({\mathcal{F}}) + nr \deg({\mathcal{F}}) + r(1-g) = h_{\mathcal{F}}(n) ,\end{align*} which yields \begin{align*} \overline{h}_{\mathcal{F}}(n) &= n + {\deg {\mathcal{F}}\over \operatorname{rank}{\mathcal{F}}}{1\over \deg({\mathcal{O}}_X(1))} + { r(1-g)\over r\deg {\mathcal{O}}_X(1) } \\ &\mathrel{\vcenter{:}}= n + \mu({\mathcal{F}}) c_1 + c_2 \end{align*} where \(\mu({\mathcal{F}})\) is the slope and the \(c_i\) are constants that do not depend on \({\mathcal{F}}\).
A sheaf \({\mathcal{F}}\in {\mathsf{Coh}}(X)\) is Hilbert stable (resp. semistable) iff for any nonzero subsheaf \({\mathcal{E}}\leq {\mathcal{F}}\) satisfies
Note that condition (a) is automatic for locally free sheaves, since \({\mathcal{F}}_x = {\mathcal{O}}_X{ {}^{ \scriptscriptstyle\oplus^{r} } }\) for every \(x\) and thus \(\mathop{\mathrm{supp}}{\mathcal{F}}= X\). An example where this won’t hold: let \(i: p\hookrightarrow X\) for \(X\) a curve and take the skyscraper sheaf \(i_* {\mathcal{O}}_p\). More generally, for a closed embedding \(i: Z\hookrightarrow X\) one gets \(\mathop{\mathrm{supp}}(i_* {\mathcal{O}}_Z) \subseteq Z \subseteq X\). If \(X\) is a smooth curve, then any \({\mathcal{F}}\in {\mathsf{Coh}}(X)\) decomposes as \({\mathcal{F}}= {\mathcal{F}}_{\operatorname{tors}}\oplus {\mathcal{F}}_{{\mathrm{free}}}\) where \({\mathcal{F}}_{\operatorname{tors}}\) is a torsion sheaf which is a sum of skyscrapers supported at points and \({\mathcal{F}}_{\mathrm{free}}\) is locally free. The Hilbert polynomial will be constant on \({\mathcal{F}}_{\operatorname{tors}}\).
Hilbert stability for smooth curves \(X\): (a) holds iff \({\mathcal{F}}\) is torsionfree. If this holds, then since \(0\neq {\mathcal{E}}\subseteq {\mathcal{F}}\) with \({\mathcal{F}}\) torsionfree, \({\mathcal{E}}\) is locally free. Then condition (b) is equivalently to \(\mu({\mathcal{E}}) < \mu({\mathcal{F}})\), respectively \(\mu({\mathcal{E}}) \leq \mu({\mathcal{F}})\). Thus Hilbert stability for curves is slope stability.
For \(X={\mathbf{P}}^1\), Grothendieck splitting yields \({\mathcal{F}}= \bigoplus_{i=1}^r {\mathcal{O}}(n_i)\) and \({\mathcal{F}}\) is stable iff \(r=1\). Then \(\mu({\mathcal{F}}) = {\deg{\mathcal{F}}\over\operatorname{rank}{\mathcal{F}}} = {\sum_{i=1}^r n_i \over r}\), and this is semistable iff \(n_1 = \cdots = n_r\). E.g. \({\mathcal{F}}= {\mathcal{O}}\oplus {\mathcal{O}}(1)\) has slope \(1/2\) but contains \({\mathcal{O}}(1)\) of slope \(1\).
Let \(X\) be a smooth elliptic curve. By a theorem of Atiyah, there is a unique indecomposable semistable sheaf \({\mathcal{F}}\) of degree zero. When \(r=1\), one can take \({\mathcal{F}}= {\mathcal{O}}_X\). For \(r=2\), \({\mathcal{O}}_X \oplus {\mathcal{O}}_X\) has slope zero but has subsheaf \({\mathcal{O}}_X\) of slope zero, so this is only semistable. Instead, it is a sheaf fitting into an extension \({\mathcal{O}}_X \hookrightarrow{\mathcal{F}}\twoheadrightarrow{\mathcal{O}}_X\), which is semistable since it contains \({\mathcal{O}}_X\). Such extensions \(A\to{\mathcal{F}}\to B\) are classified by \(H^1(X; B\otimes A^{-1})\), which here is \(H^1(X; {\mathcal{O}}_X)\) which is dimension 1. Note that \(\deg {\mathcal{F}}= \deg {\mathcal{O}}_X + \deg {\mathcal{O}}_X = 0+0=0\).
For \(g\geq 2\), there is a moduli space of stable and semistable vector bundles of fixed rank \(r\) and degree \(d\) This recovers \(\operatorname{Jac}\) for \(r=1\), and thus is a noncommutative generalization of \(\operatorname{Pic}\). If \(d=r(g-1)\) then one can define a theta divisor, and around 20 years ago there was an analog of the Riemann-Roch formula which computed its sections.
These moduli spaces exist, and the proof if by GIT by reducing it to an action of \({\operatorname{SL}}_n\) on a projective variety and applying the Hilbert-Mumford numerical criterion.
Two basic notions to discuss: \(S{\hbox{-}}\)equivalence on semistable sheaves, and the Harder-Narasimhan filtration.
The Harder-Narasimhan filtration: if \({\mathcal{F}}\) is stable, do nothing, otherwise pick a maximal destabilizing subsheaf \({\mathcal{F}}_1 \subseteq {\mathcal{F}}_0\mathrel{\vcenter{:}}={\mathcal{F}}\) (i.e. a subsheaf of largest slope or reduced Hilbert polynomial). Continue this to obtain a decreasing filtration \(0 = {\mathcal{F}}_k \subseteq \cdots \subseteq {\mathcal{F}}_1 \subseteq {\mathcal{F}}_0 = {\mathcal{F}}\). Define the associated graded sheaf \({\mathsf{gr}\,}{\mathcal{F}}\mathrel{\vcenter{:}}=\bigoplus {\mathcal{F}}_i/{\mathcal{F}}_{i+1}\).
The result: the associated graded pieces \({\mathcal{F}}_i/{\mathcal{F}}_{i+1}\) are semistable with increasing slopes. E.g. take \(0 \subseteq {\mathcal{O}}(1) \subseteq {\mathcal{O}}\oplus {\mathcal{O}}(1)\) where \(\mu({\mathcal{O}}(1)) = 1\) and \(\mu({\mathcal{O}})=0\). If \({\mathcal{F}}\) is semistable, the subsheaves will all have the same slope and the pieces \({\mathcal{F}}_i / {\mathcal{F}}_{i+1}\) are indecomposable.
Say \({\mathcal{F}}\sim_S {\mathcal{F}}'\) iff \({\mathsf{gr}\,}{\mathcal{F}}\cong {\mathsf{gr}\,}{\mathcal{F}}'\).
If \(X\) is an elliptic curve with \({\mathcal{F}}' \mathrel{\vcenter{:}}={\mathcal{O}}_X \oplus {\mathcal{O}}_X\), take \({\mathcal{O}}_X \hookrightarrow{\mathcal{F}}_\alpha \twoheadrightarrow{\mathcal{O}}_X\) corresponding to \(\alpha\in H^1({\mathcal{O}}_X) = {\mathbf{C}}\). There is a filtration \(0 \subseteq {\mathcal{O}}_X \subseteq {\mathcal{F}}\) with graded pieces \({\mathcal{O}}_X, {\mathcal{O}}_X\).
This theory becomes complicated for singular or reducible varieties. Let \(X\) be two intersecting copies of \({\mathbf{P}}^1\), so \(X = X_1 \cup X_2\) with \(X_i= {\mathbf{P}}^i\):
Then \(d = \deg {\mathcal{O}}_X(1) = \deg {\mathcal{O}}_{X_1}(1) + \deg {\mathcal{O}}_{X_2}(1) = d_1 + d_2\). Define a multidegree \(\mathbf{d} = (d_1, d_2)\) and multirank \(\mathbf{r} = (r_1, r_2)\). Condition (a) requires \(\not\exists {\mathcal{E}}\subseteq {\mathcal{F}}\) with \(\mathop{\mathrm{supp}}{\mathcal{E}}= {\operatorname{pt}}\) (0-dimensional support). We have \begin{align*} 0 \to {\mathcal{O}}_{X_2}(-p) = \ker f \hookrightarrow{\mathcal{O}}_X \xrightarrow[]{f} { \mathrel{\mkern-16mu}\rightarrow }\, {\mathcal{O}}_{X_1}\to 0 ,\end{align*} where the kernel consists of functions on \(X\) which restrict to zero on \(X_1\), and hence vanish at \(p\). These subsheaves are not torsionfree, despite \({\mathcal{O}}_X\) being torsionfree.
One can compute
which is no longer just degree over rank, and is called the Seshadri slope and generalizes slopes to curves with multiple irreducible components. This is interesting even in the case \(\mathbf{r}({\mathcal{F}}) = (1,1)\), since semistability is now nontrivial (whereas previously we used that line bundles have no sub-bundles). For simple singularities like nodes, there are numerical conditions on multidegrees to guarantee (semi)stability. Note that there are infinitely many degree zero sheaves, since \(\deg {\mathcal{F}}= \deg { \left.{{{\mathcal{F}}}} \right|_{{X_1}} } + \deg{ \left.{{{\mathcal{F}}}} \right|_{{X_2}} }\) which can be taken to be \(n\) and \(-n\) – however, there are only finitely many (semi)stable multidegrees.
Coming up: setting up the GIT problem matches up the notions of (semi)stability, and \(S{\hbox{-}}\)equivalence becomes orbit-closure equivalence.
See this very interesting paper posted today! https://arxiv.org/pdf/2211.07061.pdf
Next goal: constructing moduli spaces of stable sheaves and how to reduce it to GIT, after Seshadri, Narasimhan, Mumford (on curves), Gieseker (surfaces), Maruyama (higher dimensional varieties), Simpson (completed for higher-dimensional varieties). We’ll follow the treatment in Simpson’s 1994 paper, “Moduli of representations of fundamental groups…”.
Setup: let \(X \subseteq {\mathbf{P}}^N\) be a projective variety with \({\mathcal{O}}_X(1)\) and \({\mathcal{E}}\in {\mathsf{Coh}}(X)\) and Hilbert polynomial \(p({\mathcal{E}}, n) = \chi({\mathcal{E}}(n))\). One can easily prove by induction that \(p\) is in fact a polynomial, and it turns out to have terms of the form \(p({\mathcal{E}}, n) = r {n^d\over d!} + a {n^{d-1}\over (d-1)!} + \cdots\). We define
We say \({\mathcal{E}}\) is pure dimensional iff it has no subsheaves of strictly smaller support, i.e. for all nonzero \({\mathcal{F}}\leq {\mathcal{E}}\), one has \(d({\mathcal{F}}) = d({\mathcal{E}})\). On affine schemes, this is Serre condition 1, and this says there are no embedded components (corresponding to primes; take the primary decomposition).
For a curve with many irreducible components, there are no sheaves supported only a single point. If \(X = X_1\cup X_2\), \({\mathcal{O}}_X\) has the subsheaf \({\mathcal{O}}_{X_1}\cdot I_{X \setminus X_1}\). For a nodal curve, this yields \({\mathcal{O}}_{X_1}(-p)\), regular functions on \(X_1\) that vanish at \(p\):
We say \({\mathcal{E}}\) is \(p{\hbox{-}}\)(semi)stable or Hilbert (semi)stable iff
For any Hilbert polynomial \(P(n)\), there exists a moduli space \(M({\mathcal{O}}_X, p)\) of semistable sheaves on \(X\) with \(p({\mathcal{E}}, n) = P(n)\), which has a (semi)stable locus. This gives a bijection on points: \begin{align*} M({\mathcal{O}}_X, P) &\rightleftharpoons\left\{{\text{semistable } {\mathcal{E}}\in {\mathsf{Sh}}(X),\, p({\mathcal{E}}, n) = P(n) }\right\}{_{\scriptstyle / \sim} }\qquad \text{under ${\mathsf{gr}\,}{\hbox{-}}$equivalence} \\ M({\mathcal{O}}_X, P)^\mathrm{st}&\rightleftharpoons\left\{{\text{stable } {\mathcal{E}}\in {\mathsf{Sh}}(X),\, p({\mathcal{E}}, n) = P(n) }\right\} .\end{align*}
This will essentially be a quotient by \({\operatorname{SL}}(V)\) for some \(V\). Let \({\mathcal{E}}\in {\mathsf{Coh}}(X)\), then Serre yields that for all \(n\gg n_0\),
Thus there is a surjection \begin{align*} 0 \to K \mathrel{\vcenter{:}}=\ker f \to H^0({\mathcal{E}}(n)) \otimes{\mathcal{O}}_X \xrightarrow[]{f} { \mathrel{\mkern-16mu}\rightarrow }\, {\mathcal{E}}(n) \to 0 .\end{align*} Note that a global section \(s\in {{\Gamma}\qty{{\mathcal{F}}} }\) is equivalent to a morphism \begin{align*} {\mathcal{O}}_X &\to {\mathcal{F}}\\ 1 &\mapsto s .\end{align*} Untwisting this surjection yields \begin{align*} 0 \to K(-n) \to H^0({\mathcal{E}}(n))\otimes{\mathcal{O}}_X(-n) \xrightarrow[]{\tilde f} { \mathrel{\mkern-16mu}\rightarrow }\, {\mathcal{E}}\to 0 .\end{align*}
Let \(V\in {}_{k}{\mathsf{Mod}}^{\mathrm{fd}}\) and \({\mathcal{W}}\in {\mathsf{Sh}}(X)\) (e.g. \({\mathcal{W}}={\mathcal{O}}_X\)), and define \(\operatorname{Hilb}(V\otimes{\mathcal{W}}, P)\) to be the moduli space of quotients \(V\otimes{\mathcal{W}}\to {\mathcal{E}}\to 0\) with \(p({\mathcal{E}}, n) = P(n)\), i.e. the scheme of quotient sheaves of \(V\otimes{\mathcal{W}}\). More generally, one can define \(\operatorname{Hilb}({\mathcal{G}}, P) = \operatorname{Quot}({\mathcal{G}}, P)\) to be the scheme of quotients \({\mathcal{G}}\to {\mathcal{E}}\to 0\) with \(p({\mathcal{E}}, n) = P(n)\).
\(\operatorname{Quot}({\mathcal{G}}, P)\) exists as a scheme and admits a universal family, yielding a fine moduli space. Moreover, one can embed it into some Grassmannian, yielding \(\operatorname{Quot}({\mathcal{G}}, P) \hookrightarrow{\operatorname{Gr}}_{r, n}\).
Note that if \(V\) is a vector space, every dimension \(r\) subspace yields a codimension \(r\) quotient, so \({\operatorname{Gr}}_{r, N}\) also parameterizes quotients, and we choose quotients as they are better behaved from a commutative algebraic POV.
Let \({\mathcal{G}}\) be fixed and consider quotients \({\mathcal{G}}\to {\mathcal{E}}\to 0\) as \({\mathcal{E}}\) varies. Take the sheaf kernel to obtain \begin{align*} 0\to K\to {\mathcal{G}}\to{\mathcal{E}}\to 0 ,\end{align*} and twist by \({\mathcal{O}}_X(n)\) for \(n\gg 0\) to obtain \begin{align*} 0\to H^0(K(n)) \to H^0({\mathcal{G}}(n)) \to H^0( {\mathcal{E}}(n) ) \to 0 \end{align*} using Serre vanishing. This is a SES of vector spaces \(0\to V\to k^N \to U\to 0\) for some \(N\), and thus we get a point in the Grassmannian. We know \(\dim U = p({\mathcal{E}}, n)\) (maybe the degree..?), and as we vary the quotients, \(p({\mathcal{E}}, n)\) does not vary. Note that one needs to show that the number of such quotients to be bounded so that \(n\) can be chosen uniformly for all \({\mathcal{E}}\), which we’ll not prove here.
Conversely, suppose \(H^0({\mathcal{G}}(n))\twoheadrightarrow U\to 0\), we can produce a sheaf? Take the kernel of vector spaces to get a SES \begin{align*} 0\to K\to H^0({\mathcal{G}}(n)) \to U\to 0 ,\end{align*} where sections of \(K\) generate a subsheaf of \({\mathcal{G}}(n)\), say \({\mathcal{K}}(n)\mathrel{\vcenter{:}}= K\cdot {\mathcal{G}}(n) \leq {\mathcal{G}}(n)\). Untwisting yields \(0\to {\mathcal{K}}\xhookrightarrow{f} {\mathcal{G}}\to \operatorname{coker}f \to 0\). Again, once \(P\) is fixed, \(n\) can be chosen uniformly. This yields the embedding \(\operatorname{Quot}({\mathcal{G}}, P) \hookrightarrow{\operatorname{Gr}}_{r, N}\) as a closed subscheme, since it turns out that each quotient \(U\) is defined by polynomial equations and thus algebraic conditions.
Note that this is a closed subscheme, which is easier to handle than a closed subvariety: e.g. any equations define an ideal \(I\) and \(V(I)\) is a closed subscheme, say of \({\mathbf{A}}^n\), whereas it is only subvariety iff \(I = \sqrt I\). This is equivalent to asking if \(V(I)\) is reduced.
Note that \(V\mathrel{\vcenter{:}}= H^0({\mathcal{E}}(n))\) is fixed in the proof, and fixing this is equivalent to choosing a basis of \(H^0({\mathcal{E}}(n))\), so \(\operatorname{Quot}({\mathcal{G}}, P)\) encodes a choice of basis. To forget this choice, we need to quotient by change of basis, and we’ll have \(M({\mathcal{O}}_X, P) \mathrel{\vcenter{:}}=\operatorname{Quot}(V\otimes{\mathcal{O}}_X(-n), P) { \mathbin{/\mkern-6mu/}}{\operatorname{SL}}(V)\).
Note that the Grassmannian has a Plucker embedding into \({\mathbf{P}}^N\) for some large \(N\). We have \({\operatorname{SL}}(V)\curvearrowright\operatorname{Quot}({\mathcal{G}}, P)\), so we can apply the Hilbert-Mumford numerical criterion to the induced action \({\operatorname{SL}}(V)\curvearrowright{\mathbf{P}}^N\) – doing this precisely yields the (semi)stability criterion \(\overline{p}({\mathcal{F}}, n) \leq \overline{p}({\mathcal{E}}, n)\). The hard part will be boundedness – e.g. consider \(X\) a curve and \({\mathcal{F}}= {\mathcal{O}}_X(n) \oplus {\mathcal{O}}_X(-n)\), which all have the same Hilbert polynomial and thus yields an unbounded family. Starting with semistable sheaves yields a bounded family. Maruyama handled boundedness for low dimensions and Simpson proved it for the remaining dimensions, so we’ll generally skip the boundedness issues when choosing \(n\).
Next time: the Plucker embedding, seeing what points look like under the embedding, and seeing the polynomial criterion drop out of the calculation. Later: moduli of varieties.
Today: a sketch of a proof of existence of a moduli space of semistable sheaves. Setup: let \(X\in\mathop{\mathrm{Proj}}{\mathsf{Var}}\) or \(X\in {\mathsf{Sch}}\), fix a Hilbert polynomial \(P(n)\), and fix \({\mathcal{E}}\in {\mathsf{Coh}}(X)\) with \(p({\mathcal{E}}, n) = P(n)\); we want to construct the moduli space \(M(X, P(n))\). Using that \({\mathcal{E}}(n)\) is globally generated for \(n > n_0 \gg 0\), there is a surjection \(H^0(X; {\mathcal{E}}(n)) \otimes{\mathcal{O}}_X \twoheadrightarrow{\mathcal{E}}(n)\) and thus a surjection \(H^0(X; {\mathcal{E}}(n) )\otimes{\mathcal{O}}_X(-n) \twoheadrightarrow{\mathcal{E}}\). Note \(V \mathrel{\vcenter{:}}= H^0(X; {\mathcal{E}}(n))\) is a vector space of dimension \(\deg P(n)\) there is a surjection of sheaves \(V\otimes{\mathcal{W}}\twoheadrightarrow{\mathcal{E}}\) making \({\mathcal{E}}\in \operatorname{Hilb}(V\otimes{\mathcal{W}}, P(n))\). Grothendieck embeds this into \({\operatorname{Gr}}(V\otimes W, P(n))\), and more generally \(\operatorname{Quot}({\mathcal{Y}}, P(n))\hookrightarrow{\operatorname{Gr}}(G, a)\). At this point, \(\operatorname{Quot}\) includes the data of a choice of basis of \(V\), so we’ll quotient by an action \({\operatorname{SL}}(V)\curvearrowright V \leadsto {\operatorname{SL}}(V)\curvearrowright\operatorname{Hilb}(V\otimes{\mathcal{W}}, P(n))\).
\({\operatorname{SL}}(V)\curvearrowright{\operatorname{Gr}}(V\otimes W \to U_a)\hookrightarrow{\mathbf{P}}^N\), so we need a lift \({\operatorname{SL}}(V)\curvearrowright{\mathbf{P}}^N\) with a linearization \({\operatorname{SL}}(V)\curvearrowright{\mathbf{A}}^{N+1}\). In this situation, we’ll have the Hilbert-Mumford numerical criterion to check if \([V\otimes W\twoheadrightarrow U]\) is (semi)stable. The condition will turn out to be \begin{align*} \forall H \subseteq V,\qquad {\dim H \over \dim \operatorname{im}(H \otimes W \hookrightarrow V\otimes W)} \leq {\dim V\over \dim U} .\end{align*}
Let \(V = H^0({\mathcal{E}}(n))\), then \(V\cdot {\mathcal{E}}= {\mathcal{E}}\), i.e. \(V\) spans the stalks, and any subspace \(H \subseteq V\) defines a subsheaf \({\mathcal{F}}\mathrel{\vcenter{:}}= H\cdot {\mathcal{E}}\leq {\mathcal{E}}\). The criterion yields \(\overline{p}_{\mathcal{F}}(n) \leq \overline{p}_{\mathcal{E}}(n)\). For a single sheaf \({\mathcal{E}}\), \(n\) depends on \({\mathcal{E}}\) and this is easy, but boundedness in families is difficult in general. To define the \({\mathbf{P}}^N\) appearing in the lemma, we’ll need to discuss Grassmannians.
For a fixed \(B\), a SES \(A\hookrightarrow B\twoheadrightarrow C\in {}_{k}{\mathsf{Mod}}\) of dimensions \(a,b,c\) respectively, note \({\operatorname{Gr}}_{a, b} = {\operatorname{Gr}}_{b, c}\) where the former parameterizes subspaces and the latter quotients. There are several levels of generality in which Grassmannians can be defined:
Let \(B = k^n\), how does one parameterize subspaces? Any subspace \(A\) has a basis \(A = \left\langle{v_1,\cdots, v_a}\right\rangle\). Fixing a basis \(k^n = \left\langle{e_1,\cdots, e_b}\right\rangle\), one can form a matrix \(M_B\in \operatorname{Mat}_{a\times b}(k)\) whose rows are the \(v_i\). There is an action \(\operatorname{GL}_a\curvearrowright M_B\) by conjugation. Recall that Plucker coordinates are the components of \((P_I)\), the determinants of all \(a\times a\) minors where \({\left\lvert {I} \right\rvert} = a\) is an index set. For \(I = \left\{{1,\cdots, b}\right\}\), there are \(b\choose a\) such minors. We can regard \((P_I) \in P^{{b\choose a} - 1} = {\mathbf{P}}( { {\bigwedge}^{\scriptscriptstyle \bullet}} ^a B)\) where \(B = \left\langle{e_1,\cdots, e_b}\right\rangle\) and \({ {\bigwedge}^{\scriptscriptstyle \bullet}} ^a = \left\langle{e_{i_1} \vee\cdots e_{i_a}}\right\rangle\). Writing \(v_i\) in the \(e_i\) basis, their Plucker coordinate is \(v_1 \vee\cdots \vee v_a = \sum P_I e_I \in { {\bigwedge}^{\scriptscriptstyle \bullet}} ^a A \cong k \subseteq { {\bigwedge}^{\scriptscriptstyle \bullet}} ^a B\).
Each point \((P_I)\in {\mathbf{P}}^N\) defines \(A \subseteq B\) uniquely, so \({\operatorname{Gr}}_{a, b} \hookrightarrow{\mathbf{P}}^N\).
Let \(M\) be a matrix of rank \(a\), and change basis so that \(M\) is of the form \(M = [I | M']\), where entries of \(M'\) encode some of the Plucker coordinates. For example, \(M'_{0, 0}\) is the determinant of a certain submatrix:
We can also see that \({\operatorname{Gr}}_{a, b} = \cup{\mathbf{A}}^{a(b-a)}\) where each \({\mathbf{A}}^{a(b-a)} \hookrightarrow{\mathbf{A}}^N = \left\{{P_I\neq 0}\right\}\), yielding \({\operatorname{Gr}}_{a, b} \hookrightarrow{\mathbf{P}}^N \supseteq\bigcup_N {\mathbf{A}}^N\). There is in fact a closed embedding \(\bigcup{\mathbf{A}}^{a(b-a)} \hookrightarrow\bigcup{\mathbf{A}}^N\) given by algebraic equations.
\(\dim {\operatorname{Gr}}_{a, b} = ac = a(b-a)\) when parameterizing quotients.
What does the Hilbert-Mumford criterion say in this situation? Let \(K\hookrightarrow V\otimes W\twoheadrightarrow U\), and pick a basis to get \(P_I(K) = P_I(U)\) and \(v_{i_1}\vee\cdots \vee v_{i_n} = \sum p_{i_1, \cdots, i_a} e_{i_1} \vee\cdots \vee e_{i_a}\). Letting \(\left\{{f_i\otimes g_j}\right\}\) be a basis for \(V\otimes W\), consider how to linearize the action \({\operatorname{SL}}(V)\curvearrowright V\otimes W\): pick a \({\mathbf{G}}_m \subseteq {\operatorname{SL}}(V) \subseteq {\operatorname{SL}}(V\otimes W\) so \(t.f_i = t^{r_i} f_i\) with \({ \operatorname{weight} }(f_i) = r_i\) and \(\sum r_i = 0\). Then \({ \operatorname{weight} }(f_i \otimes g_j) = r_i\) since there is no action on \(W\). Check that \begin{align*} { \operatorname{weight} }\qty{ (f_{i_1} \otimes g_{j_1} ) \vee(f_{i_2}\otimes g_{j_2} ) \vee\cdots \vee( f_{i_a} \otimes g_{j_a} ) } = \sum_{s=1}^a r_{i_s} .\end{align*} Now the subspace \(K\to V\otimes W\) or quotient \(V\otimes W\to U\) is GIT is stable (resp. semistable) iff for all \(\lambda: {\mathbf{G}}_m \to {\operatorname{SL}}(V)\), there exists some \(P_I\) such that
with \(P_I(K) \neq 0\), by the numerical criterion. This translates to having a nonzero \(a\times a\) minor for any choice of basis in \(V\).
Let \(V = \left\langle{f_1,\cdots, f_n}\right\rangle\), then there are subspaces \(H_{n-1} = \left\langle{f_2,\cdots, f_n}\right\rangle, H_3 = \left\langle{f_3,\cdots, f_n}\right\rangle, \cdots, H_1 = \left\langle{f_n}\right\rangle\). This corresponds to an ordered list of weights \(r_1\leq r_2\leq \cdots \leq r_n\).
Try this in dimension 2, where \(V = \left\langle{f_1, f_2}\right\rangle\) with weights \(-r, r\) resp. and write \(V\otimes W = f_1 W \oplus f_2 W\). Check \({ \operatorname{weight} }\qty{ { {\bigwedge}^{\scriptscriptstyle \bullet}} (f_i\otimes g_j)} = \sum r_i = r\).
Lemma from Simpson: a point \(\qty{ K \hookrightarrow V\otimes W \twoheadrightarrow U} \in {\operatorname{Gr}}(V\otimes W, b)\) for the \({\operatorname{SL}}(V){\hbox{-}}\)action is stable (resp. semistable) iff for all \(H \leq V\), \begin{align*} { \dim H\otimes W \over \dim \operatorname{im}(H\otimes W) } \leq {\dim V\otimes W\over \dim U} \iff {\dim H\otimes W\over \dim(H\otimes W) \cap K} \geq {\dim V\otimes W \over \dim K} .\end{align*} We’ll consider \(0\to k\to {\mathbf{V}}\supseteq{\mathbb{H}}\) and pick a 1-parameter subgroup \(\lambda: {\mathbf{G}}_m\to {\operatorname{SL}}(V) \to {\operatorname{SL}}({\mathbf{V}})\) and apply the HM criterion.
Recall that if \(G\curvearrowright(X, L)\) is a linearized action with \(X\) projective and \(L\in \operatorname{Pic}^{\mathrm{amp}}(X)\). Then \(x\in X\) is stable/semistable iff for all \(\lambda: {\mathbf{G}}_m\to G\) we have \(\mu^L( \lambda, x)\geq 0\) – this is defined using \(\lim_{t\to 0} \lambda(t).x \mathrel{\vcenter{:}}= x_0\in X\) since \(X\) is projective and hence proper, and since \(\overline{x}\) is fixed by \({\mathbf{G}}_m\) we have \({\mathbf{G}}_m\curvearrowright L_{\overline{x}}\) and after picking a basis have \(\lambda(t).z = t^r z\) for \(z\in L_{\overline{x}}\) and we define \(\mu^L(\lambda, x) \mathrel{\vcenter{:}}=-r\). Replacing \(L\) by some high power, we can assume \(L = {\mathcal{O}}_X(1)\) is very ample.
If \(X\hookrightarrow{\mathbf{P}}^n\) we have \(L = { \left.{{ {\mathcal{O}}_{{\mathbf{P}}^n}(1) }} \right|_{{X}} }\) we have \(G\curvearrowright{\mathbf{A}}^{n+1}\) linearly on coordinates. Diagonalizing this action yields \(\lambda(t).x_i = t^{w_i} x_i\), and we can order the weights such that \(w_0 \geq w_1 \geq \cdots \geq w_n\). Writing \(x = {\left[ {x_0,\cdots, x_n} \right]}\) we can compute \(\overline{x} = \lim_{t\to 0} {\left[ { t^{w_1} x_1, \cdots, t^{w_n}x_n} \right]} = t^{wk}{\left[ {\cdots, x_k, 0,\cdots, 0} \right]}\). So \(\overline{x} = {\left[ {0,0,\cdots, 0,x_{k-w_k}, \cdots, x_k, 0,\cdots, 0} \right]}\) where \(w_k\) of the coordinates are nonzero. Recall \(x\) is unstable iff \(\overline{\lambda({\mathbf{G}}_m).x}\ni{\left[ {0, \cdots, 0} \right]}\) in \({\mathbf{A}}^{n+1}\), since then \(x\) would be orbit-closure equivalent to zero which is not a point in projective space.
Note that the fiber here is \(L_{\overline{x}}^{-1}= {\mathbf{C}}\overline{x} = {\mathbf{C}}\left\langle{\overline{x}_0, \overline{x}_1,\cdots, \overline{x}_k, \cdots, \overline{x}_n}\right\rangle\) (i.e. the line corresponding to \(\overline{x}\)). Here \(\lambda({\mathbf{G}}_m)\) acts with weight \(w_k\) and \(r=-w_k\) and \(-r=w_k = \mu(\overline{x}, \lambda)\). Does this match with the criterion above? Consider \(\lim_{t\to 0} \lambda(t).x \in {\mathbf{A}}^{n+1}\): \begin{align*} \lim_{t\to 0} \lambda(t).x = \lim_{t\to 0} (0,0, \cdots, ?,\cdots, t^{w_k} x_k, 0, \cdots, 0) ,\end{align*} and the bad case is \(w_k > 0\).
Let \(X = {\mathbf{P}}^2\), then \(\lim_{t\to 0} {\left[ {t^2 : t : t^{-3}} \right]} = \lim_{t \to 0} t^{-3}{\left[ {t^5: t^4 : 1} \right]} = {\left[ {0:0:1} \right]} \in {\mathbf{P}}^2\), which is how one generally shows \({\mathbf{P}}^n\) is proper (e.g. by applying the valuative criterion and choosing a uniformizing parameter \(t\)).
If \(\dim V = n\) with weights \(w_1,\cdots, w_n\), then \(\dim V\otimes W = n \dim W\) with weights \(w_1,\cdots, w_1, w_2,\cdots, w_2,\cdots, w_n,\cdots, w_n\) where each weight occurs with multiplicity \(n\). On \({\mathbf{V}}\) pick coordinates \(x_0, \cdots, x_n\) with weights \(w_1,\cdots, w_n\) so \(\lambda(t) x_i = t^{w_i} x_i\). embed \({\operatorname{Gr}}_{k ,n} \hookrightarrow{\mathbf{P}}( { {\bigwedge}^{\scriptscriptstyle \bullet}} ^k {\mathbf{V}})\) with Plucker coordinates \(p_I = x_{i_1} \vee\cdots \vee x_{i_k}\). Then \(\lambda(t) .p_I = t^{\sum_{i\in I} w_i} p_I\) has weight \(w_{i_1} + \cdots + w_{i_k}\). We want \(\lim \lambda(t). K = \overline{K}\). For simplicity assume \(w_1 > w_2 > \cdots > w_n\) with strict inequalities. In \({\mathbf{P}}(V)\) if the last coordinate is nonzero, i.e. \(p = {\left[ {0:\cdots : 0 : 1} \right]}\), the limit is \({\left[ {0:0\cdots:0:1} \right]}\).
Today: the HM criterion and a key computation for \({\operatorname{Gr}}_{k, n}\).
Setup: \(G\curvearrowright X \subseteq {\mathbf{P}}^n\), linearized to \(G\curvearrowright{\mathbf{A}}^{n+1}\). Take a 1PS \({\mathbf{G}}_m \xrightarrow{\lambda} G\) diagonalized (potentially after a change of coordinates) to \(t.{\left[ {x_0,\cdots, x_n} \right]} = {\left[ {t^{r_0} x_0,\cdots, t^{r_n} x_n} \right]}\). Define \begin{align*} \mu^{{\mathcal{O}}(1)} ( \lambda, p) \mathrel{\vcenter{:}}=\max\left\{{-r_i {~\mathrel{\Big\vert}~}i \text{ where } x_i(p)\neq 0 }\right\} .\end{align*}
A point \(p\in {\mathbf{P}}^n\) is (semi)stable if \(\forall \lambda{\mathbf{G}}_m\to G\) nonconstant 1PSs, \begin{align*} \mu(\lambda, p)\geq 0 \qquad \text{resp.} \mu( \lambda, p) > 0 .\end{align*}
On the meaning: if \(\mu(\lambda, p) = -r_0\), then \(-r_0 \geq -r_i\) for all \(i\) and thus \(r_0 \leq r_i\). The good case: \(p\) is stable, then \(r_0 < 0\). The bad case: \(p\) is unstable, then \(r_0 > 0\). So we want at least one negative coefficient for \(p\) to be stable, and we can generally write \begin{align*} \mu( \lambda,p) = \min\left\{{r_i {~\mathrel{\Big\vert}~}i \text{ where } x_i(p) \neq 0}\right\} .\end{align*} For unstable points, \(\lim_{t\to 0} (t^{r_i} x_i) = \mathbf{0}\) in \({\mathbf{A}}^{n+1}\), which does not come from a projective point. Considering \(\lim_{t\to 0} t.p\) in \({\mathbf{P}}^n\), since \({\mathbf{P}}^n\) is proper the limit exists and must equal \begin{align*} \lim {\left[ {t^{r_0}x_0:\cdots:t^{r_n} x_n} \right]} = \lim {\left[ {x_0: t^{r_1-r_0}x_1:\cdots:t^{r_n-r_0}x_n} \right]} = {\left[ {x_0:\cdots:x_r:0:\cdots:0} \right]} \end{align*} where there are zeros if either \(x_i(p) = 0\) or \(r_i -r_0 >0\). In \({\mathbf{A}}^{n+1}\), consider the line \(C_{\overline{p}} = {\mathbf{C}}(x_0,\cdots, x_r, 0,\cdots, 0)\) and \({\mathbf{G}}_m\) acts by multiplication by \(t^{r_0}\). So the weight is \(r_0 = -\mu( \lambda,p)\), thus we can define \(\mu(\lambda,p)\) in another way: let \(\overline{p} = \lim_{t\to 0} t.p\) in \({\mathbf{P}}^n\), then \(\overline{p}\) is fixed by \({\mathbf{G}}_m\). There is an action on the fiber \({\mathbf{G}}_m\curvearrowright{\mathcal{O}}(1)\mathrel{\Big|}_{\overline{p}} = C_{\overline{p}}\) with some weight \(r_0\), so define \(\mu(\lambda,p) \mathrel{\vcenter{:}}=-r_0\).
Now let \(G\curvearrowright{\operatorname{Gr}}_{k, n} = \left\{{K_k \subseteq {\mathbf{C}}^n}\right\}\), where we’re choosing to work with subspaces instead of quotient spaces. This yields a SES \(K\hookrightarrow{\mathbf{C}}^n\twoheadrightarrow V\). Note that \({\operatorname{Gr}}_{k, n} \subseteq {\mathbf{P}}(\bigwedge\nolimits^k {\mathbf{C}}^n) = {\mathbf{P}}^{N-1}\) where \(N = {n\choose k}\) by the Plucker embedding. This yields a linear action \(G\curvearrowright{\mathbf{A}}^N\). Take \(\lambda: {\mathbf{G}}_m \to G\) a 1PS, then the key computation is finding \(\mu(\lambda,[ K])\) for \([K] \in {\operatorname{Gr}}_{k, n}\).
First diagonalize \({\mathbf{G}}_m\curvearrowright{\mathbf{C}}^n\) to get \(t.\mathbf{x} =\operatorname{diag}(t^{r_1},\cdots, t^{r_n}) \mathbf{x}\). Then \(G\curvearrowright{\mathbf{C}}^N\) through Plucker coordinates \(p_I\) for \(I = \left\{{i_1,\cdots, i_k}\right\} \subseteq \left\{{1,\cdots, n}\right\}\). The weight of \(t.p_I\) is \(r(p_I) \mathrel{\vcenter{:}}=\sum_{i\in I} r_i = r_{i_1} + \cdots + r_{i_k}\), and so \begin{align*} \mu( \lambda, [K] ) = \max\left\{{-r(p_I) {~\mathrel{\Big\vert}~}p_I(K) \neq 0}\right\} .\end{align*}
Assume \(r_0 > r_1 > \cdots > r_n\), then \begin{align*} \mu( \lambda, [K] ) = -k r_n - \qty{ \sum_{i=1}^{n-1} r_i - r_{i+1} } \dim K \cap L_{i+1,\cdots n} \end{align*} where \(L_{i+1,\cdots, n} \leq {\mathbf{C}}^n\) is the subspace \(\left\{{x_{i+1} = \cdots = x_n = 0}\right\}\). Take a basis of \(K \subseteq {\mathbf{C}}^n\) and represent it as the rows of a \(k\times n\) matrix. Reduce this to echelon form, but slightly reversed to emphasize the last vector:
Taking the limit of \(t.K\) in \({\mathbf{P}}^{N-1}\) yields the following:
This follows from write \(t^{r_n} e_n + t^{r_{n-1}} c_{n-1} e_{n-1} + \cdots = t^{r_n} \qty{e_n + t^{r_{n-1} - r_n}c_{n-1} e_{n-1} + \cdots }\). Labeling the pivots \(i_1,\cdots, i_k\) from right to left, we have \begin{align*} \mu &= -r_{i_1} - \cdots - r_{i_k} \\ &= - \sum_{i=1}^n \qty{\dim (K \cap L_{i+1,\cdots, n}) - \dim( K \cap L_{i, i+1, \cdots, n} )} r_i \\ &= -r_n \qty{ \dim (K \cap L_\emptyset) - \dim(K \cap L_{ n })} ,\qquad L_n \mathrel{\vcenter{:}}=\left\{{x_n\neq 0}\right\}, L_{\emptyset} \mathrel{\vcenter{:}}={\mathbf{C}}^n \\ &\quad - r_{n-1}\qty{ \dim (K \cap L_{n}) - (\dim K \cap L_{n-1, n}) } \\ &\quad -\cdots ,\end{align*} where we note that e.g. \(-(\dim K \cap L_n)(r_{n-1} - r_n)\) appears, yielding the \(r_{i} - r_{i+1}\) terms in the sum.
Simpson considers \({\operatorname{SL}}(V)\curvearrowright{\operatorname{Gr}}(K\hookrightarrow V \oplus W \twoheadrightarrow U)\), and comes up with a precise formula that enforces an upper bound on \(\dim(K \cap H\otimes W)\). This follows from Mumford’s formula: write \(V\otimes W \cong \bigoplus _{i=1}^{\dim V} W\), then \({\mathbf{G}}_m\) acts on this with weights ???. Note that if \(r_i = r_{i+1}\) then \(\dim (K \cap L_{i+1,\cdots, n})\) doesn’t contribute to \(\mu\). The critical case is when \(r_1 = \cdots = r_1 > r_2 = \cdots = r_2\). Note that the \(r_i\) form a cone and \(\mu\) is a linear function on it, and it suffices to check on rays.
For moduli of K3s, see Viehweg on GIT for \((X, L)\) with \(X\) smooth or with canonical singularities, \(L\) ample, \(K_X\) nef – note that this doesn’t provide a compactification.
Last time: moduli of sheaves on a fixed variety, a linear case where GIT works very well by reducing to a computation on a Grassmannian.
Recall \({\mathcal{M}_g}\) is the moduli of smooth projective genus \(g\) curves, and for \(g\geq 2\) it is known that \(\dim {\mathcal{M}_g}= 3g-3\) which is locally a quotient of a smooth variety and is thus a smooth orbifold/stack. It is quasiprojective but not projective and not complete. One would like an inclusion \({\mathcal{M}_g}\hookrightarrow\overline{{\mathcal{M}_g}}\) a projective variety with mild singularities such that points in \({{\partial}}\overline{{\mathcal{M}_g}}\) correspond to curves with certain singularities. This is constructed by Deligne-Mumford, locally \(U/G\) where \(U\) is smooth and \(G\in{\mathsf{Fin}}{\mathsf{Grp}}\). Moreover \({{\partial}}\overline{{\mathcal{M}_g}}= \bigcup_i D_i/G\) with \(D_i\) smooth and SNC, and points in \({{\partial}}\overline{{\mathcal{M}_g}}\) correspond to DM-stable curves:
Let \(C = \bigcup_i C_i\) be a connected reduced projective curve, then \(C\) is DM-stable iff
Why these three conditions are the same: recall \(\mathop{\mathrm{Aut}}{\mathbf{P}}^1 = \operatorname{PGL}_2\) which has dimension 3. Note that \(\mathop{\mathrm{Aut}}E\cong E \times G\) for \(G\in {\mathsf{Fin}}{\mathsf{Grp}}\), usually \(G\cong C_2\), and \(\dim \mathop{\mathrm{Aut}}E = 1\). Consider the nodal curve \(E\), equivalent to \({\mathbf{P}}^1/ 0\sim\infty\), so \(\mathop{\mathrm{Aut}}(C) = {{\mathbf{C}}^{\times}}\rtimes C_2\) which again has dimension 1. If \(g(C) \geq 2\) then \({\sharp}\mathop{\mathrm{Aut}}(C) < \infty\). The 3 in condition b is due to the need to fix 3 points, to drop \(\dim \operatorname{PGL}_2\) from dimension 3 to zero.
If \(X\) is Gorenstein then \(\omega_X\in \operatorname{Pic}(X)\) and can be written \(\omega_X = {\mathcal{O}}(K_X)\) where \(K_X\) is defined to be the canonical class. This holds if e.g. \(X\) has hypersurface singularities. Note that \(\omega_X\) is ample iff \(\deg { \left.{{ \omega_X }} \right|_{{C_i}} } > 0\) for each \(C_i\), and by adjunction one has \begin{align*} \deg { \left.{{\omega_X}} \right|_{{C_i}} } = \deg \omega_{C_i} + {\sharp}{ C_i \cap(C\setminus C_i) } = 2p_a(C_i) - 2 .\end{align*} Thus if \(p_a(C_i) \geq 2\) this is always positive; if \(p_a(C_i) = 0\) then \(C_i = {\mathbf{P}}_1\), otherwise if \(p_a(C_i) = 1\) and we get the curves appearing in condition b.
Consider a family \({\mathcal{C}}\to \Delta^\circ\) of smooth projective curves. By the semistable reduction theorem, after a finite base change \(\Delta'\to\Delta\) any family can be completed to \(X'\to \Delta'\) such that \(X'\) is smooth and \(X_0'\) is SNC:
Note that \(\deg { \left.{{\omega_X}} \right|_{{X_i}} } < 0 \iff C_i = {\mathbf{P}}^1\) and \({\sharp}\qty{C_i \cap(C\setminus C_i)} = 1\), or just \({\sharp}\qty{C_i \cap(C\setminus C_i)} = 2\). If \(C\cdot C_i = 0\) since \(C\) can be replaced with a disjoint fiber \(F\). Writing \(0 = C\cdot C_i = C_i^2 + (C-C_i)C_i\), where get \(C_i^2 = -1\) in the first case and \(C_i^2 = -2\) in the second case. In the first case, \(C_i\) can be contracted to yields \(X\to X_1\) (Castelnuovo’s lemma) with \(X_1\) smooth, so we can get rid of \(-1\) curves in stages to get a new surface with (potentially) only \(-2\) curves, which is a minimal model and is smooth. Contracting all \(-2\) curves yields the canonical model, which may be singular but has only canonical singularities. Note that \(X\) has canonical singularities iff \((X, X_0)\) has slc singularities iff \(X_0\) has slc singularities, which is a form of log adjunction for degenerating pairs.
See the BCHM paper.
So it’s clear how to degenerate in one parameter families, but how does one organize these various limits into a compactification? The idea is to construct \({\mathcal{M}_g},\overline{{\mathcal{M}_g}}\) using GIT to realize them as \(H/\operatorname{PGL}_n\) where \(H\) is a moduli of curves with additional data. The HM criterion gives stable and semistable points, and one hopes these coincide with the above notions.
What is \(H\)? Two answers: the Chow variety, or the Hilbert scheme. Start with \(C\) a smooth curve of genus \(g\geq 2\) with \(\deg K_C = 2g-2 \geq 2\) so that \(K_C\) is ample. Then \(nK_C\) is very ample for any \(n\geq 2\), and there is an embedding \(C \xhookrightarrow{{\left\lvert {nK_C} \right\rvert}} {\mathbf{P}}^N\). Such an embedding is given by a choice of basis of \(H^0(C; {\mathcal{O}}(nK_C))\) where two bases differ by a \(\operatorname{PGL}{\hbox{-}}\)action. Note that \(N = n(2g-2) - (g-1) - 1\) by Riemann-Roch.
The Chow variety \({\operatorname{CH}}(d_1, d_2, {\mathbf{P}}^N)\) parameterizes cycles of dimension \(d_1\) and degree \(d_2\) in \({\mathbf{P}}^N\). For example, \({\operatorname{CH}}(1, n(2g-2), {\mathbf{P}}^N) \hookrightarrow{\mathbf{P}}H^0({\operatorname{Gr}}, {\mathcal{O}}(k))\) for some \(k\) which sends a cycle to a certain hypersurface. Since \(\operatorname{PGL}\) acts on the latter, it acts on the former, and there is a notion of Chow stability.
For \(n\geq 4\), DM curves are Chow stable
The Hilbert scheme is preferable since Chow doesn’t have an immediate deformation theory. We can take a scheme parameterizing closed embeddings \(Z\hookrightarrow{\mathbf{P}}^N\) with a given Hilbert polynomial; recalling \(p_X = \chi({\mathcal{O}}_Z(X))\) and setting \(p_Z(x) = n(2g-2)x + (1-g)\), consider \(\operatorname{Hilb}({\mathbf{P}}^n, p_Z)\). Constructing this scheme: for \(n\gg 0\), there is a surjection \(H^0({\mathcal{O}}_{{\mathbf{P}}^N}(n) ) \twoheadrightarrow H^0({\mathcal{O}}_Z(n))\) defines a point \(g_n\in {\operatorname{Gr}}\) as the codomain varies. Mumford proves there exists an \(N\) such that \((g_N, g_{N+1})\) defines \(Z\) uniquely, although \(N\) is not canonically defined. Thus \(\operatorname{Hilb}\) embeds into a product of Grassmannians, and there is a notion of asymptotic Hilbert stability in terms of growth in \(N\), and one takes the leading term. One shows that for \(n\geq 4\), HM-stable curves are asymptotically Hilbert stable in this since. This almost completely fails for surfaces.
The generalization: algebraic spaces, \(H/G\) or more generally \(H/R\) for \(R\subset H\times H\) an equivalence relation. By Artin, these exist, and they are natural to consider for non-polynomial equations like \(y = \sqrt{x^3+x+1}\). The construction of moduli spaces as algebraic spaces is easy, one then tries to prove they are projective. See KSB varieties and KSBA pairs.
Denote by \(\mu_n \leq {\operatorname{SL}}_n({\mathbf{C}})\) the subgroup generated by \(M \mathrel{\vcenter{:}}={ \begin{bmatrix} {{\varepsilon}} & {0} \\ {0} & {{\varepsilon}^{-1}} \end{bmatrix} }\) for \({\varepsilon}^n=1\) a primitive \(n\)th root of unity, and consider its action \(\mu_n\curvearrowright{\mathbf{C}}[x,y]\) restricted from the standard action \({\operatorname{SL}}_2({\mathbf{C}}) \curvearrowright{\mathbf{C}}[x,y]\). Explicitly, this can be written geometrically as \begin{align*} \mu_n &\curvearrowright{\mathbf{A}}^2 \\ M.(x, y) &= ({\varepsilon}x, {\varepsilon}^{-1}y) .\end{align*}
Write a general polynomial in \({\mathbf{C}}[x,y]\) as \(f(x,y) = \sum_{i, j \geq 0} c_{ij} x^i y^j\), then under the action of \(\mu_n\) we have \begin{align*} M.f(x,y) = \sum_{i, j \geq 0} c_{ij} ({\varepsilon}x)^i ({\varepsilon}^{-1}y)^j = \sum_{i, j \geq 0} c_{ij} {\varepsilon}^{i-j} x^i y^j .\end{align*} The polynomial \(f\) will be in the invariant subring \({\mathbf{C}}[x,y]^{\mu_n}\) if and only if \(M.f = f\), and equating coefficients in the above expression imposes the condition that for a fixed \(i,j\),
Inspecting such polynomials, if \(n\) is even one can find \begin{align*} a(x,y) \mathrel{\vcenter{:}}=(xy)^{n\over 2}, \qquad b(x,y) = xy ,\end{align*} from which the relation \(a^2 = b^n\) is readily seen to hold. If \(n\) is odd, no such invariants exist – this follows from writing \begin{align*} a(x,y) = a_{n, 0}x^n + a_{0, n}y^n, \qquad b(x,y) = b_{2,0} x^2 + b_{1,1}xy + b_{0, 2} y^2 \end{align*} and setting \(a^2-b^n = 0\), which yields \begin{align*} 0 &= 2 \, a_{0n} a_{n0 } x^{n} y^{n} + a_{n0} ^{2} x^{2 \, n} + a_{0n}^{2} y^{2 \, n} - {\left(b_{20} x^{2} + b_{11} x y + b_{02} y^{2}\right)}^{n} \\ &= 2 \, a_{0n} a_{n0 } x^{n} y^{n} + a_{n0} ^{2} x^{2 \, n} + a_{0n}^{2} y^{2 \, n} - \sum_{i+j+k=n} b_{20}^i b_{11} ^j b_{02}^k \,\, x^{2i} (xy)^{j} y^{2k} \\ &= 2 \, a_{0n} a_{n0 } x^{n} y^{n} + a_{n0} ^{2} x^{2 \, n} + a_{0n}^{2} y^{2 \, n} - \sum_{i+j+k=n} {n\choose i,j,k} b_{20}^i b_{11} ^j b_{02}^k \,\, x^{2i+j} y^{2k+j} ,\end{align*} where we’ve taken a general trinomial expansion. Setting \((i,j,k) = (1,0,n-1)\) shows \(b_{20} = 0\), and similarly setting \((0,1,n-1)\) forces \(b_{11} = 0\) and \((n-1,0,1)\) forces \(b_{02} = 0\).
The isomorphism with \(D_{2n}\): Let \(BD_{4n} \mathrel{\vcenter{:}}=\left\langle{R, S}\right\rangle\) where \begin{align*} R \mathrel{\vcenter{:}}={ \begin{bmatrix} {{\varepsilon}} & {0} \\ {0} & {{\varepsilon}^{-1}} \end{bmatrix} },\quad {\varepsilon}^{2n} = 1,\qquad S = { \begin{bmatrix} {0} & {1} \\ {-1} & {0} \end{bmatrix} } .\end{align*}
To see that \(BD_{4n}\) has order exactly \(4n\), we can start listing elements.
We can also note that \(S^2 = -I = R^n\), so the sets \(\left\{{ S^2 R^k {~\mathrel{\Big\vert}~}k\geq 0}\right\}, \left\{{S^3 R^k {~\mathrel{\Big\vert}~}k\geq 0}\right\}\) are redundant and exhaust all possibilities for elements in this group, since \(S, R\) commute up to multiplication by \(-1\) and \(R^n = - R\) occurs in the first subset.
To see that the image of \(BD_{4n}\) in \({\operatorname{SO}}_3({\mathbf{R}})\) is isomorphic to \(D_{2n}\), note that the subgroup \(BD_{4n}\) already lies in \({\operatorname{SU}}_2\), viewed as a subgroup of \({\operatorname{SL}}_2({\mathbf{C}})\), and so we look for a map \({\operatorname{SU}}_2 \to {\operatorname{SO}}_3({\mathbf{R}})\). For this, we can use the following isomorphism to the unit quaternions \(Q^{\times}\): \begin{align*} F_1: {\operatorname{SU}}_2 &\to Q^{\times}\\ { \begin{bmatrix} {a+bi} & {-c+di} \\ {c-di} & {a-bi} \end{bmatrix} } &\mapsto a\mathbf{1} + b\mathbf{i} + c\mathbf{j} + d\mathbf{k} .\end{align*} Unit quaternions can be mapped to rotation matrices using the following well-known formula: \begin{align*} F_2: Q^{\times}&\to {\operatorname{SO}}_3({\mathbf{R}}) \\ a\mathbf{1} + b\mathbf{i} + c\mathbf{j} + d\mathbf{k} &\mapsto \left[\begin{array}{lll} 1-2 \left(c^2+d^2\right) & 2 \left(b c-d a\right) & 2 \left(b d+c a\right) \\ 2 \left(b c+d a\right) & 1-2 \left(b^2+d^2\right) & 2 \left(c d-b a\right) \\ 2 \left(b d-c a\right) & 2 \left(c d+b a\right) & 1-2 \left(b^2+c^2\right) \end{array}\right] .\end{align*} So we can use \(\Phi \mathrel{\vcenter{:}}= F_2 \circ F_1: {\operatorname{SU}}_2\to {\operatorname{SO}}_3({\mathbf{R}})\) and investigate the image. A computation shows that \begin{align*} \Phi(S) = F_2(-1\mathbf{j}) = { \begin{bmatrix} {-1} & {0} & {0} \\ {0} & {1} & {0} \\ {0} & {0} & {-1} \end{bmatrix} } \implies \Phi(S)^2 = I ,\end{align*} and \begin{align*} \Phi(R) &= \Phi\qty{{ \begin{bmatrix} {a_n + i b_n} & {0} \\ {0} & {a_n - i b_n} \end{bmatrix} }}, \qquad a_n = \cos(2\pi/n),\, b_n = \sin(2\pi/n) \\ \\ &= F_2(a_n\mathbf{1} + b_n\mathbf{i}) \\ \\ &= { \begin{bmatrix} {1} & {0} & {0} \\ {0} & {1-2b_n^2} & {-2a_n b_n} \\ {0} & {2a_n b_n} & {1-2b_n^2} \end{bmatrix} } \\ \\ &= { \begin{bmatrix} {1} & {0} & {0} \\ {0} & {\cos(\pi/n)} & { -\sin(\pi/n) } \\ {0} & {\sin(\pi/n)} & {\cos(\pi/n)} \end{bmatrix} } \\ \\ &= \left[\begin{array}{ c | c } I & 0 \\ \hline 0 & R_{\pi/n} \end{array}\right] ,\end{align*} where \(R_\theta \in {\operatorname{SO}}_2({\mathbf{R}})\) is the rotation by \(\theta\) matrix and we have applied several double angle formulas. In this form, we can easily check \begin{align*} \Phi(R)^n = \left[\begin{array}{ c | c } I^n & 0 \\ \hline 0 & R_{{\pi/n}}^n \end{array}\right] = I ,\end{align*} and so the image of \(\Phi(R)\) is order \(n\). Finally, we note the presentation \begin{align*} D_{2n} = \left\langle{r, s{~\mathrel{\Big\vert}~}r^n = s^2 =1, sr = r^{-1}s}\right\rangle ,\end{align*} and so in order to verify that the image is isomorphic to \(D_{2n}\), it suffices to check that \(r \mathrel{\vcenter{:}}=\Phi(R)\) and \(s\mathrel{\vcenter{:}}=\Phi(S)\) satisfy the same relations, since (by the same argument as in \({\operatorname{SL}}_2({\mathbf{C}})\)) they already generate a finite subgroup of \({\operatorname{SO}}_3({\mathbf{R}})\) of order \(2n\). That this relation holds in the image follows from the fact that it holds for the original two matrices and group homomorphisms preserve relations: \begin{align*} R^{-1}S &= { \begin{bmatrix} {{\varepsilon}^{-1}} & {0} \\ {0} & {{\varepsilon}} \end{bmatrix} }\cdot { \begin{bmatrix} {0} & {-1} \\ {1} & {0} \end{bmatrix} } = { \begin{bmatrix} {0} & {{\varepsilon}^{-1}} \\ {-{\varepsilon}} & {0} \end{bmatrix} } = { \begin{bmatrix} {0} & {-1} \\ {1} & {0} \end{bmatrix} } \cdot { \begin{bmatrix} {{\varepsilon}} & {0} \\ {0} & {{\varepsilon}^{-1}} \end{bmatrix} } = SR .\end{align*}
Finding invariant polynomials: We can first check which polynomials are invariant under the \(M{\hbox{-}}\)action: \begin{align*} M.f(x,y) = f(x,y) \implies \sum c_{ij} x^i y^j = \sum c_{ij}{\varepsilon}^{i-j}x^i y^j ,\end{align*} which implies that \(c_{ij} = 0\) unless \(i=j\) or \(2n \divides i-j\). Thus the general polynomials of degrees \(2n, 4\), and \(2n+2\) respectively satisfying these conditions are of the form \begin{align*} a(x,y) &= a_{2n, 0} x^{2n} + a_{n, n} x^n y^n + a_{0, 2n} y^{2n} \\ \\ b(x,y) &= \begin{cases} b_{4, 0} x^4 + b_{2, 2} x^2y^2 + b_{0, 4} y^4, & n = 2 \\ b_{2, 2} x^2 y^2, & n > 2, \end{cases} \\ \\ c(x,y) &= c_{2n+1, 1} x^{2n+1}y + c_{n+1, n+1}x^{n+1}y^{n+1} + c_{1, 2n+1} x y^{2n+1} .\end{align*} We can then further check which polynomials are invariant under the \(i{\hbox{-}}\)action: \begin{align*} i.f(x,y) = f(x,y) \implies \sum c_{ij} x^i y^j = \sum c_{ij}(-1)^j x^j y^i ,\end{align*} which implies that \(c_{ij} = c_{ji}\) when \(j\) is even and \(c_{ij} = -c_{ji}\) when \(j\) is odd. Incorporating these new restrictions, the general such invariant polynomials will be of the following forms: \begin{align*} a(x,y) &= \alpha_0 x^{2n} + \alpha_1 x^n y^n + \alpha_0 y^{2n} \\ \\ b(x,y) &= \begin{cases} \beta_0 x^4 + \beta_1 x^2y^2 + \beta_0 y^4, & n = 2 \\ \beta_1 x^2 y^2, & n > 2, \end{cases} \\ \\ c(x,y) &= \gamma_0 x^{2n+1}y + \gamma_1 x^{n+1}y^{n+1} -\gamma_0 x y^{2n+1} .\end{align*} Since we have freedom to change coordinates, we can assume these polynomials are monic, potentially at the cost of getting a slightly different relation than \(ba^2 = 4b^{n+1}\). Setting \(\alpha_0 = \beta_0 = \gamma_0 = 1\), we’re left considering polynomials of the form \begin{align*} a(x,y) &= x^{2n} + \alpha_1 x^n y^n + y^{2n} \\ \\ b(x,y) &= \begin{cases} x^4 + \beta_1 x^2y^2 + y^4, & n = 2 \\ \beta_1 x^2 y^2, & n > 2, \end{cases} \\ \\ c(x,y) &= x^{2n+1}y + \gamma_1 x^{n+1}y^{n+1} - x y^{2n+1} .\end{align*}
Generalizing example 1.13 in Mukai suggests that invariants of the following forms may work, corresponding to setting \(\alpha_1 = \gamma_1 = 0\) and \(\beta_{1} = 1\): \begin{align*} a(x,y) &\mathrel{\vcenter{:}}= x^{2n} + y^{2n} \\ b(x,y) &\mathrel{\vcenter{:}}= x^2y^2 \\ c(x,y) &\mathrel{\vcenter{:}}= xy(x^{2n} - y^{2n}) .\end{align*}
One can then check directly that the desired relation holds: \begin{align*} b(x,y) a(x,y) ^2 - 4b(x,y)^{n+1} &= (xy)^2 (x^{4n} + y^{4n} + 2(xy)^{2n} ) - 4 (xy)^2 (xy)^{2n} \\ &= (xy)^2 (x^{4n} + y^{4n} - 2(xy)^{2n} ) \\ &= c(x,y)^2 .\end{align*}
Let \({\varepsilon}^n = 1\) and \({\varepsilon}.(x,y) \mathrel{\vcenter{:}}=({\varepsilon}x, {\varepsilon}y)\), and let \(f(x,y) = \sum c_{ij} x^i y^j\in {\mathbf{C}}[x,y]\). Then \(f\) is invariant iff \begin{align*} {\varepsilon}.f(x,y) = f(x,y) \iff \sum c_{ij} x^i y^j = \sum c_{ij} {\varepsilon}^{i+j} x^i y^j \iff n\divides i+j ,\end{align*} and so the invariant ring is \begin{align*} {\mathbf{C}}[x,y]^{\mu_n} = \bigoplus _{k\geq 0} {\mathbf{C}}[x,y]_{kn} ,\end{align*} the \(n\)th graded piece of \({\mathbf{C}}[x,y]\) along with the pieces corresponding to all higher multiples \(kn\) of \(n\). This is generated as a graded ring by the degree \(n\) monomials \(\left\langle{x^n, x^{n-1}y, \cdots, xy^{n-1}, y^n}\right\rangle\), so \begin{align*} {\mathbf{C}}[x,y]^{\mu_n} = {\mathbf{C}}[x^n, x^{n-1}y, \cdots, xy^{n-1}, y^n] .\end{align*}
Part 1: To fix notation, let \(R = k[M]\) and \(G = \operatorname{Spec}R\), and write the given maps as \begin{align*} m^*: R &\to R \otimes_k R \\ x^m &\mapsto x^m \otimes x^m \\ \\ i^*: R &\to R \\ x^m &\mapsto x^{-m} \\ \\ {\varepsilon}^*: R &\to k \\ x^m &\mapsto 1 .\end{align*}
Equipping \(G\) with the structure of a group scheme requires producing the following maps: \begin{align*} m: G &\to G \underset{\scriptscriptstyle {k} }{\times} G \\ \\ i: G &\to G \\ \\ {\varepsilon}: \operatorname{Spec}k &\to G ,\end{align*} which are required to fit into commutative diagrams of \(k{\hbox{-}}\)schemes, where \(s_G: G\to \operatorname{Spec}k\) is the structure morphisms of \(G\) and \(\Delta: G\to G \underset{\scriptscriptstyle {k} }{\times} G\) is the diagonal morphism:
Since morphisms of affine schemes correspond bijectively to \(k{\hbox{-}}\)algebra morphisms between their global sections, if we set \(m, i, {\varepsilon}\) to be the morphisms corresponding to \(m^*, i^*, {\varepsilon}^*\) induced by the \(\operatorname{Spec}\) functor, it suffices to show the following diagrams of \(k{\hbox{-}}\)algebras commute:
Part 2: Write \(M \cong {\mathbf{Z}}^r \oplus \bigoplus _{i = 1}^\ell {\mathbf{Z}}/n_i {\mathbf{Z}}\), then \begin{align*} \operatorname{Spec}k[M] &\cong \operatorname{Spec}k\left[ {\mathbf{Z}}^r \oplus \bigoplus _{i=0}^{\ell} {\mathbf{Z}}/n_i {\mathbf{Z}}\right] \\ \\ &\cong \operatorname{Spec}\qty{ k[{\mathbf{Z}}^r] \otimes_k k[{\mathbf{Z}}/n_1 {\mathbf{Z}}] \otimes_k \cdots \otimes_k k[{\mathbf{Z}}/n_\ell {\mathbf{Z}}] } \\ \\ &\cong \operatorname{Spec}k[{\mathbf{Z}}^r] \underset{\scriptscriptstyle {k} }{\times} \operatorname{Spec}k[{\mathbf{Z}}/n_0 {\mathbf{Z}}] \underset{\scriptscriptstyle {k} }{\times} \cdots \underset{\scriptscriptstyle {k} }{\times} \operatorname{Spec}k[{\mathbf{Z}}/n_\ell {\mathbf{Z}}] \\ \\ &\cong {\mathbf{G}}_m \underset{\scriptscriptstyle {k} }{\times} \mu_{n_0} \underset{\scriptscriptstyle {k} }{\times} \cdots \underset{\scriptscriptstyle {k} }{\times} \mu_{n_\ell} ,\end{align*} where we’ve used that \(k[A \times B] = k[A] \otimes_k k[B]\) and \(\operatorname{Spec}(R\otimes_k S) = \operatorname{Spec}(R) \underset{\scriptscriptstyle {k} }{\times} \operatorname{Spec}(S)\).
Part 3: \(\implies\): suppose one is given such a linear coaction, we will show that it induces a direct sum decomposition of vector spaces.
Definition 3.54 in Mukai describes a coaction of \(R\) on \(V\) as a morphism \(a^*: V\to V\otimes_k R\) such that the following diagrams commute:
As in class, we can note that for any \(v\in V\), we have \(a^*(v) = \sum_{m\in M} v_m \otimes x^m\) for some components \(v_m\), and by the commutativity of the above diagram, the composition \begin{align*} v\mapsto \sum_{m\in M } v_m \otimes x^m \mapsto \sum_{m\in M} v_m \otimes 1 \mapsto \sum_{m\in M} v_m \end{align*} is equal to the identity and so \(v = \sum_{m\in M} v_m\). This yields \(V = \sum_{m\in M} V_m\) for some subsets \(V_m\), which can be defined as all of those \(w\in V\) such that the term \(v_m\otimes x^n\) occurs in the expansion of the image \(a^*(w) = \sum_{m\in M} v_m\otimes x^m\). These are linear subspaces, because for example if \(m_1, m_2\in V_m\), then \begin{align*} a^*(v_{m_1 }+ v_{m_2}) = a^*(v_{m_1}) + a^*(v_{m_2}) = (v_{m_1} \otimes x^m) + (v_{m_2} \otimes x^m) = (v_{m_1} + v_{m_2} )\otimes x^m ,\end{align*} and so setting \(w \mathrel{\vcenter{:}}= v_{m_1} + v_{m_2}\) shows that their sum is again in \(V_m\). It remains to show that this sum of subspaces is direct.
It suffices to show that if any \(v_m\in V_m\) can be expressed as \(v_m = \sum_{n\neq m} v_n\) with \(v_n\in V_n\) then \(v_m = 0\). This shows that \(V_m \cap V_n = 0\) for all \(m\) and \(n\), making the sum direct. To this end, note that \(a^*(v_m) = v_m \otimes x^m\) is an elementary tensor. If \(v_m = \sum_{n\neq m} v_n\), then \(a^*(v) = \sum_{n\neq m} v_n\otimes x^n\). Since \(a^*\) is a well-defined map, it must be the case that \begin{align*} v_m \otimes x^m = \sum_{n\neq m}v_n \otimes x^n .\end{align*} Equating components of these tensors forces \(v_n = 0\) for all \(n\neq m\), so \(v_m = 0\).
\(\impliedby\): suppose now that one has a decomposition \(V = \bigoplus _{m\in M} V_m\); then the naturally associated map \(v\mapsto \sum{m\in M} v_m \otimes x^m\) yields the desired coaction.
Part 4: This follows from the same proof as in part 3 – the only new aspect is that the coaction map \(a^*: A\to A\otimes k[M]\) is now a map of \(k{\hbox{-}}\)algebras which preserves the grading on \(A\). If \(a_i\in A_i\) and \(a_j\in A_j\) with \(A = \bigoplus _{m\in M} A_m\), then \(a_i a_j \in A_{i+j}\), and \begin{align*} a^*(a_i a_j) = a^*(a_i) a^*(a_j) = (a_i \otimes x^i) (a_j \otimes x^j) = (a_i a_j) \otimes x^{i+j} .\end{align*}
Throughout this problem, we work over a fixed field \(k\) write \({\mathbf{A}}^2 \mathrel{\vcenter{:}}=\operatorname{Spec}k[x, y]\). All tensor products are implicitly over \(k\).
First noting that we can write \({\mathbf{G}}_a = \operatorname{Spec}k[\xi]\) for an indeterminate \(a\), we can use the isomorphism \(R\otimes V \mathrel{\vcenter{:}}= k[x,y] \otimes k[\xi] \cong k[\xi][x,y]\) to regard elements in polynomials in the variables \(x,y\) with coefficients in \(k[\xi]\). The coaction \begin{align*} {\mathbf{G}}_a &\curvearrowright{\mathbf{A}}^2 \\ \xi.(x,y) &\mathrel{\vcenter{:}}=(x, \xi x+y) \end{align*} can then be written as \begin{align*} a^*: k[x,y] &\to k[\xi] \otimes k[x,y]\cong k[\xi][x,y] \\ x &\mapsto x \\ y &\mapsto \xi x + y .\end{align*}
Write \({\mathbf{G}}_m = \operatorname{Spec}k[ \lambda, \lambda^{-1}]\) and use the isomorphism \(k[ \lambda, \lambda^{-1}] \cong k[z,w]/(zw-1)\) to write \begin{align*} R\otimes V = k[x,y] \otimes k[ \lambda, \lambda^{-1}] \cong k[x,y] \otimes{ k[z,w]\over (zw-1) } \cong {k[z, w] \over (zw-1)}[x,y] ,\end{align*} so \(z\) corresponds to \(\lambda\) and \(w\) to \(\lambda^{-1}\). Then the action \begin{align*} {\mathbf{G}}_m &\curvearrowright{\mathbf{A}}^2 \\ \lambda.(x, y) &\mathrel{\vcenter{:}}=(\lambda x, \lambda^{-1}y) \end{align*} has the following corresponding coaction: \begin{align*} a^*: k[x,y] &\mapsto {k[z,w] \over (zw-1) }[x,y] \\ x &\mapsto z x \\ y &\mapsto wy .\end{align*}
Write \(\mu_n = \operatorname{Spec}k[\xi]/(\xi^n-1)\), then \begin{align*} R\otimes V = k[x,y]\otimes{k[\xi] \over (\xi^n-1)} \cong {k[\xi]\over (\xi^n-1)}[x,y] \end{align*} and the coaction is \begin{align*} a^*: k[x,y] &\to {k[\xi]\over (\xi^n-1)}[x,y] \\ x &\mapsto \xi x \\ y &\mapsto \xi^{-1}y = \xi^{n-1}y .\end{align*}
We first write the geometric action as \begin{align*} S_3 &\curvearrowright{\mathbf{A}}^3 = \operatorname{Spec}k[x_1,x_2,x_3] \\ \sigma.{\left[ {x_1, x_2,x_3} \right]} &\mathrel{\vcenter{:}}={\left[ {x_{\sigma(1)}, x_{\sigma(2)}, x_{ \sigma(3)} } \right]} .\end{align*} We can then write \begin{align*} R\otimes_k V = \qty{ \bigoplus _{\sigma \in S_3} k }\otimes_k k[x_1, x_2, x_3] \cong \bigoplus _{\sigma \in S_3} k[x_1, x_2, x_3] .\end{align*} Thus the coaction is \begin{align*} k[x_1, x_2, x_3] &\to \bigoplus _{\sigma \in S_3} k[x_1, x_2, x_3] \\ x_i &\mapsto \bigoplus _{\sigma \in S_3} x_{ \sigma(i)} .\end{align*} For example, writing \(S_3 = \left\{{(), (12), (23), (13), (123), (132)}\right\}\), the map on the first coordinate is the following: \begin{align*} x_1\mapsto [x_1, x_2, x_1, x_3, x_2, x_3] .\end{align*}
Consider the \(\mathrm{SL}_{2}\) action on \(X=\left(\mathbb{P}^{1}\right)^{n}\) with a linearized invertible sheaf \(L=\mathcal{O}_{X}\left(d_{1}, \ldots, d_{n}\right), d_{i} \in \mathbb{N}\). Define \(w_{i}:=\frac{2 d_{i}}{\sum d_{j}}\), so that \(\sum w_{i}=2\). Prove that a point \(\left(P_{1}, \ldots, P_{n}\right) \in X^{s s}(L)\) (resp. \(\left.X^{s}(L)\right) \Longleftrightarrow\) whenever some points \(P_{i}\), \(i \in I, I \subset\{1, \ldots, n\}\), coincide, one has \(\sum_{i \in I} w_{i} \leq 1\) (resp. \(<1\) ).
Write points in this product as \begin{align*} X \mathrel{\vcenter{:}}=({\mathbf{P}}^1)^{n} = \left\{{ \mathbf{p} \mathrel{\vcenter{:}}= \begin{bmatrix} x_0 & \cdots & x_n \\ y_0 & \cdots & y_n \end{bmatrix} }\right\} ,\end{align*} corresponding to the \(n{\hbox{-}}\)tuple \(\qty{ {\left[ {x_0: y_0} \right]}, \cdots, {\left[ {x_n: y_n} \right]} }\), with \({\operatorname{SL}}_2\) action given by \begin{align*} {\operatorname{SL}}_2 &\curvearrowright X \\ { \begin{bmatrix} {a} & {b} \\ {c} & {d} \end{bmatrix} } \cdot \mathbf{p} \mathrel{\vcenter{:}}= &{ \begin{bmatrix} {a} & {b} \\ {c} & {d} \end{bmatrix} } \begin{bmatrix} x_0 & \cdots & x_n \\ y_0 & \cdots & y_n \end{bmatrix} = \begin{bmatrix} a x_0 + by_0 & \cdots & ax_n + by_n \\ cx_0 + dy_0 & \cdots & cx_n + dy_n \end{bmatrix} .\end{align*} We note that the maximal torus acts as \begin{align*} T_{{\operatorname{SL}}_2} &\curvearrowright X \\ { \begin{bmatrix} {t} & {\cdot } \\ {\cdot } & {t^{-1}} \end{bmatrix} } \cdot \mathbf{p} \mathrel{\vcenter{:}}= &{ \begin{bmatrix} {a} & {b} \\ {c} & {d} \end{bmatrix} } \begin{bmatrix} x_0 & \cdots & x_n \\ y_0 & \cdots & y_n \end{bmatrix} = \begin{bmatrix} t x_0 & \cdots & tx_n \\ t^{-1}y_0 & \cdots & t^{-1}y_n \end{bmatrix} .\end{align*} We identify \(X\) with its image (which we’ll also denote \(X\)) under the Veronese embedding \(X \to {\mathbf{P}}^N\) associated to the ample line bundle \({\mathcal{L}}\mathrel{\vcenter{:}}={\mathcal{O}}(\mathbf{d})\) where \(\mathbf{d} \mathrel{\vcenter{:}}={\left[ {d_1,\cdots, d_n} \right]} \subseteq {\mathbf{Z}}^n\) viewed as an integer vector. Writing \(D\) for the convex hull of the \(d_i\) in \({\mathbf{Z}}^n\), note that every lattice point in \({\mathbf{Z}}^n \cap D\) defines a monomial, and every point \(\mathbf{p} \in X\) corresponds to a a collection of lattice points \(P_{\mathbf{p} } = \left\{{ \mathbf{k} = {\left[ {k_1,\cdots, k_n} \right]} }\right\} \subseteq D \cap{\mathbf{Z}}^n\) along with a choice of coefficient \(\alpha_{\mathbf{k}}\) for each \(\mathbf{k} \in P_{\mathbf{p}}\).
The following is an example \(D\) and \(P_{\mathbf{p}}\) when \(n=3\) and \(\mathbf{d} = {\left[ {3, 5, 4} \right]}\):
The three highlighted lattice points are \(\mathbf{k}_1 = {\left[ {3,0,0} \right]}, \mathbf{k}_2 = {\left[ {0,5,0} \right]}, \mathbf{k}_3 = {\left[ {0,0,4} \right]}\), \(P_{\mathbf{p}} \mathrel{\vcenter{:}}=\left\{{\mathbf{k}_1, \mathbf{k}_2, \mathbf{k}_3}\right\}\) corresponds to a polynomial \begin{align*} F(x_1, x_2, x_3) = \alpha_1 x_1^3 x_2^0 x_3^0 + \alpha_2 x_1^0 x_2^5 x_3^0 + \alpha_3 x_1^0 x_2^0 x_3^4 .\end{align*}
In our situation, lattice points will correspond to monomials \begin{align*} \mathbf{k}_{IJ} = x^I y^J \mathrel{\vcenter{:}}= x_1^{i_1} x_2^{i_2}\cdots x_n^{i_n} \,\cdot\, y_1^{j_1}y_2^{j_2}\cdots y_n^{j_n} ,\end{align*} and so each point in \(X\) will correspond to a polynomial \begin{align*} F(x_1,\cdots, x_n, y_1,\cdots, y_n) = \sum_{(I, J) \subseteq D} \alpha_{IJ} x^I y^J .\end{align*} where \(\sum_{i\in I} i + \sum_{j\in J} j = d_i\).
Todo: this is not quite right. If \(\alpha_j\) is associated to the embedding along the \(d_j\) direction, then the monomial degrees should just sum up to \(d_j\).
Indexing these monomials systematically, we can write \begin{align*} F(x_1,\cdots, y_n) = \sum \alpha_j \prod_{i=1}^n x_i^{d_i - k_j} y_i^{k_j} .\end{align*} When points collide, without loss of generality (using the transitive \({\operatorname{SL}}_2{\hbox{-}}\)action) we can assume that the collision point in \({\mathbf{P}}^1\) is \({\left[ {0: 1} \right]}\), so \(p\in X\) is of the form \begin{align*} p = \begin{bmatrix} 0 & \cdots & 0 & p_{m+1} & \cdots & p_n \\ 1 & \cdots & 1 & q_{m+1} & \cdots & q_n \\ \end{bmatrix} ,\end{align*} where we’ve written \(m\) for the number of colliding points. We can now compute the weights of the torus action over such colliding points \begin{align*} \lambda(t).F(x_1,\cdots, y_n) &= \sum \alpha_j \prod t^{d_i - 2k_j} x_i^{d_i-k_j} y^{k_j} \\ &= \sum t^{w_{ij}} \alpha_j x_i^{d_i-k_j} y^{k_j}, \qquad w_{ij} \mathrel{\vcenter{:}}=\sum_{i} d_i - 2k_j .\end{align*} We now need \(\mu(x, \lambda) \geq 0\) for semistability, i.e. \(\min(w_{ij}) \geq 0\), so \(\min(\sum d_i - 2k_j) \geq 0\). We can maximally destabilize such a quantity by taking \(k_j = d_i\) for each \(i,j\), and so if the collision set is \(\mathcal S\), we require \begin{align*} \sum_{i=1}^n d_i - \sum_{i\in \mathcal S} 2d_i \geq 0 \iff \sum_{i=1}^n d_i \geq \sum_{i\in \mathcal S} d_i \iff { \sum_{i\in \mathcal S} 2d_i \over \sum_{i=1}^n d_i} \leq 1 \iff \sum_{i\in \mathcal S} w_i \leq 1 .\end{align*}
Consider the \(\mathrm{SL}_{3}\) action on the set \(X=\mathbb{P}^{N}, N=\left(\begin{array}{c}3+2 \\ 2\end{array}\right)-1=9\), parameterizing cubic curves \(C \subset \mathbb{P}^{2}\), with a linearized invertible sheaf \(L=\mathcal{O}_{X}(1)\). Prove that \(C\) is semistable \(\Longleftrightarrow C\) has only ordinary double points.
We first note that every choice of cubic curve \(C\in Y_{3, 2}\) can be represented (after choosing coordinates) by a polynomial \begin{align*} F(x,y,z) = \sum_{i+j+k=3}a_{ijk}\, x^iy^jz^k = \sum_{i+j+k=3} a_{\mathbf{i}} x^i y^j z^k \qquad \mathbf{i} \mathrel{\vcenter{:}}={\left[ {i,j,k} \right]} \end{align*} and thus a choice of lattice points \(C_P\) in the corresponding weight polytope where each point is labeled with the corresponding coefficient of \(F\):
We record the fact that the point \(p \mathrel{\vcenter{:}}={\left[ {1:0:0} \right]}\) is singular iff \(a_{300} = a_{201} = a_{210} = 0\):
Moreover, \(p\) is a triple point iff additionally \(a_{102} = a_{111} = a_{120} = 0\):
Moreover, all of above holds except \(a_{102}\) (the coefficient of \(xz^2\)) is nonzero, then \(p\) is a double point with only a single tangent, and thus not an ordinary double point. These facts follow from computing the gradients and Hessians which characterize these types of singularities.
We also note that if \(\lambda: {\mathbf{G}}_m \to {\operatorname{SL}}_3\) is a 1-parameter subgroup, then \(\lambda(t)\) is conjugate to \begin{align*} \tilde \lambda(t) = { \begin{bmatrix} {t^{r_1}} & {\cdot } & {\cdot } \\ {\cdot } & {t^{r_2}} & {\cdot } \\ {\cdot } & {\cdot } & {t^{r_3}} \end{bmatrix} }, \qquad \sum_{i=1}^3 r_i = 0 ,\end{align*} and thus determines a vector \(\mathbf{r} \mathrel{\vcenter{:}}={\left[ {r_1, r_2, r_3} \right]}\in {\mathbf{Z}}^3\). The action can then be written \begin{align*} \lambda(t)\cdot F(x, y, z) = \sum_{i+j+k = 3} a_{\mathbf{i}}\, t^{{\left\langle {\mathbf{r}},~{\mathbf{i}} \right\rangle} } x^i y^j z^k ,\end{align*} and so all weights are of the form \(w_{\mathbf{i}} = {\left\langle {\mathbf{r}},~{\mathbf{i}} \right\rangle} \in {\mathbf{Z}}\). We note that \(C\in Y_{3, 2}\) is unstable iff for every \(\lambda\), every weight is negative or every weight is positive, so \(w_{\mathbf{i}} < 0\) or \(w_{\mathbf{i}} > 0\) for all \(\mathbf{i} \in C_P\). We’ll focus on the strictly positive case, since the positive case follows similarly.
\(\implies\): Suppose \(C\) is unstable, we will show that \(p\) is either a non-ordinary double point, a triple point, or worse. Pick \(\lambda\) and its corresponding \(\mathbf{r}\) such that all weights \(w_{\mathbf{i}}\) are positive. Then in particular \begin{align*} \min \left\{{w_{\mathbf{i}} \mathrel{\vcenter{:}}={\left\langle {\mathbf{r}},~{\mathbf{i}} \right\rangle} {~\mathrel{\Big\vert}~}\mathbf{i}\in C_P}\right\} > 0 .\end{align*}
Having strictly positive weights can be phrased geometrically as \(\left\{{\mathbf{i} {~\mathrel{\Big\vert}~}\mathbf{i}\in C_P,\, a_{\mathbf{i}} \neq 0}\right\}\) being contained in the positive half-space corresponding to the hyperplane \(H_C \mathrel{\vcenter{:}}=\mathbf{r}^\perp\). Picking a maximally destabilizing \(\lambda\), without loss of generality (changing coordinates if necessary) we can arrange for the lower-left 5 monomials receive non-positive weights:
This forces all of the shaded coefficients except for potentially \(a_{102}\) to be zero. By the earlier remarks, this forces \(p = [1:0:0]\) to be singular, and if \(a_{102} = 0\) this is a triple point. Otherwise, if \(a_{102}\neq 0\), this yields a double point which only has a single tangent, and is thus not ordinary. So if \(C\) is not an unstable curve (i.e. it is semistable), it must have an ordinary double point at worst.
\(\impliedby\): Suppose conversely that \(C\) has a triple point or a non-ordinary double point \(q\). Using the transitivity of the \({\operatorname{SL}}_3\) action, we can move \(q\) to \(p = [1:0:0]\) and conclude using the singularity criterion above that the following coefficients vanish:
We can now make a specific choice of \(\lambda\) that yields the following \(H_\lambda\) and gives the remaining coefficients strictly positive weights, allowing us to conclude that \(C\) is unstable:
Give an example showing that Hilbert-Mumford’s criterion of (semi)stability for \(G \curvearrowright X\) does not hold in general if \(X\) is not assumed to be projective. (In other words, produce a counterexample with a non-projective \(X\).)
Consider the following action: \begin{align*} {\mathbf{G}}_m &\curvearrowright X \mathrel{\vcenter{:}}={\mathbf{A}}^2 \\ t.{\left[ {x, y} \right]} &\mathrel{\vcenter{:}}={\left[ {tx, ty} \right]} .\end{align*} Thus yields a set theoretic orbit space \begin{align*} {\mathbf{A}}^2/{\mathbf{G}}_m &= \left\{{O_t {~\mathrel{\Big\vert}~}t\in {\mathbf{G}}_m}\right\} \cup\left\{{O_x, O_y, O_0}\right\} \\ \\ O_t &\mathrel{\vcenter{:}}=\left\{{xy = t {~\mathrel{\Big\vert}~}t \in {\mathbf{G}}_m}\right\} \\ O_x &\mathrel{\vcenter{:}}=\left\{{{\left[ {t, 0} \right]} {~\mathrel{\Big\vert}~}t\in {\mathbf{G}}_m}\right\} = {\mathbf{G}}_m . {\left[ {1, 0} \right]} \\ O_y &\mathrel{\vcenter{:}}=\left\{{{\left[ {0, t} \right]} {~\mathrel{\Big\vert}~}t\in {\mathbf{G}}_m}\right\} = {\mathbf{G}}_m . {\left[ {0, 1} \right]} \\ O_0 &\mathrel{\vcenter{:}}=\left\{{0}\right\} ,\end{align*} i.e. there is an orbit for each hyperbola \(xy=t\), the punctured \(x{\hbox{-}}\)axis, the punctured \(y{\hbox{-}}\)axis, and the origin:
We record that the following facts:
Thus \(X^s = {\mathbf{A}}^2 \setminus V(xy)\) is the plane with the axes deleted, and for example \(0\in X\setminus X^s\) is an unstable point and \({\left[ {1, 0} \right]}, {\left[ {0, 1} \right]}\in X\setminus X^s\) are not stable points (and may thus either be unstable or semistable).
Noting that \(O_x \sim O_y \sim O_0\) are all orbit-closure equivalent since \(0\) is in the closure of \(O_x\) and \(O_y\), we can separate these orbits by redefining our total space to be \(X \mathrel{\vcenter{:}}={\mathbf{A}}^2\setminus\left\{{0}\right\}\); then \(O_x, O_y\) are closed in \(X'\) and have 0-dimensional stabilizer and thus points in those orbits become stable for the restricted action \({\mathbf{G}}_m\curvearrowright X'\).
For example, pick \(p \mathrel{\vcenter{:}}={\left[ {1, 0} \right]} \in O_x \subseteq X'\), then \(p\) is stable by construction. However, we can now check the Hilbert-Mumford numerical criterion and note that every 1-parameter subgroup \(\lambda\) acting with weights \(r_1, r_2\) satisfies \begin{align*} \lambda(t).p = {\left[ {t^{r_1} 1, t^{r_2} 0} \right]} = {\left[ {t^{r_1}1, 0} \right]} ,\end{align*} and in particular always has strictly positive or strictly negative weights, which would otherwise characterize \(p\) as an unstable point, yielding the desired counterexample.
Provide a complete VGIT (variation of GIT) analysis for the quotients \(\left(\mathbb{P}^{1}\right)^{3} / / \mathbb{G}_{m}\). The line bundle is \(L=\mathcal{O}(1,1,1)\). The \(\mathbb{G}_{m}\)-action is defined as \begin{align*} t .\left(x_{0}: x_{1}\right)=\left(x_{0}: t x_{1}\right), \quad t .\left(y_{0}: y_{1}\right)=\left(y_{0}: t y_{1}\right), \quad t .\left(z_{0}: z_{1}\right)=\left(z_{0}: t z_{1}\right) \end{align*} The linearization is a lift of this action to the action on the coordinates \(w_{i j k}=\) \(x_{i} y_{j} z_{k}\) on \(\left(\mathbb{P}^{1}\right)^{3}\) embedded into \(\mathbb{P}^{7}\) with the 8 homogeneous coordinates \(w_{i j k}\). The above equations give an action on the point \(\left(w_{i j k}\right) \in \mathbb{P}^{7}\). The linearization is a lift of this action to the point \(\left(w_{i j k}\right) \in \mathbb{A}^{8}\).
Determine the following:
The choices for \(\mathbb{Q}\)-linearizations of \(L\) (i.e. linearizations of some \(L^{d}, d \in \mathbb{N}\) ).
Chamber decomposition.
For each chamber, the quotient.
For neighboring chambers, the induced morphisms between the quotients.
For each chamber, the sets of unstable and strictly semistable points.
Todo.
Let \(X \subset \mathbb{P}^{N}\) be a singular projective curve. Suppose that \(X\) has \(n\) irreducible components \(X_{i}\) and that \(\left.\operatorname{deg} \mathcal{O}_{X}(1)\right|_{X_{i}}=\lambda_{i} \in \mathbb{N}\). Let \(F\) be a coherent sheaf on \(X\). Then on an open subset \(U_{i} \subset X_{i}\) of each irreducible component it is a locally free sheaf of rank \(r_{i}\).
The Seshadri slope of an invertible sheaf \(F\) is defined to be \begin{align*} \mu(F)=\frac{\chi(F)}{\sum \lambda_{i} r_{i}}, \quad \text { where } r_{i}=\left.\operatorname{rk} F\right|_{U_{i}} . \end{align*} By replacing \(\mathcal{O}_{X}(1)\) by a rational multiple, one can assume that \(\lambda_{i}>0, \sum \lambda_{i}=1\).
Let \(F\) be a pure-dimensional coherent sheaf on \(X\). Prove that \(F\) is Hilbertstable (resp. semistable) \(\Longleftrightarrow\) for any subsheaf \(E \subset F\) one has \(\mu(E)<\) \(\mu(F)\) (resp. \(\leq\) ). (Note in particular, that this definition depends on the polarization \(\left(\lambda_{i}\right)\), and there is a Variation of GIT here.)
Prove, however, that if \(\chi(F)=0\) then the (semi)stability condition does not depend on a polarization \(\left(\lambda_{i}\right)\).
You can use the following simple observation. If \(\pi: \widetilde{X} \rightarrow X\) is a normalization then \(\tilde{X}\) is a smooth curve, so Riemann-Roch is applicable:
\begin{align*} \chi(E)=\operatorname{deg}(E)+\operatorname{rank}(E)(1-g), \end{align*}
and the difference of Hilbert polynomials
\begin{align*} \chi(X, F(m))-\chi\left(\tilde{X},\left(\pi^{*} F\right)(m)\right) \end{align*}
is a constant.
We first recall that a sheaf \({\mathcal{F}}\in {\mathsf{Coh}}(X)\) is Hilbert stable if for every subsheaf \(E\leq F\), we have an inequality of reduced Hilbert polynomials \(\tilde p_E(n) < \tilde p_F(n)\), and semistability is characterized by replacing \(<\) with \(\leq\). Noting that \begin{align*} p_F(n) \mathrel{\vcenter{:}}=\chi(X; F(n)) = c_0 n ^{\dim X} = c_0 n + c_1 \end{align*} since \(X\) is a curve and consequently \(\dim X = 1\). We have \(\tilde p_F(n) = n + {c_0\over c_1}\) and thus \(\tilde p_E(n) = n + {d_0\over d_1}\) for some constants \(c_i\) depending on \(F\) and \(d_i\) depending on \(E\), and so \begin{align*} \tilde p_E(n) < \tilde p_F(n) \iff {d_0\over d_1} < {c_0 \over c_1} .\end{align*} Thus it suffices to show that \({d_0\over d_1} = \mu(E)\) and \({c_0\over c_1} = \mu(F)\). We’ll proceed by computing \(p_F(n)\) in order to identify what \(c_0, c_1\) are in general.
Noting that \(X\) may be singular and thus Riemann-Roch won’t apply directly, take the normalization \(\pi: \tilde X\to X\). Let \(X = \cup_i X_i\) be the decomposition of \(X\) into irreducible components and let \(\tilde X_i\) be their lifts in the normalization, which are all curves with some genera \(g_i\). We now have \begin{align*} p_F(n) &\mathrel{\vcenter{:}}=\chi(X; F(n)) \\ \\ &= \chi(\tilde X, (\pi^* F)(n) ) + c \qquad \text{ for some constant } c \\ \\ &= \sum_{1\leq i\leq n} \chi(\tilde X_i, { \left.{{ (\pi^* F)(n) }} \right|_{{\tilde X_i}} } ) + c \\ \\ &= \sum_{1\leq i\leq n} \qty{\deg { \left.{{(\pi^* F)(n)}} \right|_{{\tilde X_i}} } + (1-g_i) } + c .\end{align*} As an aside, we can compute the degrees inside of the sum as follows: \begin{align*} \deg { \left.{{(\pi^* F)(n)}} \right|_{{X_i}} } &= \deg { \left.{{F(n)}} \right|_{{X_i}} } \\ &= \deg { \left.{{F}} \right|_{{X_i}} } \otimes\bigoplus_{1\leq j\leq r_i} {\mathcal{O}}_{X_i}(n) \\ &= \deg { \left.{{F}} \right|_{{X_i}} } + n r_i \lambda_i .\end{align*} Continuing the above calculation, we have \begin{align*} p_F(n) &= \sum_{1\leq i\leq n} \qty{ \deg { \left.{{F}} \right|_{{X_i}} } + nr_i \lambda_i + (1-g_i) } + c \\ \\ &= n \qty{ \sum_{1\leq i\leq n} r_i \lambda_i} + \qty{ \sum_{1\leq i\leq n} \deg { \left.{{F}} \right|_{{X_i}} } + (1-g_i) + c} \\ \\ &= n \qty{ \sum_{1\leq i\leq n} r_i \lambda_i} + \qty{ \sum_{1\leq i\leq n} \chi(X_i; { \left.{{F}} \right|_{{X_i}} } ) + c} \\ \\ &= n \qty{ \sum_{1\leq i\leq n} r_i \lambda_i} + \chi(X; F) .\end{align*}
Thus \(c_0 = \sum r_i \lambda_i, c_1 = \chi(F)\), and \({c_1\over c_0} = { \chi(F) \over \sum r_i \lambda_i} = \mu(F)\).
\(\dim {\mathcal{M}_g}= 3g-3\) for \(g\geq 2\).↩︎
Note that open sets in the Zariski topology are large.↩︎
Note that if \(X\) is rational, this parameterization is unique.↩︎
For infinite groups, we’ll again ask if \(R^G\) is finitely generated – this will be true when \(G\) is a reductive linear algebraic group.↩︎
In fact “affine” can be removed here and \(Z\) can be replaced by an arbitrary variety.↩︎
The main difference: linearly reductive is a condition after removing a hyperplane, and geometrically reductive involves replacing a hyperplane with a higher degree hypersurface.↩︎
This is also sometimes called an immersion, i.e. any set whose preimage is closed must itself be closed.↩︎
This construction is in EGA II.↩︎
This is not an issue for line bundles, since there are no nonzero subsheaves with different supports since every subsheaf is supported on the entire variety. This is also automatic if \(X\) is irreducible, otherwise a subsheaf could be supported on different components which could have different dimensions.↩︎