# Differential Calculus

## Big Theorems / Tools:

\begin{align*} \frac{\partial}{\partial x} \int_a^x f(t) dt = f(x) \\ \\ \end{align*}

\begin{align*} \frac{\partial}{\partial x} \int_{a(x)}^{b(x)} f(x, t) dt - \int_{a(x)}^{b(x)} \frac{\partial}{\partial x} f(x, t) dt &= f(x, t) \cdot \frac{\partial}{\partial x}(t) \bigg\rvert_{t=a(x)}^{t=b(x)} \\ \\ &= f(x, b(x))\cdot b'(x) - f(x, a(x))\cdot a'(x) \\ \\ \end{align*}

If $$f(x,t) = f(t)$$ doesn’t depend on $$x$$, then $${\frac{\partial f}{\partial x}\,} = 0$$ and the second integral vanishes:

\begin{align*} \frac{\partial}{\partial x} \int_{a(x)}^{b(x)} f(t) dt &= f(b(x))\cdot b'(x) - f(a(x))\cdot a'(x) \end{align*}

Note that you can recover the original FTC by taking \begin{align*} a(x) &= c \\ b(x) &= x \\ f(x,t) &= f(t) .\end{align*}

\begin{align*} \frac{\partial}{\partial x} \int_{1}^{x} f(x, t) dt = \int_{1}^{x} \frac{\partial}{\partial x} f(x, t) dt + f(x, x) \end{align*}

Todo

\begin{align*} f \in C^0(I) &\implies \exists p\in I: f(b) - f(a) = f'(p)(b-a) \\ &\implies \exists p\in I: \int_a^b f(x)~dx = f(p)(b-a) .\end{align*}

If

• $$f(x)$$ and $$g(x)$$ are differentiable on $$I - {\{\text{pt}\}}$$, and

\begin{align*} \lim_{x\to {\{\text{pt}\}}} f(x) = \lim_{x\to {\{\text{pt}\}}} g(x) \in \left\{{0, \pm \infty}\right\}, && \forall x \in I, g'(x) \neq 0, && \lim_{x\to{\{\text{pt}\}}} \frac{ f'(x)}{\ g'(x)} \text{ exists}, \\ \end{align*}

Then it is necessarily the case that \begin{align*} \lim _ { x \rightarrow {\{\text{pt}\}}} \frac { f ( x ) } { g ( x ) } = \lim _ { x \rightarrow {\{\text{pt}\}}} \frac { f ^ { \prime } ( x ) } { g ^ { \prime } ( x ) }. \end{align*}

Note that this includes the following indeterminate forms: \begin{align*} \frac{0}{0}, \quad \frac{\infty}{\infty}, \quad 0 \cdot \infty, \quad 0^{0}, \quad \infty^{0}, \quad 1^{\infty}, \quad \infty-\infty .\end{align*}

For $$0\cdot \infty$$, can rewrite as $${0 \over {1\over \infty}} = {0\over 0},$$ or alternatively $${\infty \over {1\over 0}} = {\infty \over \infty}.$$

For $$1^\infty, \infty^0,$$ and $$0^0$$, set \begin{align*} L \mathrel{\vcenter{:}}=\lim f^g \implies \ln L = \lim g \ln(f) \end{align*} to recover $$\infty\cdot 0, 0 \cdot \infty,$$ or $$0\cdot 0$$.

\begin{align*} T(a, x) &= \sum _ { n = 0 } ^ { \infty } \frac { f ^ { ( n ) } ( a ) } { n ! } ( x - a ) ^ { n } \\ &= f ( a ) + f'(a)( x - a ) + \frac { 1 } { 2 }f ^ { \prime \prime } ( a ) ( x - a ) ^ { 2 } \\ & \quad \quad + \frac { 1} { 6 } f ^ { \prime \prime \prime } ( a ) ( x - a ) ^ { 3 } + \frac{1}{24}f^{(4)}(a)(x-a)^4 + ~\cdots \end{align*} There is a bound on the error: \begin{align*} {\left\lvert {f(x) - T_k(a,x)} \right\rvert} \leq {\left\lvert {\frac{f^{(k+1)}(a)}{(k+1)!}} \right\rvert} \end{align*} where $$T_k(a, x) = \sum _ { n = 0 } ^ { k } \frac { f ^ { ( n ) } ( a ) } { n ! } ( x - a ) ^ { n }$$ is the $$k$$th truncation.

Approximating change: $$\Delta y \approx f'(x) \Delta x$$

## Tools for finding limits

How to find $$\lim_{x\to a} f(x)$$in order of difficulty:

• Plug in: if $$f$$ is continuous, $$\lim_{x\to a} f(x) = f(a)$$.

• Check for indeterminate forms and apply L’Hopital’s Rule.

• Algebraic rules

• Squeeze theorem

• Expand in Taylor series at $$a$$

• Monotonic + bounded

• One-sided limits: $$\lim_{x\to a^-} f(x) = \lim_{\varepsilon \to 0} f(a-\varepsilon)$$

• Limits at zero or infinity: \begin{align*} \lim_{x\to\infty} f(x) = \lim_{\frac{1}{x}\to 0} f\qty{\frac{1}{x}} \text{ and } \lim_{x\to 0} f(x) = \lim_{x\to\infty} f\qty{1 \over x} \end{align*}

• Also useful: if $$p(x) = p_nx^n + \cdots$$ and $$q(x) = q_nx^m + \cdots$$, \begin{align*} \lim_{x\to\infty} \frac{p(x)}{q(x)} = \begin{cases} 0 & \deg p < \deg q \\ \infty & \deg p > \deg q \\ \frac{p_n}{q_n} & \deg p = \deg q \end{cases} \end{align*}

Be careful: limits may not exist!! Example $$:\lim_{x\to 0} \frac{1}{x} \neq 0$$.

## Asymptotes

• Vertical asymptotes: at values $$x=p$$ where $$\lim_{x\to p} = \pm\infty$$
• Horizontal asymptotes: given by points $$y=L$$ where $$L \lim_{x\to\pm\infty} f(x) < \infty$$
• Oblique asymptotes: for rational functions, divide - terms without denominators yield equation of asymptote (i.e. look at the asymptotic order or “limiting behavior”).
• Concretely:

\begin{align*} f(x) = \frac{p(x)}{q(x)} = r(x) + \frac{s(x)}{t(x)} \sim r(x) \end{align*}

## Recurrences

• Limit of a recurrence: $$x_n = f(x_{n-1}, x_{n-2}, \cdots)$$
• If the limit exists, it is a solution to $$x = f(x)$$

## Derivatives

\begin{align*} {\frac{\partial }{\partial x}\,}(f\circ g) = (f' \circ g) \cdot g' \end{align*}

\begin{align*} {\frac{\partial }{\partial x}\,} f\cdot g =f'\cdot g + g' \cdot f \end{align*}

\begin{align*} {\frac{\partial }{\partial x}\,} \frac{f(x)}{g(x)} = \frac{f'g - g'f}{g^2} \end{align*}

Mnemonic: Low d-high minus high d-low

\begin{align*} {\frac{\partial f^{-1}}{\partial x}\,}(f(x_0)) = \left( {\frac{\partial f}{\partial x}\,} \right)^{-1}(x_0) = 1/f'(x_0) \end{align*}

## Implicit Differentiation

\begin{align*} (f(x))' = f'(x)~dx, (f(y))' = f'(y)~dy \end{align*} - Often able to solve for $${\frac{\partial y}{\partial x}\,}$$ this way.

• Obtaining derivatives of inverse functions: if $$y = f^{-1}(x)$$ then write $$f(y) = x$$ and implicitly differentiate.

General series of steps: want to know some unknown rate $$y_t$$

• Lay out known relation that involves $$y$$
• Take derivative implicitly (say w.r.t $$t$$) to obtain a relation between $$y_t$$ and other stuff.
• Isolate $$y_t = \text{ known stuff }$$

• Setup: $$l, x_t$$ and $$x(t)$$ are known for a given $$t$$, want $$y_t$$. \begin{align*} x(t)^2 + y(t)^2 = l^2 \implies 2xx_t +2yy_t = 2ll_t = 0 \end{align*} (noting that $$l$$ is constant)
• So $$y_t = -\frac{x(t)}{y(t)}x_t$$
• $$x(t)$$ is known, so obtain $$y(t) = \sqrt{l^2 - x(t)^2}$$ and solve.

# Integral Calculus

## Average Values

\begin{align*} \mu_f = \frac{1}{b-a}\int_a^b f(t) dt \end{align*}

Apply MVT to $$F(x)$$.

## Area Between Curves

Area in polar coordinates: \begin{align*} A = \int_{r_1}^{r_2} \frac{1}{2}r^2(\theta) ~d\theta \end{align*}

## Solids of Revolution

\begin{align*} \text{Disks} && A = \int \pi r(t)^2 ~dt \\ \text{Cylinders} && A = \int 2\pi r(t)h(t) ~dt .\end{align*}

## Arc Lengths

\begin{align*} L &= \int ~ds && ds = \sqrt{dx^2 + dy^2} \\ &= \int_{x_0}^{x_1}\sqrt{1 + {\frac{\partial y}{\partial x}\,}}~dx \\ &= \int_{y_0}^{y_1}\sqrt{{\frac{\partial x}{\partial y}\,} + 1}~dy \end{align*}

\begin{align*} SA = \int 2 \pi r(x) ~ds \end{align*}

## Center of Mass

Given a density $$\rho(\mathbf x)$$ of an object $$R$$, the $$x_i$$ coordinate is given by \begin{align*} x_i = \frac {\displaystyle\int_R x_i\rho(x) ~dx} {\displaystyle\int_R \rho(x)~dx} \end{align*}

## Big List of Integration Techniques

Given $$f(x)$$, we want to find an antiderivative $$F(x) = \int f$$ satisfying $$\frac{\partial}{\partial x}F(x) = f(x)$$

• Guess and check: look for a function that differentiates to $$f$$.
• $$u{\hbox{-}}$$ substitution
• More generally, any change of variables \begin{align*} x = g(u) \implies \int_a^b f(x)~dx = \int_{g^{-1}(a)}^{g^{-1}(b)} (f\circ g)(x) ~g'(x)~dx \end{align*}

### Integration by Parts:

The standard form: \begin{align*} \int u dv = uv - \int v du \end{align*}

• A more general form for repeated applications: let $$v^{-1} = \int v$$, $$v^{-2} = \int\int v$$, etc. \begin{align*} \int_a^b uv &= uv^{-1}\bigg\rvert_a^b - \int_a^b u^{1} v^{-1}\\ &= uv^{-1} - u^1v^{-2}\bigg\rvert_a^b + \int_a^b u^2v^{-2} \\ &= uv^{-1} - u^1v^{-2} + u^2v^{-3}\bigg\rvert_a^b - \int_a^b u^3v^{-3} \\ &\quad\vdots \\ \implies \int_a^b uv &= \sum_{k=1}^n (-1)^k u^{k-1}v^{-k} \bigg\rvert_a^b + (-1)^n\int_a^b u^nv^{-n} \end{align*}
• Generally useful when one term’s $$n$$th derivative is a constant.

### Shoelace Method

• Note: you can choose $$u$$ or $$v$$ equal to 1! Useful if you know the derivative of the integrand.
Derivatives Integrals Signs Result
$$u$$ $$v$$ NA NA
$$u'$$ $$\int v$$ $$+$$ $$u\int v$$
$$u''$$ $$\int\int v$$ $$-$$ $$-u'\int\int v$$
$$\vdots$$ $$\vdots$$ $$\vdots$$ $$\vdots$$

Fill out until one column is zero (alternate signs). Get the result column by multiplying diagonally, then sum down the column.

### Differentiating under the integral

\begin{align*} \frac{\partial}{\partial x} \int_{a(x)}^{b(x)} f(x, t) dt - \int_{a(x)}^{b(x)} \frac{\partial}{\partial x} f(x, t) dt &= f(x, \cdot)\frac{\partial}{\partial x}(\cdot) \bigg\rvert_{a(x)}^{b(x)} \\ &= f(x, b(x))~b'(x) - f(x, a(x))~a'(x) \end{align*}

Let $$F(x)$$ be an antiderivative and compute $$F'(x)$$ using the chain rule.

• LIPET: Log, Inverse trig, Polynomial, Exponential, Trig: generally let $$u$$ be whichever one comes first.

• The ridiculous trig sub: for any integrand containing only trig terms

• Transforms any such integrand into a rational function of $$x$$

• Let $$u = 2\tan^{-1}x, ~du = \frac{2}{x^2+1}$$, then \begin{align*} \int_a^b f(x)~dx = \int_{\tan\frac{a}{2}}^{\tan\frac{b}{2}} f(u)~du \end{align*}

\begin{align*} \int_0^{\pi/2} \frac{1}{\sin \theta}~d\theta = 1/2 \end{align*}

• Trigonometric Substitution \begin{align*} \sqrt{a^2-x^2} && \Rightarrow && x = a\sin(\theta) &&dx = a\cos(\theta)~d\theta \\ \sqrt{a^2+x^2} && \Rightarrow && x = a\tan(\theta) &&dx = a\sec^2(\theta)~d\theta \\ \sqrt{x^2 - a^2} && \Rightarrow && x = a \sec(\theta) &&dx = a\sec(\theta)\tan(\theta)~d\theta \end{align*}

### Trigonometric Substitution

• Trig Formulas \begin{align*} \sin^2(x) && = && \frac{1}{2}(1-2\cos x) \\ && = && \\ && = && \\ && = && \\ && = && \\ \end{align*}

• Products of trig functions

• Setup: $$\int \sin^a(x) \cos^b(x) ~dx$$
• Both $$a,b$$ even: $$\sin(x)\cos(x) = \frac{1}{2} \sin(x)$$
• $$a$$ odd: $$\sin^2 = 1-\cos^2,~u=\cos(x)$$
• $$b$$ odd: $$\cos^2 = 1-\sin^2,~u=\sin(x)$$
• Setup: $$\int \tan^a(x) \sec^b(x) ~dx$$
• $$a$$ odd: $$\tan^2 = \sec^2 - 1,~ u = \sec(x)$$
• $$b$$ even: $$\sec^2 = \tan^2 - 1, u = \tan(x)$$

Other small but useful facts: \begin{align*} \int_0^{2\pi} \sin \theta~d\theta = \int_0^{2\pi} \cos \theta~d\theta = 0 .\end{align*}

## Optimization

• Critical points: boundary points and wherever $$f'(x) = 0$$

• Second derivative test:

• $$f''(p) > 0 \implies p$$ is a min
• $$f''(p) < 0 \implies p$$ is a max
• Inflection points of $$h$$ occur where the tangent of $$h'$$ changes sign. (Note that this is where $$h'$$ itself changes sign.)

• Inverse function theorem: The slope of the inverse is reciprocal of the original slope

• If two equations are equal at exactly one real point, they are tangent to each other there - therefore their derivatives are equal. Find the $$x$$ that satisfies this; it can be used in the original equation.

• Fundamental theorem of Calculus: If \begin{align*} \int f(x) dx = F(b) - F(a) \implies F'(x) = f(x) .\end{align*}

• Min/maxing - either derivatives of Lagranage multipliers!

• Distance from origin to plane: equation of a plane \begin{align*} P: ax+by+cz=d .\end{align*}

• You can always just read off the normal vector $$\mathbf{n} = (a,b,c)$$. So we have $$\mathbf{n}\mathbf{x} = d$$.

• Since $$\lambda \mathbf{n}$$ is normal to $$P$$ for all $$\lambda$$, solve $$\mathbf{n}\lambda \mathbf{n} = d$$, which is $$\lambda = \frac{d}{ {\left\lVert {\mathbf{n}} \right\rVert}^2}$$

• A plane can be constructed from a point $$p$$ and a normal $$n$$ by the equation $$np = 0$$.

• In a sine wave $$f(x) = \sin(\omega x)$$, the period is given by $$2\pi/\omega$$. If $$\omega > 1$$, then the wave makes exactly $$\omega$$ full oscillations in the interval $$[0, 2\pi]$$.

• The directional derivative is the gradient dotted against a unit vector in the direction of interest

• Related rates problems can often be solved via implicit differentiation of some constraint function

• The second derivative of a parametric equation is not exactly what you’d intuitively think!

• For the love of god, remember the FTC! \begin{align*} \frac{\partial}{\partial x} \int_0^x f(y) dy = f(x) \end{align*}

• Technique for asymptotic inequalities: WTS $$f < g$$, so show $$f(x_0) < g(x_0)$$ at a point and then show $$\forall x > x_0, f'(x) < g'(x)$$. Good for big-O style problems too.

• Inflection points of $$h$$ occur where the tangent of $$h'$$ changes sign. (Note that this is where $$h'$$ itself changes sign.)

• Inverse function theorem: The slope of the inverse is reciprocal of the original slope

• If two equations are equal at exactly one real point, they are tangent to each other there - therefore their derivatives are equal. Find the $$x$$ that satisfies this; it can be used in the original equation.

• Fundamental theorem of Calculus: If \begin{align*} \int f(x) dx = F(b) - F(a) \implies F'(x) = f(x) .\end{align*}

• Min/maxing - either derivatives of Lagranage multipliers!

• Distance from origin to plane: equation of a plane \begin{align*} P: ax+by+cz=d .\end{align*}

# Vector Calculus

Notation: \begin{align*} \mathbf{v}, \mathbf{a}, \cdots && \text{vectors in }{\mathbb{R}}^n \\ \mathbf{R}, \mathbf{A}, \cdots && \text{matrices} \\ \mathbf{r}(t) && \text{A parameterized curve }\mathbf{r}: {\mathbb{R}}\to {\mathbb{R}}^n \\ \\ \widehat{\mathbf{v}} && {\mathbf{v} \over {\left\lVert {\mathbf{v}} \right\rVert}} .\end{align*}

## Plane Geometry

\begin{align*} \mathbf{v} = [x, y] \in {\mathbb{R}}^2 \implies m = \frac{y}{x} .\end{align*}

\begin{align*} \mathbf{R}_\theta = \left[ \begin{array} { l l } { \cos \theta } & { - \sin \theta } \\ { \sin \theta } & { \cos \theta } \end{array} \right] \implies \mathbf{R}_{\frac{\pi}{2}} = \left[ \begin{array} { l l } { 0 } & { - 1 } \\ { 1 } & { 0 } \end{array}\right] .\end{align*}

\begin{align*} \mathbf{R}_{\frac{\pi}{2}} \mathbf{x} \mathrel{\vcenter{:}}= \mathbf{R}_{\frac{\pi}{2}} \begin{bmatrix} x \\ y \end{bmatrix} = \begin{bmatrix} -y \\ x \end{bmatrix} \in \mathbf{\mathbb{R}}\mathbf{x}^\perp .\end{align*} Thus if a planar line is defined by the span of $${\left[ {x, y} \right]}$$ and a slope of $$m = y/x$$, a normal vector is given by the span of $${\left[ {-y, x} \right]}$$ of slope $$-{1 \over m} = -x/y$$.

Given $$\mathbf{v}$$, the rotated vector $$\mathbf{R}_{\frac{\pi}{2}} \mathbf{v}$$ is orthogonal to $$\mathbf{v}$$, so this can be used to obtain normals and other orthogonal vectors in the plane.

There is a direct way to come up with one orthogonal vector to any given vector: \begin{align*} \mathbf{v} = [a,b,c] \implies \mathbf{y} \mathrel{\vcenter{:}}= \begin{cases} \mathbf{[}-(b+c), a, a] & \mathbf{v} = [-1,-1,0], \\ [c,c, -(a+b)] & \text{else} \end{cases} \in {\mathbb{R}}\mathbf{v}^\perp .\end{align*}

## Projections

For a subspace given by a single vector $$\mathbf{a}$$: \begin{align*} \mathrm{proj}_\mathbf{a}( \textcolor{Aquamarine}{\textbf{x}} ) = {\left\langle {\textcolor{Aquamarine}{\textbf{x}} },~{\mathbf{a}} \right\rangle}\mathbf{\widehat{a}} \hspace{8em} \mathrm{proj}_{\mathbf{a}}^\perp(\textcolor{Aquamarine}{\textbf{x}}) = \textcolor{Aquamarine}{\textbf{x}} - \mathrm{proj}_\mathbf{a}(\textcolor{Aquamarine}{\textbf{x}}) = \textcolor{Aquamarine}{\textbf{x}} - {\left\langle {\textcolor{Aquamarine}{\textbf{x}}},~{\mathbf{a}} \right\rangle}\widehat{\mathbf{a}} \end{align*}

In general, for a subspace $$\operatorname{colspace}(A) = \left\{{\mathbf{a}_1, \cdots \mathbf{a}_n}\right\}$$,

\begin{align*} \mathrm{proj}_A(\textcolor{Aquamarine}{\textbf{x}}) = \sum_{i=1}^n {\left\langle {\textcolor{Aquamarine}{\textbf{x}} },~{\mathbf{a}_i} \right\rangle}\mathbf{\widehat{a}_i} = A(A^T A)^{-1}A^T\textcolor{Aquamarine}{\textbf{x}} \end{align*}

## Lines

\begin{align*} \text{General Equation} && Ax + By + C = 0 \\ \\ \text{Parametric Equation} && \mathbf{r}(t) = t\mathbf{x} + \mathbf{b} .\end{align*}

Characterized by an equation in inner products: \begin{align*} \mathbf{y} \in L \iff {\left\langle {\mathbf{y}},~{\mathbf{n}} \right\rangle} = 0 \end{align*}

Given $$\mathbf{p}_0, \mathbf{p}_1$$, take $$\mathbf{x} = \mathbf{p}_1 - \mathbf{p}_0$$ and $$\mathbf{b} = \mathbf{p}_i$$ for either $$i$$: \begin{align*} \mathbf{r}(t) = t(\mathbf{p}_1 - \mathbf{p}_0) + \mathbf{p}_0 && = t\mathbf{p}_1 + (1-t) \mathbf{p}_0 .\end{align*}

If a line $$L$$ is given by \begin{align*} \mathbf{r}(t) = t {\left[ {x_1, x_2, x_3} \right]} + {\left[ {p_1, p_2, p_3} \right]} ,\end{align*} then \begin{align*} (x, y, z) \in L \iff \frac{x-p_1}{x_1} = \frac{y-p_{2}}{x_2} = \frac{z-p_{3}}{x_3} .\end{align*}

The symmetric equation of the line through $${\left[ {2,1,-3} \right]}$$ and $${\left[ {1,4,-3} \right]}$$ is given by \begin{align*} \frac{x-2}{1}=\frac{y+1}{-5}=\frac{z-3}{6} .\end{align*}

### Tangent Lines / Planes

Key idea: just need a point and a normal vector, and the gradient is normal to level sets.

For any locus $$f(\mathbf{x}) = 0$$, we have \begin{align*} \mathbf{x} \in T_f(\mathbf{p}) \implies {\left\langle {\nabla f(\mathbf{p})},~{\mathbf{x}-\mathbf{p}} \right\rangle} = 0 .\end{align*}

### Normal Lines

Key idea: the gradient is normal.

To find a normal line, you just need a single point $$\mathbf{p}$$ and a normal vector $$\mathbf{n}$$; then \begin{align*} L = \left\{{\mathbf{x} \mathrel{\Big|}\mathbf{x} = \mathbf{p} + t\mathbf{v}}\right\} .\end{align*}

## Planes

\begin{align*} \text{General Equation} && A x + B y + C z + D = 0 \\ \\ \text{Parametric Equation} &&\mathbf{y}(t,s) = t\mathbf{x}_1 + s\mathbf{x}_2 + \mathbf{b} \\ \\ .\end{align*}

Characterized by an equation in inner products: \begin{align*} \mathbf{y} \in P \iff {\left\langle {\mathbf{y} - \mathbf{p}_0},~{\mathbf{n}} \right\rangle} = 0 \end{align*}

Determined by a point $$\mathbf{p}_0$$ and a normal vector $$\mathbf{n}$$

Given $$\mathbf{v}_0, \mathbf{v}_1$$, set $$\mathbf{n} = \mathbf{v}_0 \times\mathbf{v}_1$$.

### Finding a Normal Vector

• Normal vector to a plane
• Can read normal off of equation: $$\mathbf{n} = [a,b,c]$$
• Computing $$D$$:
• $$D = {\left\langle {\mathbf{p}_0},~{\mathbf{n}} \right\rangle} = p_1n_1 + p_2n_2 + p_3n_3$$
• Useful trick: once you have $$\mathbf{n}$$, you can let $$\mathbf{p}_0$$ be any point in the plane (don’t necessarily need to use the one you started with, so pick any point that’s convenient to calculate)

### Distance from origin to plane

• Given by $$D/ {\left\lVert {\mathbf{n}} \right\rVert} = {\left\langle {\mathbf{p}_0},~{\mathbf{\widehat{n}}} \right\rangle}$$. Gives a signed distance.

### Distance from point to plane

• Given by $${\left\langle {{\,\cdot\,}},~{\mathbf{\widehat{n}}} \right\rangle}$$
• Finding vectors in the plane
• Given $$P = [A, B, C] \cdot [x, y, z] = 0$$, can take $${\left[ {-\frac{B}{A},1,0} \right]}, {\left[ {-\frac{C}{A},0,1} \right]}$$

## Curves

\begin{align*} \mathbf{r}(t) = [x(t), y(t), z(t)] .\end{align*}

### Tangent line to a curve

We have an equation for the tangent vector at each point: \begin{align*} \widehat{ \mathbf{T} }(t) = \widehat{\mathbf{r}'}(t) ,\end{align*} so we can write \begin{align*} \mathbf{L}_{T}(t) = \mathbf{r}(t_0) + t \widehat{ \mathbf{T}}(t_0) \mathrel{\vcenter{:}}=\mathbf{r}(t_0) + t \widehat{\mathbf{r}'}(t_0) .\end{align*}

### Normal line to a curve

• Use the fact that $$\mathbf{r}''(t) \in {\mathbb{R}}\mathbf{r}'(t)^\perp$$, so we have an equation for a normal vector at each point: \begin{align*} \widehat{\mathbf{N}}(t) = \widehat{\mathbf{r}''}(t) .\end{align*} Thus we can write \begin{align*} \mathbf{L}_N(t) = \mathbf{r}(t_0) + t \widehat{ \mathbf{N} }(t_0) = \mathbf{r}(t_0) + t \widehat{ \mathbf{r}''} (t_0) .\end{align*}

#### Special case: planar graphs of functions

Suppose $$y = f(x)$$. Set $$g(x, y) = f(x) - y$$, then \begin{align*} \nabla g = [f_x(x), -1]\implies m = -\frac{1}{f_x(x)} \end{align*}

## Minimal Distances

Fix a point $$\mathbf{p}$$. Key idea: find a subspace and project onto it.

Key equations: projection and orthogonal projection of $$\mathbf{b}$$ onto $$\mathbf{a}$$: \begin{align*} \mathrm{proj}_\mathbf{a}(\mathbf{b}) = {\left\langle {\mathbf{b}},~{\mathbf{a}} \right\rangle}\mathbf{\widehat{a}} \hspace{8em} \mathrm{proj}_{\mathbf{a}}^\perp(\mathbf{b}) = \mathbf{b} - \mathrm{proj}_\mathbf{a}(\mathbf{b}) = \mathbf{b} - {\left\langle {\mathbf{b}},~{\mathbf{a}} \right\rangle}\widehat{\mathbf{a}} \end{align*}

### Point to plane

• Given a point $$\mathbf{p}$$ and a plane $$S = \left\{{\mathbf{x} \in {\mathbb{R}}^3 \mathrel{\Big|}n_0x + n_1y + n_2z = d}\right\}$$, let $$\mathbf{n} = [n_1, n_2, n_3]$$, find any point $$\mathbf{q} \in S$$, and project $$\mathbf{q} -\mathbf{p}$$ onto $$S^\perp= \mathrm{Span}(\mathbf{n})$$ using

\begin{align*} d = {\left\lVert {\mathrm{proj}_{\mathbf{n}}(\mathbf{q} - \mathbf{p})} \right\rVert} = {\left\lVert {{\left\langle {\mathbf{q} - \mathbf{p}},~{\mathbf{n}} \right\rangle} \widehat{\mathbf{n}}} \right\rVert} = {\left\langle {\mathbf{q} - \mathbf{p}},~{\mathbf{n}} \right\rangle} .\end{align*}

• Given just two vectors $$\mathbf{u}, \mathbf{v}$$: manufacture a normal vector $$\mathbf{n} = \mathbf{u} \times \mathbf{v}$$ and continue as above.

#### Origin to plane

Special case: if $$\mathbf{p} = \mathbf{0}$$, \begin{align*} d = {\left\lVert {\mathrm{proj}_{\mathbf{n}}(\mathbf{q})} \right\rVert} = {\left\lVert {{\left\langle {\mathbf{p}},~{\mathbf{n}} \right\rangle} \widehat{\mathbf{n}}} \right\rVert} = {\left\langle {\mathbf{p}},~{\mathbf{n}} \right\rangle}. .\end{align*}

### Point to line

• Given a line $$L: \mathbf{x}(t) = t\mathbf{v}$$ for some fixed $$\mathbf{v}$$, use \begin{align*} d = {\left\lVert {\mathrm{proj}_\mathbf{v}^^\perp(\mathbf{p})} \right\rVert} = {\left\lVert {\mathbf{p} - {\left\langle {\mathbf{p}},~{\mathbf{v}} \right\rangle}\widehat{\mathbf{v} }} \right\rVert} .\end{align*}

• Given a line $$L: \mathbf{x}(t) = \mathbf{w}_0 + t\mathbf{w}$$, let $$\mathbf{v} = \mathbf{x}(1) - \mathbf{x}(0)$$ and proceed as above.

### Line to line

Given $$\mathbf{r}_1(t) = \mathbf{p}_1 + t\mathbf{v}_2$$ and $$\mathbf{r}_2(t) = \mathbf{p}_2 + t\mathbf{v}_2$$, let $$d$$ be the desired distance.

• Let $$\widehat{ \mathbf{n}} = \widehat{\mathbf{v}_1 \times \mathbf{v}_2}$$, which is orthogonal to both lines.

• Then project the vector connecting the two fixed points $$\mathbf{p}_i$$ onto this subspace and take its norm: \begin{align*} d &= {\left\lVert {\mathrm{proj}_{\mathbf{n}}(\mathbf{p}_2 - \mathbf{p}_1)} \right\rVert} \\ &= {\left\lVert {{\left\langle {\mathbf{p}_2 -\mathbf{p}_1},~{\mathbf{n}} \right\rangle}\widehat{\mathbf{n}}} \right\rVert} \\ &= {\left\langle {\mathbf{p}_2 - \mathbf{p}_1},~{\mathbf{n}} \right\rangle} \\ &\mathrel{\vcenter{:}}={\left\langle {\mathbf{p}_2 - \mathbf{p}_1},~{\mathbf{v}_1 \times\mathbf{v}_2} \right\rangle} .\end{align*}

## Surfaces

\begin{align*} S = \left\{{(x,y,z) \mathrel{\Big|}f(x,y, z) = 0}\right\} \hspace{10em} z = f(x,y) \end{align*}

### Tangent plane to a surface

• Need a point $$\mathbf{p}$$ and a normal $$\mathbf{n}$$. By cases:
• $$f(x,y, z) = 0$$
• $$\nabla f$$ is a normal vector.
• Write the tangent plane equation $${\left\langle {\mathbf{n}},~{\mathbf{x} - \mathbf{p}_0} \right\rangle}$$, done.
• $$z = g(x,y)$$:
• Let $$f(x, y, z) = g(x,y) - z$$, then $$\mathbf{p} \in S \iff \mathbf{p}$$ is in a level set of $$f$$.
• $$\nabla f$$ is normal to level sets (and thus the surface), so compute $$\nabla f = [g_x, g_y, -1]$$
• Proceed as in previous case

### Surfaces of revolution

• Given $$f(x_1 ,x_2) = 0$$, can be revolved around either the $$x_1$$ or $$x_2$$ axis.
• $$f(x,y)$$ around the $$x$$ axis yields $$f(x, \pm \sqrt{y^2 + z^2})=0$$
• $$f(x,y)$$ around the $$y$$ axis yields $$f(\pm\sqrt{x^2 + z^2}, y)=0$$
• Remaining cases proceed similarly - leave the axis variable alone, replace other variable with square root involving missing axis.
• Equations of lines tangent to an intersection of surfaces $$f(x,y,z) = g(x,y,z)$$:
• Find two normal vectors and take their cross product, e.g. $$n = \nabla f \times \nabla g$$, then \begin{align*} L = \left\{{\mathbf{x}\mathrel{\Big|}\mathbf{x} = \mathbf{p} + t \mathbf{n}}\right\} \end{align*}
• Level curves:
• Given a surface $$f(x,y,z) = 0$$, the level curves are obtained by looking at e.g. $$f(x,y,c) = 0$$.

# Multivariable Calculus

Given a function $$f: {\mathbb{R}}^n \to {\mathbb{R}}$$, let $$S_k \mathrel{\vcenter{:}}=\left\{{\mathbf{p}\in {\mathbb{R}}^n ~{\text{s.t.}}~f(\mathbf{p}) = k}\right\}$$ denote the level set for $$k\in {\mathbb{R}}$$. Then \begin{align*} \nabla f(\mathbf{p}) \in S_k^\perp .\end{align*}

## Notation

\begin{align*} \mathbf{v} &= [v_1, v_2, \cdots] && \text{a vector} \\ \\ \mathbf{e}_i &= [0, 0, \cdots, \overbrace{1}^{i \text{th term}}, \cdots, 0] && \text{the } i \text{th standard basis vector} \\ \\ \phi: {\mathbb{R}}^n &\to {\mathbb{R}} && \text{a functional on } {\mathbb{R}}^n\\ \phi(x_1, x_2, \cdots) &= \cdots && \\ \\ \mathbf{F}: {\mathbb{R}}^n &\to {\mathbb{R}}^n && \text{a multivariable function}\\ \mathbf{F}(x_1,x_2,\cdots) &= [\mathbf{F}_1(x_1, x_2, \cdots), \mathbf{F}_2(x_1, x_2, \cdots), \cdots, \mathbf{F}_n(x_1, x_2, \cdots)] \end{align*}

## Partial Derivatives

For a functional $$f:{\mathbb{R}}^n\to {\mathbb{R}}$$, the partial derivative of $$f$$ with respect to $$x_i$$ is \begin{align*} {\frac{\partial f}{\partial x_i}\,}(\mathbf p) \mathrel{\vcenter{:}}=\lim_{h\to 0}\frac{f(\mathbf p + h\mathbf e_i) - f(\mathbf p)}{h} \end{align*}

\begin{align*} f: {\mathbb{R}}^2 &\to {\mathbb{R}}\\ {\frac{\partial f}{\partial x}\,}(x_0,y_0) &= \lim_{h \to 0} \frac{f(x_0+h, y_0) - f(x_0,y_0)}{h} \end{align*}

## General Derivatives

A function $$f: {\mathbb{R}}^n \to {\mathbb{R}}^m$$ is differentiable iff there exists a linear transformation $$D_f: {\mathbb{R}}^n \to {\mathbb{R}}^m$$ such that the following limit exists \begin{align*} \lim _ { \mathbf x \rightarrow \mathbf{p} } \frac { \left\| f (\mathbf x ) - f (\mathbf{p} ) - D_f (\mathbf x - \mathbf{p} ) \right\| } { \| \mathbf x - \mathbf{p} \| } = 0 .\end{align*}

$$D_f$$ is the “best linear approximation” to $$f$$.

When $$f$$ is differentiable, $$D_f$$ can be given in coordinates by
\begin{align*} (D_f)_{ij} = {\frac{\partial f_i}{\partial x_j}\,} \end{align*}

This yields the Jacobian of $$f$$: \begin{align*} D_f(\mathbf{\mathbf}{p}) \begin{bmatrix} \rule[-1ex]{0.5pt}{2.5ex}& \rule[-1ex]{0.5pt}{2.5ex}& & \rule[-1ex]{0.5pt}{2.5ex}\\ \nabla f_1(\mathbf{p}) & \nabla f_2(\mathbf{p}) & \cdots & \nabla f_m(\mathbf{p}) \\ \rule[-1ex]{0.5pt}{2.5ex}& \rule[-1ex]{0.5pt}{2.5ex}& & \rule[-1ex]{0.5pt}{2.5ex} \end{bmatrix}^T = \left[ \begin{array} { c c c c } { \frac { \partial f _ { 1 } } { \partial x _ { 1 } } ( \mathbf{p} ) } & { \frac { \partial f _ { 1 } } { \partial x _ { 2 } } ( \mathbf{p} ) } & { \ldots } & { \frac { \partial f _ { 1 } } { \partial x _ { n } } ( \mathbf{p} ) } \\ { \frac { \partial f _ { 2 } } { \partial x _ { 1 } } ( \mathbf{p} ) } & { \frac { \partial f _ { 2 } } { \partial x _ { 2 } } ( \mathbf{p} ) } & { \dots } & { \frac { \partial f _ { 2 } } { \partial x _ { n } } ( \mathbf{p} ) } \\ { \vdots } & { \vdots } & { \ddots } & { \vdots } \\ { \frac { \partial f _ { m } } { \partial x _ { 1 } } ( \mathbf{p} ) } & { \frac { \partial f _ { m } } { \partial x _ { 2 } } ( \mathbf{p} ) } & { \cdots } & { \frac { \partial f _ { m } } { \partial x _ { n } } ( \mathbf{p} ) } \end{array} \right]. \end{align*}

This is equivalent to

• Taking the gradient of each component $$f_i$$ of $$f$$,
• Evaluating $$\nabla f_i$$ at $$\mathbf{p}$$,
• Forming a matrix using these as the columns, and
• Transposing the resulting matrix.

For a function $$f: {\mathbb{R}}^n \to {\mathbb{R}}$$, the Hessian is a generalization of the second derivative, and is given in coordinates by \begin{align*} (H_f)_{ij} = {\frac{\partial ^2f}{\partial x_i x_j}\,} \end{align*}

Explicitly, we have \begin{align*} H_f(\mathbf{p}) = \begin{bmatrix} \rule[-1ex]{0.5pt}{2.5ex}& \rule[-1ex]{0.5pt}{2.5ex}& & \rule[-1ex]{0.5pt}{2.5ex}\\ D \nabla f_1(\mathbf{p}) & D\nabla f_2(\mathbf{p}) & \cdots & D\nabla f_m(\mathbf{p}) \\ \rule[-1ex]{0.5pt}{2.5ex}& \rule[-1ex]{0.5pt}{2.5ex}& & \rule[-1ex]{0.5pt}{2.5ex} \end{bmatrix}^T = \left[ \begin{array} { c c c } { \frac { \partial ^ { 2 } f } { \partial x _ { 1 } \partial x _ { 1 } } ( \mathbf { a } ) } & { \dots } & { \frac { \partial ^ { 2 } f } { \partial x _ { 1 } \partial x _ { n } } ( \mathbf { a } ) } \\ { \vdots } & { \ddots } & { \vdots } \\ { \frac { \partial ^ { 2 } f } { \partial x _ { n } \partial x _ { 1 } } ( \mathbf { a } ) } & { \cdots } & { \frac { \partial ^ { 2 } f } { \partial x _ { n } \partial x _ { n } } ( \mathbf { a } ) } \end{array} \right]. \end{align*}

Mnemonic: make matrix with $$\nabla f$$ as the columns, and then differentiate variables left to right.

## The Chain Rule

Write out tree of dependent variables:

Then sum each possible path.

Let subscripts denote which variables are held constant, then \begin{align*} \left({\frac{\partial z}{\partial x}\,}\right)_y &= \left({\frac{\partial z}{\partial x}\,}\right)_{u,y,v} \\ & + \left({\frac{\partial z}{\partial v}\,}\right)_{x,y,u} \left({\frac{\partial v}{\partial x}\,}\right)_y \\ & + \left({\frac{\partial z}{\partial u}\,}\right)_{x,y,v} \left({\frac{\partial u}{\partial x}\,}\right)_{v,y} \\ & + \left({\frac{\partial z}{\partial u}\,}\right)_{x,y,v} \left({\frac{\partial u}{\partial v}\,}\right)_{x,y} \left({\frac{\partial v}{\partial x}\,}\right)_y \end{align*}

## Approximation

Let $$z = f(x,y)$$, then to approximate near $$\mathbf{p}_0 = {\left[ {x_0, y_0} \right]}$$, \begin{align*} f(\textcolor{Aquamarine}{\textbf{x}}) &\approx f(\mathbf{p}) + \nabla f (\textcolor{Aquamarine}{\textbf{x}} - \mathbf{p}_0) \\ \implies f(x,y) &\approx f(\mathbf{p}) + f_x(\mathbf{p})(x-x_0) + f_y(\mathbf{p})(y-y_0) \\ .\end{align*}

## Optimization

### Classifying Critical Points

Critical points of $$f$$ given by points $$\mathbf{p}$$ such that the derivative vanishes: \begin{align*} \operatorname{crit}(f) = \left\{{\mathbf{p}\in {\mathbb{R}}^n ~{\text{s.t.}}~D_f({\mathbf p}) = 0}\right\} \end{align*}

. Compute \begin{align*} {\left\lvert {H_f(\mathbf p)} \right\rvert} \mathrel{\vcenter{:}}=\left| \begin{array} { l l } { f _ { x x } } & { f _ { x y } } \\ { f _ { y x } } & { f _ { y y } } \end{array} \right| ({ \mathbf p }) \end{align*} 2. Check by cases:

• $${\left\lvert {H(\mathbf p)} \right\rvert} = 0$$: No conclusion
• $${\left\lvert {H(\mathbf p)} \right\rvert} < 0$$: Saddle point
• $${\left\lvert {H(\mathbf p)} \right\rvert} > 0$$:
• $$f_{xx}(\mathbf p) > 0 \implies$$ local min
• $$f_{xx}(\mathbf p) < 0 \implies$$ local max

What’s really going on?

• Eigenvalues have same sign $$\iff$$ positive definite or negative definite

• Positive definite $$\implies$$ convex $$\implies$$ local min

• Negative definite $$\implies$$ concave $$\implies$$ local max

• Extrema occur on boundaries, so parameterize each boundary to obtain a function in one less variable and apply standard optimization techniques to yield critical points. Test all critical points to find extrema.
• If possible, use constraint to just reduce equation to one dimension and optimze like single-variable case.

### Lagrange Multipliers

The setup: \begin{align*} \text{Optimize } f(\mathbf x) &\quad \text{subject to } g(\mathbf x) = c \\ \implies \nabla f &= \lambda \nabla g \end{align*} 1. Use this formula to obtain a system of equations in the components of $$x$$ and the parameter $$\lambda$$.

1. Use $$\lambda$$ to obtain a relation involving only components of $$\mathbf{x}$$.

2. Substitute relations back into constraint to obtain a collection of critical points.

3. Evaluate $$f$$ at critical points to find max/min.

## Change of Variables

For any $$f: {\mathbb{R}}^n \to {\mathbb{R}}^n$$ and region $$R$$, \begin{align*} \int _ { g ( R ) } f ( \mathbf { x } ) ~d V = \int _ { R } (f \circ g) ( \mathbf { x } ) \cdot {\left\lvert {D_g ( \mathbf { x })} \right\rvert} ~d V \end{align*}

# Vector Calculus

## Notation

$$R$$ is a region, $$S$$ is a surface, $$V$$ is a solid.

\begin{align*} \oint _ { \partial S } \mathbf { F } \cdot d \mathbf { r } = \oint _ { \partial S } [\mathbf{F}_1, \mathbf{F}_2, \mathbf{F}_3] \cdot [dx, dy, dz] = \oint_{{\partial}S} \mathbf{F}_1dx + \mathbf{F}_2dy + \mathbf{F}_3dz \end{align*}

The main vector operators \begin{align*} \nabla: ({\mathbb{R}}^n \to {\mathbb{R}}) &\to ({\mathbb{R}}^n \to {\mathbb{R}}^n) \\ \phi &\mapsto \nabla \phi \mathrel{\vcenter{:}}=\sum_{i=1}^n \frac{\partial \phi}{\partial x_i} ~\mathbf{e}_i \\ \\ \text{} \mathrm{div}(\mathbf{F}): ({\mathbb{R}}^n \to {\mathbb{R}}^n) &\to ({\mathbb{R}}^n \to {\mathbb{R}}) \\ \mathbf{F} &\mapsto \nabla \cdot \mathbf{F} \mathrel{\vcenter{:}}=\sum_{i=1}^n \frac{\partial \mathbf{F}_i}{\partial x_i} \\ \\ \text{} \mathrm{curl}(\mathbf{F}): ({\mathbb{R}}^3 \to {\mathbb{R}}^3) &\to ({\mathbb{R}}^3 \to {\mathbb{R}}^3) \\ \mathbf{F} &\mapsto \nabla \times\mathbf{F} \\ \\ \text{} \end{align*} Some terminology: \begin{align*} \text{Scalar Field} && \phi:&~ X \to {\mathbb{R}}\\ \text{Vector Field} && \mathbf{F}:&~ X\to {\mathbb{R}}^n\\ \text{Gradient Field} && \mathbf{F}:&~ X \to {\mathbb{R}}^n \mathrel{\Big|}\exists \phi: X\to {\mathbb{R}}\mathrel{\Big|}\nabla \phi = F \end{align*}

• The Gradient: lifts scalar fields on $${\mathbb{R}}^n$$ to vector fields on $${\mathbb{R}}^n$$
• Divergence: drops vector fields on $${\mathbb{R}}^n$$ to scalar fields on $${\mathbb{R}}^n$$
• Curl: takes vector fields on $${\mathbb{R}}^3$$ to vector fields on $${\mathbb{R}}^3$$

\begin{align*} \mathbf x \cdot \mathbf y = {\left\langle {\mathbf x},~{\mathbf y} \right\rangle} = \sum_{i=1}^n {x_i y_i} = x_1y_1 + x_2y_2 + \cdots && \text{inner/dot product} \\ {\left\lVert {\mathbf x} \right\rVert} = \sqrt{{\left\langle {\mathbf x},~{\mathbf x} \right\rangle}} = \sqrt{\sum_{i=1}^n x_i^2} = \sqrt{x_1^2 + x_2^2 + \cdots} && \text{norm} \\ \mathbf a \times\mathbf b = \mathbf{\widehat{n}} {\left\lVert {\mathbf a} \right\rVert}{\left\lVert {\mathbf b} \right\rVert}\sin\theta_{\mathbf a,\mathbf b} = \left| \begin{array}{ccc} \mathbf{\widehat{x}} & \mathbf{\widehat{y}} & \mathbf{\widehat{z}} \\ a_1 & a_2 & a_3 \\ b_1 & b_2 & b_3 \end{array}\right| && \text{cross product} \\ \\ D_\mathbf{u}(\phi) = \nabla \phi \cdot \mathbf{\widehat{u}} && \text{directional derivative} \\ \\ \nabla \mathrel{\vcenter{:}}=\sum_{i=1}^n \frac{\partial}{\partial x_i} \mathbf{e}_i = \left[\frac{\partial}{\partial x_1}, \frac{\partial}{\partial x_2}, \cdots, \frac{\partial}{\partial x_n}\right] && \text{del operator} \\ \\ \nabla \phi \mathrel{\vcenter{:}}=\sum_{i=1}^n \frac{\partial \phi}{\partial x_i} ~\mathbf{e}_i = {\left[ { \frac{\partial \phi}{\partial x_1}, \frac{\partial \phi}{\partial x_2}, \cdots, \frac{\partial \phi}{\partial x_n}} \right]} && \text{gradient} \\ \\ \Delta \phi \mathrel{\vcenter{:}}=\nabla\cdot\nabla \phi \mathrel{\vcenter{:}}=\sum_{i=1}^n \frac{\partial^2 \phi}{\partial x_i^2} = \frac{\partial^2 \phi}{\partial x_1^2} + \frac{\partial^2 \phi}{\partial x_2} + \cdots + \frac{\partial^2 \phi}{\partial x_n^2} && \text{Laplacian} \\ \\ \nabla \cdot \mathbf{F} \mathrel{\vcenter{:}}=\sum_{i=1}^n \frac{\partial \mathbf{F}_i}{\partial x_i} = \frac{\partial \mathbf{F}_1}{\partial x_1} + \frac{\partial \mathbf{F}_2}{\partial x_2} + \cdots + \frac{\partial \mathbf{F}_n}{\partial x_n} && \text{divergence} \\ \\ \nabla \times \mathbf { F } = \left| \begin{array} { c c c } { \mathbf { e }_1 } & { \mathbf { e }_2 } & { \mathbf { e }_3 } \\ { \frac { \partial } { \partial x } } & { \frac { \partial } { \partial y } } & { \frac { \partial } { \partial z } } \\ { \mathbf{F} _ { 1 } } & { \mathbf{F} _ { 2 } } & { \mathbf{F} _ { 3 } } \end{array} \right| = [\mathbf{F}_{3y} - \mathbf{F}_{2z}, \mathbf{F}_{1z}- \mathbf{F}_{3x}, \mathbf{F}_{2x} -\mathbf{F}_{1y}] && \text{curl} \\ \iint _ { S } ( \nabla \times \mathbf { F } ) \cdot d \mathbf { S } = \iint _ { S } ( \nabla \times \mathbf { F } ) \cdot \mathbf { n } ~dS && \text{surface integral} \end{align*}

## Big Theorems

### Stokes’ and Consequences

\begin{align*} \oint _ { \partial S } \mathbf { F } \cdot d \mathbf { r } = \iint _ { S } ( \nabla \times \mathbf { F } ) \cdot d \mathbf { S } .\end{align*}

Note that if $$S$$ is a closed surface, so $${\partial}S = \emptyset$$, this integral vanishes.

\begin{align*} \oint _ { {\partial}R } ( L ~d x + M ~d y ) = \iint _ { R } \left( \frac { \partial M } { \partial x } - \frac { \partial L } { \partial y } \right) d x d y .\end{align*}

Recovering Green’s Theorem from Stokes’ Theorem:

Let $$\mathbf{F} = [L, M, 0]$$, then $$\nabla\times\mathbf{F} = [0, 0, \frac{\partial M}{\partial x} - \frac{\partial L}{\partial y}]$$

\begin{align*} \iint_ { \partial V } \mathbf { F } \cdot d \mathbf { S } = \iiint _ { V } ( \nabla \cdot \mathbf { F } ) ~d V .\end{align*}

• $$\nabla\times(\nabla\phi) = 0$$
• $$\nabla\cdot(\nabla\times\mathbf{F}) = 0$$

### Directional Derivatives

\begin{align*} D_{\mathbf{v}} f(\mathbf{p}) \mathrel{\vcenter{:}}={\frac{\partial f}{\partial t}\,}(\mathbf{p} + t\mathbf{v}) \Big|_{t=0} .\end{align*}

Note that the directional derivative uses a normalized direction vector!

Suppose $$f:{\mathbb{R}}^n\to {\mathbb{R}}$$ and $$\mathbf{v}\in {\mathbb{R}}^n$$. Then \begin{align*} D_{\mathbf{v}}f(\mathbf{p}) = {\left\langle {\nabla f(\mathbf{p})},~{\mathbf{v}} \right\rangle} .\end{align*}

We first use the fact that we can find $$L$$, the best linear approximation to $$f$$: \begin{align*} L(\mathbf{x}) &\mathrel{\vcenter{:}}= f(\mathbf{p}) + D_f(\mathbf{p})(\mathbf{x} - \mathbf{p}) \\ \\ D_{\mathbf{v}}f(\mathbf{p}) &= D_{\mathbf{v}} L(\mathbf{p}) \\ &= \lim_{t\to 0} {L(\mathbf{p} + t\mathbf{v}) - L(\mathbf{p}) \over t}\\ &= \lim_{t\to 0} { f(\mathbf{p}) + D_f(\mathbf{p})(\mathbf{p} + t\mathbf{v} - \mathbf{p}) -\qty{f(\mathbf{p}) + D_f(\mathbf{p})(\mathbf{p} - \mathbf{p})} \over t }\\ &= \lim_{t\to 0} { D_f(\mathbf{p})(t\mathbf{v}) \over t} \\ &= D_f(\mathbf{p})\mathbf{v} \\ &\mathrel{\vcenter{:}}=\nabla f(\mathbf{p}) \cdot \mathbf{v} .\end{align*}

## Computing Integrals

### Changing Coordinates

#### Polar and Cylindrical Coordinates

\begin{align*} x = r\cos\theta \\ y = r\sin\theta \\ dV \mapsto r \quad dr~d\theta \end{align*}

#### Spherical Coordinates

\begin{align*} x = r\cos\theta = \rho\sin\phi\cos\theta \\ y = r\sin\theta = \rho\sin\phi\sin\theta \\ dV \mapsto r^2 \sin\phi \quad dr~d\phi~d\theta \end{align*}

### Line Integrals

#### Curves

• Parametrize the path $$C$$ as $$\left\{{\mathbf{r}(t): t\in[a,b]}\right\}$$, then

\begin{align*} \int_C f ~ds &\mathrel{\vcenter{:}}=\int_a^b (f\circ \mathbf{r})(t) ~{\left\lVert {\mathbf{r}'(t)} \right\rVert}~dt \\ &= \int_a^b f(x(t), y(t), z(t)) \sqrt{x_t^2 + y_t^2 + z_t^2} ~dt \end{align*}

#### Vector Fields

• If exact: \begin{align*} {\frac{\partial }{\partial y}\,} \mathbf{F_1} = {\frac{\partial }{\partial x}\,} \mathbf{F_2} \implies \int \mathbf{F_1} ~dx + \mathbf{F_2} ~dy = \phi(\mathbf{p_1}) - \phi(\mathbf{p_0}) \end{align*}

The function $$\phi$$ can be found using the same method from ODEs.

• Parametrize the path $$C$$ as $$\left\{{\mathbf{r}(t): t\in[a,b]}\right\}$$, then \begin{align*} \int_C \mathbf F \cdot d\mathbf r & \mathrel{\vcenter{:}}=\int_a^b (\mathbf F \circ \mathbf r)(t) \cdot \mathbf r'(t) ~dt \\ &= \int_a^b [\mathbf F_1(x(t), y(t), \cdots), \mathbf F_2(x(t), y(t), \cdots)]\cdot[x_t, y_t, \cdots] ~dt \\ &= \int_a^b \mathbf F_1(x(t), y(t) \cdots)x_t + \mathbf F_2(x(t), y(t), \cdots)y_t + \cdots ~dt \end{align*}

• Equivalently written:

\begin{align*} \int_a^b \mathbf F_1 ~dx + \mathbf F_2 ~dy + \cdots \mathrel{\vcenter{:}}=\int_C \mathbf F \cdot d\mathbf r \end{align*} in which case $$[dx, dy, \cdots] \mathrel{\vcenter{:}}=[x_t, y_t, \cdots] = \mathbf r'(t)$$.

• Remember to substitute dx back into the integrand!!

### Flux

\begin{align*} \iint_S \mathbf{F}\cdot d\mathbf{S} = \iint_S \mathbf{F}\cdot \mathbf{\widehat{n}} ~dS .\end{align*}

### Area

Given $$R$$ and $$f(x,y) = 0$$, \begin{align*} A(R) = \oint _ { {\partial}R } x ~d y = - \oint _ { {\partial}R } y ~d x = \frac { 1 } { 2 } \oint _ { {\partial}R } - y ~d x + x ~d y . \end{align*}

Compute \begin{align*} \oint_{{\partial}R} x ~dy = - \oint_{{\partial}R} y ~dx \\ = \frac{1}{2} \oint_{{\partial}R} -y~dx + x~dy = \frac{1}{2} \iint_R 1 - (-1) ~dA =\iint_R 1 ~dA \end{align*}

### Surface Integrals

• For a paramterization $$\mathbf r(s,t): U \to S$$ of a surface $$S$$ and any function $$f: {\mathbb{R}}^n \to {\mathbb{R}}$$, \begin{align*} \iint _ { S } f ~dA = \iint _ { U } ( f \circ \mathbf r) ( s , t )~{\left\lVert {\mathbf n} \right\rVert} ~dA \end{align*}
• Can obtain a normal vector $$\mathbf n = T _ { u } \times T _ { v }$$

## Other Results

$$\nabla \cdot \mathbf{F} = 0 \not \implies \exists G:~ \mathbf{F} = \nabla\times G$$. A counterexample

\begin{align*} \mathbf{F}(x,y,z) =\frac{1}{\sqrt{x^2+y^2+z^2}}[x, y, z]~,\quad S = S^2 \subset {\mathbb{R}}^3 \\ \implies \nabla \mathbf{F} = 0 \text{ but } \iint_{S^2}\mathbf{F}\cdot d\mathbf{S} = 4\pi \neq 0 \end{align*} Where by Stokes’ theorem, \begin{align*} \mathbf{F} = \nabla\times\mathbf{G}\implies \iint_{S^2} \mathbf{F} &= \iint_{S^2} \nabla\times\mathbf{G} \\ \\ &= \oint_{{\partial}S^2}\mathbf{G}~d\mathbf{r} && \text{by Stokes}\\ &= 0 \end{align*} since $${\partial}S^2 = \emptyset$$.

Sufficient condition: if $$\mathbf{F}$$ is everywhere $$C^1$$, \begin{align*} \exists \mathbf{G}:~ \mathbf{F} = \nabla \times\mathbf{G} \iff \iint_S \mathbf{F}\cdot d\mathbf{S} = 0 \text{ for all closed surfaces }S .\end{align*}

# Linear Algebra

The underlying field will be assumed to be $${\mathbb{R}}$$ for this section.

## Notation

\begin{align*} \operatorname{Mat}(m, n) && \text{the space of all } m\times n\text{ matrices} \\ T && \text{a linear map } {\mathbb{R}}^n \to {\mathbb{R}}^m \\ A\in \operatorname{Mat}(m, n)&& \text{an } m\times n \text{ matrix representing }T \\ A^t\in \operatorname{Mat}(n, m) && \text{an } n\times m \text{ transposed matrix} \\ \mathbf{a} && \text{a } 1\times n \text{ column vector} \\ \mathbf{a}^t && \text{an } n\times 1 \text{ row vector} \\ A = {\left[ {\mathbf{a}_1, \cdots, \mathbf{a}_n} \right]} && \text{a matrix formed with } \mathbf{a}_i \text{ as the columns} \\ V, W && \text{vector spaces} \\ |V|, \dim(W) && \text{dimensions of vector spaces} \\ \det(A) && \text{the determinant of }A \\ \begin{bmatrix} A &\fboxsep=-\fboxrule\!\!\!\fbox{\strut}\!\!\!& \mathbf{b} \end{bmatrix} \mathrel{\vcenter{:}}={\left[ {\mathbf{a}_1, \mathbf{a}_2, \cdots \mathbf{a}_n, \mathbf{b}} \right]} && \text{augmented matrices} \\ \begin{bmatrix} A &\fboxsep=-\fboxrule\!\!\!\fbox{\strut}\!\!\!& B \end{bmatrix} \mathrel{\vcenter{:}}={\left[ {\mathbf{a}_1, \cdots \mathbf{a}_n, \mathbf{b}_1, \cdots, \mathbf{b}_m} \right]} && \text{block matrices}\\ \operatorname{Spec}(A) && \text{the multiset of eigenvalues of } A \\ A\mathbf{x} = \mathbf{b} && \text{a system of linear equations} \\ r\mathrel{\vcenter{:}}=\operatorname{rank}(A) && \text{the rank of }A\\ r_b = \operatorname{rank}\qty{ \begin{bmatrix} A &\fboxsep=-\fboxrule\!\!\!\fbox{\strut}\!\!\!& \mathbf{b} \\ \end{bmatrix} } && \text{the rank of }A\text{ augmented by }\mathbf{b} .\end{align*}

## Big Theorems

\begin{align*} {\left\lvert {\ker(A)} \right\rvert} + {\left\lvert {\operatorname{im}(A)} \right\rvert} = {\left\lvert {\operatorname{dom}(A)} \right\rvert} ,\end{align*} where $$\operatorname{nullspace}(A) = {\left\lvert {\operatorname{im}{A}} \right\rvert}, \operatorname{rank}(A) = {\left\lvert {\operatorname{im}(A)} \right\rvert},$$ and $$n$$ is the number of columns in the corresponding matrix.

Generalization: the following sequence is always exact: \begin{align*} 0 \to \ker(A) \xhookrightarrow{\text{id}} \operatorname{dom}(A) % \xrightarrow[]{A}\mathrel{\mkern-14mu}\rightarrow \operatorname{im}(A) \to 0 .\end{align*} Moreover, it always splits, so $$\operatorname{dom}A = \ker A \oplus \operatorname{im}A$$ and thus $${\left\lvert {\operatorname{dom}(A)} \right\rvert} = {\left\lvert {\ker(A)} \right\rvert} + {\left\lvert {\operatorname{im}(A)} \right\rvert}$$.

We also have \begin{align*} \dim(\operatorname{rowspace}(A)) = \dim(\operatorname{colspace}(A)) = \operatorname{rank}(A) .\end{align*}

## Big List of Equivalent Properties

Let $$A$$ be an $$m\times n$$ matrix. TFAE: - $$A$$ is invertible and has a unique inverse $$A^{-1}$$ - $$A^T$$ is invertible - $$\det(A) \neq 0$$ - The linear system $$A\mathbf{x} = \mathbf{b}$$ has a unique solution for every $$b\ \in {\mathbb{R}}^m$$ - The homogeneous system $$A\mathbf{x} = 0$$ has only the trivial solution $$\mathbf{x} = 0$$ - $$\operatorname{rank}(A) = n$$ - i.e. $$A$$ is full rank - $$\mathrm{nullity}(A) \mathrel{\vcenter{:}}=\dim\mathrm{nullspace}(A) = 0$$ - $$A = \prod_{i=1}^k E_i$$ for some finite $$k$$, where each $$E_i$$ is an elementary matrix. - $$A$$ is row-equivalent to the identity matrix $$I_n$$ - $$A$$ has exactly $$n$$ pivots - The columns of $$A$$ are a basis for $${\mathbb{R}}^n$$ - i.e. $$\operatorname{colspace}(A) = {\mathbb{R}}^n$$ - The rows of $$A$$ are a basis for $${\mathbb{R}}^m$$ - i.e. $$\mathrm{rowspace}(A) = {\mathbb{R}}^m$$ - $$\left(\operatorname{colspace}(A)\right)^\perp= \left(\mathrm{rowspace}A\right)^\perp= \left\{{\mathbf{0}}\right\}$$ - Zero is not an eigenvalue of $$A$$. - $$A$$ has $$n$$ linearly independent eigenvectors - The rows of $$A$$ are coplanar.

Similarly, by taking negations, TFAE:

• $$A$$ is not invertible
• $$A$$ is singular
• $$A^T$$ is not invertible
• $$\det A = 0$$
• The linear system $$A \mathbf{x} = \mathbf{b}$$ has either no solution or infinitely many solutions.
• The homogeneous system $$A \mathbf{x} = \mathbf{0}$$ has nontrivial solutions
• $$\operatorname{rank}A < n$$
• $$\dim \mathrm{nullspace}~ A > 0$$
• At least one row of $$A$$ is a linear combination of the others
• The $$RREF$$ of $$A$$ has a row of all zeros.

Reformulated in terms of linear maps $$T$$, TFAE: - $$T^{-1}: {\mathbb{R}}^m \to {\mathbb{R}}^n$$ exists - $$\operatorname{im}(T) = {\mathbb{R}}^n$$ - $$\ker(T) = 0$$ - $$T$$ is injective - $$T$$ is surjective - $$T$$ is an isomorphism - The system $$A\mathbf{x} = 0$$ has infinitely many solutions

## Vector Spaces

### Linear Transformations

It is common to want to know the range and kernel of a specific linear transformation $$T$$. $$T$$ can be given in many ways, but a general strategy for deducing these properties involves:

• Express an arbitrary vector in $$V$$ as a linear combination of its basis vectors, and set it equal to an arbitrary vector in $$W$$.

• Use the linear properties of $$T$$ to make a substitution from known transformations

• Find a restriction or relation given by the constants of the initial linear combination.

Useful fact: if $$V\leq W$$ is a subspace and $$\dim(V) \geq \dim(W)$$, then $$V=W$$.

If $$V\subseteq W$$, then $$V$$ is a subspace of $$W$$ if the following hold:

\begin{align*} (1) && \mathbf{0}\in V \\ (2) && \mathbf{a}, \mathbf{b}\in V\implies t\mathbf{a} + \mathbf{b}\in V .\end{align*}

### Linear Independence

Any set of two vectors $$\left\{{\mathbf{v}, \mathbf{w}}\right\}$$ is linearly dependent $$\iff \exists \lambda :~\mathbf{v} = \lambda \mathbf{w}$$, i.e. one is not a scalar multiple of the other.

### Bases

A set $$S$$ forms a basis for a vector space $$V$$ iff

1. $$S$$ is a set of linearly independent vectors, so $$\sum \alpha_i \vec{s_i} = 0 \implies \alpha_i = 0$$ for all $$i$$.
2. $$S$$ spans $$V$$, so $$\vec{v} \in V$$ implies there exist $$\alpha_i$$ such that $$\sum \alpha_i \vec{s_i} = \vec{v}$$

In this case, we define the dimension of $$V$$ to be $${\left\lvert {S} \right\rvert}$$.

### The Inner Product

The point of this section is to show how an inner product can induce a notion of “angle”, which agrees with our intuition in Euclidean spaces such as $${\mathbb{R}}^n$$, but can be extended to much less intuitive things, like spaces of functions.

The Euclidean inner product is defined as \begin{align*} {\left\langle {\mathbf{a}},~{\mathbf{b}} \right\rangle} = \sum_{i=1}^n a_i b_i = a_1b_1 + a_2b_2 + \cdots + a_nb_n .\end{align*}

Also sometimes written as $$\mathbf{a}^T\mathbf{b}$$ or $$\mathbf{a} \cdot \mathbf{b}$$.

Yields a norm \begin{align*} {\left\lVert {\mathbf{x}} \right\rVert} \mathrel{\vcenter{:}}=\sqrt{{\left\langle {\mathbf{x}},~{\mathbf{x}} \right\rangle}} \end{align*}

which has a useful alternative formulation \begin{align*} {\left\langle {\mathbf{x}},~{\mathbf{x}} \right\rangle} = {\left\lVert {\mathbf{x}} \right\rVert}^2 .\end{align*}

This leads to a notion of angle: \begin{align*} {\left\langle {\mathbf{x}},~{\mathbf{y}} \right\rangle} = {\left\lVert {\mathbf{x}} \right\rVert} {\left\lVert {\mathbf{y}} \right\rVert} \cos\theta_{x,y} \implies \cos \theta_{x,y} \mathrel{\vcenter{:}}=\frac{{\left\langle {\mathbf{x}},~{\mathbf{y}} \right\rangle}}{{\left\lVert {\mathbf{x}} \right\rVert} {\left\lVert {\mathbf{y}} \right\rVert}} = {\left\langle {\widehat{\mathbf{x}}},~{\widehat{\mathbf{y}}} \right\rangle} \end{align*} where $$\theta_{x,y}$$ denotes the angle between the vectors $$\mathbf{x}$$ and $$\mathbf{y}$$.

Since $$\cos \theta=0$$ exactly when $$\theta = \pm \frac \pi 2$$, we can can declare two vectors to be orthogonal exactly in this case: \begin{align*} \mathbf{x} \in \mathbf{y}^\perp\iff {\left\langle {\mathbf{x}},~{\mathbf{y}} \right\rangle} = 0 .\end{align*}

Note that this makes the zero vector orthogonal to everything.

Given a subspace $$S \subseteq V$$, we define its orthogonal complement \begin{align*} S^\perp= \left\{{\mathbf{v}\in V {~\mathrel{\Big|}~}\forall \mathbf{s}\in S,~ {\left\langle {\mathbf{v}},~{\mathbf{s}} \right\rangle} = 0}\right\}. \end{align*}

Any choice of subspace $$S\subseteq V$$ yields a decomposition $$V = S \oplus S^\perp$$.

A useful formula is \begin{align*} {\left\lVert {\mathbf{x} + \mathbf{y}} \right\rVert}^2 = {\left\lVert {\mathbf{x}} \right\rVert}^2 + 2{\left\langle {\mathbf{x}},~{\mathbf{y}} \right\rangle} + {\left\lVert {\mathbf{y}} \right\rVert}^2, .\end{align*}

When $$\mathbf{x}\in \mathbf{y}^\perp$$, this reduces to \begin{align*} {\left\lVert {\mathbf{x} + \mathbf{y}} \right\rVert}^2 = {\left\lVert {\mathbf{x}} \right\rVert}^2 + {\left\lVert {\mathbf{y}} \right\rVert}^2 .\end{align*}

. Bilinearity: \begin{align*} {\left\langle {\sum_j \alpha_j \mathbf{a}_j},~{\sum_k \beta_k \mathbf{b}_k} \right\rangle} = \sum_j \sum_i \alpha_j \beta_i {\left\langle {\mathbf{a}_j},~{\mathbf{b}_i} \right\rangle}. \end{align*}

1. Symmetry: \begin{align*} {\left\langle {\mathbf{a}},~{\mathbf{b}} \right\rangle} = {\left\langle {\mathbf{b}},~{\mathbf{a}} \right\rangle} \end{align*}

2. Positivity: \begin{align*} \mathbf{a} \neq \mathbf{0} \implies {\left\langle {\mathbf{a}},~{\mathbf{a}} \right\rangle} > 0 \end{align*}

3. Nondegeneracy: \begin{align*} \mathbf{a} = \mathbf{0} \iff {\left\langle {\mathbf{a}},~{\mathbf{a}} \right\rangle} = 0 \end{align*}

### Gram-Schmidt Process

Extending a basis $$\left\{{\mathbf{x}_i}\right\}$$ to an orthonormal basis $$\left\{{\mathbf{u}_i}\right\}$$

\begin{align*} \mathbf{u}_1 &= N(\mathbf{x_1}) \\ \mathbf{u}_2 &= N(\mathbf{x}_2 - {\left\langle {\mathbf{x}_2},~{\mathbf{u}_1} \right\rangle}\mathbf{u}_1)\\ \mathbf{u}_3 &= N(\mathbf{x}_3 - {\left\langle {\mathbf{x}_3},~{\mathbf{u}_1} \right\rangle}\mathbf{u}_1 - {\left\langle {\mathbf{x}_3},~{\mathbf{u}_2} \right\rangle}\mathbf{u}_2 ) \\ \vdots & \qquad \vdots \\ \mathbf{u}_k &= N(\mathbf{x}_k - \sum_{i=1}^{k-1} {\left\langle {\mathbf{x}_k},~{\mathbf{u}_i} \right\rangle}\mathbf{u}_i) \end{align*}

where $$N$$ denotes normalizing the result.

#### In more detail

The general setup here is that we are given an orthogonal basis $$\left\{{\mathbf{x}_i}\right\}_{i=1}^n$$ and we want to produce an orthonormal basis from them.

Why would we want such a thing? Recall that we often wanted to change from the standard basis $$\mathcal{E}$$ to some different basis $$\mathcal{B} = \left\{{\mathbf{b}_1, \mathbf{b}_2, \cdots}\right\}$$. We could form the change of basis matrix $$B = [\mathbf{b}_1, \mathbf{b}_2, \cdots]$$ acts on vectors in the $$\mathcal{B}$$ basis according to \begin{align*} B[\mathbf{x}]_\mathcal{B} = [\mathbf{x}]_{\mathcal{E}} .\end{align*}

But to change from $$\mathcal{E}$$ to $$\mathcal{B}$$ requires computing $$B^{-1}$$, which acts on vectors in the standard basis according to \begin{align*} B^{-1}[\mathbf{x}]_\mathcal{E} = [\mathbf{x}]_{\mathcal{B}} .\end{align*}

If, on the other hand, the $$\mathbf{b}_i$$ are orthonormal, then $$B^{-1} = B^T$$, which is much easier to compute. We also obtain a rather simple formula for the coordinates of $$\mathbf{x}$$ with respect to $$\mathcal B$$. This follows because we can write \begin{align*} \mathbf{x} = \sum_{i=1}^n {\left\langle {\mathbf{x}},~{\mathbf{b}_i} \right\rangle} \mathbf{b}_i \mathrel{\vcenter{:}}=\sum_{i=1}^n c_i \mathbf{b}_i, .\end{align*}

and we find that \begin{align*} [\mathbf{x}]_\mathcal{B} = \mathbf{c} \mathrel{\vcenter{:}}=[c_1, c_2, \cdots, c_n]^T. .\end{align*}

This also allows us to simplify projection matrices. Supposing that $$A$$ has orthonormal columns and letting $$S$$ be the column space of $$A$$, recall that the projection onto $$S$$ is defined by \begin{align*} P_S = Q(Q^TQ)^{-1}Q^T .\end{align*}

Since $$Q$$ has orthogonal columns and satisfies $$Q^TQ = I$$, this simplifies to \begin{align*} P_S = QQ^T. .\end{align*}

#### The Algorithm

Given the orthogonal basis $$\left\{{\mathbf{x}_i}\right\}$$, we form an orthonormal basis $$\left\{{\mathbf{u}_i}\right\}$$ iteratively as follows.

First define \begin{align*} N: {\mathbb{R}}^n &\to S^{n-1} \\ \mathbf{x} &\mapsto \widehat{\mathbf{x}} \mathrel{\vcenter{:}}=\frac {\mathbf{x}} {{\left\lVert {\mathbf{x}} \right\rVert}} \end{align*}

which projects a vector onto the unit sphere in $${\mathbb{R}}^n$$ by normalizing. Then,

\begin{align*} \mathbf{u}_1 &= N(\mathbf{x_1}) \\ \mathbf{u}_2 &= N(\mathbf{x}_2 - {\left\langle {\mathbf{x}_2},~{\mathbf{u}_1} \right\rangle}\mathbf{u}_1)\\ \mathbf{u}_3 &= N(\mathbf{x}_3 - {\left\langle {\mathbf{x}_3},~{\mathbf{u}_1} \right\rangle}\mathbf{u}_1 - {\left\langle {\mathbf{x}_3},~{\mathbf{u}_2} \right\rangle}\mathbf{u}_2 ) \\ \vdots & \qquad \vdots \\ \mathbf{u}_k &= N(\mathbf{x}_k - \sum_{i=1}^{k-1} {\left\langle {\mathbf{x}_k},~{\mathbf{u}_i} \right\rangle}\mathbf{u}_i) \end{align*}

In words, at each stage, we take one of the original vectors $$\mathbf{x}_i$$, then subtract off its projections onto all of the $$\mathbf{u}_i$$ we’ve created up until that point. This leaves us with only the component of $$\mathbf{x}_i$$ that is orthogonal to the span of the previous $$\mathbf{u}_i$$ we already have, and we then normalize each $$\mathbf{u}_i$$ we obtain this way.

Alternative Explanation:

Given a basis \begin{align*} S = \left\{\mathbf{v_1, v_2, \cdots v_n}\right\}, \end{align*}

the Gram-Schmidt process produces a corresponding orthogonal basis \begin{align*} S' = \left\{\mathbf{u_1, u_2, \cdots u_n}\right\} \end{align*} that spans the same vector space as $$S$$.

$$S'$$ is found using the following pattern: \begin{align*} \mathbf{u_1} &= \mathbf{v_1} \\ \mathbf{u_2} &= \mathbf{v_2} - \text{proj}_{\mathbf{u_1}} \mathbf{v_2}\\ \mathbf{u_3} &= \mathbf{v_3} - \text{proj}_{\mathbf{u_1}} \mathbf{v_3} - \text{proj}_{\mathbf{u_2}} \mathbf{v_3}\\ \end{align*}

where \begin{align*} \text{proj}_{\mathbf{u}} \mathbf{v} = (\text{scal}_{\mathbf{u}} \mathbf{v})\frac{\mathbf{u}}{\mathbf{{\left\lVert {u} \right\rVert}}} = \frac{\langle \mathbf{v,u} \rangle}{{\left\lVert {\mathbf{u}} \right\rVert}}\frac{\mathbf{u}}{\mathbf{{\left\lVert {u} \right\rVert}}} = \frac{{\left\langle {\mathbf{v}},~{\mathbf{u}} \right\rangle}}{{\left\lVert {\mathbf{u}} \right\rVert}^2}\mathbf{u} \end{align*} is a vector defined as the

The orthogonal set $$S'$$ can then be transformed into an orthonormal set $$S''$$ by simply dividing the vectors $$s\in S'$$ by their magnitudes. The usual definition of a vector’s magnitude is \begin{align*} {\left\lVert {\mathbf{a}} \right\rVert} = \sqrt{{\left\langle {\mathbf{a}},~{\mathbf{a}} \right\rangle}} \text{ and } {\left\lVert {\mathbf{a}} \right\rVert}^2 = {\left\langle {\mathbf{a}},~{\mathbf{a}} \right\rangle} \end{align*}

As a final check, all vectors in $$S'$$ should be orthogonal to each other, such that \begin{align*} {\left\langle {\mathbf{v_i}},~{\mathbf{v_j}} \right\rangle} = 0 \text{ when } i \neq j \end{align*}

and all vectors in $$S''$$ should be orthonormal, such that \begin{align*} {\left\langle {\mathbf{v_i}},~{\mathbf{v_j}} \right\rangle} = \delta_{ij} \end{align*}

### The Fundamental Subspaces Theorem

Given a matrix $$A \in \mathrm{Mat}(m, n)$$, and noting that \begin{align*} A &: {\mathbb{R}}^n \to {\mathbb{R}}^m,\\ A^T &: {\mathbb{R}}^m \to {\mathbb{R}}^n \end{align*}

We have the following decompositions: \begin{align*} &{\mathbb{R}}^n &\cong \ker A &\oplus \operatorname{im}A^T &\cong \mathrm{nullspace}(A) &\oplus~ \mathrm{colspace}(A^T) \\ &{\mathbb{R}}^m &\cong \operatorname{im}A &\oplus \ker A^T &\cong \mathrm{colspace}(A) &\oplus~ \mathrm{nullspace}(A^T) \end{align*}

## Matrices

An $$m\times n$$ matrix is a map from $$n$$-dimensional space to $$m$$-dimensional space. The number of rows tells you the dimension of the codomain, the number of columns tells you the dimension of the domain.

The space of matrices is not an integral domain! Counterexample: if $$A$$ is singular and nonzero, there is some nonzero $$\mathbf{v}$$ such that $$A \mathbf{v} = \mathbf{0}$$. Then setting $$B = {\left[ {\mathbf{v}, \mathbf{v}, \cdots} \right]}$$ yields $$AB = 0$$ with $$A\neq 0, B\neq 0$$.

The rank of a matrix $$A$$ representing a linear transformation $$T$$ is $$\dim \operatorname{colspace}(A)$$, or equivalently $$\dim \operatorname{im}T$$.

$$\operatorname{rank}(A)$$ is equal to the number of nonzero rows in $$\operatorname{RREF}(A)$$.

\begin{align*} \mathrm{Trace}(A) = \sum_{i=1}^m A_{ii} \end{align*}

The following are elementary row operations on a matrix:

• Permute rows
• Multiple a row by a scalar
• Add any row to another

If $$A = {\left[ {\mathbf{a}_1, \mathbf{a}_2, \cdots} \right]} \in \mathrm{Mat}(m, n)$$ and $$B = {\left[ {\mathbf{b}_1, \mathbf{b}_2, \cdots} \right]} \in\mathrm{Mat}(n, p)$$, then \begin{align*} C \mathrel{\vcenter{:}}= AB \implies c_{ij} = \sum_{k=1}^n a_{ik}b_{kj} = {\left\langle {\mathbf{a_i}},~{\mathbf{b_j}} \right\rangle} \end{align*} where $$1\leq i \leq m$$ and $$1\leq j \leq p$$. In words, each entry $$c_{ij}$$ is obtained by dotting row $$i$$ of $$A$$ against column $$j$$ of $$B$$.

### Systems of Linear Equations

A system of linear equations is consistent when it has at least one solution. The system is inconsistent when it has no solutions.

?

Homogeneous systems are always consistent, i.e. there is always at least one solution.

• Tall matrices: more equations than unknowns, overdetermined
• Wide matrices: more unknowns than equations, underdetermined

There are three possibilities for a system of linear equations:

1. No solutions (inconsistent)
2. One unique solution (consistent, square or tall matrices)
3. Infinitely many solutions (consistent, underdetermined, square or wide matrices)

These possibilities can be check by considering $$r\mathrel{\vcenter{:}}=\operatorname{rank}(A)$$:

• $$r < r_b$$: case 1, no solutions.
• $$r = r_b$$: case 1 or 2, at least one solution.
• $$r_b = n$$: case 2, a unique solution.
• $$r_b < n:$$ case 3, infinitely many solutions.

### Determinants

\begin{align*} \det{(A \mod p}) \mod p \equiv (\det{A}) \mod p \end{align*}

For $$2\times 2$$ matrices, \begin{align*} A^{-1} = \left( \begin{array}{cc} a & b \\ c & d \end{array}\right)^{-1} = \frac{1}{\det{A}}\left( \begin{array}{cc} d & -b \\ -c & a \end{array}\right) \end{align*} In words, swap the main diagonal entries, and flip the signs on the off-diagonal.

Let $$A \in \mathrm{Mat}(m, n)$$, then there is a function \begin{align*} \det: \operatorname{Mat}(m, m) &\to {\mathbb{R}}\\ A &\mapsto \det(A) \end{align*} satisfying the following properties:

• $$\det$$ is a group homomorphism onto $$({\mathbb{R}}, \cdot)$$: \begin{align*} \det(AB) = \det(A) \det(B) \end{align*}

• Some corollaries: \begin{align*} \det{A^k} &= k\det{A} \\ \det(A^{-1}) &= (\det A)^{-1} \det(A^t) &= \det(A) .\end{align*}
• Invariance under adding scalar multiples of any row to another: \begin{align*} \det \begin{bmatrix} & \vdots & \\ \rule[.5ex]{2.5ex}{0.5pt}& \textcolor{Aquamarine}{\textbf{a}}_i & \rule[.5ex]{2.5ex}{0.5pt}\\ & \vdots & \\ \end{bmatrix} = \det \begin{bmatrix} & \vdots & \\ \rule[.5ex]{2.5ex}{0.5pt}& \textcolor{Aquamarine}{\textbf{a}}_i + t\mathbf{a_j} & \rule[.5ex]{2.5ex}{0.5pt}\\ & \vdots & \\ \end{bmatrix} \end{align*}

• Sign change under row permutation: \begin{align*} \det \begin{bmatrix} & \vdots & \\ \rule[.5ex]{2.5ex}{0.5pt}& \textcolor{Aquamarine}{\textbf{a}}_i & \rule[.5ex]{2.5ex}{0.5pt}\\ & \vdots & \\ \rule[.5ex]{2.5ex}{0.5pt}& \mathbf{a}_j & \rule[.5ex]{2.5ex}{0.5pt}\\ & \vdots & \\ \end{bmatrix} = (-1) \det \begin{bmatrix} & \vdots & \\ \rule[.5ex]{2.5ex}{0.5pt}& \mathbf{a}_j & \rule[.5ex]{2.5ex}{0.5pt}\\ & \vdots & \\ \rule[.5ex]{2.5ex}{0.5pt}& \textcolor{Aquamarine}{\textbf{a}}_i & \rule[.5ex]{2.5ex}{0.5pt}\\ & \vdots & \end{bmatrix} \end{align*}

• More generally, for a permutation $$\sigma\in S_n$$, \begin{align*} \det \begin{bmatrix} & \vdots & \\ \rule[.5ex]{2.5ex}{0.5pt}& \mathbf{a}_i & \rule[.5ex]{2.5ex}{0.5pt}\\ & \vdots & \\ \rule[.5ex]{2.5ex}{0.5pt}& \mathbf{a}_j & \rule[.5ex]{2.5ex}{0.5pt}\\ & \vdots & \\ \end{bmatrix} = (-1)^{\operatorname{sgn}(\sigma)} \det \begin{bmatrix} & \vdots & \\ \rule[.5ex]{2.5ex}{0.5pt}& \mathbf{a}_{\sigma(j)} & \rule[.5ex]{2.5ex}{0.5pt}\\ & \vdots & \\ \rule[.5ex]{2.5ex}{0.5pt}& \mathbf{a}_{\sigma(i)} & \rule[.5ex]{2.5ex}{0.5pt}\\ & \vdots & \end{bmatrix} \end{align*}
• Multilinearity in rows: \begin{align*} \det \begin{bmatrix} & \vdots & \\ \rule[.5ex]{2.5ex}{0.5pt}& t \textcolor{Aquamarine}{\textbf{a}}_i & \rule[.5ex]{2.5ex}{0.5pt}\\ & \vdots & \\ \end{bmatrix} &= t \det \begin{bmatrix} & \vdots & \\ \rule[.5ex]{2.5ex}{0.5pt}& \textcolor{Aquamarine}{\textbf{a}}_i & \rule[.5ex]{2.5ex}{0.5pt}\\ & \vdots & \\ \end{bmatrix} \\ \det \begin{bmatrix} \rule[.5ex]{2.5ex}{0.5pt}& t \mathbf{a}_1 & \rule[.5ex]{2.5ex}{0.5pt}\\ \rule[.5ex]{2.5ex}{0.5pt}& t \mathbf{a}_2 & \rule[.5ex]{2.5ex}{0.5pt}\\ & \vdots & \\ \rule[.5ex]{2.5ex}{0.5pt}& t \mathbf{a}_m & \rule[.5ex]{2.5ex}{0.5pt}\\ \end{bmatrix} &= t^m \det \begin{bmatrix} \rule[.5ex]{2.5ex}{0.5pt}& \mathbf{a}_1 & \rule[.5ex]{2.5ex}{0.5pt}\\ \rule[.5ex]{2.5ex}{0.5pt}& \mathbf{a}_2 & \rule[.5ex]{2.5ex}{0.5pt}\\ & \vdots & \\ \rule[.5ex]{2.5ex}{0.5pt}& \mathbf{a}_m & \rule[.5ex]{2.5ex}{0.5pt}\\ \end{bmatrix} \\ \det \begin{bmatrix} \rule[.5ex]{2.5ex}{0.5pt}& t_1 \mathbf{a}_1 & \rule[.5ex]{2.5ex}{0.5pt}\\ \rule[.5ex]{2.5ex}{0.5pt}& t_2 \mathbf{a}_2 & \rule[.5ex]{2.5ex}{0.5pt}\\ & \vdots & \\ \rule[.5ex]{2.5ex}{0.5pt}& t_m \mathbf{a}_m & \rule[.5ex]{2.5ex}{0.5pt}\\ \end{bmatrix} &= \prod_{i=1}^m t_i \det \begin{bmatrix} \rule[.5ex]{2.5ex}{0.5pt}& \mathbf{a}_1 & \rule[.5ex]{2.5ex}{0.5pt}\\ \rule[.5ex]{2.5ex}{0.5pt}& \mathbf{a}_2 & \rule[.5ex]{2.5ex}{0.5pt}\\ & \vdots & \\ \rule[.5ex]{2.5ex}{0.5pt}& \mathbf{a}_m & \rule[.5ex]{2.5ex}{0.5pt}\\ \end{bmatrix} .\end{align*}

• Linearity in each row: \begin{align*} \det \begin{bmatrix} & \vdots & \\ \rule[.5ex]{2.5ex}{0.5pt}& \textcolor{Aquamarine}{\textbf{a}}_i + \textcolor{red}{\textbf{a}}_j & \rule[.5ex]{2.5ex}{0.5pt}\\ & \vdots & \\ \end{bmatrix} =\det \begin{bmatrix} & \vdots & \\ \rule[.5ex]{2.5ex}{0.5pt}& \textcolor{Aquamarine}{\textbf{a}}_i & \rule[.5ex]{2.5ex}{0.5pt}\\ & \vdots & \\ \end{bmatrix} + \det \begin{bmatrix} & \vdots & \\ \rule[.5ex]{2.5ex}{0.5pt}& \textcolor{red}{\textbf{a}}_j & \rule[.5ex]{2.5ex}{0.5pt}\\ & \vdots & \\ \end{bmatrix} .\end{align*}

• $$\det(A)$$ is the volume of the parallelepiped spanned by the columns of $$A$$.

• If any row of $$A$$ is all zeros, $$\det(A) = 0$$.

TFAE:

• $$\det(A) = 0$$
• $$A$$ is singular.

### Computing Determinants

Useful shortcuts:

• If $$A$$ is upper or lower triangular, $$\det(A) = \prod_i a_{ii}$$.

The minor $$M_{ij}$$ of $$A\in \operatorname{Mat}(n, n)$$ is the determinant of the $$(n-1) \times (n-1)$$ matrix obtained by deleting the $$i$$th row and $$j$$th column from $$A$$.

The cofactor $$C_{ij}$$ is the scalar defined by \begin{align*} C_{ij} \mathrel{\vcenter{:}}=(-1)^{i+j} M_{ij} .\end{align*}

For any fixed $$i$$, there is a formula \begin{align*} \det(A) = \sum_{j=1}^n a_{ij} C_{ij} .\end{align*}

Let \begin{align*} A = \left[\begin{array}{lll} 1 & 2 & 3 \\ 4 & 5 & 6 \\ 7 & 8 & 9 \end{array}\right] .\end{align*}

Then \begin{align*} \det A = 1 \cdot\left|\begin{array}{ll} 5 & 6 \\ 8 & 9 \end{array}\right|-2 \cdot\left|\begin{array}{ll} 4 & 6 \\ 7 & 9 \end{array}\right|+3 \cdot\left|\begin{array}{ll} 4 & 5 \\ 7 & 8 \end{array}\right| = 1 \cdot(-3)-2 \cdot(-6)+3 \cdot(-3) = 0 .\end{align*}

$$\det(A)$$ can be computed by reducing $$A$$ to $$\operatorname{RREF}(A)$$ (which is upper triangular) and keeping track of the following effects:

• $$R_i \mapsfrom R_i \pm t R_j$$: no effect.
• $$R_i \rightleftharpoons R_j$$: multiply by $$(-1)$$.
• $$R_i \mapsfrom tR_i$$: multiply by $$t$$.

### Inverting a Matrix

Given a linear system $$A\mathbf{x} = \mathbf{b}$$, writing $$\mathbf{x} = {\left[ {x_1, \cdots, x_n} \right]}$$, there is a formula \begin{align*} x_i = \frac{\det(B_i)}{\det(A)} \end{align*} where $$B_i$$ is $$A$$ with the $$i$$th column deleted and replaced by $$\mathbf{b}$$.

Under the equivalence relation of elementary row operations, there is an equivalence of augmented matrices: \begin{align*} \begin{bmatrix} A &\fboxsep=-\fboxrule\!\!\!\fbox{\strut}\!\!\!& I \end{bmatrix} \sim \begin{bmatrix} I &\fboxsep=-\fboxrule\!\!\!\fbox{\strut}\!\!\!& A^{-1} \end{bmatrix} \end{align*} where $$I$$ is the $$n\times n$$ identity matrix.

\begin{align*} A^{-1} = {1\over \det(A)} {\left[ {C_{ij}} \right]}^t .\end{align*} where $$C_{ij}$$ is the cofactor() at position $$i,j$$.1

\begin{align*} \left(\begin{array}{cc} a& b \\ c& d \end{array}\right)^{-1} = {1 \over a d - b c} \left(\begin{array}{rr} d & -b \\ -c & a \end{array}\right) \quad \text{ where } ad-bc \ne 0 \end{align*}

What’s the pattern?

1. Always divide by determinant
2. Swap the diagonals

\begin{align*} \begin{bmatrix} + & - \\ - & + \end{bmatrix} \end{align*}

\begin{align*} A^{-1} \mathrel{\vcenter{:}}= \begin{bmatrix} a & b & c \\ d & e & f \\ g & h & i \end{bmatrix} ^{-1} = {1 \over {\det A}} \begin{bmatrix} e i - f h & -(b i - c h) & b f - c e \\ -(d i - f g) &a i - c g &-(a f -c d) \\ d h - e g & -(a h - b g)& a e - b d \end{bmatrix} .\end{align*}

The pattern:

1. Divide by determinant
2. Each entry is determinant of submatrix of $$A$$ with corresponding col/row deleted

\begin{align*} \begin{bmatrix} + & - & + \\ - & + & - \\ \ + & - & + \end{bmatrix} \end{align*}

1. Transpose at the end!!

### Bases for Spaces of a Matrix

Let $$A\in \operatorname{Mat}(m, n)$$ represent a map $$T:{\mathbb{R}}^n\to {\mathbb{R}}^m$$.

?

\begin{align*} \dim \operatorname{rowspace}(A) = \dim \operatorname{colspace}(A) .\end{align*}

#### The row space

\begin{align*} \operatorname{im}(T)^\vee= \operatorname{rowspace}(A) \subset {\mathbb{R}}^n .\end{align*}

Reduce to RREF, and take nonzero rows of $$\mathrm{RREF}(A)$$.

#### The column space

\begin{align*} \operatorname{im}(T) = \operatorname{colspace}(A) \subseteq {\mathbb{R}}^m \end{align*}

Reduce to RREF, and take columns with pivots from original $$A$$.

Not enough pivots implies columns don’t span the entire target domain

#### The nullspace

\begin{align*} \ker(T) = \operatorname{nullspace}(A) \subseteq {\mathbb{R}}^n \end{align*}

Reduce to RREF, zero rows are free variables, convert back to equations and pull free variables out as scalar multipliers.

#### Eigenspaces

For each $$\lambda \in \operatorname{Spec}(A)$$, compute a basis for $$\ker(A - \lambda I)$$.

### Eigenvalues and Eigenvectors

A vector $$\mathbf{v}$$ is said to be an eigenvector of $$A$$ with eigenvalue $$\lambda\in \operatorname{Spec}(A)$$ iff \begin{align*} A\mathbf{v} = \lambda \mathbf{v} \end{align*} For a fixed $$\lambda$$, the corresponding eigenspace $$E_\lambda$$ is the span of all such vectors.

• Similar matrices have identical eigenvalues and multiplicities.

• Eigenvectors corresponding to distinct eigenvalues are always linearly independent

• $$A$$ has $$n$$ distinct eigenvalues $$\implies A$$ has $$n$$ linearly independent eigenvectors.

• A matrix $$A$$ is diagonalizable $$\iff A$$ has $$n$$ linearly independent eigenvectors.

For $$\lambda\in \operatorname{Spec}(A)$$, \begin{align*} \mathbf{v}\in E_\lambda \iff \mathbf{v} \in \ker(A-I\lambda) .\end{align*}

Some miscellaneous useful facts:

• $$\lambda \in \operatorname{Spec}(A) \implies \lambda^2 \in \operatorname{Spec}(A^2)$$ with the same eigenvector.

• $$\prod \lambda_i = \det A$$

• $$\sum \lambda_i = \mathrm{Tr}~A$$

#### Diagonalizability

An $$n\times n$$ matrix $$P$$ is diagonalizable iff its eigenspace is all of $$\mathbb{R}^n$$ (i.e. there are $$n$$ linearly independent eigenvectors, so they span the space.)

$$A$$ is diagonalizable if there is a basis of eigenvectors for the range of $$P$$.

### Useful Counterexamples

\begin{align*} A \mathrel{\vcenter{:}}=\left[ \begin{array} { c c } { 1 } & { 1 } \\ { 0 } & { 1 } \end{array} \right] &\implies A^n = \left[ \begin{array} { c c } { 1 } & { n } \\ { 0 } & { 1 } \end{array} \right], && \operatorname{Spec}(A) = [1,1] \\ A \mathrel{\vcenter{:}}=\left[ \begin{array} { c c } { 1 } & { 1 } \\ { 0 } & { - 1 } \end{array} \right] &\implies A^2 = I_2, && \operatorname{Spec}(A) = [1, -1] \end{align*}

## Example Problems

Determine a basis for \begin{align*} S = \left\{a_0 + a_1 x + a_2 x^2\mathrel{\Big|}a_0,a_1,a_2 \in \mathbb{R} \land a_0 - a_1 -2a_2 =0\right\}. \end{align*}

Let $$a_2=t, a_1=s, a_0=s+2t$$, then \begin{align*} S &= \left\{ (s+2t) + (sx+tx^2)\mathrel{\Big|}s,t\in\mathbb{R} \right\} \\ &= \left\{ (s+sx) + (2t+tx^2)\mathrel{\Big|}s,t\in\mathbb{R} \right\} \\ &= \left\{ s(1+x) + t(2+x^2)\mathrel{\Big|}s,t\in\mathbb{R} \right\} \\ &= \text{span}\left\{(1+x),(2+x^2)\right\} \end{align*}

and a basis for $$S$$ is \begin{align*} \left\{(1+x), (2+x^2)\right\} \end{align*}

: If $$V$$ is an $$n$$-dimensional vector space, then every set $$S$$ with fewer than $$n$$ vectors can be extended to a basis for $$V$$.

Only sets with fewer than $$n$$ vectors can be extended to form a basis for $$V$$.

: The set of all 3 x 3 matrices forms a three-dimensional subspace of $$M_{3}(\mathbb{R})$$.

This set forms a 6-dimensional subspace. A basis for this space would require six elements.

Given $$A=$$ \begin{align*}\begin{bmatrix} 1 & 3 ` \\ -2 & -6 \end{bmatrix}\end{align*} what is the dimension of the null space of $$A$$?

The augmented matrix for the system $$A\mathbf{x} = \mathbf{0}$$ is \begin{align*}\begin{bmatrix}[cc|c] 1 & 3 & 0 \\ 0 & 0 & 0 \end{bmatrix}\end{align*} which has one free variable.

Writing one variable in terms of another results in $$x_1 + 3x_2 = 0 \Rightarrow x_1 = 3x_2$$.

Let $$x_2 = r$$ where $$r \in R$$, then $$S = \left\{ x \in \mathbb{R}^2 : \mathbf{x} = r(3,1), r \in \mathbb{R}\right\} = \text{span}\left\{(3,1)\right\}$$.

So, the set $$B = \left\{(3,1)\right\}$$ is a basis for the null space of $$A$$ and\ .

Let $$S$$ be the subspace of $$\mathbb{R}^3$$ that consists of all solutions to the equation $$x-3y+z = 0$$. Determine a basis for $$S$$, and find dim[$$S$$].

The first goal is to find a way to express the set of 3-tuples that satisfy this equation.

Let $$y=r$$ and $$z=s$$, then $$x=r-s$$. Then vectors $$\mathbf{v}$$ that satisfy the equation are all of the form \begin{align*} \mathbf{v} = (3r-s, r, s) = (3r,r,0)+(-s,0,s) = r(3,1,0) + s(-1,0,1). \end{align*} (Note - the goal here is to separate the dependent variables into different vectors so they can be written as a linear combination of something.)

The set $$S$$ that satisfies this equation is then \begin{align*} S &= \left\{ \mathbf{v} \in \mathbb{R}^3 : \mathbf{v} =r(3,1,0) + s(-1,0,1) \land r,s\in\mathbb{R} \right\} \\ &= \text{span}\left\{ (3,1,0), (-1,0,1)\right\} \end{align*}

All that remains is to check that the vectors in this span are linearly independent. This can be done by showing that \begin{align*} a(3,1,0) + b(-1,0,1) = (0,0,0) \end{align*} $$a=b=0$$.

Since the two vectors are linearly independent and span the solution set $$S$$, they form a basis for $$S$$ of dimension 2.

Determine a basis for the subspace of $$M_2(\mathbb{R})$$ spanned by \begin{align*}\begin{bmatrix} -1 & 3 \\ -1 & 2 \end{bmatrix}\end{align*}

\begin{align*}\begin{bmatrix} 0 & 0 \\ 0 & 0 \end{bmatrix}\end{align*}

\begin{align*}\begin{bmatrix} -1 & 4 \\ 1 & 1 \end{bmatrix}\end{align*}

\begin{align*}\begin{bmatrix} 5 & 6 \\ -5 & 1 \end{bmatrix}\end{align*}.

Note that because the set contains the zero matrix, it is linearly dependent. So only consider the other three, as they span the same subspace as the original set.

First, determine if $$\left\{ A_1, A_2, A_3\right\}$$ is linearly independent. Start with the equation \begin{align*} c_1A_1 + c_2A_2 + c_3A_3 = 0_2 \end{align*}

which gives \begin{align*} c_1 - c_2 + 5c_3 &= 0 \\ 3c_1 + 4c_2 - 6c_3 &= 0 \\ -c_1 + c_2 - 5c_3 &= 0 \\ 2c_1 + c_2 + c_3 &= 0 \end{align*}

which has the solution $$(-2r,3r,r)$$. So the set is linearly dependent by the relation \begin{align*} -2A_1 + 3A_2 + A_3 = 0 \text{ or }\\ A_3 = 2A_1 - 3A_2 \end{align*}

So $$\left\{A_1, A_2\right\}$$ spans the same subspace as the original set. It is also linearly independent, and therefore forms a basis for the original subspace.

Let $$A, B, C \in M_2 (\mathbb{R})$$. Define $$\langle A,B\rangle = a_{11}b_{11}+2a_{12}b{12}+3a_{21}b_{21}$$. Does this define an inner product on $$M_2 (\mathbb{R})$$?

Instead, let $$\langle A,B\rangle = a_{11} + b_{22}$$. Does this define an inner product on $$M_2(\mathbb{R})$$?

Let $$p=a_0 + a_1 x + a_2 x^2$$ and $$q=b_0 + b_1 x + b_2 x^2$$. Define $$\langle p,q\rangle = \sum_{i=0}^{2}(i+1)a_i b_i$$. Does this define an inner product on $$P_2$$?

## Changing Basis

The transition matrix from a given basis $$\mathcal{B} = \left\{{\mathbf{b}_i}\right\}_{i=1}^n$$ to the standard basis is given by \begin{align*} A\mathrel{\vcenter{:}}= \begin{bmatrix} \rule[-1ex]{0.5pt}{2.5ex}& \rule[-1ex]{0.5pt}{2.5ex}& & \rule[-1ex]{0.5pt}{2.5ex}\\ \mathbf{b}_1 & \mathbf{b}_2 & \cdots & \mathbf{b}_n \\ \rule[-1ex]{0.5pt}{2.5ex}& \rule[-1ex]{0.5pt}{2.5ex}& & \rule[-1ex]{0.5pt}{2.5ex}\\ \end{bmatrix} ,\end{align*} and the transition matrix from the standard basis to $$\mathcal{B}$$ is $$A^{-1}$$.

## Orthogonal Matrices

Given a notion of orthogonality for vectors, we can extend this to matrices. A square matrix is said to be orthogonal iff $$QQ^T = Q^TQ = I$$. For rectangular matrices, we have the following characterizations: \begin{align*} QQ^T = I \implies &\text{The rows of } Q \text { are orthogonal,} \\ Q^TQ = I \implies &\text{The columns of } Q \text{ are orthogonal.} \end{align*}

To remember which condition is which, just recall that matrix multiplication $$AB$$ takes the inner product between the rows of $$A$$ and the columns of $$B$$. So if, for example, we want to inspect whether or not the columns of $$Q$$ are orthogonal, we should let $$B=Q$$ in the above formulation – then we just note that the rows of $$Q^T$$ are indeed the columns of $$Q$$, so $$Q^TQ$$ computes the inner products between all pairs of the columns of $$Q$$ and stores them in a matrix.

## Projections

A projection $$P$$ induces a decomposition \begin{align*} \operatorname{dom}(P) = \ker(P) \oplus \ker(P)^\perp .\end{align*}

Distance from a point $$\mathbf{p}$$ to a line $$\mathbf{a} + t\mathbf{b}$$: let $$\mathbf{w} = \mathbf{p} - \mathbf{a}$$, then: $${\left\lVert {\mathbf{w} - P(\mathbf{w}, \mathbf{v})} \right\rVert}$$

\begin{align*} \operatorname{Proj}_{\operatorname{range}(A)}(\mathbf{x}) = A(A^t A)^{-1} A^t \mathbf{x} .\end{align*}

Mnemonic: \begin{align*} P \approx {A^t A \over AA^t} .\end{align*}

With an inner product in hand and a notion of orthogonality, we can define a notion of orthogonal projection of one vector onto another, and more generally of a vector onto a subspace spanned by multiple vectors.

### Projection Onto a Vector

Say we have two vectors $$\mathbf{x}$$ and $$\mathbf{y}$$, and we want to define “the component of $$\mathbf{x}$$ that lies along $$\mathbf{y}$$”, which we’ll call $$\mathbf{p}$$. We can work out what the formula should be using a simple model:

We notice that whatever $$p$$ is, it will in the direction of $$\mathbf{y}$$, and thus $$\mathbf{p} = \lambda \widehat{\mathbf{y}}$$ for some scalar $$\lambda$$, where in fact $$\lambda = {\left\lVert {\mathbf{p}} \right\rVert}$$ since $${\left\lVert {\widehat{\mathbf{y}}} \right\rVert} = 1$$. We will find that $$\lambda = {\left\langle {\mathbf{x}},~{\widehat{\mathbf{y}}} \right\rangle}$$, and so \begin{align*} \mathbf{p} = {\left\langle {\mathbf{x}},~{\widehat{\mathbf{y}}} \right\rangle}\widehat{\mathbf{y}} = \frac{{\left\langle {\mathbf{x}},~{\mathbf{y}} \right\rangle}}{{\left\langle {\mathbf{y}},~{\mathbf{y}} \right\rangle}} \mathbf{y} .\end{align*}

Notice that we can then form a “residual” vector $$\mathbf{r} = \mathbf{x} - \mathbf{p}$$, which should satisfy $$\mathbf{r} ^\perp\mathbf{p}$$. If we were to let $$\lambda$$ vary as a function of a parameter $$t$$ (making $$\mathbf{r}$$ a function of $$t$$ as well) we would find that this particular choice minimizes $${\left\lVert {\mathbf{r} (t)} \right\rVert}$$.

### Projection Onto a Subspace

In general, supposing one has a subspace $$S = \mathrm{span}\left\{{\mathbf{y}_1, \mathbf{y}_2, \cdots, \mathbf{y}_n}\right\}$$ and (importantly!) the $$\mathbf{y}_i$$ are orthogonal, then the projection of $$\mathbf{p}$$ of $$x$$ onto $$S$$ is given by the sum of the projections onto each basis vector, yielding

\begin{align*} \mathbf{p} = \sum_{i=1}^n \frac{{\left\langle {\mathbf{x}},~{\mathbf{y}_i} \right\rangle}}{{\left\langle {\mathbf{y}_i},~{\mathbf{y}_i} \right\rangle}} \mathbf{y}_i = \sum_{i=1}^n {\left\langle {\mathbf{x}},~{\mathbf{y}_i} \right\rangle} \widehat{\mathbf{y}_i} .\end{align*}

Note: this is part of why having an orthogonal basis is desirable!

Letting $$A = [\mathbf{y}_1, \mathbf{y}_2, \cdots]$$, then the following matrix projects vectors onto $$S$$, expressing them in terms of the basis $$\mathbf{y}_i$$2: \begin{align*} \tilde P_A = (AA^T)^{-1}A^T, \end{align*}

while this matrix performs the projection and expresses it in terms of the standard basis: \begin{align*} P_A = A(AA^T)^{-1}A^T. \end{align*}

Equation of a plane: given a point $$\mathbf{p}_0$$ on a plane and a normal vector $$\mathbf{n}$$, any vector $$\mathbf{x}$$ on the plane satisfies \begin{align*} {\left\langle {\mathbf{x} - \mathbf{p}_0},~{\mathbf{n}} \right\rangle} = 0 \end{align*}

To find the distance between a point $$\mathbf{a}$$ and a plane, we need only project $$\mathbf{a}$$ onto the subspace spanned by the normal $$\mathbf{n}$$: \begin{align*} d = {\left\langle {\mathbf{a}},~{\mathbf{n}} \right\rangle} .\end{align*}

One important property of projections is that for any vector $$\mathbf{v}$$ and for any subspace $$S$$, we have $$\mathbf{v} - P_S(\mathbf{v}) \in S^\perp$$. Moreover, if $$\mathbf{v} \in S^\perp$$, then $$P_s(\mathbf{v})$$ must be zero. This follows by noting that in equation $$\ref{projection_equation}$$, every inner product appearing in the sum vanishes, by definition of $$\mathbf{v} \in S^\perp$$, and so the projection is zero.

### Least Squares

$$\mathbf{x}$$ is a least squares solution to $$A\mathbf{x} = \mathbf{b}$$ iff \begin{align*} A^t A \mathbf{x} = A^t \mathbf{b} \end{align*}

The general setup here is that we would like to solve $$A\mathbf{x} = \mathbf{b}$$ for $$\mathbf{x}$$, where $$\mathbf{b}$$ is not in fact in the range of $$A$$. We thus settle for a unique “best” solution $$\tilde{\mathbf{x}}$$ such that the error $${\left\lVert {A\tilde{\mathbf{x}} - \mathbf{b}} \right\rVert}$$ is minimized.

Geometrically, the solution is given by projecting $$\mathbf{b}$$ onto the column space of $$A$$. To see why this is the case, define the residual vector $$\mathbf{r} = A\tilde{\mathbf{x}} - \mathbf{b}$$. We then seek to minimize $${\left\lVert {\mathbf{r}} \right\rVert}$$, which happens exactly when $$\mathbf{r} ^\perp\operatorname{im}A$$. But this happens exactly when $$\mathbf{r} \in (\operatorname{im}A)^\perp$$, which by the fundamental subspaces theorem, is equivalent to $$\mathbf{r} \in \ker A^T$$.

From this, we get the equation \begin{align*} A^T \mathbf{r} = \mathbf{0} \\ \implies A^T(A \tilde{\mathbf{x}} - \mathbf{b}) = \mathbf{0}\\ \implies A^TA\tilde{\mathbf{x}} = A^T \mathbf{b}, \end{align*}

where the last line is described as the normal equations.

If $$A$$ is an $$m\times n$$ matrix and is of full rank, so it has $$n$$ linearly independent columns, then one can show that $$A^T A$$ is nonsingular, and we thus arrive at the least-squares solution \begin{align*} \tilde{\mathbf{x}} = (A^TA)^{-1}A^T \mathbf{b} \hfill\blacksquare \end{align*}

These equations can also be derived explicitly using Calculus applied to matrices, vectors, and inner products. This requires the use of the following formulas: \begin{align*} {\frac{\partial }{\partial \mathbf{x}}\,} {\left\langle {\mathbf{x}},~{\mathbf{a}} \right\rangle} &= \mathbf{a} \\ {\frac{\partial }{\partial \mathbf{x}}\,} {\left\langle {\mathbf{x}},~{\mathbf{A}\mathbf{x}} \right\rangle} &= (A+A^T)\mathbf{x} \end{align*}

as well as the adjoint formula \begin{align*} {\left\langle {A\mathbf{x}},~{\mathbf{x}} \right\rangle} = {\left\langle {\mathbf{x}},~{A^T \mathbf{x}} \right\rangle}. .\end{align*}

From these, by letting $$A=I$$ we can derive \begin{align*} {\frac{\partial }{\partial \mathbf{x}}\,} {\left\lVert {\mathbf{x}} \right\rVert}^2 = {\frac{\partial }{\partial \mathbf{x}}\,} {\left\langle {\mathbf{x}},~{\mathbf{x}} \right\rangle} = 2\mathbf{x}\\ .\end{align*}

The derivation proceeds by solving the equation \begin{align*} {\frac{\partial }{\partial \mathbf{x}}\,} {\left\lVert {\mathbf{b} - A\mathbf{x}} \right\rVert}^2 = \mathbf{0}. .\end{align*}

## Normal Forms

Every square matrix is similar to a matrix in Jordan canonical form.

## Decompositions

### The QR Decomposition

Gram-Schmidt is often computed to find an orthonormal basis for, say, the range of some matrix $$A$$. With a small modification to this algorithm, we can write $$A = QR$$ where $$R$$ is upper triangular and $$Q$$ has orthogonal columns.

Why is this useful? One reason is that this also allows for a particularly simple expression of least-squares solutions. If $$A=QR$$, then $$R$$ will be invertible, and a bit of algebraic manipulation will show that \begin{align*} \tilde{\mathbf{x}} = R^{-1}Q^T\mathbf{b}. .\end{align*}

How does it work? You simply perform Gram-Schmidt to obtain $$\left\{{\mathbf{u}_i}\right\}$$, then \begin{align*}Q = [\mathbf{u}_1, \mathbf{u}_2, \cdots ].\end{align*}

The matrix $$R$$ can then be written as

\begin{align*} r_{ij} = \begin{cases} {\left\langle {\mathbf{u}_i},~{\mathbf{x}_j} \right\rangle}, & i\leq j, \\ 0, & \text{else}. \end{cases} \end{align*}

Explicitly, this yields the matrix \begin{align*} R = \begin{bmatrix} {\left\langle {\mathbf{u}_1},~{\mathbf{x}_1} \right\rangle} & {\left\langle {\mathbf{u}_1},~{\mathbf{x}_2} \right\rangle} & {\left\langle {\mathbf{u}_1},~{\mathbf{x}_3} \right\rangle} & \cdots & \\ 0 & {\left\langle {\mathbf{u}_2},~{\mathbf{x}_2} \right\rangle} & {\left\langle {\mathbf{u}_2},~{\mathbf{x}_3} \right\rangle} & \cdots & \\ 0 & 0 & {\left\langle {\mathbf{u}_3},~{\mathbf{x}_3} \right\rangle} & \cdots & \\ \vdots & \vdots & \vdots & \ddots \\ \end{bmatrix} \end{align*}

# Appendix: Lists of things to know

Textbook: Leon, Linear Algebra with Applications

## Topics

• 1.6: Partition Matrices
• 3.5: Change of Basis
• 4.1: Linear Transformations
• 4.2: Matrix Representations
• 4.3: Similarity
• Exam 1
• 5.1: Scalar Product in $${\mathbb{R}}^n$$
• 5.2: Orthogonal Subspaces
• 5.3: Least Squares
• 5.4: Inner Product Spaces
• 5.5: Orthonormal Sets
• 5.6: Gram-Schmidt
• 6.1: Eigenvalues and Eigenvectors
• Exam 2
• 6.2: Systems of Linear Differential Equations
• 6.3: Diagonalization
• 6.7: Positive Definite Matrices
• 6.5: Singular Value Decomposition
• 7.7: The Moore-Penrose Pseudo-Inverse
• Final Exam

## Definitions

• System of equations
• Homogeneous system
• Consistent/inconsistent system
• Matrix
• Matrix (i.e. $$A \mathbf{x} = \mathbf{b}$$)
• Inverse matrix
• Singular matrix
• Determinant
• Trace
• Rank
• Elementary row operation
• Row equivalence
• Pivot
• Row Echelon Form
• Reduced Row Echelon Form
• Gaussian elimination
• Block matrix
• Vector space
• Vector subspace
• Linear transformation
• Span
• Linear independence
• Basis
• Change of basis
• Dimension
• Row space
• Column space
• Image
• Null space
• Kernel
• Direct sum
• Projection
• Orthogonal subspaces
• Orthogonal complement
• Normal equations
• Least-squares solution
• Orthonormal
• Eigenvalue
• Eigenvector
• Characteristic polynomial
• Similarity
• Diagonalizable
• Inner product
• Bilinearity
• Multilinearity
• Defective
• Singular value decomposition
• QR factorization
• Gram-Schmidt process
• Spectral theorem
• Symmetric matrix
• Orthogonal matrix
• Positive-definite

## Lower-division review

• Systems of linear equations
• Consistent vs. Inconsistent
• Possibilities for solutions
• Geometric interpretation
• Matrix Inverses
• Detecting if a matrix is singular
• Computing the inverse
• Formula for 2x2 case
• Augment with the identity
• Cramer’s Rule
• Vector Spaces
• Definition in terms of closures
• Span
• Linear Independence
• Subspace and the subspace test
• Basis
• Common Computations
• Reduction to RREF
• Eigenvalues and eigenvectors
• Basis for the column space
• Basis for the nullspace
• Basis for the eigenspace
• Construct matrix from a given linear map
• Construct change of basis matrix
• Construct matrix projection onto subspace
• Convert a basis to an orthonormal basis

## Things to compute

• Construct a matrix representing a linear map
• With respect to the standard basis in both domain and range
• With respect to a nonstandard basis in the range
• With respect to a nonstandard basis in the domain
• With respect to nonstandard bases in both the domain and range
• Construct a change of basis matrix
• Check that a map is a linear transformation
• Compute the following spaces of a matrix and their orthogonal complements:
• Row space
• Column space
• Null space
• Compute the shortest distance between a point and a plane
• Compute the least squares solution to linear system
• Prove that something is a vector space
• Prove that a map is an inner product
• Compute determinants
• Compute the RREF of a matrix
• Compute characteristic polynomials, eigenvalues, and eigenvectors
• Diagonalize a matrix
• Solve a system of ODEs resulting arising from tank mixing
• Compute the singular value decomposition of a matrix
• Compute the rank and nullity of a matrix
• Convert a set of vectors to a basis
• Convert a basis to an orthonormal basis
• Determine if a matrix is diagonalizable
• Compute the matrix for a projection onto a subspace
• Find the QR factorization of a matrix

## Things to prove

• Prove facts about block matrices
• Prove facts about injective linear maps
• Prove facts about similar matrices
• Prove facts about orthogonal spaces and orthogonal complements
• Prove facts about inner products
• Prove facts about orthonormal sets
• Understand when a matrix can be diagonalized
• Prove facts about diagonalizable matrices
• Prove facts about the orthogonal decomposition theorem

# Ordinary Differential Equations

## Techniques Overview

\begin{align*} p(y)y' = q(x) && \hspace{10em} \text{separable} \\ \\ y'+p(x)y = q(x) && \text{integrating factor} \\ \\ y' = f(x,y), f(tx,ty) = f(x,y) && y = xV(x)\text{ COV reduces to separable} \\ \\ y' +p(x)y = q(x)y^n && \text{Bernoulli, divide by y^n and COV u = y^{1-n}} \\ \\ M(x,y)dx + N(x,y)dy = 0 && M_y = N_x: \phi(x,y) = c (\phi_x =M, \phi_y = N) \\ \\ P(D)y = f(x,y) && x^ke^{rx} \text{ for each root } \end{align*}

Where $$e^{zx}$$ yields $$e^{ax}\cos bx, e^{ax}\sin bx$$

## Types of Equations

• Separable equations: \begin{align*}p(y)\frac{dy}{dx} - q(x) = 0 \implies \int p(y) dy = \int q(x) dx + C\end{align*} \begin{align*} \frac{dy}{dx} = f(x)g(y) \implies \int \frac{1}{g(y)}dy = \int f(x) dx + C \end{align*}
• Population growth: \begin{align*}\frac{dP}{dt} = kP \implies \qquad P = P_0 e^{kt}\end{align*}
• Logistic growth:
• General form: $$\frac{dP}{dt} =(B(t) - D(t))P(t)$$
• Assume birth rate is constant $$B(t) = B_0$$ and death rate is proportional to instantaneous population $$D(t) = D_0 P(t)$$. Then let $$r = B_0, C = B_0/D_0$$ be the carrying capacity: \begin{align*}\frac{dP}{dt} = r\left( 1 - \frac{P}{C} \right)P \implies \qquad P(t) = \frac{P_0}{\frac{P_0}{C} + e^{-rt}(1 - \frac{P_0}{C})}\end{align*}
• First order linear: \begin{align*}\frac{dy}{dx} + p(x)y = q(x) \implies I(x) = e^{\int p(x) dx},\qquad y(x) = \frac{1}{I(x)}\left(\int q(x)I(x) dx + C\right)\end{align*}
• Exact:
• $$M(x,y)dx + N(x,y)dy = 0 \text{ is exact } \iff \exists \phi: \frac{\partial\phi}{\partial x} = M(x, y),~\frac{\partial\phi}{\partial y} = N(x, y) \\ \iff \frac{\partial M}{\partial y} = \frac{\partial N}{x}$$
• General solution: \begin{align*}\phi(x, y) = \int^x M(s, y) ds + \int^y N(x, t) dt - \int^y \frac{\partial}{\partial t} \left(\int^x M(s, t) ds\right)dt\end{align*} (where $$\int^x f(t) dt$$ means take the antiderivative of $$f$$ and consider it a function of $$x$$)
• Cauchy Euler: #todo
• Bernoulli: todo

## Linear Homogeneous

General form: \begin{align*} y^{(n)} + c_{n-1} y^{(n-1)} + \cdots + c_2y'' + cy' + cy = 0 \\ p(D)y = \prod (D-r_i)^{m_i} y= 0 \end{align*} where $$p$$ is a polynomial in the differential operator $$D$$ with roots $$r_i$$:

• Real roots: contribute $$m_i$$ solutions of the form \begin{align*}e^{rx}, xe^{rx}, \cdots, x^{m_i-1}e^{rx}\end{align*}

• Complex conjugate roots: for $$r=a+bi$$, contribute $$2m_i$$ solutions of the form \begin{align*}e^{(a\pm bi)x}, xe^{(a\pm bi)x}, ~\cdots,~ x^{m_i-1}e^{(a\pm bi)x} \\ = e^{ax}\cos(bx), e^{ax}\sin(bx),~ xe^{ax}\cos(bx), xe^{ax}\sin(bx),~ \cdots,~ \end{align*}

Example: by cases, second order equation of the form \begin{align*}ay'' + by' + cy = 0\end{align*} - Two distinct roots: $$c_1 e^{r_1 x} + c_2 e^{r_2 x}$$ - One real root: $$c_1 e^{rx} + c_2 x e^{rx}$$ - Complex conjugates $$\alpha \pm i \beta$$: $$e^{\alpha x}(c_1 \cos \beta x + c_2 \sin \beta x)$$

## Linear Inhomogeneous

General form: \begin{align*} y^{(n)} + c_{n-1} y^{(n-1)} + \cdots + c_2y'' + cy' + cy = F(x) \\ p(D)y = \prod (D-r_i)^{m_i} y= 0 \end{align*}

Then solutions are of the form $$y_c + y_p$$, where $$y_c$$ is the solution to the associated homogeneous system and $$y_p$$ is a particular solution.

Methods of obtaining particular solutions

### Undetermined Coefficients

• Find an operator $$p(D)$$ the annihilates $$F(x)$$ (so $$q(D)F = 0$$)
• Find solution of $$q(D)p(D) = 0$$, subtract of known solutions from homogeneous part to obtain the form of the trial solution $$A_0f(x)$$, where $$A_0$$ is the undetermined coefficient
• Substitute trial solution into original equation to determine $$A_0$$

Useful Annihilators: \begin{align*} &F(x) = p(x): & D^{\deg(p)+1} \\ &F(x) = p(x)e^{ax}: & (D-a)^{\deg(p)+1}\\ &F(x) = \cos(ax) + \sin(ax): & D^2 + a^2\\ &F(x) = e^{ax}(a_0\cos(bx) + b_0\sin(bx)): & (D-z)(D-{\overline{{z}}}) = D^2 -2aD + a^2 + b^2 \\ &F(x) = p(x)e^{ax}\cos(bx) + p(x)e^{ax}\cos(bx): & \left( (D-z)(D-{\overline{{z}}})\right)^{\max(\deg(p), \deg(q))+ 1} \end{align*}

## Systems of Differential Equations

General form: \begin{align*} \frac{\partial \mathbf{x}(t) }{\partial t} = A\mathbf{x}(t) + \mathbf{b}(t) \iff \mathbf{x}'(t) = A\mathbf{x}(t) + \mathbf{b}(t) \end{align*}

General solution to homogeneous equation: \begin{align*} c_1\mathbf{x_1}(t) + c_2\mathbf{x_2}(t)+ \cdots +c_n\mathbf{x_n}(t) = \mathbf{X}(t)\mathbf{c} \end{align*}

If $$A$$ is a matrix of constants: $$\mathbf{x}(t) = e^{\lambda_i t}~\mathbf{v}_i$$ is a solution for each eigenvalue/eigenvector pair $$(\lambda_i, \mathbf{v}_i)$$ - If $$A$$ is defective, you’ll need generalized eigenvectors.

Inhomogeneous Equation: particular solutions given by \begin{align*} \mathbf{x}_p(t) = \mathbf{X}(t) \int^t \mathbf{X}^{-1}(s)\mathbf{b}(s) ~ds \end{align*}

## Laplace Transforms

Definitions: \begin{align*} H_ { a } ( t ) \mathrel{\vcenter{:}}=\left\{ \begin{array} { l l } { 0 , } & { 0 \leq t < a } \\ { 1 , } & { t \geq a } \end{array} \right.\\ \delta(t): \int_{\mathbb{R}}\delta(t-a)f(t)~dt &= f(a),\quad \int_{\mathbb{R}}\delta(t-a)~dt = 1\\ (f \ast g )(t) &= \int_0^t f(t-s)g(s)~ds \\ L[f(t)] &= L[f] =\int_0^\infty e^{-st}f(t)dt = F(s) .\end{align*} Useful property: for $$a\leq b$$, $$H_a(t) - H_b(t) = \indic{[a,b]}$$. \begin{align*} t^n, n\in{\mathbb{N}}\quad&\iff &\frac{n!}{s^{n+1}},\quad &s > 0 \\ t^{-\frac{1}{2}} \quad&\iff &\sqrt{\pi} s^{-\frac{1}{2}}\quad & s>0\\ e^{at} \quad&\iff &\frac{1}{s-a},\quad &s > a \\ \cos(bt) \quad&\iff &\frac{s}{s^2+b^2},\quad &s>0 \\ \sin(bt) \quad&\iff &\frac{b}{s^2+b^2},\quad &s>0 \\ \cosh(bt) \quad&\iff &\frac{s}{s^2 - b^2},\quad &? \\ \sinh(bt) \quad&\iff &\frac{b}{s^2-b^2},\quad &? \\ \delta(t-a) \quad&\iff &e^{-as} \quad& \\ H_a(t) \quad&\iff &s^{-1}e^{-as}\quad& \\ e^{at}f(t) \quad&\iff &F(s-a)\quad & \\ H_a(t)f(t-a) \quad&\iff &e^{-as}F(s)& \\ f'(t) \quad&\iff & sL(f) - f(0) & \\ f''(t) \quad&\iff &s^2L(f) -sf(0) - f'(0) &\\ f^{(n)}(t) \quad&\iff & s^nL(f) - \sum_{i=0}^{n-1} s^{n-1-i}f^{(i)}(0) & \\ f(t)g(t) \quad&\iff &F(s) \ast G(s)\quad & \end{align*}

• For $$f$$ periodic with period $$T$$, $$L(f) = \frac{1}{1+e^{-sT}}\int_0^T e^{-st}f(t)~dt$$

\begin{align*} p(y)y' = q(x) & & \hspace{10em} \text{separable} \\ \\ y'+p(x)y = q(x) & & \text{integrating factor} \\ \\ y' = f(x,y), f(tx,ty) = f(x,y) & & y = xV(x)\text{ COV reduces to separable} \\ \\ y' +p(x)y = q(x)y^n & & \text{Bernoulli, divide by y^n and COV u = y^{1-n}} \\ \\ M(x,y)dx + N(x,y)dy = 0 & & M_y = N_x: \phi(x,y) = c (\phi_x =M, \phi_y = N) \\ \\ P(D)y = f(x,y) & & x^ke^{rx} \text{ for each root } \end{align*}

\begin{align*} L[e^{at} f(t)] = \int_0^\infty e^{(a-s)}f(t)dt = F(s-a), .\end{align*}

The general technique for solving differential equations with Laplace Transforms: - Take the Laplace Transform of all terms on both sides. - Solve for $$L[y]$$ in terms of $$s$$. - Attempt an inverse Laplace Transformations - This may involve partial fraction decomposition, completing the square, and splitting numerators to match terms with known inverse transformations.

## Systems of Differential Equations

For a collection of $$n$$ functions $$f_i: {\mathbb{R}}^n \to {\mathbb{R}}$$, define the $$n\times 1$$ column vector \begin{align*} W(f_i)(\mathbf{p}) \mathrel{\vcenter{:}}= \begin{bmatrix} f_i(\mathbf{p}) \\ f_i'(\mathbf{p}) \\ f_i''(\mathbf{p}) \\ \vdots \\ f^{(n-1)}(\mathbf{p}) \end{bmatrix} .\end{align*}

The Wronskian of this collection is defined as \begin{align*} W(f_1, \cdots, f_n)(\mathbf{p}) \mathrel{\vcenter{:}}= \det \begin{bmatrix} \rule[-1ex]{0.5pt}{2.5ex}& \rule[-1ex]{0.5pt}{2.5ex}& & \rule[-1ex]{0.5pt}{2.5ex}\\ W(f_1)(\mathbf{p}) & W(f_2)(\mathbf{p}) & \cdots & W(f_n)(\mathbf{p})\\ \rule[-1ex]{0.5pt}{2.5ex}& \rule[-1ex]{0.5pt}{2.5ex}& & \rule[-1ex]{0.5pt}{2.5ex}\\ \end{bmatrix} .\end{align*}

A set of functions $$\left\{{f_i}\right\}$$ is linearly independent on $$I \iff \exists x_0 \in I: W(x_0) \neq 0$$.

$$W \equiv 0$$ on $$I$$ does not imply that $$\left\{{f_i}\right\}$$ is linearly dependent! Counterexample: $$\left\{{x, x+x^2, 2x-x^2}\right\}$$ where $$W \equiv 0$$ but $$x+x^2 = 3(x) + (2x-x^2)$$ is a linear combination of the other two functions.

### Linear Equations of Order $$n$$

The standard form of such equations is \begin{align*} y^{(n)} + a_1y^{(n-1)} + a_2y^{(n-2)} + \cdots +a_ny'' + a_{n-1}y' + y = F(x). \end{align*}

All solutions will be the sum of the solution to the associated homogeneous equation and a single particular solution.

In the homogeneous case, examine the discriminant of the characteristic polynomial. Three cases arise:

That is, every real root contributes a term of $$ce^{rx}$$, while a multiplicity of $$m$$ multiplies the solution by a polynomial in $$x$$ of degree $$m-1$$.

Every pair of complex roots contributes a term $$ce^r(a\cos \omega x + b\sin \omega x)$$, where $$r$$ is the real part of the roots and $$\omega$$ is the complex part.

In the nonhomogeneous case, assume a solution in the most general form of $$F(x)$$, and substitute it into the equation to solve for constant terms. For example,

Use to reduce a nonhomogeneous equation to a homogeneous one as a polynomial in the operator $$D$$.

$$F(x)$$ of the form $$e^{ax}sin(kx)$$ can be rewritten as $$e^{(a+ki)x}$$

# Algebra

## To Sort

• Burnside’s Lemma

• Cauchy’s Theorem

• If $${\left\lvert {G} \right\rvert} = n = \prod p_i^{k_i}$$, then for each $$i$$ there exists a subgroup $$H$$ of order $$p_i$$.
• The Sylow Theorems

• If $${\left\lvert {G} \right\rvert} = n = \prod p_i^{k_i}$$, for each $$ii$$ and each $$1 \leq k_j \leq k_i$$ then there exists a subgroup $$H$$ of order $$p_i^{k_j}$$.
• Galois Theory

• Order $$p$$: One, $$Z_p$$

• Order $$p^2$$: Two abelian groups, $$Z_{p^2}, Z_p^2$$

• Order $$p^3$$:

• 3 abelian $$Z_{p^3}, Z_p \times Z_{p^2}. Z_p^3$$,

• 2 others $$Z_p \rtimes Z_{p^2}$$.

• The other is the quaternion group for p = 2 and a group of exponent p for p > 2.
• Order $$pq$$:

• $$p \mathrel{\Big|}q-1$$: Two groups, $$Z_{pq}$$ and $$Z_q \rtimes Z_p$$
• Else cyclic, $$Z_{pq}$$
• Every element in a permutation group is a product of disjoint cycles, and the order is the lcm of the order of the cycles.

• The product ideal $$IJ$$ is not just elements of the form $$ij$$, it is all sums of elements of this form! The product alone isn’t enough.

• The intersection of any number of ideals is also an ideal

## Big List of Notation

\begin{align*} C(x) = && \left\{{g\in G : gxg^{-1} = x}\right\} && \subseteq G && \text{Centralizer} \\ C_G(x) = && \left\{{gxg^{-1} : g\in G}\right\} && \subseteq G && \text{Conjugacy Class} \\ G_x = && \left\{{g.x : x\in X}\right\} && \subseteq X && \text{Orbit} \\ x_0 = && \left\{{g\in G : g.x = x}\right\} && \subseteq G && \text{Stabilizer} \\ Z(G) = && \left\{{x\in G: \forall g\in G,~ gxg^{-1} = x}\right\} && \subseteq G && \text{Center} \\ \mathrm{Inn}(G) = && \left\{{\phi_g(x) = gxg^{-1} }\right\} && \subseteq {\operatorname{Aut}}(G) && \text{Inner Aut.} \\ \mathrm{Out}(G) = && {\operatorname{Aut}}(G) / \mathrm{Inn}(G) && \hookrightarrow{\operatorname{Aut}}(G) && \text{Outer Aut.} \\ N(H) = && \left\{{g\in G: gHg^{-1} = H}\right\} && \subseteq G && \text{Normalizer} \end{align*}

## Group Theory

Notation: $$H < G$$ a subgroup, $$N < G$$ a normal subgroup, concatenation is a generic group operation.

• $${\mathbb{Z}}_n$$ the unique cyclic group of order $$n$$

• $$\mathbf{Q}$$ the quaternion group

• $$G^n = G\times G \times \cdots G$$

• $$Z(G)$$ the center of $$G$$

• $$o(G)$$ the order of a group

• $$S_n$$ the symmetric group

• $$A_n$$ the alternating group

• $$D_n$$ the dihedral group of order $$2n$$

• Group Axioms

• Closure: $$a,b \in G \implies ab \in G$$
• Identity: $$\exists e\in G \mathrel{\Big|}a\in G \implies ae = ea = a$$
• Associativity: $$a,b,c \in G \implies (ab)c = a(bc)$$
• Inverses: $$a\in G \implies \exists b \in G \mathrel{\Big|}ab =ba = e$$
• Definitions:

• Order
• Of a group: $$o(G) = {\left\lvert {G} \right\rvert}$$, the cardinality of $$G$$
• Of an element: $$o(g) = \min\left\{{n\in {\mathbb{N}}: g^n = e}\right\}$$
• Index
• Center: the elements that commute with everything
• Centralizer: all elements that commute with a given element/subgroup.
• Group Action: a function $$f: X\times G \to G$$ satisfying
• $$x\in X, g_1,g_2 \in G \implies g_1.(g_2.x) = (g_1g_2). x$$
• $$x\in X \implies e.x = x$$
• Orbits partition any set
• Transitive Action
• Conjugacy Class: $$C \subset G$$ is a conjugacy class $$\iff$$
• $$x\in C, g\in G \implies gxg^{-1} \in C$$
• $$x,y \in C \implies \exists g\in G : gxg^{-1} = y$$
• i.e. subsets that are closed under $$G$$ acting on itself by conjugation and on which the action is transitive
• i.e. orbits under the conjugation action
• The order of any conjugacy class divides the order of $$G$$
• $$p$$-group: Any group of order $$p^n$$.
• Simple Group: no nontrivial normal subgroups
• Normal Series: $$0 {~\trianglelefteq~}H_0 {~\trianglelefteq~}H_1 \cdots {~\trianglelefteq~}G$$
• Composition Series: The successive quotients of the normal series
• Solvable: $$G$$ is solvable $$\iff$$ $$G$$ has an abelian composition series.
• One step subgroup test: \begin{align*} a,b \in H \implies a b^{-1} \in H \\ .\end{align*}

• Useful isomorphism invariants:

• Order profile of elements: $$n_1$$ elements of order $$p_1$$, $$n_2$$ elements of order $$p_2$$, etc
• Useful to look at elements of order $$2$$!
• Order profile of subgroups
• $$Z(A) \cong Z(B)$$
• Number of generators (generators are sent to generators)
• Number and size of conjugacy classes
• Number of Sylow$${\hbox{-}}p$$ subgroups.
• Commutativity
• “Being cyclic”
• Automorphism Groups
• Solvability
• Nilpotency
• Useful homomorphism invariants

• $$\phi(e) = e$$
• $${\left\lvert {g} \right\rvert} = m < \infty \implies {\left\lvert {\phi(g)} \right\rvert} = m$$
• Inverses, i.e. $$\phi(a)^{-1} = \phi(a^{-1})$$
• $$H < G \implies \phi(H) < G'$$
• $$H' < G' \implies \phi^{-1}(H') < G$$
• $${\left\lvert {G} \right\rvert} < \infty \implies \phi(G)$$ divides $${\left\lvert {G} \right\rvert}, {\left\lvert {G'} \right\rvert}$$

## Big Theorems

• Classification of Abelian Groups \begin{align*} G \cong {\mathbb{Z}}_{p_1^{k_1}} \oplus {\mathbb{Z}}_{p_2^{k_2}} \oplus \cdots \oplus {\mathbb{Z}}_{p_n^{k_n}} ,\end{align*} where $$(p_i, k_i)$$ are the set of elementary divisors of $$G$$.

• Isomorphism Theorems

\begin{align*} \phi: G \to G’ \implies && \frac{G}{\ker{\phi}} \cong &~ \phi(G) \\ H {~\trianglelefteq~}G,~ K < G \implies && \frac{K}{H\cap K} \cong &~ \frac{HK}{H} \\ H,K {~\trianglelefteq~}G,~ K < H \implies && \frac{G/K}{H/K} \cong &~ \frac{G}{H} \end{align*}

• Lagrange’s Theorem: $$H < G \implies o(H) \mathrel{\Big|}o(G)$$

• Converse is false: $$o(A_4) = 12$$ but has no order 6 subgroup.
• The $$GZ$$ Theorem: $$G/Z(G)$$ cyclic implies that $$G \in \mathbf{Ab}$$.

• Orbit Stabilizer Theorem: $$G / x_0 \cong Gx$$

• The Class Equation

• Let $$G\curvearrowright X$$ and $$\mathcal{O}_i \subseteq X$$ be the nontrivial orbits, then \begin{align*} {\left\lvert {X} \right\rvert} = {\left\lvert { X_0 } \right\rvert} + \sum_{[x_i] \in X/G} {\left\lvert {Gx} \right\rvert} .\end{align*}
• The right hand side is the number of fixed points, plus a sum over all of the orbits of size greater than 1, where any representative within the orbit is chosen and we look at the index of its stabilizer in $$G$$.
• Let $$G\curvearrowright G$$ and for each nontrivial conjugacy class $$C_G$$ choose a representative $$[x_i] = C_G = C_G(x_i)$$ to obtain

\begin{align*} {\left\lvert {G} \right\rvert} = {\left\lvert {Z(G)} \right\rvert} + \sum_{[x_i] = C_G(x_i)} \left[ G: [x_i] \right] .\end{align*}

• Useful facts:

• $$H < G \in \mathbf{Ab} \implies H {~\trianglelefteq~}G$$
• Converse doesn’t hold, even if all subgroups are normal. Counterexample: $$\mathbf{Q}$$
• $$G / Z(G) \cong \mathrm{Inn}(G)$$
• $$H, K < G$$ with $$H \cong K \not\implies G/H \cong G/K$$
• Counterexample: $$G = {\mathbb{Z}}_4 \times{\mathbb{Z}}_2, H = <(0,1)>, K = <(2,0)>$$. Then $$G/H \cong {\mathbb{Z}}_4 \not\cong {\mathbb{Z}}_2^2 \cong G/K$$
• $$G\in\mathbf{Ab} \implies$$ for each $$p$$ dividing $$o(G)$$, there is an element of order $$p$$
• Any surjective homomorphism $$\phi: A \twoheadrightarrow B$$ where $$o(A) = o(B)$$ is an isomorphism
• If $$G$$ is abelian, for each $$d\mathrel{\Big|}{\left\lvert {G} \right\rvert}$$ there is exactly one subgroup of order $$d$$.
• Sylow Subgroups:

• Todo
• Big List of Interesting Groups

• $${\mathbb{Z}}_4, {\mathbb{Z}}_2^2$$
• $$D_4$$
• $$Q = \langle a , b | a ^ { 4 } = 1 , a ^ { 2 } = b ^ { 2 } , a b = b a ^ { 3 } \rangle$$ the quaternion group
• $$S^3$$, the smallest nonabelian group
• Chinese Remainder Theorem: \begin{align*} {\mathbb{Z}}_{pq} \cong {\mathbb{Z}}_p \oplus {\mathbb{Z}}_q \iff (p,q) = 1 \end{align*}

• Fundamental Theorem of Finitely Generated Abelian Groups:
• $$G = {\mathbb{Z}}^n \oplus \bigoplus {\mathbb{Z}}_{q_i}$$
• Finding all of the unique groups of a given order: #todo

### Cyclic Groups

• Generated by ?
• For each $$d$$ dividing $$o(G)$$, there exists a subgroup $$H$$ of order $$d$$.
• If $$G = <a>$$, then take $$H = <a^{\frac{n}{d}}>$$

### The Symmetric Group

• Generated by:
• Transpositions
• #todo
• Cycle types: characterized by the number of elements in the cycle.
• Two elements are in the same conjugacy class $$\iff$$ they have the same cycle type.
• Inversions: given $$\tau = (p_1 \cdots p_n)$$, a pair $$p_i, p_j$$ is inverted iff $$i < j$$ but $$p_j < p_i$$
• Can count inversions $$N(\tau)$$
• Equal to minimum number of transpositions to obtain non-decreasing permutation
• Sign of a permutation: $$\sigma(\tau) = (-1)^{N(\tau)}$$
• Parity of permutations $$\cong ({\mathbb{Z}}, +)$$
• even $$\circ$$ even = even
• odd $$\circ$$ odd = even
• even $$\circ$$ odd = odd

## Ring Theory

• Examples:
• Non-Examples:
• Definition of an Ideal
• Definitions of types of rings:
• Field
• Unique Factorization Domain (UFD)
• Principal Ideal Domain (PID)
• Euclidean Domain:
• Integral Domain
• Division Ring \begin{align*} \text{field} \implies \text{Euclidean Domain} \implies \text{PID} \implies \text{UFD} \implies \text{integral domain} .\end{align*}
• Counterexamples to inclusions are strict:
• An ED that is not a field:
• A PID that is not an ED: $${\mathbb{Q}}[\sqrt {19}]$$
• A UFD that is not a PID:
• An integral domain that is not a UFD:
• Integral Domains
• Unique Factorization Domains
• Prime Elements
• Prime Ideals
• Field Extensions
• The Chinese Remainder Theorem for Rings
• Polynomial Rings
• Irreducible Polynomials
• Over $${\mathbb{Z}}_2$$\begin{align*} x,~ x+1,~ x^2+x+1,~ x^3+x+1,~ x^3+x^2+1 .\end{align*}
• Eisenstein’s Criterion
• Gauss’ Lemma

# Number Theory

## Notation and Basic Definitions

\begin{align*} (a, b) \mathrel{\vcenter{:}}=\gcd(a, b) && \text{the greatest common divisor} \\ {\mathbb{Z}}_n && \text{the ring of integers} \mod n \\ {\mathbb{Z}}_n^{\times}&& \text{the group of units}\mod n .\end{align*}

A function $$f:{\mathbb{Z}}\to {\mathbb{Z}}$$ is said to be multiplicative iff \begin{align*} (a, b) = 1 \implies f(ab) = f(a) f(b) .\end{align*}

## Primes

Every $$n\in {\mathbb{Z}}$$ can be written uniquely as \begin{align*} n = \prod_{i=1}^m p_i^{k_i} \end{align*} where the $$p_i$$ are the $$m$$ distinct prime divisors of $$n$$.

Note that the number of distinct prime factors is $$m$$, while the total number of factors is $$\prod_{i=1}^m(k_i + 1)$$.

## Divisibility

\begin{align*} a{~\Bigm|~}b \iff b \equiv 0 \mod a \iff \exists k \text{ such that } ak = b \end{align*}

### $$\gcd, \operatorname{lcm}$$

$$\gcd(a, b)$$ can be computed by taking prime factorizations of $$a$$ and $$b$$, intersecting the primes occurring, and taking the lowest exponent that appears. Dually, $$\operatorname{lcm}(a, b)$$ can be computed by taking the union and the highest exponent.

\begin{align*} xy = \gcd{(x,y)}~\mathrm{lcm}{(x,y)} \end{align*}

If $$d\mathrel{\Big|}x$$ and $$d\mathrel{\Big|}y$$, then \begin{align*} \gcd(x,y) &= d\cdot \gcd\qty{ \frac x d, \frac y d} \\ \operatorname{lcm}(x,y) &= d\cdot \operatorname{lcm}\qty{ \frac x d, \frac y d} \end{align*}

\begin{align*} \gcd(x, y, z) &= \gcd(\gcd(x,y), z) \\ \gcd(x, y) &= \gcd(x\bmod y, y) \\ \gcd(x,y) &= \gcd(x-y, y) .\end{align*}

### The Euclidean Algorithm

$$\gcd(a, b)$$ can be computed via the Euclidean algorithm, taking the final bottom-right coefficient.

## Modular Arithmetic

Generally concerned with the multiplicative group $$({\mathbb{Z}}_n, \times)$$.

The system \begin{align*} \begin{array} { c } { x \equiv a _ { 1 } \mod m _ { 1 } } \\ { x \equiv a _ { 2 } \mod m _ { 2 } } \\ { \vdots } \\ { x \equiv a _ { r } \mod m _ { r } } \end{array} \end{align*}

has a unique solution $$x \mod \prod m_i \iff (m_i, m_j) = 1$$ for each pair $$i,j$$, given by \begin{align*} x = \sum_{j=1}^r a_j \frac{\prod_i m_i}{m_j} \left[ \frac{\prod_i m_i}{m_j} \right]^{-1}_{\mod m_j} .\end{align*}

\begin{align*} a^{\phi(p)} \equiv 1 \mod n .\end{align*}

\begin{align*} x^{p} &\equiv x \mod p \\ x^{p-1} &\equiv 1 \mod p \quad \text{ if } p \nmid a \end{align*}

### Diophantine Equations

Consider $$ax + by = c$$. This has solutions iff $$c = 0 \mod (a,b) \iff \gcd(a,b) \text{ divides } c$$.

### Computations

If $$x\equiv 0 \mod n$$, then $$x\equiv 0 \mod p^k$$ for all $$p^k$$ appearing in the prime factorization of $$n$$.

If there are factors of the modulus in the equation, peel them off with addition, using the fact that $$nk \equiv 0 \mod n$$. \begin{align*} x &\equiv nk + r \mod n \\ &\equiv r \mod n \end{align*}

So take $$x=463, n = 4$$, then use $$463 = 4\cdot 115 + 4$$ to write \begin{align*} 463 &\equiv y \mod 4 \\ \implies 4\cdot 115 + 3 &\equiv y \mod 4 \\ \implies 3&\equiv y\mod 4 .\end{align*}

For any $$n$$, \begin{align*} x^k \mod n \equiv (x^{k/d} \bmod n)^d \mod n .\end{align*}

\begin{align*} 2^{25} &\equiv (2^5 \mod 5)^5 \mod 5 \\ &\equiv 2^5 \mod 5 \\ &\equiv 2 \mod 5 \end{align*}

Make things easier with negatives! For example, $$\mod 5$$, \begin{align*} 4^{25} &\equiv (-1)^{25} \mod 5\\ &\equiv (-1) \mod 5\\ &\equiv 4 \mod 5 \end{align*}

### Invertibility

\begin{align*} xa = xb \mod n \implies a = b \mod \frac{n}{(x,n)} .\end{align*}

\begin{align*} x\in {\mathbb{Z}}_n^{\times}\iff (x, n) = 1 ,\end{align*} and thus \begin{align*} {\mathbb{Z}}_n^\times = \left\{{1\leq x \leq n : (x,n) = 1}\right\} \end{align*} and $${\left\lvert {{\mathbb{Z}}_n^\times} \right\rvert} = \phi(n)$$.

One can reduce equations by dividing through by a unit. Pick any $$x$$ such that $$x{~\Bigm|~}a$$ and $$x{~\Bigm|~}b$$ with $$(x,n) = 1$$, then \begin{align*} a =b \mod n \implies \frac a x = \frac b x \mod n .\end{align*}

## The Totient Function

\begin{align*} \phi(n) = {\left\lvert {\left\{{1\leq x \leq n : (x,n) = 1}\right\}} \right\rvert} \end{align*}

\begin{align*} \phi(1) & = {\left\lvert {\left\{{1}\right\}} \right\rvert} = 1 \\ \phi(2) & = {\left\lvert {\left\{{1}\right\}} \right\rvert} = 1 \\ \phi(3) & = {\left\lvert {\left\{{1,2}\right\}} \right\rvert} = 2 \\ \phi(4) & = {\left\lvert {\left\{{1,3}\right\}} \right\rvert} = 2 \\ \phi(5) & = {\left\lvert {\left\{{1,2,3,4}\right\}} \right\rvert} = 4 \end{align*}

\begin{align*} \phi(p) & = p-1 \\ \phi(p^k) & = p^{k-1}(p-1) \\ \phi(n) &= n\prod_{i=1}^{?} \qty{1 - {1\over p_i}} \\ n &= \sum_{\tiny d{~\Bigm|~}n} \phi(d) \end{align*}

All numbers less than $$p$$ are coprime to $$p$$; there are $$p^k$$ numbers less than $$p^k$$ and the only numbers not coprime to $$p^k$$ are multiples of $$p$$, i.e. $$\left\{{p, p^2, \cdots p^{k-1}}\right\}$$ of which there are $$k-1$$, yielding $$p^k - p^{k-1}$$

Along with the fact that $$\phi$$ is multiplicative, so $$(p,q) = 1 \implies \phi(pq) = \phi(p)\phi(q)$$, compute this for any $$n$$ by taking the prime factorization.

With these properties, one can compute: \begin{align*} \phi(n) &= \phi\qty{ \prod_i p_i^{k_i}} \\ &= \prod_i p_i^{k_i-1}(p_i-1) \\ &= n \left(\frac{\prod_i (p_i-1)}{\prod_i p_i}\right) \\ &= n\prod_i \qty{ 1 - \frac{1}{p_i}} \end{align*}

\todo[inline]{Check and explain}

$$x$$ is a quadratic residue $$\bmod n$$ iff there exists an $$a$$ such that $$a^2 = x \mod n$$.

In $${\mathbb{Z}}_p$$, exactly half of the elements (even powers of generator) are quadratic residues.

\begin{align*} -1\text{ is a quadratic residue in } {\mathbb{Z}}_p \iff p = 1 \mod 4 .\end{align*}

## Primality Tests

If $$n$$ is prime, then \begin{align*} a^{n-1} = 1 \mod n \end{align*}

$$n$$ is prime iff \begin{align*} x^2 = 1 \mod n \implies x = \pm 1 \end{align*}

## Sequences in Metric Spaces

Every bounded sequence has a convergent subsequence.

In $${\mathbb{R}}^n, X$$ is compact $$\iff X$$ is closed and bounded.

Necessity of $${\mathbb{R}}^n$$: $$X = ({\mathbb{Z}}, d(x,y) = 1)$$ is closed, complete, bounded, but not compact since $$\left\{{1,2,\cdots}\right\}$$ has no convergent subsequence

Converse holds iff bounded is replaced with totally bounded

# Sequences

Notation: $$\left\{{a_n}\right\}_{n\in{\mathbb{N}}}$$ is a sequence, $$\displaystyle\sum_{i\in{\mathbb{N}}} a_i$$ is a series.

## Known Examples

• Known sequences: let $$c$$ be a constant. \begin{align*} c, c^2, c^3, \ldots &= \left\{{c^n}\right\}_{n=1}^\infty \to 0 && \forall {\left\lvert {c} \right\rvert} < 1 \\ \\ \frac{1}{c},\frac{1}{c^2},\frac{1}{c^3},\ldots &= \left\{{\frac{1}{c^n}}\right\}_{n=1}^\infty \to 0 &&\forall {\left\lvert {c} \right\rvert} > 1 \\ \\ 1,\frac{1}{2^c},\frac{1}{3^c},\ldots &= \left\{{\frac{1}{n^c}}\right\}_{n=1}^\infty \to 0 && \forall c > 0 \end{align*}

## Convergence

A sequence $$\left\{{x_j}\right\}$$ converges to $$L$$ iff \begin{align*} \forall \varepsilon > 0,\, \exists N > 0 \text{ such that } \quad n\geq N \implies {\left\lvert {x_n - L} \right\rvert} < \varepsilon .\end{align*}

\begin{align*} b_n \leq a_n \leq c_n \text{ and } b_n,c_n \to L \implies a_n \to L \end{align*}

If $$\left\{{a_j}\right\}$$ monotone and bounded, then $$a_j \to L = \lim\sup a_i < \infty$$.

$${\left\lvert {a_m - a_n} \right\rvert} \to 0 \in {\mathbb{R}}\implies \left\{{a_i}\right\}$$ converges.

### Checklist

• Is the sequence bounded?

• $$\left\{{a_i}\right\}$$ not bounded $$\implies$$ not convergent
• If bounded, is it monotone?
• $$\left\{{a_i}\right\}$$ bounded and monotone $$\implies$$ convergent
• Use algebraic properties of limits

• Epsilon-delta definition

• Algebraic properties and manipulation:

• Limits commute with $$\pm, \times, \operatorname{Div}$$ and $$\lim C = C$$ for constants.

• E.g. Divide all terms by $$n$$ before taking limit

• Clear denominators

# Sums (“Series”)

A series is an function of the form \begin{align*} f(x) = \sum_{j=1}^\infty c_j x^j .\end{align*}

## Known Examples

### Conditionally Convergent

\begin{align*} \sum_{k=1}^\infty k^p &< \infty &&\iff p \leq 1 \\ \sum_{k=1}^\infty \frac{1}{k^p} &< \infty &&\iff p > 1 \\ \sum_{k=1}^\infty \frac{1}{k} &= \infty && \end{align*}

### Convergent

\begin{align*} \sum_{n=1}^\infty \frac{1}{n^2} & < \infty \\ \sum_{n=1}^\infty \frac{1}{n^3} & < \infty \\ \sum_{n=1}^\infty \frac{1}{n^\frac{3}{2}} & < \infty \\ \sum_{n=1}^\infty \frac{1}{n!} & = e \\ \sum_{n=1}^\infty \frac{1}{c^n} & = \frac{c}{c-1} \\ \sum_{n=1}^\infty (-1)^n \frac{1}{c^n} & = \frac{c}{c+1} \\ \sum_{n=1}^\infty (-1)^n \frac{1}{n} & = \ln 2 \end{align*}

### Divergent

\begin{align*} \sum_{n=1}^\infty \frac{1}{n} = \infty \\ \sum_{n=1}^\infty \frac{1}{\sqrt n} = \infty \end{align*}

## Convergence

Useful reference: http://math.hawaii.edu/~ralph/Classes/242/SeriesConvTests.pdf

$$a_n\to 0$$ does not imply $$\sum a_n < \infty$$. Counterexample: the harmonic series.

Absolute convergence $$\implies$$ convergence

\begin{align*} \limsup a_i \to 0 \implies \sum a_i \text{ converges } \end{align*}

### The Big Tests

• $$a_n < b_n \and \sum b_n < \infty \implies \sum a_n < \infty$$
• $$b_n < a_n \and \sum b_n = \infty \implies \sum a_n = \infty$$

\begin{align*} R =\lim_{n\to\infty} {\left\lvert {\frac{a_{n+1}}{a_n}} \right\rvert} \end{align*}

• $$R < 1$$: absolutely convergent
• $$R > 1$$: divergent
• $$R = 1$$: inconclusive

\begin{align*} R = \limsup_{n \to \infty} \sqrt[n]{{\left\lvert {a_n} \right\rvert}} \end{align*} - $$R < 1$$: convergent - $$R > 1$$: divergent - $$R = 1$$: inconclusive

\begin{align*} f(n) = a_n \implies \sum a_n < \infty \iff \int_1^\infty f(x) dx < \infty \end{align*}

\begin{align*} \lim_{n\to\infty}\frac{a_n}{b_n} = L < \infty \implies \sum a_n < \infty \iff \sum b_n < \infty \end{align*}

\begin{align*} a_n \downarrow 0 \implies \sum (-1)^n a_n < \infty \end{align*}

\begin{align*} \sum_{n=1}^\infty {\left\lVert {f_n} \right\rVert}_\infty < \infty \implies \exists f\text{ such that } {\left\lVert { \sum_{n=1}^\infty f_n - f} \right\rVert}_\infty \to 0 \end{align*} In other words, the series converges uniformly.

Slogan: Convergence of the sup norms implies uniform convergence"

The $$M$$ in the name comes from defining $$\sup\left\{{f_k(x)}\right\} \mathrel{\vcenter{:}}= M_n$$ and requiring $$\sum {\left\lvert {M_n} \right\rvert} < \infty$$.

### Checklist

• Do the terms tend to zero?
• $$a_i \not\to 0 \implies \sum a_i = \infty$$.
• Can check with L’Hopital’s rule
• There are exactly 6 tests at our disposal:
• Comparison, root, ratio, integral, limit, alternating
• Is the series alternating?
• If so, does $$a_n \downarrow 0$$?
• If so, convergent
• Is this series bounded above by a known convergent series?
• $$p$$ series with $$p>1$$, i.e. : $$\sum a_n \leq \sum \frac{1}{n^p} < \infty$$
• Geometric series with $${\left\lvert {x} \right\rvert} < 1$$, i.e. $$\sum a_n \leq \sum x^n$$
• Is this series bounded below by a known divergent series?
• $$p$$ series with $$p\leq 1$$, i.e. $$\infty = \sum \frac{1}{n^p} \leq \sum a_i$$
• Are the ratios strictly less than or greater than 1?
• $$<1 \implies$$ convergent
• $$>1 \implies$$ convergent
• Does the integral analogue converge?
• Integral converges $$\iff$$ sum converges
• Try the root test
• $$<1 \implies$$ convergent
• $$>1 \implies$$ convergent
• Try the limit test
• Attempt to divide each term to obtain a known convergent/divergent series

Some Pattern Recognition:

• $$(\text{stuff})!$$: Ratio test (only test that will work with factorials!!)
• $$(\text{stuff})^n$$: Root test or ratio test
• Replace $$a_n$$ with an $$f(x)$$ that’s easy to integrate - integral test
• $$p(x)$$ or $$\sqrt{p(x)}$$: comparison or limit test

Use the fact that \begin{align*} \lim_{k\to\infty} {\left\lvert {\frac{a_{k+1}x^{k+1}}{a_kx^k}} \right\rvert} = {\left\lvert {x} \right\rvert}\lim_{k\to\infty} {\left\lvert {\frac{a_{k+1}}{a_k}} \right\rvert} < 1 \implies \sum a_k x^k < \infty ,\end{align*} so take $$L \mathrel{\vcenter{:}}=\lim_{k\to\infty} \frac{a_{k+1}}{a_k}$$ and then obtain the radius as \begin{align*} R = \frac{1}{L} = \lim_{k\to\infty} {a_k \over a_{k+1}} \end{align*}

• Note $$L=0 \implies$$ absolutely convergent everywhere
• $$L = \infty \implies$$ convergent only at $$x=0$$.
• Also need to check endpoints $$R, -R$$ manually.

# Real Analysis

## Notation

A function is continuously differentiable iff $$f$$ is differentiable and $$f'$$ is continuous.

Conventions:

• Integrable means Riemann integrable.

\begin{align*} f && \text{a functional }{\mathbb{R}}^n \to {\mathbb{R}}\\ \mathbf{f} && \text{a function } {\mathbb{R}}^n\to {\mathbb{R}}^m \\ A, E, U, V && \text{open sets} \\ A' && \text{the limit points of }A \\ \mkern 1.5mu\overline{\mkern-1.5muA\mkern-1.5mu}\mkern 1.5mu && \text{the closure of }A \\ A^\circ\mathrel{\vcenter{:}}= A\setminus A' && \text{the interior of }A \\ K && \text{a compact set} \\ \mathcal{R}_A && \text{the space of Riemann integral functions on }A \\ C^j(A) && \text{the space of }j\text{ times continuously differentiable functions }f: {\mathbb{R}}^n \to {\mathbb{R}}\\ \left\{{f_n}\right\} && \text{a sequence of functions} \\ \left\{{x_n}\right\} && \text{a sequence of real numbers}\\ f_n \to f && \text{pointwise convergence} \\ f_n \rightrightarrows f && \text{uniform convergence} \\ x_n \nearrow x && x_i\leq x_j \text{ and }x_j\text{ converges to }x \\ x_n \searrow x && x_i\geq x_j \text{ and }x_j\text{ converges to }x \\ \sum_{k\in {\mathbb{N}}} f_k && \text{a series}\\ D(f) && \text{the set of discontinuities of }f .\end{align*}

## Big Ideas

Summary for GRE:

• Limits,

• Continuity,

• Boundedness,

• Compactness,

• Definitions of topological spaces,

• Lipschitz continuity

• Sequences and series of functions.

• Know the interactions between the following major operations:

• Continuity (pointwise limits)
• Differentiability
• Integrability
• Limits of sequences
• Limits of series/sums
• The derivative of a continuous function need not be continuous

• A continuous function need not be differentiable

• A uniform limit of differentiable functions need not be differentiable

• A limit of integrable functions need not be integrable

• An integrable function need not be continuous

• An integrable function need not be differentiable

\begin{align*} f,g\text{ differentiable on } [a,b] \implies \exists c\in[a,b] : \left[f ( b ) - f ( a ) \right] g' ( c ) = \left[g ( b ) - g ( a )\right] f' ( c ) \end{align*}

?

## Commuting Limits

• Suppose $$f_n \to f$$ (pointwise, not necessarily uniformly)
• Let $$F(x) = \int f(t)$$ be an antiderivative of $$f$$
• Let $$f'(x) = \frac{\partial f}{\partial x}(x)$$ be the derivative of $$f$$.

Then consider the following possible ways to commute various limiting operations:

Does taking the derivative of the integral of a function always return the original function? \begin{align*} [\frac{\partial}{\partial x}, \int dx]:\qquad\qquad \frac{\partial}{\partial x}\int f(x, t)dt =_? \int \frac{\partial}{\partial x} f(x, t)dt\\ \text{} \end{align*}

Answer: Sort of (but possibly not).

Counterexample: \begin{align*} f(x) = \begin{cases} 1 & x > 0 \\ -1 & x \leq 0 \end{cases} \implies \int f \approx {\left\lvert {x} \right\rvert}, \end{align*} which is not differentiable. (This is remedied by the so-called “weak derivative”)

Sufficient Condition: If $$f$$ is continuous, then both are always equal to $$f(x)$$ by the FTC.

Is the derivative of a continuous function always continuous? \begin{align*} [\frac{\partial}{\partial x}, \lim_{x_i\to x}]:\qquad\qquad \lim_{x_i \to x} f'(x_n) =_? f'(\lim_{x_i\to x} x) \end{align*} Answer: No.

Counterexample: \begin{align*} f ( x ) = \left\{ \begin{array} { l l } { x ^ { 2 } \sin ( 1 / x ) } & { \text { if } x \neq 0 } \\ { 0 } & { \text { if } x = 0 } \end{array} \right. \implies f ^ { \prime } ( x ) = \left\{ \begin{array} { l l } { 2 x \sin \left( \frac { 1 } { x } \right) - \cos \left( \frac { 1 } { x } \right) } & { \text { if } x \neq 0 } \\ { 0 } & { \text { if } x = 0 } \end{array} \right. \end{align*} which is discontinuous at zero.

Sufficient Condition: There doesn’t seem to be a general one (which is perhaps why we study $$C^k$$ functions).

Is the limit of a sequence of differentiable functions differentiable and the derivative of the limit?

\begin{align*} [\frac{\partial}{\partial x}, \lim_{f_n \to f}]:\qquad\qquad \lim_{f_n \to f}\frac{\partial}{\partial x}f_n(x) =_? \frac{\partial }{\partial x}\lim_{f_n \to f} f_n(x) \end{align*} Answer: Super no – even the uniform limit of differentiable functions need not be differentiable!

Counterexample: $$f_n(x) = \frac{\sin(nx)}{\sqrt{n}} \rightrightarrows f = 0$$ but $$f_n' \not\to f' = 0$$

Sufficient Condition: $$f_n \rightrightarrows f$$ and $$f_n \in C^1$$.

Is the limit of a sequence of integrable functions integrable and the integral of the limit?

\begin{align*} [\int dx, \lim_{f_n \to f}](f):\qquad\qquad \lim_{f_n \to f}\int f_n(x) dx =_? \int \lim_{f_n \to f} f_n(x) dx \end{align*}

Counterexample: Order $${\mathbb{Q}}\cap[0,1]$$ as $$\left\{{q_i}\right\}_{i\in{\mathbb{N}}}$$, then take \begin{align*} f_n(x) = \sum_{i=1}^n \indic{q_n} \to \indic{{{\mathbb{Q}}\cap[0,1]}} \end{align*} where each $$f_n$$ integrates to zero (only finitely many discontinuities) but $$f$$ is not Riemann-integrable.

Sufficient Condition: - $$f_n \rightrightarrows f$$, or - $$f$$ integrable and $$\exists M: \forall n, {\left\lvert {f_n} \right\rvert} < M$$ ($$f_n$$ uniformly bounded)

Is the integral of a continuous function also continuous?

\begin{align*} [\int dx, \lim_{x_i \to x}]:\qquad\qquad \lim_{x_i \to x} F(x_i) =_? F(\lim_{x_i \to x} x_i) \end{align*}

Proof: $$|f(x)| < M$$ on $$I$$, so given $$c$$ pick a sequence $$x\to c$$. Then \begin{align*} {\left\lvert {f(x)} \right\rvert} < M \implies \left\vert \int_c^x f(t)dt \right\vert < \int_c^x M dt \implies {\left\lvert {F(x) - F(c)} \right\rvert} < M(b-a) \to 0 \end{align*}

Is the limit of a sequence of continuous functions also continuous?

\begin{align*} [\lim_{x_i \to x}, \lim_{f_n \to f}]: \qquad\qquad \lim_{f_n \to f}\lim_{x_i \to x} f(x_i) =_? \lim_{x_i \to x}\lim_{f_n \to f} f_n(x_i)\\ \text{}\\ \end{align*}

Counterexample: $$f_n(x) = x^n \to \delta(1)$$

Sufficient Condition: $$f_n \rightrightarrows f$$

Does a sum of differentiable functions necessarily converge to a differentiable function?

\begin{align*} \left[\frac{\partial}{\partial x}, \sum_{f_n}\right]: \qquad\qquad \frac{\partial}{\partial x} \sum_{k=1}^\infty f_k =_? \sum_{k=1}^\infty \frac{\partial}{\partial x} f_k \\ \text{} \\ \text{}\\ \end{align*}

Counterexample: $$f_n(x) = \frac{\sin(nx)}{\sqrt{n}} \rightrightarrows 0 \mathrel{\vcenter{:}}= f$$, but $$f_n' = \sqrt{n}\cos(nx) \not\to 0 = f'$$ (at, say, $$x=0$$)

Sufficient Condition: When $$f_n \in C^1, \exists x_0: f_n(x_0) \to f(x_0)$$ and $$\sum {\left\lVert {f_n'} \right\rVert}_\infty < \infty$$ (continuously differentiable, converges at a point, and the derivatives absolutely converge)

## Continuity

\begin{align*} f\text{ continuous } \iff \lim_{x \to p} f(x) = f(p) \end{align*}

\begin{align*} f:(X, d_X) \to (Y, d_Y) \text{ continuous } \iff \forall \varepsilon,~ \exists \delta \mathrel{\Big|}~ d_X(x,y) < \delta \implies d_Y(f(x), f(y)) < \varepsilon \end{align*}

\begin{align*} f(x) = \sin\qty{ \frac{1}{x} } \implies 0\in D(f) \end{align*}

The Dirichlet function is nowhere continuous: \begin{align*} f(x) = \indic{{\mathbb{Q}}} \end{align*}

The following function continuous at infinitely many points and discontinuous at infinitely many points: \begin{align*} f(x) = \begin{cases} 0 & x\in{\mathbb{R}}\setminus{\mathbb{Q}}\\ \frac{1}{q} & x = \frac{p}{q} \in {\mathbb{Q}} \end{cases} \end{align*} Then $$f$$ is discontinuous on $${\mathbb{Q}}$$ and continuous on $${\mathbb{R}}\setminus{\mathbb{Q}}$$.

$$f$$ is continuous on $${\mathbb{Q}}$$:

• Fix $$\varepsilon$$, let $$x_0 \in {\mathbb{R}}-{\mathbb{Q}}$$, choose $$n: \frac{1}{n} < \varepsilon$$ using Archimedean property.
• Define $$S = \left\{{x\in{\mathbb{Q}}: x\in (0,1), x=\frac{m}{n'}, n' < n}\right\}$$
• Then $${\left\lvert {S} \right\rvert} \leq 1+2+\cdots (n-1)$$, so choose $$\delta = \min_{s\in S}{\left\lvert {s-x_0} \right\rvert}$$
• Then \begin{align*} x \in N_\delta(x_0) \implies f(x) < \frac{1}{n} < \varepsilon .\end{align*}

$$f$$ is discontinuous on $${\mathbb{R}}\setminus{\mathbb{Q}}$$:

• Let $$x_0 = \frac{p}{q} \in {\mathbb{Q}}$$ and $$\left\{{x_n}\right\} = \left\{{x-\frac{1}{n\sqrt 2}}\right\}$$. Then \begin{align*} x_n \uparrow x_0\text{ but } f(x_n) = 0 \to 0 \neq \frac{1}{q} = f(x_0) \end{align*}

There are no functions that are continuous on $${\mathbb{Q}}$$ but discontinuous on $${\mathbb{R}}-{\mathbb{Q}}$$

A continuous function on a compact space attains its extrema.

## Differentiability

\begin{align*} f'(p) \mathrel{\vcenter{:}}=\frac{\partial f}{\partial x}(p) = \lim_{x\to p} \frac{f(x) - f(p)}{x-p} \end{align*}

• For multivariable functions: existence and continuity of $$\frac{\partial \mathbf{f}}{\partial x_i} \forall i \implies \mathbf{f}$$ differentiable
• Necessity of continuity: example of a continuous functions with all partial and directional derivatives that is not differentiable: \begin{align*} f(x, y) = \begin{cases} \frac{y^3}{x^2+y^2} & (x,y) \neq (0,0) \\ 0 & \text{else} \end{cases} .\end{align*}

### Properties, strongest to weakest

\begin{align*} C^\infty \subsetneq C^k \subsetneq \text{ differentiable } \subsetneq C^0 \subset \mathcal{R}_K .\end{align*}

• Example showing $$f\in C^0 \centernot\implies f$$ is differentiable and $$f$$ not differentiable $$\centernot\implies f \not\in C^0$$.
• Take $$f(x) = {\left\lvert {x} \right\rvert}$$ at $$x=0$$.
• Example showing that $$f$$ differentiable $$\centernot\implies f \in C^1$$:
• Take \begin{align*} f(x) = \begin{cases} x^2\sin\qty{ \frac{1}{x} } & x \neq 0 \\ 0 & x =0 \end{cases} \implies f'(x) = \begin{cases} -\cos\qty{\frac{1}{x}} + 2x\sin\qty{ \frac{1}{x} } & x \neq 0 \\ 0 & x=0 \end{cases} \end{align*} but $$\lim_{x\to 0}f'(x)$$ does not exist and thus $$f'$$ is not continuous at zero.

Proof that $$f$$ differentiable $$\implies f \in C^0$$: \begin{align*} f(x) - f(p) = \frac{f(x)-f(p)}{x-p}(x-p) \stackrel{\tiny\mbox{hypothesis}}{=} f'(p)(x-p) \stackrel{\tiny\mbox{x\to p}}\rightrightarrows 0 \end{align*}

## Giant Table of Relations

Bold are assumed hypothesis, regular text is the strongest conclusion you can reach, strikeout denotes implications that aren’t necessarily true.

\begin{align*} f' && f && \therefore f && F \\ \hline \\ \cancel{\text{exists}} && \mathbf{continuous} && \text{K-integrable} && \text{exists} \\ \cancel{\text{continuous}} && \mathbf{differentiable} && \text{continuous} && \text{exists} \\ \cancel{\text{exists}} && \mathbf{integrable} && \cancel{\text{continuous}} && \text{differentiable} \\ \end{align*}

Explanation of items in table:

• K-integrable: compactly integrable.
• $$f$$ integrable $$\implies F$$ differentiable $$\implies F \in C_0$$
• By definition and FTC, and differentiability $$\implies$$ continuity
• $$f$$ differentiable and $$K$$ compact $$\implies f$$ integrable on $$K$$.
• In general, $$f$$ differentiable $$\centernot\implies f$$ integrable. Necessity of compactness: \begin{align*} f(x) = e^x \in C^\infty({\mathbb{R}})\text{ but }\int_{\mathbb{R}}e^x dx \to \infty .\end{align*}
• $$f$$ integrable $$\centernot\implies f$$ differentiable
• An integrable function that is not differentiable: $$f(x) = |x|$$ on $${\mathbb{R}}$$
• $$f$$ differentiable $$\implies f$$ continuous a.e.

## Integrability

• Sufficient criteria for Riemann integrability:
• $$f$$ continuous
• $$f$$ bounded and continuous almost everywhere, or
• $$f$$ uniformly continuous
• $$f$$ integrable $$\iff$$ bounded and continuous a.e.

If $$F$$ is a differentiable function on the interval $$[a,b]$$, and $$F'$$ is bounded and continuous a.e., then $$F' \in L_R([a, b])$$ and \begin{align*} \forall x\in [a,b]: \int_a^x F'(t)~dt=F(x)-F(a) \end{align*}

Suppose $$f$$ bounded and continuous a.e. on $$[a,b]$$, and define \begin{align*} F(x) \mathrel{\vcenter{:}}=\int_a^x f(t)~dt \end{align*} Then $$F$$ is absolutely continuous on $$[a,b]$$, and for $$p \in [a,b]$$, \begin{align*} f \in C^0(p) \implies F \text{ differentiable at } p,~ F'(p) = f(p), \text{ and } F' \stackrel{\tiny\mbox{a.e}}{=} f. \end{align*}

The Dirichlet function is Lebesgue integrable but not Riemann integrable: \begin{align*} f(x) = \indic{x \in {\mathbb{Q}}} \end{align*}

## List of Free Conclusions:

• $$f$$ integrable on $$U \implies$$:
• $$f$$ is bounded
• $$f$$ is continuous a.e. (finitely many discontinuities)
• $$\int f$$ is continuous
• $$\int f$$ is differentiable
• $$f$$ continuous on $$U$$:
• $$f$$ is integrable on compact subsets of $$U$$
• $$f$$ is bounded
• $$f$$ is integrable
• $$f$$ differentiable at a point $$p$$:
• $$f$$ is continuous
• $$f$$ is differentiable in $$U$$
• $$f$$ is continuous a.e.
• Defining the Riemann integral: #todo

## Convergence

### Sequences and Series of Functions

Define \begin{align*} s_n(x) \mathrel{\vcenter{:}}=\sum_{k=1}^n f_k(x) \end{align*} and \begin{align*} \sum_{k=1}^\infty f_k(x) \mathrel{\vcenter{:}}=\lim_{n\to\infty} s_n(x), \end{align*} which can converge pointwise, absolutely, uniformly, or not all.

If $$\limsup_{k\in {\mathbb{N}}} {\left\lvert {f_k(x)} \right\rvert} \neq 0$$ then $$f_k$$ is not convergent.

If $$f$$ is injective, then $$f'$$ is nonzero in some neighborhood of ???

### Pointwise convergence

\begin{align*} f_n \to f = \lim_{n\to\infty} f_n .\end{align*} Summary: \begin{align*} \lim_{f_n \to f} \lim_{x_i \to x} f_n(x_i) \neq \lim_{x_i \to x} \lim_{f_n \to f} f_n(x_i) .\end{align*}

\begin{align*} \lim_{f_n \to f} \int_I f_n \neq \int_I \lim_{f_n \to f} f_n .\end{align*}

Pointwise convergence is strictly weaker than uniform convergence.

$$f_n(x) = x^n$$ on $$[0, 1]$$ converges pointwise but not uniformly.

• Towards a contradiction let $$\varepsilon = \frac{1}{2}$$.
• Let $$n = N\qty{\frac{1}{2} }$$ and $$x = \left(\frac{3}{4}\right)^\frac{1}{n}$$.
• Then $$f(x) = 0$$ but \begin{align*} {\left\lvert {f_n(x) - f(x)} \right\rvert} = x^n = \frac{3}{4} > \frac{1}{2} \end{align*}

\begin{align*} f_n \text{ continuous} \centernot\implies f\mathrel{\vcenter{:}}=\lim_n f_n \text{ is continuous} .\end{align*}

Take \begin{align*} f_n(x) = x^n,\quad f_n(x) \to \indic[x = 1] .\end{align*}

\begin{align*} f_n \text{ differentiable} &\centernot\implies f'_n \text{ converges} \\ f'_n \text{ converges} &\not\implies \lim f'_n = f' .\end{align*}

Take \begin{align*} f_n(x) = \frac{1}{n}\sin(n^2 x) \to 0,&& \text{but } f'_n = n\cos(n^2 x) \text{ does not converge} .\end{align*}

\begin{align*} f_n\in \mathcal{R}_I \centernot\implies\lim_{f_n \to f} \int_I f_n \neq \int_I \lim_{f_n \to f} f_n .\end{align*}

May fail to converge to same value, take \begin{align*} f_n(x) = \frac{2n^2x}{(1+n^2x^2)^2} \to 0 && \text{but }\int_0^1 f_n = 1 - \frac{1}{n^2 + 1} \to 1\neq 0 .\end{align*}

### Uniform Convergence

Notation: \begin{align*} f_n \rightrightarrows f= \lim_{n\to\infty} f_n \text{ and } \sum_{n=1}^\infty f_n \rightrightarrows S .\end{align*}

Summary: \begin{align*} \lim_{x_i \to x} \lim_{f_n \to f} f_n(x_i) = \lim_{f_n \to f} \lim_{x_i \to x} f_n(x_i) = \lim_{f_n \to f} f_n(\lim_{x_i \to x} x_i) .\end{align*}

\begin{align*} \lim_{f_n \to f} \int_I f_n = \int_I \lim_{f_n \to f} f_n .\end{align*}

\begin{align*} \sum_{n=1}^\infty \int_I f_n = \int_I \sum_{n=1}^\infty f_n .\end{align*}

“The uniform limit of a(n) $$x$$ function is $$x$$”, for $$x \in$$ {continuous, bounded}

• Equivalent to convergence in the uniform metric on the metric space of bounded functions on $$X$$: \begin{align*} f_n \rightrightarrows f \iff \sup_{x\in X} {\left\lvert {f_n(x) - f(x)} \right\rvert} \to 0 .\end{align*}

• $$(B(X,Y), {\left\lVert {} \right\rVert}_\infty)$$ is a metric space and $$f_n \rightrightarrows f \iff {\left\lVert {f_n - f} \right\rVert}_\infty \to 0$$ (where $$B(X,Y)$$ are bounded functions from $$X$$ to $$Y$$ and $${\left\lVert {f} \right\rVert}_\infty = \sup_{x\in I}\left\{{f(x)}\right\}$$
• $$f_n \rightrightarrows f \implies f_n \to f$$ pointwise

• $$f_n$$ continuous $$\implies f$$ continuous

• i.e. “the uniform limit of continuous functions is continuous”
• $$f_n \in C^1$$, $$\exists x_0: f_n(x_0) \to f(x_0)$$, and $$f'_n \rightrightarrows g$$ $$\implies f$$ differentiable and $$f' = g~$$ (i.e. $$f'_n \to f'$$)

• Necessity of $$C^1$$ – look at failures of $$f'_n$$ to be continuous:
• Take $$f_n(x) = \sqrt{\frac{1}{n^2} + x^2} \rightrightarrows |x|$$, not differentiable
• Take $$f_n(x) = n^{-\frac{1}{2}}\sin(nx) \rightrightarrows 0$$ but $$f'_n \not\to f' = 0$$ and $$f' \neq g$$
• $$f_n$$ integrable $$\implies f$$ integrable and $$\int f_n \to \int f$$

• $$f_n$$ bounded $$\implies f$$ bounded

• $$f_n \rightrightarrows f_n \centernot\implies f'_n$$ converges

• Says nothing about it general
• $$f_n' \rightrightarrows f' \centernot\implies f_n \rightrightarrows f$$

• Unless $$f$$ converges at one or more points.

$$\left\{{x_i}\right\} \to p \implies$$ every subsequence also converges to $$p$$.

Every convergent sequence in $$X$$ is a Cauchy sequence.

The converse need not hold in general, but if $$X$$ is complete, every Cauchy sequence converges. An example of a Cauchy sequence that doesn’t converge: take $$X={\mathbb{Q}}$$ and set $$x_i = \pi$$ truncated to $$i$$ decimal places.

If any subsequence of a Cauchy sequence converges, the entire sequence converges.

\begin{align*} d(x,y) &\geq 0 && \text{Positive}\\ d(x,y) &= 0 \iff x = y && \text{Nondegenerate}\\ d(x,y) &= d(y,x) && \text{Symmetric}\\ d(x,y) &\leq d(x,p) + d(p,y) \quad \forall p && \text{Triangle Inequality} .\end{align*}

?

?

## Topology

Open Set Characterization: Arbitrary unions and finite intersections of open sets are open.

Closed Set Characterization: Arbitrary intersections and finite unions of closed sets are closed.

The best source of examples and counterexamples is the open/closed unit interval in $$\mathbb{R}$$. Always test against these first!

If $$f$$ is a continuous function. the preimage of every open set is open and the preimage of every closed set is closed.

In $${\mathbb{R}}$$, singleton sets and finite discrete sets are closed.

A singleton set can be written \begin{align*} \left\{{p_0}\right\} = (-\infty, p) \cup(p, \infty) .\end{align*} A finite discrete set $$\left\{{p_0}\right\}$$, which wlog (by relabeling) can be assumed to satisfy $$p_0 < p_1 < \cdots$$, can be written \begin{align*} \left\{{p_0, p_1, \cdots, p_n}\right\} = (-\infty, p_0) \cup(p_0, p_1) \cup\cdots \cup(p_n, \infty) .\end{align*}

This yields a good way to produce counterexamples to continuity.

In $$\mathbb{R}$$, singletons are closed. This means any finite subset is closed, as a finite union of singleton sets!

If $$X$$ is a compact metric space, then $$X$$ is complete and bounded.

If $$X$$ complete and $$X \subset Y$$, then $$X$$ closed in $$Y$$.

The converse generally does not hold, and completeness is a necessary condition. Counterexample: $${\mathbb{Q}}\subset {\mathbb{Q}}$$ is closed but $${\mathbb{Q}}\subset{\mathbb{R}}$$ is not.

If $$X$$ is compact, then $$Y \subset X \implies Y$$ is compact $$\iff$$ $$Y$$ closed.

A topological space $$X$$ is sequentially compact iff every sequence $$\left\{{x_n}\right\}$$ has a subsequence converging to a point in $$X$$.

If $$X$$ is a metric space, $$X$$ is compact iff $$X$$ is sequentially compact.

Note that in general, neither form of compactness implies the other.

## Counterexamples

There are functions differentiable only at a single point. Example: \begin{align*} f(x) = \begin{cases} x^2 & x\in QQ\\ -x^2 & x\in {\mathbb{R}}\setminus{\mathbb{Q}} \end{cases} .\end{align*}

This is discontinuous everywhere except for $$x=0$$, and you can compute \begin{align*} \lim_{h\to 0} {f(x+h) - f(x) \over h}\Big|_{x=0} = \lim_{h\to 0} \begin{cases} h & x\in {\mathbb{Q}}\\ -h & x\in {\mathbb{R}}\setminus{\mathbb{Q}} \end{cases} =0 .\end{align*}

The product of two non-differentiable functions can be differentiable: take $$f(x) = g(x) = {\left\lvert {x} \right\rvert}$$ which are not differentiable at $$x=0$$, then $$fg(x) = {\left\lvert {x} \right\rvert}^2$$ is differentiable at $$x=0$$.

A continuous function that is zero on a dense set $$A\subset X$$ is identically zero.

Since $$A$$ is dense, for any $$x\in X\setminus A$$ take a sequence $$\left\{{x_n}\right\}$$ in $$A$$ converging to $$x$$. Then $$0 = f(x_n) \to f(x)$$ implies $$f(x) = 0$$.

# Point-Set Topology

## Definitions

• Epsilon-neighborhood

• $$N_r(p) = \left\{{q \mathrel{\Big|}d_X(p,q) < r}\right\}$$
• Limit Point

• $$p$$ is a limit point of $$E$$ iff $$\forall N_r(p),~ \exists q\neq p \mathrel{\Big|}q \in N_r(p)$$
• Equivalently, $$\forall N_r(p),~ N_r(p) \cap E \neq \emptyset$$
• Let $$L(E)$$ be the set of limit points of $$E$$.
• Example: $$E = (0,1) \implies 0 \in L(E)$$
• Isolated Point

• $$p$$ is an isolated point of $$E$$ iff $$p$$ is not a limit point of $$E$$
• Equivalently, $$\exists N_r(p) \mathrel{\Big|}N_r(p) \cap E = \emptyset$$
• Equivalently, $$E - L(E)$$
• Perfect

• $$E$$ is perfect iff $$E$$ is closed and $$E \subseteq L(E)$$
• Equivalently, $$L(E) = E$$
• Interior

• $$p$$ is an interior point of $$E$$ iff $$\exists N_r(p) \mathrel{\Big|}N_r(p) \subsetneq E$$
• Denote the interior of $$E$$ by $$E^\circ$$
• Exterior

• Closed sets

• $$E$$ is closed iff $$p$$ a limit point of $$E \implies p \in E$$
• Equivalently if $$L(E) \subseteq E$$
• Closed under finite unions, arbitrary intersections
• Open sets

• $$E$$ is open iff $$p\in E \implies p \in E^\circ$$
• Equivalently, if $$E \subseteq E^\circ$$
• Closed under arbitrary unions, finite intersections
• Boundary

• Closure

• Dense

• $$E$$ is dense in $$X$$ iff $$X \subseteq E \cup L(E)$$
• Connected

• Space of connected sets closed under union, product, closures
• Convex $$\implies$$ connected
• Disconnected

• Path Connected

• $$\forall x,y \in X \exists f: I \to X \mathrel{\Big|}f(0) = x, f(1) = y$$
• Path connected $$\implies$$ connected
• Simply Connected

• Totally Disconnected

• Hausdorff

• Compact

• Every covering has a finite subcovering.
• $$X$$ compact and $$U \subset X: (U \text{ closed } \implies U \text{ compact })$$
• $$U \text{ compact } \implies U \text{ closed }$$ iff $$X$$ is Hausdorff
• Closed under products

The space $$\left\{{\frac{1}{n}}\right\}_{n\in {\mathbb{N}}}$$.

List of properties preserved by continuous maps:

• Connectedness
• Compactness

Checking if a map is homeomorphism:

• $$f$$ continuous, $$X$$ compact and Hausdorff $$\implies f$$ is a homeomorphism.

# Probability

## Definitions

\begin{align*} L^2(X) &= \left\{{f: X \to {\mathbb{R}}: \int_{\mathbb{R}}f(x) ~dx < \infty}\right\} &&\text{square integrable functions}\\ {\left\langle {g},~{f} \right\rangle}_{2} &= \int_{\mathbb{R}}g(x)f(x) ~dx &&\text{the } L^2 \text{ inner product}\\ {\left\lVert {f} \right\rVert}_2^2 &= {\left\langle {f},~{f} \right\rangle} = \int_{\mathbb{R}}f(x)^2 ~dx &&\text{norm}\\ E[{\,\cdot\,}] &= {\left\langle {{\,\cdot\,}},~{f} \right\rangle} &&\text{expectation}\\ (\tau_{p}f)(x) &= f(p- x) &&\text{translation}\\ (f \ast g)(p) &= \int_{\mathbb{R}}f(t)g(p-t)~dt = \int_{\mathbb{R}}f(t)(T_{p}g)(t) ~dt = {\left\langle {T_pg},~{f} \right\rangle} &&\text{convolution}\\ \end{align*}

For $$(\Sigma, E, \mu)$$ a probability space with sample space $$\Sigma$$ and probability measure $$\mu$$, a random variable is a function $$X: \Sigma \to {\mathbb{R}}$$

For any $$U \subset {\mathbb{R}}$$, given by the relation \begin{align*} P(X \in U) = \int_U f(x) ~dx \\ \implies P(a \leq X \leq b) = \int_a^b f(x) ~dx \end{align*}

The antiderivative of the PDF \begin{align*} F(x) = P(X \leq x) = \int_{-\infty}^x f(x) ~dx .\end{align*} Yields $${\frac{\partial F}{\partial x}\,} = f(x)$$

\begin{align*} E[X] \mathrel{\vcenter{:}}={\left\langle {\text{id}},~{f} \right\rangle} = \int_{\mathbb{R}}x f(x) ~dx .\end{align*} Also denoted $$\mu_X$$.

\begin{align*} E\left[\sum_{i\in{\mathbb{N}}} a_i X_i\right] = \sum_{i\in{\mathbb{N}}} a_i E[X_i] .\end{align*} Does not matter whether or not the $$X_i$$ are independent.

\begin{align*} \mathrm{Var}(X) &= E[(X - E[X])^2] \\ &= \int (x - E[X])^2 f(x) ~dx \\ &= E[X^2] - E[X]^2 \\ &\mathrel{\vcenter{:}}=\sigma^2(X) \end{align*} where $$\sigma$$ is the standard deviation. Can also defined as $${\left\langle {(\text{id}- {\left\langle {\text{id}},~{f} \right\rangle})^2},~{f} \right\rangle}$$ Take the portion of the id function in the orthogonal complement of $$f$$, squared, and project it back onto $$f$$?

\begin{align*} \mathrm{Var}(aX + b) &= a^2\mathrm{Var}(X) \\ \mathrm{Var}\qty{ \sum_{\mathbb{N}}X_i } &= \sum_i \mathrm{Var}(X_i) + 2 \sum_{i < j}\mathrm{Cov}(X_i, X_j) .\end{align*}

\begin{align*} \mathrm{Cov}(X,Y) &= E[(X-\mu_X)(Y-\mu_Y)] \\ &= E[XY] - E[X]E[Y] \end{align*}

\begin{align*} \mathrm{Cov}(X, X) &= \mathrm{Var}(X) \\ \mathrm{Cov}(aX, Y) &= a\mathrm{Cov}(X,Y) \\ \mathrm{Cov}(\sum_{{\mathbb{N}}} X_i, \sum_{\mathbb{N}}Y_j) &= \sum_i \sum_j\mathrm{Cov}(X_i, Y_j) \\ .\end{align*}

\begin{align*} k! \sim k^\frac{k+1}{2}e^{-k} \sqrt{2\pi} .\end{align*}

\begin{align*} P(X \geq a) \leq \frac 1 a E[X] \end{align*}

One-sided Markov: \begin{align*} P(X \in N_\varepsilon(\mu)) = 2\frac{\sigma^2}{\sigma^2 + a^2} .\end{align*}

\begin{align*} P({\left\lvert {X - \mu} \right\rvert} \geq a) \leq \left( \frac \sigma k \right)^2 \end{align*}

Apply Markov to the variable $$(X-\mu)^2$$ and $$a=k^2$$

For $$X_i$$ i.i.d., \begin{align*} \lim_n \frac{\sum_{i=1}^n X_i - n\mu}{\sigma \sqrt n} \sim N(0, 1) .\end{align*}

\begin{align*} P(\frac{1}{n} \sum X_i \rightarrow \mu) = 1 .\end{align*}

For all $$t > 0$$, \begin{align*} P(X \in N_\varepsilon(a)^c) \leq 2 e^{-at}M_X(t) \\ .\end{align*}

\begin{align*} E[f(X)] \geq f(E[X]) \end{align*}

\begin{align*} H(X) = - \sum p_i \ln p_i \end{align*}