Differential Calculus

Big Theorems / Tools:

\[\begin{align*} \frac{\partial}{\partial x} \int_a^x f(t) dt = f(x) \\ \\ \end{align*}\]

\[\begin{align*} \frac{\partial}{\partial x} \int_{a(x)}^{b(x)} f(x, t) dt - \int_{a(x)}^{b(x)} \frac{\partial}{\partial x} f(x, t) dt &= f(x, t) \cdot \frac{\partial}{\partial x}(t) \bigg\rvert_{t=a(x)}^{t=b(x)} \\ \\ &= f(x, b(x))\cdot b'(x) - f(x, a(x))\cdot a'(x) \\ \\ \end{align*}\]

If $f(x,t) = f(t)$ doesn’t depend on $x$, then ${\frac{\partial f}{\partial x}\,} = 0$ and the second integral vanishes:

\[\begin{align*} \frac{\partial}{\partial x} \int_{a(x)}^{b(x)} f(t) dt &= f(b(x))\cdot b'(x) - f(a(x))\cdot a'(x) \end{align*}\]

Note that you can recover the original FTC by taking \[\begin{align*} a(x) &= c \\ b(x) &= x \\ f(x,t) &= f(t) .\end{align*}\]

\[\begin{align*} \frac{\partial}{\partial x} \int_{1}^{x} f(x, t) dt = \int_{1}^{x} \frac{\partial}{\partial x} f(x, t) dt + f(x, x) \end{align*}\]

Todo

\[\begin{align*} f \in C^0(I) &\implies \exists p\in I: f(b) - f(a) = f'(p)(b-a) \\ &\implies \exists p\in I: \int_a^b f(x)~dx = f(p)(b-a) .\end{align*}\]

$f(x)$ and $g(x)$ are differentiable on $I - {\{\text{pt}\}}$, and

\[\begin{align*} \lim_{x\to {\{\text{pt}\}}} f(x) = \lim_{x\to {\{\text{pt}\}}} g(x) \in \left\{{0, \pm \infty}\right\}, && \forall x \in I, g'(x) \neq 0, && \lim_{x\to{\{\text{pt}\}}} \frac{ f'(x)}{\ g'(x)} \text{ exists}, \\ \end{align*}\]

Then it is necessarily the case that \[\begin{align*} \lim _ { x \rightarrow {\{\text{pt}\}}} \frac { f ( x ) } { g ( x ) } = \lim _ { x \rightarrow {\{\text{pt}\}}} \frac { f ^ { \prime } ( x ) } { g ^ { \prime } ( x ) }. \end{align*}\]

Note that this includes the following indeterminate forms: \[\begin{align*} \frac{0}{0}, \quad \frac{\infty}{\infty}, \quad 0 \cdot \infty, \quad 0^{0}, \quad \infty^{0}, \quad 1^{\infty}, \quad \infty-\infty .\end{align*}\]

For $0\cdot \infty$, can rewrite as ${0 \over {1\over \infty}} = {0\over 0},$ or alternatively ${\infty \over {1\over 0}} = {\infty \over \infty}.$

For $1^\infty, \infty^0,$ and $0^0$, set \[\begin{align*} L \mathrel{\vcenter{:}}=\lim f^g \implies \ln L = \lim g \ln(f) \end{align*}\] to recover $\infty\cdot 0, 0 \cdot \infty,$ or $0\cdot 0$.

\[\begin{align*} T(a, x) &= \sum _ { n = 0 } ^ { \infty } \frac { f ^ { ( n ) } ( a ) } { n ! } ( x - a ) ^ { n } \\ &= f ( a ) + f'(a)( x - a ) + \frac { 1 } { 2 }f ^ { \prime \prime } ( a ) ( x - a ) ^ { 2 } \\ & \quad \quad + \frac { 1} { 6 } f ^ { \prime \prime \prime } ( a ) ( x - a ) ^ { 3 } + \frac{1}{24}f^{(4)}(a)(x-a)^4 + ~\cdots \end{align*}\] There is a bound on the error: \[\begin{align*} {\left\lvert {f(x) - T_k(a,x)} \right\rvert} \leq {\left\lvert {\frac{f^{(k+1)}(a)}{(k+1)!}} \right\rvert} \end{align*}\] where $T_k(a, x) = \sum _ { n = 0 } ^ { k } \frac { f ^ { ( n ) } ( a ) } { n ! } ( x - a ) ^ { n }$ is the $k$th truncation.

Approximating change: $\Delta y \approx f'(x) \Delta x$

Limits

Tools for finding limits

How to find $\lim_{x\to a} f(x)$in order of difficulty:

Plug in: if $f$ is continuous, $\lim_{x\to a} f(x) = f(a)$.
Check for indeterminate forms and apply L’Hopital’s Rule.
Algebraic rules
Squeeze theorem
Expand in Taylor series at $a$
Monotonic + bounded
One-sided limits: $\lim_{x\to a^-} f(x) = \lim_{\varepsilon \to 0} f(a-\varepsilon)$
Limits at zero or infinity: \[\begin{align*} \lim_{x\to\infty} f(x) = \lim_{\frac{1}{x}\to 0} f\qty{\frac{1}{x}} \text{ and } \lim_{x\to 0} f(x) = \lim_{x\to\infty} f\qty{1 \over x} \end{align*}\]
- Also useful: if $p(x) = p_nx^n + \cdots$ and $q(x) = q_nx^m + \cdots$, \[\begin{align*} \lim_{x\to\infty} \frac{p(x)}{q(x)} = \begin{cases} 0 & \deg p < \deg q \\ \infty & \deg p > \deg q \\ \frac{p_n}{q_n} & \deg p = \deg q \end{cases} \end{align*}\]

Be careful: limits may not exist!! Example $:\lim_{x\to 0} \frac{1}{x} \neq 0$.

Asymptotes

Vertical asymptotes: at values $x=p$ where $\lim_{x\to p} = \pm\infty$
Horizontal asymptotes: given by points $y=L$ where $L \lim_{x\to\pm\infty} f(x) < \infty$
Oblique asymptotes: for rational functions, divide - terms without denominators yield equation of asymptote (i.e. look at the asymptotic order or “limiting behavior”).
- Concretely:

\[\begin{align*} f(x) = \frac{p(x)}{q(x)} = r(x) + \frac{s(x)}{t(x)} \sim r(x) \end{align*}\]

Recurrences

Limit of a recurrence: $x_n = f(x_{n-1}, x_{n-2}, \cdots)$
- If the limit exists, it is a solution to $x = f(x)$

Derivatives

\[\begin{align*} {\frac{\partial }{\partial x}\,}(f\circ g) = (f' \circ g) \cdot g' \end{align*}\]

\[\begin{align*} {\frac{\partial }{\partial x}\,} f\cdot g =f'\cdot g + g' \cdot f \end{align*}\]

\[\begin{align*} {\frac{\partial }{\partial x}\,} \frac{f(x)}{g(x)} = \frac{f'g - g'f}{g^2} \end{align*}\]

Mnemonic: Low d-high minus high d-low

\[\begin{align*} {\frac{\partial f^{-1}}{\partial x}\,}(f(x_0)) = \left( {\frac{\partial f}{\partial x}\,} \right)^{-1}(x_0) = 1/f'(x_0) \end{align*}\]

Implicit Differentiation

\[\begin{align*} (f(x))' = f'(x)~dx, (f(y))' = f'(y)~dy \end{align*}\] - Often able to solve for ${\frac{\partial y}{\partial x}\,}$ this way.

Obtaining derivatives of inverse functions: if $y = f^{-1}(x)$ then write $f(y) = x$ and implicitly differentiate.

General series of steps: want to know some unknown rate $y_t$

Lay out known relation that involves $y$
Take derivative implicitly (say w.r.t $t$) to obtain a relation between $y_t$ and other stuff.
Isolate $y_t = \text{ known stuff }$

Example: ladder sliding down wall

Setup: $l, x_t$ and $x(t)$ are known for a given $t$, want $y_t$. \[\begin{align*} x(t)^2 + y(t)^2 = l^2 \implies 2xx_t +2yy_t = 2ll_t = 0 \end{align*}\] (noting that $l$ is constant)
- So $y_t = -\frac{x(t)}{y(t)}x_t$
- $x(t)$ is known, so obtain $y(t) = \sqrt{l^2 - x(t)^2}$ and solve.

Integral Calculus

Average Values

\[\begin{align*} \mu_f = \frac{1}{b-a}\int_a^b f(t) dt \end{align*}\]

Apply MVT to $F(x)$.

Area Between Curves

Area in polar coordinates: \[\begin{align*} A = \int_{r_1}^{r_2} \frac{1}{2}r^2(\theta) ~d\theta \end{align*}\]

Solids of Revolution

\[\begin{align*} \text{Disks} && A = \int \pi r(t)^2 ~dt \\ \text{Cylinders} && A = \int 2\pi r(t)h(t) ~dt .\end{align*}\]

Arc Lengths

\[\begin{align*} L &= \int ~ds && ds = \sqrt{dx^2 + dy^2} \\ &= \int_{x_0}^{x_1}\sqrt{1 + {\frac{\partial y}{\partial x}\,}}~dx \\ &= \int_{y_0}^{y_1}\sqrt{{\frac{\partial x}{\partial y}\,} + 1}~dy \end{align*}\]

\[\begin{align*} SA = \int 2 \pi r(x) ~ds \end{align*}\]

Center of Mass

Given a density $\rho(\mathbf x)$ of an object $R$, the $x_i$ coordinate is given by \[\begin{align*} x_i = \frac {\displaystyle\int_R x_i\rho(x) ~dx} {\displaystyle\int_R \rho(x)~dx} \end{align*}\]

Big List of Integration Techniques

Given $f(x)$, we want to find an antiderivative $F(x) = \int f$ satisfying $\frac{\partial}{\partial x}F(x) = f(x)$

Guess and check: look for a function that differentiates to $f$.
$u{\hbox{-}}$ substitution
- More generally, any change of variables \[\begin{align*} x = g(u) \implies \int_a^b f(x)~dx = \int_{g^{-1}(a)}^{g^{-1}(b)} (f\circ g)(x) ~g'(x)~dx \end{align*}\]

Integration by Parts:

The standard form: \[\begin{align*} \int u dv = uv - \int v du \end{align*}\]

A more general form for repeated applications: let $v^{-1} = \int v$, $v^{-2} = \int\int v$, etc. \[\begin{align*} \int_a^b uv &= uv^{-1}\bigg\rvert_a^b - \int_a^b u^{1} v^{-1}\\ &= uv^{-1} - u^1v^{-2}\bigg\rvert_a^b + \int_a^b u^2v^{-2} \\ &= uv^{-1} - u^1v^{-2} + u^2v^{-3}\bigg\rvert_a^b - \int_a^b u^3v^{-3} \\ &\quad\vdots \\ \implies \int_a^b uv &= \sum_{k=1}^n (-1)^k u^{k-1}v^{-k} \bigg\rvert_a^b + (-1)^n\int_a^b u^nv^{-n} \end{align*}\]
Generally useful when one term’s $n$th derivative is a constant.

Shoelace Method

Note: you can choose $u$ or $v$ equal to 1! Useful if you know the derivative of the integrand.

Derivatives	Integrals	Signs	Result
$u$	$v$	NA	NA
$u'$	$\int v$	$+$	$u\int v$
$u''$	$\int\int v$	$-$	$-u'\int\int v$
$\vdots$	$\vdots$	$\vdots$	$\vdots$

Fill out until one column is zero (alternate signs). Get the result column by multiplying diagonally, then sum down the column.

Differentiating under the integral

\[\begin{align*} \frac{\partial}{\partial x} \int_{a(x)}^{b(x)} f(x, t) dt - \int_{a(x)}^{b(x)} \frac{\partial}{\partial x} f(x, t) dt &= f(x, \cdot)\frac{\partial}{\partial x}(\cdot) \bigg\rvert_{a(x)}^{b(x)} \\ &= f(x, b(x))~b'(x) - f(x, a(x))~a'(x) \end{align*}\]

Let $F(x)$ be an antiderivative and compute $F'(x)$ using the chain rule.

LIPET: Log, Inverse trig, Polynomial, Exponential, Trig: generally let $u$ be whichever one comes first.
The ridiculous trig sub: for any integrand containing only trig terms
- Transforms any such integrand into a rational function of $x$
- Let $u = 2\tan^{-1}x, ~du = \frac{2}{x^2+1}$, then \[\begin{align*} \int_a^b f(x)~dx = \int_{\tan\frac{a}{2}}^{\tan\frac{b}{2}} f(u)~du \end{align*}\]

\[\begin{align*} \int_0^{\pi/2} \frac{1}{\sin \theta}~d\theta = 1/2 \end{align*}\]

Trigonometric Substitution \[\begin{align*} \sqrt{a^2-x^2} && \Rightarrow && x = a\sin(\theta) &&dx = a\cos(\theta)~d\theta \\ \sqrt{a^2+x^2} && \Rightarrow && x = a\tan(\theta) &&dx = a\sec^2(\theta)~d\theta \\ \sqrt{x^2 - a^2} && \Rightarrow && x = a \sec(\theta) &&dx = a\sec(\theta)\tan(\theta)~d\theta \end{align*}\]

Partial Fractions

Trigonometric Substitution

Trig Formulas \[\begin{align*} \sin^2(x) && = && \frac{1}{2}(1-2\cos x) \\ && = && \\ && = && \\ && = && \\ && = && \\ \end{align*}\]
Products of trig functions
- Setup: $\int \sin^a(x) \cos^b(x) ~dx$
  - Both $a,b$ even: $\sin(x)\cos(x) = \frac{1}{2} \sin(x)$
  - $a$ odd: $\sin^2 = 1-\cos^2,~u=\cos(x)$
  - $b$ odd: $\cos^2 = 1-\sin^2,~u=\sin(x)$
- Setup: $\int \tan^a(x) \sec^b(x) ~dx$
  - $a$ odd: $\tan^2 = \sec^2 - 1,~ u = \sec(x)$
  - $b$ even: $\sec^2 = \tan^2 - 1, u = \tan(x)$

Other small but useful facts: \[\begin{align*} \int_0^{2\pi} \sin \theta~d\theta = \int_0^{2\pi} \cos \theta~d\theta = 0 .\end{align*}\]

Optimization

Critical points: boundary points and wherever $f'(x) = 0$
Second derivative test:
- $f''(p) > 0 \implies p$ is a min
- $f''(p) < 0 \implies p$ is a max
Inflection points of $h$ occur where the tangent of $h'$ changes sign. (Note that this is where $h'$ itself changes sign.)
Inverse function theorem: The slope of the inverse is reciprocal of the original slope
If two equations are equal at exactly one real point, they are tangent to each other there - therefore their derivatives are equal. Find the $x$ that satisfies this; it can be used in the original equation.
Fundamental theorem of Calculus: If \[\begin{align*} \int f(x) dx = F(b) - F(a) \implies F'(x) = f(x) .\end{align*}\]
Min/maxing - either derivatives of Lagranage multipliers!
Distance from origin to plane: equation of a plane \[\begin{align*} P: ax+by+cz=d .\end{align*}\]
- You can always just read off the normal vector $\mathbf{n} = (a,b,c)$. So we have $\mathbf{n}\mathbf{x} = d$.
- Since $\lambda \mathbf{n}$ is normal to $P$ for all $\lambda$, solve $\mathbf{n}\lambda \mathbf{n} = d$, which is $\lambda = \frac{d}{ {\left\lVert {\mathbf{n}} \right\rVert}^2}$
A plane can be constructed from a point $p$ and a normal $n$ by the equation $np = 0$.
In a sine wave $f(x) = \sin(\omega x)$, the period is given by $2\pi/\omega$. If $\omega > 1$, then the wave makes exactly $\omega$ full oscillations in the interval $[0, 2\pi]$.
The directional derivative is the gradient dotted against a unit vector in the direction of interest
Related rates problems can often be solved via implicit differentiation of some constraint function
The second derivative of a parametric equation is not exactly what you’d intuitively think!
For the love of god, remember the FTC! \[\begin{align*} \frac{\partial}{\partial x} \int_0^x f(y) dy = f(x) \end{align*}\]
Technique for asymptotic inequalities: WTS $f < g$, so show $f(x_0) < g(x_0)$ at a point and then show $\forall x > x_0, f'(x) < g'(x)$. Good for big-O style problems too.
Inflection points of $h$ occur where the tangent of $h'$ changes sign. (Note that this is where $h'$ itself changes sign.)
Inverse function theorem: The slope of the inverse is reciprocal of the original slope
If two equations are equal at exactly one real point, they are tangent to each other there - therefore their derivatives are equal. Find the $x$ that satisfies this; it can be used in the original equation.
Fundamental theorem of Calculus: If \[\begin{align*} \int f(x) dx = F(b) - F(a) \implies F'(x) = f(x) .\end{align*}\]
Min/maxing - either derivatives of Lagranage multipliers!
Distance from origin to plane: equation of a plane \[\begin{align*} P: ax+by+cz=d .\end{align*}\]

Vector Calculus

Notation: \[\begin{align*} \mathbf{v}, \mathbf{a}, \cdots && \text{vectors in }{\mathbb{R}}^n \\ \mathbf{R}, \mathbf{A}, \cdots && \text{matrices} \\ \mathbf{r}(t) && \text{A parameterized curve }\mathbf{r}: {\mathbb{R}}\to {\mathbb{R}}^n \\ \\ \widehat{\mathbf{v}} && {\mathbf{v} \over {\left\lVert {\mathbf{v}} \right\rVert}} .\end{align*}\]

Plane Geometry

\[\begin{align*} \mathbf{v} = [x, y] \in {\mathbb{R}}^2 \implies m = \frac{y}{x} .\end{align*}\]

\[\begin{align*} \mathbf{R}_\theta = \left[ \begin{array} { l l } { \cos \theta } & { - \sin \theta } \\ { \sin \theta } & { \cos \theta } \end{array} \right] \implies \mathbf{R}_{\frac{\pi}{2}} = \left[ \begin{array} { l l } { 0 } & { - 1 } \\ { 1 } & { 0 } \end{array}\right] .\end{align*}\]

\[\begin{align*} \mathbf{R}_{\frac{\pi}{2}} \mathbf{x} \mathrel{\vcenter{:}}= \mathbf{R}_{\frac{\pi}{2}} \begin{bmatrix} x \\ y \end{bmatrix} = \begin{bmatrix} -y \\ x \end{bmatrix} \in \mathbf{\mathbb{R}}\mathbf{x}^\perp .\end{align*}\] Thus if a planar line is defined by the span of ${\left[ {x, y} \right]}$ and a slope of $m = y/x$, a normal vector is given by the span of ${\left[ {-y, x} \right]}$ of slope $-{1 \over m} = -x/y$.

Given $\mathbf{v}$, the rotated vector $\mathbf{R}_{\frac{\pi}{2}} \mathbf{v}$ is orthogonal to $\mathbf{v}$, so this can be used to obtain normals and other orthogonal vectors in the plane.

There is a direct way to come up with one orthogonal vector to any given vector: \[\begin{align*} \mathbf{v} = [a,b,c] \implies \mathbf{y} \mathrel{\vcenter{:}}= \begin{cases} \mathbf{[}-(b+c), a, a] & \mathbf{v} = [-1,-1,0], \\ [c,c, -(a+b)] & \text{else} \end{cases} \in {\mathbb{R}}\mathbf{v}^\perp .\end{align*}\]

Projections

For a subspace given by a single vector $\mathbf{a}$: \[\begin{align*} \mathrm{proj}_\mathbf{a}( \textcolor{Aquamarine}{\textbf{x}} ) = {\left\langle {\textcolor{Aquamarine}{\textbf{x}} },~{\mathbf{a}} \right\rangle}\mathbf{\widehat{a}} \hspace{8em} \mathrm{proj}_{\mathbf{a}}^\perp(\textcolor{Aquamarine}{\textbf{x}}) = \textcolor{Aquamarine}{\textbf{x}} - \mathrm{proj}_\mathbf{a}(\textcolor{Aquamarine}{\textbf{x}}) = \textcolor{Aquamarine}{\textbf{x}} - {\left\langle {\textcolor{Aquamarine}{\textbf{x}}},~{\mathbf{a}} \right\rangle}\widehat{\mathbf{a}} \end{align*}\]

In general, for a subspace $\operatorname{colspace}(A) = \left\{{\mathbf{a}_1, \cdots \mathbf{a}_n}\right\}$,

\[\begin{align*} \mathrm{proj}_A(\textcolor{Aquamarine}{\textbf{x}}) = \sum_{i=1}^n {\left\langle {\textcolor{Aquamarine}{\textbf{x}} },~{\mathbf{a}_i} \right\rangle}\mathbf{\widehat{a}_i} = A(A^T A)^{-1}A^T\textcolor{Aquamarine}{\textbf{x}} \end{align*}\]

Lines

\[\begin{align*} \text{General Equation} && Ax + By + C = 0 \\ \\ \text{Parametric Equation} && \mathbf{r}(t) = t\mathbf{x} + \mathbf{b} .\end{align*}\]

Characterized by an equation in inner products: \[\begin{align*} \mathbf{y} \in L \iff {\left\langle {\mathbf{y}},~{\mathbf{n}} \right\rangle} = 0 \end{align*}\]

Given $\mathbf{p}_0, \mathbf{p}_1$, take $\mathbf{x} = \mathbf{p}_1 - \mathbf{p}_0$ and $\mathbf{b} = \mathbf{p}_i$ for either $i$: \[\begin{align*} \mathbf{r}(t) = t(\mathbf{p}_1 - \mathbf{p}_0) + \mathbf{p}_0 && = t\mathbf{p}_1 + (1-t) \mathbf{p}_0 .\end{align*}\]

If a line $L$ is given by \[\begin{align*} \mathbf{r}(t) = t {\left[ {x_1, x_2, x_3} \right]} + {\left[ {p_1, p_2, p_3} \right]} ,\end{align*}\] then \[\begin{align*} (x, y, z) \in L \iff \frac{x-p_1}{x_1} = \frac{y-p_{2}}{x_2} = \frac{z-p_{3}}{x_3} .\end{align*}\]

The symmetric equation of the line through ${\left[ {2,1,-3} \right]}$ and ${\left[ {1,4,-3} \right]}$ is given by \[\begin{align*} \frac{x-2}{1}=\frac{y+1}{-5}=\frac{z-3}{6} .\end{align*}\]

Tangent Lines / Planes

Key idea: just need a point and a normal vector, and the gradient is normal to level sets.

For any locus $f(\mathbf{x}) = 0$, we have \[\begin{align*} \mathbf{x} \in T_f(\mathbf{p}) \implies {\left\langle {\nabla f(\mathbf{p})},~{\mathbf{x}-\mathbf{p}} \right\rangle} = 0 .\end{align*}\]

Normal Lines

Key idea: the gradient is normal.

To find a normal line, you just need a single point $\mathbf{p}$ and a normal vector $\mathbf{n}$; then \[\begin{align*} L = \left\{{\mathbf{x} \mathrel{\Big|}\mathbf{x} = \mathbf{p} + t\mathbf{v}}\right\} .\end{align*}\]

Planes

\[\begin{align*} \text{General Equation} && A x + B y + C z + D = 0 \\ \\ \text{Parametric Equation} &&\mathbf{y}(t,s) = t\mathbf{x}_1 + s\mathbf{x}_2 + \mathbf{b} \\ \\ .\end{align*}\]

Characterized by an equation in inner products: \[\begin{align*} \mathbf{y} \in P \iff {\left\langle {\mathbf{y} - \mathbf{p}_0},~{\mathbf{n}} \right\rangle} = 0 \end{align*}\]

Determined by a point $\mathbf{p}_0$ and a normal vector $\mathbf{n}$

Given $\mathbf{v}_0, \mathbf{v}_1$, set $\mathbf{n} = \mathbf{v}_0 \times\mathbf{v}_1$.

Finding a Normal Vector

Normal vector to a plane
- Can read normal off of equation: $\mathbf{n} = [a,b,c]$
Computing $D$:
- $D = {\left\langle {\mathbf{p}_0},~{\mathbf{n}} \right\rangle} = p_1n_1 + p_2n_2 + p_3n_3$
- Useful trick: once you have $\mathbf{n}$, you can let $\mathbf{p}_0$ be any point in the plane (don’t necessarily need to use the one you started with, so pick any point that’s convenient to calculate)

Distance from origin to plane

Given by $D/ {\left\lVert {\mathbf{n}} \right\rVert} = {\left\langle {\mathbf{p}_0},~{\mathbf{\widehat{n}}} \right\rangle}$. Gives a signed distance.

Distance from point to plane

Given by ${\left\langle {{\,\cdot\,}},~{\mathbf{\widehat{n}}} \right\rangle}$
Finding vectors in the plane
Given $P = [A, B, C] \cdot [x, y, z] = 0$, can take ${\left[ {-\frac{B}{A},1,0} \right]}, {\left[ {-\frac{C}{A},0,1} \right]}$

Curves

\[\begin{align*} \mathbf{r}(t) = [x(t), y(t), z(t)] .\end{align*}\]

Tangent line to a curve

We have an equation for the tangent vector at each point: \[\begin{align*} \widehat{ \mathbf{T} }(t) = \widehat{\mathbf{r}'}(t) ,\end{align*}\] so we can write \[\begin{align*} \mathbf{L}_{T}(t) = \mathbf{r}(t_0) + t \widehat{ \mathbf{T}}(t_0) \mathrel{\vcenter{:}}=\mathbf{r}(t_0) + t \widehat{\mathbf{r}'}(t_0) .\end{align*}\]

Normal line to a curve

Use the fact that $\mathbf{r}''(t) \in {\mathbb{R}}\mathbf{r}'(t)^\perp$, so we have an equation for a normal vector at each point: \[\begin{align*} \widehat{\mathbf{N}}(t) = \widehat{\mathbf{r}''}(t) .\end{align*}\] Thus we can write \[\begin{align*} \mathbf{L}_N(t) = \mathbf{r}(t_0) + t \widehat{ \mathbf{N} }(t_0) = \mathbf{r}(t_0) + t \widehat{ \mathbf{r}''} (t_0) .\end{align*}\]

Special case: planar graphs of functions

Suppose $y = f(x)$. Set $g(x, y) = f(x) - y$, then \[\begin{align*} \nabla g = [f_x(x), -1]\implies m = -\frac{1}{f_x(x)} \end{align*}\]

Minimal Distances

Fix a point $\mathbf{p}$. Key idea: find a subspace and project onto it.

Key equations: projection and orthogonal projection of $\mathbf{b}$ onto $\mathbf{a}$: \[\begin{align*} \mathrm{proj}_\mathbf{a}(\mathbf{b}) = {\left\langle {\mathbf{b}},~{\mathbf{a}} \right\rangle}\mathbf{\widehat{a}} \hspace{8em} \mathrm{proj}_{\mathbf{a}}^\perp(\mathbf{b}) = \mathbf{b} - \mathrm{proj}_\mathbf{a}(\mathbf{b}) = \mathbf{b} - {\left\langle {\mathbf{b}},~{\mathbf{a}} \right\rangle}\widehat{\mathbf{a}} \end{align*}\]

Point to plane

Given a point $\mathbf{p}$ and a plane $S = \left\{{\mathbf{x} \in {\mathbb{R}}^3 \mathrel{\Big|}n_0x + n_1y + n_2z = d}\right\}$, let $\mathbf{n} = [n_1, n_2, n_3]$, find any point $\mathbf{q} \in S$, and project $\mathbf{q} -\mathbf{p}$ onto $S^\perp= \mathrm{Span}(\mathbf{n})$ using

\[\begin{align*} d = {\left\lVert {\mathrm{proj}_{\mathbf{n}}(\mathbf{q} - \mathbf{p})} \right\rVert} = {\left\lVert {{\left\langle {\mathbf{q} - \mathbf{p}},~{\mathbf{n}} \right\rangle} \widehat{\mathbf{n}}} \right\rVert} = {\left\langle {\mathbf{q} - \mathbf{p}},~{\mathbf{n}} \right\rangle} .\end{align*}\]

Given just two vectors $\mathbf{u}, \mathbf{v}$: manufacture a normal vector $\mathbf{n} = \mathbf{u} \times \mathbf{v}$ and continue as above.

Origin to plane

Special case: if $\mathbf{p} = \mathbf{0}$, \[\begin{align*} d = {\left\lVert {\mathrm{proj}_{\mathbf{n}}(\mathbf{q})} \right\rVert} = {\left\lVert {{\left\langle {\mathbf{p}},~{\mathbf{n}} \right\rangle} \widehat{\mathbf{n}}} \right\rVert} = {\left\langle {\mathbf{p}},~{\mathbf{n}} \right\rangle}. .\end{align*}\]

Point to line

Given a line $L: \mathbf{x}(t) = t\mathbf{v}$ for some fixed $\mathbf{v}$, use \[\begin{align*} d = {\left\lVert {\mathrm{proj}_\mathbf{v}^^\perp(\mathbf{p})} \right\rVert} = {\left\lVert {\mathbf{p} - {\left\langle {\mathbf{p}},~{\mathbf{v}} \right\rangle}\widehat{\mathbf{v} }} \right\rVert} .\end{align*}\]
Given a line $L: \mathbf{x}(t) = \mathbf{w}_0 + t\mathbf{w}$, let $\mathbf{v} = \mathbf{x}(1) - \mathbf{x}(0)$ and proceed as above.

Point to curve

Line to line

Given $\mathbf{r}_1(t) = \mathbf{p}_1 + t\mathbf{v}_2$ and $\mathbf{r}_2(t) = \mathbf{p}_2 + t\mathbf{v}_2$, let $d$ be the desired distance.

Let $\widehat{ \mathbf{n}} = \widehat{\mathbf{v}_1 \times \mathbf{v}_2}$, which is orthogonal to both lines.
Then project the vector connecting the two fixed points $\mathbf{p}_i$ onto this subspace and take its norm: \[\begin{align*} d &= {\left\lVert {\mathrm{proj}_{\mathbf{n}}(\mathbf{p}_2 - \mathbf{p}_1)} \right\rVert} \\ &= {\left\lVert {{\left\langle {\mathbf{p}_2 -\mathbf{p}_1},~{\mathbf{n}} \right\rangle}\widehat{\mathbf{n}}} \right\rVert} \\ &= {\left\langle {\mathbf{p}_2 - \mathbf{p}_1},~{\mathbf{n}} \right\rangle} \\ &\mathrel{\vcenter{:}}={\left\langle {\mathbf{p}_2 - \mathbf{p}_1},~{\mathbf{v}_1 \times\mathbf{v}_2} \right\rangle} .\end{align*}\]

Surfaces

\[\begin{align*} S = \left\{{(x,y,z) \mathrel{\Big|}f(x,y, z) = 0}\right\} \hspace{10em} z = f(x,y) \end{align*}\]

Tangent plane to a surface

Need a point $\mathbf{p}$ and a normal $\mathbf{n}$. By cases:
$f(x,y, z) = 0$
- $\nabla f$ is a normal vector.
- Write the tangent plane equation ${\left\langle {\mathbf{n}},~{\mathbf{x} - \mathbf{p}_0} \right\rangle}$, done.
$z = g(x,y)$:
- Let $f(x, y, z) = g(x,y) - z$, then $\mathbf{p} \in S \iff \mathbf{p}$ is in a level set of $f$.
- $\nabla f$ is normal to level sets (and thus the surface), so compute $\nabla f = [g_x, g_y, -1]$
- Proceed as in previous case

Surfaces of revolution

Given $f(x_1 ,x_2) = 0$, can be revolved around either the $x_1$ or $x_2$ axis.
- $f(x,y)$ around the $x$ axis yields $f(x, \pm \sqrt{y^2 + z^2})=0$
- $f(x,y)$ around the $y$ axis yields $f(\pm\sqrt{x^2 + z^2}, y)=0$
- Remaining cases proceed similarly - leave the axis variable alone, replace other variable with square root involving missing axis.
Equations of lines tangent to an intersection of surfaces $f(x,y,z) = g(x,y,z)$:
- Find two normal vectors and take their cross product, e.g. $n = \nabla f \times \nabla g$, then \[\begin{align*} L = \left\{{\mathbf{x}\mathrel{\Big|}\mathbf{x} = \mathbf{p} + t \mathbf{n}}\right\} \end{align*}\]
Level curves:
- Given a surface $f(x,y,z) = 0$, the level curves are obtained by looking at e.g. $f(x,y,c) = 0$.

Multivariable Calculus

Given a function $f: {\mathbb{R}}^n \to {\mathbb{R}}$, let $S_k \mathrel{\vcenter{:}}=\left\{{\mathbf{p}\in {\mathbb{R}}^n ~{\text{s.t.}}~f(\mathbf{p}) = k}\right\}$ denote the level set for $k\in {\mathbb{R}}$. Then \[\begin{align*} \nabla f(\mathbf{p}) \in S_k^\perp .\end{align*}\]

Notation

\[\begin{align*} \mathbf{v} &= [v_1, v_2, \cdots] && \text{a vector} \\ \\ \mathbf{e}_i &= [0, 0, \cdots, \overbrace{1}^{i \text{th term}}, \cdots, 0] && \text{the } i \text{th standard basis vector} \\ \\ \phi: {\mathbb{R}}^n &\to {\mathbb{R}} && \text{a functional on } {\mathbb{R}}^n\\ \phi(x_1, x_2, \cdots) &= \cdots && \\ \\ \mathbf{F}: {\mathbb{R}}^n &\to {\mathbb{R}}^n && \text{a multivariable function}\\ \mathbf{F}(x_1,x_2,\cdots) &= [\mathbf{F}_1(x_1, x_2, \cdots), \mathbf{F}_2(x_1, x_2, \cdots), \cdots, \mathbf{F}_n(x_1, x_2, \cdots)] \end{align*}\]

Partial Derivatives

For a functional $f:{\mathbb{R}}^n\to {\mathbb{R}}$, the partial derivative of $f$ with respect to $x_i$ is \[\begin{align*} {\frac{\partial f}{\partial x_i}\,}(\mathbf p) \mathrel{\vcenter{:}}=\lim_{h\to 0}\frac{f(\mathbf p + h\mathbf e_i) - f(\mathbf p)}{h} \end{align*}\]

\[\begin{align*} f: {\mathbb{R}}^2 &\to {\mathbb{R}}\\ {\frac{\partial f}{\partial x}\,}(x_0,y_0) &= \lim_{h \to 0} \frac{f(x_0+h, y_0) - f(x_0,y_0)}{h} \end{align*}\]

General Derivatives

A function $f: {\mathbb{R}}^n \to {\mathbb{R}}^m$ is differentiable iff there exists a linear transformation $D_f: {\mathbb{R}}^n \to {\mathbb{R}}^m$ such that the following limit exists \[\begin{align*} \lim _ { \mathbf x \rightarrow \mathbf{p} } \frac { \left\| f (\mathbf x ) - f (\mathbf{p} ) - D_f (\mathbf x - \mathbf{p} ) \right\| } { \| \mathbf x - \mathbf{p} \| } = 0 .\end{align*}\]

$D_f$ is the “best linear approximation” to $f$.

When $f$ is differentiable, $D_f$ can be given in coordinates by
\[\begin{align*} (D_f)_{ij} = {\frac{\partial f_i}{\partial x_j}\,} \end{align*}\]

This yields the Jacobian of $f$: \[\begin{align*} D_f(\mathbf{\mathbf}{p}) \begin{bmatrix} \rule[-1ex]{0.5pt}{2.5ex}& \rule[-1ex]{0.5pt}{2.5ex}& & \rule[-1ex]{0.5pt}{2.5ex}\\ \nabla f_1(\mathbf{p}) & \nabla f_2(\mathbf{p}) & \cdots & \nabla f_m(\mathbf{p}) \\ \rule[-1ex]{0.5pt}{2.5ex}& \rule[-1ex]{0.5pt}{2.5ex}& & \rule[-1ex]{0.5pt}{2.5ex} \end{bmatrix}^T = \left[ \begin{array} { c c c c } { \frac { \partial f _ { 1 } } { \partial x _ { 1 } } ( \mathbf{p} ) } & { \frac { \partial f _ { 1 } } { \partial x _ { 2 } } ( \mathbf{p} ) } & { \ldots } & { \frac { \partial f _ { 1 } } { \partial x _ { n } } ( \mathbf{p} ) } \\ { \frac { \partial f _ { 2 } } { \partial x _ { 1 } } ( \mathbf{p} ) } & { \frac { \partial f _ { 2 } } { \partial x _ { 2 } } ( \mathbf{p} ) } & { \dots } & { \frac { \partial f _ { 2 } } { \partial x _ { n } } ( \mathbf{p} ) } \\ { \vdots } & { \vdots } & { \ddots } & { \vdots } \\ { \frac { \partial f _ { m } } { \partial x _ { 1 } } ( \mathbf{p} ) } & { \frac { \partial f _ { m } } { \partial x _ { 2 } } ( \mathbf{p} ) } & { \cdots } & { \frac { \partial f _ { m } } { \partial x _ { n } } ( \mathbf{p} ) } \end{array} \right]. \end{align*}\]

This is equivalent to

Taking the gradient of each component $f_i$ of $f$,
Evaluating $\nabla f_i$ at $\mathbf{p}$,
Forming a matrix using these as the columns, and
Transposing the resulting matrix.

For a function $f: {\mathbb{R}}^n \to {\mathbb{R}}$, the Hessian is a generalization of the second derivative, and is given in coordinates by \[\begin{align*} (H_f)_{ij} = {\frac{\partial ^2f}{\partial x_i x_j}\,} \end{align*}\]

Explicitly, we have \[\begin{align*} H_f(\mathbf{p}) = \begin{bmatrix} \rule[-1ex]{0.5pt}{2.5ex}& \rule[-1ex]{0.5pt}{2.5ex}& & \rule[-1ex]{0.5pt}{2.5ex}\\ D \nabla f_1(\mathbf{p}) & D\nabla f_2(\mathbf{p}) & \cdots & D\nabla f_m(\mathbf{p}) \\ \rule[-1ex]{0.5pt}{2.5ex}& \rule[-1ex]{0.5pt}{2.5ex}& & \rule[-1ex]{0.5pt}{2.5ex} \end{bmatrix}^T = \left[ \begin{array} { c c c } { \frac { \partial ^ { 2 } f } { \partial x _ { 1 } \partial x _ { 1 } } ( \mathbf { a } ) } & { \dots } & { \frac { \partial ^ { 2 } f } { \partial x _ { 1 } \partial x _ { n } } ( \mathbf { a } ) } \\ { \vdots } & { \ddots } & { \vdots } \\ { \frac { \partial ^ { 2 } f } { \partial x _ { n } \partial x _ { 1 } } ( \mathbf { a } ) } & { \cdots } & { \frac { \partial ^ { 2 } f } { \partial x _ { n } \partial x _ { n } } ( \mathbf { a } ) } \end{array} \right]. \end{align*}\]

Mnemonic: make matrix with $\nabla f$ as the columns, and then differentiate variables left to right.

The Chain Rule

Write out tree of dependent variables:

Then sum each possible path.

Let subscripts denote which variables are held constant, then \[\begin{align*} \left({\frac{\partial z}{\partial x}\,}\right)_y &= \left({\frac{\partial z}{\partial x}\,}\right)_{u,y,v} \\ & + \left({\frac{\partial z}{\partial v}\,}\right)_{x,y,u} \left({\frac{\partial v}{\partial x}\,}\right)_y \\ & + \left({\frac{\partial z}{\partial u}\,}\right)_{x,y,v} \left({\frac{\partial u}{\partial x}\,}\right)_{v,y} \\ & + \left({\frac{\partial z}{\partial u}\,}\right)_{x,y,v} \left({\frac{\partial u}{\partial v}\,}\right)_{x,y} \left({\frac{\partial v}{\partial x}\,}\right)_y \end{align*}\]

Approximation

Let $z = f(x,y)$, then to approximate near $\mathbf{p}_0 = {\left[ {x_0, y_0} \right]}$, \[\begin{align*} f(\textcolor{Aquamarine}{\textbf{x}}) &\approx f(\mathbf{p}) + \nabla f (\textcolor{Aquamarine}{\textbf{x}} - \mathbf{p}_0) \\ \implies f(x,y) &\approx f(\mathbf{p}) + f_x(\mathbf{p})(x-x_0) + f_y(\mathbf{p})(y-y_0) \\ .\end{align*}\]

Optimization

Classifying Critical Points

Critical points of $f$ given by points $\mathbf{p}$ such that the derivative vanishes: \[\begin{align*} \operatorname{crit}(f) = \left\{{\mathbf{p}\in {\mathbb{R}}^n ~{\text{s.t.}}~D_f({\mathbf p}) = 0}\right\} \end{align*}\]

. Compute \[\begin{align*} {\left\lvert {H_f(\mathbf p)} \right\rvert} \mathrel{\vcenter{:}}=\left| \begin{array} { l l } { f _ { x x } } & { f _ { x y } } \\ { f _ { y x } } & { f _ { y y } } \end{array} \right| ({ \mathbf p }) \end{align*}\] 2. Check by cases:

${\left\lvert {H(\mathbf p)} \right\rvert} = 0$: No conclusion
${\left\lvert {H(\mathbf p)} \right\rvert} < 0$: Saddle point
${\left\lvert {H(\mathbf p)} \right\rvert} > 0$:
- $f_{xx}(\mathbf p) > 0 \implies$ local min
- $f_{xx}(\mathbf p) < 0 \implies$ local max

What’s really going on?

Eigenvalues have same sign $\iff$ positive definite or negative definite
- Positive definite $\implies$ convex $\implies$ local min
- Negative definite $\implies$ concave $\implies$ local max

Extrema occur on boundaries, so parameterize each boundary to obtain a function in one less variable and apply standard optimization techniques to yield critical points. Test all critical points to find extrema.
If possible, use constraint to just reduce equation to one dimension and optimze like single-variable case.

Lagrange Multipliers

The setup: \[\begin{align*} \text{Optimize } f(\mathbf x) &\quad \text{subject to } g(\mathbf x) = c \\ \implies \nabla f &= \lambda \nabla g \end{align*}\] 1. Use this formula to obtain a system of equations in the components of $x$ and the parameter $\lambda$.

Use $\lambda$ to obtain a relation involving only components of $\mathbf{x}$.
Substitute relations back into constraint to obtain a collection of critical points.
Evaluate $f$ at critical points to find max/min.

Change of Variables

For any $f: {\mathbb{R}}^n \to {\mathbb{R}}^n$ and region $R$, \[\begin{align*} \int _ { g ( R ) } f ( \mathbf { x } ) ~d V = \int _ { R } (f \circ g) ( \mathbf { x } ) \cdot {\left\lvert {D_g ( \mathbf { x })} \right\rvert} ~d V \end{align*}\]

Vector Calculus

Notation

$R$ is a region, $S$ is a surface, $V$ is a solid.

\[\begin{align*} \oint _ { \partial S } \mathbf { F } \cdot d \mathbf { r } = \oint _ { \partial S } [\mathbf{F}_1, \mathbf{F}_2, \mathbf{F}_3] \cdot [dx, dy, dz] = \oint_{{\partial}S} \mathbf{F}_1dx + \mathbf{F}_2dy + \mathbf{F}_3dz \end{align*}\]

The main vector operators \[\begin{align*} \nabla: ({\mathbb{R}}^n \to {\mathbb{R}}) &\to ({\mathbb{R}}^n \to {\mathbb{R}}^n) \\ \phi &\mapsto \nabla \phi \mathrel{\vcenter{:}}=\sum_{i=1}^n \frac{\partial \phi}{\partial x_i} ~\mathbf{e}_i \\ \\ \text{} \mathrm{div}(\mathbf{F}): ({\mathbb{R}}^n \to {\mathbb{R}}^n) &\to ({\mathbb{R}}^n \to {\mathbb{R}}) \\ \mathbf{F} &\mapsto \nabla \cdot \mathbf{F} \mathrel{\vcenter{:}}=\sum_{i=1}^n \frac{\partial \mathbf{F}_i}{\partial x_i} \\ \\ \text{} \mathrm{curl}(\mathbf{F}): ({\mathbb{R}}^3 \to {\mathbb{R}}^3) &\to ({\mathbb{R}}^3 \to {\mathbb{R}}^3) \\ \mathbf{F} &\mapsto \nabla \times\mathbf{F} \\ \\ \text{} \end{align*}\] Some terminology: \[\begin{align*} \text{Scalar Field} && \phi:&~ X \to {\mathbb{R}}\\ \text{Vector Field} && \mathbf{F}:&~ X\to {\mathbb{R}}^n\\ \text{Gradient Field} && \mathbf{F}:&~ X \to {\mathbb{R}}^n \mathrel{\Big|}\exists \phi: X\to {\mathbb{R}}\mathrel{\Big|}\nabla \phi = F \end{align*}\]

The Gradient: lifts scalar fields on ${\mathbb{R}}^n$ to vector fields on ${\mathbb{R}}^n$
Divergence: drops vector fields on ${\mathbb{R}}^n$ to scalar fields on ${\mathbb{R}}^n$
Curl: takes vector fields on ${\mathbb{R}}^3$ to vector fields on ${\mathbb{R}}^3$

\[\begin{align*} \mathbf x \cdot \mathbf y = {\left\langle {\mathbf x},~{\mathbf y} \right\rangle} = \sum_{i=1}^n {x_i y_i} = x_1y_1 + x_2y_2 + \cdots && \text{inner/dot product} \\ {\left\lVert {\mathbf x} \right\rVert} = \sqrt{{\left\langle {\mathbf x},~{\mathbf x} \right\rangle}} = \sqrt{\sum_{i=1}^n x_i^2} = \sqrt{x_1^2 + x_2^2 + \cdots} && \text{norm} \\ \mathbf a \times\mathbf b = \mathbf{\widehat{n}} {\left\lVert {\mathbf a} \right\rVert}{\left\lVert {\mathbf b} \right\rVert}\sin\theta_{\mathbf a,\mathbf b} = \left| \begin{array}{ccc} \mathbf{\widehat{x}} & \mathbf{\widehat{y}} & \mathbf{\widehat{z}} \\ a_1 & a_2 & a_3 \\ b_1 & b_2 & b_3 \end{array}\right| && \text{cross product} \\ \\ D_\mathbf{u}(\phi) = \nabla \phi \cdot \mathbf{\widehat{u}} && \text{directional derivative} \\ \\ \nabla \mathrel{\vcenter{:}}=\sum_{i=1}^n \frac{\partial}{\partial x_i} \mathbf{e}_i = \left[\frac{\partial}{\partial x_1}, \frac{\partial}{\partial x_2}, \cdots, \frac{\partial}{\partial x_n}\right] && \text{del operator} \\ \\ \nabla \phi \mathrel{\vcenter{:}}=\sum_{i=1}^n \frac{\partial \phi}{\partial x_i} ~\mathbf{e}_i = {\left[ { \frac{\partial \phi}{\partial x_1}, \frac{\partial \phi}{\partial x_2}, \cdots, \frac{\partial \phi}{\partial x_n}} \right]} && \text{gradient} \\ \\ \Delta \phi \mathrel{\vcenter{:}}=\nabla\cdot\nabla \phi \mathrel{\vcenter{:}}=\sum_{i=1}^n \frac{\partial^2 \phi}{\partial x_i^2} = \frac{\partial^2 \phi}{\partial x_1^2} + \frac{\partial^2 \phi}{\partial x_2} + \cdots + \frac{\partial^2 \phi}{\partial x_n^2} && \text{Laplacian} \\ \\ \nabla \cdot \mathbf{F} \mathrel{\vcenter{:}}=\sum_{i=1}^n \frac{\partial \mathbf{F}_i}{\partial x_i} = \frac{\partial \mathbf{F}_1}{\partial x_1} + \frac{\partial \mathbf{F}_2}{\partial x_2} + \cdots + \frac{\partial \mathbf{F}_n}{\partial x_n} && \text{divergence} \\ \\ \nabla \times \mathbf { F } = \left| \begin{array} { c c c } { \mathbf { e }_1 } & { \mathbf { e }_2 } & { \mathbf { e }_3 } \\ { \frac { \partial } { \partial x } } & { \frac { \partial } { \partial y } } & { \frac { \partial } { \partial z } } \\ { \mathbf{F} _ { 1 } } & { \mathbf{F} _ { 2 } } & { \mathbf{F} _ { 3 } } \end{array} \right| = [\mathbf{F}_{3y} - \mathbf{F}_{2z}, \mathbf{F}_{1z}- \mathbf{F}_{3x}, \mathbf{F}_{2x} -\mathbf{F}_{1y}] && \text{curl} \\ \iint _ { S } ( \nabla \times \mathbf { F } ) \cdot d \mathbf { S } = \iint _ { S } ( \nabla \times \mathbf { F } ) \cdot \mathbf { n } ~dS && \text{surface integral} \end{align*}\]

Big Theorems

Stokes’ and Consequences

\[\begin{align*} \oint _ { \partial S } \mathbf { F } \cdot d \mathbf { r } = \iint _ { S } ( \nabla \times \mathbf { F } ) \cdot d \mathbf { S } .\end{align*}\]

Note that if $S$ is a closed surface, so ${\partial}S = \emptyset$, this integral vanishes.

\[\begin{align*} \oint _ { {\partial}R } ( L ~d x + M ~d y ) = \iint _ { R } \left( \frac { \partial M } { \partial x } - \frac { \partial L } { \partial y } \right) d x d y .\end{align*}\]

Recovering Green’s Theorem from Stokes’ Theorem:

Let $\mathbf{F} = [L, M, 0]$, then $\nabla\times\mathbf{F} = [0, 0, \frac{\partial M}{\partial x} - \frac{\partial L}{\partial y}]$

\[\begin{align*} \iint_ { \partial V } \mathbf { F } \cdot d \mathbf { S } = \iiint _ { V } ( \nabla \cdot \mathbf { F } ) ~d V .\end{align*}\]

$\nabla\times(\nabla\phi) = 0$
$\nabla\cdot(\nabla\times\mathbf{F}) = 0$

Directional Derivatives

\[\begin{align*} D_{\mathbf{v}} f(\mathbf{p}) \mathrel{\vcenter{:}}={\frac{\partial f}{\partial t}\,}(\mathbf{p} + t\mathbf{v}) \Big|_{t=0} .\end{align*}\]

Note that the directional derivative uses a normalized direction vector!

Suppose $f:{\mathbb{R}}^n\to {\mathbb{R}}$ and $\mathbf{v}\in {\mathbb{R}}^n$. Then \[\begin{align*} D_{\mathbf{v}}f(\mathbf{p}) = {\left\langle {\nabla f(\mathbf{p})},~{\mathbf{v}} \right\rangle} .\end{align*}\]

We first use the fact that we can find $L$, the best linear approximation to $f$: \[\begin{align*} L(\mathbf{x}) &\mathrel{\vcenter{:}}= f(\mathbf{p}) + D_f(\mathbf{p})(\mathbf{x} - \mathbf{p}) \\ \\ D_{\mathbf{v}}f(\mathbf{p}) &= D_{\mathbf{v}} L(\mathbf{p}) \\ &= \lim_{t\to 0} {L(\mathbf{p} + t\mathbf{v}) - L(\mathbf{p}) \over t}\\ &= \lim_{t\to 0} { f(\mathbf{p}) + D_f(\mathbf{p})(\mathbf{p} + t\mathbf{v} - \mathbf{p}) -\qty{f(\mathbf{p}) + D_f(\mathbf{p})(\mathbf{p} - \mathbf{p})} \over t }\\ &= \lim_{t\to 0} { D_f(\mathbf{p})(t\mathbf{v}) \over t} \\ &= D_f(\mathbf{p})\mathbf{v} \\ &\mathrel{\vcenter{:}}=\nabla f(\mathbf{p}) \cdot \mathbf{v} .\end{align*}\]

Computing Integrals

Changing Coordinates

Multivariable Chain Rule

Polar and Cylindrical Coordinates

\[\begin{align*} x = r\cos\theta \\ y = r\sin\theta \\ dV \mapsto r \quad dr~d\theta \end{align*}\]

Spherical Coordinates

\[\begin{align*} x = r\cos\theta = \rho\sin\phi\cos\theta \\ y = r\sin\theta = \rho\sin\phi\sin\theta \\ dV \mapsto r^2 \sin\phi \quad dr~d\phi~d\theta \end{align*}\]

Line Integrals

Curves

Parametrize the path $C$ as $\left\{{\mathbf{r}(t): t\in[a,b]}\right\}$, then

\[\begin{align*} \int_C f ~ds &\mathrel{\vcenter{:}}=\int_a^b (f\circ \mathbf{r})(t) ~{\left\lVert {\mathbf{r}'(t)} \right\rVert}~dt \\ &= \int_a^b f(x(t), y(t), z(t)) \sqrt{x_t^2 + y_t^2 + z_t^2} ~dt \end{align*}\]

Vector Fields

If exact: \[\begin{align*} {\frac{\partial }{\partial y}\,} \mathbf{F_1} = {\frac{\partial }{\partial x}\,} \mathbf{F_2} \implies \int \mathbf{F_1} ~dx + \mathbf{F_2} ~dy = \phi(\mathbf{p_1}) - \phi(\mathbf{p_0}) \end{align*}\]

The function $\phi$ can be found using the same method from ODEs.

Parametrize the path $C$ as $\left\{{\mathbf{r}(t): t\in[a,b]}\right\}$, then \[\begin{align*} \int_C \mathbf F \cdot d\mathbf r & \mathrel{\vcenter{:}}=\int_a^b (\mathbf F \circ \mathbf r)(t) \cdot \mathbf r'(t) ~dt \\ &= \int_a^b [\mathbf F_1(x(t), y(t), \cdots), \mathbf F_2(x(t), y(t), \cdots)]\cdot[x_t, y_t, \cdots] ~dt \\ &= \int_a^b \mathbf F_1(x(t), y(t) \cdots)x_t + \mathbf F_2(x(t), y(t), \cdots)y_t + \cdots ~dt \end{align*}\]
Equivalently written:

\[\begin{align*} \int_a^b \mathbf F_1 ~dx + \mathbf F_2 ~dy + \cdots \mathrel{\vcenter{:}}=\int_C \mathbf F \cdot d\mathbf r \end{align*}\] in which case $[dx, dy, \cdots] \mathrel{\vcenter{:}}=[x_t, y_t, \cdots] = \mathbf r'(t)$.

Remember to substitute dx back into the integrand!!

Flux

\[\begin{align*} \iint_S \mathbf{F}\cdot d\mathbf{S} = \iint_S \mathbf{F}\cdot \mathbf{\widehat{n}} ~dS .\end{align*}\]

Area

Given $R$ and $f(x,y) = 0$, \[\begin{align*} A(R) = \oint _ { {\partial}R } x ~d y = - \oint _ { {\partial}R } y ~d x = \frac { 1 } { 2 } \oint _ { {\partial}R } - y ~d x + x ~d y . \end{align*}\]

Compute \[\begin{align*} \oint_{{\partial}R} x ~dy = - \oint_{{\partial}R} y ~dx \\ = \frac{1}{2} \oint_{{\partial}R} -y~dx + x~dy = \frac{1}{2} \iint_R 1 - (-1) ~dA =\iint_R 1 ~dA \end{align*}\]

Surface Integrals

For a paramterization $\mathbf r(s,t): U \to S$ of a surface $S$ and any function $f: {\mathbb{R}}^n \to {\mathbb{R}}$, \[\begin{align*} \iint _ { S } f ~dA = \iint _ { U } ( f \circ \mathbf r) ( s , t )~{\left\lVert {\mathbf n} \right\rVert} ~dA \end{align*}\]
Can obtain a normal vector $\mathbf n = T _ { u } \times T _ { v }$

Other Results

$\nabla \cdot \mathbf{F} = 0 \not \implies \exists G:~ \mathbf{F} = \nabla\times G$. A counterexample

\[\begin{align*} \mathbf{F}(x,y,z) =\frac{1}{\sqrt{x^2+y^2+z^2}}[x, y, z]~,\quad S = S^2 \subset {\mathbb{R}}^3 \\ \implies \nabla \mathbf{F} = 0 \text{ but } \iint_{S^2}\mathbf{F}\cdot d\mathbf{S} = 4\pi \neq 0 \end{align*}\] Where by Stokes’ theorem, \[\begin{align*} \mathbf{F} = \nabla\times\mathbf{G}\implies \iint_{S^2} \mathbf{F} &= \iint_{S^2} \nabla\times\mathbf{G} \\ \\ &= \oint_{{\partial}S^2}\mathbf{G}~d\mathbf{r} && \text{by Stokes}\\ &= 0 \end{align*}\] since ${\partial}S^2 = \emptyset$.

Sufficient condition: if $\mathbf{F}$ is everywhere $C^1$, \[\begin{align*} \exists \mathbf{G}:~ \mathbf{F} = \nabla \times\mathbf{G} \iff \iint_S \mathbf{F}\cdot d\mathbf{S} = 0 \text{ for all closed surfaces }S .\end{align*}\]

Linear Algebra

The underlying field will be assumed to be ${\mathbb{R}}$ for this section.

Notation

\[\begin{align*} \operatorname{Mat}(m, n) && \text{the space of all } m\times n\text{ matrices} \\ T && \text{a linear map } {\mathbb{R}}^n \to {\mathbb{R}}^m \\ A\in \operatorname{Mat}(m, n)&& \text{an } m\times n \text{ matrix representing }T \\ A^t\in \operatorname{Mat}(n, m) && \text{an } n\times m \text{ transposed matrix} \\ \mathbf{a} && \text{a } 1\times n \text{ column vector} \\ \mathbf{a}^t && \text{an } n\times 1 \text{ row vector} \\ A = {\left[ {\mathbf{a}_1, \cdots, \mathbf{a}_n} \right]} && \text{a matrix formed with } \mathbf{a}_i \text{ as the columns} \\ V, W && \text{vector spaces} \\ |V|, \dim(W) && \text{dimensions of vector spaces} \\ \det(A) && \text{the determinant of }A \\ \begin{bmatrix} A &\fboxsep=-\fboxrule\!\!\!\fbox{\strut}\!\!\!& \mathbf{b} \end{bmatrix} \mathrel{\vcenter{:}}={\left[ {\mathbf{a}_1, \mathbf{a}_2, \cdots \mathbf{a}_n, \mathbf{b}} \right]} && \text{augmented matrices} \\ \begin{bmatrix} A &\fboxsep=-\fboxrule\!\!\!\fbox{\strut}\!\!\!& B \end{bmatrix} \mathrel{\vcenter{:}}={\left[ {\mathbf{a}_1, \cdots \mathbf{a}_n, \mathbf{b}_1, \cdots, \mathbf{b}_m} \right]} && \text{block matrices}\\ \operatorname{Spec}(A) && \text{the multiset of eigenvalues of } A \\ A\mathbf{x} = \mathbf{b} && \text{a system of linear equations} \\ r\mathrel{\vcenter{:}}=\operatorname{rank}(A) && \text{the rank of }A\\ r_b = \operatorname{rank}\qty{ \begin{bmatrix} A &\fboxsep=-\fboxrule\!\!\!\fbox{\strut}\!\!\!& \mathbf{b} \\ \end{bmatrix} } && \text{the rank of }A\text{ augmented by }\mathbf{b} .\end{align*}\]

Big Theorems

\[\begin{align*} {\left\lvert {\ker(A)} \right\rvert} + {\left\lvert {\operatorname{im}(A)} \right\rvert} = {\left\lvert {\operatorname{dom}(A)} \right\rvert} ,\end{align*}\] where $\operatorname{nullspace}(A) = {\left\lvert {\operatorname{im}{A}} \right\rvert}, \operatorname{rank}(A) = {\left\lvert {\operatorname{im}(A)} \right\rvert},$ and $n$ is the number of columns in the corresponding matrix.

Generalization: the following sequence is always exact: \[\begin{align*} 0 \to \ker(A) \xhookrightarrow{\text{id}} \operatorname{dom}(A) % \xrightarrow[]{A}\mathrel{\mkern-14mu}\rightarrow \operatorname{im}(A) \to 0 .\end{align*}\] Moreover, it always splits, so $\operatorname{dom}A = \ker A \oplus \operatorname{im}A$ and thus ${\left\lvert {\operatorname{dom}(A)} \right\rvert} = {\left\lvert {\ker(A)} \right\rvert} + {\left\lvert {\operatorname{im}(A)} \right\rvert}$.

We also have \[\begin{align*} \dim(\operatorname{rowspace}(A)) = \dim(\operatorname{colspace}(A)) = \operatorname{rank}(A) .\end{align*}\]

Big List of Equivalent Properties

Let $A$ be an $m\times n$ matrix. TFAE: - $A$ is invertible and has a unique inverse $A^{-1}$ - $A^T$ is invertible - $\det(A) \neq 0$ - The linear system $A\mathbf{x} = \mathbf{b}$ has a unique solution for every $b\ \in {\mathbb{R}}^m$ - The homogeneous system $A\mathbf{x} = 0$ has only the trivial solution $\mathbf{x} = 0$ - $\operatorname{rank}(A) = n$ - i.e. $A$ is full rank - $\mathrm{nullity}(A) \mathrel{\vcenter{:}}=\dim\mathrm{nullspace}(A) = 0$ - $A = \prod_{i=1}^k E_i$ for some finite $k$, where each $E_i$ is an elementary matrix. - $A$ is row-equivalent to the identity matrix $I_n$ - $A$ has exactly $n$ pivots - The columns of $A$ are a basis for ${\mathbb{R}}^n$ - i.e. $\operatorname{colspace}(A) = {\mathbb{R}}^n$ - The rows of $A$ are a basis for ${\mathbb{R}}^m$ - i.e. $\mathrm{rowspace}(A) = {\mathbb{R}}^m$ - $\left(\operatorname{colspace}(A)\right)^\perp= \left(\mathrm{rowspace}A\right)^\perp= \left\{{\mathbf{0}}\right\}$ - Zero is not an eigenvalue of $A$. - $A$ has $n$ linearly independent eigenvectors - The rows of $A$ are coplanar.

Similarly, by taking negations, TFAE:

$A$ is not invertible
$A$ is singular
$A^T$ is not invertible
$\det A = 0$
The linear system $A \mathbf{x} = \mathbf{b}$ has either no solution or infinitely many solutions.
The homogeneous system $A \mathbf{x} = \mathbf{0}$ has nontrivial solutions
$\operatorname{rank}A < n$
$\dim \mathrm{nullspace}~ A > 0$
At least one row of $A$ is a linear combination of the others
The $RREF$ of $A$ has a row of all zeros.

Reformulated in terms of linear maps $T$, TFAE: - $T^{-1}: {\mathbb{R}}^m \to {\mathbb{R}}^n$ exists - $\operatorname{im}(T) = {\mathbb{R}}^n$ - $\ker(T) = 0$ - $T$ is injective - $T$ is surjective - $T$ is an isomorphism - The system $A\mathbf{x} = 0$ has infinitely many solutions

Vector Spaces

Linear Transformations

It is common to want to know the range and kernel of a specific linear transformation $T$. $T$ can be given in many ways, but a general strategy for deducing these properties involves:

Express an arbitrary vector in $V$ as a linear combination of its basis vectors, and set it equal to an arbitrary vector in $W$.
Use the linear properties of $T$ to make a substitution from known transformations
Find a restriction or relation given by the constants of the initial linear combination.

Useful fact: if $V\leq W$ is a subspace and $\dim(V) \geq \dim(W)$, then $V=W$.

If $V\subseteq W$, then $V$ is a subspace of $W$ if the following hold:

\[\begin{align*} (1) && \mathbf{0}\in V \\ (2) && \mathbf{a}, \mathbf{b}\in V\implies t\mathbf{a} + \mathbf{b}\in V .\end{align*}\]

Linear Independence

Any set of two vectors $\left\{{\mathbf{v}, \mathbf{w}}\right\}$ is linearly dependent $\iff \exists \lambda :~\mathbf{v} = \lambda \mathbf{w}$, i.e. one is not a scalar multiple of the other.

Bases

A set $S$ forms a basis for a vector space $V$ iff

$S$ is a set of linearly independent vectors, so $\sum \alpha_i \vec{s_i} = 0 \implies \alpha_i = 0$ for all $i$.
$S$ spans $V$, so $\vec{v} \in V$ implies there exist $\alpha_i$ such that $\sum \alpha_i \vec{s_i} = \vec{v}$

In this case, we define the dimension of $V$ to be ${\left\lvert {S} \right\rvert}$.

The Inner Product

The point of this section is to show how an inner product can induce a notion of “angle”, which agrees with our intuition in Euclidean spaces such as ${\mathbb{R}}^n$, but can be extended to much less intuitive things, like spaces of functions.

The Euclidean inner product is defined as \[\begin{align*} {\left\langle {\mathbf{a}},~{\mathbf{b}} \right\rangle} = \sum_{i=1}^n a_i b_i = a_1b_1 + a_2b_2 + \cdots + a_nb_n .\end{align*}\]

Also sometimes written as $\mathbf{a}^T\mathbf{b}$ or $\mathbf{a} \cdot \mathbf{b}$.

Yields a norm \[\begin{align*} {\left\lVert {\mathbf{x}} \right\rVert} \mathrel{\vcenter{:}}=\sqrt{{\left\langle {\mathbf{x}},~{\mathbf{x}} \right\rangle}} \end{align*}\]

which has a useful alternative formulation \[\begin{align*} {\left\langle {\mathbf{x}},~{\mathbf{x}} \right\rangle} = {\left\lVert {\mathbf{x}} \right\rVert}^2 .\end{align*}\]

This leads to a notion of angle: \[\begin{align*} {\left\langle {\mathbf{x}},~{\mathbf{y}} \right\rangle} = {\left\lVert {\mathbf{x}} \right\rVert} {\left\lVert {\mathbf{y}} \right\rVert} \cos\theta_{x,y} \implies \cos \theta_{x,y} \mathrel{\vcenter{:}}=\frac{{\left\langle {\mathbf{x}},~{\mathbf{y}} \right\rangle}}{{\left\lVert {\mathbf{x}} \right\rVert} {\left\lVert {\mathbf{y}} \right\rVert}} = {\left\langle {\widehat{\mathbf{x}}},~{\widehat{\mathbf{y}}} \right\rangle} \end{align*}\] where $\theta_{x,y}$ denotes the angle between the vectors $\mathbf{x}$ and $\mathbf{y}$.

Since $\cos \theta=0$ exactly when $\theta = \pm \frac \pi 2$, we can can declare two vectors to be orthogonal exactly in this case: \[\begin{align*} \mathbf{x} \in \mathbf{y}^\perp\iff {\left\langle {\mathbf{x}},~{\mathbf{y}} \right\rangle} = 0 .\end{align*}\]

Note that this makes the zero vector orthogonal to everything.

Given a subspace $S \subseteq V$, we define its orthogonal complement \[\begin{align*} S^\perp= \left\{{\mathbf{v}\in V {~\mathrel{\Big|}~}\forall \mathbf{s}\in S,~ {\left\langle {\mathbf{v}},~{\mathbf{s}} \right\rangle} = 0}\right\}. \end{align*}\]

Any choice of subspace $S\subseteq V$ yields a decomposition $V = S \oplus S^\perp$.

A useful formula is \[\begin{align*} {\left\lVert {\mathbf{x} + \mathbf{y}} \right\rVert}^2 = {\left\lVert {\mathbf{x}} \right\rVert}^2 + 2{\left\langle {\mathbf{x}},~{\mathbf{y}} \right\rangle} + {\left\lVert {\mathbf{y}} \right\rVert}^2, .\end{align*}\]

When $\mathbf{x}\in \mathbf{y}^\perp$, this reduces to \[\begin{align*} {\left\lVert {\mathbf{x} + \mathbf{y}} \right\rVert}^2 = {\left\lVert {\mathbf{x}} \right\rVert}^2 + {\left\lVert {\mathbf{y}} \right\rVert}^2 .\end{align*}\]

. Bilinearity: \[\begin{align*} {\left\langle {\sum_j \alpha_j \mathbf{a}_j},~{\sum_k \beta_k \mathbf{b}_k} \right\rangle} = \sum_j \sum_i \alpha_j \beta_i {\left\langle {\mathbf{a}_j},~{\mathbf{b}_i} \right\rangle}. \end{align*}\]

Symmetry: \[\begin{align*} {\left\langle {\mathbf{a}},~{\mathbf{b}} \right\rangle} = {\left\langle {\mathbf{b}},~{\mathbf{a}} \right\rangle} \end{align*}\]
Positivity: \[\begin{align*} \mathbf{a} \neq \mathbf{0} \implies {\left\langle {\mathbf{a}},~{\mathbf{a}} \right\rangle} > 0 \end{align*}\]
Nondegeneracy: \[\begin{align*} \mathbf{a} = \mathbf{0} \iff {\left\langle {\mathbf{a}},~{\mathbf{a}} \right\rangle} = 0 \end{align*}\]

Gram-Schmidt Process

Extending a basis $\left\{{\mathbf{x}_i}\right\}$ to an orthonormal basis $\left\{{\mathbf{u}_i}\right\}$

\[\begin{align*} \mathbf{u}_1 &= N(\mathbf{x_1}) \\ \mathbf{u}_2 &= N(\mathbf{x}_2 - {\left\langle {\mathbf{x}_2},~{\mathbf{u}_1} \right\rangle}\mathbf{u}_1)\\ \mathbf{u}_3 &= N(\mathbf{x}_3 - {\left\langle {\mathbf{x}_3},~{\mathbf{u}_1} \right\rangle}\mathbf{u}_1 - {\left\langle {\mathbf{x}_3},~{\mathbf{u}_2} \right\rangle}\mathbf{u}_2 ) \\ \vdots & \qquad \vdots \\ \mathbf{u}_k &= N(\mathbf{x}_k - \sum_{i=1}^{k-1} {\left\langle {\mathbf{x}_k},~{\mathbf{u}_i} \right\rangle}\mathbf{u}_i) \end{align*}\]

where $N$ denotes normalizing the result.

In more detail

The general setup here is that we are given an orthogonal basis $\left\{{\mathbf{x}_i}\right\}_{i=1}^n$ and we want to produce an orthonormal basis from them.

Why would we want such a thing? Recall that we often wanted to change from the standard basis $\mathcal{E}$ to some different basis $\mathcal{B} = \left\{{\mathbf{b}_1, \mathbf{b}_2, \cdots}\right\}$. We could form the change of basis matrix $B = [\mathbf{b}_1, \mathbf{b}_2, \cdots]$ acts on vectors in the $\mathcal{B}$ basis according to \[\begin{align*} B[\mathbf{x}]_\mathcal{B} = [\mathbf{x}]_{\mathcal{E}} .\end{align*}\]

But to change from $\mathcal{E}$ to $\mathcal{B}$ requires computing $B^{-1}$, which acts on vectors in the standard basis according to \[\begin{align*} B^{-1}[\mathbf{x}]_\mathcal{E} = [\mathbf{x}]_{\mathcal{B}} .\end{align*}\]

If, on the other hand, the $\mathbf{b}_i$ are orthonormal, then $B^{-1} = B^T$, which is much easier to compute. We also obtain a rather simple formula for the coordinates of $\mathbf{x}$ with respect to $\mathcal B$. This follows because we can write \[\begin{align*} \mathbf{x} = \sum_{i=1}^n {\left\langle {\mathbf{x}},~{\mathbf{b}_i} \right\rangle} \mathbf{b}_i \mathrel{\vcenter{:}}=\sum_{i=1}^n c_i \mathbf{b}_i, .\end{align*}\]

and we find that \[\begin{align*} [\mathbf{x}]_\mathcal{B} = \mathbf{c} \mathrel{\vcenter{:}}=[c_1, c_2, \cdots, c_n]^T. .\end{align*}\]

This also allows us to simplify projection matrices. Supposing that $A$ has orthonormal columns and letting $S$ be the column space of $A$, recall that the projection onto $S$ is defined by \[\begin{align*} P_S = Q(Q^TQ)^{-1}Q^T .\end{align*}\]

Since $Q$ has orthogonal columns and satisfies $Q^TQ = I$, this simplifies to \[\begin{align*} P_S = QQ^T. .\end{align*}\]

The Algorithm

Given the orthogonal basis $\left\{{\mathbf{x}_i}\right\}$, we form an orthonormal basis $\left\{{\mathbf{u}_i}\right\}$ iteratively as follows.

First define \[\begin{align*} N: {\mathbb{R}}^n &\to S^{n-1} \\ \mathbf{x} &\mapsto \widehat{\mathbf{x}} \mathrel{\vcenter{:}}=\frac {\mathbf{x}} {{\left\lVert {\mathbf{x}} \right\rVert}} \end{align*}\]

which projects a vector onto the unit sphere in ${\mathbb{R}}^n$ by normalizing. Then,

In words, at each stage, we take one of the original vectors $\mathbf{x}_i$, then subtract off its projections onto all of the $\mathbf{u}_i$ we’ve created up until that point. This leaves us with only the component of $\mathbf{x}_i$ that is orthogonal to the span of the previous $\mathbf{u}_i$ we already have, and we then normalize each $\mathbf{u}_i$ we obtain this way.

Alternative Explanation:

Given a basis \[\begin{align*} S = \left\{\mathbf{v_1, v_2, \cdots v_n}\right\}, \end{align*}\]

the Gram-Schmidt process produces a corresponding orthogonal basis \[\begin{align*} S' = \left\{\mathbf{u_1, u_2, \cdots u_n}\right\} \end{align*}\] that spans the same vector space as $S$.

$S'$ is found using the following pattern: \[\begin{align*} \mathbf{u_1} &= \mathbf{v_1} \\ \mathbf{u_2} &= \mathbf{v_2} - \text{proj}_{\mathbf{u_1}} \mathbf{v_2}\\ \mathbf{u_3} &= \mathbf{v_3} - \text{proj}_{\mathbf{u_1}} \mathbf{v_3} - \text{proj}_{\mathbf{u_2}} \mathbf{v_3}\\ \end{align*}\]

where \[\begin{align*} \text{proj}_{\mathbf{u}} \mathbf{v} = (\text{scal}_{\mathbf{u}} \mathbf{v})\frac{\mathbf{u}}{\mathbf{{\left\lVert {u} \right\rVert}}} = \frac{\langle \mathbf{v,u} \rangle}{{\left\lVert {\mathbf{u}} \right\rVert}}\frac{\mathbf{u}}{\mathbf{{\left\lVert {u} \right\rVert}}} = \frac{{\left\langle {\mathbf{v}},~{\mathbf{u}} \right\rangle}}{{\left\lVert {\mathbf{u}} \right\rVert}^2}\mathbf{u} \end{align*}\] is a vector defined as the

The orthogonal set $S'$ can then be transformed into an orthonormal set $S''$ by simply dividing the vectors $s\in S'$ by their magnitudes. The usual definition of a vector’s magnitude is \[\begin{align*} {\left\lVert {\mathbf{a}} \right\rVert} = \sqrt{{\left\langle {\mathbf{a}},~{\mathbf{a}} \right\rangle}} \text{ and } {\left\lVert {\mathbf{a}} \right\rVert}^2 = {\left\langle {\mathbf{a}},~{\mathbf{a}} \right\rangle} \end{align*}\]

As a final check, all vectors in $S'$ should be orthogonal to each other, such that \[\begin{align*} {\left\langle {\mathbf{v_i}},~{\mathbf{v_j}} \right\rangle} = 0 \text{ when } i \neq j \end{align*}\]

and all vectors in $S''$ should be orthonormal, such that \[\begin{align*} {\left\langle {\mathbf{v_i}},~{\mathbf{v_j}} \right\rangle} = \delta_{ij} \end{align*}\]

The Fundamental Subspaces Theorem

Given a matrix $A \in \mathrm{Mat}(m, n)$, and noting that \[\begin{align*} A &: {\mathbb{R}}^n \to {\mathbb{R}}^m,\\ A^T &: {\mathbb{R}}^m \to {\mathbb{R}}^n \end{align*}\]

We have the following decompositions: \[\begin{align*} &{\mathbb{R}}^n &\cong \ker A &\oplus \operatorname{im}A^T &\cong \mathrm{nullspace}(A) &\oplus~ \mathrm{colspace}(A^T) \\ &{\mathbb{R}}^m &\cong \operatorname{im}A &\oplus \ker A^T &\cong \mathrm{colspace}(A) &\oplus~ \mathrm{nullspace}(A^T) \end{align*}\]

Computing change of basis matrices

Matrices

An $m\times n$ matrix is a map from $n$-dimensional space to $m$-dimensional space. The number of rows tells you the dimension of the codomain, the number of columns tells you the dimension of the domain.

The space of matrices is not an integral domain! Counterexample: if $A$ is singular and nonzero, there is some nonzero $\mathbf{v}$ such that $A \mathbf{v} = \mathbf{0}$. Then setting $B = {\left[ {\mathbf{v}, \mathbf{v}, \cdots} \right]}$ yields $AB = 0$ with $A\neq 0, B\neq 0$.

The rank of a matrix $A$ representing a linear transformation $T$ is $\dim \operatorname{colspace}(A)$, or equivalently $\dim \operatorname{im}T$.

$\operatorname{rank}(A)$ is equal to the number of nonzero rows in $\operatorname{RREF}(A)$.

\[\begin{align*} \mathrm{Trace}(A) = \sum_{i=1}^m A_{ii} \end{align*}\]

The following are elementary row operations on a matrix:

Permute rows
Multiple a row by a scalar
Add any row to another

If $A = {\left[ {\mathbf{a}_1, \mathbf{a}_2, \cdots} \right]} \in \mathrm{Mat}(m, n)$ and $B = {\left[ {\mathbf{b}_1, \mathbf{b}_2, \cdots} \right]} \in\mathrm{Mat}(n, p)$, then \[\begin{align*} C \mathrel{\vcenter{:}}= AB \implies c_{ij} = \sum_{k=1}^n a_{ik}b_{kj} = {\left\langle {\mathbf{a_i}},~{\mathbf{b_j}} \right\rangle} \end{align*}\] where $1\leq i \leq m$ and $1\leq j \leq p$. In words, each entry $c_{ij}$ is obtained by dotting row $i$ of $A$ against column $j$ of $B$.

Systems of Linear Equations

A system of linear equations is consistent when it has at least one solution. The system is inconsistent when it has no solutions.

Homogeneous systems are always consistent, i.e. there is always at least one solution.

Tall matrices: more equations than unknowns, overdetermined
Wide matrices: more unknowns than equations, underdetermined

There are three possibilities for a system of linear equations:

No solutions (inconsistent)
One unique solution (consistent, square or tall matrices)
Infinitely many solutions (consistent, underdetermined, square or wide matrices)

These possibilities can be check by considering $r\mathrel{\vcenter{:}}=\operatorname{rank}(A)$:

$r < r_b$: case 1, no solutions.
$r = r_b$: case 1 or 2, at least one solution.
- $r_b = n$: case 2, a unique solution.
- $r_b < n:$ case 3, infinitely many solutions.

Determinants

\[\begin{align*} \det{(A \mod p}) \mod p \equiv (\det{A}) \mod p \end{align*}\]

For $2\times 2$ matrices, \[\begin{align*} A^{-1} = \left( \begin{array}{cc} a & b \\ c & d \end{array}\right)^{-1} = \frac{1}{\det{A}}\left( \begin{array}{cc} d & -b \\ -c & a \end{array}\right) \end{align*}\] In words, swap the main diagonal entries, and flip the signs on the off-diagonal.

Let $A \in \mathrm{Mat}(m, n)$, then there is a function \[\begin{align*} \det: \operatorname{Mat}(m, m) &\to {\mathbb{R}}\\ A &\mapsto \det(A) \end{align*}\] satisfying the following properties:

$\det$ is a group homomorphism onto $({\mathbb{R}}, \cdot)$: \[\begin{align*} \det(AB) = \det(A) \det(B) \end{align*}\]
- Some corollaries: \[\begin{align*} \det{A^k} &= k\det{A} \\ \det(A^{-1}) &= (\det A)^{-1} \det(A^t) &= \det(A) .\end{align*}\]
Invariance under adding scalar multiples of any row to another: \[\begin{align*} \det \begin{bmatrix} & \vdots & \\ \rule[.5ex]{2.5ex}{0.5pt}& \textcolor{Aquamarine}{\textbf{a}}_i & \rule[.5ex]{2.5ex}{0.5pt}\\ & \vdots & \\ \end{bmatrix} = \det \begin{bmatrix} & \vdots & \\ \rule[.5ex]{2.5ex}{0.5pt}& \textcolor{Aquamarine}{\textbf{a}}_i + t\mathbf{a_j} & \rule[.5ex]{2.5ex}{0.5pt}\\ & \vdots & \\ \end{bmatrix} \end{align*}\]
Sign change under row permutation: \[\begin{align*} \det \begin{bmatrix} & \vdots & \\ \rule[.5ex]{2.5ex}{0.5pt}& \textcolor{Aquamarine}{\textbf{a}}_i & \rule[.5ex]{2.5ex}{0.5pt}\\ & \vdots & \\ \rule[.5ex]{2.5ex}{0.5pt}& \mathbf{a}_j & \rule[.5ex]{2.5ex}{0.5pt}\\ & \vdots & \\ \end{bmatrix} = (-1) \det \begin{bmatrix} & \vdots & \\ \rule[.5ex]{2.5ex}{0.5pt}& \mathbf{a}_j & \rule[.5ex]{2.5ex}{0.5pt}\\ & \vdots & \\ \rule[.5ex]{2.5ex}{0.5pt}& \textcolor{Aquamarine}{\textbf{a}}_i & \rule[.5ex]{2.5ex}{0.5pt}\\ & \vdots & \end{bmatrix} \end{align*}\]
- More generally, for a permutation $\sigma\in S_n$, \[\begin{align*} \det \begin{bmatrix} & \vdots & \\ \rule[.5ex]{2.5ex}{0.5pt}& \mathbf{a}_i & \rule[.5ex]{2.5ex}{0.5pt}\\ & \vdots & \\ \rule[.5ex]{2.5ex}{0.5pt}& \mathbf{a}_j & \rule[.5ex]{2.5ex}{0.5pt}\\ & \vdots & \\ \end{bmatrix} = (-1)^{\operatorname{sgn}(\sigma)} \det \begin{bmatrix} & \vdots & \\ \rule[.5ex]{2.5ex}{0.5pt}& \mathbf{a}_{\sigma(j)} & \rule[.5ex]{2.5ex}{0.5pt}\\ & \vdots & \\ \rule[.5ex]{2.5ex}{0.5pt}& \mathbf{a}_{\sigma(i)} & \rule[.5ex]{2.5ex}{0.5pt}\\ & \vdots & \end{bmatrix} \end{align*}\]
Multilinearity in rows: \[\begin{align*} \det \begin{bmatrix} & \vdots & \\ \rule[.5ex]{2.5ex}{0.5pt}& t \textcolor{Aquamarine}{\textbf{a}}_i & \rule[.5ex]{2.5ex}{0.5pt}\\ & \vdots & \\ \end{bmatrix} &= t \det \begin{bmatrix} & \vdots & \\ \rule[.5ex]{2.5ex}{0.5pt}& \textcolor{Aquamarine}{\textbf{a}}_i & \rule[.5ex]{2.5ex}{0.5pt}\\ & \vdots & \\ \end{bmatrix} \\ \det \begin{bmatrix} \rule[.5ex]{2.5ex}{0.5pt}& t \mathbf{a}_1 & \rule[.5ex]{2.5ex}{0.5pt}\\ \rule[.5ex]{2.5ex}{0.5pt}& t \mathbf{a}_2 & \rule[.5ex]{2.5ex}{0.5pt}\\ & \vdots & \\ \rule[.5ex]{2.5ex}{0.5pt}& t \mathbf{a}_m & \rule[.5ex]{2.5ex}{0.5pt}\\ \end{bmatrix} &= t^m \det \begin{bmatrix} \rule[.5ex]{2.5ex}{0.5pt}& \mathbf{a}_1 & \rule[.5ex]{2.5ex}{0.5pt}\\ \rule[.5ex]{2.5ex}{0.5pt}& \mathbf{a}_2 & \rule[.5ex]{2.5ex}{0.5pt}\\ & \vdots & \\ \rule[.5ex]{2.5ex}{0.5pt}& \mathbf{a}_m & \rule[.5ex]{2.5ex}{0.5pt}\\ \end{bmatrix} \\ \det \begin{bmatrix} \rule[.5ex]{2.5ex}{0.5pt}& t_1 \mathbf{a}_1 & \rule[.5ex]{2.5ex}{0.5pt}\\ \rule[.5ex]{2.5ex}{0.5pt}& t_2 \mathbf{a}_2 & \rule[.5ex]{2.5ex}{0.5pt}\\ & \vdots & \\ \rule[.5ex]{2.5ex}{0.5pt}& t_m \mathbf{a}_m & \rule[.5ex]{2.5ex}{0.5pt}\\ \end{bmatrix} &= \prod_{i=1}^m t_i \det \begin{bmatrix} \rule[.5ex]{2.5ex}{0.5pt}& \mathbf{a}_1 & \rule[.5ex]{2.5ex}{0.5pt}\\ \rule[.5ex]{2.5ex}{0.5pt}& \mathbf{a}_2 & \rule[.5ex]{2.5ex}{0.5pt}\\ & \vdots & \\ \rule[.5ex]{2.5ex}{0.5pt}& \mathbf{a}_m & \rule[.5ex]{2.5ex}{0.5pt}\\ \end{bmatrix} .\end{align*}\]
Linearity in each row: \[\begin{align*} \det \begin{bmatrix} & \vdots & \\ \rule[.5ex]{2.5ex}{0.5pt}& \textcolor{Aquamarine}{\textbf{a}}_i + \textcolor{red}{\textbf{a}}_j & \rule[.5ex]{2.5ex}{0.5pt}\\ & \vdots & \\ \end{bmatrix} =\det \begin{bmatrix} & \vdots & \\ \rule[.5ex]{2.5ex}{0.5pt}& \textcolor{Aquamarine}{\textbf{a}}_i & \rule[.5ex]{2.5ex}{0.5pt}\\ & \vdots & \\ \end{bmatrix} + \det \begin{bmatrix} & \vdots & \\ \rule[.5ex]{2.5ex}{0.5pt}& \textcolor{red}{\textbf{a}}_j & \rule[.5ex]{2.5ex}{0.5pt}\\ & \vdots & \\ \end{bmatrix} .\end{align*}\]
$\det(A)$ is the volume of the parallelepiped spanned by the columns of $A$.
If any row of $A$ is all zeros, $\det(A) = 0$.

TFAE:

$\det(A) = 0$
$A$ is singular.

Computing Determinants

Useful shortcuts:

If $A$ is upper or lower triangular, $\det(A) = \prod_i a_{ii}$.

The minor $M_{ij}$ of $A\in \operatorname{Mat}(n, n)$ is the determinant of the $(n-1) \times (n-1)$ matrix obtained by deleting the $i$th row and $j$th column from $A$.

The cofactor $C_{ij}$ is the scalar defined by \[\begin{align*} C_{ij} \mathrel{\vcenter{:}}=(-1)^{i+j} M_{ij} .\end{align*}\]

For any fixed $i$, there is a formula \[\begin{align*} \det(A) = \sum_{j=1}^n a_{ij} C_{ij} .\end{align*}\]

Let \[\begin{align*} A = \left[\begin{array}{lll} 1 & 2 & 3 \\ 4 & 5 & 6 \\ 7 & 8 & 9 \end{array}\right] .\end{align*}\]

Then \[\begin{align*} \det A = 1 \cdot\left|\begin{array}{ll} 5 & 6 \\ 8 & 9 \end{array}\right|-2 \cdot\left|\begin{array}{ll} 4 & 6 \\ 7 & 9 \end{array}\right|+3 \cdot\left|\begin{array}{ll} 4 & 5 \\ 7 & 8 \end{array}\right| = 1 \cdot(-3)-2 \cdot(-6)+3 \cdot(-3) = 0 .\end{align*}\]

$\det(A)$ can be computed by reducing $A$ to $\operatorname{RREF}(A)$ (which is upper triangular) and keeping track of the following effects:

$R_i \mapsfrom R_i \pm t R_j$: no effect.
$R_i \rightleftharpoons R_j$: multiply by $(-1)$.
$R_i \mapsfrom tR_i$: multiply by $t$.

Inverting a Matrix

Given a linear system $A\mathbf{x} = \mathbf{b}$, writing $\mathbf{x} = {\left[ {x_1, \cdots, x_n} \right]}$, there is a formula \[\begin{align*} x_i = \frac{\det(B_i)}{\det(A)} \end{align*}\] where $B_i$ is $A$ with the $i$th column deleted and replaced by $\mathbf{b}$.

Under the equivalence relation of elementary row operations, there is an equivalence of augmented matrices: \[\begin{align*} \begin{bmatrix} A &\fboxsep=-\fboxrule\!\!\!\fbox{\strut}\!\!\!& I \end{bmatrix} \sim \begin{bmatrix} I &\fboxsep=-\fboxrule\!\!\!\fbox{\strut}\!\!\!& A^{-1} \end{bmatrix} \end{align*}\] where $I$ is the $n\times n$ identity matrix.

\[\begin{align*} A^{-1} = {1\over \det(A)} {\left[ {C_{ij}} \right]}^t .\end{align*}\] where $C_{ij}$ is the cofactor() at position $i,j$.¹

\[\begin{align*} \left(\begin{array}{cc} a& b \\ c& d \end{array}\right)^{-1} = {1 \over a d - b c} \left(\begin{array}{rr} d & -b \\ -c & a \end{array}\right) \quad \text{ where } ad-bc \ne 0 \end{align*}\]

What’s the pattern?

Always divide by determinant
Swap the diagonals
Hadamard product with checkerboard

\[\begin{align*} \begin{bmatrix} + & - \\ - & + \end{bmatrix} \end{align*}\]

\[\begin{align*} A^{-1} \mathrel{\vcenter{:}}= \begin{bmatrix} a & b & c \\ d & e & f \\ g & h & i \end{bmatrix} ^{-1} = {1 \over {\det A}} \begin{bmatrix} e i - f h & -(b i - c h) & b f - c e \\ -(d i - f g) &a i - c g &-(a f -c d) \\ d h - e g & -(a h - b g)& a e - b d \end{bmatrix} .\end{align*}\]

The pattern:

Divide by determinant
Each entry is determinant of submatrix of $A$ with corresponding col/row deleted
Hadamard product with checkerboard

\[\begin{align*} \begin{bmatrix} + & - & + \\ - & + & - \\ \ + & - & + \end{bmatrix} \end{align*}\]

Transpose at the end!!

Bases for Spaces of a Matrix

Let $A\in \operatorname{Mat}(m, n)$ represent a map $T:{\mathbb{R}}^n\to {\mathbb{R}}^m$.

\[\begin{align*} \dim \operatorname{rowspace}(A) = \dim \operatorname{colspace}(A) .\end{align*}\]

The row space

\[\begin{align*} \operatorname{im}(T)^\vee= \operatorname{rowspace}(A) \subset {\mathbb{R}}^n .\end{align*}\]

Reduce to RREF, and take nonzero rows of $\mathrm{RREF}(A)$.

The column space

\[\begin{align*} \operatorname{im}(T) = \operatorname{colspace}(A) \subseteq {\mathbb{R}}^m \end{align*}\]

Reduce to RREF, and take columns with pivots from original $A$.

Not enough pivots implies columns don’t span the entire target domain

The nullspace

\[\begin{align*} \ker(T) = \operatorname{nullspace}(A) \subseteq {\mathbb{R}}^n \end{align*}\]

Reduce to RREF, zero rows are free variables, convert back to equations and pull free variables out as scalar multipliers.

Eigenspaces

For each $\lambda \in \operatorname{Spec}(A)$, compute a basis for $\ker(A - \lambda I)$.

Eigenvalues and Eigenvectors

A vector $\mathbf{v}$ is said to be an eigenvector of $A$ with eigenvalue $\lambda\in \operatorname{Spec}(A)$ iff \[\begin{align*} A\mathbf{v} = \lambda \mathbf{v} \end{align*}\] For a fixed $\lambda$, the corresponding eigenspace $E_\lambda$ is the span of all such vectors.

Similar matrices have identical eigenvalues and multiplicities.
Eigenvectors corresponding to distinct eigenvalues are always linearly independent
$A$ has $n$ distinct eigenvalues $\implies A$ has $n$ linearly independent eigenvectors.
A matrix $A$ is diagonalizable $\iff A$ has $n$ linearly independent eigenvectors.

For $\lambda\in \operatorname{Spec}(A)$, \[\begin{align*} \mathbf{v}\in E_\lambda \iff \mathbf{v} \in \ker(A-I\lambda) .\end{align*}\]

Some miscellaneous useful facts:

$\lambda \in \operatorname{Spec}(A) \implies \lambda^2 \in \operatorname{Spec}(A^2)$ with the same eigenvector.
$\prod \lambda_i = \det A$
$\sum \lambda_i = \mathrm{Tr}~A$

Finding generalized eigenvectors

Diagonalizability

An $n\times n$ matrix $P$ is diagonalizable iff its eigenspace is all of $\mathbb{R}^n$ (i.e. there are $n$ linearly independent eigenvectors, so they span the space.)

$A$ is diagonalizable if there is a basis of eigenvectors for the range of $P$.

Useful Counterexamples

\[\begin{align*} A \mathrel{\vcenter{:}}=\left[ \begin{array} { c c } { 1 } & { 1 } \\ { 0 } & { 1 } \end{array} \right] &\implies A^n = \left[ \begin{array} { c c } { 1 } & { n } \\ { 0 } & { 1 } \end{array} \right], && \operatorname{Spec}(A) = [1,1] \\ A \mathrel{\vcenter{:}}=\left[ \begin{array} { c c } { 1 } & { 1 } \\ { 0 } & { - 1 } \end{array} \right] &\implies A^2 = I_2, && \operatorname{Spec}(A) = [1, -1] \end{align*}\]

Example Problems

Determine a basis for \[\begin{align*} S = \left\{a_0 + a_1 x + a_2 x^2\mathrel{\Big|}a_0,a_1,a_2 \in \mathbb{R} \land a_0 - a_1 -2a_2 =0\right\}. \end{align*}\]

Let $a_2=t, a_1=s, a_0=s+2t$, then \[\begin{align*} S &= \left\{ (s+2t) + (sx+tx^2)\mathrel{\Big|}s,t\in\mathbb{R} \right\} \\ &= \left\{ (s+sx) + (2t+tx^2)\mathrel{\Big|}s,t\in\mathbb{R} \right\} \\ &= \left\{ s(1+x) + t(2+x^2)\mathrel{\Big|}s,t\in\mathbb{R} \right\} \\ &= \text{span}\left\{(1+x),(2+x^2)\right\} \end{align*}\]

and a basis for $S$ is \[\begin{align*} \left\{(1+x), (2+x^2)\right\} \end{align*}\]

: If $V$ is an $n$-dimensional vector space, then every set $S$ with fewer than $n$ vectors can be extended to a basis for $V$.

Only sets with fewer than $n$ vectors can be extended to form a basis for $V$.

: The set of all 3 x 3 matrices forms a three-dimensional subspace of $M_{3}(\mathbb{R})$.

This set forms a 6-dimensional subspace. A basis for this space would require six elements.

Given $A=$ \[\begin{align*}\begin{bmatrix} 1 & 3 ` \\ -2 & -6 \end{bmatrix}\end{align*}\] what is the dimension of the null space of $A$?

The augmented matrix for the system $A\mathbf{x} = \mathbf{0}$ is \[\begin{align*}\begin{bmatrix}[cc|c] 1 & 3 & 0 \\ 0 & 0 & 0 \end{bmatrix}\end{align*}\] which has one free variable.

Writing one variable in terms of another results in $x_1 + 3x_2 = 0 \Rightarrow x_1 = 3x_2$.

Let $x_2 = r$ where $r \in R$, then $S = \left\{ x \in \mathbb{R}^2 : \mathbf{x} = r(3,1), r \in \mathbb{R}\right\} = \text{span}\left\{(3,1)\right\}$.

So, the set $B = \left\{(3,1)\right\}$ is a basis for the null space of $A$ and\ .

Let $S$ be the subspace of $\mathbb{R}^3$ that consists of all solutions to the equation $x-3y+z = 0$. Determine a basis for $S$, and find dim[$S$].

The first goal is to find a way to express the set of 3-tuples that satisfy this equation.

Let $y=r$ and $z=s$, then $x=r-s$. Then vectors $\mathbf{v}$ that satisfy the equation are all of the form \[\begin{align*} \mathbf{v} = (3r-s, r, s) = (3r,r,0)+(-s,0,s) = r(3,1,0) + s(-1,0,1). \end{align*}\] (Note - the goal here is to separate the dependent variables into different vectors so they can be written as a linear combination of something.)

The set $S$ that satisfies this equation is then \[\begin{align*} S &= \left\{ \mathbf{v} \in \mathbb{R}^3 : \mathbf{v} =r(3,1,0) + s(-1,0,1) \land r,s\in\mathbb{R} \right\} \\ &= \text{span}\left\{ (3,1,0), (-1,0,1)\right\} \end{align*}\]

All that remains is to check that the vectors in this span are linearly independent. This can be done by showing that \[\begin{align*} a(3,1,0) + b(-1,0,1) = (0,0,0) \end{align*}\] $a=b=0$.

Since the two vectors are linearly independent and span the solution set $S$, they form a basis for $S$ of dimension 2.

Determine a basis for the subspace of $M_2(\mathbb{R})$ spanned by \[\begin{align*}\begin{bmatrix} -1 & 3 \\ -1 & 2 \end{bmatrix}\end{align*}\]

\[\begin{align*}\begin{bmatrix} 0 & 0 \\ 0 & 0 \end{bmatrix}\end{align*}\]

\[\begin{align*}\begin{bmatrix} -1 & 4 \\ 1 & 1 \end{bmatrix}\end{align*}\]

\[\begin{align*}\begin{bmatrix} 5 & 6 \\ -5 & 1 \end{bmatrix}\end{align*}\].

Note that because the set contains the zero matrix, it is linearly dependent. So only consider the other three, as they span the same subspace as the original set.

First, determine if $\left\{ A_1, A_2, A_3\right\}$ is linearly independent. Start with the equation \[\begin{align*} c_1A_1 + c_2A_2 + c_3A_3 = 0_2 \end{align*}\]

which gives \[\begin{align*} c_1 - c_2 + 5c_3 &= 0 \\ 3c_1 + 4c_2 - 6c_3 &= 0 \\ -c_1 + c_2 - 5c_3 &= 0 \\ 2c_1 + c_2 + c_3 &= 0 \end{align*}\]

which has the solution $(-2r,3r,r)$. So the set is linearly dependent by the relation \[\begin{align*} -2A_1 + 3A_2 + A_3 = 0 \text{ or }\\ A_3 = 2A_1 - 3A_2 \end{align*}\]

So $\left\{A_1, A_2\right\}$ spans the same subspace as the original set. It is also linearly independent, and therefore forms a basis for the original subspace.

Let $A, B, C \in M_2 (\mathbb{R})$. Define $\langle A,B\rangle = a_{11}b_{11}+2a_{12}b{12}+3a_{21}b_{21}$. Does this define an inner product on $M_2 (\mathbb{R})$?

Instead, let $\langle A,B\rangle = a_{11} + b_{22}$. Does this define an inner product on $M_2(\mathbb{R})$?

Let $p=a_0 + a_1 x + a_2 x^2$ and $q=b_0 + b_1 x + b_2 x^2$. Define $\langle p,q\rangle = \sum_{i=0}^{2}(i+1)a_i b_i$. Does this define an inner product on $P_2$?

Linear Algebra: Advanced Topics

Changing Basis

The transition matrix from a given basis $\mathcal{B} = \left\{{\mathbf{b}_i}\right\}_{i=1}^n$ to the standard basis is given by \[\begin{align*} A\mathrel{\vcenter{:}}= \begin{bmatrix} \rule[-1ex]{0.5pt}{2.5ex}& \rule[-1ex]{0.5pt}{2.5ex}& & \rule[-1ex]{0.5pt}{2.5ex}\\ \mathbf{b}_1 & \mathbf{b}_2 & \cdots & \mathbf{b}_n \\ \rule[-1ex]{0.5pt}{2.5ex}& \rule[-1ex]{0.5pt}{2.5ex}& & \rule[-1ex]{0.5pt}{2.5ex}\\ \end{bmatrix} ,\end{align*}\] and the transition matrix from the standard basis to $\mathcal{B}$ is $A^{-1}$.

Orthogonal Matrices

Given a notion of orthogonality for vectors, we can extend this to matrices. A square matrix is said to be orthogonal iff $QQ^T = Q^TQ = I$. For rectangular matrices, we have the following characterizations: \[\begin{align*} QQ^T = I \implies &\text{The rows of } Q \text { are orthogonal,} \\ Q^TQ = I \implies &\text{The columns of } Q \text{ are orthogonal.} \end{align*}\]

To remember which condition is which, just recall that matrix multiplication $AB$ takes the inner product between the rows of $A$ and the columns of $B$. So if, for example, we want to inspect whether or not the columns of $Q$ are orthogonal, we should let $B=Q$ in the above formulation – then we just note that the rows of $Q^T$ are indeed the columns of $Q$, so $Q^TQ$ computes the inner products between all pairs of the columns of $Q$ and stores them in a matrix.

Projections

A projection $P$ induces a decomposition \[\begin{align*} \operatorname{dom}(P) = \ker(P) \oplus \ker(P)^\perp .\end{align*}\]

Distance from a point $\mathbf{p}$ to a line $\mathbf{a} + t\mathbf{b}$: let $\mathbf{w} = \mathbf{p} - \mathbf{a}$, then: ${\left\lVert {\mathbf{w} - P(\mathbf{w}, \mathbf{v})} \right\rVert}$

\[\begin{align*} \operatorname{Proj}_{\operatorname{range}(A)}(\mathbf{x}) = A(A^t A)^{-1} A^t \mathbf{x} .\end{align*}\]

Mnemonic: \[\begin{align*} P \approx {A^t A \over AA^t} .\end{align*}\]

With an inner product in hand and a notion of orthogonality, we can define a notion of orthogonal projection of one vector onto another, and more generally of a vector onto a subspace spanned by multiple vectors.

Projection Onto a Vector

Say we have two vectors $\mathbf{x}$ and $\mathbf{y}$, and we want to define “the component of $\mathbf{x}$ that lies along $\mathbf{y}$”, which we’ll call $\mathbf{p}$. We can work out what the formula should be using a simple model:

We notice that whatever $p$ is, it will in the direction of $\mathbf{y}$, and thus $\mathbf{p} = \lambda \widehat{\mathbf{y}}$ for some scalar $\lambda$, where in fact $\lambda = {\left\lVert {\mathbf{p}} \right\rVert}$ since ${\left\lVert {\widehat{\mathbf{y}}} \right\rVert} = 1$. We will find that $\lambda = {\left\langle {\mathbf{x}},~{\widehat{\mathbf{y}}} \right\rangle}$, and so \[\begin{align*} \mathbf{p} = {\left\langle {\mathbf{x}},~{\widehat{\mathbf{y}}} \right\rangle}\widehat{\mathbf{y}} = \frac{{\left\langle {\mathbf{x}},~{\mathbf{y}} \right\rangle}}{{\left\langle {\mathbf{y}},~{\mathbf{y}} \right\rangle}} \mathbf{y} .\end{align*}\]

Notice that we can then form a “residual” vector $\mathbf{r} = \mathbf{x} - \mathbf{p}$, which should satisfy $\mathbf{r} ^\perp\mathbf{p}$. If we were to let $\lambda$ vary as a function of a parameter $t$ (making $\mathbf{r}$ a function of $t$ as well) we would find that this particular choice minimizes ${\left\lVert {\mathbf{r} (t)} \right\rVert}$.

Projection Onto a Subspace

In general, supposing one has a subspace $S = \mathrm{span}\left\{{\mathbf{y}_1, \mathbf{y}_2, \cdots, \mathbf{y}_n}\right\}$ and (importantly!) the $\mathbf{y}_i$ are orthogonal, then the projection of $\mathbf{p}$ of $x$ onto $S$ is given by the sum of the projections onto each basis vector, yielding

\[\begin{align*} \mathbf{p} = \sum_{i=1}^n \frac{{\left\langle {\mathbf{x}},~{\mathbf{y}_i} \right\rangle}}{{\left\langle {\mathbf{y}_i},~{\mathbf{y}_i} \right\rangle}} \mathbf{y}_i = \sum_{i=1}^n {\left\langle {\mathbf{x}},~{\mathbf{y}_i} \right\rangle} \widehat{\mathbf{y}_i} .\end{align*}\]

Note: this is part of why having an orthogonal basis is desirable!

Letting $A = [\mathbf{y}_1, \mathbf{y}_2, \cdots]$, then the following matrix projects vectors onto $S$, expressing them in terms of the basis $\mathbf{y}_i$²: \[\begin{align*} \tilde P_A = (AA^T)^{-1}A^T, \end{align*}\]

while this matrix performs the projection and expresses it in terms of the standard basis: \[\begin{align*} P_A = A(AA^T)^{-1}A^T. \end{align*}\]

Equation of a plane: given a point $\mathbf{p}_0$ on a plane and a normal vector $\mathbf{n}$, any vector $\mathbf{x}$ on the plane satisfies \[\begin{align*} {\left\langle {\mathbf{x} - \mathbf{p}_0},~{\mathbf{n}} \right\rangle} = 0 \end{align*}\]

To find the distance between a point $\mathbf{a}$ and a plane, we need only project $\mathbf{a}$ onto the subspace spanned by the normal $\mathbf{n}$: \[\begin{align*} d = {\left\langle {\mathbf{a}},~{\mathbf{n}} \right\rangle} .\end{align*}\]

One important property of projections is that for any vector $\mathbf{v}$ and for any subspace $S$, we have $\mathbf{v} - P_S(\mathbf{v}) \in S^\perp$. Moreover, if $\mathbf{v} \in S^\perp$, then $P_s(\mathbf{v})$ must be zero. This follows by noting that in equation $\ref{projection_equation}$, every inner product appearing in the sum vanishes, by definition of $\mathbf{v} \in S^\perp$, and so the projection is zero.

Least Squares

$\mathbf{x}$ is a least squares solution to $A\mathbf{x} = \mathbf{b}$ iff \[\begin{align*} A^t A \mathbf{x} = A^t \mathbf{b} \end{align*}\]

The general setup here is that we would like to solve $A\mathbf{x} = \mathbf{b}$ for $\mathbf{x}$, where $\mathbf{b}$ is not in fact in the range of $A$. We thus settle for a unique “best” solution $\tilde{\mathbf{x}}$ such that the error ${\left\lVert {A\tilde{\mathbf{x}} - \mathbf{b}} \right\rVert}$ is minimized.

Geometrically, the solution is given by projecting $\mathbf{b}$ onto the column space of $A$. To see why this is the case, define the residual vector $\mathbf{r} = A\tilde{\mathbf{x}} - \mathbf{b}$. We then seek to minimize ${\left\lVert {\mathbf{r}} \right\rVert}$, which happens exactly when $\mathbf{r} ^\perp\operatorname{im}A$. But this happens exactly when $\mathbf{r} \in (\operatorname{im}A)^\perp$, which by the fundamental subspaces theorem, is equivalent to $\mathbf{r} \in \ker A^T$.

From this, we get the equation \[\begin{align*} A^T \mathbf{r} = \mathbf{0} \\ \implies A^T(A \tilde{\mathbf{x}} - \mathbf{b}) = \mathbf{0}\\ \implies A^TA\tilde{\mathbf{x}} = A^T \mathbf{b}, \end{align*}\]

where the last line is described as the normal equations.

If $A$ is an $m\times n$ matrix and is of full rank, so it has $n$ linearly independent columns, then one can show that $A^T A$ is nonsingular, and we thus arrive at the least-squares solution \[\begin{align*} \tilde{\mathbf{x}} = (A^TA)^{-1}A^T \mathbf{b} \hfill\blacksquare \end{align*}\]

These equations can also be derived explicitly using Calculus applied to matrices, vectors, and inner products. This requires the use of the following formulas: \[\begin{align*} {\frac{\partial }{\partial \mathbf{x}}\,} {\left\langle {\mathbf{x}},~{\mathbf{a}} \right\rangle} &= \mathbf{a} \\ {\frac{\partial }{\partial \mathbf{x}}\,} {\left\langle {\mathbf{x}},~{\mathbf{A}\mathbf{x}} \right\rangle} &= (A+A^T)\mathbf{x} \end{align*}\]

as well as the adjoint formula \[\begin{align*} {\left\langle {A\mathbf{x}},~{\mathbf{x}} \right\rangle} = {\left\langle {\mathbf{x}},~{A^T \mathbf{x}} \right\rangle}. .\end{align*}\]

From these, by letting $A=I$ we can derive \[\begin{align*} {\frac{\partial }{\partial \mathbf{x}}\,} {\left\lVert {\mathbf{x}} \right\rVert}^2 = {\frac{\partial }{\partial \mathbf{x}}\,} {\left\langle {\mathbf{x}},~{\mathbf{x}} \right\rangle} = 2\mathbf{x}\\ .\end{align*}\]

The derivation proceeds by solving the equation \[\begin{align*} {\frac{\partial }{\partial \mathbf{x}}\,} {\left\lVert {\mathbf{b} - A\mathbf{x}} \right\rVert}^2 = \mathbf{0}. .\end{align*}\]

Normal Forms

Every square matrix is similar to a matrix in Jordan canonical form.

Decompositions

The QR Decomposition

Gram-Schmidt is often computed to find an orthonormal basis for, say, the range of some matrix $A$. With a small modification to this algorithm, we can write $A = QR$ where $R$ is upper triangular and $Q$ has orthogonal columns.

Why is this useful? One reason is that this also allows for a particularly simple expression of least-squares solutions. If $A=QR$, then $R$ will be invertible, and a bit of algebraic manipulation will show that \[\begin{align*} \tilde{\mathbf{x}} = R^{-1}Q^T\mathbf{b}. .\end{align*}\]

How does it work? You simply perform Gram-Schmidt to obtain $\left\{{\mathbf{u}_i}\right\}$, then \[\begin{align*}Q = [\mathbf{u}_1, \mathbf{u}_2, \cdots ].\end{align*}\]

The matrix $R$ can then be written as

\[\begin{align*} r_{ij} = \begin{cases} {\left\langle {\mathbf{u}_i},~{\mathbf{x}_j} \right\rangle}, & i\leq j, \\ 0, & \text{else}. \end{cases} \end{align*}\]

Explicitly, this yields the matrix \[\begin{align*} R = \begin{bmatrix} {\left\langle {\mathbf{u}_1},~{\mathbf{x}_1} \right\rangle} & {\left\langle {\mathbf{u}_1},~{\mathbf{x}_2} \right\rangle} & {\left\langle {\mathbf{u}_1},~{\mathbf{x}_3} \right\rangle} & \cdots & \\ 0 & {\left\langle {\mathbf{u}_2},~{\mathbf{x}_2} \right\rangle} & {\left\langle {\mathbf{u}_2},~{\mathbf{x}_3} \right\rangle} & \cdots & \\ 0 & 0 & {\left\langle {\mathbf{u}_3},~{\mathbf{x}_3} \right\rangle} & \cdots & \\ \vdots & \vdots & \vdots & \ddots \\ \end{bmatrix} \end{align*}\]

Appendix: Lists of things to know

Textbook: Leon, Linear Algebra with Applications

Topics

1.6: Partition Matrices
3.5: Change of Basis
4.1: Linear Transformations
4.2: Matrix Representations
4.3: Similarity
- Exam 1
5.1: Scalar Product in ${\mathbb{R}}^n$
5.2: Orthogonal Subspaces
5.3: Least Squares
5.4: Inner Product Spaces
5.5: Orthonormal Sets
5.6: Gram-Schmidt
6.1: Eigenvalues and Eigenvectors
- Exam 2
6.2: Systems of Linear Differential Equations
6.3: Diagonalization
6.6: Quadratic Forms
6.7: Positive Definite Matrices
6.5: Singular Value Decomposition
7.7: The Moore-Penrose Pseudo-Inverse
- Final Exam

Definitions

System of equations
Homogeneous system
Consistent/inconsistent system
Matrix
Matrix (i.e. $A \mathbf{x} = \mathbf{b}$)
Inverse matrix
Singular matrix
Determinant
Trace
Rank
Elementary row operation
Row equivalence
Pivot
Row Echelon Form
Reduced Row Echelon Form
Gaussian elimination
Block matrix
Vector space
Vector subspace
Linear transformation
Span
Linear independence
Basis
Change of basis
Dimension
Row space
Column space
Image
Null space
Kernel
Direct sum
Projection
Orthogonal subspaces
Orthogonal complement
Normal equations
Least-squares solution
Orthonormal
Eigenvalue
Eigenvector
Characteristic polynomial
Similarity
Diagonalizable
Inner product
Bilinearity
Multilinearity
Defective
Singular value decomposition
QR factorization
Gram-Schmidt process
Spectral theorem
Symmetric matrix
Orthogonal matrix
Positive-definite
Quadratic form

Lower-division review

Systems of linear equations
- Consistent vs. Inconsistent
- Possibilities for solutions
- Geometric interpretation
Matrix Inverses
- Detecting if a matrix is singular
- Computing the inverse
  - Formula for 2x2 case
  - Augment with the identity
  - Cramer’s Rule
Vector Spaces
- Definition in terms of closures
- Span
- Linear Independence
- Subspace and the subspace test
- Basis
Common Computations
- Reduction to RREF
- Eigenvalues and eigenvectors
- Basis for the column space
- Basis for the nullspace
- Basis for the eigenspace
- Construct matrix from a given linear map
- Construct change of basis matrix
- Construct matrix projection onto subspace
- Convert a basis to an orthonormal basis

Things to compute

Construct a matrix representing a linear map
- With respect to the standard basis in both domain and range
- With respect to a nonstandard basis in the range
- With respect to a nonstandard basis in the domain
- With respect to nonstandard bases in both the domain and range
Construct a change of basis matrix
Check that a map is a linear transformation
Compute the following spaces of a matrix and their orthogonal complements:
- Row space
- Column space
- Null space
Compute the shortest distance between a point and a plane
Compute the least squares solution to linear system
Prove that something is a vector space
Prove that a map is an inner product
Compute determinants
Compute the RREF of a matrix
Compute characteristic polynomials, eigenvalues, and eigenvectors
Diagonalize a matrix
Solve a system of ODEs resulting arising from tank mixing
Compute the singular value decomposition of a matrix
Compute the rank and nullity of a matrix
Convert a set of vectors to a basis
Convert a basis to an orthonormal basis
Determine if a matrix is diagonalizable
Compute the matrix for a projection onto a subspace
Find the QR factorization of a matrix

Things to prove

Prove facts about block matrices
Prove facts about injective linear maps
Prove facts about similar matrices
Prove facts about orthogonal spaces and orthogonal complements
Prove facts about inner products
Prove facts about orthonormal sets
Prove facts about eigenvalues/eigenvectors
Understand when a matrix can be diagonalized
Prove facts about diagonalizable matrices
Prove facts about the orthogonal decomposition theorem

Ordinary Differential Equations

Techniques Overview

\[\begin{align*} p(y)y' = q(x) && \hspace{10em} \text{separable} \\ \\ y'+p(x)y = q(x) && \text{integrating factor} \\ \\ y' = f(x,y), f(tx,ty) = f(x,y) && y = xV(x)\text{ COV reduces to separable} \\ \\ y' +p(x)y = q(x)y^n && \text{Bernoulli, divide by $y^n$ and COV $u = y^{1-n}$} \\ \\ M(x,y)dx + N(x,y)dy = 0 && M_y = N_x: \phi(x,y) = c (\phi_x =M, \phi_y = N) \\ \\ P(D)y = f(x,y) && x^ke^{rx} \text{ for each root } \end{align*}\]

Where $e^{zx}$ yields $e^{ax}\cos bx, e^{ax}\sin bx$

Types of Equations

Separable equations: \[\begin{align*}p(y)\frac{dy}{dx} - q(x) = 0 \implies \int p(y) dy = \int q(x) dx + C\end{align*}\] \[\begin{align*} \frac{dy}{dx} = f(x)g(y) \implies \int \frac{1}{g(y)}dy = \int f(x) dx + C \end{align*}\]
- Population growth: \[\begin{align*}\frac{dP}{dt} = kP \implies \qquad P = P_0 e^{kt}\end{align*}\]
- Logistic growth:
  - General form: $\frac{dP}{dt} =(B(t) - D(t))P(t)$
  - Assume birth rate is constant $B(t) = B_0$ and death rate is proportional to instantaneous population $D(t) = D_0 P(t)$. Then let $r = B_0, C = B_0/D_0$ be the carrying capacity: \[\begin{align*}\frac{dP}{dt} = r\left( 1 - \frac{P}{C} \right)P \implies \qquad P(t) = \frac{P_0}{\frac{P_0}{C} + e^{-rt}(1 - \frac{P_0}{C})}\end{align*}\]
First order linear: \[\begin{align*}\frac{dy}{dx} + p(x)y = q(x) \implies I(x) = e^{\int p(x) dx},\qquad y(x) = \frac{1}{I(x)}\left(\int q(x)I(x) dx + C\right)\end{align*}\]
Exact:
- $M(x,y)dx + N(x,y)dy = 0 \text{ is exact } \iff \exists \phi: \frac{\partial\phi}{\partial x} = M(x, y),~\frac{\partial\phi}{\partial y} = N(x, y) \\ \iff \frac{\partial M}{\partial y} = \frac{\partial N}{x}$
- General solution: \[\begin{align*}\phi(x, y) = \int^x M(s, y) ds + \int^y N(x, t) dt - \int^y \frac{\partial}{\partial t} \left(\int^x M(s, t) ds\right)dt\end{align*}\] (where $\int^x f(t) dt$ means take the antiderivative of $f$ and consider it a function of $x$)
Cauchy Euler: #todo
Bernoulli: todo

Linear Homogeneous

General form: \[\begin{align*} y^{(n)} + c_{n-1} y^{(n-1)} + \cdots + c_2y'' + cy' + cy = 0 \\ p(D)y = \prod (D-r_i)^{m_i} y= 0 \end{align*}\] where $p$ is a polynomial in the differential operator $D$ with roots $r_i$:

Real roots: contribute $m_i$ solutions of the form \[\begin{align*}e^{rx}, xe^{rx}, \cdots, x^{m_i-1}e^{rx}\end{align*}\]
Complex conjugate roots: for $r=a+bi$, contribute $2m_i$ solutions of the form \[\begin{align*}e^{(a\pm bi)x}, xe^{(a\pm bi)x}, ~\cdots,~ x^{m_i-1}e^{(a\pm bi)x} \\ = e^{ax}\cos(bx), e^{ax}\sin(bx),~ xe^{ax}\cos(bx), xe^{ax}\sin(bx),~ \cdots,~ \end{align*}\]

Example: by cases, second order equation of the form \[\begin{align*}ay'' + by' + cy = 0\end{align*}\] - Two distinct roots: $c_1 e^{r_1 x} + c_2 e^{r_2 x}$ - One real root: $c_1 e^{rx} + c_2 x e^{rx}$ - Complex conjugates $\alpha \pm i \beta$: $e^{\alpha x}(c_1 \cos \beta x + c_2 \sin \beta x)$

Linear Inhomogeneous

General form: \[\begin{align*} y^{(n)} + c_{n-1} y^{(n-1)} + \cdots + c_2y'' + cy' + cy = F(x) \\ p(D)y = \prod (D-r_i)^{m_i} y= 0 \end{align*}\]

Then solutions are of the form $y_c + y_p$, where $y_c$ is the solution to the associated homogeneous system and $y_p$ is a particular solution.

Methods of obtaining particular solutions

Undetermined Coefficients

Find an operator $p(D)$ the annihilates $F(x)$ (so $q(D)F = 0$)
Find solution of $q(D)p(D) = 0$, subtract of known solutions from homogeneous part to obtain the form of the trial solution $A_0f(x)$, where $A_0$ is the undetermined coefficient
Substitute trial solution into original equation to determine $A_0$

Useful Annihilators: \[\begin{align*} &F(x) = p(x): & D^{\deg(p)+1} \\ &F(x) = p(x)e^{ax}: & (D-a)^{\deg(p)+1}\\ &F(x) = \cos(ax) + \sin(ax): & D^2 + a^2\\ &F(x) = e^{ax}(a_0\cos(bx) + b_0\sin(bx)): & (D-z)(D-{\overline{{z}}}) = D^2 -2aD + a^2 + b^2 \\ &F(x) = p(x)e^{ax}\cos(bx) + p(x)e^{ax}\cos(bx): & \left( (D-z)(D-{\overline{{z}}})\right)^{\max(\deg(p), \deg(q))+ 1} \end{align*}\]

Variation of Parameters

Reduction of Order

Systems of Differential Equations

General form: \[\begin{align*} \frac{\partial \mathbf{x}(t) }{\partial t} = A\mathbf{x}(t) + \mathbf{b}(t) \iff \mathbf{x}'(t) = A\mathbf{x}(t) + \mathbf{b}(t) \end{align*}\]

General solution to homogeneous equation: \[\begin{align*} c_1\mathbf{x_1}(t) + c_2\mathbf{x_2}(t)+ \cdots +c_n\mathbf{x_n}(t) = \mathbf{X}(t)\mathbf{c} \end{align*}\]

If $A$ is a matrix of constants: $\mathbf{x}(t) = e^{\lambda_i t}~\mathbf{v}_i$ is a solution for each eigenvalue/eigenvector pair $(\lambda_i, \mathbf{v}_i)$ - If $A$ is defective, you’ll need generalized eigenvectors.

Inhomogeneous Equation: particular solutions given by \[\begin{align*} \mathbf{x}_p(t) = \mathbf{X}(t) \int^t \mathbf{X}^{-1}(s)\mathbf{b}(s) ~ds \end{align*}\]

Laplace Transforms

Definitions: \[\begin{align*} H_ { a } ( t ) \mathrel{\vcenter{:}}=\left\{ \begin{array} { l l } { 0 , } & { 0 \leq t < a } \\ { 1 , } & { t \geq a } \end{array} \right.\\ \delta(t): \int_{\mathbb{R}}\delta(t-a)f(t)~dt &= f(a),\quad \int_{\mathbb{R}}\delta(t-a)~dt = 1\\ (f \ast g )(t) &= \int_0^t f(t-s)g(s)~ds \\ L[f(t)] &= L[f] =\int_0^\infty e^{-st}f(t)dt = F(s) .\end{align*}\] Useful property: for $a\leq b$, $H_a(t) - H_b(t) = \indic{[a,b]}$. \[\begin{align*} t^n, n\in{\mathbb{N}}\quad&\iff &\frac{n!}{s^{n+1}},\quad &s > 0 \\ t^{-\frac{1}{2}} \quad&\iff &\sqrt{\pi} s^{-\frac{1}{2}}\quad & s>0\\ e^{at} \quad&\iff &\frac{1}{s-a},\quad &s > a \\ \cos(bt) \quad&\iff &\frac{s}{s^2+b^2},\quad &s>0 \\ \sin(bt) \quad&\iff &\frac{b}{s^2+b^2},\quad &s>0 \\ \cosh(bt) \quad&\iff &\frac{s}{s^2 - b^2},\quad &? \\ \sinh(bt) \quad&\iff &\frac{b}{s^2-b^2},\quad &? \\ \delta(t-a) \quad&\iff &e^{-as} \quad& \\ H_a(t) \quad&\iff &s^{-1}e^{-as}\quad& \\ e^{at}f(t) \quad&\iff &F(s-a)\quad & \\ H_a(t)f(t-a) \quad&\iff &e^{-as}F(s)& \\ f'(t) \quad&\iff & sL(f) - f(0) & \\ f''(t) \quad&\iff &s^2L(f) -sf(0) - f'(0) &\\ f^{(n)}(t) \quad&\iff & s^nL(f) - \sum_{i=0}^{n-1} s^{n-1-i}f^{(i)}(0) & \\ f(t)g(t) \quad&\iff &F(s) \ast G(s)\quad & \end{align*}\]

For $f$ periodic with period $T$, $L(f) = \frac{1}{1+e^{-sT}}\int_0^T e^{-st}f(t)~dt$

\[\begin{align*} p(y)y' = q(x) & & \hspace{10em} \text{separable} \\ \\ y'+p(x)y = q(x) & & \text{integrating factor} \\ \\ y' = f(x,y), f(tx,ty) = f(x,y) & & y = xV(x)\text{ COV reduces to separable} \\ \\ y' +p(x)y = q(x)y^n & & \text{Bernoulli, divide by $y^n$ and COV $u = y^{1-n}$} \\ \\ M(x,y)dx + N(x,y)dy = 0 & & M_y = N_x: \phi(x,y) = c (\phi_x =M, \phi_y = N) \\ \\ P(D)y = f(x,y) & & x^ke^{rx} \text{ for each root } \end{align*}\]

\[\begin{align*} L[e^{at} f(t)] = \int_0^\infty e^{(a-s)}f(t)dt = F(s-a), .\end{align*}\]

The general technique for solving differential equations with Laplace Transforms: - Take the Laplace Transform of all terms on both sides. - Solve for $L[y]$ in terms of $s$. - Attempt an inverse Laplace Transformations - This may involve partial fraction decomposition, completing the square, and splitting numerators to match terms with known inverse transformations.

Systems of Differential Equations

For a collection of $n$ functions $f_i: {\mathbb{R}}^n \to {\mathbb{R}}$, define the $n\times 1$ column vector \[\begin{align*} W(f_i)(\mathbf{p}) \mathrel{\vcenter{:}}= \begin{bmatrix} f_i(\mathbf{p}) \\ f_i'(\mathbf{p}) \\ f_i''(\mathbf{p}) \\ \vdots \\ f^{(n-1)}(\mathbf{p}) \end{bmatrix} .\end{align*}\]

The Wronskian of this collection is defined as \[\begin{align*} W(f_1, \cdots, f_n)(\mathbf{p}) \mathrel{\vcenter{:}}= \det \begin{bmatrix} \rule[-1ex]{0.5pt}{2.5ex}& \rule[-1ex]{0.5pt}{2.5ex}& & \rule[-1ex]{0.5pt}{2.5ex}\\ W(f_1)(\mathbf{p}) & W(f_2)(\mathbf{p}) & \cdots & W(f_n)(\mathbf{p})\\ \rule[-1ex]{0.5pt}{2.5ex}& \rule[-1ex]{0.5pt}{2.5ex}& & \rule[-1ex]{0.5pt}{2.5ex}\\ \end{bmatrix} .\end{align*}\]

A set of functions $\left\{{f_i}\right\}$ is linearly independent on $I \iff \exists x_0 \in I: W(x_0) \neq 0$.

$W \equiv 0$ on $I$ does not imply that $\left\{{f_i}\right\}$ is linearly dependent! Counterexample: $\left\{{x, x+x^2, 2x-x^2}\right\}$ where $W \equiv 0$ but $x+x^2 = 3(x) + (2x-x^2)$ is a linear combination of the other two functions.

Linear Equations of Order $n$

The standard form of such equations is \[\begin{align*} y^{(n)} + a_1y^{(n-1)} + a_2y^{(n-2)} + \cdots +a_ny'' + a_{n-1}y' + y = F(x). \end{align*}\]

All solutions will be the sum of the solution to the associated homogeneous equation and a single particular solution.

In the homogeneous case, examine the discriminant of the characteristic polynomial. Three cases arise:

That is, every real root contributes a term of $ce^{rx}$, while a multiplicity of $m$ multiplies the solution by a polynomial in $x$ of degree $m-1$.

Every pair of complex roots contributes a term $ce^r(a\cos \omega x + b\sin \omega x)$, where $r$ is the real part of the roots and $\omega$ is the complex part.

In the nonhomogeneous case, assume a solution in the most general form of $F(x)$, and substitute it into the equation to solve for constant terms. For example,

Use to reduce a nonhomogeneous equation to a homogeneous one as a polynomial in the operator $D$.

$F(x)$ of the form $e^{ax}sin(kx)$ can be rewritten as $e^{(a+ki)x}$

Algebra

To Sort

Burnside’s Lemma
Cauchy’s Theorem
- If ${\left\lvert {G} \right\rvert} = n = \prod p_i^{k_i}$, then for each $i$ there exists a subgroup $H$ of order $p_i$.
The Sylow Theorems
- If ${\left\lvert {G} \right\rvert} = n = \prod p_i^{k_i}$, for each $ii$ and each $1 \leq k_j \leq k_i$ then there exists a subgroup $H$ of order $p_i^{k_j}$.
Galois Theory
More terms: http://mathroughguides.wikidot.com/glossary:abstract-algebra
Order $p$: One, $Z_p$
Order $p^2$: Two abelian groups, $Z_{p^2}, Z_p^2$
Order $p^3$:
- 3 abelian $Z_{p^3}, Z_p \times Z_{p^2}. Z_p^3$,
- 2 others $Z_p \rtimes Z_{p^2}$.
  - The other is the quaternion group for p = 2 and a group of exponent p for p > 2.
Order $pq$:
- $p \mathrel{\Big|}q-1$: Two groups, $Z_{pq}$ and $Z_q \rtimes Z_p$
- Else cyclic, $Z_{pq}$
Every element in a permutation group is a product of disjoint cycles, and the order is the lcm of the order of the cycles.
The product ideal $IJ$ is not just elements of the form $ij$, it is all sums of elements of this form! The product alone isn’t enough.
The intersection of any number of ideals is also an ideal

Big List of Notation

\[\begin{align*} C(x) = && \left\{{g\in G : gxg^{-1} = x}\right\} && \subseteq G && \text{Centralizer} \\ C_G(x) = && \left\{{gxg^{-1} : g\in G}\right\} && \subseteq G && \text{Conjugacy Class} \\ G_x = && \left\{{g.x : x\in X}\right\} && \subseteq X && \text{Orbit} \\ x_0 = && \left\{{g\in G : g.x = x}\right\} && \subseteq G && \text{Stabilizer} \\ Z(G) = && \left\{{x\in G: \forall g\in G,~ gxg^{-1} = x}\right\} && \subseteq G && \text{Center} \\ \mathrm{Inn}(G) = && \left\{{\phi_g(x) = gxg^{-1} }\right\} && \subseteq {\operatorname{Aut}}(G) && \text{Inner Aut.} \\ \mathrm{Out}(G) = && {\operatorname{Aut}}(G) / \mathrm{Inn}(G) && \hookrightarrow{\operatorname{Aut}}(G) && \text{Outer Aut.} \\ N(H) = && \left\{{g\in G: gHg^{-1} = H}\right\} && \subseteq G && \text{Normalizer} \end{align*}\]

Group Theory

Notation: $H < G$ a subgroup, $N < G$ a normal subgroup, concatenation is a generic group operation.

${\mathbb{Z}}_n$ the unique cyclic group of order $n$
$\mathbf{Q}$ the quaternion group
$G^n = G\times G \times \cdots G$
$Z(G)$ the center of $G$
$o(G)$ the order of a group
$S_n$ the symmetric group
$A_n$ the alternating group
$D_n$ the dihedral group of order $2n$
Group Axioms
- Closure: $a,b \in G \implies ab \in G$
- Identity: $\exists e\in G \mathrel{\Big|}a\in G \implies ae = ea = a$
- Associativity: $a,b,c \in G \implies (ab)c = a(bc)$
- Inverses: $a\in G \implies \exists b \in G \mathrel{\Big|}ab =ba = e$
Definitions:
- Order
  - Of a group: $o(G) = {\left\lvert {G} \right\rvert}$, the cardinality of $G$
  - Of an element: $o(g) = \min\left\{{n\in {\mathbb{N}}: g^n = e}\right\}$
- Index
- Center: the elements that commute with everything
- Centralizer: all elements that commute with a given element/subgroup.
- Group Action: a function $f: X\times G \to G$ satisfying
  - $x\in X, g_1,g_2 \in G \implies g_1.(g_2.x) = (g_1g_2). x$
  - $x\in X \implies e.x = x$
- Orbits partition any set
- Transitive Action
- Conjugacy Class: $C \subset G$ is a conjugacy class $\iff$
  - $x\in C, g\in G \implies gxg^{-1} \in C$
  - $x,y \in C \implies \exists g\in G : gxg^{-1} = y$
  - i.e. subsets that are closed under $G$ acting on itself by conjugation and on which the action is transitive
  - i.e. orbits under the conjugation action
  - The order of any conjugacy class divides the order of $G$
- $p$-group: Any group of order $p^n$.
- Simple Group: no nontrivial normal subgroups
- Normal Series: $0 {~\trianglelefteq~}H_0 {~\trianglelefteq~}H_1 \cdots {~\trianglelefteq~}G$
- Composition Series: The successive quotients of the normal series
- Solvable: $G$ is solvable $\iff$ $G$ has an abelian composition series.
One step subgroup test: \[\begin{align*} a,b \in H \implies a b^{-1} \in H \\ .\end{align*}\]
Useful isomorphism invariants:
- Order profile of elements: $n_1$ elements of order $p_1$, $n_2$ elements of order $p_2$, etc
  - Useful to look at elements of order $2$!
- Order profile of subgroups
- $Z(A) \cong Z(B)$
- Number of generators (generators are sent to generators)
- Number and size of conjugacy classes
- Number of Sylow${\hbox{-}}p$ subgroups.
- Commutativity
- “Being cyclic”
- Automorphism Groups
- Solvability
- Nilpotency
Useful homomorphism invariants
- $\phi(e) = e$
- ${\left\lvert {g} \right\rvert} = m < \infty \implies {\left\lvert {\phi(g)} \right\rvert} = m$
- Inverses, i.e. $\phi(a)^{-1} = \phi(a^{-1})$
- $H < G \implies \phi(H) < G'$
  - $H' < G' \implies \phi^{-1}(H') < G$
- ${\left\lvert {G} \right\rvert} < \infty \implies \phi(G)$ divides ${\left\lvert {G} \right\rvert}, {\left\lvert {G'} \right\rvert}$

Big Theorems

Classification of Abelian Groups \[\begin{align*} G \cong {\mathbb{Z}}_{p_1^{k_1}} \oplus {\mathbb{Z}}_{p_2^{k_2}} \oplus \cdots \oplus {\mathbb{Z}}_{p_n^{k_n}} ,\end{align*}\] where $(p_i, k_i)$ are the set of elementary divisors of $G$.
Isomorphism Theorems

\[\begin{align*} \phi: G \to G’ \implies && \frac{G}{\ker{\phi}} \cong &~ \phi(G) \\ H {~\trianglelefteq~}G,~ K < G \implies && \frac{K}{H\cap K} \cong &~ \frac{HK}{H} \\ H,K {~\trianglelefteq~}G,~ K < H \implies && \frac{G/K}{H/K} \cong &~ \frac{G}{H} \end{align*}\]

Lagrange’s Theorem: $H < G \implies o(H) \mathrel{\Big|}o(G)$
- Converse is false: $o(A_4) = 12$ but has no order 6 subgroup.
The $GZ$ Theorem: $G/Z(G)$ cyclic implies that $G \in \mathbf{Ab}$.
Orbit Stabilizer Theorem: $G / x_0 \cong Gx$
The Class Equation
- Let $G\curvearrowright X$ and $\mathcal{O}_i \subseteq X$ be the nontrivial orbits, then \[\begin{align*} {\left\lvert {X} \right\rvert} = {\left\lvert { X_0 } \right\rvert} + \sum_{[x_i] \in X/G} {\left\lvert {Gx} \right\rvert} .\end{align*}\]
- The right hand side is the number of fixed points, plus a sum over all of the orbits of size greater than 1, where any representative within the orbit is chosen and we look at the index of its stabilizer in $G$.
- Let $G\curvearrowright G$ and for each nontrivial conjugacy class $C_G$ choose a representative $[x_i] = C_G = C_G(x_i)$ to obtain
\[\begin{align*} {\left\lvert {G} \right\rvert} = {\left\lvert {Z(G)} \right\rvert} + \sum_{[x_i] = C_G(x_i)} \left[ G: [x_i] \right] .\end{align*}\]
Useful facts:
- $H < G \in \mathbf{Ab} \implies H {~\trianglelefteq~}G$
  - Converse doesn’t hold, even if all subgroups are normal. Counterexample: $\mathbf{Q}$
- $G / Z(G) \cong \mathrm{Inn}(G)$
- $H, K < G$ with $H \cong K \not\implies G/H \cong G/K$
  - Counterexample: $G = {\mathbb{Z}}_4 \times{\mathbb{Z}}_2, H = <(0,1)>, K = <(2,0)>$. Then $G/H \cong {\mathbb{Z}}_4 \not\cong {\mathbb{Z}}_2^2 \cong G/K$
- $G\in\mathbf{Ab} \implies$ for each $p$ dividing $o(G)$, there is an element of order $p$
- Any surjective homomorphism $\phi: A \twoheadrightarrow B$ where $o(A) = o(B)$ is an isomorphism
- If $G$ is abelian, for each $d\mathrel{\Big|}{\left\lvert {G} \right\rvert}$ there is exactly one subgroup of order $d$.
Sylow Subgroups:
- Todo
Big List of Interesting Groups
- ${\mathbb{Z}}_4, {\mathbb{Z}}_2^2$
- $D_4$
- $Q = \langle a , b | a ^ { 4 } = 1 , a ^ { 2 } = b ^ { 2 } , a b = b a ^ { 3 } \rangle$ the quaternion group
- $S^3$, the smallest nonabelian group
Chinese Remainder Theorem: \[\begin{align*} {\mathbb{Z}}_{pq} \cong {\mathbb{Z}}_p \oplus {\mathbb{Z}}_q \iff (p,q) = 1 \end{align*}\]
- Fundamental Theorem of Finitely Generated Abelian Groups:
- $G = {\mathbb{Z}}^n \oplus \bigoplus {\mathbb{Z}}_{q_i}$
Finding all of the unique groups of a given order: #todo

Cyclic Groups

Generated by ?
For each $d$ dividing $o(G)$, there exists a subgroup $H$ of order $d$.
- If $G = <a>$, then take $H = <a^{\frac{n}{d}}>$

The Symmetric Group

Generated by:
- Transpositions
- #todo
Cycle types: characterized by the number of elements in the cycle.
- Two elements are in the same conjugacy class $\iff$ they have the same cycle type.
Inversions: given $\tau = (p_1 \cdots p_n)$, a pair $p_i, p_j$ is inverted iff $i < j$ but $p_j < p_i$
Can count inversions $N(\tau)$
- Equal to minimum number of transpositions to obtain non-decreasing permutation
Sign of a permutation: $\sigma(\tau) = (-1)^{N(\tau)}$
Parity of permutations $\cong ({\mathbb{Z}}, +)$
- even $\circ$ even = even
- odd $\circ$ odd = even
- even $\circ$ odd = odd

Ring Theory

Examples:
Non-Examples:
Definition of an Ideal
Definitions of types of rings:
- Field
- Unique Factorization Domain (UFD)
- Principal Ideal Domain (PID)
- Euclidean Domain:
- Integral Domain
- Division Ring \[\begin{align*} \text{field} \implies \text{Euclidean Domain} \implies \text{PID} \implies \text{UFD} \implies \text{integral domain} .\end{align*}\]
Counterexamples to inclusions are strict:
- An ED that is not a field:
- A PID that is not an ED: ${\mathbb{Q}}[\sqrt {19}]$
- A UFD that is not a PID:
- An integral domain that is not a UFD:
Integral Domains
Unique Factorization Domains
Prime Elements
Prime Ideals
Field Extensions
The Chinese Remainder Theorem for Rings
Polynomial Rings
- Irreducible Polynomials
  - Over ${\mathbb{Z}}_2$: \[\begin{align*} x,~ x+1,~ x^2+x+1,~ x^3+x+1,~ x^3+x^2+1 .\end{align*}\]
  - Eisenstein’s Criterion
Gauss’ Lemma

Number Theory

Notation and Basic Definitions

\[\begin{align*} (a, b) \mathrel{\vcenter{:}}=\gcd(a, b) && \text{the greatest common divisor} \\ {\mathbb{Z}}_n && \text{the ring of integers} \mod n \\ {\mathbb{Z}}_n^{\times}&& \text{the group of units}\mod n .\end{align*}\]

A function $f:{\mathbb{Z}}\to {\mathbb{Z}}$ is said to be multiplicative iff \[\begin{align*} (a, b) = 1 \implies f(ab) = f(a) f(b) .\end{align*}\]

Big Theorems

Primes

Every $n\in {\mathbb{Z}}$ can be written uniquely as \[\begin{align*} n = \prod_{i=1}^m p_i^{k_i} \end{align*}\] where the $p_i$ are the $m$ distinct prime divisors of $n$.

Note that the number of distinct prime factors is $m$, while the total number of factors is $\prod_{i=1}^m(k_i + 1)$.

Divisibility

\[\begin{align*} a{~\Bigm|~}b \iff b \equiv 0 \mod a \iff \exists k \text{ such that } ak = b \end{align*}\]

$\gcd, \operatorname{lcm}$

$\gcd(a, b)$ can be computed by taking prime factorizations of $a$ and $b$, intersecting the primes occurring, and taking the lowest exponent that appears. Dually, $\operatorname{lcm}(a, b)$ can be computed by taking the union and the highest exponent.

\[\begin{align*} xy = \gcd{(x,y)}~\mathrm{lcm}{(x,y)} \end{align*}\]

If $d\mathrel{\Big|}x$ and $d\mathrel{\Big|}y$, then \[\begin{align*} \gcd(x,y) &= d\cdot \gcd\qty{ \frac x d, \frac y d} \\ \operatorname{lcm}(x,y) &= d\cdot \operatorname{lcm}\qty{ \frac x d, \frac y d} \end{align*}\]

\[\begin{align*} \gcd(x, y, z) &= \gcd(\gcd(x,y), z) \\ \gcd(x, y) &= \gcd(x\bmod y, y) \\ \gcd(x,y) &= \gcd(x-y, y) .\end{align*}\]

The Euclidean Algorithm

$\gcd(a, b)$ can be computed via the Euclidean algorithm, taking the final bottom-right coefficient.

Modular Arithmetic

Generally concerned with the multiplicative group $({\mathbb{Z}}_n, \times)$.

The system \[\begin{align*} \begin{array} { c } { x \equiv a _ { 1 } \mod m _ { 1 } } \\ { x \equiv a _ { 2 } \mod m _ { 2 } } \\ { \vdots } \\ { x \equiv a _ { r } \mod m _ { r } } \end{array} \end{align*}\]

has a unique solution $x \mod \prod m_i \iff (m_i, m_j) = 1$ for each pair $i,j$, given by \[\begin{align*} x = \sum_{j=1}^r a_j \frac{\prod_i m_i}{m_j} \left[ \frac{\prod_i m_i}{m_j} \right]^{-1}_{\mod m_j} .\end{align*}\]

\[\begin{align*} a^{\phi(p)} \equiv 1 \mod n .\end{align*}\]

\[\begin{align*} x^{p} &\equiv x \mod p \\ x^{p-1} &\equiv 1 \mod p \quad \text{ if } p \nmid a \end{align*}\]

Diophantine Equations

Consider $ax + by = c$. This has solutions iff $c = 0 \mod (a,b) \iff \gcd(a,b) \text{ divides } c$.

Computations

If $x\equiv 0 \mod n$, then $x\equiv 0 \mod p^k$ for all $p^k$ appearing in the prime factorization of $n$.

If there are factors of the modulus in the equation, peel them off with addition, using the fact that $nk \equiv 0 \mod n$. \[\begin{align*} x &\equiv nk + r \mod n \\ &\equiv r \mod n \end{align*}\]

So take $x=463, n = 4$, then use $463 = 4\cdot 115 + 4$ to write \[\begin{align*} 463 &\equiv y \mod 4 \\ \implies 4\cdot 115 + 3 &\equiv y \mod 4 \\ \implies 3&\equiv y\mod 4 .\end{align*}\]

For any $n$, \[\begin{align*} x^k \mod n \equiv (x^{k/d} \bmod n)^d \mod n .\end{align*}\]

\[\begin{align*} 2^{25} &\equiv (2^5 \mod 5)^5 \mod 5 \\ &\equiv 2^5 \mod 5 \\ &\equiv 2 \mod 5 \end{align*}\]

Make things easier with negatives! For example, $\mod 5$, \[\begin{align*} 4^{25} &\equiv (-1)^{25} \mod 5\\ &\equiv (-1) \mod 5\\ &\equiv 4 \mod 5 \end{align*}\]

Invertibility

\[\begin{align*} xa = xb \mod n \implies a = b \mod \frac{n}{(x,n)} .\end{align*}\]

\[\begin{align*} x\in {\mathbb{Z}}_n^{\times}\iff (x, n) = 1 ,\end{align*}\] and thus \[\begin{align*} {\mathbb{Z}}_n^\times = \left\{{1\leq x \leq n : (x,n) = 1}\right\} \end{align*}\] and ${\left\lvert {{\mathbb{Z}}_n^\times} \right\rvert} = \phi(n)$.

One can reduce equations by dividing through by a unit. Pick any $x$ such that $x{~\Bigm|~}a$ and $x{~\Bigm|~}b$ with $(x,n) = 1$, then \[\begin{align*} a =b \mod n \implies \frac a x = \frac b x \mod n .\end{align*}\]

The Totient Function

\[\begin{align*} \phi(n) = {\left\lvert {\left\{{1\leq x \leq n : (x,n) = 1}\right\}} \right\rvert} \end{align*}\]

\[\begin{align*} \phi(1) & = {\left\lvert {\left\{{1}\right\}} \right\rvert} = 1 \\ \phi(2) & = {\left\lvert {\left\{{1}\right\}} \right\rvert} = 1 \\ \phi(3) & = {\left\lvert {\left\{{1,2}\right\}} \right\rvert} = 2 \\ \phi(4) & = {\left\lvert {\left\{{1,3}\right\}} \right\rvert} = 2 \\ \phi(5) & = {\left\lvert {\left\{{1,2,3,4}\right\}} \right\rvert} = 4 \end{align*}\]

\[\begin{align*} \phi(p) & = p-1 \\ \phi(p^k) & = p^{k-1}(p-1) \\ \phi(n) &= n\prod_{i=1}^{?} \qty{1 - {1\over p_i}} \\ n &= \sum_{\tiny d{~\Bigm|~}n} \phi(d) \end{align*}\]

All numbers less than $p$ are coprime to $p$; there are $p^k$ numbers less than $p^k$ and the only numbers not coprime to $p^k$ are multiples of $p$, i.e. $\left\{{p, p^2, \cdots p^{k-1}}\right\}$ of which there are $k-1$, yielding $p^k - p^{k-1}$

Along with the fact that $\phi$ is multiplicative, so $(p,q) = 1 \implies \phi(pq) = \phi(p)\phi(q)$, compute this for any $n$ by taking the prime factorization.

With these properties, one can compute: \[\begin{align*} \phi(n) &= \phi\qty{ \prod_i p_i^{k_i}} \\ &= \prod_i p_i^{k_i-1}(p_i-1) \\ &= n \left(\frac{\prod_i (p_i-1)}{\prod_i p_i}\right) \\ &= n\prod_i \qty{ 1 - \frac{1}{p_i}} \end{align*}\]

\todo[inline]{Check and explain}

Quadratic Residues

$x$ is a quadratic residue $\bmod n$ iff there exists an $a$ such that $a^2 = x \mod n$.

In ${\mathbb{Z}}_p$, exactly half of the elements (even powers of generator) are quadratic residues.

\[\begin{align*} -1\text{ is a quadratic residue in } {\mathbb{Z}}_p \iff p = 1 \mod 4 .\end{align*}\]

Primality Tests

If $n$ is prime, then \[\begin{align*} a^{n-1} = 1 \mod n \end{align*}\]

$n$ is prime iff \[\begin{align*} x^2 = 1 \mod n \implies x = \pm 1 \end{align*}\]

Sequences in Metric Spaces

Every bounded sequence has a convergent subsequence.

In ${\mathbb{R}}^n, X$ is compact $\iff X$ is closed and bounded.

Necessity of ${\mathbb{R}}^n$: $X = ({\mathbb{Z}}, d(x,y) = 1)$ is closed, complete, bounded, but not compact since $\left\{{1,2,\cdots}\right\}$ has no convergent subsequence

Converse holds iff bounded is replaced with totally bounded

Sequences

Notation: $\left\{{a_n}\right\}_{n\in{\mathbb{N}}}$ is a sequence, $\displaystyle\sum_{i\in{\mathbb{N}}} a_i$ is a series.

Known Examples

Known sequences: let $c$ be a constant. \[\begin{align*} c, c^2, c^3, \ldots &= \left\{{c^n}\right\}_{n=1}^\infty \to 0 && \forall {\left\lvert {c} \right\rvert} < 1 \\ \\ \frac{1}{c},\frac{1}{c^2},\frac{1}{c^3},\ldots &= \left\{{\frac{1}{c^n}}\right\}_{n=1}^\infty \to 0 &&\forall {\left\lvert {c} \right\rvert} > 1 \\ \\ 1,\frac{1}{2^c},\frac{1}{3^c},\ldots &= \left\{{\frac{1}{n^c}}\right\}_{n=1}^\infty \to 0 && \forall c > 0 \end{align*}\]

Convergence

A sequence $\left\{{x_j}\right\}$ converges to $L$ iff \[\begin{align*} \forall \varepsilon > 0,\, \exists N > 0 \text{ such that } \quad n\geq N \implies {\left\lvert {x_n - L} \right\rvert} < \varepsilon .\end{align*}\]

\[\begin{align*} b_n \leq a_n \leq c_n \text{ and } b_n,c_n \to L \implies a_n \to L \end{align*}\]

If $\left\{{a_j}\right\}$ monotone and bounded, then $a_j \to L = \lim\sup a_i < \infty$.

${\left\lvert {a_m - a_n} \right\rvert} \to 0 \in {\mathbb{R}}\implies \left\{{a_i}\right\}$ converges.

Checklist

Is the sequence bounded?
- $\left\{{a_i}\right\}$ not bounded $\implies$ not convergent
- If bounded, is it monotone?
  - $\left\{{a_i}\right\}$ bounded and monotone $\implies$ convergent
Use algebraic properties of limits
Epsilon-delta definition
Algebraic properties and manipulation:
- Limits commute with $\pm, \times, \operatorname{Div}$ and $\lim C = C$ for constants.
  - E.g. Divide all terms by $n$ before taking limit
  - Clear denominators

Sums (“Series”)

A series is an function of the form \[\begin{align*} f(x) = \sum_{j=1}^\infty c_j x^j .\end{align*}\]

Known Examples

Conditionally Convergent

\[\begin{align*} \sum_{k=1}^\infty k^p &< \infty &&\iff p \leq 1 \\ \sum_{k=1}^\infty \frac{1}{k^p} &< \infty &&\iff p > 1 \\ \sum_{k=1}^\infty \frac{1}{k} &= \infty && \end{align*}\]

Convergent

\[\begin{align*} \sum_{n=1}^\infty \frac{1}{n^2} & < \infty \\ \sum_{n=1}^\infty \frac{1}{n^3} & < \infty \\ \sum_{n=1}^\infty \frac{1}{n^\frac{3}{2}} & < \infty \\ \sum_{n=1}^\infty \frac{1}{n!} & = e \\ \sum_{n=1}^\infty \frac{1}{c^n} & = \frac{c}{c-1} \\ \sum_{n=1}^\infty (-1)^n \frac{1}{c^n} & = \frac{c}{c+1} \\ \sum_{n=1}^\infty (-1)^n \frac{1}{n} & = \ln 2 \end{align*}\]

Divergent

\[\begin{align*} \sum_{n=1}^\infty \frac{1}{n} = \infty \\ \sum_{n=1}^\infty \frac{1}{\sqrt n} = \infty \end{align*}\]

Convergence

Useful reference: http://math.hawaii.edu/~ralph/Classes/242/SeriesConvTests.pdf

$a_n\to 0$ does not imply $\sum a_n < \infty$. Counterexample: the harmonic series.

Absolute convergence $\implies$ convergence

\[\begin{align*} \limsup a_i \to 0 \implies \sum a_i \text{ converges } \end{align*}\]

The Big Tests

$a_n < b_n \and \sum b_n < \infty \implies \sum a_n < \infty$
$b_n < a_n \and \sum b_n = \infty \implies \sum a_n = \infty$

\[\begin{align*} R =\lim_{n\to\infty} {\left\lvert {\frac{a_{n+1}}{a_n}} \right\rvert} \end{align*}\]

$R < 1$: absolutely convergent
$R > 1$: divergent
$R = 1$: inconclusive

\[\begin{align*} R = \limsup_{n \to \infty} \sqrt[n]{{\left\lvert {a_n} \right\rvert}} \end{align*}\] - $R < 1$: convergent - $R > 1$: divergent - $R = 1$: inconclusive

\[\begin{align*} f(n) = a_n \implies \sum a_n < \infty \iff \int_1^\infty f(x) dx < \infty \end{align*}\]

\[\begin{align*} \lim_{n\to\infty}\frac{a_n}{b_n} = L < \infty \implies \sum a_n < \infty \iff \sum b_n < \infty \end{align*}\]

\[\begin{align*} a_n \downarrow 0 \implies \sum (-1)^n a_n < \infty \end{align*}\]

\[\begin{align*} \sum_{n=1}^\infty {\left\lVert {f_n} \right\rVert}_\infty < \infty \implies \exists f\text{ such that } {\left\lVert { \sum_{n=1}^\infty f_n - f} \right\rVert}_\infty \to 0 \end{align*}\] In other words, the series converges uniformly.

Slogan: Convergence of the sup norms implies uniform convergence"

The $M$ in the name comes from defining $\sup\left\{{f_k(x)}\right\} \mathrel{\vcenter{:}}= M_n$ and requiring $\sum {\left\lvert {M_n} \right\rvert} < \infty$.

Checklist

Do the terms tend to zero?
- $a_i \not\to 0 \implies \sum a_i = \infty$.
  - Can check with L’Hopital’s rule
There are exactly 6 tests at our disposal:
- Comparison, root, ratio, integral, limit, alternating
Is the series alternating?
- If so, does $a_n \downarrow 0$?
  - If so, convergent
Is this series bounded above by a known convergent series?
- $p$ series with $p>1$, i.e. : $\sum a_n \leq \sum \frac{1}{n^p} < \infty$
- Geometric series with ${\left\lvert {x} \right\rvert} < 1$, i.e. $\sum a_n \leq \sum x^n$
Is this series bounded below by a known divergent series?
- $p$ series with $p\leq 1$, i.e. $\infty = \sum \frac{1}{n^p} \leq \sum a_i$
Are the ratios strictly less than or greater than 1?
- $<1 \implies$ convergent
- $>1 \implies$ convergent
Does the integral analogue converge?
- Integral converges $\iff$ sum converges
Try the root test
- $<1 \implies$ convergent
- $>1 \implies$ convergent
Try the limit test
- Attempt to divide each term to obtain a known convergent/divergent series

Some Pattern Recognition:

$(\text{stuff})!$: Ratio test (only test that will work with factorials!!)
$(\text{stuff})^n$: Root test or ratio test
Replace $a_n$ with an $f(x)$ that’s easy to integrate - integral test
$p(x)$ or $\sqrt{p(x)}$: comparison or limit test

Radius of Convergence

Use the fact that \[\begin{align*} \lim_{k\to\infty} {\left\lvert {\frac{a_{k+1}x^{k+1}}{a_kx^k}} \right\rvert} = {\left\lvert {x} \right\rvert}\lim_{k\to\infty} {\left\lvert {\frac{a_{k+1}}{a_k}} \right\rvert} < 1 \implies \sum a_k x^k < \infty ,\end{align*}\] so take $L \mathrel{\vcenter{:}}=\lim_{k\to\infty} \frac{a_{k+1}}{a_k}$ and then obtain the radius as \[\begin{align*} R = \frac{1}{L} = \lim_{k\to\infty} {a_k \over a_{k+1}} \end{align*}\]

Note $L=0 \implies$ absolutely convergent everywhere
$L = \infty \implies$ convergent only at $x=0$.
Also need to check endpoints $R, -R$ manually.

Real Analysis

Notation

A function is continuously differentiable iff $f$ is differentiable and $f'$ is continuous.

Conventions:

Integrable means Riemann integrable.

\[\begin{align*} f && \text{a functional }{\mathbb{R}}^n \to {\mathbb{R}}\\ \mathbf{f} && \text{a function } {\mathbb{R}}^n\to {\mathbb{R}}^m \\ A, E, U, V && \text{open sets} \\ A' && \text{the limit points of }A \\ \mkern 1.5mu\overline{\mkern-1.5muA\mkern-1.5mu}\mkern 1.5mu && \text{the closure of }A \\ A^\circ\mathrel{\vcenter{:}}= A\setminus A' && \text{the interior of }A \\ K && \text{a compact set} \\ \mathcal{R}_A && \text{the space of Riemann integral functions on }A \\ C^j(A) && \text{the space of }j\text{ times continuously differentiable functions }f: {\mathbb{R}}^n \to {\mathbb{R}}\\ \left\{{f_n}\right\} && \text{a sequence of functions} \\ \left\{{x_n}\right\} && \text{a sequence of real numbers}\\ f_n \to f && \text{pointwise convergence} \\ f_n \rightrightarrows f && \text{uniform convergence} \\ x_n \nearrow x && x_i\leq x_j \text{ and }x_j\text{ converges to }x \\ x_n \searrow x && x_i\geq x_j \text{ and }x_j\text{ converges to }x \\ \sum_{k\in {\mathbb{N}}} f_k && \text{a series}\\ D(f) && \text{the set of discontinuities of }f .\end{align*}\]

Big Ideas

Summary for GRE:

Limits,
Continuity,
Boundedness,
Compactness,
Definitions of topological spaces,
Lipschitz continuity
Sequences and series of functions.
Know the interactions between the following major operations:
- Continuity (pointwise limits)
- Differentiability
- Integrability
- Limits of sequences
- Limits of series/sums
The derivative of a continuous function need not be continuous
A continuous function need not be differentiable
A uniform limit of differentiable functions need not be differentiable
A limit of integrable functions need not be integrable
An integrable function need not be continuous
An integrable function need not be differentiable

\[\begin{align*} f,g\text{ differentiable on } [a,b] \implies \exists c\in[a,b] : \left[f ( b ) - f ( a ) \right] g' ( c ) = \left[g ( b ) - g ( a )\right] f' ( c ) \end{align*}\]

Important Examples

Limits

Commuting Limits

Suppose $f_n \to f$ (pointwise, not necessarily uniformly)
Let $F(x) = \int f(t)$ be an antiderivative of $f$
Let $f'(x) = \frac{\partial f}{\partial x}(x)$ be the derivative of $f$.

Then consider the following possible ways to commute various limiting operations:

Does taking the derivative of the integral of a function always return the original function? \[\begin{align*} [\frac{\partial}{\partial x}, \int dx]:\qquad\qquad \frac{\partial}{\partial x}\int f(x, t)dt =_? \int \frac{\partial}{\partial x} f(x, t)dt\\ \text{} \end{align*}\]

Answer: Sort of (but possibly not).

Counterexample: \[\begin{align*} f(x) = \begin{cases} 1 & x > 0 \\ -1 & x \leq 0 \end{cases} \implies \int f \approx {\left\lvert {x} \right\rvert}, \end{align*}\] which is not differentiable. (This is remedied by the so-called “weak derivative”)

Sufficient Condition: If $f$ is continuous, then both are always equal to $f(x)$ by the FTC.

Is the derivative of a continuous function always continuous? \[\begin{align*} [\frac{\partial}{\partial x}, \lim_{x_i\to x}]:\qquad\qquad \lim_{x_i \to x} f'(x_n) =_? f'(\lim_{x_i\to x} x) \end{align*}\] Answer: No.

Counterexample: \[\begin{align*} f ( x ) = \left\{ \begin{array} { l l } { x ^ { 2 } \sin ( 1 / x ) } & { \text { if } x \neq 0 } \\ { 0 } & { \text { if } x = 0 } \end{array} \right. \implies f ^ { \prime } ( x ) = \left\{ \begin{array} { l l } { 2 x \sin \left( \frac { 1 } { x } \right) - \cos \left( \frac { 1 } { x } \right) } & { \text { if } x \neq 0 } \\ { 0 } & { \text { if } x = 0 } \end{array} \right. \end{align*}\] which is discontinuous at zero.

Sufficient Condition: There doesn’t seem to be a general one (which is perhaps why we study $C^k$ functions).

Is the limit of a sequence of differentiable functions differentiable and the derivative of the limit?

\[\begin{align*} [\frac{\partial}{\partial x}, \lim_{f_n \to f}]:\qquad\qquad \lim_{f_n \to f}\frac{\partial}{\partial x}f_n(x) =_? \frac{\partial }{\partial x}\lim_{f_n \to f} f_n(x) \end{align*}\] Answer: Super no – even the uniform limit of differentiable functions need not be differentiable!

Counterexample: $f_n(x) = \frac{\sin(nx)}{\sqrt{n}} \rightrightarrows f = 0$ but $f_n' \not\to f' = 0$

Sufficient Condition: $f_n \rightrightarrows f$ and $f_n \in C^1$.

Is the limit of a sequence of integrable functions integrable and the integral of the limit?

\[\begin{align*} [\int dx, \lim_{f_n \to f}](f):\qquad\qquad \lim_{f_n \to f}\int f_n(x) dx =_? \int \lim_{f_n \to f} f_n(x) dx \end{align*}\]

Answer: No.

Counterexample: Order ${\mathbb{Q}}\cap[0,1]$ as $\left\{{q_i}\right\}_{i\in{\mathbb{N}}}$, then take \[\begin{align*} f_n(x) = \sum_{i=1}^n \indic{q_n} \to \indic{{{\mathbb{Q}}\cap[0,1]}} \end{align*}\] where each $f_n$ integrates to zero (only finitely many discontinuities) but $f$ is not Riemann-integrable.

Sufficient Condition: - $f_n \rightrightarrows f$, or - $f$ integrable and $\exists M: \forall n, {\left\lvert {f_n} \right\rvert} < M$ ($f_n$ uniformly bounded)

Is the integral of a continuous function also continuous?

\[\begin{align*} [\int dx, \lim_{x_i \to x}]:\qquad\qquad \lim_{x_i \to x} F(x_i) =_? F(\lim_{x_i \to x} x_i) \end{align*}\]

Answer: Yes.

Proof: $|f(x)| < M$ on $I$, so given $c$ pick a sequence $x\to c$. Then \[\begin{align*} {\left\lvert {f(x)} \right\rvert} < M \implies \left\vert \int_c^x f(t)dt \right\vert < \int_c^x M dt \implies {\left\lvert {F(x) - F(c)} \right\rvert} < M(b-a) \to 0 \end{align*}\]

Is the limit of a sequence of continuous functions also continuous?

\[\begin{align*} [\lim_{x_i \to x}, \lim_{f_n \to f}]: \qquad\qquad \lim_{f_n \to f}\lim_{x_i \to x} f(x_i) =_? \lim_{x_i \to x}\lim_{f_n \to f} f_n(x_i)\\ \text{}\\ \end{align*}\]

Answer: No.

Counterexample: $f_n(x) = x^n \to \delta(1)$

Sufficient Condition: $f_n \rightrightarrows f$

Does a sum of differentiable functions necessarily converge to a differentiable function?

\[\begin{align*} \left[\frac{\partial}{\partial x}, \sum_{f_n}\right]: \qquad\qquad \frac{\partial}{\partial x} \sum_{k=1}^\infty f_k =_? \sum_{k=1}^\infty \frac{\partial}{\partial x} f_k \\ \text{} \\ \text{}\\ \end{align*}\]

Answer: No.

Counterexample: $f_n(x) = \frac{\sin(nx)}{\sqrt{n}} \rightrightarrows 0 \mathrel{\vcenter{:}}= f$, but $f_n' = \sqrt{n}\cos(nx) \not\to 0 = f'$ (at, say, $x=0$)

Sufficient Condition: When $f_n \in C^1, \exists x_0: f_n(x_0) \to f(x_0)$ and $\sum {\left\lVert {f_n'} \right\rVert}_\infty < \infty$ (continuously differentiable, converges at a point, and the derivatives absolutely converge)

Continuity

\[\begin{align*} f\text{ continuous } \iff \lim_{x \to p} f(x) = f(p) \end{align*}\]

\[\begin{align*} f:(X, d_X) \to (Y, d_Y) \text{ continuous } \iff \forall \varepsilon,~ \exists \delta \mathrel{\Big|}~ d_X(x,y) < \delta \implies d_Y(f(x), f(y)) < \varepsilon \end{align*}\]

\[\begin{align*} f(x) = \sin\qty{ \frac{1}{x} } \implies 0\in D(f) \end{align*}\]

The Dirichlet function is nowhere continuous: \[\begin{align*} f(x) = \indic{{\mathbb{Q}}} \end{align*}\]

The following function continuous at infinitely many points and discontinuous at infinitely many points: \[\begin{align*} f(x) = \begin{cases} 0 & x\in{\mathbb{R}}\setminus{\mathbb{Q}}\\ \frac{1}{q} & x = \frac{p}{q} \in {\mathbb{Q}} \end{cases} \end{align*}\] Then $f$ is discontinuous on ${\mathbb{Q}}$ and continuous on ${\mathbb{R}}\setminus{\mathbb{Q}}$.

$f$ is continuous on ${\mathbb{Q}}$:

Fix $\varepsilon$, let $x_0 \in {\mathbb{R}}-{\mathbb{Q}}$, choose $n: \frac{1}{n} < \varepsilon$ using Archimedean property.
- Define $S = \left\{{x\in{\mathbb{Q}}: x\in (0,1), x=\frac{m}{n'}, n' < n}\right\}$
- Then ${\left\lvert {S} \right\rvert} \leq 1+2+\cdots (n-1)$, so choose $\delta = \min_{s\in S}{\left\lvert {s-x_0} \right\rvert}$
- Then \[\begin{align*} x \in N_\delta(x_0) \implies f(x) < \frac{1}{n} < \varepsilon .\end{align*}\]

$f$ is discontinuous on ${\mathbb{R}}\setminus{\mathbb{Q}}$:

Let $x_0 = \frac{p}{q} \in {\mathbb{Q}}$ and $\left\{{x_n}\right\} = \left\{{x-\frac{1}{n\sqrt 2}}\right\}$. Then \[\begin{align*} x_n \uparrow x_0\text{ but } f(x_n) = 0 \to 0 \neq \frac{1}{q} = f(x_0) \end{align*}\]

There are no functions that are continuous on ${\mathbb{Q}}$ but discontinuous on ${\mathbb{R}}-{\mathbb{Q}}$

A continuous function on a compact space attains its extrema.

Lipschitz Continuity

Differentiability

\[\begin{align*} f'(p) \mathrel{\vcenter{:}}=\frac{\partial f}{\partial x}(p) = \lim_{x\to p} \frac{f(x) - f(p)}{x-p} \end{align*}\]

For multivariable functions: existence and continuity of $\frac{\partial \mathbf{f}}{\partial x_i} \forall i \implies \mathbf{f}$ differentiable
- Necessity of continuity: example of a continuous functions with all partial and directional derivatives that is not differentiable: \[\begin{align*} f(x, y) = \begin{cases} \frac{y^3}{x^2+y^2} & (x,y) \neq (0,0) \\ 0 & \text{else} \end{cases} .\end{align*}\]

Properties, strongest to weakest

\[\begin{align*} C^\infty \subsetneq C^k \subsetneq \text{ differentiable } \subsetneq C^0 \subset \mathcal{R}_K .\end{align*}\]

Example showing $f\in C^0 \centernot\implies f$ is differentiable and $f$ not differentiable $\centernot\implies f \not\in C^0$.
- Take $f(x) = {\left\lvert {x} \right\rvert}$ at $x=0$.
Example showing that $f$ differentiable $\centernot\implies f \in C^1$:
- Take \[\begin{align*} f(x) = \begin{cases} x^2\sin\qty{ \frac{1}{x} } & x \neq 0 \\ 0 & x =0 \end{cases} \implies f'(x) = \begin{cases} -\cos\qty{\frac{1}{x}} + 2x\sin\qty{ \frac{1}{x} } & x \neq 0 \\ 0 & x=0 \end{cases} \end{align*}\] but $\lim_{x\to 0}f'(x)$ does not exist and thus $f'$ is not continuous at zero.

Proof that $f$ differentiable $\implies f \in C^0$: \[\begin{align*} f(x) - f(p) = \frac{f(x)-f(p)}{x-p}(x-p) \stackrel{\tiny\mbox{hypothesis}}{=} f'(p)(x-p) \stackrel{\tiny\mbox{$x\to p$}}\rightrightarrows 0 \end{align*}\]

Giant Table of Relations

Bold are assumed hypothesis, regular text is the strongest conclusion you can reach, strikeout denotes implications that aren’t necessarily true.

\[\begin{align*} f' && f && \therefore f && F \\ \hline \\ \cancel{\text{exists}} && \mathbf{continuous} && \text{K-integrable} && \text{exists} \\ \cancel{\text{continuous}} && \mathbf{differentiable} && \text{continuous} && \text{exists} \\ \cancel{\text{exists}} && \mathbf{integrable} && \cancel{\text{continuous}} && \text{differentiable} \\ \end{align*}\]

Explanation of items in table:

K-integrable: compactly integrable.
$f$ integrable $\implies F$ differentiable $\implies F \in C_0$
- By definition and FTC, and differentiability $\implies$ continuity
$f$ differentiable and $K$ compact $\implies f$ integrable on $K$.
- In general, $f$ differentiable $\centernot\implies f$ integrable. Necessity of compactness: \[\begin{align*} f(x) = e^x \in C^\infty({\mathbb{R}})\text{ but }\int_{\mathbb{R}}e^x dx \to \infty .\end{align*}\]
$f$ integrable $\centernot\implies f$ differentiable
- An integrable function that is not differentiable: $f(x) = |x|$ on ${\mathbb{R}}$
$f$ differentiable $\implies f$ continuous a.e.

Integrability

Sufficient criteria for Riemann integrability:
- $f$ continuous
- $f$ bounded and continuous almost everywhere, or
- $f$ uniformly continuous
$f$ integrable $\iff$ bounded and continuous a.e.

If $F$ is a differentiable function on the interval $[a,b]$, and $F'$ is bounded and continuous a.e., then $F' \in L_R([a, b])$ and \[\begin{align*} \forall x\in [a,b]: \int_a^x F'(t)~dt=F(x)-F(a) \end{align*}\]

Suppose $f$ bounded and continuous a.e. on $[a,b]$, and define \[\begin{align*} F(x) \mathrel{\vcenter{:}}=\int_a^x f(t)~dt \end{align*}\] Then $F$ is absolutely continuous on $[a,b]$, and for $p \in [a,b]$, \[\begin{align*} f \in C^0(p) \implies F \text{ differentiable at } p,~ F'(p) = f(p), \text{ and } F' \stackrel{\tiny\mbox{a.e}}{=} f. \end{align*}\]

The Dirichlet function is Lebesgue integrable but not Riemann integrable: \[\begin{align*} f(x) = \indic{x \in {\mathbb{Q}}} \end{align*}\]

List of Free Conclusions:

$f$ integrable on $U \implies$:
- $f$ is bounded
- $f$ is continuous a.e. (finitely many discontinuities)
- $\int f$ is continuous
- $\int f$ is differentiable
$f$ continuous on $U$:
- $f$ is integrable on compact subsets of $U$
- $f$ is bounded
- $f$ is integrable
$f$ differentiable at a point $p$:
- $f$ is continuous
$f$ is differentiable in $U$
- $f$ is continuous a.e.
Defining the Riemann integral: #todo

Convergence

Sequences and Series of Functions

Define \[\begin{align*} s_n(x) \mathrel{\vcenter{:}}=\sum_{k=1}^n f_k(x) \end{align*}\] and \[\begin{align*} \sum_{k=1}^\infty f_k(x) \mathrel{\vcenter{:}}=\lim_{n\to\infty} s_n(x), \end{align*}\] which can converge pointwise, absolutely, uniformly, or not all.

If $\limsup_{k\in {\mathbb{N}}} {\left\lvert {f_k(x)} \right\rvert} \neq 0$ then $f_k$ is not convergent.

If $f$ is injective, then $f'$ is nonzero in some neighborhood of ???

Pointwise convergence

\[\begin{align*} f_n \to f = \lim_{n\to\infty} f_n .\end{align*}\] Summary: \[\begin{align*} \lim_{f_n \to f} \lim_{x_i \to x} f_n(x_i) \neq \lim_{x_i \to x} \lim_{f_n \to f} f_n(x_i) .\end{align*}\]

\[\begin{align*} \lim_{f_n \to f} \int_I f_n \neq \int_I \lim_{f_n \to f} f_n .\end{align*}\]

Pointwise convergence is strictly weaker than uniform convergence.

$f_n(x) = x^n$ on $[0, 1]$ converges pointwise but not uniformly.

Towards a contradiction let $\varepsilon = \frac{1}{2}$.
Let $n = N\qty{\frac{1}{2} }$ and $x = \left(\frac{3}{4}\right)^\frac{1}{n}$.
Then $f(x) = 0$ but \[\begin{align*} {\left\lvert {f_n(x) - f(x)} \right\rvert} = x^n = \frac{3}{4} > \frac{1}{2} \end{align*}\]

\[\begin{align*} f_n \text{ continuous} \centernot\implies f\mathrel{\vcenter{:}}=\lim_n f_n \text{ is continuous} .\end{align*}\]

Take \[\begin{align*} f_n(x) = x^n,\quad f_n(x) \to \indic[x = 1] .\end{align*}\]

\[\begin{align*} f_n \text{ differentiable} &\centernot\implies f'_n \text{ converges} \\ f'_n \text{ converges} &\not\implies \lim f'_n = f' .\end{align*}\]

Take \[\begin{align*} f_n(x) = \frac{1}{n}\sin(n^2 x) \to 0,&& \text{but } f'_n = n\cos(n^2 x) \text{ does not converge} .\end{align*}\]

\[\begin{align*} f_n\in \mathcal{R}_I \centernot\implies\lim_{f_n \to f} \int_I f_n \neq \int_I \lim_{f_n \to f} f_n .\end{align*}\]

May fail to converge to same value, take \[\begin{align*} f_n(x) = \frac{2n^2x}{(1+n^2x^2)^2} \to 0 && \text{but }\int_0^1 f_n = 1 - \frac{1}{n^2 + 1} \to 1\neq 0 .\end{align*}\]

Uniform Convergence

Notation: \[\begin{align*} f_n \rightrightarrows f= \lim_{n\to\infty} f_n \text{ and } \sum_{n=1}^\infty f_n \rightrightarrows S .\end{align*}\]

Summary: \[\begin{align*} \lim_{x_i \to x} \lim_{f_n \to f} f_n(x_i) = \lim_{f_n \to f} \lim_{x_i \to x} f_n(x_i) = \lim_{f_n \to f} f_n(\lim_{x_i \to x} x_i) .\end{align*}\]

\[\begin{align*} \lim_{f_n \to f} \int_I f_n = \int_I \lim_{f_n \to f} f_n .\end{align*}\]

\[\begin{align*} \sum_{n=1}^\infty \int_I f_n = \int_I \sum_{n=1}^\infty f_n .\end{align*}\]

“The uniform limit of a(n) $x$ function is $x$”, for $x \in$ {continuous, bounded}

Equivalent to convergence in the uniform metric on the metric space of bounded functions on $X$: \[\begin{align*} f_n \rightrightarrows f \iff \sup_{x\in X} {\left\lvert {f_n(x) - f(x)} \right\rvert} \to 0 .\end{align*}\]
- $(B(X,Y), {\left\lVert {} \right\rVert}_\infty)$ is a metric space and $f_n \rightrightarrows f \iff {\left\lVert {f_n - f} \right\rVert}_\infty \to 0$ (where $B(X,Y)$ are bounded functions from $X$ to $Y$ and ${\left\lVert {f} \right\rVert}_\infty = \sup_{x\in I}\left\{{f(x)}\right\}$
$f_n \rightrightarrows f \implies f_n \to f$ pointwise
$f_n$ continuous $\implies f$ continuous
- i.e. “the uniform limit of continuous functions is continuous”
$f_n \in C^1$, $\exists x_0: f_n(x_0) \to f(x_0)$, and $f'_n \rightrightarrows g$ $\implies f$ differentiable and $f' = g~$ (i.e. $f'_n \to f'$)
- Necessity of $C^1$ – look at failures of $f'_n$ to be continuous:
  - Take $f_n(x) = \sqrt{\frac{1}{n^2} + x^2} \rightrightarrows |x|$, not differentiable
  - Take $f_n(x) = n^{-\frac{1}{2}}\sin(nx) \rightrightarrows 0$ but $f'_n \not\to f' = 0$ and $f' \neq g$
$f_n$ integrable $\implies f$ integrable and $\int f_n \to \int f$
$f_n$ bounded $\implies f$ bounded
$f_n \rightrightarrows f_n \centernot\implies f'_n$ converges
- Says nothing about it general
$f_n' \rightrightarrows f' \centernot\implies f_n \rightrightarrows f$
- Unless $f$ converges at one or more points.

$\left\{{x_i}\right\} \to p \implies$ every subsequence also converges to $p$.

Every convergent sequence in $X$ is a Cauchy sequence.

The converse need not hold in general, but if $X$ is complete, every Cauchy sequence converges. An example of a Cauchy sequence that doesn’t converge: take $X={\mathbb{Q}}$ and set $x_i = \pi$ truncated to $i$ decimal places.

If any subsequence of a Cauchy sequence converges, the entire sequence converges.

\[\begin{align*} d(x,y) &\geq 0 && \text{Positive}\\ d(x,y) &= 0 \iff x = y && \text{Nondegenerate}\\ d(x,y) &= d(y,x) && \text{Symmetric}\\ d(x,y) &\leq d(x,p) + d(p,y) \quad \forall p && \text{Triangle Inequality} .\end{align*}\]

Topology

Open Set Characterization: Arbitrary unions and finite intersections of open sets are open.

Closed Set Characterization: Arbitrary intersections and finite unions of closed sets are closed.

The best source of examples and counterexamples is the open/closed unit interval in $\mathbb{R}$. Always test against these first!

If $f$ is a continuous function. the preimage of every open set is open and the preimage of every closed set is closed.

In ${\mathbb{R}}$, singleton sets and finite discrete sets are closed.

A singleton set can be written \[\begin{align*} \left\{{p_0}\right\} = (-\infty, p) \cup(p, \infty) .\end{align*}\] A finite discrete set $\left\{{p_0}\right\}$, which wlog (by relabeling) can be assumed to satisfy $p_0 < p_1 < \cdots$, can be written \[\begin{align*} \left\{{p_0, p_1, \cdots, p_n}\right\} = (-\infty, p_0) \cup(p_0, p_1) \cup\cdots \cup(p_n, \infty) .\end{align*}\]

This yields a good way to produce counterexamples to continuity.

In $\mathbb{R}$, singletons are closed. This means any finite subset is closed, as a finite union of singleton sets!

If $X$ is a compact metric space, then $X$ is complete and bounded.

If $X$ complete and $X \subset Y$, then $X$ closed in $Y$.

The converse generally does not hold, and completeness is a necessary condition. Counterexample: ${\mathbb{Q}}\subset {\mathbb{Q}}$ is closed but ${\mathbb{Q}}\subset{\mathbb{R}}$ is not.

If $X$ is compact, then $Y \subset X \implies Y$ is compact $\iff$ $Y$ closed.

A topological space $X$ is sequentially compact iff every sequence $\left\{{x_n}\right\}$ has a subsequence converging to a point in $X$.

If $X$ is a metric space, $X$ is compact iff $X$ is sequentially compact.

Note that in general, neither form of compactness implies the other.

Counterexamples

There are functions differentiable only at a single point. Example: \[\begin{align*} f(x) = \begin{cases} x^2 & x\in QQ\\ -x^2 & x\in {\mathbb{R}}\setminus{\mathbb{Q}} \end{cases} .\end{align*}\]

This is discontinuous everywhere except for $x=0$, and you can compute \[\begin{align*} \lim_{h\to 0} {f(x+h) - f(x) \over h}\Big|_{x=0} = \lim_{h\to 0} \begin{cases} h & x\in {\mathbb{Q}}\\ -h & x\in {\mathbb{R}}\setminus{\mathbb{Q}} \end{cases} =0 .\end{align*}\]

The product of two non-differentiable functions can be differentiable: take $f(x) = g(x) = {\left\lvert {x} \right\rvert}$ which are not differentiable at $x=0$, then $fg(x) = {\left\lvert {x} \right\rvert}^2$ is differentiable at $x=0$.

A continuous function that is zero on a dense set $A\subset X$ is identically zero.

Since $A$ is dense, for any $x\in X\setminus A$ take a sequence $\left\{{x_n}\right\}$ in $A$ converging to $x$. Then $0 = f(x_n) \to f(x)$ implies $f(x) = 0$.

Point-Set Topology

Definitions

Epsilon-neighborhood
- $N_r(p) = \left\{{q \mathrel{\Big|}d_X(p,q) < r}\right\}$
Limit Point
- $p$ is a limit point of $E$ iff $\forall N_r(p),~ \exists q\neq p \mathrel{\Big|}q \in N_r(p)$
- Equivalently, $\forall N_r(p),~ N_r(p) \cap E \neq \emptyset$
- Let $L(E)$ be the set of limit points of $E$.
- Example: $E = (0,1) \implies 0 \in L(E)$
Isolated Point
- $p$ is an isolated point of $E$ iff $p$ is not a limit point of $E$
- Equivalently, $\exists N_r(p) \mathrel{\Big|}N_r(p) \cap E = \emptyset$
- Equivalently, $E - L(E)$
Perfect
- $E$ is perfect iff $E$ is closed and $E \subseteq L(E)$
- Equivalently, $L(E) = E$
Interior
- $p$ is an interior point of $E$ iff $\exists N_r(p) \mathrel{\Big|}N_r(p) \subsetneq E$
- Denote the interior of $E$ by $E^\circ$
Exterior
Closed sets
- $E$ is closed iff $p$ a limit point of $E \implies p \in E$
- Equivalently if $L(E) \subseteq E$
- Closed under finite unions, arbitrary intersections
Open sets
- $E$ is open iff $p\in E \implies p \in E^\circ$
- Equivalently, if $E \subseteq E^\circ$
- Closed under arbitrary unions, finite intersections
Boundary
Closure
Dense
- $E$ is dense in $X$ iff $X \subseteq E \cup L(E)$
Connected
- Space of connected sets closed under union, product, closures
- Convex $\implies$ connected
Disconnected
Path Connected
- $\forall x,y \in X \exists f: I \to X \mathrel{\Big|}f(0) = x, f(1) = y$
- Path connected $\implies$ connected
Simply Connected
Totally Disconnected
Hausdorff
Compact
- Every covering has a finite subcovering.
- $X$ compact and $U \subset X: (U \text{ closed } \implies U \text{ compact })$
  - $U \text{ compact } \implies U \text{ closed }$ iff $X$ is Hausdorff
- Closed under products

The space $\left\{{\frac{1}{n}}\right\}_{n\in {\mathbb{N}}}$.

List of properties preserved by continuous maps:

Connectedness
Compactness

Checking if a map is homeomorphism:

$f$ continuous, $X$ compact and Hausdorff $\implies f$ is a homeomorphism.

Probability

Definitions

\[\begin{align*} L^2(X) &= \left\{{f: X \to {\mathbb{R}}: \int_{\mathbb{R}}f(x) ~dx < \infty}\right\} &&\text{square integrable functions}\\ {\left\langle {g},~{f} \right\rangle}_{2} &= \int_{\mathbb{R}}g(x)f(x) ~dx &&\text{the } L^2 \text{ inner product}\\ {\left\lVert {f} \right\rVert}_2^2 &= {\left\langle {f},~{f} \right\rangle} = \int_{\mathbb{R}}f(x)^2 ~dx &&\text{norm}\\ E[{\,\cdot\,}] &= {\left\langle {{\,\cdot\,}},~{f} \right\rangle} &&\text{expectation}\\ (\tau_{p}f)(x) &= f(p- x) &&\text{translation}\\ (f \ast g)(p) &= \int_{\mathbb{R}}f(t)g(p-t)~dt = \int_{\mathbb{R}}f(t)(T_{p}g)(t) ~dt = {\left\langle {T_pg},~{f} \right\rangle} &&\text{convolution}\\ \end{align*}\]

For $(\Sigma, E, \mu)$ a probability space with sample space $\Sigma$ and probability measure $\mu$, a random variable is a function $X: \Sigma \to {\mathbb{R}}$

For any $U \subset {\mathbb{R}}$, given by the relation \[\begin{align*} P(X \in U) = \int_U f(x) ~dx \\ \implies P(a \leq X \leq b) = \int_a^b f(x) ~dx \end{align*}\]

The antiderivative of the PDF \[\begin{align*} F(x) = P(X \leq x) = \int_{-\infty}^x f(x) ~dx .\end{align*}\] Yields ${\frac{\partial F}{\partial x}\,} = f(x)$

\[\begin{align*} E[X] \mathrel{\vcenter{:}}={\left\langle {\text{id}},~{f} \right\rangle} = \int_{\mathbb{R}}x f(x) ~dx .\end{align*}\] Also denoted $\mu_X$.

\[\begin{align*} E\left[\sum_{i\in{\mathbb{N}}} a_i X_i\right] = \sum_{i\in{\mathbb{N}}} a_i E[X_i] .\end{align*}\] Does not matter whether or not the $X_i$ are independent.

\[\begin{align*} \mathrm{Var}(X) &= E[(X - E[X])^2] \\ &= \int (x - E[X])^2 f(x) ~dx \\ &= E[X^2] - E[X]^2 \\ &\mathrel{\vcenter{:}}=\sigma^2(X) \end{align*}\] where $\sigma$ is the standard deviation. Can also defined as ${\left\langle {(\text{id}- {\left\langle {\text{id}},~{f} \right\rangle})^2},~{f} \right\rangle}$ Take the portion of the id function in the orthogonal complement of $f$, squared, and project it back onto $f$?

\[\begin{align*} \mathrm{Var}(aX + b) &= a^2\mathrm{Var}(X) \\ \mathrm{Var}\qty{ \sum_{\mathbb{N}}X_i } &= \sum_i \mathrm{Var}(X_i) + 2 \sum_{i < j}\mathrm{Cov}(X_i, X_j) .\end{align*}\]

\[\begin{align*} \mathrm{Cov}(X,Y) &= E[(X-\mu_X)(Y-\mu_Y)] \\ &= E[XY] - E[X]E[Y] \end{align*}\]

\[\begin{align*} \mathrm{Cov}(X, X) &= \mathrm{Var}(X) \\ \mathrm{Cov}(aX, Y) &= a\mathrm{Cov}(X,Y) \\ \mathrm{Cov}(\sum_{{\mathbb{N}}} X_i, \sum_{\mathbb{N}}Y_j) &= \sum_i \sum_j\mathrm{Cov}(X_i, Y_j) \\ .\end{align*}\]

\[\begin{align*} k! \sim k^\frac{k+1}{2}e^{-k} \sqrt{2\pi} .\end{align*}\]

\[\begin{align*} P(X \geq a) \leq \frac 1 a E[X] \end{align*}\]

One-sided Markov: \[\begin{align*} P(X \in N_\varepsilon(\mu)) = 2\frac{\sigma^2}{\sigma^2 + a^2} .\end{align*}\]

\[\begin{align*} P({\left\lvert {X - \mu} \right\rvert} \geq a) \leq \left( \frac \sigma k \right)^2 \end{align*}\]

Apply Markov to the variable $(X-\mu)^2$ and $a=k^2$

For $X_i$ i.i.d., \[\begin{align*} \lim_n \frac{\sum_{i=1}^n X_i - n\mu}{\sigma \sqrt n} \sim N(0, 1) .\end{align*}\]

\[\begin{align*} P(\frac{1}{n} \sum X_i \rightarrow \mu) = 1 .\end{align*}\]

For all $t > 0$, \[\begin{align*} P(X \in N_\varepsilon(a)^c) \leq 2 e^{-at}M_X(t) \\ .\end{align*}\]

\[\begin{align*} E[f(X)] \geq f(E[X]) \end{align*}\]

\[\begin{align*} H(X) = - \sum p_i \ln p_i \end{align*}\]

Theory and Background

Given a sample space $\Sigma$ with events $S$, 1. $\mu(\Sigma) = 1$ 1. Yields $S \in \Sigma \implies 0 \leq P(S) \leq 1$ 2. For mutually exclusive events, $P(\cup_{\mathbb{N}}S_i) = \sum_{\mathbb{N}}P(S_i)$ 1. Yields $P(\emptyset) = 0$

$P(S^c) = 1 - P(S)$
$E \subseteq F \implies P(E) \leq P(F)$
Proof: $E \subseteq F \implies F = E \cup(E^c \cap F)$, which are disjoint, so $P(E) \leq P(E) + P(E^c \cap F) = P(F)$.
$P(E \cup F) = P(E) + P(F) - P(E \cap F)$

\[\begin{align*} P(F)P(E \mathrel{\Big|}F) = P(E \cap F) = P(E)P(F \mathrel{\Big|}E) \end{align*}\] Generalization: \[\begin{align*} P(\cap_{\mathbb{N}}E_i) = P(E_1) P(E_2 \mathrel{\Big|}E_1)P(E_3\mathrel{\Big|}E_1 \cap E_2) \cdots \end{align*}\]

\[\begin{align*} P(E) = P(F)P(E \mathrel{\Big|}F) + P(F^c)P(E \mathrel{\Big|}F^c) \\ P(E) = \sum_i P(A_i) P(E \mathrel{\Big|}A_i) \end{align*}\] Generalization: for $\coprod_{i=1}^n A_i = \Sigma$ and $A=A_i$ for some $i$, \[\begin{align*} P(A \mathrel{\Big|}B) = \frac{P(A)P(B\mathrel{\Big|}A)}{\sum_{j = 1}^n P(B \mathrel{\Big|}A_j)} .\end{align*}\] The LHS: the posterior probability, while $P(A_i)$ are the priors.

\[\begin{align*} P(A) / P(A^c) \end{align*}\] Conditional odds: \[\begin{align*} \frac{P(A \mathrel{\Big|}E)}{P(A^c \mathrel{\Big|}E)} = \frac{P(A)}{P(A^c)} \frac{P(E \mathrel{\Big|}A)}{P(E \mathrel{\Big|}A^c)} \end{align*}\]

\[\begin{align*} P(A \cap B) = P(A) P(B) \end{align*}\]

If $g$ is differentiable and monotonic and $Y=g(X)$, then \[\begin{align*} f_Y(y) = \begin{cases} (f_X \circ g^{-1})(y) {\left\lvert {{\frac{\partial }{\partial y}\,} g^{-1}(y)} \right\rvert} & y \in \operatorname{im}(g) \\ 0 & \text{else} \end{cases} \end{align*}\]

\[\begin{align*} f_{X+Y} = (F_X \ast f_y) \end{align*}\]

Distributions

Let $X$ be a random variable, and $f$ be its probability density function satisfying $f(k) = P(X = k)$

Uniform

Consider an event with $n$ mutually exclusive outcomes of equal probability, and let $X \in \left\{{1,2,\ldots, n}\right\}$ denote which outcome occurs. Then, \[\begin{align*} f(k) &= \quad \frac 1 n \\ \mu &= \frac n 2 \\ \sigma^2 &= a \end{align*}\]
Examples:
- Dice rolls where $n=6$.
- Fair coin toss where $n=2$.
Continuous: $\mu = (1/2)(b+a), \sigma^2 = (1/12)(b-a)^2$

Bernoulli

Consider a trial with either a positive or negative outcome, and let $X \in\left\{{0,1}\right\}$ where $1$ denotes a success with probability $p$. Then,

\[\begin{align*} f(k) &= \begin{cases} 1-p, & k = 0 \\ p, & k = 1 \end{cases} \\ \mu &= \quad p \\ \sigma^2 &= \quad p(1-p) \end{align*}\] - Examples: - A weighted coin with $P(\text{Heads}) = p$

Binomial

Consider a sequence of independent Bernoulli trials, let $X \in \left\{{1,\ldots, n}\right\}$ denote the number of successes occurring in $n$ trials. Then, \[\begin{align*} f(k) &= \quad {n \choose k} p^k (1-p)^{n-k} \\ \mu &= \quad np \\ \sigma^2 &= \quad np(1-p) \end{align*}\]
Examples:
- A sequence of coin flips and the numbers of total heads occurring.

Poisson

Given a parameter $\lambda > 0$ that denotes the rate per unit time of an event occurring and $X$ the number of times the event occurs in one unit of time, \[\begin{align*} f(k) &= \quad \frac{\lambda^k}{k!}e^{-\lambda} \\ \mu &= \quad \lambda \\ \sigma^2 &= \quad \lambda^2 \end{align*}\]
Approximates binomial when $n >> 1$ and $p << 1$ by using $\lambda = np$

Negative Binomial

$B^- (r, p)$: similar to binomial, where $X$ is the number of trials it takes to accumulate $r$ successes \[\begin{align*} f(k) &= \quad {k-1 \choose r-1}p^r(1-p)^{k-r} \\ \mu &= \quad \frac r p \\ \sigma^2 &= \quad \frac{r (1-p)}{p^2} \end{align*}\]

Geometric

Consider a sequence of independent Bernoulli trials, let $X \in \left\{{1,\ldots, n}\right\}$ where $X=k$ denotes the first success happening on the $k{\hbox{-}}$th trial. Then, \[\begin{align*} f(k) &= \quad (1-p)^{k-1} p \\ \mu &= \quad \frac 1 p \\ \sigma^2 &= \quad \frac{1-p}{p^2} \end{align*}\]
Note this is equal to $B^-(1, p)$
Examples:
- A sequence of coin flips and the number of flips before the first heads appears.

Hypergeometric

$H(n, m, s)$ An urn filled with $n$ balls, where $m$ are white and $n-m$ are black; pick a sample of size $s$ and let $X$ denote the number of white balls: \[\begin{align*} f(k) &= \quad {m \choose k} {n-m \choose s-k} {n \choose s}^{-1} \\ \mu &= \quad \frac{ms}{n} \\ \sigma^2 &= \quad \frac{ms}{n}(1- \frac{m}{n})\left( 1 - \frac{s-1}{n-1} \right) \end{align*}\]

Normal

\[\begin{align*} f(x) = \frac{1}{\sigma \sqrt{2\pi}}e^{-\frac{(x-\mu)^2}{2\sigma^2}} \end{align*}\]

$z$	$\Phi(z)$
$0$	$0.5$
$1$	$0.69$
$1.5$	$0.84$
$2$	$0.93$
$2.5$	$0.97$
$>3$	$0.99$

Table of Distributions

Table: let $q = 1-p$.

\[\begin{align*} \text{Distribution} & f(x) & & \mu & \sigma^2 & M(t) \\ \hline \\ B(n, p) & {n\choose x}p^x q^{n-x} & & np & npq & (pe^t + q)^n \\ P(\lambda) & \frac{\lambda^x}{x!}e^{-\lambda} & & \lambda & \lambda & e^{\lambda(e^t-1)} \\ G(p) & q^{x-1}p & & \frac{1}{p} & \frac{q}{p^2} & \frac{pe^t}{1-qe^t} \\ B^-(r, p) & {n-1 \choose r-1}p^rq^{n-r} & & \frac{r}{p} & \frac{rq}{p^2} & \left(\frac{pe^t}{1-qe^t}\right)^r \\ U(a, b) & \indic{a\leq x\leq b}\frac 1 {b-a} & & \frac{1}{2}(a+b) & \frac{1}{12}(b-a)^2 & \frac{e^{tb} - e^{ta}}{t(b-a)} \\ Exp(\lambda) & \indic{0 \leq x}\lambda e^{-\lambda x} & & \frac 1 \lambda & \frac 1 {\lambda^2} & \frac \lambda {\lambda - t} \\ \Gamma(s, \lambda) & \indic{0 \leq x} \frac{\lambda e^{-\lambda x} (\lambda x)^{s-1}}{\Gamma(s)} & & \frac s \lambda & \frac s {\lambda^2} & \left( \frac{\lambda}{\lambda - t} \right)^s \\ N(\mu, \sigma^2) & \frac{1}{\sigma \sqrt{2\pi}}e^{-\frac{(x-\mu)^2}{2\sigma^2}} & & \mu & \sigma^2 & e^{\mu t + \frac{1}{2}\sigma^2 t^2} \end{align*}\]

Why you need the Stieltjes Integral: let $X \sim B(n, \frac 1 2), Y \sim U(0, 1),$ and \[\begin{align*} Z = \begin{cases} X, & $X = 1$ \\ Y, & else \end{cases} \end{align*}\] then ${\left\lvert {Z} \right\rvert} = {\left\lvert {{\mathbb{R}}} \right\rvert}$ so $Z$ is not discrete, but $P(X = 1) = \frac 1 2 \neq 0$ so $Z$ is not continuous. Definition: \[\begin{align*} \int _ { a } ^ { b } g ( x ) ~d F ( x ) = \lim \sum _ { i = 1 } ^ { n } g \left( x _ { i } \right) \left( F \left( x _ { i } \right) - F \left( x _ { i - 1 } \right) \right) .\end{align*}\]

Common Problems

Birthday Paradox
Coupon Collectors
- Given $X = \left\{{1, \cdots n}\right\}$, what is the expected number of draws until all $n$ outcomes are seen?

Notes, Shortcuts, Misc

When computing expected values, variation, etc, just insert a parameter $k$ and compute the moments $E[X^k]$. Then with a solution in terms of $k$, let $k=1,2$ etc.
Neat property of pdfs: $P(X \in N_\varepsilon(a)) \approx \varepsilon f(a)$

\[\begin{align*} \Gamma(x+1) = \int_{{\mathbb{R}}^{>0}} e^{-t} t^x ~dt .\end{align*}\] Integrate by parts to obtain functional relation $\Gamma(x+1) = x\Gamma(x)$

\[\begin{align*} P(\cup_{\mathbb{N}}A_i) \leq \sum_{\mathbb{N}}P(A_i) \end{align*}\]

For any function $f:X \to {\mathbb{R}}$, there is a lower bound: $\max_{x\in X}f(x) \geq E[f(x)]$

Define \[\begin{align*} M(t) = E[e^{Xt}] \end{align*}\]

Then $M^{(n)}(0)$ is the $n{\hbox{-}}$th moment, i.e. $M'(0) = E[X], M''(0) = E[X^2]$, etc.
$M_{X+Y}(t) = M_X(t)M_Y(t)$ (if independent)
$M_{aX + b}(t) = e^{bt} M_X(at)$
$f_X = \mathcal{F}^{-1}(M_X(it))$, denoting the inverse Fourier transform,

Combinatorics

Notation

\[\begin{align*} S_n & = \left\{{1,2,\ldots n}\right\} & & \text{the symmetric group} \\ {n\choose k} & = \frac{n!}{k!(n-k)!} & & \text{binomial coefficient}\\ n^{\underline k} & = n(n-1) \cdots (n-k+1) = k!{n\choose k} & & \text{falling factorial}\\ n^{\overline k} & = n(n+1) \cdots (n+k-1) = k!{n + n - 1 \choose n} & & \text{rising factorial}\\ {n \choose {m_1, m_2, \cdots m_k}} & = \frac{n!}{\prod_{i=1}^k m_i!} & & \text{multinomial coefficient} \end{align*}\]

Note that the rising and falling factorials always have exactly $k$ terms.

Multinomial: A set of $n$ items divided into $k$ distinct, disjoint subsets of sizes $m_1 \cdots m_k$. Multinomial theorem: \[\begin{align*} (x_1 + x_2 + \cdots x_k )^n = \sum_{\substack{m_1, m_2, \cdots, m_k \\ ~~~\sum m_i = n}} {n \choose m_1,m_2,\cdots, m_k} x_1^{m_1}x_2^{m_2}\cdots x_k^{m_k} \end{align*}\] which contains $n + r - 1 \choose r - 1$ terms.

Inclusion-Exclusion: \[\begin{align*} {\left\lvert {\cup_{i=1}^n A_i} \right\rvert} = \sum_i {\left\lvert {A} \right\rvert}_i - \sum_{i_1 < i_2}^{n\choose 2^2} {\left\lvert {A_{i_1} \cap A_{i_2}} \right\rvert} + \sum_{i_1 < i_2 < i_3}^{n \choose 2^3} {\left\lvert {A_{i_1} \cap A_{i_2} \cap A_{i_3}} \right\rvert} + \cdots +(-1)^{n+1} {\left\lvert {\cap_{i=1}^n A_i} \right\rvert} \\ = \sum_{k=1}^n ~\sum_{i_1 < \cdots < i_k} (-1)^{k+1} {\left\lvert {\cap_{j=1}^k A_{i_j}} \right\rvert} \end{align*}\]

The Important Numbers

Binomial Coefficients
- The Binomial Theorem: \[\begin{align*} (x+y)^n = \sum_{k=0}^n {n\choose k}x^ky^{n-k} \end{align*}\]
- Choosing:$n \choose k$
- Choosing with repetition allowed: ${n+k-1}\choose k$
Signed Stirling Numbers of the First Kind: $s(n,k)$
- Count the number of permutations of $n$ elements with $k$ disjoint cycles, i.e. the number of elements elements in $S_n$ that are the product of $k$ disjoint cycles (including trivial cycles that fix a point).
- Recurrence relation: \[\begin{align*} s(n,k) = s(n-1, k-1) + ks(n-1, k) .\end{align*}\]
- Relation to falling factorial: $x^{\underline n} = \sum_{k=1}^n s(n,k)x^k$
Stirling Numbers of the Second Kind: $\genfrac\{\}{0pt}{}{n}{k}$
- Counts the number of ways to partition a set $N$ into $k$ non-empty subsets $S_i$ (i.e. such that $S_i \cap S_j = \emptyset,~\coprod_{i=1}^k S_i = N$)
- Recurrence relation: \[\begin{align*} \genfrac\{\}{0pt}{}{n+1}{k} = k\genfrac\{\}{0pt}{}{n}{k} + \genfrac\{\}{0pt}{}{n}{k-1} \\ \genfrac\{\}{0pt}{}{0}{0} = 1, \quad \genfrac\{\}{0pt}{}{n}{0} = \genfrac\{\}{0pt}{}{0}{n} = 0 \end{align*}\]
- Explicit formula: $\genfrac\{\}{0pt}{}{n}{k} = \frac 1 {k!} \sum_{i=0}^k (-1)^{k-i} {k \choose i} i^n$
- $B_n = \sum_{i=0}^n \genfrac\{\}{0pt}{}{n}{i}$
Betti Numbers
Bell Numbers
Compositions
- A composition of $n$ is a way of writing $n$ as a sum of strictly positive integers, ie. $k_1 + k_2 + \cdots k_i = n$ where each $0 < k_i \leq n$, where order matters (and distinct orders count as distinct compositions).
- Weak compositions: identical, but some terms are allowed to be zero.
- Number of compositions of $n$ into $k$ parts: $n-1 \choose k - 1$
- Number of weak compositions of $n$ into $k$ parts: $n+k-1 \choose n$
- Total number of compositions of $n$ (into any number of parts): $2^{n-1}$
Partitions
- A partition of $n$ is a composition of $n$ quotiented by permutations of the ordering of terms.
  - Example: 2 compositions of $5$ involving $1$ and $4$, given by $4+1$ and $1+4$, whereas there is only one such partition of $5$ given by $4+1$.
- Visualize with Young diagrams

Common Problems

Stars and Bars
- No two bars adjacent: $n-1\choose k-1$
- Allowing adjacent bars: $n+k-1 \choose k-1$

Coupon Collectors Problem

The Twelvefold Way

Consider a function $f: N \to K$ where ${\left\lvert {N} \right\rvert}=n, {\left\lvert {K} \right\rvert} = k$.

A number of valid interpretations: - $f$ labels elements of $N$ by elements of $K$ - For each element of $N$, $f$ chooses an element of $K$ - $f$ partitions $N$ into classes that are mapped to the same element of $K$ - Throw each of $N$ balls into some of $K$ boxes

Dictionary: - No restrictions: - Assign $n$ labels, repetition allowed - Form a multiset of $K$ of size $n$ - Injectivity - Assign $n$ labels without repetition - Select $n$ distinct elements from $K$ - Number of $n{\hbox{-}}$combinations of $k$ elements - No more than one ball per box - Surjectivity: - Use every label at least once - Every element of $K$ is selected at least once - “Indistinguishable” - Quotient by the action of $S_n$ or $S_k$ - $n{\hbox{-}}$permutations = injective functions - $n{\hbox{-}}$combinations = injective functions / $S_n$ - $n{\hbox{-}}$multisets = all functions / $S^n$ - Partitions of $N$ into $k$ subsets = surjective functions / $S_k$ - Compositions of $n$ into $k$ parts = surjective functions / $S_n$

\[\begin{align*} \begin{array}{c|c|c|c} \text{Permutations \ Restrictions} & N \xrightarrow{f} K & N \hookrightarrow K & N \twoheadrightarrow K \\ \hline f & k^n & k^{\underline{n}} & k! \genfrac\{\}{0pt}{}{n}{k} \\ f \circ \sigma_N & {n+k-1}\choose n & k\choose n & {n-1}\choose{n-k} \\ \sigma_X \circ f & \sum_{i=0}^k \genfrac\{\}{0pt}{}{n}{i} & \indic{n \leq k} & \genfrac\{\}{0pt}{}{n}{k}\\ \sigma_X \circ f \circ \sigma_N & p_k(n+k) & \indic{n \leq k} & p_k(n) \end{array} \end{align*}\]

In words (todo: explain)

Perm. / Rest.	—	Injective	Surjective
—	A sequence of any $n$ elements of $K$	Sequences of $n$ distinct elements of $K$	Compositions of $N$ with exactly $k$ subsets
Permutations of $N$	Multisets of $K$ with $n$ elements	An $n{\hbox{-}}$element subset of $K$	Compositions of $n$ with $k$ terms
Permutations of $X$	Partitions of $N$ into $\leq k$ subsets	?	Partitions of $N$ into exactly $k$ nonempty subsets
Both	Partitions of $n$ into $\leq k$ parts	?	Partitions of $n$ into exactly $k$ parts

Proofs/Explanations: todo

Counting non-isomorphic things: Pick a systematic way. Can descend my maximum vertex degree, or ascend by adding nodes/leaves.

Complex Analysis

$\lim_{z\rightarrow z_0} f(z) = x_0 + iy_0$ iff the component functions limit to $x_0$ and $y_0$ respectively. Moreover, both ways are equal!

Notation: $z = a + ib, f(z) = u(x, y) + iv(x, y)$

Useful Equations and Definitions

\[\begin{align*} {\left\lvert {z} \right\rvert} &= \sqrt{a^2 + b^2} \\ {\left\lvert {z} \right\rvert}^2 = z{\overline{{z}}} &= a^2 + b^2 \\ \frac{z{\overline{{z}}}}{{\left\lvert {z} \right\rvert}^2} &= \frac{(a+ib)(a-ib)}{a^2 + b^2} = 1 \\ \frac{1}{z} &= \frac{{\overline{{z}}}}{{\left\lvert {z} \right\rvert}^2} = \frac{a-ib}{a^2+b^2} \\ e^{zx} = e^{(a+ib)x} &= e^{ax}(\cos(bx) + i\sin(bx)) \\ x^z &\mathrel{\vcenter{:}}= e^{z\ln x} \\ \mathrm{Log}(z) &= \ln{\left\lvert {z} \right\rvert} + i~\mathrm{Arg}(z) \\ \cos z &= \frac{1}{2}(e^{iz} + e^{-iz}) \\ \sin z &= \frac{1}{2i}(e^{iz} - e^{-iz}) \\ (x-z)(x-{\overline{{z}}}) &= x^2 - 2{\mathcal{Re}({z})}x + (a^2+b^2) \\ \frac { \partial } { \partial z } &= \frac { 1 } { 2 } \left( \frac { \partial } { \partial x } - i \frac { \partial } { \partial y } \right) \\ \frac { \partial } { \partial \overline { z } } &= \frac { 1 } { 2 } \left( \frac { \partial } { \partial x } + i \frac { \partial } { \partial y } \right) \end{align*}\]

Complex Arithmetic and Calculus

$n{\hbox{-}}$th roots: \[\begin{align*}e^{\frac{ki}{2\pi n}}, \qquad k = 1, 2, \cdots n-1\end{align*}\]

Complex Differentiability

\[\begin{align*} z' = \lim_{h\to 0} \frac{f(z+h)-f(z)}{h} \end{align*}\] - A complex function that is not differentiable at a point: $f(z) = z/\mkern 1.5mu\overline{\mkern-1.5muz\mkern-1.5mu}\mkern 1.5mu$ at $z=0$

Cauchy-Riemann Equations \[\begin{align*} u_x = v_y \hspace{4em}u_y = -v_x \end{align*}\]
Alternatively:
- ${\frac{\partial [}{\partial f}\,}]{\mkern 1.5mu\overline{\mkern-1.5muz\mkern-1.5mu}\mkern 1.5mu} = 0$
- ${\left\langle {\nabla u},~{\nabla v} \right\rangle} = 0$
- $\Delta u = \Delta v = 0$ (both components are harmonic)

Complex Integrals

The main theorem: \[\begin{align*} \oint_C f(z)~dz = 2\pi i \sum_k \mathrm{Res}(f, z_k) \end{align*}\]

Computing residues: \[\begin{align*} \operatorname { Res } ( f , c ) = \frac { 1 } { ( n - 1 ) ! } \lim _ { z \rightarrow c } \frac { d ^ { n - 1 } } { d z ^ { n - 1 } } \left( ( z - c ) ^ { n } f ( z ) \right) \\ f(z) = \frac{g(z)}{h(z)} \implies \operatorname { Res } ( f , c ) = \frac{g(c)}{h'(c)} \end{align*}\]

Definitions

Analytic: differentiable everywhere
Entire
Holomorphic
Meromorphic

Complex Analytic $\implies$ smooth and all derivatives are analytic

Not true in real case, take the everywhere differentiable but not $C^1$ function \[\begin{align*} f(x) = \begin{cases} -\frac{1}{2}x^2 & x < 0 \\ \frac{1}{2}x^2 & x \geq 0 \end{cases} \end{align*}\]

Definitions

In these notes, $C$ generally denotes some closed contour, $\mathbb{H}$ is the upper half-plane, $C_R$ is a semicircle of radius $R$ in $\mathbb{H}$, $f$ will denote a complex function.

Analytic

$f$ is analytic at $z_0$ if it can be expanded as a convergent power series in some neighborhood of $z_0$.
Holomorphic

A function $f$ is holomorphic at a point $z_0$ if $f'(z_0)$ exists in a neighborhood of $z_0$.

(Note - this is more than just being differentiable at a single point!)

Big Theorem: $f$ is a holomorphic complex function iff $f$ is analytic.
Meromorphic

Holomorphic, except for possibly a finite number of singularities.
Conformal

$f$ is conformal at $z_0$ if $f$ is analytic at $z_0$ and $f'(z_0) \neq 0$.
Harmonic

A function $u(x,y)$ is harmonic if it satisfies Laplace’s equation, \[\begin{align*}\Delta u = u_{xx} + u_{yy} = 0\end{align*}\]

Some other notions to look up:

Conformal maps
Analytic
Theorem: Analytic $\implies$ conformal
The Sterographic projection. Is it conformal?
Branch points and branch cuts
Loxodromic transformations
Horocycles
Analytic Continuation
The complex logarithm
Mobius transformations
Curvature
Angular Excess

Preliminary Notions

What is the Complex Derivative?

In small neighborhoods, the derivative of a function at a point rotates it by an angle $\Delta\theta$ and scales it by a real number $\lambda$ according to \[\begin{align*}\Delta\theta = \arg f'(z_0), ~\lambda = |f'(z_0)|\end{align*}\]

$n$th roots of a complex number

The $n$th roots of $z_0$ are given by writing $z_0 = re^{i\theta}$, and are \[\begin{align*} \zeta = \left\{{ \sqrt[n]{r} \exp\qty{\left}[{i\left( \frac{\theta}{n} + \frac{2k\pi}{n}\right)}\right] ~{\text{s.t.}}~k = 0,1,2, \ldots, n-1 }\right\} \end{align*}\]

or equivalently

\[\begin{align*}\zeta = \left\{ \sqrt[n]{r}\omega_n^k \mathrel{\Big|}k = 0,1,2,\ldots, n-1\right\}~\text{where}~\omega_n = e^{\frac{2\pi i}{n}}\end{align*}\]

This can be derived by looking at $\left( re^{i\theta + 2k\pi}\right)^{\frac{1}{n}}$.

It is also useful to immediately recognize that $z^2+a = (z-i\sqrt{a})(z+i\sqrt{a})$.

The Cauchy-Riemann Equations

If $f(x+iy) = u(x,y) + iv(x,y)$ or $f(re^{i\theta}) = u(r,\theta) + iv(r,\theta)$, then $f$ is complex differentiable if $u,v$ satisfy

\[\begin{align*}\begin{aligned} u_x &= v_y &\quad u_y &= -v_x \\ r u_r &= v_\theta &\quad u_\theta &= -r v_r\end{aligned}\end{align*}\]

In this case, \[\begin{align*}f'(x+iy) = u_x(x,y) + iv_x(x,y)\end{align*}\] or in polar coordinates, \[\begin{align*}f'(re^{i\theta}) = e^{i\theta}(u_r(r,\theta) + iv_r(r,\theta))\end{align*}\]

Integration

The Residue Theorem

If $f$ is meromorphic inside of a closed contour $C$, then \[\begin{align*}\oint_C f(z) dz = 2\pi i \sum_{z_k} \underset{z=z_k}{\text{Res}} f(z)\end{align*}\]

where $\underset{z=z_k}{\text{Res}} f(z)$ is the coefficient of $z^{-1}$ in the Laurent expansion of $f$.

If $f$ is analytic everywhere in the interior of $C$, then $\oint_C f(z) dz = 0$.

If $f$ is meromorphic inside of a contour $C$ and analytic everywhere else, one can equivalently calculate the residue at infinity

\[\begin{align*}\oint_C f(z) dz = 2\pi i \sum_{z_k} \underset{z=0}{\text{Res}} ~z^{-2}f(z^{-1})\end{align*}\]

Computing Residues

Simple Poles

If $z_0$ is a pole of order $m$, define $g(z) := (z-z_0)^m f(z)$.

If $g(z)$ is analytic and $g(z_0) \neq 0$, then \[\begin{align*}\underset{z=z_0}{\text{Res}} f(z) = \frac{\phi^{(m-1)}(z_0)}{(m-1)!}\end{align*}\]

In the case where $m=1$, this reduces to \[\begin{align*}\underset{z=z_0}{\text{Res}} f(z) = \phi(z_0)\end{align*}\]

To compute residues this way, attempt to write $f$ in the form

\[\begin{align*}f(z) = \frac{\phi(z)}{(z-z_0)^m}\end{align*}\]

where $\phi$ only needs to be analytic at $z_0$.

Rational Functions

If $f(z) = \frac{p(z)}{q(z)}$ where

$p(z_0) \neq 0$
$q(z_0) = 0$
$q'(z_0) \neq 0$

then the residue can be computed as

\[\begin{align*}\underset{z=z_0}{\text{Res}} \frac{p(z)}{q(z)} = \frac{p(z_0)}{q'(z_0)}\end{align*}\]

Computing Integrals

When computing real integrals, the following contours can be useful:

One often needs bounds, which can come from the following lemmas

The Arc Length Bound If $|f(z)| \leq M$ everywhere on $C$, then \[\begin{align*}|\oint_C f(z) dz | \leq M L_C\end{align*}\] where $L_C$ is the length of $C$.

Jordan’s Lemma: If $f$ is analytic outside of a semicircle $C_R$ and $|f(z)| \leq M_R$ on $C_R$ where $M_R \rightarrow 0$, then \[\begin{align*}\int_{C_R} f(z) e^{iaz} dz \rightarrow 0\end{align*}\].

Can also be used for integrals of the form $\int f(z) \cos az dz$ or $\int f(z) \sin az dz$, just take real/imaginary parts of $e^{iaz}$ respectively.

Conformal Maps

Linear Fractional Transformations:

\[\begin{align*} f(z) = \frac{az+b}{cz+d}\qquad f^{-1}(z) = \frac{-dz+b}{cz-a} \end{align*}\]

$[z_1, z_2, z_3] \mapsto [w_1, w_2, w_3]$

Every linear fractional transformation is determined by its action on three points. Given 3 pairs points $z_i \mapsto w_i$, construct one using the implicit equation

\[\begin{align*} \frac{(w-w_1)(w_2-w_3)}{(w-w_3)(w_2-w_1)} = \frac{(z-z_1)(z_2-z_3)}{(z-z_3)(z_2-z_1)} \end{align*}\]

$z^k: \text{Wedge} \mapsto \mathbb{H}$

Just multiplies the angle by $k$. If a wedge makes angle $\theta$, use $z^\frac{\pi}{\theta}$.

It is useful to know that $z\mapsto z^2$ is equivalent to $(x,y) \mapsto (x^2-y^2, 2xy)$.
$e^z: \mathbb{C} \mapsto \mathbb{C}$

Horizontal lines $\mapsto$ rays from origin

Vertical lines $\mapsto$ circles at origin

Rectangles $\mapsto$ portions of wedges/sectors
$\log: \mathbb{H} \mapsto \mathbb{R} + i[0, \pi]$

Just the inverse of what the exponential map does.

Rays $\mapsto$ Horizontal Lines

Wedges $\mapsto$ Horizontal Strips

$z \mapsto \log z$
$\sin: [0, \pi/2] + i\mathbb{R} \mapsto \mathbb{H}_{\mathcal{R}(z)>0}$

Maps the infinite strip to the first quadrant.

$z \mapsfrom \sin w$
$z\mapsto\frac{i-z}{i+z}: \mathbb{H} \mapsto D^\circ$.

$\mathbb{R}_{>0}$ $\mapsto$ Upper half of $D^\circ$

$\mathbb{R}_{<0}$ $\mapsto$ Bottom half of $D^\circ$

Has inverse $w \mapsto i\frac{1-w}{1+w}$
$z\mapsto z + z^{-1}: \partial D \mapsto \mathbb{R}$

$z \mapsto z+z^{-1}$

Maps the boundary of the circle to the real axis, and the plane to $\mathbb{H}$.

Applications

It is mostly important to know that composing a harmonic function on one domain with an analytic function produces a new harmonic function on the new domain.

Similarly, composing the solution to a boundary value problem on a domain with a conformal map produces a new solution to a new boundary problem in the new domain, where the new boundary is given by the conformal image of the old one.

The general technique is use solutions to the boundary value problem on a simple domain $D$, and compose one or several conformal maps to map a given problem into $D$, then pull back the solution.

Heat Flow: Steady Temperatures

Generally interested in finding a harmonic function $T(x,y)$ which represents the steady-state temperature at any point. Usually given as a Dirichlet problem on a domain $D$ of the form

\[\begin{align*} \Delta T &= 0 \\ T(\partial D) &= f(\partial D) \end{align*}\]

where $f$ is a given function that prescribes values on $\partial D$, the boundary of $D$.

Embed this in an analytic function with its harmonic conjugate to yield solutions of the form $F(x+iy) = T(x,y) + iS(x,y)$.

The isotherms are given by $T(x,y) = c$.

The lines of flow are given by $S(x,y) = c$.

Any easy solution on the domain $\mathbb{R} \times i[0,\pi]$ in the $u,v$ plane, where

\[\begin{align*} T(x, 0) &= 0 \\ T(x, \pi) &= 1 \end{align*}\] is given by $T(u,v) = \frac{1}{\pi}v$.

It is harmonic, as the imaginary part of the analytic $F(u+iv) = \frac{1}{\pi}(u+iv)$, since every analytic function has harmonic component functions.

Similar methods work with different domains, just pick a smooth interpolation between the boundary conditions.

Fluid Flow

Write $F(z) = \phi(x,y) + i\psi(x,y)$. Then $F$ is the complex potential of the flow, $\overline{F'}$ is the velocity, and setting $\psi(x,y) = c$ yields the streamlines.

A solution in $\mathbb{H}$ is $F(z) = Az$ some some velocity $A$. Apply conformal mapping appropriately.

Theorems

General Theorems

Liouville’s Theorem:

If $f$ is entire and bounded on $\mathbb{C}$, then $f$ is constant.
If $f$ is continuous in a region $D$, $f$ is bounded in $D$.
If $f$ is differentiable at $z_0$, $f$ is continuous at $z_0$.

Note - the converse need not hold!
If $f = u + iv$ , where $u,v$ satisfy the Cauchy-Riemann equations and have continuous partials, then $f$ is differentiable.

Note - continuous partials are not enough, consider $f(z) = |z|^2$.
Rouché’s Theorem

If $p(z) = f(z) + g(z)$ and $|g(z)| < |f(z)|$ everywhere on $C$, then $f$ and $p$ have the same number of zeros with $C$.
The Argument Principle

If $f$ is analytic on a closed contour $C$ and meromorphic within $C$, then \[\begin{align*} W \mathrel{\vcenter{:}}=\frac{1}{2\pi}\Delta_C \arg f(z) = Z - P \end{align*}\]

Proof: Evaluate the integral $\oint_C \frac{f'(z)}{f(z)} dz$ first by parameterizing, changing to polar, and using the FTC, and second by using residues directly from the Laurent series.
The Main Story: The following are equivalent
- $f$ is continuous
- $f'$ exists
- $f$ is analytic
- $f$ is conformal
- $f$ satisfies the Cauchy-Riemann equations

Theorems About Analytic Functions

If $f$ is analytic on $D$, then $\oint_C f(z) dz = 0$ for any closed contour $C \subset D$.

Note: this does not require $f$ to be $f'$ to be continuous on $C$.
Maximum Modulus Principle

If $f$ is analytic in a region $D$ and not constant, then $|f(z)|$ attains its maximum on $\partial D$.
If $f$ is analytic, then $f^{(n)}$ is analytic for every $n$. If $f = u(x,y) + iv(x,y)$, then all partials of $u,v$ are continuous.
If $f$ is analytic at $z_0$ and $f'(z_0) \neq 0$, then $f$ is conformal at $z_0$.
If $f = u+iv$ is analytic, then $u,v$ are harmonic conjugates.
If $f$ is holomorphic, $f$ is $C_\infty$ (smooth).
If $f$ is analytic, $f$ is holomorphic.

Proof: Since $f$ has a power series expansion at $z_0$, its derivative is given by the term-by-term differentiation of this series.

Some Useful Formulae

\[\begin{align*} f_{x_0}(x) = f(x_0) + f'(x_0)(x-x_0) + \frac{1}{2!}f''(x_0)(x-x_0)^2 + \ldots \end{align*}\]

\[\begin{align*} \frac{1}{1-z} = \sum_k z^k \end{align*}\]

\[\begin{align*} e^z = \sum_k \frac{1}{k!} z^k \end{align*}\]

\[\begin{align*} \left(\sum_i a_i z^i \right) \left( \sum_j b_j z^j\right) = \sum_n \left( \sum\limits_{i+j=n}a_ib_j \right) z^n \end{align*}\]

\[\begin{align*}\begin{aligned} % \cos z &= \frac{1}{2}(e^{iz} + e^{-iz}) & &= 1 - \frac{z^2}{2!} + \frac{z^4}{4!} - \ldots \\ % \cosh z &= \frac{1}{2}(e^{z} + e^{-z}) &= \cos iz &= 1 + \frac{z^2}{2!} + \frac{z^4}{4!} + \ldots \\ % \sin z &= \frac{1}{2i}(e^{iz} - e^{-iz}) & &= z - \frac{z^3}{3!} + \frac{z^4}{4!} - \ldots \\ % \sinh z &= \frac{1}{2}(e^{z} - e^{-z}) &= -i\sin iz &= z + \frac{z^3}{3!} + \frac{z^4}{4!} + \ldots \\\end{aligned}\end{align*}\] Mnemonic: just remember that cosine is an even function, and that the even terms of $e^z$ are kept. Similarly, sine is an odd function, so keep the odd terms of $e^z$.

Harmonic Conjugate \[\begin{align*}v(x,y) = \int_{(0,0)}^{(x,y)} -u_t(s,t)ds + u_s(s,t)dt\end{align*}\]

The Gamma Function \[\begin{align*}\Gamma(z) = \int_0^\infty x^{z-1} e^{-x} dx\end{align*}\]

Useful to know: $\Gamma(\frac{1}{2}) = \sqrt\pi$.

Questions

True or False: If $f$ is analytic and bounded in $\mathbb{H}$, then $f$ is constant on $\mathbb{H}$.

False: Take $f(z) = e^{-z}$, where $|f(z)| \leq 1$ in $\mathbb{H}$.
Compute $\int_{-\infty}^{\infty} \frac{\sin x}{x(x^2+a^2)}dx$

Two semicircles needed to avoid singularity at zero. Limit equals the residue at zero, solution is $\pi (\frac{1}{a^2} - \frac{e^{-a}}{a^2})$.
Compute $\int_0^{2\pi} \frac{1}{2+\cos\theta}d\theta$

Cosine sub, solution is $\frac{2\pi}{\sqrt{3}}$
Find the first three terms of the Laurent expansion of $\frac{e^z+1}{e^z-1}$.

Equals $2z^{-1} + 0 + 6^{-1}z + \ldots$
Compute $\int_{S_1} \frac{1}{z^2+z-1}dz$

Equals $i\frac{2\pi}{5}$
True or false: If f is analytic on the unit disk $E = \{z : |z| < 1\}$, then there exists an $a \in E$ such that $|f (a)| \geq |f (0)|$.

True, by the maximum modulus principal. Suppose otherwise. Then $f(0)$ is a maximum of $f$ inside $S_1$. But by the MMP, $f$ must attain its maximum on $\partial S_1$.
Prove that if $f(z)$ and $f (\mkern 1.5mu\overline{\mkern-1.5muz\mkern-1.5mu}\mkern 1.5mu)$ are both analytic on a domain D, then f is constant on D

Analytic $\implies$ Cauchy-Riemann equations are satisfied. Also have the identity $f' = u_x + iv_x$, and $f' = 0$ $\implies$ $f$ is constant.

Common Mistakes

\[\begin{align*} -x^{-2} &\neq \int x^{-1} = \int \frac{1}{x} = \ln x \\ \frac{1}{x} &\neq \int \ln x = x\ln x - x \\ \int x^{-k} = \frac{1}{-k+1}x^{-k+1} &\neq \frac{1}{-(k+1)}x^{-(k+1)} \\ \text{ e.g. } \int x^{-2} = -x^{-1} &\neq -\frac{1}{3}x^{-3} \lim_{n\to\infty} \frac{n}{n+1} = 1 \neq 0\\ \frac{\partial}{\partial x}a^x = \frac{\partial}{\partial x}e^{x\ln a} = e^{x\ln a} \ln a = a^x \ln a. \end{align*}\]

Exponentials: when in doubt, write $a^b = e^{b\ln a}$

\[\begin{align*} \frac{\partial}{\partial x} x^{f(x)} = ? \end{align*}\]

\[\begin{align*} \sum x^k = \frac{1}{1-x} \neq \frac{1}{1+x} = \sum (-1)^k x^k \end{align*}\]

Appendix 1

Neat Tricks

Commuting differentials and integrals:

\[\begin{align*} \frac{d}{dx} \int_{a(x)}^{b(x)} f(x,t) dt = f(x, b(x))\frac{d}{dx}b(x) - f(x, a(x))\frac{d}{dx}a(x) + \int_{a(x)}^{b(x)} \frac{\partial}{\partial x} f(x, t) dt \end{align*}\]

Need $f,dfdxf, \frac{df}{dx}$ to be continuous in both variables. Also need $a(x),b(x)∈C1a(x),b(x) \in C_1$.
If $a,b$ are constant, boundary terms vanish.
Recover the fundamental theorem with $a(x)=a,b(x)=ba(x) = a, b(x) = b$, and $f(x,t)=f(t)f(x,t) = f(t)$.

Big Derivative / Integral Table

\[\begin{align*} \frac{\partial f}{\partial{x}}\Leftarrow && f && \Rightarrow\int f dx \\ \hline \\ \frac{1}{2\sqrt{x}} && \sqrt{x} && \frac{2}{3}x^{\frac{3}{2}} \\ nx^{n-1} && x^n, n \neq -1 && \frac{1}{n+1}x^{n+1} \\ -nx^{-(n+1)} && \frac{1}{x^n}, n \neq 1 && -\frac{1}{n-1}x^{-(n-1)} \\ \frac{1}{x} && {\ln(x)} && x\ln(x) - x \\ a^x\ln(a) && a^x && \frac{a^x}{\ln a} \\ \cos(x) && \sin(x) && -\cos(x) \\ -\csc(x)\cot(x) && \csc(x) && \ln{\left\lvert {\csc(x)-\cot(x)} \right\rvert} \\ -\sin(x) && \cos(x) && \sin(x) \\ \sec(x)\tan(x) && \sec(x) && \ln{\left\lvert {\sec(x) + \tan(x)} \right\rvert} \\ \sec^2(x) && \tan(x) && \ln{\left\lvert {\frac{1}{\cos x}} \right\rvert} \\ -\csc^2(x) && \cot(x) && \ln {\left\lvert {\sin x} \right\rvert} \\ \frac{1}{1+x^2} && {\tan^{-1}(x)} && x\tan^{-1}x - \frac{1}{2}\ln(1+x^2) \\ \frac{1}{\sqrt{1-x^2}} && {\sin^{-1}(x)} && x\sin^{-1}x+ \sqrt{1-x^2} \\ -\frac{1}{\sqrt{1-x^2}} && {\cos^{-1}(x)} && x\cos^{-1}x -\sqrt{1-x^2} \\ \frac{1}{\sqrt{x^2+a}} && \ln{\left\lvert {x+\sqrt{x^2+a}} \right\rvert} && \cdot\\ 2\sin x\cos x && \sin^2(x) && \frac{1}{2}(x - \sin x \cos x) \\ -2\sin x\cos x && \cos^2(x) && \frac{1}{2}(x + \sin x \cos x) \\ 2\csc^2(x)\cot(x) && \csc^2(x) && -\cot(x) \\ 2\sec^2(x)\tan(x) && \sec^2(x) && \tan(x) \\ ? && ? && ? \\ ? && ? && ? \\ ? && ? && ? \\ ? && ? && ? \\ ? && ? && ? \\ ? && ? && ? \\ ? && ? && ? \\ (ax+1)e^{ax} && xe^{ax} && \frac { 1 } { a ^ { 2 } } ( a x - 1 ) e ^ { a x } \\ ? && e^{ax}\sin(bx) && \frac { 1 } { a ^ { 2 } + b ^ { 2 } } e ^ { a x } ( a \sin b x - b \cos b x ) \\ ? && e^{ax}\cos(bx) && \frac { 1 } { a ^ { 2 } + b ^ { 2 } } e ^ { a x } ( a \sin b x + b \cos b x ) \\ ? && ? && ? \\ \end{align*}\]

Useful Series and Sequences

Notation: $\uparrow, \downarrow$: monotonically converges from below/above.

Taylor Series:

\[\begin{align*} f ( x ) = \sum _ { n = 0 } ^ { \infty } \frac { f ^ { ( n ) } \left( x _ { 0 } \right) } { n ! } \left( x - x _ { 0 } \right) ^ { n } \end{align*}\]

Cauchy Product:

\[\begin{align*} \left( \sum_{k=0}^\infty a_k x^k \right)\left( \sum_{k=0}^\infty b_i x^n \right) = \sum_{k=0}^\infty \left( \sum_{i=0}^k a_{n} b_{n} \right)x^k \end{align*}\]

Differentiation:

\[\begin{align*} \frac{\partial}{\partial x} \sum_{k=i}^\infty a_kx^k = \sum_{k=i+1}^\infty k\,a_k x^{k-1} \end{align*}\]

Common Series \[\begin{align*} &\sum_{k=0}^{N} x^k &= \frac{1-x^{N+1}}{1-x} &\\ &\sum_{k=1}^\infty x^k &= \frac{1}{1-x}& \quad\text{ for } {\left\lvert {x} \right\rvert} < 1 \\ &\sum_{k=1}^{\infty } k x ^ {k - 1 } &= \frac{1}{( 1 - x ) ^ { 2 } }& \quad \text { for } | x | < 1 \\ &\sum_{k=2}^{\infty } k ( k - 1 ) x ^ {k - 2 } &= \frac{2}{( 1 - x ) ^ { 3 } } & \quad \text { for } | x | < 1 \\ &\sum_{k=3}^{\infty } k ( k - 1 ) ( k - 2 ) x ^ {k - 3 } &= \frac{6}{( 1 - x ) ^ { 4 } } & \quad \text { for } | x | < 1 \\ &\sum_{k=1}^\infty {n\choose k} x^k y^{n-k} &= (x+y)^n& \\ &\sum_{k=1}^{\infty } \frac{x ^ {k } } {k } &= -\log ( 1 - x )& \\ &\sum_{k=0}^{\infty } \frac{x ^ {k } } {k ! } &= e^x & \\ &\sum_{ n = 0}^{\infty } \frac{( - 1 ) ^ {k }}{( 2 n + 1 ) ! } x ^ { 2 k + 1 } \quad = x - \frac{x ^ { 3 }}{3 ! } + \frac{x ^ { 5 }}{5 ! } &= \sin(x) & \\ &\sum_{k=0}^{\infty } \frac{( - 1 ) ^ {k }}{( 2 n ) ! } x ^ { 2 k } \quad = 1 - \frac{x ^ { 2 }}{2 ! } + \frac{x ^ { 4 }}{4 ! } &= \cos(x)& \\ &\sum_{k=0}^{\infty } \frac{( - 1 ) ^ {k }}{2 n + 1 } x ^ { 2 k + 1 } \quad = x - \frac{x ^ { 3 }}{3 } + \frac{x ^ { 5 }}{5 } &= \arctan(x) & \\ &\sum_{k=0}^{\infty } \frac{1}{( 2 k + 1 ) ! }x ^ { 2 n + 1 } \quad = x + \frac{x ^ { 3 }}{3 ! } + \frac{x ^ { 5 }}{5 ! } + \cdots &= \sinh(x) & \\ &\sum_{k=0}^{\infty } \frac{1}{( 2 k ) ! }x ^ { 2 k } \quad = 1 + \frac{x ^ { 2 }}{2 ! } + \frac{x ^ { 4 }}{4 ! } + \cdots &= \cosh(x) & \\ &\sum_{k=0}^{\infty } \frac{x ^ { 2 k + 1 }}{2 k + 1 } &= \operatorname { arctanh } x & \\ &\sum_{k=1}^\infty \frac{1}{k} &= \infty &\\ &\sum_{k=1}^\infty (-1)^k \frac{1}{k} &= \ln (2) & \\ &\sum_{k=1}^N \frac{1}{k} &= _\approx \ln(N) + \gamma + \frac{1}{2N} & \\ &\sum_{k=1}^{\infty } \frac{1 } {k ^ { 2 } } &= \frac{\pi ^ { 2 }}{6 }& \\ \end{align*}\]

Partial Fraction Decomposition

Given $R(x) = \frac{p(x)}{q(x)}$, factor $q(x)$ into $\prod q_i(x)$.

Linear factors of the form $q_i(x) = (ax+b)^n$ contribute \[\begin{align*} r_i(x) = \sum_{k=1}^n \frac{A_k}{(ax+b)^k} = \frac{A_1}{ax+b} + \frac{A_2}{(ax+b)^2} + \cdots \end{align*}\]
Irreducible quadratics of the form $q_i(x) = (ax^2+bx+c)^n$ contribute \[\begin{align*} r_i(x) = \sum_{k=1}^n \frac{A_k x + B_k}{(ax^2+bx+c)^k} = \frac{A_1x+B_1}{ax^2+bx+c} + \frac{A_2x+B_2}{(ax^2+bx+c)^2} + \cdots \end{align*}\]
- Note: $ax^2+bx+c$ is irreducible $\iff b^2 < 4ac$
Write $R(x) = \frac{p(x)}{\prod q_i(x)} = \sum r_i(x)$, then solve for the unknown coefficients $A_k, B_k$.
- IMPORTANT SHORTCUT: don’t try to solve the resulting linear system: for each $q_i(x)$, multiply through by that factor and evaluate at its root to zero out many terms!
- For linear terms $q_i(x) = (ax+b)^n$, define $P(x) = (ax+b)^nR(x)$; then \[\begin{align*} A_{k} = \frac{1}{(n-k)!}P^{(n-k)}(a), \quad k = 1,2,\cdots n \\ \implies A_n= P(a),~ A_{n-1} = P'(a),~ \cdots,~ A_1 = \frac{1}{(n-1)!}P^{(n-1)}(A) \end{align*}\]
- Note: #todo check, might need to evaluate at $-b/a$ instead, extend to quadratics.

Properties of Norms

\[\begin{align*} {\left\lVert {t\mathbf x} \right\rVert} & = {\left\lvert {t} \right\rvert} {\left\lVert {\mathbf x} \right\rVert} \\ {\left\lvert {{\left\langle {\mathbf x},~{\mathbf y} \right\rangle}} \right\rvert} & \leq {\left\lVert {\mathbf x} \right\rVert} {\left\lVert {\mathbf y} \right\rVert} \\ {\left\lVert {\mathbf x+\mathbf y} \right\rVert} & \leq {\left\lVert {\mathbf x} \right\rVert} + {\left\lVert {\mathbf y} \right\rVert} \\ {\left\lVert {\mathbf x-\mathbf z} \right\rVert} & \leq {\left\lVert {\mathbf x-\mathbf y} \right\rVert} + {\left\lVert {\mathbf y-\mathbf z} \right\rVert} \end{align*}\]

Logic Identities

$P \implies Q \iff Q {\text{ or }}\lnot P$
$P \implies Q \iff \lnot Q \implies \lnot P$
$P {\text{ or }}(Q {\text{ and }}S) \iff (P {\text{ or }}Q) {\text{ and }}(P {\text{ or }}S)$
$P {\text{ and }}(Q {\text{ or }}S) \iff (P {\text{ and }}Q) {\text{ or }}(P {\text{ and }}S)$
$\lnot (P {\text{ and }}Q) \iff \lnot P {\text{ or }}\lnot Q$
$\lnot (P {\text{ or }}Q) \iff \lnot P {\text{ and }}\lnot Q$

Set Identities

\[\begin{align*} A \cup B && = && A \cup(A^c \cap B) \\ A && = && (B\cap A) \cup(B^c \cap A) \\ (\cup_{\mathbb{N}}A_i)^c && = && \cap_{\mathbb{N}}A_i^c \\ (\cap_{\mathbb{N}}A_i)^c && = && \cup_{\mathbb{N}}A_i^c \\ A - B && = && A \cap B^c \\ (A-B)^c && = && A^c \cup B \\ (A\cup B) - C && = && (A-C) \cup (B-C) \\ (A\cap B) - C && = && (A-C) \cap (B-C) \\ A - (B \cup C) && = && (A - B) \cap (A - C) \\ A - (B \cap C) && = && (A-B) \cup (A-C) \\ A - (B - C) && = && (A-B) \cup (A \cap C) \\ (A-B) \cap C && = && (A \cap C) - B && = && A \cap (C-B) \\ (A-B) \cup C && = && (A \cup C) - (B-C) \\ A\cup(B\cap C) && = && (A\cup B) \cap (A\cup C) \\ A\cap(B\cup C) && = && (A\cap B) \cup (A \cap C) \\ A \subseteq C {\text{ and }}B \subseteq C &&\implies && A \cup B \subseteq C \\ C \subseteq A {\text{ and }}C \subseteq B &&\implies && C \subseteq A \cup B \\ A_k ~\text{countable} &&\implies && \prod_{k=1}^n A_k, ~ \cup_{k=1}^\infty A_k \quad\text{countable} \end{align*}\]

Preimage Identities

Summary

Injectivity: left cancellation
Surjectivity: right cancellation
Everything commutes with unions
Preimage commutes with everything
Image generally only results in an inequality

Preimage Equations

$A \subseteq B \implies f(A) \subseteq f(B) {\text{ or }}f^{-1}(A) \subseteq f^{-1}(B)$
$f^{-1}(\cup_{i\in I}A_i) = \cup_{i\in I} f^{-1}(A_i)$
- Also holds for $f(\cup_{i\in I}A_i) = \cup_{i\in I} f(A_i)$
$f^{-1}(\cap_{i\in I}A_i) = \cap_{i\in I} f^{-1}(A_i)$
- Also holds for $f(\cap_{i\in I}A_i) = \cap_{i\in I} f(A_i)$
$f^{-1}(A) - f^{-1}(B) = f^{-1}(A-B)$
- BUT $f(A) - f(B) \subseteq f(A-B)$
For $X\subset A, Y \subset B$:
- $({\left.{f}\right|_{X}})^{-1} = X \cap f^{-1}(Y)$
- $(f\circ f^{-1})(Y) = Y \cap f(A)$
Summary: preimage commutes with:
- Union
- Intersection
- Complements
- Difference
- Symmetric Difference

Image Equations

$A \subset B \implies f(A) \subset f(B)$
$f(\cup A_i) = \cup f(A_i)$
$f(\cap A_i) \subset \cap f(A_i)$
$f(A-B) \supset f(A) - f(B)$
$f(A^c) = \operatorname{im}(f) - f(A)$

Equations Involving Both

$A \subseteq f^{-1}(f(A))$
- Equal $\iff f$ is injective
$f(f^{-1}(A)) \subseteq A$
- Equal $\iff f$ is surjective

Pascal’s Triangle:

$n$	Sequence
3	$1,2,1$
4	$1,3,3,1$
5	$1,4,6,4,1$
6	$1,5,10,10,5,1$
7	$1,6,15,20,15,16,1$
8	$1,7,21,35,35,21,7,1$

Obtain new entries by adding in $L$ pattern rotated by $\pi$ (e.g. 7 = 1+6, 12 = 6 + 15, etc). Note that $n\choose i$ is given by the entry in the $n{\hbox{-}}$th row, $i{\hbox{-}}$th column.

Table of Small Factorials

$n$	$n!$
2	$2$
3	$6$
4	$24$
5	$120$
6	$720$
7	$5040$
8	$40320$
9	$362880$
10	$3628800$

$\pi \approx 3.1415926535$ $e \approx 2.71828$ $\sqrt{2} \approx 1.4142135$

Primes Under 100:

\[\begin{align*} & 2, 3, 5, 7, \\ & 11, 13, 17, 19, \\ & 23, 29, \\ & 31, 37, \\ & 41, 43, 47, \\ & 53, 59, \\ & 61, 67, \\ & 71, 73, 79, \\ & 83, 89, \\ & 97, \\ & 101 \end{align*}\]

Checking Divisibility by Small Numbers

Note that $n\mod 10^k$ yields the last $k$ digits. Let $d_i$ denote the $i{\hbox{-}}$th digit of $n$.

The recursive prime procedure (RPP): for each prime $p$, there exists a $k$ such recursive application of this procedure to $n$ yields the same remainder mod $p$ as $n$ itself.

Write $n_0 = 10x + y$ where $y = 0 \ldots 9$
Let $n_1 = x + ky$, repeat until $n_i < 10$.

$p$	$p \mathrel{\Big\|}n \iff$	Mnemonic
2	$n \equiv 2, 4, 6, 8 \mod 10$	Last digit is even
3	$\sum d_i \equiv 0 \mod 3$	3 divides the sum of digits (apply recursively)
4	$n \equiv 4k \mod 10^2$	Last two digits are divisible by 4
5	$n \equiv 0, 5 \mod 10$	Last digit is 0 or 5
6	$n \equiv 0 \mod 2 \text{ and } n \equiv 0 \mod 3$	Reduce to 2, 3 case
7	RPP, $k=-2$	$-20 \equiv 1 \mod 7 \implies 10x+y \equiv 10(x-2y) \mod 7$
8	$n \equiv 8k \mod 10^3$	Manually divide the last 3 digits by 8 (or peel off factors of 2)
9	$\sum d_i \equiv 0 \mod 9$	9 divides the sum of digits (apply recursively)
10	$n \equiv 0 \mod 10$	Last digit is 0
11	$\sum (-1)^i d_i \equiv 0 \mod 11$ or	11 divides alternating sum
13	RPP, $k=4$	$40 \equiv 1 \mod 13 \implies 10x + y \equiv 10(x + 4y) \mod 13$
17	RPP, $k=-5$	$-50 \equiv 1 \mod 17 \implies 10x + y \equiv 10(x - 5y) \mod 19$
19	RPP, $k=2$	$20 \equiv 1 \mod 19 \implies 10x + y \equiv 10(x + 2y) \mod 19$

Hyperbolic Functions

\[\begin{align*} \cosh(x) & = \frac{1}{2}(e^x + e^{-x}) \\ \sinh(x) & = \frac{1}{2}(e^x - e^{-x}) \\ \cos(iz) & = \cosh z \\ \cosh(iz) & = \cos z \\ \sin(iz) & = \sinh z \\ \sinh(iz) & = \sin z \\ \sinh^{-1}x & = ? \quad = \ln(x + \sqrt{x^2+1}) \\ \cosh^{-1}x & = ? \quad = \ln(x + \sqrt{x^2-1}) \\ \tanh^{-1}x & = \frac{1}{2}\ln(\frac{1+x}{1-x}) \\ \end{align*}\]

Integral Tables

\[\begin{align*} \frac{\partial f}{\partial{x}}\Leftarrow & & f & & \Rightarrow\int f dx \\ \hline \\ \frac{1}{2\sqrt{x}} & & \sqrt{x} & & \frac{2}{3}x^{\frac{3}{2}} \\ nx^{n-1} & & x^n, n \neq -1 & & \frac{1}{n+1}x^{n+1} \\ \frac{1}{x} & & {\ln(x)} & & x\ln(x) - x \\ a^x\ln(a) & & a^x & & \frac{a^x}{\ln a} \\ \cos(x) & & \sin(x) & & -\cos(x) \\ -\sin(x) & & \cos(x) & & \sin(x) \\ 2\sec^2(x)\tan(x) & & \sec^2(x) & & \tan(x) \\ 2\csc^2(x)\cot(x) & & \csc^2(x) & & -\cot(x) \\ \sec^2(x) & & \tan(x) & & \ln{\left\lvert {\sec(x)} \right\rvert} \\ \sec(x)\tan(x) & & \sec(x) & & \ln{\left\lvert {\sec(x) + \tan(x)} \right\rvert} \\ -\csc(x)\cot(x) & & \csc(x) & & \ln{\left\lvert {\csc(x)-\cot(x)} \right\rvert} \\ \frac{1}{1+x^2} & & {\tan^{-1}(x)} & & x\tan^{-1}x - \frac{1}{2}\ln(1+x^2) \\ \frac{1}{\sqrt{1-x^2}} & & {\sin^{-1}(x)} & & x\sin^{-1}x+ \sqrt{1-x^2} \\ -\frac{1}{\sqrt{1-x^2}} & & {\cos^{-1}(x)} & & x\cos^{-1}x -\sqrt{1-x^2} \\ \frac{1}{\sqrt{x^2+a}} & & \ln{\left\lvert {x+\sqrt{x^2+a}} \right\rvert} & & \cdot\\ -\csc^2(x) & & \cot(x) & & ? \\ ? & & \cos^2(x) & & ? \\ ? & & \sin^2(x) & & ? \\ ? & & xe^{ax} & & \frac { 1 } { a ^ { 2 } } ( a x - 1 ) e ^ { a x } \\ ? & & e^{ax}\sin(bx) & & \frac { 1 } { a ^ { 2 } + b ^ { 2 } } e ^ { a x } ( a \sin b x - b \cos b x ) \\ ? & & e^{ax}\cos(bx) & & \frac { 1 } { a ^ { 2 } + b ^ { 2 } } e ^ { a x } ( a \sin b x + b \cos b x ) \\ ? & & ? & & ? \end{align*}\]

Definitions

\[\begin{align*} e^x = \lim_{n \to \infty} \left(1 + \frac{1}{n}\right)^n = \lim_{n \to \infty} \left( \frac{n+1}{n} \right)^n .\end{align*}\]

Set Theory

Injectivity \[\begin{align*} f:X \to Y \text{ injective } \iff \forall x_1,x_2 \in X,\quad f(x_1) = f(x_2) \implies x_1 = x_2 \\ \iff \forall x_1,x_2 \in X,\quad x_1 \neq x_2 \implies f(x_1) \neq f(x_2) .\end{align*}\]
Surjectivity \[\begin{align*} f:X \to Y \text{ surjective } \iff \forall y\in Y,~ \exists x\in X : f(x) = y .\end{align*}\]
Preimage \[\begin{align*} f:X \to Y, U \subseteq Y \implies f^{-1}(U) = \left\{{x \in X : f(x) \in U}\right\} .\end{align*}\]

Calculus

Limit \[\begin{align*} \lim_{x \to p} f(x) = L \iff \forall\varepsilon,~\exists\delta:\\ d(x, p) < \delta\implies d(f(x), L) < \varepsilon \end{align*}\]
Continuity
- Epsilon-delta definition: \[\begin{align*} f:X \to Y \text{ continuous at } p \iff \forall \varepsilon,~ \exists\delta:\\ d_X(x, p) < \delta \implies d_Y(f(x), f(p)) < \varepsilon \end{align*}\]
- Limit/Sequential definition: \[\begin{align*} f:X \to Y \text{ continuous at } p \iff \forall \left\{{x_i}\right\}_{i\in {\mathbb{N}}}\subseteq X: \left\{{x_i}\right\}\to p,\\ \lim_{i\to\infty}f(x_i) = f(\lim_{i\to\infty}x_i) = f(p) \end{align*}\]
- Topological Definition: \[\begin{align*} f:X \to Y \text{ continuous } \iff U \text{ open in } \operatorname{im}(f) \subseteq Y \implies f^{-1}(U) \text{ open in } X .\end{align*}\]
Differentiability and the Derivative
- For single variable functions: \[\begin{align*} f:{\mathbb{R}}\to {\mathbb{R}}~\text{differentiable at } p \iff \forall \left\{{x_i}\right\}_{i\in {\mathbb{N}}}\to p,\\ f'(p) \mathrel{\vcenter{:}}=\lim_{i\to\infty} \frac{f(x_i) - f(p)}{x_i - p}< \infty \end{align*}\]
- For multivariable functions: \[\begin{align*} \mathbf{f}:{\mathbb{R}}^n \to {\mathbb{R}}^m ~\text{differentiable at } \mathbf{p} \iff \exists \text{ a linear map } \mathbf{J}:{\mathbb{R}}^n \to {\mathbb{R}}^m \text{ such that: } \\ \lim _ { \mathbf{h} \rightarrow 0 } \frac { \left\| \mathbf { f } \left( \mathbf{p} + \mathbf { h } \right) - \mathbf { f } \left( \mathbf { p } \right) - \mathbf { J } ( \mathbf { h } ) \right\|_ {{\mathbb{R}}^n}} { \| \mathbf { h } \|_ {{\mathbb{R}}^m} } = 0 \end{align*}\]
Gradient \[\begin{align*} \nabla f = [f_x, f_y, f_z] .\end{align*}\]
Divergence
Curl
Taylor Series (at a point $a$)
- Single Variable ${\mathbb{R}}\to {\mathbb{R}}$
\[\begin{align*} T_a(x) = f ( a ) + \frac { f ^ { \prime } ( a ) } { 1 ! } ( x - a ) + \frac { f ^ { \prime \prime } ( a ) } { 2 ! } ( x - a ) ^ { 2 } + \frac { f ^ { \prime \prime \prime } ( a ) } { 3 ! } ( x - a ) ^ { 3 } + \cdots \\ \implies T_a(x) = \sum _ { n = 0 } ^ { \infty } \frac { f ^ { ( n ) } ( a ) } { n ! } ( x - a ) ^ { n } .\end{align*}\]
- Multivariable ${\mathbb{R}}^n \to {\mathbb{R}}$:
\[\begin{align*} T_a(\mathbf x) = f(\mathbf a) + (\mathbf x - \mathbf a)^T\nabla f(\mathbf a) .\end{align*}\]
- Multivariable ${\mathbb{R}}^n \to {\mathbb{R}}^m$:
\[\begin{align*} T_{(a,b)} ( x , y ) = f ( a , b ) + ( x - a ) f _ { x } ( a , b ) + ( y - b ) f _ { y } ( a , b ) + \\ \frac { 1 } { 2 ! } \left( ( x - a ) ^ { 2 } f _ { x x } ( a , b ) + 2 ( x - a ) ( y - b ) f _ { y y } ( a , b ) + ( y - b ) ^ { 2 } f _ { y x } ( a , b ) \right) + \cdots .\end{align*}\]

\[\begin{align*} T_a ( \mathbf { x } ) = f ( \mathbf { a } ) + ( \mathbf { x } - \mathbf { a } ) ^ { \mathrm { T } } \mathbf{J} ( \mathbf { a } ) + \frac { 1 } { 2 ! } ( \mathbf { x } - \mathbf { a } ) ^ { \mathrm { T } } \mathbf{H} ( \mathbf { a } ) ( \mathbf { x } - \mathbf { a } ) + \cdots \\ \implies T_a ( \mathbf { x } ) = \sum _ { | \alpha | \geq 0 } \frac { ( \mathbf { x } - \mathbf { a } ) ^ { \alpha } } { \alpha ! } \left( \partial ^ { \alpha } f \right) ( \mathbf { a } ) \end{align*}\]

Analysis

Archimedean Property: $x \in {\mathbb{R}}\implies \exists n\in {\mathbb{N}}:~ x < n {\text{ and }}x > 0 \implies \exists n:~ \frac{1}{n} < x$
Upper Bound (for S $\subseteq {\mathbb{R}}$) \[\begin{align*} \alpha\text{ is an upper bound for } S \iff s \in S \implies s < \alpha .\end{align*}\]
Triangle Inequality
- ${ | a + b | \leq | a | + | b | }$
- ${ | a - b | \leq | a | + | b | }$
Reverse Triangle Inequality
- ${{\left\lvert {| a | - | b |} \right\rvert} \leq | a - b | }$
Least Upper Bound / Supremum (for S $\subseteq {\mathbb{R}}$) \[\begin{align*} \alpha\text{ is a LUB for } S \iff s \in S \implies s < \alpha {\text{ and }}\forall t : (s \in S \implies s < t),~ \alpha < t .\end{align*}\]
Greatest Lower Bound / Infimum (for S $\subseteq {\mathbb{R}}$) \[\begin{align*} \alpha\text{ is a GLB for } S \iff s \in S \implies \alpha < s {\text{ and }}\forall t : (s \in S \implies t < s),~ t < \alpha .\end{align*}\]
Open Set
Closed Set
Limit Point
Interior Point
Closure of a Set
Boundary
Metric
Cauchy Sequence: \[\begin{align*} \left\{{a_i}\right\} \text{ is a cauchy sequence } \iff \forall \varepsilon~\exists N\in{\mathbb{N}}: \quad m,n > N \implies d(x_m, x_n) < \varepsilon .\end{align*}\]
Connected: $S$ is connected $\iff$ $\not\exists U,V\subset S$ nonempty, open, disjoint such that $S = U \cup V$
Compact: Every open cover has a finite subcover: \[\begin{align*} X \subseteq \cup_{j\in J} V_j \implies \exists I \subseteq J: {\left\lvert {I} \right\rvert} < \infty {\text{ and }}X\subseteq \cup_{i\in I} V_i .\end{align*}\]
Sequential Compactness Every sequence has a convergent subsequence: \[\begin{align*} \left\{{x_i}\right\}_{i\in I}\subseteq X \implies \exists J\subseteq I,~ \exists p\in X: \quad \left\{{x_j}\right\}_{j\in J} \to p .\end{align*}\]
Bounded (sequences, subsets, metric spaces) \[\begin{align*} U \subseteq X \text{ is bounded } \iff \exists x\in X, \exists M \in {\mathbb{R}}:\quad u\in U \implies d(x, u) < M .\end{align*}\]
Totally Bounded
Pointwise Convergence \[\begin{align*} \text{For}~\{f_n: X \to Y\}_{n\in{\mathbb{N}}},\\ f_n \to f \iff \forall \varepsilon > 0,~\forall x\in X,~ \exists N(x, \varepsilon)\in{\mathbb{N}}: \quad n > N \implies d_Y\left(f_n(x),f(x)\right) < \varepsilon \end{align*}\]
Uniform Convergence \[\begin{align*} \text{For}~\{f_n: X \to Y\}_{n\in{\mathbb{N}}}, \\ f_n \rightrightarrows f \iff \forall \varepsilon > 0,~ \exists N(\varepsilon)\in{\mathbb{N}}: \quad \forall x\in X,~ n > N \implies d_Y(f_n(x), f(x)) < \varepsilon \end{align*}\]
Generalized Mean Value Theorem \[\begin{align*} (f ( b ) - f ( a ) ) g' ( c ) = (g ( b ) - g ( a )) f' ( c ) .\end{align*}\]

Linear Algebra

Convention: always over a field $k$, and $T: k^n \to k^m$ is a generic linear map (or $m\times n$ matrix).

Consistent

A system of linear equations is consistent when it has at least one solution.
Inconsistent

A system of linear equations is inconsistent when it has no solutions.
Rank

The number of nonzero rows in RREF
Elementary Matrix
Row Equivalent
Pivot
Cofactor

\[\begin{align*} \mathrm{cofactor}(A)_{i,j} = (-1)^{i+j} M_{i, j} \end{align*}\] where $M_{i, j}$ is the minor obtained by deleting the $i{\hbox{-}}$th row and $j{\hbox{-}}$th column of $A$.

Adjugate \[\begin{align*} \mathrm{adjugate}(A) = \mathrm{cofactor}(A)^T = (-1)^{i+j} M_{j, i} .\end{align*}\]
Vector Space Axioms
- Let $k$ be a field and $\mathbf{u},\mathbf{v},\mathbf{w} \in V$ and $r,s,t\in k$. A vector space $V$ over $k$ satisfies:
  1. Closure under addition: $\mathbf{v} + \mathbf{w} \in V$
  2. Closure under scalar multiplication: $r\mathbf{v} \in V$
  3. Commutativity of addition: $\mathbf{v} + \mathbf{w} = \mathbf{w} + \mathbf{v}$
  4. Associativity of addition: $\mathbf{u} + (\mathbf{v} + \mathbf{w}) = (\mathbf{u}+\mathbf{v}) + \mathbf{w}$
  5. Existence of an additive zero $\mathbf{0}$ satisfying $\mathbf{v} + 0 = 0 + \mathbf{v} = \mathbf{v}$
  6. Existence of additive inverse $-\mathbf{v}$ satisfying $v + (-\mathbf{v}) = 0$
  7. Unit property: $1\mathbf{v} = \mathbf{v}$
  8. Associativity of scalar multiplication: $(rs)\mathbf{v} = r(s\mathbf{v})$
  9. Distribution of scalars multiplication over vector addition: $r(\mathbf{v} + \mathbf{w}) = r\mathbf{v} + r\mathbf{w}$
  10. Distribution of scalar multiplication over scalar addition: $(r+s)\mathbf{v} = r\mathbf{v} + s\mathbf{v}$
Subspace
- A nonempty subset $W \subseteq V$ that is a vector space and satisfies \[\begin{align*} \left\{{ \sum_i c_i \mathbf{x}_i \mathrel{\Big|}c_i \in {\mathbb{F}},~ x_i \in W}\right\} \subseteq W .\end{align*}\]
- Quick counter-check: find $\mathbf{x}, \mathbf{y}$ such that $a\mathbf{x} + b\mathbf{y} \not\in W$
Span Given a set of $n$ vectors $S = \left\{{\mathbf{x}_i}\right\}_{i=1}^n$, defined as \[\begin{align*} \mathrm{Span}(S) = \left\{{\sum_{i=1}^nc_i \mathbf{x}_i \mathrel{\Big|}c_i \in k}\right\} .\end{align*}\]
Row Space
- The range of the linear map $T$.
- Given $T = \begin{bmatrix} \mathbf{x}_1 \rightarrow \\ \mathbf{x}_2 \rightarrow \\ \vdots \\ \mathbf{x}_m \rightarrow \end{bmatrix}$, defined as \[\begin{align*} \mathrm{Span}(\left\{{\mathbf{x}_i}\right\}_{i=1}^m) \subseteq k^m .\end{align*}\]
- $\mathrm{rowspace}(T)^\perp= \mathrm{null}(T)$
- ${\left\lvert {\mathrm{rowspace}(T)} \right\rvert} = \mathrm{Rank}(T)$
Column Space
Null Space
- Defined as $\mathrm{null}(T) = \left\{{\mathbf{x} \in k^n \mathrel{\Big|}T(\mathbf{x}) = 0 \in k^m}\right\}$
- $\mathrm{null}(T)^\perp= \mathrm{rowspace}(T)$
Eigenvalue
- A value $\lambda$ such that $Ax = \lambda x$
- Invariant under similarity.
Eigenspace
- For a linear map $T$ with eigenvalue $\lambda$, defined as $E_\lambda = \left\{{\mathbf{x} \in k^n \mathrel{\Big|}T(\mathbf{x}) = \lambda \mathbf{x}}\right\}$
Dimension
- The cardinality of a basis of $V$
Basis
- A linearly independent set of vectors $S = \left\{{\mathbf{x}_i}\right\} \subset V$ such that $\mathrm{Span}(S) = V$
Linear independence
- A set of vectors $\left\{{\mathbf{x}_i}\right\}_{i=1}^n$ is linearly independent $\iff \sum_{i=1}^n c_i \mathbf{x}_i = 0 \implies c_i = 0$ for all $i$.
- Can be detected by considering the matrix \[\begin{align*} T = [\mathbf{x}_1, \mathbf{x}_2, \cdots, \mathbf{x}_n]^T .\end{align*}\] (linearly independent iff $T$ is singular)
Rank
- Dimension of rowspace
Rank-Nullity Theorem
- ${\left\lvert {\mathrm{Nullspace}(A)} \right\rvert} + {\left\lvert {\mathrm{Rank}(A)} \right\rvert} = {\left\lvert {\mathrm{Codomain}(A)} \right\rvert}$
Nullspace
- $\mathrm{nullspace}(A) = \left\{{\mathbf{x}\in {\mathbb{R}}^n : A \mathbf{x} = \mathbf{0}}\right\}$
Singular
- A square $n\times n$ matrix $T$ is singular iff $\mathrm{Rank}(T) < n$
Similarity
- Two matrices $A, B$ are similar iff there exists an invertible matrix $S$ such that $B = SAS^{-1}$
Diagonalizable
- A matrix $X$ is diagonalizable if it can be written $X = EDE^{-1}$ where $D$ is diagonal.
- If $X$ is $n\times n$ and has $n$ linearly independent eigenvectors $\mathbf{\lambda}_i$, then $D_{ii} = \mathbf{\lambda}_i$, and $E = \begin{bmatrix} \mathbf{\lambda}_1 & \mathbf{\lambda}_2 & \cdots & \mathbf{\lambda}_n \\ \downarrow & \downarrow &\cdots & \downarrow \end{bmatrix}$
Positive Definite
- A matrix $A$ is positive definite iff $\forall \mathbf{x} \in k^n,$ we have the scalar inequality $\mathbf{x}^T A \mathbf{x} > 0$
Projection
- The projection of a vector $\mathbf{v}$ onto $\mathbf{u}$ is given by $P_{\mathbf{u}}(\mathbf{v}) = \left< \mathbf{u}, \mathbf{v} \right> \widehat{u}$
- The projection of a vector $\mathbf{v}$ onto a space $U = \mathrm{Span}(\left\{{\mathbf{u}_i}\right\})$ is given by \[\begin{align*} P_U(\mathbf{v}) = \sum_i P_{\mathbf{u}_i}(\mathbf{v}) = \sum_i \left< \mathbf{u}_i, \mathbf{v}\right> \widehat{u}_i .\end{align*}\]
Orthogonal Complement
- Given a subspace $U \subseteq V$, defined as $U^\perp= \left\{{\mathbf{v} \in V \mathrel{\Big|}\forall \mathbf{u} \in U, \left<\mathbf{u}, \mathbf{v}\right> = 0}\right\}$
Determinant \[\begin{align*} \det(A) = \sum_{\tau \in S^n}\prod_{i=1}^n \sigma(\tau) a_{i, \tau(i)} .\end{align*}\]
Trace \[\begin{align*} \mathrm{Tr}(A) = \sum_{i=1}^n A_{ii} .\end{align*}\]
Characteristic Polynomial
- $p_A(x) = \det(xI - A)$
- Roots of $p_A$ are eigenvalues of $A$
Symmetric: $A = A^T$
Skew-Symmetric: $A = -A^T$
Inner Product
- ${\left\langle {\mathbf{x}},~{\mathbf{x}} \right\rangle} \geq 0$
- ${\left\langle {\mathbf{x}},~{\mathbf{x}} \right\rangle} = 0 \iff \mathbf{x} = \mathbf{0}$
- ${\left\langle {\mathbf{x}},~{\mathbf{y}} \right\rangle} = {\overline{{{\left\langle {\mathbf{y}},~{\mathbf{x}} \right\rangle}}}}$
- ${\left\langle {[},~{k} \right\rangle}\mathbf{x}]{\mathbf{y}} = k{\left\langle {\mathbf{x}},~{\mathbf{y}} \right\rangle} = {\left\langle {\mathbf{x}},~{k\mathbf{y}} \right\rangle}$
- ${\left\langle {\mathbf{x} + \mathbf{y}},~{\mathbf{z}} \right\rangle} = {\left\langle {\mathbf{x}},~{\mathbf{z}} \right\rangle} + {\left\langle {[},~{y} \right\rangle}]{\mathbf{z}}$
- ${\left\langle {[},~{a} \right\rangle}\mathbf{x}]{b\mathbf{y}} = {\left\langle {\mathbf{x}},~{\mathbf{x}} \right\rangle} + {\left\langle {a\mathbf{x}},~{y} \right\rangle} + {\left\langle {\mathbf{x}},~{b\mathbf{y}} \right\rangle} + {\left\langle {\mathbf{y}},~{\mathbf{y}} \right\rangle}$
- Defines a norm: ${\left\lVert {\mathbf{x}} \right\rVert} = \sqrt{{\left\langle {\mathbf{x}},~{\mathbf{x}} \right\rangle}} \implies {\left\lVert {\mathbf{x}} \right\rVert}^2 = {\left\langle {\mathbf{x}},~{\mathbf{x}} \right\rangle}$
Cauchy-Schwarz Inequality: ${\left\lvert {{\left\langle {\mathbf{x}},~{\mathbf{y}} \right\rangle}} \right\rvert} \leq {\left\lVert {\mathbf{x}} \right\rVert}{\left\lVert {\mathbf{y}} \right\rVert}$
Orthogonality:
- For vectors: $\mathbf{x} ^\perp\mathbf{y} \iff {\left\langle {\mathbf{x}},~{\mathbf{y}} \right\rangle} = 0$
- For matrices: $A$ is orthogonal $\iff A^{-1} = A^T$
Orthogonal Projection of $\mathbf{x}$ onto $\mathbf{y}$: \[\begin{align*} P(\mathbf{x}, \mathbf{y}) = {\left\langle {\mathbf{x}},~{\mathbf{y}} \right\rangle} \widehat{y} = {\left\langle {\mathbf{x}},~{\mathbf{y}} \right\rangle} \frac{\mathbf{y}}{{\left\lVert {\mathbf{y}} \right\rVert}^2} .\end{align*}\]
- Note ${\left\lVert {P(\mathbf{x}, \mathbf{y})} \right\rVert} = {\left\lVert {\mathbf{x}} \right\rVert}\cos\theta_{x,y}$
Defective: An $n\times n$ matrix $A$ is defective $\iff$ the number of linearly independent eigenvectors of $A$ is less than $n$.

Differential Equations

Homogeneous \[\begin{align*} f(x, y) \text{ homogeneous of degree } n \iff \exists n\in{\mathbb{N}}: f(tx, ty) = t^nf(x, y). .\end{align*}\]
Separable \[\begin{align*} p(y)\frac{dy}{dx} - q(x) = 0 .\end{align*}\]
Wronskian: \[\begin{align*} W \left[ f _ { 1 } , f _ { 2 } , \ldots , f _ { k } \right] ( x ) = \left| \begin{array} { c c c c } { f _ { 1 } ( x ) } & { f _ { 2 } ( x ) } & { \dots } & { f _ { k } ( x ) } \\ { f _ { 1 } ^ { \prime } ( x ) } & { f _ { 2 } ^ { \prime } ( x ) } & { \dots } & { f _ { k } ^ { \prime } ( x ) } \\ { \vdots } & { \vdots } & { } & { \vdots } \\ { f _ { 1 } ^ { ( k - 1 ) } ( x ) } & { f _ { 2 } ^ { ( k - 1 ) } ( x ) } & { \dots } & { f _ { k } ^ { ( k - 1 ) } ( x ) } \end{array} \right| \end{align*}\]
Laplace Transform: \[\begin{align*} L_f(s) = \int_0^\infty e^{-st} f(t) ~dt .\end{align*}\]

Algebra

Ring
Group
Subgroup
- Two step subgroups test:
Integral Domain
Division Ring
Principal Ideal Domain
Tensor Product: #todo insert construction

Complex Analysis

Analytic
Harmonic
Cauchy-Euler Equations
Holomorphic
The Complex Derivative
Meromorphic
The Gamma Function: Satisfies $\Gamma(p+1) + p\Gamma(p)$ and $\Gamma(1) = 1$, defined as \[\begin{align*} \Gamma ( p ) = \int _ { 0 } ^ { \infty } t ^ { p - 1 } e ^ { - t } d t , \quad p > 0 .\end{align*}\]

Algebra

Looking at real roots:
- Let $p$ be number of sign changes in $f(x)$;
- Let $q$ be number of sign changes in $f(-x)$;
- Let $n$ be the degree of $f$.
- Then $p$ gives the maximum number of positive real roots, $q$ gives the maximum number of negative real roots, and $n-p-q$ gives the minimum number of complex roots.
- Rational Roots Theorem: If $p(x) = ax^n +\cdots + c$ and $r = \frac{p}{q}$ where $p(r) = 0$, then $p \mathrel{\Big|}c$ and $q \mathrel{\Big|}a$.
Properties of logs:
- $\ln(\prod) = \sum \ln$ but $\prod \ln \neq \ln \sum$
- $\log_b x = \frac{\ln x}{\ln b}$

Be careful! $\frac{\ln x}{\ln y} \neq \ln\frac{x}{y} = \ln x - \ln y$

Completing the square:
- $p(x) = ax^2 + bx + c \implies p(x) = a(x+\frac{b}{2a})^2 + -\frac{1}{2}\left(\frac{b^2-4ac}{2a}\right)$

Geometry

Generic Conic Sections

\[\begin{align*}A x ^ { 2 } + B x y + C y ^ { 2 } + D x + E y + F = 0\end{align*}\]

\[\begin{align*}\frac{(x-x_0)^2}{w_0} \pm \frac{(y-y_0)^2}{h_0} = c\end{align*}\]
Circles: \[\begin{align*} Ax^2 + By^2 + C = 0 \hspace{5em} (x-x_0)^2 + (y-y_0)^2 = r^2\end{align*}\]
- Defining trait: locus of points at a constant distance from the center
- Center at $(x_0, y_0)$
Parabolas: \[\begin{align*}Ax^2 + Bx + Cy + D = 0 \hspace{5em} y = ax^2\end{align*}\]
- Defining Trait:
  - Locus of points equidistant from the focus (a point) and the directrix (a line)
  - #todo add image
- Focus at $(0, \frac{1}{4a})$
- Directrix at the line $y = -\frac{1}{4a}$
  - For an arbitrary quadratic: complete the square to write in the form $y = a(x - w_0)^2 + h_0$, and translate points of interest by by $(x+w_0, y+h_0)$
Ellipses: \[\begin{align*}\frac{x^2}{w^2} + \frac{y^2}{h^2} = 1\end{align*}\]
- Defining trait:
  - The locus of points where the sum of distances to two focii are constant.
- Center at $(0,0)$ (can translate easily)
- Vertices at $(\pm w, 0)$ and $(0, \pm h)$
- Focii at $F_1 = (\sqrt{w^2-h^2}, 0), F_2 = (-\sqrt{w^2-h^2}, 0)$
- Another useful shortcut form:
Hyperbolas: \[\begin{align*}\frac{x^2}{w^2}-\frac{y^2}{h^2} = 1\end{align*}\]
- Defining trait:
  - Locus of points where the difference between the distances to two focii are constant.
- Vertices at $(0, \pm h)$ and $(\pm w, 0)$
- Focii at $F_1 = (\sqrt{w^2+h^2}, 0), F_2 = (-\sqrt{w^2+h^2}, 0)$
Summary of Traits:
- One point $p$:
  - Distance to $p$ is constant: circle
- Two points $a,b$:
  - Distance to $a$ equal to distance to $b$ equals a constant: a line bisecting the midpoint of the line connecting them
  - Difference of distances constant: ellipse
  - Sum of differences constant: hyperbola
- Point $p$ and a line $l$:
  - Distance to $p$ equals distance to $l$ equals a constant: parabola
Areas of certain figures:

Note that the matrix appearing here is sometimes called the adjugate.↩︎
For a derivation of this formula, see the section on least-squares approximations.↩︎

Derivatives	Integrals	Signs	Result
\(u\)	\(v\)	NA	NA
\(u'\)	\(\int v\)	\(+\)	\(u\int v\)
\(u''\)	\(\int\int v\)	\(-\)	\(-u'\int\int v\)
\(\vdots\)	\(\vdots\)	\(\vdots\)	\(\vdots\)

\(z\)	\(\Phi(z)\)
\(0\)	\(0.5\)
\(1\)	\(0.69\)
\(1.5\)	\(0.84\)
\(2\)	\(0.93\)
\(2.5\)	\(0.97\)
\(>3\)	\(0.99\)

Perm. / Rest.	—	Injective	Surjective
—	A sequence of any \(n\) elements of \(K\)	Sequences of \(n\) distinct elements of \(K\)	Compositions of \(N\) with exactly \(k\) subsets
Permutations of \(N\)	Multisets of \(K\) with \(n\) elements	An \(n{\hbox{-}}\)element subset of \(K\)	Compositions of \(n\) with \(k\) terms
Permutations of \(X\)	Partitions of \(N\) into \(\leq k\) subsets	?	Partitions of \(N\) into exactly \(k\) nonempty subsets
Both	Partitions of \(n\) into \(\leq k\) parts	?	Partitions of \(n\) into exactly \(k\) parts

\(\mathbb{R}_{>0}\)	\(\mapsto\)	Upper half of \(D^\circ\)
\(\mathbb{R}_{<0}\)	\(\mapsto\)	Bottom half of \(D^\circ\)

\(n\)	Sequence
3	\(1,2,1\)
4	\(1,3,3,1\)
5	\(1,4,6,4,1\)
6	\(1,5,10,10,5,1\)
7	\(1,6,15,20,15,16,1\)
8	\(1,7,21,35,35,21,7,1\)

Horizontal lines	\(\mapsto\)	rays from origin
Vertical lines	\(\mapsto\)	circles at origin
Rectangles	\(\mapsto\)	portions of wedges/sectors

Rays	\(\mapsto\)	Horizontal Lines
Wedges	\(\mapsto\)	Horizontal Strips

\(n\)	\(n!\)
2	\(2\)
3	\(6\)
4	\(24\)
5	\(120\)
6	\(720\)
7	\(5040\)
8	\(40320\)
9	\(362880\)
10	\(3628800\)

\(p\)	\(p \mathrel{\Big\|}n \iff\)	Mnemonic
2	\(n \equiv 2, 4, 6, 8 \mod 10\)	Last digit is even
3	\(\sum d_i \equiv 0 \mod 3\)	3 divides the sum of digits (apply recursively)
4	\(n \equiv 4k \mod 10^2\)	Last two digits are divisible by 4
5	\(n \equiv 0, 5 \mod 10\)	Last digit is 0 or 5
6	\(n \equiv 0 \mod 2 \text{ and } n \equiv 0 \mod 3\)	Reduce to 2, 3 case
7	RPP, \(k=-2\)	\(-20 \equiv 1 \mod 7 \implies 10x+y \equiv 10(x-2y) \mod 7\)
8	\(n \equiv 8k \mod 10^3\)	Manually divide the last 3 digits by 8 (or peel off factors of 2)
9	\(\sum d_i \equiv 0 \mod 9\)	9 divides the sum of digits (apply recursively)
10	\(n \equiv 0 \mod 10\)	Last digit is 0
11	\(\sum (-1)^i d_i \equiv 0 \mod 11\) or	11 divides alternating sum
13	RPP, \(k=4\)	\(40 \equiv 1 \mod 13 \implies 10x + y \equiv 10(x + 4y) \mod 13\)
17	RPP, \(k=-5\)	\(-50 \equiv 1 \mod 17 \implies 10x + y \equiv 10(x - 5y) \mod 19\)
19	RPP, \(k=2\)	\(20 \equiv 1 \mod 19 \implies 10x + y \equiv 10(x + 2y) \mod 19\)

Undergraduate Compendium

D. Zack Garza

Differential Calculus

Big Theorems / Tools:

Limits

Tools for finding limits

Asymptotes

Recurrences

Derivatives

Implicit Differentiation

Related Rates

Integral Calculus

Average Values

Area Between Curves

Solids of Revolution

Arc Lengths

Center of Mass

Big List of Integration Techniques

Integration by Parts:

Shoelace Method

Differentiating under the integral

Partial Fractions

Trigonometric Substitution

Optimization

Vector Calculus

Plane Geometry

Projections

Lines

Tangent Lines / Planes

Normal Lines

Planes

Finding a Normal Vector

Distance from origin to plane

Distance from point to plane

Curves

Tangent line to a curve

Normal line to a curve

Special case: planar graphs of functions

Minimal Distances

Point to plane

Origin to plane

Point to line

Point to curve

Line to line

Surfaces

Tangent plane to a surface

Surfaces of revolution

Multivariable Calculus

Notation

Partial Derivatives

General Derivatives

The Chain Rule

Approximation

Optimization

Classifying Critical Points

Lagrange Multipliers

Change of Variables

Vector Calculus

Notation

Big Theorems

Stokes’ and Consequences

Directional Derivatives

Computing Integrals

Changing Coordinates

Multivariable Chain Rule

Polar and Cylindrical Coordinates

Spherical Coordinates

Line Integrals

Curves

Vector Fields

Flux

Area

Surface Integrals

Other Results

Linear Algebra

Notation

Big Theorems

Big List of Equivalent Properties

Vector Spaces

Linear Transformations