chem262

the apostle in triumph

index | chem262 | chem351

Geometric series

A geometric series multiplies each term by a fixed number to get the next term.

\[S = a + ax + ax ^ 2 + ax ^ 3 + \cdots + ax ^ n + \cdots\] \[S = \sum_{n = 0} ^ \infty ax ^ n = \sum_{n = m} ^ \infty ax ^ {n - m}\]

We may also define the partial sum, \(S_N\), of the first \(N\) terms:

\[S_N = a + ax + ax ^ 2 + ax ^ 3 + \cdots + ax ^ n + \cdots + ax ^ {N - 1}\] \[S_N = \sum_{n = 0} ^ {N - 1} ax ^ n = \sum_{n = m} ^ {N - 1} ax ^ {n - m}\]

Example: cavity ring-down spectroscopy

In a ring-down spectrometer, light is coupled in, then bounces back and forth between two mirrors. On each pass, an amount of light leaks past one of the mirors into a detector behind.

Let reflectivity: \(R = 0.9999\), transmission: \(1 - R = 0.0001\).

We can find the geometric series describing the total light detected assuming that the initial intensity coupled into the cavity is \(I\):

\[S = I(1 - R)(1 + R + R ^ 2 + R ^ 3 + \cdots + R ^ {N - 1} + \cdots)\] \[a = I(1 - R)\] \[S = \sum_{n = 0} ^ \infty aR ^ n\] \[S_N = \sum_{n = 0} ^ {N - 1} aR ^ n\] \[S_NR = aR + aR ^ 2 + aR ^ 3 + \cdots + aR ^ {N - 1}\] \[S_N - S_NR = (a + aR + aR ^ 2 + \cdots + aR ^ {N - 1}) - (aR + aR ^ 2 + aR ^ 3 + \cdots + aR ^ N)\] \[S_N - S_NR = a - aR ^ N\] \[S_N = \frac{a(1 - R ^ N)}{1 - R}\] \[S = \lim\limits_{N \to \infty} S_N = \lim\limits_{N \to \infty} \frac{a(1 - R ^ N)}{1 - R}\] \[|R| < 1 \implies \lim\limits_{N \to \infty} R ^ N = 0\] \[S = \frac{a}{1 - R} = \frac{I(1 - R)}{1 - R} = I\]

All light coupled will be detected, as the given reflectivity and transmission sum to \(1\).

In practice, absorption may be calculated as \(A = \sigma(\lambda) N d\), where:

Radius of convergence

For any geometric series:

\[|x| < 1 \implies S = \sum_{n = 0} ^ \infty ax ^ n = \frac{a}{1 - x}\]

Note that, for a geometric series, if \(|R| \geq 1\), the series diverges.

Power series

A power series has terms which can be expressed as powers of a variable.

\[\sum_{n = 0} ^ \infty a_nx ^ n = a_0 + a_1x + a_2x ^ 2 + \cdots + a_nx ^ n + \cdots\]

A power series can also be expressed as originating from a nonzero number, \(b\):

\[\sum_{n = 0} ^ \infty a_n(x - b) ^ n = a_0 + a_1(x - b) + a_2(x - b) ^ 2 + \cdots + a_n(x - b) ^ n + \cdots\]

A geometric series is simply a power series with constant coefficients, where \(a\) replaces \(a_n\).

Ratio test for convergence

We can find the ratio of successive terms of a power series:

\[\rho_n = \left|\frac{a_{n + 1}x ^ {n + 1}}{a_nx ^ n}\right|\] \[\rho = \lim\limits_{n \to \infty} \rho_n\]

Example: ratio test

We can find the interval over which \(\sum_{n = 1} ^ \infty \frac{n(-x) ^ n}{n ^ 2 + 1}\) will converge:

\[\rho_n = \left|\frac{(n + 1)(-x) ^ {n + 1}}{(n + 1) ^ 2 + 1} \times \frac{n ^ 2 + 1}{n(-x) ^ n}\right|\] \[\rho_n = \left|\frac{-x(n ^ 3 + n ^ 2 + n+1)}{n ^ 3 + 2n ^ 2 + 2n}\right|\] \[\lim\limits_{n \to \infty} \rho_n = |-x| \times \left|\frac{n ^ 3 + n ^ 2 + n+1}{n ^ 3 + 2n ^ 2 + 2n}\right|\] \[\rho = |-x| = |x|\]

The series will converge for \(|x| < 1\). As \(x = \pm1 \implies p = 1\), an alternative method is required to prove whether the series converges at \(x = \pm1\).

Taylor series expansions

Assume a function \(f(x)\) can be expanded as a power series about a point \(x = b\):

\[f(x) = a_0 + a_1(x - b) + a_2(x - b)^2 + \cdots + a_n(x - b)^n + \cdots\]

The Taylor Series of a function is characterized by the coefficients \(n \in \N, a_n\). If \(f(x)\) is differentiable, these can be found straightforwardly:

\[f'(x) = a_1 + 2a_2(x - b) + 3a_3(x - b) ^ 2 + 4a_3(x - b) ^ 3 + \cdots + na_n(x - b) ^ {n - 1} + \cdots\] \[f''(x) = 2a_2 + 3 \times 2a_3(x - b) + 4 \times 3a_4(x - b) ^ 2 + \cdots + n(n - 1)a_n(x - b) ^ {n - 2} + \cdots\] \[f'''(x) = 3 \times 2a_3 + 4 \times 3 \times 2a_4(x - b) + \cdots + n(n - 1)(n - 2)a_n(x - b) ^ {n - 3} + \cdots\] \[f ^ {(n)}(x) = n(n - 1)(n - 2)(n - 3) \times \cdots \times 2a_n + (n + 1)n(n - 1)(n - 2) \times \cdots \times 2a_{n + 1}(x - b) + \cdots\]

\[f(b) = a_0 \iff a_0 = f(b) = \left.\frac{1}{0!} \times \frac{d ^ 0f(x)}{dx ^ 0} \right|_b\] \[f'(b) = a_1 \iff a_1 = f'(b) = \left.\frac{1}{1!} \times \frac{df(x)}{dx} \right|_b\] \[f''(b) = 2a_2 \iff a_2 = \frac{1}{2}f''(b) = \left.\frac{1}{2!} \times \frac{d ^ 2f(x)}{dx ^ 2} \right|_b\] \[f'''(b) = 3 \times 2a_3 \iff a_3 = \frac{1}{3 \times 2}f'''(b) = \left.\frac{1}{3!} \times \frac{d ^ 3f(x)}{dx ^ 3} \right|_b\] \[f ^ {(n)}(b) = n!a_n \iff a_n = \frac{1}{n!}f ^ {(n)}(b) = \left.\frac{1}{n!} \times \frac{d ^ nf(x)}{dx ^ n} \right|_b\]

\[f(x) = \sum_{n = 0} ^ \infty \left.\frac{1}{n!} \times \frac{d ^ nf(x)}{dx ^ n} \right|_b(x - b) ^ n\]

Maclaurin series

Evaluation of Taylor Series can be simplified at \(b = 0\):

\[f(x) = \sum_{n = 0} ^ \infty \left.\frac{1}{n!} \times \frac{d ^ nf(x)}{dx ^ n} \right|_0x ^ n\]

Relevant expansions

\(\sin(x)\)

\(\sin(-x) = -\sin(x) \iff \sin(x)\) is an odd function. We therefore expect odd terms in the expansion of \(\sin(x)\):

\[f(x) = \sin(x) \implies f(0) = 0\] \[f'(x) = \cos(x) \implies f'(0) = 1\] \[f''(x) = -\sin(x) \implies f''(0) = 0\] \[f'''(x) = -\cos(x) \implies f'''(0) = -1\] \[f''''(x) = \sin(x) \implies f''''(0) = 0\]

\[f(x) = \sin(x) = x - \frac{x ^ 3}{3!} + \frac{x ^ 5}{5!} - \frac{x ^ 7}{7!} + \cdots\] \[\sin(x) = \sum_{n = 0} ^ \infty \frac{(-1) ^ nx ^ {2n + 1}}{(2n + 1)!}\] \[\sin(ax) = \sum_{n = 0} ^ \infty \frac{(-1) ^ n(ax) ^ {2n + 1}}{(2n + 1)!}\]

\(\cos(x)\)

\(\cos(-x) = \cos(x) \iff \cos(x)\) is an even function. In the same fashion as \(\sin(x)\), we expect even terms in the expansion of \(\cos(x)\):

\[f(x) = \cos(x) \implies f(0) = 1\] \[f'(x) = -\sin(x) \implies f'(0) = 0\] \[f''(x) = -\cos(x) \implies f''(0) = -1\] \[f'''(x) = \sin(x) \implies f'''(0) = 0\] \[f''''(x) = \cos(x) \implies f''''(0) = 1\]

\[f(x) = \cos(x) = 1 - \frac{x ^ 2}{2!} + \frac{x ^ 4}{4!} - \frac{x ^ 6}{6!} + \cdots\] \[\cos(x) = \sum_{n = 0} ^ \infty \frac{(-1) ^ nx ^ {2n}}{(2n)!}\] \[\cos(ax) = \sum_{n = 0} ^ \infty \frac{(-1) ^ n(ax) ^ {2n}}{(2n)!}\]

\(e ^ {x}\)

\[f ^ {(n)}(x) = e^x \implies \forall n \in \mathbb{N} : f ^ {(n)}(0) = 1\] \[f(x) = e^x = 1 + x + \frac{x ^ 2}{2!} + \frac{x ^ 3}{3!} + \cdots\] \[e^x = \sum_{n = 0} ^ \infty \frac{x ^ n}{n!}\]

\[f(x) = e ^ {ax} \implies f(0) = 1\] \[f'(x) = ae ^ {ax} \implies f'(0) = a\] \[f''(x) = a ^ 2e ^ {ax} \implies f''(0) = a ^ 2\] \[f'''(x) = a ^ 3e ^ {ax} \implies f'''(0) = a ^ 3\] \[f''''(x) = a ^ 4e ^ {ax} \implies f'''(0) = a ^ 4\] \[f ^ {(n)}(x) = a ^ ne ^ {ax} \implies f ^ {(n)}(0) = a ^ n\]

\[f(x) = e ^ {ax} = 1 + ax + \frac{(ax) ^ 2}{2!} + \frac{(ax) ^ 3}{3!} + \cdots\] \[e ^ {ax} = \sum_{n = 0} ^ \infty \frac{(ax) ^ n}{n!}\]

Small angle formulae

We can use Maclaurin Series to approximate functions for small input values:

\[\sin(\theta) = \theta - \frac{\theta ^ 3}{3!} + \frac{\theta ^ 5}{5!} - \frac{\theta ^ 7}{7!} + \cdots \approx \theta\] \[\cos(\theta) = 1 - \frac{\theta ^ 2}{2!} + \frac{\theta ^ 4}{4!} - \frac{\theta ^ 6}{6!} + \cdots \approx 1 - \frac{\theta ^ 2}{2!}\]

Example: uses in astronomy

We can imagine the earth, the sun, and a distant star forming the points of a triangle with its right angle at the sun and a small angle at the distant star. To find the distance to the star, we can use the distance at which the star appears to move in the sky as the earth moves from one side of the sun to the other.

Here:

\[\tan(\theta) = \frac{1 \text{ AU}}{D}\] \[\tan(\theta) = \frac{\sin(\theta)}{\cos(\theta)} \approx \frac{\theta}{1 - \frac{\theta ^ 2}{2!}} \approx \theta\] \[\theta \approx \frac{1 \text{ AU}}{D}\] \[D = \frac{1 \text{ AU}}{\theta} = \frac{1.5 \times 10 ^ 8 \text{ km}}{\theta}\]

Error analysis

\((\sum_{n = 0} ^ \infty a_n\) is an alternating series with \(|a_{n + 1}| < |a_n|\) and \(\lim\limits_{n \to \infty} a_n = 0) \implies R_n < |a_{n + 1}|\).

\[R_n = S - S_n = a_{n + 1} + a_{n + 2} + a_{n + 3} + \cdots\] \[R_n = (a_{n + 1} + a_{n + 2}) + (a_{n + 3} + a_{n + 4}) + \cdots\]

As successive terms shrink, each of the above terms, and therefore \(R_n\), shares a sign with \(a_{n + 1}\).

\[R_n = a_{n + 1} + (a_{n + 2} + a_{n + 3}) + (a_{n + 4} + a_{n + 5}) + \cdots\]

In the same fashion, none of the above terms share a sign with \(a_{n + 1}\), and all other terms therefore reduce \(a_{n + 1}\).

\[R_n < |a_{n + 1}|\]

Example: small angle formulae

We can find the range of \(\theta\) for which a given function \(f(\theta)\) has error \(\le 1%\):

\[\sin(\theta) \approx \theta - \frac{\theta ^ 3}{3!}\] \[R_n \le \left|\frac{\theta ^ 3}{3!}\right|\] \[\frac{\theta ^ 3}{3!} \le 0.01\theta \iff \theta ^ 2 \le 0.06\] \[\theta = 0.245 \text{ rad}\] \[0.245 \text{ rad} \times \frac{180}{\pi} = 14 ^ \circ\]

\[|\theta| \le 14 ^ \circ \implies \text{error} \le 1%\]

\[\cos(\theta) \approx 1 - \frac{\theta ^ 2}{2!} + \frac{\theta ^ 4}{4!}\] \[R_n \le \left|\frac{\theta ^ 4}{4!}\right|\] \[\frac{\theta ^ 4}{4!} \le 0.01 \left(1 - \frac{\theta ^ 2}{2!}\right) \iff \theta ^ 4 = 0.24 \left(1 - \frac{\theta ^ 2}{2!}\right)\] \[\theta = 0.658 \text{ rad}\] \[0.658 \text{ rad} \times \frac{180}{\pi} = 38 ^ \circ\]

\[|\theta| \le 38 ^ \circ \implies \text{error} \le 1%\]

Taylor series expansion of \(\tan(x)\)

\[|x| \le 1 \implies \tan(x) = x + \frac{x ^ 3}{3} + \frac{2x ^ 5}{15} + \frac{17x ^ 7}{315} + \cdots\]

Note that the previously demonstrated error analysis is not applicable, as this expansion of \(\tan(x)\) is not an alternating series.

\[\left(\sum_{n = 1} ^ \infty a_nx ^ n \text{ converges for } |x| < 1 \land n > N \implies |a_{n + 1}| < |a_n|\right) \implies R_N < \frac{a_{N + 1}x ^ {N + 1}}{1 - |x|}\]

\[|\theta| < 1 \implies \tan(\theta) = \theta + \frac{\theta ^ 3}{3} + \cdots\]

We can find the range of \(\theta\) for which the above expansion of \(\tan(\theta)\) has error \(\le 1%\):

\[R < \frac{\theta ^ 3}{3} \times \frac{1}{1 - \theta} < 0.01\theta\] \[\theta ^ 2 \le 0.03 - 0.03\theta \iff \theta ^ 2 + 0.03\theta - 0.03 = 0\] \[\theta = 0.159 \text{ rad} \approx 9 ^ \circ\]

\[|\theta| \le 9 ^ \circ \implies \text{error} \le 1%\]

\(\tan(\theta) = \theta\) is therefore an excellent approximation for applications in astronomy, where, often, \(\theta << 1%\).

Complex numbers

We can use the quadratic formula to find the roots of the equation \(z ^ 2 - 2z + 2 = 0\):

\[z = \frac{-b \pm \sqrt{b ^ 2 - 4ac}}{2a} = \frac{2 \pm \sqrt{4 - 4(2)}}{2} = 1 \pm \frac{\sqrt{4 - 8}}{2} = 1 \pm \sqrt{-1}\]

Let \(i = \sqrt{-1}\), then \(z = 1 \pm i\).

The above number is complex, as is \(z = x + iy\): here, \(\Re(z) = x\) and \(\Im(z) = y\). We can plot complex numbers using real (horizontal) and imaginary (vertical) axes, with \(z = (x, y)\).

The modulus of a point on the complex plane is its distance from the origin: here, it can be calculated as \(|z| = \sqrt{x ^ 2 + y ^ 2}\).

\(z ^ *\) represents the complex conjugate of \(z\), in which the sign of the imaginary part (\(y)\) is flipped: \(\Re(z) = \Re(z ^ *) \land \Im(z) = -\Im(z ^ *) \land z ^ * = x - iy\). On the complex plane, this corresponds to \(z ^ * = (x, -y)\) and represents geometrically a reflection of \(z\) about the real axis.

\[|z| ^ 2 = zz ^ * = (x + iy)(x - iy) = x ^ 2 + y ^ 2 \implies |z| = \sqrt{x ^ 2 + y ^ 2} = \sqrt{zz ^ *}\]

Rather than using Cartesian coordinates, we can plot points on the complex axis using polar coordinates such that:

A triangle with angle \(\theta\) at the origin can thus be formed by connecting the origin to either side of a line segment with Cartesian endpoints \((x, 0)\) and \((x, y)\).

We therefore may express \(z\) as any of:

Let \(z_1 = x_1 + iy_1, z_2 = x_2 + iy_2\), then we can add:

\[z_1 + z_2 = (x_1 + x_2) + i(y_1 + y_2)\]

Subtraction can similarly be expressed:

\[z_1 - z_2 = (x_1 - x_2) + i(y_1 - y_2)\]

We can cross multiply:

\[z_1z_2 = (x_1 + iy_1)(x_2 + iy_2) = x_1x_2 - y_1y_2 + i(x_1y_2 + x_2y_1)\]

We cannot divide by a complex number; we must first multiply each number by the complex conjugate of either number:

\[\frac{z_1}{z_2} = \frac{z_1z_2 ^ *}{z_2z_2 ^ *} = \frac{z_1z_2 ^ *}{|z_2| ^ 2}\] \[\frac{1}{z} = \frac{z ^ *}{|z| ^ 2}\]

Euler's formula

\[z = r(\cos(\theta) + i\sin(\theta)) = x + iy\]

\[\cos(\theta) = 1 - \frac{\theta ^ 2}{2!} + \frac{\theta ^ 4}{4!} - \frac{\theta ^ 6}{6!} + \cdots\] \[\sin(\theta) = \theta - \frac{\theta ^ 3}{3!} + \frac{\theta ^ 5}{5!} - \frac{\theta ^ 7}{7!} + \cdots\]

\[\cos(\theta) = i ^ 0 + \frac{(i\theta) ^ 2}{2!} + \frac{(i\theta) ^ 4}{4!} + \frac{(i\theta) ^ 6}{6!} + \cdots = \sum_{n = 0} ^ \infty \frac{(i\theta) ^ {2n}}{(2n)!}\] \[\sin(\theta) = -i(i\theta + \frac{(i\theta) ^ 3}{3!} + \frac{(i\theta) ^ 5}{5!} + \frac{(i\theta) ^ 7}{7!} + \cdots) = -i\sum_{n = 0} ^ \infty \frac{(i\theta) ^ {2n + 1}}{(2n + 1)!}\]

\[z = r(\cos(\theta) + i\sin(\theta))\] \[\frac{z}{r} = \cos(\theta) + i\sin(\theta) = \overbrace{\sum_{n = 0} ^ \infty \frac{(i\theta) ^ {2n}}{(2n)!}} ^ {\text{all even terms}} + \overbrace{\sum_{n = 0} ^ \infty \frac{(i\theta) ^ {2n + 1}}{(2n + 1)!}} ^ {\text{all odd terms}}\] \[\frac{z}{r} = \sum_{n = 0} ^ \infty \frac{(i\theta) ^ n}{n!} = e ^ {i\theta}\] \[\overbrace{\cos(\theta) + i\sin(\theta) = e ^ {i\theta}} ^ {\text{Euler's Formula}}\]

\[z_1 = r_1e ^ {i\theta_1}, z_2 = r_2e ^ {i\theta_2}\] \[z_1z_2 = r_1r_2e ^ {i(\theta_1 + \theta_2)}\] \[\frac{z_1}{z_2} = \frac{r_1e ^ {i\theta_1}}{r_2e ^ {i\theta_2}} = \frac{r_1}{r_2}e ^ {i(\theta_1 - \theta_2)}\] \[zz ^ * = re ^ {i\theta}re ^ {-i\theta} = r ^ 2\]

\[e ^ z = e ^ {x + iy} = e^xe ^ {iy} = e^x(\cos(y) + i\sin(y))\]

Trigonometric functions

\[e ^ {-i\theta} = \overbrace{\cos(-\theta)} ^ {\cos(\theta)} + \overbrace{i\sin(-\theta)} ^ {-\sin(\theta)} = \cos(\theta) - i\sin(\theta)\]

\[e ^ {i\theta} + e ^ {-i\theta} = \cos(\theta) + i\sin(\theta) + \cos(\theta) - i\sin(\theta) = 2\cos(\theta)\] \[\cos(\theta) = \frac{e ^ {i\theta} + e ^ {-i\theta}}{2}\]

\[e ^ {i\theta} - e ^ {-i\theta} = \cos(\theta) + i\sin(\theta) - \cos(\theta) + i\sin(\theta) = 2i\sin(\theta)\] \[\sin(\theta) = \frac{e ^ {i\theta} - e ^ {-i\theta}}{2i}\]

\[z \in \Complex \implies \cos(z) = \frac{e ^ {iz} + e ^ {-iz}}{2} \land \sin(z) = \frac{e ^ {iz} - e ^ {-iz}}{2i}\]

We can find \(\cos(z) ^ *\) via Euler's Formula:

\[\cos(z) ^ * = \frac{e ^ {-zz ^ *} + e ^ {iz ^ *}}{2} = \frac{e ^ {iz ^ *} + e ^ {iz ^ *}}{2} = \cos(z ^ *)\]

Hyperbolic functions

Let \(iy\) be a purely imaginary number where \(y\) is real, then:

\[\cos(iy) = \frac{e ^ {i(iy)} + e ^ {-i(iy)}}{2} = \frac{e ^ {-y} + e ^ y}{2} = \overbrace{\frac{e ^ y + e ^ {-y}}{2}} ^ {\cosh(y)}\] \[\sin(iy) = \frac{e ^ {i(iy)} - e ^ {-i(iy)}}{2i} = \frac{e ^ {-y} - e ^ y}{2i} \times \frac{i}{i} = i \times \overbrace{\frac{e ^ y - e ^ {-y}}{2}} ^ {\sinh(y)}\]

\[z \in \Complex \implies \cosh(z) = \frac{e ^ z + e ^ {-z}}{2} \land \sinh(z) = \frac{e ^ z - e ^ {-z}}{2}\]

Let \(x = \cos(z) \land y = \sin(z) \implies x ^ 2 + y ^ 2 = \cos ^ 2(z) + \sin ^ 2(z) = 1\). This relation produces a circle when plotted on the \(xy\)-plane.

In contrast, let \(x = \cosh(z) \land y = \sinh(z)\): \[\cosh ^ 2(z) = \frac{e ^ z + e ^ {-z}}{2} \times \frac{e ^ z + e ^ {-z}}{2} = \frac{e ^ {2z} + 2 + e ^ {-2z}}{4}\] \[\sinh ^ 2(z) = \frac{e ^ z - e ^ {-z}}{2} \times \frac{e ^ z - e ^ {-z}}{2} = \frac{e ^ {2z} - 2 + e ^ {-2z}}{4}\] \[\cosh ^ 2(z) + \sinh ^ 2(z) = \frac{e ^ {2z} + 2 + e ^ {-2z}}{4} - \frac{e ^ {2z} - 2 + e ^ {-2z}}{4} = \frac{2 - (-2)}{4} = 1\]

The above relation instead produces a hyperbola when plotted on the \(xy\)-plane.

Applications

Euler's identity can be used to aid in integration:

\[S = \int_{-\pi} ^ \pi \cos ^ 2(3x) dx = \int_{-\pi} ^ \pi \left(\frac{e ^ {i3x} + e ^ {-i3x}}{2}\right) ^ 2 dx = \frac{1}{4} \int_{-\pi} ^ \pi (e ^ {i6x} + 2 + e ^ {-i6x}) dx\] \[S = \frac{1}{4} \left(\int_{-\pi} ^ \pi e ^ {i6x} dx + \int_{-\pi} ^ \pi 2 dx + \int_{-\pi} ^ \pi e ^ {-i6x} dx \right)\] \[S = \left. \frac{1}{4} \left(\frac{e ^ {i6x}}{6i} + 2x - \frac{e ^ {-i6x}}{6i}\right) \right|_{-\pi} ^ \pi\] \[S = \left. \left(\frac{x}{2} + \frac{1}{12} \times \frac{e ^ {i6x} - e ^ {-i6x}}{2i}\right) \right|_{-\pi} ^ \pi\] \[S = \left. \left(\frac{x}{2} + \frac{\sin(6x)}{12}\right) \right|_{-\pi} ^ \pi\] \[S = \frac{\pi}{2} - \left(\frac{-\pi}{2}\right) + \frac{\sin(6\pi)}{12} - \frac{\sin(-6\pi)}{12}\] \[S = \int_{-\pi} ^ \pi \cos ^ 2(3x) dx = \pi\]

The various expressions of complex numbers can help simplify operations such as the modulus:

\[e ^ z = e ^ {x + iy} = e^xe ^ {iy}\] \[|e ^ z| ^ 2 = e ^ ze ^ {z ^ *} = e^xe ^ {iy}e^xe ^ {-iy} = e ^ {2x}\] \[|e ^ z| ^ 2 = e ^ {2x} = e ^ {2\Re(z)}\]

\[\left|\frac{2e ^ {i\theta} - 1}{ie ^ {i\theta} + 2}\right| ^ 2 = \frac{4 - 2e ^ {-i\theta} - 2e ^ {i\theta} + 1}{1 - 2ie ^ {-i\theta} + 2ie ^ {i\theta} + 4}\] \[\frac{e ^ {i\theta} - e ^ {-i\theta}}{2i} = \sin(\theta) \implies e ^ {i\theta} - e ^ {-i\theta} = 2i\sin(\theta)\] \[2i(e ^ {i\theta} - e ^ {-i\theta}) = 2i(2i)\sin(\theta) = -4sin(\theta)\] \[2\cos(\theta) = e ^ {i\theta} + e ^ {-i\theta}\] \[-2(e ^ {i\theta} + e ^ {-i\theta}) = -2(2\cos(\theta)) = -4\cos(\theta)\] \[\left|\frac{2e ^ {i\theta} - 1}{ie ^ {i\theta} + 2}\right| ^ 2 = \frac{4 - 4\cos(\theta) + 1}{1 - 4\sin(\theta) + 4} = \frac{5 - 4\cos(\theta)}{5 - 4\sin(\theta)}\] \[\left|\frac{2e ^ {i\theta} - 1}{ie ^ {i\theta} + 2}\right| = \sqrt{\frac{5 - 4\cos(\theta)}{5 - 4\sin(\theta)}}\]

Linear algebra

Matrices

A matrix is a rectangular array of quantities: numbers, variables, or functions.

\[N = \begin{bmatrix} 1 & 2 & 3 \newline 4 & 5 & 6 \newline 7 & 8 & 9 \end{bmatrix}\] \[V = \begin{bmatrix} x \newline y \newline z \end{bmatrix}\] \[F = \begin{bmatrix} \exp(ix) & 0 & \sin(x) \newline \cos(x) & x^2 & \left(1 - x^2\right) \end{bmatrix}\]

Here, \(N\) is a \(3 \times 3\) square matrix, \(V\) is a \(3 \times 1\) matrix, and \(F\) is a \(2 \times 3\) matrix.

Matrices may be ordered, in which elements are designated \(M_{ij}\) (where \(i\) is the row of the element, and \(j\) is the column):

\[M = \begin{bmatrix} M_{11} & M_{12} & M_{13} & M_{14} \newline M_{21} & M_{22} & M_{23} & M_{24} \newline M_{31} & M_{32} & M_{33} & M_{34} \end{bmatrix} \]

The transpose of a matrix interchanges rows and columns, where \(M^T_{ij} = M_{ji}\).

\[N = \begin{bmatrix} 1 & 2 & 3 \newline 4 & 5 & 6 \newline 7 & 8 & 9 \end{bmatrix} \implies N^T = \begin{bmatrix} 1 & 4 & 7 \newline 2 & 5 & 8 \newline 3 & 6 & 9 \end{bmatrix}\]

If the number of columns in one matrix is equal to the number of rows in another, we can multiply the two matrices:

\[\begin{bmatrix} a & b & c \newline d & e & f \end{bmatrix} \begin{bmatrix} g & h \newline i & j \newline k & l \end{bmatrix} = \begin{bmatrix} ag + bi + ck & ah + bj + cl \newline dg + ei + fk & dh + ej + fl \end{bmatrix}\]

Linear equations

A linear equation is one which can be written \(\sum_{k=0}^n a_kx_k = b\), where each \(a_k\) is a real constant. More simply, linear equations can be written in the form \(ax + by + cz = d\), where \(a, b, c, d\) are real constants.

We may also define a system of linear equations, and represent it as a matrix equation:

\[M = \begin{bmatrix} 1 & 1 & 2 \newline 2 & 4 & -3 \newline 3 & 6 & -5 \end{bmatrix} \implies \begin{cases} x + y + 2z = 9 \newline 2x + 4y - 3z = 1 \newline 3x + 6y - 5z = 0 \end{cases} \implies M \begin{bmatrix} x \newline y \newline z \end{bmatrix} = \begin{bmatrix}9 \newline 1 \newline 0 \end{bmatrix} \]

To systematically solve a system of linear equations, we can repeat the following operations until an expression is found for each variable:

Note that any system of equations will either have no solution, one unique solution, or infinitely many solutions.

Row reduction

We may solve the system similarly using the augmented matrix of the system, which places the solution to the system as the final column of the matrix:

\[A = \begin{bmatrix} 1 & 1 & 2 & 9 \newline 2 & 4 & -3 & 1 \newline 3 & 6 & -5 & 0 \end{bmatrix}\]

The operations we can perform to solve the system are similar:

We can begin by subtracting twice the first row from the second row:

\[\begin{bmatrix} 1 & 1 & 2 & 9 \newline 0 & 2 & -7 & -17 \newline 3 & 6 & -5 & 0 \end{bmatrix}\]

Subtract thrice the first row from the third row:

\[\begin{bmatrix} 1 & 1 & 2 & 9 \newline 0 & 2 & -7 & -17 \newline 0 & 3 & -11 & -27 \end{bmatrix}\]

Divide the second row by \(2\):

\[\begin{bmatrix} 1 & 1 & 2 & 9 \newline 0 & 1 & -\frac{7}{2} & -\frac{17}{2} \newline 0 & 3 & -11 & -27 \end{bmatrix}\]

Subtract thrice the second row from the third row:

\[\begin{bmatrix} 1 & 1 & 2 & 9 \newline 0 & 1 & -\frac{7}{2} & -\frac{17}{2} \newline 0 & 0 & -\frac{1}{2} & -\frac{3}{2} \end{bmatrix}\]

Multiply the third row by \(-2\):

\[\begin{bmatrix} 1 & 1 & 2 & 9 \newline 0 & 1 & -\frac{7}{2} & -\frac{17}{2} \newline 0 & 0 & 1 & 3 \end{bmatrix}\]

Subtract the second row from the first row:

\[\begin{bmatrix} 1 & 0 & \frac{11}{2} & \frac{35}{2} \newline 0 & 1 & -\frac{7}{2} & -\frac{17}{2} \newline 0 & 0 & 1 & 3 \end{bmatrix}\]

Add \(\frac{7}{2}\) the third row to the second row:

\[\begin{bmatrix} 1 & 0 & \frac{11}{2} & \frac{35}{2} \newline 0 & 1 & 0 & 2 \newline 0 & 0 & 1 & 3 \end{bmatrix}\]

Subtract \(\frac{11}{2}\) the third row from the first row:

\[\begin{bmatrix} 1 & 0 & 0 & 1 \newline 0 & 1 & 0 & 2 \newline 0 & 0 & 1 & 3 \end{bmatrix}\]

We can now recognize the definition of the augmented matrix to solve the system of equations:

\[\overbrace{\begin{bmatrix} 1 & 0 & 0 & 1 \newline 0 & 1 & 0 & 2 \newline 0 & 0 & 1 & 3 \end{bmatrix}}^A \implies \overbrace{\begin{bmatrix} 1 & 0 & 0 \newline 0 & 1 & 0 \newline 0 & 0 & 1 \end{bmatrix}}^M \begin{bmatrix} x \newline y \newline z \end{bmatrix} = \begin{bmatrix} 1 \newline 2 \newline 3 \end{bmatrix} \implies \begin{cases} x = 1 \newline y = 2 \newline z = 3 \end{cases}\]

The key to solving the above equation was the "diagonalization" of \(M\).

The rank of a matrix its number of nonzero rows: here, \(A\) and \(M\) each have a rank of \(3\). For any system of linear equations, if the rank of \(A\), the rank of \(M\), and the number of variables are equal to one another, there is one unique solution, and \(M\) can be diagonalized as shown above.

We can define a system of equations for which no solution exists:

\[\overbrace{\begin{bmatrix} 1 & 2 & 2 & 2 \newline 0 & 1 & 3 & 3 \newline 0 & 0 & 0 & 1 \end{bmatrix}}^A \implies \overbrace{\begin{bmatrix} 1 & 2 & 2 \newline 0 & 1 & 3 \newline 0 & 0 & 0 \end{bmatrix}}^M \begin{bmatrix} x \newline y \newline z \end{bmatrix} = \begin{bmatrix} 2 \newline 3 \newline 1 \end{bmatrix} \implies \begin{cases} x + 2y + 2z = 2 \newline y + 3z = 3 \newline 0 = 1 \end{cases}\]

For any system of linear equations, if the rank of \(A\) is greater than that of \(M\), no solution exists.

Finally, we can define a system for which an infinite number of solutions exists:

\[\overbrace{\begin{bmatrix} 1 & 0 & -4 & -4 \newline 0 & 1 & 3 & 3 \newline 0 & 0 & 0 & 0 \end{bmatrix}}^A \implies \overbrace{\begin{bmatrix} 1 & 0 & -4 \newline 0 & 1 & 3 \newline 0 & 0 & 0 \end{bmatrix}}^M \begin{bmatrix} x \newline y \newline z \end{bmatrix} = \begin{bmatrix} -4 \newline 3 \newline 0 \end{bmatrix} \implies \begin{cases} x - 4z = -4 \newline y + 3z = 3 \newline 0 = 0 \end{cases}\]

For any system of linear equations, if the rank of \(A\) is less than that of \(M\), there are infinitely many solutions.

Determinants

A square matrix can be characterized by its determinant:

\[A = \begin{bmatrix} a & b \newline c & d \end{bmatrix} \implies \det(A) = \begin{vmatrix} a & b \newline c & d \end{vmatrix} = ad - bc\] \[A = \begin{bmatrix} a_{11} & a_{12} & a_{13} & \cdots & a_{1n} \newline a_{21} & a_{22} & a_{23} & \cdots & a_{2n} \newline a_{31} & a_{32} & a_{33} & \cdots & a_{3n} \newline \vdots & \vdots & \vdots & \ddots & \vdots & \newline a_{n1} & a_{n2} & a_{n3} & \cdots & a_{nn} \newline \end{bmatrix} \implies \det(A) = \sum (-1)^{i + j}a_{ij}M_{ij}\]

Take one row or column, then \(M_{ij}\) is the determinant of the matrix remaining when row \(i\) and column \(j\) are removed from matrix \(A\).

\[A = \begin{bmatrix} a_{11} & a_{12} & a_{13} \newline a_{21} & a_{22} & a_{23} \newline a_{31} & a_{32} & a_{33} \end{bmatrix} \implies \det(A) = a_{11} \begin{vmatrix} a_{22} & a_{23} \newline a_{32} & a_{33} \end{vmatrix} - a_{12} \begin{vmatrix} a_{21} & a_{23} \newline a_{31} & a_{33} \end{vmatrix} + a_{13} \begin{vmatrix} a_{21} & a_{22} \newline a_{31} & a_{32} \end{vmatrix}\]

Example: solving for chemical properties

We can use determinants to solve systems of linear equations for chemical properties. These problems can be reduced to a system of homogenous linear equations:

\[\overbrace{\begin{bmatrix} 1 & 1 & 2 & 0 \newline 2 & 4 & -3 & 0 \newline 3 & 6 & -5 & 0 \end{bmatrix}}^A \implies \overbrace{\begin{bmatrix} 1 & 1 & 2 \newline 2 & 4 & -3 \newline 3 & 6 & -5 \end{bmatrix}}^M \begin{bmatrix} x \newline y \newline z \end{bmatrix} = \begin{bmatrix} 0 \newline 0 \newline 0 \end{bmatrix} \implies \begin{cases} x + y + 2z = 0 \newline 2x + 4y -3z = 0 \newline 3x + 6y - 5z = 0 \end{cases}\]

Such a system will always have a trivial solution \(x = y = z = 0\), and will only have a nontrivial solution if \(\det(M) \neq 0\).

As an example, we can solve for values of \(E\), energy, for which \(\det(M) = 0 \):

\[\begin{vmatrix} H_{11} - ES_{11} & H_{12} - ES_{12} & H_{13} - ES_{13} \newline H_{21} - ES_{21} & H_{22} - ES_{22} & H_{23} - ES_{23} \newline H_{31} - ES_{31} & H_{32} - ES_{32} & H_{33} - ES_{33} \end{vmatrix} = 0\]

The above is step three of a simplified calculation for the properties of the molecular orbitals of molecular nitrogen:

  1. Derive the solution from a partial differential equation
  2. Evaluate the multivariable integrals required for the elements \(H_{ij}, S_{ij}\)
  3. Calculate the determinant and solve for \(E\) to set the equation equal to \(0\)
  4. Calculate the gradient to find the minimum geometry

For a square matrix, the only type for which the \(\det\) is defined:

  1. Multiply a row by a nonzero constant \(\implies\) the \(\det\) is multiplied by the constant
  2. Interchange two rows \(\implies\) the \(\det\) changes sign
  3. Add a multiple of one row to another \(\implies\) no change in the \(\det\)
  4. The value of \(\det\) is \(0\) if a row or column is all \(0\)

Determinants and row reduction (Laplace development)

We begin with a \(4 \times 4\) matrix:

\[\begin{vmatrix} 1 & 1 & 1 & 1 \newline 1 & 2 & 3 & 4 \newline 1 & 3 & 6 & 10 \newline 1 & 4 & 10 & 20 \end{vmatrix}\]

Subtract the first row from the second, third, and fourth rows:

\[\begin{vmatrix} 1 & 1 & 1 & 1 \newline 0 & 1 & 2 & 3 \newline 0 & 2 & 5 & 9 \newline 0 & 3 & 9 & 19 \end{vmatrix}\]

Subtract twice the second row from the third row:

\[\begin{vmatrix} 1 & 1 & 1 & 1 \newline 0 & 1 & 2 & 3 \newline 0 & 0 & 1 & 3 \newline 0 & 3 & 9 & 19 \end{vmatrix}\]

Subtract thrice the second row from the fourth row:

\[\begin{vmatrix} 1 & 1 & 1 & 1 \newline 0 & 1 & 2 & 3 \newline 0 & 0 & 1 & 3 \newline 0 & 0 & 3 & 10 \end{vmatrix}\]

Subtract thrice the third row from the fourth row:

\[\begin{vmatrix} 1 & 1 & 1 & 1 \newline 0 & 1 & 2 & 3 \newline 0 & 0 & 1 & 3 \newline 0 & 0 & 0 & 1 \end{vmatrix}\]

Taking the determinant using the first column allows a simple expression:

\[\det(A) = \begin{vmatrix} 1 & 2 & 3 \newline 0 & 1 & 3 \newline 0 & 0 & 1 \end{vmatrix} = \begin{vmatrix} 1 & 3 \newline 0 & 1 \end{vmatrix} = 1\]

For any matrix \(M\) with elements only above or below the diagonal, \(\det(M) =\) the product of its diagonal elements.

Vectors

A vector is an ordered, one-dimensional array capable of representing both magnitude and direction. A common example of a vector is velocity, \(\vec{v}\). In contrast, speed lacks direction and is therefore not a vector, but a scalar.

The coordinate representation of a vector can be written as a row or column vector with elements \(A_x, A_y, A_z\):

\[\begin{bmatrix} A_x & A_y & A_z \end{bmatrix} \lor \begin{bmatrix} A_x \newline A_y \newline A_z \end{bmatrix}\]

A vector can be also written in terms of the unit vectors \(\hat{i}, \hat{j}, \hat{k}\) (which are unit vectors in the \(x, y, z\) directions, respectively):

\[\vec{A} = A_x\hat{i} + A_y\hat{j} + A_z\hat{k}\]

The length (or magnitude or norm) of \(\vec{A}\) is given by:

\[|\vec{A}| = \|\vec{A}\| = \sqrt{A_x^2 + A_y^2 + A_z^2}\]

is the more general expression for a vector of any size, \(n\):

\[\vec{A} = \begin{bmatrix} A_1 \newline A_2 \newline A_3 \newline \vdots \newline A_n \end{bmatrix} \implies \|\vec{A}\| = \sqrt{A_1^2 + A_2^2 + A_3^2 + \cdots + A_n^2} = \sqrt{\sum_{i = 1}^n A_i^2}\]

Vector addition

Vector addition is an element-by-element addition:

\[\vec{A} + \vec{B} = (A_x + B_x)\hat{i} + (A_y + B_y)\hat{j} + (A_z + B_z)\hat{k}\]

Vector addition is both associative and commutative (as the elements being added are simply scalars):

\[A + B = B + A\] \[(A + B) + C = A + (B + C)\]

For any number of components, in row or column form:

\[\begin{bmatrix} A_1 \newline A_2 \newline A_3 \newline \vdots \newline A_n \end{bmatrix} + \begin{bmatrix} B_1 \newline B_2 \newline B_3 \newline \vdots \newline B_n \end{bmatrix} = \begin{bmatrix} A_1 + B_1 \newline A_2 + B_2 \newline A_3 + B_3 \newline \vdots \newline A_n + B_n \end{bmatrix}\]

Vector products

The dot product (or scalar product) of two vectors is the sum of element-by-element multiplication:

\[\vec{A} \cdot \vec{B} = A_xB_x + A_yB_z + A_zB_z\]

Let \(\theta\) be the angle between \(\vec{A}\) and \(\vec{B}\), then:

\[\vec{A} \cdot \vec{B} = |\vec{A}||\vec{B}|\cos(\theta)\] \[\theta = \cos^{-1}\left(\frac{\vec{A} \cdot \vec{B}}{|\vec{A}||\vec{B}|}\right)\]

The dot product can be thought of as the projection of \(\vec{A}\) onto \(\vec{B}\), multiplied by the magnitude of \(\vec{B}\). This operation is commutative:

\[\vec{A} \cdot \vec{B} = \vec{B} \cdot \vec{A}\]

We may generalize the dot product for any number of dimensions:

\[\vec{A} \cdot \vec{B} = \sum_{i=1}^n A_nB_n\]

The dot product offers a concise formula for the magnitude of a vector:

\[|\vec{A}| = \sqrt{\vec{A} \cdot \vec{A}} = \sqrt{\sum_{i = 1}^n A_i^2}\]

For \(|\vec{A}| \ne 0 \land |\vec{B}| \ne 0 \land \vec{A} \cdot \vec{B} = 0\), the vectors are said to be orthogonal, and to have no overlap.

Cross products

The cross product is a binary operation on two vectors in three-dimensional space which results in a third vector perpendicular to each of the vectors and the plane they form (if coplanar).

Let \(\theta\) be the angle between \(\vec{A}\) and \(\vec{B}\), then:

\[\vec{C} = \vec{A} \times \vec{B} = |\vec{A}||\vec{B}|\sin(\theta)\]

The magnitude of the cross product is given by:

\[|\vec{C}| = |\vec{A} \times \vec{B}| = |\vec{A}||\vec{B}||\sin(\theta)|\]

We can calculate the cross product via a determinant:

\[\vec{C} = \begin{vmatrix} \hat{i} & \hat{j} & \hat{k} \newline A_x & A_y & A_Z \newline B_x & B_y & B_z \end{vmatrix}\] \[\vec{C} = (A_yB_z - A_zB_y)\hat{i} + (A_zB_x - A_xB_z)\hat{j} + (A_xB_y - B_xA_y)\hat{k}\]

The reversal of ordering of the vectors of a cross product results in the inversion of the sign of \(\theta\); therefore, the cross product is not commutative:

\[\vec{A} \times \vec{B} = -\vec{B} \times \vec{A}\]

Equations of lines

A point and a slope define a line; in two dimensions, this is given by:

\[\frac{y - y_0}{x - x_0} = m\] \[y = mx - mx_0 + y_0\] \[y = mx + \underbrace{(y - mx_0)}_{\text{intercept}}\]

A point \((x_0, y_0)\) may also define a vector:

\[\vec{r}_0 = x_0\hat{i} + y_0\hat{j}\] \[\vec{r} = x\hat{i} + y\hat{j}\] \[\vec{r} - \vec{r}_0 = (x - x_0)\hat{i} + (y - y_0)\hat{j}\]

Let \(L\) be the line plotted by the relation between \(x\) and \(y\), then the vector \(\vec{r}_0\) constrains \(\vec{r}\) to \(L\).

Let \(\vec{A}\) begin at the origin and extend parallel to \(L\):

\[\vec{A} = a\hat{i} + b\hat{j} \implies m = \frac{b}{a} \implies \frac{y - y_0}{x - x_0} = \frac{b}{a}\] \[a, b \ne 0 \implies \frac{x - x_0}{a} = \frac{y - y_0}{b}\]

That the above expressions are proportional is the constraint which defines \(L\).

Let \(t\) be a scalar proportionality constant, then we can express \(L\) parametrically:

\[\vec{r} - \vec{r}_0 = \vec{A}t \iff \vec{r} = \vec{r}_0 + \vec{A}t\]

Generalizing the above equations to three dimensions, we can begin by taking a point along a line, \(L\), in three-dimensional space:

\[P = (x_0, y_0, z_0)\] \[\vec{r}_0 = x_0\hat{i} + y_0\hat{j} + z_0\hat{k}\] \[\vec{r} = x\hat{i} + y\hat{j} + z\hat{k}\] \[\vec{r} - \vec{r}_0 = (x - x_0)\hat{i} + (y - y_0)\hat{j} + (z - z_0)\hat{k}\] \[\vec{A} = a\hat{i} + b\hat{j} + c\hat{k}\] \[a, b, c \ne 0 \implies \frac{x - x_0}{a} = \frac{y - y_0}{b} = \frac{z - z_0}{c}\] \[\vec{r} - \vec{r}_0 = \vec{A}t \iff \vec{r} = \vec{r}_0 + \vec{A}t\] \[\vec{r} = (x_0 + at)\hat{i} + (y_0 + bt)\hat{j} + (z_0 + ct)\hat{k}\]

If any value of \(\vec{A}\) is \(0\), \(L\) is two-dimensional and may be defined in terms of the two nonzero values.

Equation for a plane

To define a plane, we require a line in the plane and a vector normal to the plane.

Let \(\vec{r} - \vec{r}_0\) be the line in the plane and \(\vec{N}\) be the vector normal to the plane, then:

\[\vec{r} - \vec{r}_0 = (x - x_0)\hat{i} + (y - y_0)\hat{j} + (z - z_0)\hat{k}\] \[\vec{N} = a\hat{i} + b\hat{j} + c\hat{k}\]

As the vectors are orthogonal:

\[\vec{N} \cdot (\vec{r} - \vec{r}_0) = 0\]

We can expand the above equation to obtain an equation for the plane:

\[a(x - x_0) + b(y - y_0) + c(z - z_0) = 0\] \[ax + by + cz = \underbrace{ax_0 + by_0 + cz_0}_{\text{constant}}\] \[ax + by + cz = d\]

We can obtain a unit vector normal to the plane, \(\hat{n}\):

\[\hat{n} = \frac{\vec{N}}{|\vec{N}|}\]

Matrices and matrix operations

The helix progression of biomolecules such as DNA is written as a screw operator in matrix notation:

\[\begin{bmatrix} \cos(2\pi/c) & -\sin(2\pi/c) & 0 \newline \sin(2\pi/c) & \cos(2\pi/c) & 0 \newline 0 & 0 & 1 \end{bmatrix} \begin{bmatrix} x_i \newline y_i \newline z_i \end{bmatrix} + \begin{bmatrix} 0 \newline 0 \newline P/c \end{bmatrix} = \begin{bmatrix} x_{i + 1} \newline y_{i + 1} \newline z_{i + 1} \end{bmatrix}\]

The vector for residue \(i\) is converted into the vector for residue \(i + 1\), allowing us to describe the progression of the helix:

\[\begin{bmatrix} x_i \newline y_i \newline z_i \end{bmatrix} \to \begin{bmatrix} x_{i + 1} \newline y_{i + 1} \newline z_{i + 1} \end{bmatrix}\]

Operations

Matrix equality and addition are defined only for two matrices of equal dimensions:

\[A = \begin{bmatrix} a_{11} & a_{12} \newline a_{21} & a_{22} \end{bmatrix} \land B = \begin{bmatrix} 1 & -1 \newline 2 & -3 \end{bmatrix}\] \[A = B \implies\begin{cases} a_{11} = 1 \newline a_{12} = -1 \newline a_{21} = 2 \newline a_{22} = -3 \end{cases}\] \[A + B = \begin{bmatrix} a_{11} + 1 & a_{12} - 1 \newline a_{21} + 2 & a_{22} - 3 \end{bmatrix}\]

Matrices can be multiplied by a scalar:

\[k \begin{bmatrix} a_1 \newline a_2 \end{bmatrix} = \begin{bmatrix} ka_1 \newline ka_2 \end{bmatrix}\] \[kA = \begin{bmatrix} ka_{11} & ka_{12} \newline ka_{21} & ka_{22} \end{bmatrix}\]

Let \(M\) be an \(n \times n\) square matrix, then:

\[\det(M) = a_{11}a_{22} - a_{12}a_{21}\] \[\det(kM) = ka_{11}ka_{22} - ka_{12}ka_{21}\] \[\det(kM) = k^n \det(M)\]

The example above expresses \(n = 2\), but holds for \(n \in \mathbb{N}\).

Matrix multiplication is defined only if the number of columns of the first matrix is equal to the number of rows of the second matrix:

\[AB = \begin{bmatrix} \begin{bmatrix} a_{11} & a_{12}\end{bmatrix} \cdot \begin{bmatrix} 1 \newline 2 \end{bmatrix} & \begin{bmatrix} a_{11} & a_{12}\end{bmatrix} \cdot \begin{bmatrix} -1 \newline -3 \end{bmatrix} \newline \begin{bmatrix} a_{21} & a_{22}\end{bmatrix} \cdot \begin{bmatrix} 1 \newline 2 \end{bmatrix} & \begin{bmatrix} a_{21} & a_{22}\end{bmatrix} \cdot \begin{bmatrix} -1 \newline -3 \end{bmatrix} \end{bmatrix}\] \[AB = \begin{bmatrix} a_{11} + 2a_{12} & -a_{11} - 3a_{12} \newline a_{21} + 2a_{22} & -a_{21} - 3a_{22} \end{bmatrix}\]

The screw matrix operator can be expanded:

\[\begin{bmatrix} \cos(2\pi/c) & -\sin(2\pi/c) & 0 \newline \sin(2\pi/c) & \cos(2\pi/c) & 0 \newline 0 & 0 & 1 \end{bmatrix} \begin{bmatrix} x_i \newline y_i \newline z_i \end{bmatrix} + \begin{bmatrix} 0 \newline 0 \newline P/c \end{bmatrix} = \begin{bmatrix} x_{i + 1} \newline y_{i + 1} \newline z_{i + 1} \end{bmatrix}\] \[\begin{bmatrix} x_i\cos(2\pi/c) - y_i\sin(2\pi/c) \newline x_i\sin(2\pi/c) + y_i\cos(2\pi/c) \newline z_i \end{bmatrix} + \begin{bmatrix} 0 \newline 0 \newline P/c \end{bmatrix} = \begin{bmatrix} x_{i + 1} \newline y_{i + 1} \newline z_{i + 1} \end{bmatrix}\] \[\begin{bmatrix} x_i\cos(2\pi/c) - y_i\sin(2\pi/c) \newline x_i\sin(2\pi/c) + y_i\cos(2\pi/c) \newline z_i + P/c \end{bmatrix} = \begin{bmatrix} x_{i + 1} \newline y_{i + 1} \newline z_{i + 1} \end{bmatrix} \implies \begin{cases} x_{i + 1} = x_i\cos(2\pi/c) - y_i\sin(2\pi/c) \newline y_{i + 1} = x_i\sin(2\pi/c) + y_i\cos(2\pi/c) \newline z_{i + 1} = z_i + P/c \end{cases}\]

A vector is nothing more than a \(1 \times n\), or \(n \times 1\), matrix; we can therefore generalize the dot (or scalar, or inner) product to the multiplication of these matrices:

\[\vec{V} \cdot \vec{W} = \begin{bmatrix} V_1 & V_2 & V_3 \end{bmatrix} \begin{bmatrix} W_1 \newline W_2 \newline W_3 \end{bmatrix} = V_1W_1 + V_2W_2 + V_3W_3\]

The cross (or tensor, or outer) product is simply the multiplication of these matrices in reversed order:

\[\vec{W} \times \vec{V} = \begin{bmatrix} W_1 \newline W_2 \newline W_3 \end{bmatrix} \begin{bmatrix} V_1 & V_2 & V_3 \end{bmatrix} = \begin{bmatrix} W_1V_1 & W_1V_2 & W_1V_3 \newline W_2V_1 & W_2V_2 & W_2V_3 \newline W_3V_1 & W_3V_2 & W_3V_3 \end{bmatrix}\]

As demonstrated above, matrix multiplication is not commutative for non-square matrices. Neither must square matrices commute:

\[C = \begin{bmatrix} 1 & 0 \newline 0 & -1 \end{bmatrix} \land D = \begin{bmatrix} 0 & 1 \newline -1 & 0 \end{bmatrix}\] \[AB = \begin{bmatrix} 0 & 1 \newline 1 & 0 \end{bmatrix}\] \[BA = \begin{bmatrix} 0 & -1 \newline -1 & 0 \end{bmatrix}\]

The commutator operation gives the difference between the products of two matrices:

\[[A, B] = -[B, A] = AB - BA = \begin{bmatrix} 0 & 1 \newline 1 & 0 \end{bmatrix} - \begin{bmatrix} 0 & -1 \newline -1 & 0 \end{bmatrix} = \begin{bmatrix} 0 & 2 \newline 2 & 0 \end{bmatrix}\]

If two matrices commute, their commutator is \(0\).

Some named matrices

The null matrix, or zero matrix, has all elements equal to \(0\):

\[\begin{bmatrix} 0 & 0 \newline 0 & 0 \end{bmatrix}\] \[\begin{bmatrix} 0 & 0 & \cdots & 0 \newline 0 & 0 & \cdots & 0 \newline \vdots & \vdots & \ddots & \vdots \newline 0 & 0 & \cdots & 0 \end{bmatrix}\]

The product of a null matrix and another matrix is the null matrix:

\[\begin{bmatrix} 0 & 0 \newline 0 & 0 \end{bmatrix} \begin{bmatrix} a & b \newline c & d \end{bmatrix} = \begin{bmatrix} a & b \newline c & d \end{bmatrix} \begin{bmatrix} 0 & 0 \newline 0 & 0 \end{bmatrix} = \begin{bmatrix} 0 & 0 \newline 0 & 0 \end{bmatrix}\]

In this sense, the null matrix behaves as \(0\) does in scalar multiplication. The behavior is not always similar, however:

\[A = \begin{bmatrix} 1 & 2 \newline 3 & 6 \end{bmatrix} \land B = \begin{bmatrix} 10 & 4 \newline -5 & -2 \end{bmatrix}\] \[AB = \begin{bmatrix} 10 - 10 & 4 - 4 \newline 30 - 30 & 12 - 12 \end{bmatrix} = \begin{bmatrix} 0 & 0 \newline 0 & 0 \end{bmatrix}\] \[\neg(AB = 0 \implies A = 0 \lor B = 0)\]

The identity matrix is a square matrix with all diagonal elements equal to \(1\) and all other elements equal to \(0\):

\[\begin{bmatrix} 1 & 0 \newline 0 & 1 \end{bmatrix}\] \[\begin{bmatrix} 1 & 0 & \cdots & 0 \newline 0 & 1 & \cdots & 0 \newline \vdots & \vdots & \ddots & \vdots \newline 0 & 0 & \cdots & 1 \end{bmatrix}\]

The product of an identity matrix and another matrix is the identity matrix:

\[\begin{bmatrix} 1 & 0 \newline 0 & 1 \end{bmatrix} \begin{bmatrix} a & b \newline c & d \end{bmatrix} = \begin{bmatrix} a & b \newline c & d \end{bmatrix} \begin{bmatrix} 1 & 0 \newline 0 & 1 \end{bmatrix} = \begin{bmatrix} a & b \newline c & d \end{bmatrix} \iff AI = IA = A\]

In this sense, the identity matrix behaves as \(1\) does in scalar multiplication. As with the null matrix, however, the behavior is not always similar:

\[\begin{bmatrix} \frac{\sqrt{3}}{2} & -\frac{1}{2} \newline \frac{1}{2} & \frac{\sqrt{3}}{2} \end{bmatrix} \begin{bmatrix} \frac{\sqrt{3}}{2} & \frac{1}{2} \newline -\frac{1}{2} & \frac{\sqrt{3}}{2} \end{bmatrix} = \begin{bmatrix} \frac{3}{4} + \frac{1}{4} & \frac{\sqrt{3}}{4} - \frac{\sqrt{3}}{4} \newline \frac{\sqrt{3}}{4} - \frac{\sqrt{3}}{4} & \frac{1}{4} + \frac{3}{4} \end{bmatrix} = \begin{bmatrix} 1 & 0 \newline 0 & 1 \end{bmatrix} \iff AB = I\]

In scalar multiplication, if \(ab = 1\), then \(a\) and \(b\) are inverses of each other:

\[a = \frac{1}{b} \land b = \frac{1}{a}\]

The same is true of matrices; the inverse of a matrix \(A\) is the matrix \(B\) such that \(AB = I\). The inverse of a matrix \(A\) is written \(A^{-1}\) such that \(A^{-1}A = I\).

Inverse matrices commute; \(AA^{-1} = A^{-1}A = I\):

\[\begin{bmatrix} \frac{\sqrt{3}}{2} & \frac{1}{2} \newline -\frac{1}{2} & \frac{\sqrt{3}}{2} \end{bmatrix} \begin{bmatrix} \frac{\sqrt{3}}{2} & -\frac{1}{2} \newline \frac{1}{2} & \frac{\sqrt{3}}{2} \end{bmatrix} = \begin{bmatrix} \frac{3}{4} + \frac{1}{4} & -\frac{\sqrt{3}}{4} + \frac{\sqrt{3}}{4} \newline -\frac{\sqrt{3}}{4} + \frac{\sqrt{3}}{4} & \frac{1}{4} + \frac{3}{4} \end{bmatrix} = \begin{bmatrix} 1 & 0 \newline 0 & 1 \end{bmatrix} \iff BA = I\]

For square matrices, \(AB = I \implies BA = I\).

We can calculate the commutator of \(A\) and \(B\):

\[[A, B] = AB - BA = \begin{bmatrix} 1 & 0 \newline 0 & 1 \end{bmatrix} - \begin{bmatrix} 1 & 0 \newline 0 & 1 \end{bmatrix} = \begin{bmatrix} 0 & 0 \newline 0 & 0 \end{bmatrix}\]

When matrices commute, the commutator is the null matrix.

The screw matrix is a rotation matrix used in chemistry to rotate molecules such as benzene:

\[\begin{bmatrix} \cos(2\pi/c) & -\sin(2\pi/c) & 0 \newline \sin(2\pi/c) & \cos(2\pi/c) & 0 \newline 0 & 0 & 1 \end{bmatrix} \begin{bmatrix} x_i \newline y_i \newline z_i \end{bmatrix} \implies A = \begin{bmatrix} \cos(\theta) & -\sin(\theta) \newline \sin(\theta) & \cos(\theta) \end{bmatrix}\]

Multiplying the screw matrix by itself, flipping the sign of \(\theta\), gives the identity matrix:

\[\begin{bmatrix} \cos(\theta) & -\sin(\theta) \newline \sin(\theta) & \cos(\theta) \end{bmatrix} \begin{bmatrix} \cos(-\theta) & -\sin(-\theta) \newline \sin(-\theta) & \cos(-\theta) \end{bmatrix} = \begin{bmatrix} 1 & 0 \newline 0 & 1 \end{bmatrix}\]

Inverting a matrix

A matrix which has an inverse is invertible; a matrix which is not invertible is singular. Only square matrices are invertible.

There is a formula to calculate the inverse of a matrix, involving the determinant:

\[\det(A) = \sum a_{ij}(-1)^{i + j}M_{ij}\] \[C = (-1)^{i + j}M_{ij} \implies \det(A) = \sum a_{ij} C_{ij}\]

Here, \(C\) is coined the cofactor matrix. We can use it to invert a matrix:

\[A^{-1} = \frac{1}{|A|}C^T = \frac{1}{\det(A)}C^T\]

\[A = \begin{bmatrix} a_{11} & a_{12} & a_{13} \newline a_{21} & a_{22} & a_{23} \newline a_{31} & a_{32} & a_{33} \end{bmatrix}\] \[C = \begin{bmatrix} (-1)^{1 + 1}\begin{vmatrix} a_{22} & a_{23} \newline a_{32} & a_{33} \end{vmatrix} & (-1)^{1 + 2}\begin{vmatrix} a_{21} & a_{23} \newline a_{31} & a_{33} \end{vmatrix} & (-1)^{1 + 3}\begin{vmatrix} a_{21} & a_{22} \newline a_{31} & a_{32} \end{vmatrix} \newline (-1)^{2 + 1}\begin{vmatrix} a_{12} & a_{13} \newline a_{32} & a_{33} \end{vmatrix} & (-1)^{2 + 2}\begin{vmatrix} a_{11} & a_{13} \newline a_{31} & a_{33} \end{vmatrix} & (-1)^{2 + 3}\begin{vmatrix} a_{11} & a_{12} \newline a_{31} & a_{32} \end{vmatrix} \newline (-1)^{3 + 1}\begin{vmatrix} a_{12} & a_{13} \newline a_{22} & a_{23} \end{vmatrix} & (-1)^{3 + 2}\begin{vmatrix} a_{11} & a_{13} \newline a_{31} & a_{33} \end{vmatrix} & (-1)^{3 + 3}\begin{vmatrix} a_{11} & a_{12} \newline a_{21} & a_{22} \end{vmatrix} \end{bmatrix}\]

We can find whether a matrix \(A\) is invertible:

\[A = \begin{bmatrix} 1 & 2 \newline 3 & 6 \end{bmatrix}\] \[\det(A) = \frac{1}{|A|}C^T\] \[\det(A) = 1 \times 6 - 2 \times 3 = 0\]

As the determinant is \(0\), the matrix is singular. The singular matrix can be defined in this way: a matrix with determinant equal to \(0\).

\[A = \begin{bmatrix} 7 & 6 \newline 2 & 3 \end{bmatrix}\] \[\det(A) = 7 \times 3 - 6 \times 2 = 9\]

As the determinant is not \(0\), the matrix is invertible. We can find the inverse:

\[C = \begin{bmatrix} 3 & -2 \newline -6 & 7 \end{bmatrix}\] \[C^T = \begin{bmatrix} 3 & -6 \newline -2 & 7 \end{bmatrix}\] \[A^{-1} = \frac{1}{9}\begin{bmatrix} 3 & -6 \newline -2 & 7 \end{bmatrix} = \begin{bmatrix} \frac{1}{3} & -\frac{2}{3} \newline -\frac{2}{9} & \frac{7}{9} \end{bmatrix}\]

Matrix definitions, formulas, and facts

We can take the transpose of a matrix:

\[A^T_{ij} = A_{ji}\] \[A = \begin{bmatrix} a_{11} & a_{12} & a_{13} \newline a_{21} & a_{22} & a_{23} \newline a_{31} & a_{32} & a_{33} \end{bmatrix} \implies A^T = \begin{bmatrix} a_{11} & a_{21} & a_{31} \newline a_{12} & a_{22} & a_{32} \newline a_{13} & a_{23} & a_{33} \end{bmatrix}\]

Let \(k\) be the row or column along which the determinant is calculated, then:

\[(AB)^T = B^TA^T \land (AB)_{ij} = \sum_k A_{ik}B_{kj}\] \[(AB)_{ij} = \sum_k B^T_{jk}A^T_{ki}\] \[A = (B^TA^T)_{ji} = (AB)^T_{ji}\]

The above equality can be generalized to any number of matrices:

\[(ABCD)^T = D^TC^TB^TA^T\]

The \(\text{Kronecker } \delta\) function gives a definition for the identity matrix:

\[I = \delta_{ij} = \text{Kronecker } \delta = \begin{cases} 1 \text{ if } i = j \newline 0 \text{ if } i \ne j \end{cases}\]

The inverse of a matrix \(A\) is defined by \(AA^{-1} = I\). An invertible matrix has only one inverse:

\[\exists ! A^{-1} : A\]

If \(A\) and \(B\) are invertible and equal in size, then \(AB\) is also invertible, and \((AB)^{-1} = B^{-1}A^{-1}\).

\[(AB)(B^{-1}A^{-1}) = A(BB^{-1})A^{-1} = AIA^{-1} = AA^{-1} = I\] \[(B^{-1}A^{-1})(AB) = B^{-1}(A^{-1}A)B = B^{-1}IB = B^{-1}B = I\]

We can generalize this property to any number of matrices:

\[(ABCD)^{-1} = D^{-1}C^{-1}B^{-1}A^{-1}\]

If \(A\) is invertible, then \(kA\) is also invertible, and \((kA)^{-1} = \frac{1}{k}A^{-1}\).

\[kA\frac{1}{k}A^{-1} \overset{?}{=} I\]

As the scalar multiplication of matrices is commutative:

\[\overbrace{\frac{k}{k}}^{I} \overbrace{AA^{-1}}^{I} \overset{?}{=} I\] \[I = I\]

A matrix is symmetric if \(A_{ij} = A_{ji}\) (or, equally, if \(A = A^T\)).

\[A = \begin{bmatrix} a & b & c \newline b & a & d \newline c & d & a \end{bmatrix} = A^T\]

A matrix is orthogonal if \(A \in \mathbb{R} : A^{-1} = A^T\).

A matrix is skew-symmetric (or antisymmetric) if \(A_{ij} = -A_{ji}\) (or, equally, if \(A = -A^T\)). This implies that the diagonal elements of \(A\) are each equal to their respective opposites; all diagonal elements of \(A\) must therefore equal \(0\):

\[A = \begin{bmatrix} 0 & b & c \newline -b & 0 & d \newline -c & -d & 0 \end{bmatrix} = -A^T\]

For any square matrix \(A\), \(A + A^T\) is symmetric, and \(A - A^T\) is antisymmetric.

The trace of a matrix is the sum of its diagonal elements.

\[\text{Tr}(A) = \sum_{i = 1}^n A_{ii}\] \[\text{Tr}(I) = n\]

Linear combinations and basis sets

Two vectors, \(\vec{A}\) and \(\vec{B}\), are colinear if \(\vec{A} = \alpha\vec{B}\).

Any two non-colinear vectors with the same origin, \(\vec{A}\) and \(\vec{B}\), define a plane. \(\vec{A}\) and \(\vec{B}\) are linearly independent and span the plane; they comprise a basis (or basis set) for the two-dimensional space of the plane.

Any vector in the plane, \(\vec{C}\), can be written as a linear combination of \(\vec{A}\) and \(\vec{B}\):

\[\vec{C} = \alpha\vec{A} + \beta\vec{B}\]

\(\vec{A}\), \(\vec{B}\), and \(\vec{C}\) are linearly dependent:

\[\begin{cases} \alpha\vec{A} + \beta\vec{B} + \gamma\vec{C} = 0 \newline \alpha\vec{A} + \beta\vec{B} + (-1)\vec{C} = 0 \end{cases}\]

There exists a nontrivial linear combination of the vectors \(\vec{A}\), \(\vec{B}\), and \(\vec{C}\) which equals \(0\).

We can calculate the properties of \(\vec{C}\) using the properties of the basis set, \(\vec{A}\) and \(\vec{B}\):

\[|\vec{C}| = \vec{C} \cdot \vec{C} = (\alpha\vec{A} + \beta\vec{B}) \cdot (\alpha\vec{A} + \beta\vec{B}) = \alpha^2(\vec{A} \cdot \vec{A}) + 2\alpha\beta(\vec{A} \cdot \vec{B}) + \beta^2(\vec{B} \cdot \vec{B})\] \[|\vec{C}| = \alpha^2 |\vec{A}|^2 + 2\alpha\beta |\vec{A}||\vec{B}|\cos(\theta_{AB}) + \beta^2|\vec{B}|^2\]

Calculations of this manner can be cumbersome; we therefore prefer to use an orthonormal basis to define the space of a plane.

Consider the \(xy\)-plane: the unit vectors \(\hat{i}\) and \(\hat{j}\) form an orthonormal basis set. "Ortho-" represents the orthogonality of \(\hat{i}\) and \(\hat{j}\) (\(\hat{i} \cdot \hat{j} = 0\)), and "-normal" represents the normalization of \(\hat{i}\) and \(\hat{j}\) (\(|\hat{i}| = |\hat{j}| = 1\)). The vectors are linearly independent and span the plane; any vector in the plane, \(\vec{C}\), can therefore be written as a linear combination of \(\hat{i}\) and \(\hat{i}\):

\[\vec{C} = c_x\hat{i} + c_y\hat{j}\] \[|\vec{C}| = \vec{C} \cdot \vec{C} = (c_x\hat{i} + c_y\hat{j}) \cdot (c_x\hat{i} + c_y\hat{j}) = c_x^2 + c_y^2\]

Any vector in three-dimensional space, \(\vec{C}\), can be written as a linear combination of the orthonormal basis \(\hat{i}\), \(\hat{j}\), and \(\hat{k}\):

\[\vec{C} = c_x\hat{i} + c_y\hat{j} + c_y\hat{k}\] \[|\vec{C}| = \vec{C} \cdot \vec{C} = (c_x\hat{i} + c_y\hat{j} + c_y\hat{k}) \cdot (c_x\hat{i} + c_y\hat{j} + c_y\hat{k}) = c_x^2 + c_y^2 + c_z^2\]

Generalizations of basic sets

We have used the definitions of linear dependence and independence; we can make an equivalent definition for a set of functions:

\(\lbrace f_n(x) \rbrace\) are linearly dependent if there is a nontrivial set of constants, \(k_n\), such that \(\sum k_nf_n(x) = 0\).

\(\lbrace f_n(x) \rbrace\) are linearly independent if the above equation has only a primitive solution.

For example, we can define the set of functions corresponding to the Taylor series of \(e^x\):

\[e^x = \sum_{n = 0}^\infty \frac{x^n}{n!} = 1 + x + \frac{x^2}{2} + \frac{x^3}{3!} + \cdots\]

Let \(f(x)\) be a linear combinations of the set of functions \(1, x, x^2, x^3, \cdots, x^n\): \(\lbrace f_n(x) \rbrace = \lbrace x^n \rbrace\) spans the space of polynomials \(\le n\). \(\lbrace x^n \rbrace\) is linearly independent, as we cannot write any term as a linear combination of other terms. As \(n \to \infty\), the set spans the space which all differentiable functions occupy, and for which the Maclaurin series are defined.

Homogenous equations

A system of \(n\) homogenous linear equations in \(n\) unknowns has a nontrivial solution if and only if the equations are linearly dependent.

\[\begin{cases} x + y + 2z = 0 \newline 2x + 4y - 3z = 0 \newline 4x + 6y + z = 0 \end{cases} \implies \overbrace{\begin{bmatrix} 1 & 1 & 2 \newline 2 & 4 & -3 \newline 4 & 6 & 1 \end{bmatrix}}^M \begin{bmatrix} x \newline y \newline z \end{bmatrix} = \begin{bmatrix} 0 \newline 0 \newline 0 \end{bmatrix}\]

We can find whether the equations are linearly dependent via row reduction:

\[\begin{bmatrix} 1 & 1 & 2 \newline 2 & 4 & -3 \newline 4 & 6 & 1\end{bmatrix}\]

Add twice the first row to the second row:

\[\begin{bmatrix} 1 & 1 & 2 \newline 4 & 6 & 1 \newline 4 & 6 & 1\end{bmatrix}\]

Subtract the second row from the third row:

\[\begin{bmatrix} 1 & 1 & 2 \newline 4 & 6 & 1 \newline 0 & 0 & 0 \end{bmatrix}\]

The ability to zero a row represents the expression of that row via a linear combination of the others: the three equations are thus linearly dependent. As a zeroed row exists, \(\det(M) = 0\), which itself implies the linear dependence of a system.

The solution is the line, defined parametrically, passing through the origin:

\[\vec{r}(t) = \begin{bmatrix} -\frac{11}{7} \newline 1 \newline \frac{2}{7} \end{bmatrix} t\]

Matrix operators

A function, \(f\), is linear if \(f(a\vec{r}_1 + b\vec{r}_2)\) = \(af(\vec{r}_1) + bf(\vec{r}_2)\), where \(a, b\) are scalars.

A vector function, \(\vec{F}\), is linear if \(\vec{F}(a\vec{r}_1 + b\vec{r}_2)\) = \(a\vec{F}(\vec{r}_1) + b\vec{F}(\vec{r}_2)\).

An operator, \(O\), is linear if \(O(af + bg) = aO(f) + bO(g)\), where \(a, b\) are numbers and \(f, g\) are the type \(O\) operates on: functions, vectors, numbers, etc.

Consider the screw matrix:

\[O\vec{r}_i = \begin{bmatrix} \cos(2\pi/c) & -\sin(2\pi/c) & 0 \newline \sin(2\pi/c) & \cos(2\pi/c) & 0 \newline 0 & 0 & 1 \end{bmatrix} \begin{bmatrix} x_i \newline y_i \newline z_i \end{bmatrix} + \begin{bmatrix} 0 \newline 0 \newline P/c \end{bmatrix} = \begin{bmatrix} x_{i + 1} \newline y_{i + 1} \newline z_{i + 1} \end{bmatrix} = \vec{r}_{i + 1}\]

We can determine if the screw matrix is a linear operator by showing whether \(O(k\vec{r}_1) = kO(\vec{r_1})\):

\[O\vec{r}_i = \begin{bmatrix} \cos(2\pi/c) & -\sin(2\pi/c) & 0 \newline \sin(2\pi/c) & \cos(2\pi/c) & 0 \newline 0 & 0 & 1 \end{bmatrix} \begin{bmatrix} x_i \newline y_i \newline z_i \end{bmatrix} + \begin{bmatrix} 0 \newline 0 \newline P/c \end{bmatrix} = \begin{bmatrix} x_i\cos(2\pi/c) - y_i\sin(2\pi/c) \newline x_i\sin(2\pi/c) + y_i\cos(2\pi/c) \newline z_i + P/c \end{bmatrix}\] \[Ok\vec{r}_i = \begin{bmatrix} \cos(2\pi/c) & -\sin(2\pi/c) & 0 \newline \sin(2\pi/c) & \cos(2\pi/c) & 0 \newline 0 & 0 & 1 \end{bmatrix} \begin{bmatrix} kx_i \newline ky_i \newline kz_i \end{bmatrix} + \begin{bmatrix} 0 \newline 0 \newline P/c \end{bmatrix} = \begin{bmatrix} k(x_i\cos(2\pi/c) - y_i\sin(2\pi/c)) \newline k(x_i\sin(2\pi/c) + y_i\cos(2\pi/c)) \newline kz_i + P/c \end{bmatrix}\]

As \(P/c\) is not multiplied by the scalar \(k\), \(O(\vec{r}_i) \ne O(\vec{kr}_i)\): \(O\) is not a linear operator.

The rotation operator is an example of a matrix operator and a linear transformation:

\[M\vec{r}_i = \begin{bmatrix} \cos(\theta) & -\sin(\theta) & 0 \newline \sin(\theta) & \cos(\theta) & 0 \newline 0 & 0 & 1 \end{bmatrix} \begin{bmatrix} x \newline y \newline z \end{bmatrix} = \begin{bmatrix} x\cos(\theta) -y\sin(\theta) \newline x\sin(\theta) + y\cos(\theta) \newline z \end{bmatrix} = \vec{r'}_i\]

Matrix multiplication is linear:

\[M(a\vec{r}_1 + b\vec{r}_2) = aM\vec{r}_1 + bM\vec{r}_2\] \[M(a\vec{r}_1 + b\vec{r}_2) = \begin{bmatrix} \cos(\theta) & -\sin(\theta) & 0 \newline \sin(\theta) & \cos(\theta) & 0 \newline 0 & 0 & 1 \end{bmatrix} \begin{bmatrix} ax_1 + bx_2 \newline ay_1 + by_2 \newline az_1 + bz_2 \end{bmatrix} = \begin{bmatrix} (ax_1 + bx_2)\cos(\theta) -(ay_1 + by_2)\sin(\theta) \newline (ax_1 + bx_2)\sin(\theta) + (ay_1 + by_2)\cos(\theta) \newline az_1 + bz_2 \end{bmatrix} = \vec{r'}_i\] \[M(a\vec{r}_1 + b\vec{r}_2) = a \begin{bmatrix} x_1 \cos(\theta) - y_1 \sin(\theta) \newline x_1 \sin(\theta) + y_1 \cos(\theta) \newline z_1 \end{bmatrix} + b \begin{bmatrix} x_2 \cos(\theta) - y_2 \sin(\theta) \newline x_2 \sin(\theta) + y_2 \cos(\theta) \newline z_2 \end{bmatrix}\]

The rotation operator, \(M\), is thus linear.

A matrix operator, \(M\), is orthogonal if \(|\vec{r}| = |\vec{r'}|\), which expresses that the norm of the vector does not change. The inversion operator is an example of such a case:

\[M\vec{r} = \begin{bmatrix} -1 & 0 & 0 \newline 0 & -1 & 0 \newline 0 & 0 & -1 \end{bmatrix} \begin{bmatrix} x \newline y \newline z \end{bmatrix} = \begin{bmatrix} -x \newline -y \newline -z \end{bmatrix}\] \[|\vec{r}| = \sqrt{x^2 + y^2 + z^2} = |\vec{r'}| = \sqrt{(-x)^2 + (-y)^2 + (-z)^2}\]

A square matrix, \(M\), is orthogonal if \(M^{-1} = M^T\). The inversion matrix is such an example:

\[\begin{bmatrix} -1 & 0 & 0 \newline 0 & -1 & 0 \newline 0 & 0 & -1 \end{bmatrix} \begin{bmatrix} -1 & 0 & 0 \newline 0 & -1 & 0 \newline 0 & 0 & -1 \end{bmatrix} = \begin{bmatrix} 1 & 0 & 0 \newline 0 & 1 & 0 \newline 0 & 0 & 1 \end{bmatrix} = I\]

The rotation matrix is another example:

\[\begin{bmatrix} \cos(\theta) & -\sin(\theta) & 0 \newline \sin(\theta) & \cos(\theta) & 0 \newline 0 & 0 & 1 \end{bmatrix} \begin{bmatrix} \cos(\theta) & -\sin(\theta) & 0 \newline \sin(\theta) & \cos(\theta) & 0 \newline 0 & 0 & 1 \end{bmatrix} = \begin{bmatrix} 1 & 0 & 0 \newline 0 & 1 & 0 \newline 0 & 0 & 1 \end{bmatrix} = I\]

The following properties are equivalent:

  1. \(M\) is orthogonal
  2. The row vectors of \(M\) are orthonormal
  3. The column vectors of \(M\) are orthonormal

If \(M\) is an orthogonal matrix, \(\det(M) = \pm 1\):

We can use the rotation matrix to rotate a vector away from the \(x\)-axis by \(\theta\):

\[\begin{bmatrix} x_2 \newline y_2 \newline z_2 \end{bmatrix} = \begin{bmatrix} \cos(\theta) & -\sin(\theta) & 0 \newline \sin(\theta) & \cos(\theta) & 0 \newline 0 & 0 & 1 \end{bmatrix} \begin{bmatrix} x_1 \newline y_1 \newline z_1 \end{bmatrix}\]

We can use the inverse of the rotation matrix to instead rotate the axes toward \(x\), effectively rotating the vector by \(-\theta\) away from \(x\) and establishing a new basis of axes:

\[\begin{bmatrix} x' \newline y' \newline z' \end{bmatrix} = \begin{bmatrix} \cos(\theta) & \sin(\theta) & 0 \newline -\sin(\theta) & \cos(\theta) & 0 \newline 0 & 0 & 1 \end{bmatrix} \begin{bmatrix} x \newline y \newline z \end{bmatrix}\]

The transformation, \(P\), from one orthonormal basis to another is orthogonal:

\[P^{-1} = P^T\]

Complex matrices

We define the complex conjugate of a matrix by taking the complex conjugate of each element:

\[A^* = \begin{bmatrix} a^*_{11} & a^*_{12} & a^*_{13} \newline a^*_{21} & a^*_{22} & a^*_{23} \newline a^*_{31} & a^*_{32} & a^*_{33} \end{bmatrix}\]

For a real matrix, \(A = A^*\); for an imaginary matrix, \(A = -A^*\).

While we can apply the ideas of transposal, symmetry, orthogonality, and antisymmetry to complex matrices, these are generally used for real matrices. There is instead an equivalent set of ideas for complex matrices.

The transpose conjugate of \(A\) is obtained by taking the complex conjugate, then interchanging rows and columns:

\[A^\dagger = (A^*)^T\] \[A^\dagger_{ij} = A^*_{ji}\]

A matrix is Hermitian if \(A = A^\dagger\); \(A^\dagger\) is often called the Hermitian conjugate. If \(A\) is real, a symmetric matrix is Hermitian: Hermitianality is a generalization of matrix symmetry to the complex set.

A matrix is anti-Hermitian if \(A = -A^\dagger\). If \(A\) is real, an antisymmetric matrix is anti-Hermitian.

A matrix is unitary if \(A^{-1} = A^\dagger\). If \(A\) is real, an orthogonal matrix is unitary: in the same fashion as the Hermitian conjugate, unitarity is a generalization of matrix orthogonality to the complex set.

In quantum chemistry, we generally have operators represented by complex matrices, for which Hermitianality and unitarity are important concepts.

We can also define operations for the properties of complex vectors:

\[\vec{A} \cdot \vec{B} \iff \begin{cases} A, B \in \mathbb{R} : \left(\sum_{i = 1}^n A_iB_i\right) \lor A^TB \newline A, B \in \mathbb{C} : \left(\sum_{i = 1}^n A^*_iB_i\right) \lor A^\dagger B \end{cases}\] \[|\vec{A}| \iff \begin{cases} A \in \mathbb{R} : \sqrt{\sum_{i = 1}^n A^2_i} \lor \sqrt{A^TA} \newline A \in \mathbb{C} : \sqrt{\sum_{i = 1}^n A^*_iA_i} \lor \sqrt{A^\dagger A} \end{cases}\] \[\vec{A} \perp \vec{B} \iff \begin{cases} A, B \in \mathbb{R} : \left(\sum_{i = 1}^n A_iB_i = 0\right) \lor A^TB = 0 \newline A, B \in \mathbb{C} : \left(\sum_{i = 1}^n A^*_iB_i = 0 \right) \lor A^\dagger B = 0 \end{cases}\]

Eigenvalues and eigenvectors

Consider a square matrix, \(A\): a nonzero vector \(\vec{x}\) is an eigenvector of \(A\) if \(A\vec{x} = \lambda\vec{x}\), where \(\lambda\) is a scalar (real or complex).

\[A\vec{x} = \lambda\vec{x} = \lambda I\vec{x} \implies A\vec{x} - \lambda I\vec{x} = 0 \implies \overbrace{(A - \lambda I)}^{n \times n \text{ matrix}}\vec{x} = 0\]

The above is a homogenous linear equation; a nonprimitive solution exists if and only if \(\det(A - \lambda I) = 0\) (which expresses that the matrix is singular).

We can find the eigenvalues of a matrix:

\[A = \begin{bmatrix} 2 & 1 & 0 \newline 3 & 2 & 0 \newline 0 & 0 & 4 \end{bmatrix}\] \[\det(A - \lambda I) = 0 = \begin{vmatrix} \begin{bmatrix} 2 & 1 & 0 \newline 3 & 2 & 0 \newline 0 & 0 & 4 \end{bmatrix} - \begin{bmatrix} \lambda & 0 & 0 \newline 0 & \lambda & 0 \newline 0 & 0 & \lambda \end{bmatrix} \end{vmatrix} = \begin{bmatrix} 2 - \lambda & 1 & 0 \newline 3 & 2 - \lambda & 0 \newline 0 & 0 & 4 - \lambda \end{bmatrix}\] \[\det(A - \lambda I) = 0 = (4 - \lambda)((2 - \lambda)^2 - 3)\] \[\begin{cases} 4 - \lambda = 0 \newline \lambda^2 - 4\lambda + 1 = 0 \end{cases} \implies \underbrace{\begin{cases} \lambda = 4 \newline \lambda = 2 \pm \sqrt{3} \end{cases}}_{\text{eigenvalues of } A}\]

For an \(n \times n\) matrix, \(n\) eigenvalues exist.

Note that \(A\) is block diagonal: we can consider the square blocks of nonzero values separately, thus simplifying the problem. \(\det(A - \lambda I)\) becomes a product of terms corresponding to the blocks:

\[\underbrace{(4 - \lambda)}_{1 \times 1} \underbrace{((2 - \lambda)^2 - 3)}_{2 \times 2} = 0\]

We next can find the eigenvectors, \(\vec{x}\), which satisfy \(A\vec{x} = \lambda\vec{x}\) for each eigenvalue \(\lambda\):

\[\begin{bmatrix} 2 - \lambda & 1 & 0 \newline 3 & 2 - \lambda & 0 \newline 0 & 0 & 4 - \lambda \end{bmatrix} \begin{bmatrix} a \newline b \newline c \end{bmatrix} = \begin{bmatrix} 0 \newline 0 \newline 0 \end{bmatrix}\]

\[\lambda = 4 \implies = \begin{bmatrix} -2 & 1 & 0 \newline 3 & -2 & 0 \newline 0 & 0 & 0 \end{bmatrix} \begin{bmatrix} a \newline b \newline c \end{bmatrix} = \begin{bmatrix} 0 \newline 0 \newline 0 \end{bmatrix}\] \[\begin{bmatrix} -2a + b \newline 3a - 2b \newline 0 \end{bmatrix} = \begin{bmatrix} 0 \newline 0 \newline 0 \end{bmatrix} \implies \begin{cases} a = 0 \newline b = 0 \newline c \in \mathbb{C} \end{cases}\] \[c \coloneqq 1 \implies \left. \begin{bmatrix} 0 \newline 0 \newline 1 \end{bmatrix} \right\rbrace \text{eigenvector of } A\]

\[\lambda = 2 - \sqrt{3} \implies \begin{bmatrix} -\sqrt{3} & 1 & 0 \newline 3 & -\sqrt{3} & 0 \newline 0 & 0 & 2 - \sqrt{3} \end{bmatrix} \begin{bmatrix} a \newline b \newline c \end{bmatrix} = \begin{bmatrix} 0 \newline 0 \newline 0 \end{bmatrix}\] \[\begin{bmatrix} -a\sqrt{3} + b \newline 3a - b\sqrt{3} \newline (2 - \sqrt{3})c \end{bmatrix} = \begin{bmatrix} 0 \newline 0 \newline 0 \end{bmatrix} \implies \begin{cases} a \in \mathbb{C} \newline b = a\sqrt{3} \newline c = 0 \end{cases}\] \[a \coloneqq 1 \implies \left. \begin{bmatrix} 1 \newline \sqrt{3} \newline 0 \end{bmatrix} \right\rbrace \text{eigenvector of } A\]

\[\lambda = 2 + \sqrt{3} \implies \begin{bmatrix} \sqrt{3} & 1 & 0 \newline 3 & \sqrt{3} & 0 \newline 0 & 0 & 2 + \sqrt{3} \end{bmatrix} \begin{bmatrix} a \newline b \newline c \end{bmatrix} = \begin{bmatrix} 0 \newline 0 \newline 0 \end{bmatrix}\] \[\begin{bmatrix} a\sqrt{3} + b \newline 3a + b\sqrt{3} \newline (2 + \sqrt{3})c \end{bmatrix} = \begin{bmatrix} 0 \newline 0 \newline 0 \end{bmatrix} \implies \begin{cases} a \in \mathbb{C} \newline b = -a\sqrt{3} \newline c = 0 \end{cases}\] \[a \coloneqq 1 \implies \left. \begin{bmatrix} 1 \newline -\sqrt{3} \newline 0 \end{bmatrix} \right\rbrace \text{eigenvector of } A\]

We can consider another block-diagonal matrix:

\[A = \begin{bmatrix} 3 & -2 & 0 \newline -2 & 3 & 0 \newline 0 & 0 & 5 \end{bmatrix}\] \[a_{nn} = 5 \implies \lambda = 5\] \[(3 - \lambda)^2 - 4 = 0 \implies (\lambda - 5)(\lambda - 1) = 0 \implies \lambda = 1, 5\] \[\lambda = 1, 5, 5\]

The duplicate \(\lambda = 5\) entries indicate degenerate eigenvalues.

\[\lambda = 1 \implies \begin{bmatrix} 2 & -2 & 0 \newline -2 & 2 & 0 \newline 0 & 0 & 4 \end{bmatrix} \begin{bmatrix} a \newline b \newline c \end{bmatrix} = \begin{bmatrix} 0 \newline 0 \newline 0 \end{bmatrix}\] \[\begin{bmatrix} 2a - 2b \newline -2a + 2b \newline 4c \end{bmatrix} = \begin{bmatrix} 0 \newline 0 \newline 0 \end{bmatrix} \implies \begin{cases} a \in \mathbb{C} \newline b = a \newline c = 0 \end{cases}\] \[a \coloneqq 1 \implies \left. \begin{bmatrix} 1 \newline 1 \newline 0 \end{bmatrix} \right\rbrace \text{eigenvector of } A\]

\[\lambda = 1 \implies \begin{bmatrix} -2 & -2 & 0 \newline -2 & -2 & 0 \newline 0 & 0 & 0 \end{bmatrix} \begin{bmatrix} a \newline b \newline c \end{bmatrix} = \begin{bmatrix} 0 \newline 0 \newline 0 \end{bmatrix}\] \[\begin{bmatrix} -2a - 2b \newline -2a - 2b \newline 0 \end{bmatrix} = \begin{bmatrix} 0 \newline 0 \newline 0 \end{bmatrix} \implies \begin{cases} a \in \mathbb{C} \newline b = -a \newline c \in \mathbb{C} \end{cases}\] \[a \coloneqq 1, c \coloneqq 0 \implies \left. \begin{bmatrix} 1 \newline -1 \newline 0 \end{bmatrix} \right\rbrace \text{eigenvalue of } A\] \[a \coloneqq 0, c \coloneqq 1 \implies \left. \begin{bmatrix} 0 \newline 0 \newline 1 \end{bmatrix} \right\rbrace \text{eigenvalue of } A\]

The above two eigenvectors span the space corresponding to the doubly degenerate eigenvalue, \(\lambda = 5\).

Matrix diagonalization

A square matrix \(M\) is diagonalizable if an invertible matrix \(C\) exists such that \(C^{-1}MC\) is diagonalizable; here, \(C\) is said to diagonalize \(M\).

If \(M\) is an \(n \times n\) matrix, then the following are equivalent:

  1. \(M\) is diagonalizable
  2. \(M\) has \(n\) linearly independent eigenvectors

To diagonalizable a matrix \(M\):

  1. Find the linearly independent eigenvectors of \(M\)
  2. Form the matrix \(C\) with these eigenvectors as the columns
  3. The matrix \(C^{-1}MC\) will be diagonal with the eigenvalues on the diagonal

\[\begin{gather*} M \coloneqq \begin{bmatrix} 2 & 1 & 0 \newline 3 & 2 & 0 \newline 0 & 0 & 4 \end{bmatrix} \newline \lambda = \left\lbrace 4, 2 + \sqrt{3}, 2 - \sqrt{3}\right\rbrace \implies \left. \begin{bmatrix} 1 \newline \sqrt{3} \newline 0 \end{bmatrix}, \begin{bmatrix} 1 \newline -\sqrt{3} \newline 0 \end{bmatrix}, \begin{bmatrix} 0 \newline 0 \newline 1 \end{bmatrix} \right\rbrace \text{eigenvectors (not unique)} \newline C \coloneqq \begin{bmatrix} 1 & \sqrt{3} & 0 \newline 1 & -\sqrt{3} & 0 \newline 0 & 0 & 1 \end{bmatrix} \implies C^{-1} = \begin{bmatrix} \frac{1}{2} & \frac{\sqrt{3}}{6} & 0 \newline \frac{1}{2} & -\frac{\sqrt{3}}{6} & 0 \newline 0 & 0 & 1 \end{bmatrix} \newline C^{-1}MC = \underbrace{\begin{bmatrix} 2 + \sqrt{3} & 0 & 0 \newline 0 & 2 - \sqrt{3} & 0 \newline 0 & 0 & 4 \end{bmatrix}}_{D} \end{gather*}\]

\(D\) is similar to \(M\); we have diagonalized \(M\) via a similarity transform.

Not all square matrices are diagonalizable:

\[\begin{gather*} M \coloneqq \begin{bmatrix} -3 & 2 & 0 \newline -2 & 1 & 0 \newline 0 & 0 & -1 \end{bmatrix} \newline \lambda = \left\lbrace -1, -1, -1\right\rbrace \implies \begin{bmatrix} -3 + 1 & 2 & 0 \newline -2 & 1 + 1 & 0 \newline 0 & 0 & -1 + 1 \end{bmatrix} \begin{bmatrix} a \newline b \newline c \end{bmatrix} = \begin{bmatrix} 0 \newline 0 \newline 0 \end{bmatrix} \newline \implies \begin{cases} -2a + 2b = 0 \newline -2a + 2b = 0 \newline 0 = 0 \end{cases} \implies \vec{x} = \begin{bmatrix} a \newline a \newline c \end{bmatrix} \end{gather*}\]

We cannot construct three linearly independent eigenvectors; \(M\) is not diagonalizable.

Eigenvalues, eigenvectors, and diagonalization

\[\begin{gather*} \vec{L} = I\vec{\omega} \impliedby \begin{cases} \vec{L}, \vec{J} \coloneqq \text{angular momentum} \newline I \coloneqq \text{moment of inertia} \newline \vec{\omega} \coloneqq \text{angular frequency} \newline \end{cases} \newline \begin{bmatrix} J_x \newline J_y \newline J_z \end{bmatrix} = \begin{bmatrix} I_{xx} & I_{xy} & I_{xz} \newline I_{yx} & I_{yy} & I_{yz} \newline I_{zx} & I_{zy} & I_{zz} \end{bmatrix} \begin{bmatrix} \omega_x \newline \omega_y \newline \omega_z \end{bmatrix} \implies \begin{dcases} I_{xx} = \sum_i m_i \left(y_i^2 + z_i^2\right) \newline I_{yy} = \sum_i m_i \left(x_i^2 + z_i^2\right) \newline I_{zz} = \sum_i m_i \left(x_i^2 + y_i^2\right) \newline I_{xy} = I_{yx} = -\sum_i m_ix_iy_i \newline I_{xz} = I_{zx} = -\sum_i m_ix_iz_i \newline I_{yz} = I_{zy} = -\sum_i m_iy_iz_i \end{dcases} \end{gather*}\]

We can use the above equation to calculate the rotations of a molecule, methylene chloride, in an arbitrary coordinate system:

\[\begin{gather*} \begin{bmatrix} J_x \newline J_y \newline J_z \end{bmatrix} = \overbrace{\begin{bmatrix} 124.60 & 31.57 & -54.68 \newline 31.57 & 124.84 & 42.23 \newline -54.68 & 42.23 & 76.08 \end{bmatrix}}^{M} \begin{bmatrix} \omega_x \newline \omega_y \newline \omega_z \end{bmatrix} \newline \implies \begin{bmatrix} \text{C}1 \newline \text{H}2 \newline \text{H}4 \newline \text{Cl}1 \newline \text{Cl}2 \end{bmatrix} \mapsto \overbrace{\begin{bmatrix} 0.000 & 0.681 & 0.393 \newline 0.800 & 1.450 & -0.304 \newline -0.800 & 0.989 & 1.104 \newline -0.726 & 0.477 & -1.176 \newline 0.726 & -0.780 & 1.001 \end{bmatrix}}^{\text{coordinates}} \end{gather*}\]

To gather the principal eigenvalues along the diagonal:

  1. Take the eigenvalues of \(M\)
  2. Take the eigenvectors of \(M\)
  3. Combine the eigenvectors to form a matrix, \(C\)
  4. Find the inverse of \(C\), \(C^{-1}\)
  5. Take \(C^{-1}MC\ = M_{\text{diag}}\)

The above steps give

\[\begin{gather*} M_{\text{diag}} = \begin{bmatrix} 15.24 & 0 & 0 \newline 0 & 149.22 & 0 \newline 0 & 0 & 161.05 \end{bmatrix} \end{gather*}\]

We want to rewrite \(M\) in the coordinate system which diagonalizes the moment of inertia, \(I\).

\[\begin{gather*} \vec{J} = I\vec{\omega} \implies C^{-1}\vec{J} = C^{-1}I\vec{\omega} \implies C^{-1}\vec{J} = \overbrace{\left(C^{-1}IC\right)}^{I_{\text{diag}}}C^{-1}\vec{\omega} \newline C^{-1}\vec{J} = I_{\text{diag}}C^{-1}\vec{\omega} \impliedby \begin{cases} C^{-1}\vec{J} \rightarrow \text{angular momentum around principal axes} \newline C^{-1}\vec{\omega} \rightarrow \text{angular frequency around principal axes} \end{cases} \newline \implies \vec{A}_{\text{new}} = C^{-1}\vec{A}_{\text{old}} \newline \implies \begin{bmatrix} \text{C}1 \newline \text{H}2 \newline \text{H}4 \newline \text{Cl}1 \newline \text{Cl}2 \end{bmatrix} \mapsto \overbrace{\begin{bmatrix} 0.000 & 0.681 & 0.393 \newline 0.800 & 1.450 & -0.304 \newline -0.800 & 0.989 & 1.104 \newline -0.726 & 0.477 & -1.176 \newline 0.726 & -0.780 & 1.001 \end{bmatrix}}^{\text{old coordinates}} \mapsto \overbrace{\begin{bmatrix} 0.000 & 0.787 & 0.000 \newline 0.000 & 1.408 & -0.924 \newline 0.000 & 1.408 & 0.924 \newline -1.452 & -0.175 & 0.000 \newline 1.452 & -0.175 & 0.000 \end{bmatrix}}^{\text{new coordinates}} \end{gather*}\]

Complex matrices and eigenvalues

Consider a general, non-Hermitian matrix \(M\):

\[\begin{gather*} M \coloneqq \begin{bmatrix} 2 & 3 - i \newline 3 - i & -1 \end{bmatrix} \newline \begin{vmatrix} 2 - \lambda & 3 - i \newline 3 - i & -1 - \lambda \end{vmatrix} = 0 \implies (2 - \lambda)(-1 - \lambda) - (3 - i)^2 = 0 \newline \implies \lambda^2 - \lambda - (10 - 6i) = 0 \newline \implies \lambda = \frac{1 \pm \sqrt{1 + 4(1)(10 - 6i)}}{2} = \overbrace{\frac{1}{2} \pm \frac{\sqrt{41 - 24i}}{2}}^{\text{not real}} \end{gather*}\]

Recall that a matrix \(A\) is Hermitian if \(A = A^\dagger\). Notably, a Hermitian matrix will have the following properties:

  1. Real values on the diagonal
  2. Real eigenvalues

Consider the Hermitian matrix \(M\):

\[\begin{gather*} M \coloneqq \begin{bmatrix} 2 & 3 - i \newline 3 + i & -1 \end{bmatrix} \newline \begin{vmatrix} 2 - \lambda & 3 - i \newline 3 + i & -1 - \lambda \end{vmatrix} = 0 \implies \lambda^2 - \lambda - 12 = 0 \newline \implies (\lambda + 3)(\lambda - 4) = 0 \implies \lambda = \left\lbrace -3, 4\right\rbrace \end{gather*}\]

Recall that a matrix \(A\) is unitary if \(A^{-1} = A^\dagger\). For real matrices, it is given that a real matrix is orthogonally diagonalizable:

\[\begin{gather*} C^TMC = D \impliedby \begin{cases} M \text{ symmetric} \newline C \text{ orthogonal} \newline D \text{ diagonal} \end{cases} \end{gather*}\]

Similarly, for complex matrices:

\[\begin{gather*} U^TMU = D \impliedby \begin{cases} M \text{ symmetric} \newline U \text{ unitary} \newline D \text{ diagonal} \end{cases} \end{gather*}\]

If \(M\) is an \(n \times n\) matrix, then the following are equivalent:

  1. \(M\) can be diagonlized by a unitary transform
  2. \(M\) is Hermitian

Unitary transforms

Consider the same matrix \(M\) from above:

\[\begin{gather*} M \coloneqq \begin{bmatrix} 2 & 3 - i \newline 3 + i & -1 \end{bmatrix} \implies \lambda = \left\lbrace -3, 4\right\rbrace \newline \implies \begin{bmatrix} 2 + 3 & 3 - i \newline 3 + i & -1 + 3 \end{bmatrix} \begin{bmatrix} a \newline b \end{bmatrix} = \begin{bmatrix} 0 \newline 0 \end{bmatrix} \implies \begin{cases} 5a + (3 - i)b = 0 \newline (3 + i)a + 2b = 0 \end{cases} \newline (3 + i)a + 2b = 0 \iff (3 + i)^2a + 2(3 + i)b = 0 \newline \iff 10a + 2(3 - i)b = 0 \newline \iff 5a + (3 - i)b = 0 \newline \implies \left.\begin{cases} 5a + (3 - i)b = 0 \newline (3 + i)a + 2b = 0 \end{cases}\right\rbrace \text{equivalent} \newline a = -\frac{1}{5}(3 - i)b \implies \vec{v}_1 = \begin{bmatrix} -\frac{1}{5}(3 - i) \newline b \end{bmatrix} \end{gather*}\]

We can find \(b\) such that \(\vec{v}_1\) is normalized (\(\vec{v}_1^*\vec{v}_1 = 1\)):

\[\begin{gather*} \left(-\frac{1}{5}(3 + i)b\right)\left(-\frac{1}{5}(3 - i)b\right) + b^2 = 1 \newline \frac{10}{25}b^2 + b^2 = 1 \newline \frac{7}{5}b^2 = 1 \newline b^2 = \frac{5}{7} \newline b = \sqrt{\frac{5}{7}} \newline \vec{v}_1 = \begin{bmatrix} \frac{-(3 - i)}{\sqrt{35}} \newline \sqrt{\frac{5}{7}} \end{bmatrix} \end{gather*}\]

Repeating this process for the other eigenvector gives

\[\begin{gather*} \vec{v}_2 = \begin{bmatrix} \frac{3 - i}{\sqrt{14}} \newline \sqrt{\frac{2}{7}} \end{bmatrix} \newline U = \begin{bmatrix} \frac{-(3 - i)}{\sqrt{35}} & \frac{3 - i}{\sqrt{14}} \newline \sqrt{\frac{5}{7}} & \sqrt{\frac{2}{7}} \end{bmatrix} \implies U^\dagger = \begin{bmatrix} \frac{-(3 - i)}{\sqrt{35}} & \sqrt{\frac{5}{7}} \newline \frac{3 - i}{\sqrt{14}} & \sqrt{\frac{2}{7}} \end{bmatrix} \end{gather*}\]

Note that \(U\) is not unique:

  1. We could interchange columns
  2. We could multiply either \(\vec{v}_1\), \(\vec{v}_2\), or both by \(-1\)
  3. We could assume \(b \in \mathbb{C}\) when normalizing:

\[\begin{gather*} b^*b = \sqrt{\frac{5}{7}} \implies b = \pm \sqrt{\frac{5}{7}} \text{ or } \pm i\sqrt{\frac{5}{7}} \newline \text{likewise for } \vec{v}_2 \end{gather*}\]

We can check if this indeed diagonalizes:

\[\begin{gather*} U^\dagger MU = D = \begin{bmatrix} \frac{-(3 - i)}{\sqrt{35}} & \sqrt{\frac{5}{7}} \newline \frac{3 - i}{\sqrt{14}} & \sqrt{\frac{2}{7}} \end{bmatrix} \begin{bmatrix} 2 & 3 - i \newline 3 + i & -1 \end{bmatrix} \begin{bmatrix} \frac{-(3 - i)}{\sqrt{35}} & \frac{3 - i}{\sqrt{14}} \newline \sqrt{\frac{5}{7}} & \sqrt{\frac{2}{7}} \end{bmatrix} \newline D = \left. \begin{bmatrix} -3 & 0 \newline 0 & 4 \end{bmatrix}\right\rbrace \text{eigenvalues on the diagonal} \end{gather*}\]

Note that an orthogonal transformation of a real matrix \(A\) is also a unitary transformation.

Generalized vector spaces

A vector space is a set of elements \(\left\lbrace U, V, W, \cdots\right\rbrace\) together with the operations of addition and multiplication by a scalar (\(k \in \mathbb{C}\)) for which the following are true:

  1. The sum of any two vectors in the space is also a vector in the space (closure)
  2. Vector addition is commutative and associative
  3. There is a zero vector such that \(V + 0 = 0 + V = V\)
  4. Every element has an additive inverse \(W\) such that \(W = -v\)
  5. Multiplication by a scalar has the usual properties

\[\begin{gather*} \begin{cases} \text{Closure} \implies \exists A : A = U + V \newline \text{Commutative} \implies U + V = V + U \newline \text{Associative} \implies (U + V) + W = V + (U + W) \newline \text{Identity} \implies \exists 0 \newline \text{Inverse} \implies \exists W : W = -V \newline \end{cases} \newline \text{Multiplication by scalar} \implies \begin{cases} k(U + V) = kU + kV \newline (k_1 + k_2)U = k_1U + k_2U \newline (k_1k_2)U = k_1(k_2U) \newline 0V = 0 \newline 1V = V \end{cases} \end{gather*}\]

Three-dimensional Cartesian spaces is a vector space. We also have:

  1. An inner product
  2. A norm
  3. Orthogonality

\[\begin{gather*} \begin{cases} \text{Inner product} \implies \exists (V \cdot U) \newline \text{Norm} \implies \exists ||V|| \newline \text{Orthogonality} \implies \exists (V, U) : V \cdot U = 0 \end{cases} \end{gather*}\]

We can generalize vector spaces to other collections of elements. Consider the space of polynomials of degree \(\le 3\):

\[\begin{gather*} f(x) \coloneqq a_0 + a_1x + a_2x^2 + a_3x^3 \newline \underbrace{\left\lbrace 1, x, x^2, x^3\right\rbrace}_{\text{basis set which spans the space}} \end{gather*}\]

Any function in the space can be written as a linear combination of the elements of the basis set. The above space obeys all the rules of vector spaces:

\[\begin{gather*} \begin{cases} f(x) \coloneqq a_0 + a_1x + a_2x^2 + a_3x^3 \newline g(x) \coloneqq b_0 + b_1x + b_2x^2 + b_3x^3 \newline z(x) \coloneqq 0 \end{cases} \newline \begin{cases} \text{Closure} \rightarrow f(x) + g(x) = (a_0 + b_0) + (a_1 + b_1)x + (a_2 + b_2)x^2 + (a_3 + b_3)x^3 = h(x) \newline \text{Commutative} \rightarrow f(x) + g(x) = g(x) + f(x) \newline \text{Associative} \rightarrow (f(x) + g(x)) + h(x) = g(x) + (f(x) + h(x)) \newline \text{Identity} \rightarrow z(x) + g(x) = g(x) \newline \text{Inverse} \rightarrow b \coloneqq -a \implies g(x) = -f(x) \newline \end{cases} \newline \text{Multiplication by scalar} \implies \begin{cases} k(f(x) + g(x)) = kf(x) + kg(x) \newline (k_1 + k_2)f(x) = k_1f(x) + k_2f(x) \newline (k_1k_2)f(x) = k_1(k_2f(x)) \newline z(x)g(x) = z(x) \newline 1g(x) = g(x) \end{cases} \end{gather*}\]

We can now define an inner product and a norm for this space:

\[\begin{gather*} I(A, B) \coloneqq \text{Inner product} \rightarrow \begin{dcases} (A,B) \in \mathbb{R} \rightarrow I : A,B \mapsto \int A(x)B(x)dx \newline (A,B) \in \mathbb{C} \rightarrow I : A,B \mapsto \int A(x)^*B(x)dx \end{dcases} \newline \int A(x)^*B(x)dx = \braket{A(x)|B(x)} \rightarrow \left. \begin{cases} \bra{A(x)} \coloneqq \text{bra} \newline \ket{B(x)} \coloneqq \text{ket} \end{cases}\right\rbrace \text{bracket notation} \newline ||A||^2 = \braket{A|A} = \int A(x)^*A(x)dx = N : 0 < N \le \infty \newline \left. a = \frac{A}{\sqrt{N}} : \braket{a|a} = 1\right\rbrace \text{similar to unit vectors} \newline \braket{A|B} = 0 \rightarrow A,B \text{ are orthogonal} \newline \end{gather*}\]

We can apply this type of space to the orbitals of the hydrogen atom:

\[\begin{gather*} \left. \hat{H}\psi = E\psi\right\rbrace \text{Schrodinger equation} \newline \underbrace{\left\lbrace 1s, 2s, 2p_x, 2p_y, 2p_z, 3s, 3p_x, 3p_y, 3p_z, 3d_{z^2}, \cdots\right\rbrace}_{\text{an orthonormal basis set which spans the plane}} \newline \braket{\psi_i|\psi_j} = \delta_{ij} \end{gather*}\]

The above space gives the possible wave functions for the electron.