Chapter 9. Quadratic Forms and Spectral Theorems
第 9 章二次型和谱定理
9.1 Quadratic Forms
9.1 二次型
A quadratic form is a polynomial of degree two in several variables, expressed neatly using matrices. Quadratic forms appear throughout mathematics: in optimization, geometry of conic sections, statistics (variance), and physics (energy functions).
二次型是多元二次多项式,可以用矩阵简洁地表示。二次型在数学中随处可见:优化、圆锥曲线几何、统计学(方差)和物理学(能量函数)。
Definition
定义
Let be an symmetric matrix and . The quadratic form associated with is
令 为 对称矩阵, 。与 相关的二次型为
Expanded,
展开,
Because is symmetric (), the cross-terms can be grouped naturally.
因为 是对称的(),交叉项可以自然分组。
Examples
示例
Example 9.1.1. For
例 9.1.1. 对于
Example 9.1.2. The quadratic form
例 9.1.2. 二次型
corresponds to the matrix . It measures squared Euclidean distance from the origin.
对应于矩阵. . 它测量距离原点的平方欧几里得距离。
Example 9.1.3. The conic section equation
例 9.1.3 圆锥曲线方程
is described by the quadratic form with
由二次型 描述
Diagonalization of Quadratic Forms
二次型的对角化
By choosing a new basis consisting of eigenvectors of , we can rewrite the quadratic form without cross terms. If with diagonal, then
通过选择由 的特征向量组成的新基,我们可以重写没有交叉项的二次型。如果 以 为对角线,则
Thus quadratic forms can always be expressed as a sum of weighted squares:
因此二次型总是可以表示为加权平方和:
where are the eigenvalues of .
其中 是 的特征值。
Geometric Interpretation
几何解释
Quadratic forms describe geometric shapes:
二次型描述几何形状:
-
In 2D: ellipses, parabolas, hyperbolas.
二维:椭圆、抛物线、双曲线。
-
In 3D: ellipsoids, paraboloids, hyperboloids.
在 3D 中:椭圆体、抛物面、双曲面。
-
In higher dimensions: generalizations of ellipsoids.
在更高维度中:椭圆体的概括。
Diagonalization aligns the coordinate axes with the principal axes of the shape.
对角化将坐标轴与形状的主轴对齐。
Why this matters
为什么这很重要
Quadratic forms unify geometry and algebra. They are central in optimization (minimizing energy functions), statistics ( covariance matrices and variance), mechanics (kinetic energy), and numerical analysis. Understanding quadratic forms leads directly to the spectral theorem.
二次型统一了几何和代数。它们在优化(最小化能量函数)、统计学(协方差矩阵和方差)、力学(动能)和数值分析中都至关重要。理解二次型可以直接引出谱定理。
Exercises 9.1
练习 9.1
-
Write the quadratic form as for some symmetric matrix .
对于某些对称矩阵 ,将二次型 写为 。
-
For , compute explicitly.
对于 ,明确计算 。
-
Diagonalize the quadratic form .
将二次型 对角化。
-
Identify the conic section given by .
确定由 给出的圆锥截面。
-
Show that if is symmetric, quadratic forms defined by and are identical.
证明如果 是对称的,则由 和 定义的二次型是相同的。
9.2 Positive Definite Matrices
9.2 正定矩阵
Quadratic forms are especially important when their associated matrices are positive definite, since these guarantee positivity of energy, distance, or variance. Positive definiteness is a cornerstone in optimization, numerical analysis, and statistics.
当二次型的相关矩阵为正定矩阵时,它们尤为重要,因为它们可以保证能量、距离或方差的正性。正定性是优化、数值分析和统计学的基石。
Definition
定义
A symmetric matrix is called:
对称矩阵 称为:
-
Positive definite if
满足如下条件是正定的
-
Positive semidefinite if
满足如下条件是半正半的
Similarly, negative definite (always < 0) and indefinite (can be both < 0 and > 0) matrices are defined.
类似地,定义了负定(始终 < 0)和不定(可以同时 < 0 和 > 0)矩阵。
Examples
示例
Example 9.2.1.
例 9.2.1。
is positive definite, since
是正定的,因为
for all .
对于所有 。
Example 9.2.2. 例 9.2.2。
has quadratic form 具有二次形式
This matrix is not positive definite, since .
该矩阵不是正定的,因为 。
Characterizations
特征
For a symmetric matrix :
对于对称矩阵 :
-
Eigenvalue test: is positive definite if and only if all eigenvalues of are positive.
特征值检验:当且仅当 的所有特征值都为正时, 才是正定的。
-
Principal minors test (Sylvester’s criterion): is positive definite if and only if all leading principal minors ( determinants of top-left submatrices) are positive.
主子式检验(西尔维斯特标准):当且仅当所有首项主子式(左上角 子矩阵的行列式)均为正时, 才是正定的。
-
Cholesky factorization: is positive definite if and only if it can be written as
Cholesky 分解: 为正定当且仅当它可以写成
where is an upper triangular matrix with positive diagonal entries.
其中 是具有正对角线项的上三角矩阵。
Geometric Interpretation
几何解释
-
Positive definite matrices correspond to quadratic forms that define ellipsoids centered at the origin.
正定矩阵对应于定义以原点为中心的椭圆体的二次型。
-
Positive semidefinite matrices define flattened ellipsoids (possibly degenerate).
半正定矩阵定义扁平的椭球体(可能是退化的)。
-
Indefinite matrices define hyperbolas or saddle-shaped surfaces.
不定矩阵定义双曲线或马鞍形曲面。
Applications 应用
-
Optimization: Hessians of convex functions are positive semidefinite; strict convexity corresponds to positive definite Hessians.
优化:凸函数的 Hessian 矩阵是正半定的;严格凸性对应于正定的 Hessian 矩阵。
-
Statistics: Covariance matrices are positive semidefinite.
统计:协方差矩阵是正半定的。
-
Numerical methods: Cholesky decomposition is widely used to solve systems with positive definite matrices efficiently.
数值方法:Cholesky 分解被广泛用于有效地解决具有正定矩阵的系统。
Why this matters
为什么这很重要
Positive definiteness provides stability and guarantees in mathematics and computation. It ensures energy functions are bounded below, optimization problems have unique solutions, and statistical models are meaningful.
正定性在数学和计算中提供了稳定性和保证。它确保能量函数有界,优化问题有唯一解,统计模型有意义。
Exercises 9.2
练习 9.2
-
Use Sylvester’s criterion to check whether
使用 Sylvester 标准检查
is positive definite.
是正定的。
-
Determine whether
确定是否
is positive definite, semidefinite, or indefinite.
是正定的、半定的或不定的。
-
Find the eigenvalues of
找到特征值
and use them to classify definiteness.
并用它们来对确定性进行分类。
-
Prove that all diagonal matrices with positive entries are positive definite.
证明所有具有正项的对角矩阵都是正定的。
-
Show that if is positive definite, then so is for any invertible matrix .
证明如果 为正定矩阵,则对于任何可逆矩阵 , 也为正定矩阵。
9.3 Spectral Theorem
9.3 谱定理
The spectral theorem is one of the most powerful results in linear algebra. It states that symmetric matrices can always be diagonalized by an orthogonal basis of eigenvectors. This links algebra (eigenvalues), geometry (orthogonal directions), and applications (stability, optimization, statistics).
谱定理是线性代数中最有力的结论之一。它指出对称矩阵总是可以通过特征向量的正交基对角化。这连接了代数(特征值)、几何(正交方向)和应用(稳定性、优化、统计)。
Statement of the Spectral Theorem
谱定理表述
If is symmetric (), then:
如果 是对称的( ),则:
-
All eigenvalues of are real.
的所有特征值都是实数。
-
There exists an orthonormal basis of consisting of eigenvectors of .
存在由 的特征向量组成的 正交基。
-
Thus, can be written as
因此, 可以写成
where is an orthogonal matrix () and is diagonal with eigenvalues of on the diagonal.
其中 是正交矩阵 ( ), 是对角矩阵,其特征值 位于对角线上。
Consequences
结果
-
Symmetric matrices are always diagonalizable, and the diagonalization is numerically stable.
对称矩阵总是可对角化的,并且对角化在数值上是稳定的。
-
Quadratic forms can be expressed in terms of eigenvalues and eigenvectors, showing ellipsoids aligned with eigen-directions.
二次型 可以用特征值和特征向量来表示,显示与特征方向对齐的椭圆体。
-
Positive definiteness can be checked by confirming that all eigenvalues are positive.
可以通过确认所有特征值都为正来检查正定性。
Example 9.3.1
例 9.3.1
Let
设
已知二维矩阵求行列式公式:
-
Characteristic polynomial:
特征多项式:
Eigenvalues: .
特征值: 。
-
Eigenvectors:
特征向量:
-
For : solve , giving .
对于 :求解 ,得到 。
-
For : solve , giving .
对于 :求解 ,得到 。
-
Normalize eigenvectors:
归一化特征向量:
-
Then
则
So
所以
Geometric Interpretation
几何解释
The spectral theorem says every symmetric matrix acts like independent scaling along orthogonal directions. In geometry, this corresponds to stretching space along perpendicular axes.
谱定理指出,每个对称矩阵都像沿正交方向的独立缩放一样。在几何学中,这相当于沿垂直轴拉伸空间。
-
Ellipses, ellipsoids, and quadratic surfaces can be fully understood via eigenvalues and eigenvectors.
通过特征值和特征向量可以充分理解椭圆、椭圆体和二次曲面。
-
Orthogonality ensures directions remain perpendicular after transformation.
正交性确保方向在变换后保持垂直。
Applications
应用
-
Optimization: The spectral theorem underlies classification of critical points via eigenvalues of the Hessian.
优化:谱定理是通过 Hessian 的特征值对临界点进行分类的基础。
-
PCA (Principal Component Analysis): Data covariance matrices are symmetric, and PCA finds orthogonal directions of maximum variance.
PCA(主成分分析):数据协方差矩阵是对称的,PCA 找到最大方差的正交方向。
-
Differential equations & physics: Symmetric operators correspond to measurable quantities with real eigenvalues ( stability, energy).
微分方程和物理学:对称算子对应于具有实特征值(稳定性、能量)的可测量量。
Why this matters
为什么这很重要
The spectral theorem guarantees that symmetric matrices are as simple as possible: they can always be analyzed in terms of real, orthogonal eigenvectors. This provides both deep theoretical insight and powerful computational tools.
谱定理保证对称矩阵尽可能简单:它们总是可以用实数正交特征向量来分析。这既提供了深刻的理论见解,也提供了强大的计算工具。
Exercises 9.3
练习 9.3
-
Diagonalize
对角化
using the spectral theorem.
使用谱定理。
-
Prove that all eigenvalues of a real symmetric matrix are real.
证明实对称矩阵的所有特征值都是实数。
-
Show that eigenvectors corresponding to distinct eigenvalues of a symmetric matrix are orthogonal.
证明对称矩阵的不同特征值对应的特征向量是正交的。
-
Explain geometrically how the spectral theorem describes ellipsoids defined by quadratic forms.
从几何角度解释谱定理如何描述由二次型定义的椭球体。
-
Apply the spectral theorem to the covariance matrix
将谱定理应用于协方差矩阵
and interpret the eigenvectors as principal directions of variance.
并将特征向量解释为方差的主方向。
9.4 Principal Component Analysis (PCA)
9.4 主成分分析(PCA)
Principal Component Analysis (PCA) is a widely used technique in data science, machine learning, and statistics. At its core, PCA is an application of the spectral theorem to covariance matrices: it finds orthogonal directions (principal components) that capture the maximum variance in data.
主成分分析 (PCA) 是数据科学、机器学习和统计学中广泛使用的技术。PCA 的核心是谱定理在协方差矩阵中的应用:它找到能够捕捉数据中最大方差的正交方向(主成分)。
The Idea
理念
Given a dataset of vectors :
给定向量数据集 :
-
Center the data by subtracting the mean vector .
通过减去平均向量 使数据居中。
-
Form the covariance matrix
形成协方差矩阵
-
Apply the spectral theorem: .
应用谱定理: 。
-
Columns of are orthonormal eigenvectors (principal directions).
的列是正交特征向量(主方向)。
-
Eigenvalues in measure variance explained by each direction.
中的特征值测量每个方向解释的方差。
-
The first principal component is the eigenvector corresponding to the largest eigenvalue; it is the direction of maximum variance.
第一个主成分是最大特征值对应的特征向量,是方差最大的方向。
Example 9.4.1
例 9.4.1
Suppose we have two-dimensional data points roughly aligned along the line . The covariance matrix is approximately
假设我们有二维数据点大致沿着直线 排列。协方差矩阵大约为
Eigenvalues are about and . The eigenvector for is approximately .
特征值约为 和 . 的特征向量大约为 。
-
First principal component: the line .
第一个主成分:线 。
-
Most variance lies along this direction.
大部分差异都发生在这个方向。
-
Second component is nearly orthogonal (), but variance there is tiny.
第二个成分几乎正交( ),但那里的方差很小。
Thus PCA reduces the data to essentially one dimension.
因此,PCA 将数据简化为一个维度。
Applications of PCA
PCA 的应用
-
Dimensionality reduction: Represent data with fewer features while retaining most variance.
降维:用较少的特征表示数据,同时保留大部分的方差。
-
Noise reduction: Small eigenvalues correspond to noise; discarding them filters data.
降噪:较小的特征值对应噪声;丢弃它们可以过滤数据。
-
Visualization: Projecting high-dimensional data onto top 2 or 3 principal components reveals structure.
可视化:将高维数据投影到前 2 个或 3 个主成分上可以揭示结构。
-
Compression: PCA is used in image and signal compression.
压缩:PCA 用于图像和信号压缩。
Connection to the Spectral Theorem
与谱定理的联系
The covariance matrix is always symmetric and positive semidefinite. Hence by the spectral theorem, it has an orthonormal basis of eigenvectors and nonnegative real eigenvalues. PCA is nothing more than re-expressing data in this eigenbasis.
协方差矩阵 始终是对称的,且为半正定矩阵。因此,根据谱定理,它有一个由特征向量和非负实特征值组成的正交基。PCA 只不过是在这个特征基上重新表达数据。
Why this matters
为什么这很重要
PCA demonstrates how abstract linear algebra directly powers modern applications. Eigenvalues and eigenvectors give a practical method for simplifying data, revealing patterns, and reducing complexity. It is one of the most important algorithms derived from the spectral theorem.
PCA 展示了抽象线性代数如何直接驱动现代应用。特征值和特征向量提供了一种简化数据、揭示模式和降低复杂性的实用方法。它是从谱定理中推导出的最重要的算法之一。
Exercises 9.4
练习 9.4
-
Show that the covariance matrix is symmetric and positive semidefinite.
证明协方差矩阵是对称的和半正定的。
-
Compute the covariance matrix of the dataset , and find its eigenvalues and eigenvectors.
计算数据集 的协方差矩阵,并找到其特征值和特征向量。
-
Explain why the first principal component captures the maximum variance.
解释为什么第一个主成分捕获最大方差。
-
In image compression, explain how PCA can reduce storage by keeping only the top principal components.
在图像压缩中,解释 PCA 如何通过仅保留前 个主成分来减少存储。
-
Prove that the sum of the eigenvalues of the covariance matrix equals the total variance of the dataset.
证明协方差矩阵的特征值之和等于数据集的总方差。