MIT18.06_5_Orthogonal and Gramham-Schmidt
Carpe Tu Black Whistle

Orthogonal

We could say that this is part two of the fundamental theorem of linear algebra. Part one gives the dimensions of the four subspaces, part two says those subspaces come in orthogonal pairs, and part three will be about orthogonal bases for these subspaces.

“我们认为,这是线性代数基本定理的Part 2.“

  1. Part one 给出了四个子空间的维度。
  2. Part two 给出了子空间间的正交关系。
  3. Part three 将会是关于这些空间的正交基

Orthogonal vectors and subspaces

In this lecture we learn what it means for vectors, bases and subspaces to be orthogonal. The symbol for this is.
The “big picture” of this course is that the row space of a matrix’ is orthogonal to its nullspace, and its column space is orthogonal to its left nullspace.


image

矩阵的shape是

  • 一个矩阵空间与一个相同维度的零空间正交(orthogonal).
  • row space 和 nullspace 属于.
  • column space 和 nullspace of属于

Orthogonal vectors

Orthogonal is just another word for perpendicular. Two vectors are orthogonal if the angle between them is 90 degrees. If two vectors are orthogonal, they form a right triangle whose hypotenuse is the sum of the vectors. Thus, we can use the Pythagorean theorem to prove that the dot productis zero exactly whenandare orthogonal. (The length squaredequalsequals.)

  • 正交 的另一个名字叫做垂直。
  • 两个向量正交 means 两个向量间的夹角为 90degrees。
  • Pythagorean theorem:
  • 相互垂直的两条边,可以构成一个直角三角形。
  • 斜边(hypotenuse),是两个向量之和。
  • 可以证明点乘:

一个计算方式.

Note that all vectors are orthogonal to the zero vector.

一个特例,零向量 对于所有向量全部正交。

Orthogonal subspaces

Subspaceis orthogonal to subspacemeans: every vector inis orthogonal to every vector in.

子空间正交于子空间:中的任意向量都与中的任意向量垂直。

The blackboard is not orthognal to the floor; two vectors in the line where the blackboard meets the floor aren’t orthogonal to each other.

特别注意,黑板和地面,并不两个正交的子空间。想象一下,两个平面内都取一个跟交界线夹角45度的射线。这两条射线并不是垂直的。

In the plane, the space containing only the zero vector and any line through the origin are orthogonal subspaces. A line through the origin and the whole plane are never orthogonal subspaces. Two lines through the origin are orthogonal subspaces if they meet at right angles.

在平面空间中,中间的 正交子空间 仅有 原点 和 过原点的直线。

  • 过原点的直线 和 整个平面 一定不正交。
  • 两条过原点的直线,仅在直线相互垂直的时候,相互正交。

Nullspace is perpendicular to row space

The row space of a matrix is orthogonal to the nullspace, becausemeans the dot product ofwith each row ofis. But then the product ofwith any combination of rows ofmust be 0.

结论:矩阵的行空间与矩阵的零空间正交。

  • 反映了,矩阵的行向量与的点乘(内积)必须全部为0。
  • 与行空间中的基全部正交。
  • 那么与行空间中所有向量正交。

The column space is orthogonal to the left nullspace ofbecause the row space ofis perpendicular to the nullspace of.

同理的结论: 矩阵的列空间与矩阵的左零空间正交。

In some sense, the row space and the nullspace of a matrix subdivideinto two perpendicular subspaces. For, the row space has dimension 1 and basisand the nullspace has dimension 2 and is the plane through the orgin perpendicular to the vector.

一定程度上,行空间 与 零空间 将矩阵分割为两个子空间。

  • 一个三维线性空间rank=1。
  • 此时,rowspace的dim=1,子空间为直线。
  • nullspace的dim=2,子空间为平面。
  • 两个子空间相互垂直。

Not only is the nullspace orthogonal to the row space, their dimensions add up to the dimension of the whole space.

行空间与零空间维度之和 = 整个空间维度

We say that the nullspace and the row space are orthogonal complements in𝕟.

我们称零空间和行空间是n维线性空间的正交补。

The nullspace contains all the vectors that are perpendicular to the row space, and vice versa.

零空间中,包含所有垂直于行空间的向量,反之亦然。

以下是关于最小二乘,投影相关的内容:

Due to measurement error,is often unsolvable if. Our next challenge is to find the best possible solution in this case.

由于测量误差(比如在计算卫星轨道之类的事情),经常会遇到无法解的情况()。 那么下一个更富挑战性的问题就是,如何从中获得最好的可能解。

The matrixplays a key role in this effort: the central equation is.
We know thatis square () and symmetric. When is it invertible?
Suppose. Then:

is invertible.is not always invertible. In fact:

We conclude thatis invertible exactly whenhas independent columns.

  • 对称方阵在最小二乘的算法中,扮演了核心角色。
  • 中心等式是
  1. 并不是总 可逆
  2. 仅当矩阵的列都相互独立时,可逆。

Projections onto subspaces

Projections

If we have a vectorand a line determined by a vector, how do we find the point on the line that is closest to?

image

  • 向量
  • 向量是向量在向量上的投影。

We can see from Figure 1 that this closest pointis at the intersection formed by a line throughthat is orthogonal to. If we think ofas an approximation of, then the length ofis the error in that approximation.
We could try to findusing trigonometry or calculus, but it’s easier to use linear algebra. Sincelies on the line through, we knowfor some number. We also know thatis perpendicular to:

and. Doublingdoubles. Doublingdoes not affect.

通过 投影误差被投影向量 的垂直关系,来得到最后的放缩因子

投影矩阵

  • 倍增向量矩阵也会倍增。
  • 倍增向量矩阵不会倍增。

Projection matrix

We’d like to write this projection in terms of a projection matrix.

so the matrix is:

Note thatis a three by three matrix, not a number; matrix multiplication is not commutative.

矩阵乘法是没有交换律的

The column space ofis spanned bybecause for any,lies on the line determined by. The rank ofis 1.is symmetric.because the projection of a vector already on the line throughis just that vector. In general, projection matrices have the properties:

  • 列空间由向量
  • 对于任意
  • 所在的直线,由向量决定
  • 矩阵的秩为1
  • 对称

** 对称性的证明 **
分母是是数字(两个向量的内积)。所以矩阵只是的放缩。

Why project?

As we know, the equationmay have no solution. The vectoris always in the column space of, andis unlikely to be in the column space. So, we projectonto a vectorin the column space ofand solve.

  • 并不是总有解。
  • 向量总是在column space当中。但向量不一定在column space中。
  • 投影是在column space上向量的投影。此时就有最近解.

Projection in higher dimensions

在更高维度的线性空间的投影(在中)

In, how do we project a vectoronto the closest pointin a plane?
Ifandform a basis for the plane, then that plane is the column space of the matrix.

是平面中的一组基,根据垂直条件,投影误差应该与两组基都垂直

We know that. We want to find. There are many ways to show thatis orthogonal to the plane we’re projecting onto, after which we can use the fact thatis perpendicular toand:

In matrix form,. When we were projecting onto a line,only had one column and so this equation looked like:.
Note thatis in the nullspace ofand so is in the left nullspace of. We know that everything in the left nullspace ofis perpendicular to the column space of, so this is another confirmation that our calculations are correct.
We can rewrite the equationas:

将上式与各个基向量垂直等式整合,得到矩阵乘法的表达式,最后变形为:
上式是由:演变而来的。
** 比较重要的概念 **
含义为:向量在矩阵的列空间上的投影为。预测值为

When projecting onto a line,was just a number; now it is a square matrix. So instead of dividing bywe now have to multiply by
Indimensions,

当投影在直线(矩阵的列空间dim=1)是一个实数。更高维时,是一个方阵。

It’s tempting to try to simplify these expressions, but ifisn’t a square matrix we can’t say thatIfdoes happen to be a square, invertible matrix then its column space is the whole space and contains. In this caseis the identity, as we find when we simplify. It is still true that:

  • 不是方阵,是不成立的。
  • 恰好是方阵,并且可逆,那么向量在矩阵的列空间中。

image

Least Squares

Suppose we’re given a collection of data points:

and we want to find the closest lineto that collection. If the line went through all three points, we’d have:

which is equivalent to:

以上的矩阵表达式,从左到右,分别为、向量和 向量

为需要回归的直线方程,需要敲定两个参数。

In our example the line does not go through all three points, so this equation is not solvable. Instead we’ll solve:

Projection matrices and least squares

Projections

Last lecture, we learned thatis the matrix that projects a vectoronto the space spanned by the columns of. Ifis perpendicular to the column space, then it’s in the left nullspaceofand. Ifis in the column space thenfor some, and.

  • 向量与列空间垂直情况下,垂直列空间于原点(列空间过原点)。
  • 向量在列空间上时,投影矩阵

** 这里有一个非常重要的点 **

, 并不代表是单位矩阵
比如,正好与特征向量垂直时。仅仅与矩阵有关,与向量无关。

A typical vector will have a componentin the column space and a componentperpendicular to the column space (in the left nullspace); its projection is just the component in the column space.
The matrix projectingontois:

Naturally,has all the properties of a projection matrix.

** 可以理解为误差矩阵(我取的名字)具备投影矩阵的全部属性(可以相互确定)**

Least squares

image

We want to find the closest lineto the points, and. The process we’re going to use is called linear regression; this technique is most useful if none of the data points are outliers.

  • 这个处理过程叫做: 线性回归
  • 有效条件: 没有离群值

By “closest” line we mean one that minimizes the error represented by the distance from the points to the line. We measure that error by adding up the squares of these distances. In other words, we want to minimize.

为衡量最近采样点到直线的误差,采用了距离平方

If the line went through all three points, we’d have”

but this system is unsolvable. It’s equivalent to, where:

There are two ways of viewing this. In the space of the line we’re trying to find,andare the vertical distances from the data points to the line. The componentsandare the values ofnear each data point.
In the other view we have a vectorin, its projectiononto the column space of, and its projectiononto.

对于以上的乘法矩阵,我们有两个视角来理解:

  1. 尝试寻找最小误差组,其中估计值(投影点)组满足,.
  2. 向量在矩阵的column space的投影在列空间中,投影在左零空间上。

image

We will now findand. We know:

From this we get the normal equations:

We solve these to findand.

** 这里的误差向量是回归值与ground true之间的垂直距离 **

We could also have used calculus to find the minimum of the following function of two variables”

Either way, we end up solving a system of linear equations to find that the closest line to our points is.
This gives us:

orand. Note thatandare orthogonal, and also thatis perpendicular to the columns of.

The matrix

We’ve been assuming that the matrixis invertible. Is this justified?

在先前,我们都假设方程是可逆的,在这里要证明其合理性。

Ifhas independent columns, thenis invertible.

结论:若矩阵中的各个列独立,那么可逆。

To prove this we assume that, then show that it must be true that:

Sincehas independent columns,.

假设,当满足矩阵的各个列向量全部都独立时,能够唯一确定那么就能证明成立。

As long as the columns ofare independent, we can use linear regression to find approximate solutions to unsolvable systems of linear equations.
The columns ofare guaranteed to be independent if they are orthonormal, i.e. if they are perpendicular unit vectors likeand, or likeand.

只要矩阵A的列向量独立,就能利用线性回归找到无解线性系统的近似解。

Orthogonal matrices and Gram-Schmidt

In this lecture we finish introducing orthogonality.
Using an orthonormal basis or a matrix with orthonormal columns makes calculations much easier.
Gram-Schmidt process starts with any basis and produces an orthonormal basis that spans the same space as the original basis.

  • 上一节我们完成了对 正交 的介绍。
  • 使用 正交基 或者 具有正交列的矩阵,能够让计算变得更容易。
  • 格拉姆-施密特 处理,从任意一组基出发,生成一组正交的向量基。

Orthonormal vectors

The vectorsare orthonormal if:

image

In other words, they all have(normal) length 1 and are perpendicular(ortho) to each other. Orthonormal vectors are always independent.

对于一组向量集:

  1. 同一向量点乘为1:;
  2. 不同向量点乘为0:;

Orthonormal matrix

If the columns ofare orthonormal, thenis the identity.
Matrices with orthonormal columns are a new class of important matrices to add to those on our list: triangular, diagonal, permutation, symmetric, reduced row echelon, and projection matrices. We’ll call them “orthonormal matrices“.
A square orthonormal matrixis called an orthogonal matrix. Ifis square, thetells us that.
For example, ifthen. Bothandare orthogonal matrices, and their product is the identity.

  • 矩阵以向量集形式表示。
  • 只有主对角线上的元素为1。
  • 由上式可以得到

The matrixis orthogonal. The matrixis not, but we can adjust that matrix to get the orthogonal matrix. We can use the same tactic to find some larger orthogonal matrices called Hadamard matrices:

An example of a rectangular matrix with orthonormal columns is:

We can extend this to a (square) orthogonal matrix:

These examples are particularly nice because they don’t include complicated square roots.

above是一个很不错的例子,不用处理麻烦的求根符号。

Orthonormal columns are good

Supposehas orthonormal columns. The matrix that projects onto the column space ofis:

If the columns ofare orthonrmal, thenand. Ifis square, thenbecause the columns ofspan the entire space.

投影到矩阵的列空间的的投影矩阵为:
具有 正交性, 此时。又因为。最后得出结论
意味着正交矩阵行列式,张成的是整个空间

Many equations become trivial when using a matrix with orthonormal columns. If our basis is orthonormal, the projection componentis justbecausebecomes.

许多矩阵运算再考虑正交性后,都可以变得更加简单易解。

Gram-Schmidt

With elimination, our goal was “make the matrix triangular”. Now our goal is “make the matrix orthonormal”.

在使用消元法时,我们的目的是:使得矩阵三角化
如今的目标是,使得矩阵正交化

We start with two independent vectorsandand want to find orthonormal vectorsandthat span the same plane.
We start by finding orthogonal vectorsandthat span the same space asand.
Then the unit vectorsandform the desired orthonormal basis.

  • 我们从两个独立向量出发。
  • 透过投影的处理,获得正交向量基
  • 一个比较重要的特性:具有相同的张成空间

Let. We get a vector orthogonal toin the space spanned byandby projectingontoand letting. (is what we previously called.)

If we multiply both sides of this equation by, we see that.

  • 假设,设为第一个基的向量。
  • 向量减去向量在上的投影。最终结果可以用来验证。

** 注意:是一个实数 **
代表:向量在向量/矩阵上投影的缩放因子

What if we had started with three independent vectors,,and? Then we’d find a vectororthogonal to bothandby subtracting fromits components in theanddirections:

如果由三个向量获得第三个向量,仅需要减去两个正交向量的投影。

For example, supposeand. Thenand:

Normalizing, we get:

The column space ofis the plane spanned byand.

获得向量后,想要获得单位矩阵的方式:

When we studied elimination, we wrote the process in terms of matrices and found.similar equationrelates our starting matrixto the resultof the Gram-Schmidt process. Wherewas lower triangular,is upper triangular.
Suppose. Then:

image
从假设到关系的推倒,利用了假设式子与其它向量的正交关系。
** 感谢 Hengchuan zou 提供的证明(我看了好久 **

我们像研究消元法时一样,将消元法用矩阵运算分析理解。像一样,我们把原独立列矩阵分解为正交单位矩阵和关系矩阵

Ifis upper triangular, then it should be true that. This must be true because we choseto be a unit vector in the direction of. All the laterwere chosen to be perpendicular to the earlier ones.

若矩阵R为上三角矩阵,则a1被选作第一个正交向量。其余正交向量都垂直于之前都向量。

Notice that. This makes sense;.

这里利用了正交向量的求逆特性。