当前位置: 代码迷 >> 综合 >> 第二章.Regression -- 02.Multiple Linear Regression翻译
  详细解决方案

第二章.Regression -- 02.Multiple Linear Regression翻译

热度:1491   发布时间:2023-09-18 14:54:56.0

Well let’s give you a quick introduction to multiple linear regression. This is our
usual regression setup with the features, the label, and the predicted label, f(x).
Now I’m going to propose a very simple function that takes into account all the various factors
and weights each of them by some amount, okay? Now this model is linear, meaning it’s just
a weighted combination of the factors. But in reality, the model could be very complex.
For instance, you could multiply some of the features together and use those as a new feature
if you wanted to. In any case, where am I going to get these weights in reality, right?
I could make them up, but that’s not really a good idea because it ignores all the data.
So let’s learn them instead. So I’m going to stick to linear models like
this for now. So the b’s are just these coefficients, which are 0 you know, 3, 10,
100, whatever in the example. So, we’ll take our linear model and then the method
of least squares is simply: choose the coefficients, the b’s, to minimize the sum of squares
error. Okay, so the weights are learned from data? Right, how do we do this? Right, this
is – this is an optimization problem that’s exactly the same as the one for simple linear
regression, the only difference is that the function can take – has multiple terms in
it. And this is called the method of least squares; once again, you don’t need to program
this yourself. In fact, actually solving this problem is particularly easy for a computer
since there’s an analytical solution to the minimization problem for this case. So
all you need to do is give it the data and tell it to do least squares regression and
boom, you’re done. You have the b’s, and you can use them in your predictive model
f. You can be creative in the choice of features you’re going to use, you know, like I said,
you can use polynomials, you can square the variables, cube the variables, you can multiply
them together, you could create indicator variables – like this variable is 1 if the
age of the person is above 60 or 0 otherwise. If you put too many variables in there the
optimization problem gets harder, and you could run into recursive dimensionality issues,
but you really should put what you think are all the potentially important factors in there.
If you did put polynomials in there, it would allow the function to be kind of curvy and
interesting, sort of like that with all the curves; although you can see already that
if you put too many interesting features in there, you do run the risk of over-fitting.
So we’re going to have to handle that a few lectures from now, but you really should
put all the main features that you think are likely to be predictive.

让我们快速介绍一下多元线性回归。这是我们的

通常的回归设置与特性,标签,和预测的标签,f(x)。

现在我要提出一个非常简单的函数它考虑了所有的因素。

然后按一定的数量对它们进行加权,对吧?这个模型是线性的,也就是说它是线性的。

因子的加权组合。但实际上,这个模型可能非常复杂。

例如,您可以将一些特性组合在一起,并将它们作为一个新特性使用。

如果你想。无论如何,在现实中我将得到这些权重,对吧?

我可以编出来,但这不是个好主意因为它忽略了所有的数据。

我们来学习一下。我要坚持线性模型。

这个现在。所以b就是这些系数,0你知道,3 10,

100,无论如何。我们用线性模型,然后是方法。

最小二乘法是简单的:选择系数b,将平方和最小化。

错误。好的,那么权重是从数据中学习的?我们要怎么做呢?对的,这

这是一个最优化问题和简单线性的问题是一样的吗?

回归,唯一的区别是函数可以取-有多个项。

它。这叫做最小二乘法的方法;再说一遍,你不需要编程。

这个你自己。事实上,解决这个问题对电脑来说尤其容易。

因为这个例子有一个最小化问题的解析解。所以

你所需要做的就是给它数据并告诉它做最小二乘回归。

繁荣时期,你就完成了。你有b,你可以在预测模型中使用它们。

f.你可以在你将要使用的功能的选择上有创意,你知道,就像我说的,

你可以用多项式,你可以把变量平方,把变量平方,你可以相乘。

它们在一起,你可以创建指标变量,比如这个变量是1。

人的年龄不超过60岁或0岁。如果你在里面放了太多的变量。

优化问题变得更加困难,你可能会遇到递归维度问题,

但是你真的应该把你认为的所有潜在的重要因素放在那里。

如果你把多项式写在里面,它会让函数变得弯曲。

有趣的是,所有的曲线都是这样的;虽然你们已经看到了。

如果你在那里放了太多有趣的东西,你就会有过度拟合的风险。

所以我们现在要做一些讲座,但你们真的应该这样做。

把你认为可能具有预测性的所有主要特征都放在一起。

  相关解决方案