Pytorch 的损失函数Loss function使用详解

2023-08-03 15:20:04 229

1.损失函数

损失函数，又叫目标函数，是编译一个神经网络模型必须的两个要素之一。另一个必不可少的要素是优化器。

损失函数是指用于计算标签值和预测值之间差异的函数，在机器学习过程中，有多种损失函数可供选择，典型的有距离向量，绝对值向量等。

损失Loss必须是标量，因为向量无法比较大小（向量本身需要通过范数等标量来比较）。

损失函数一般分为4种，平方损失函数，对数损失函数，HingeLoss0-1损失函数，绝对值损失函数。

我们先定义两个二维数组，然后用不同的损失函数计算其损失值。

importtorch
fromtorch.autogradimportVariable
importtorch.nnasnn
importtorch.nn.functionalasF
sample=Variable(torch.ones(2,2))
a=torch.Tensor(2,2)
a[0,0]=0
a[0,1]=1
a[1,0]=2
a[1,1]=3
target=Variable(a)

sample的值为：[[1,1],[1,1]]。

target的值为：[[0,1],[2,3]]。

1nn.L1Loss

L1Loss计算方法很简单，取预测值和真实值的绝对误差的平均数即可。

criterion=nn.L1Loss()
loss=criterion(sample,target)
print(loss)

最后结果是：1。

它的计算逻辑是这样的：

先计算绝对差总和：|0-1|+|1-1|+|2-1|+|3-1|=4；

然后再平均：4/4=1。

2nn.SmoothL1Loss

SmoothL1Loss也叫作HuberLoss，误差在(-1,1)上是平方损失，其他情况是L1损失。

criterion=nn.SmoothL1Loss()
loss=criterion(sample,target)
print(loss)

最后结果是：0.625。

3nn.MSELoss

平方损失函数。其计算公式是预测值和真实值之间的平方和的平均数。

criterion=nn.MSELoss()
loss=criterion(sample,target)
print(loss)

最后结果是：1.5。

4nn.CrossEntropyLoss

交叉熵损失函数

花了点时间才能看懂它。

首先，先看几个例子，

需要注意的是，target输入必须是tensorlong类型（int64位）

importtorch
#crossentropyloss
pred=np.array([[0.8,2.0,1.2]])
CELoss=torch.nn.CrossEntropyLoss()
forkinrange(3):
target=np.array([k])
loss2=CELoss(torch.from_numpy(pred),torch.from_numpy(target).long())
print(loss2)

Output：

tensor(1.7599,dtype=torch.float64)
tensor(0.5599,dtype=torch.float64)
tensor(1.3599,dtype=torch.float64)

如果，改成pred=np.array([[0.8,2.0,2.0]])，输出，

tensor(2.0334,dtype=torch.float64)
tensor(0.8334,dtype=torch.float64)
tensor(0.8334,dtype=torch.float64)

后面两个输出一样。

先看它的公式，就明白怎么回事了：

（这个应该是有两个标准交叉熵组成了，后面一个算是预测错误的交叉熵？反正，数值会变大了）

使用numpy来实现是这样的：

pred=np.array([[0.8,2.0,2.0]])
nClass=pred.shape[1]
target=np.array([0])

deflabelEncoder(y):
tmp=np.zeros(shape=(y.shape[0],nClass))
foriinrange(y.shape[0]):
tmp[i][y[i]]=1
returntmp
defcrossEntropy(pred,target):
target=labelEncoder(target)
pred=softmax(pred)
H=-np.sum(target*np.log(pred))
returnH
H=crossEntropy(pred,target)

输出：

2.0334282107562287

对上了！

再回头看看，公式

这里，就是class就是索引，（调用nn.CrossEntropyLoss需要注意），这里把Softmax求p和ylog(p)写在一起，一开始还没反应过来。

5.nn.BCELoss

二分类交叉熵的含义其实在交叉熵上面提过，就是把{y,1-y}当做两项分布，计算出来的loss就比交叉熵大（也就是包含的信息更多了，因为包含了正类和负类的loss了）。

最后结果是：-13.8155。

6nn.NLLLoss

负对数似然损失函数（NegativeLogLikelihood）

在前面接上一个LogSoftMax层就等价于交叉熵损失了。注意这里的xlabel和上个交叉熵损失里的不一样，这里是经过log运算后的数值。这个损失函数一般也是用在图像识别模型上。

NLLLoss的输入是一个对数概率向量和一个目标标签(不需要是one-hot编码形式的).它不会为我们计算对数概率.适合网络的最后一层是log_softmax.损失函数nn.CrossEntropyLoss()与NLLLoss()相同,唯一的不同是它为我们去做softmax.

Nn.NLLLoss和nn.CrossEntropyLoss的功能是非常相似的！通常都是用在多分类模型中，实际应用中我们一般用NLLLoss比较多。

7nn.NLLLoss2d

和上面类似，但是多了几个维度，一般用在图片上。

input,(N,C,H,W)

target,(N,H,W)

比如用全卷积网络做分类时，最后图片的每个点都会预测一个类别标签。

criterion=nn.NLLLoss2d()
loss=criterion(sample,target)
print(loss)

最后结果是：报错，看来不能直接这么用！

8.BCEWithLogitsLoss与MultilabelSoftMarginLoss

BCEWithLogitsLoss:

这里，主要x,y的顺序，x为predict的输出（还没有sigmoid）；y为真实标签，一般是[0,1],但是真实标签也可以是概率表示，如[0.1,0.9].

可以看出，这里与BCELoss相比，它帮你做sigmoid操作，不需要你输出时加激活函数。

MultiLabelSoftMarginLoss:

可以看出，后者是前者权值为1时的特例。

importtorch
fromtorch.autogradimportVariable
fromtorchimportnn
x=Variable(torch.randn(10,3))
y=Variable(torch.FloatTensor(10,3).random_(2))

#doublethelossforclass1
class_weight=torch.FloatTensor([1.0,2.0,1.0])
#doublethelossforlastsample
element_weight=torch.FloatTensor([1.0]*9+[2.0]).view(-1,1)
element_weight=element_weight.repeat(1,3)

bce_criterion=nn.BCEWithLogitsLoss(weight=None,reduce=False)
multi_criterion=nn.MultiLabelSoftMarginLoss(weight=None,reduce=False)

bce_criterion_class=nn.BCEWithLogitsLoss(weight=class_weight,reduce=False)
multi_criterion_class=nn.MultiLabelSoftMarginLoss(weight=class_weight,
reduce=False)

bce_criterion_element=nn.BCEWithLogitsLoss(weight=element_weight,reduce=False)
multi_criterion_element=nn.MultiLabelSoftMarginLoss(weight=element_weight,
reduce=False)

bce_loss=bce_criterion(x,y)
multi_loss=multi_criterion(x,y)

bce_loss_class=bce_criterion_class(x,y)
multi_loss_class=multi_criterion_class(x,y)

print(bce_loss_class)
print(multi_loss_class)

print('bce_loss',bce_loss)
print('bcelossmean',torch.mean(bce_loss,dim=1))
print('multi_loss',multi_loss)

9.比较BCEWithLogitsLoss和TensorFlow的sigmoid_cross_entropy_with_logits；softmax_cross_entropy_with_logits

pytorchBCEwithLogitsLoss参考前面8的介绍。

fromtorchimportnn
fromtorch.autogradimportVariable
bce_criterion=nn.BCEWithLogitsLoss(weight=None,reduce=False)
y=Variable(torch.tensor([[1,0,0],[0,1,0],[0,0,1],[1,1,0],[0,1,0]],dtype=torch.float64))
logits=Variable(torch.tensor([[12,3,2],[3,10,1],[1,2,5],[4,6.5,1.2],[3,6,1]],dtype=torch.float64))
bce_criterion(logits,y)

result：

tensor([[6.1442e-06,3.0486e+00,2.1269e+00],
[3.0486e+00,4.5399e-05,1.3133e+00],
[1.3133e+00,2.1269e+00,6.7153e-03],
[1.8150e-02,1.5023e-03,1.4633e+00],
[3.0486e+00,2.4757e-03,1.3133e+00]],dtype=torch.float64)

如果使用TensorFlow的sigmoid_cross_entropy_with_logits,

y=np.array([[1,0,0],[0,1,0],[0,0,1],[1,1,0],[0,1,0]])
logits=np.array([[12,3,2],[3,10,1],[1,2,5],[4,6.5,1.2],[3,6,1]]).astype(np.float32)

sess=tf.Session()
y=np.array(y).astype(np.float32)#labels是float64的数据类型
E2=sess.run(tf.nn.sigmoid_cross_entropy_with_logits(labels=y,logits=logits))
print(E2)

result

[[6.1441933e-063.0485873e+002.1269281e+00]
[3.0485873e+004.5398901e-051.3132617e+00]
[1.3132617e+002.1269281e+006.7153485e-03]
[1.8149929e-021.5023102e-031.4632825e+00]
[3.0485873e+002.4756852e-031.3132617e+00]]

从结果来看，两个是等价的。

其实，两个损失函数都是，先预测结果sigmoid，再求交叉熵。

Kerasbinary_crossentropy也是调用Tfsigmoid_cross_entropy_with_logits.
kerasbinary_crossentropy源码;


defloss_fn(y_true,y_pred,e=0.1):
bce_loss=K.binary_crossentropy(y_true,y_pred)
returnK.mean(bce_loss,axis=-1)

y=K.variable([[1,0,0],[0,1,0],[0,0,1],[1,1,0],[0,1,0]])
logits=K.variable([[12,3,2],[3,10,1],[1,2,5],[4,6.5,1.2],[3,6,1]])
res=loss_fn(logits,y)
print(K.get_value(res))

fromkeras.lossesimportbinary_crossentropy
print(K.get_value(binary_crossentropy(logits,y)))

result:

[-31.59192-26.336359-5.1384177-38.72286-5.0798492]
[-31.59192-26.336359-5.1384177-38.72286-5.0798492]

同样，如果是softmax_cross_entropy_with_logits的话，

y=np.array([[1,0,0],[0,1,0],[0,0,1],[1,1,0],[0,1,0]])
logits=np.array([[12,3,2],[3,10,1],[1,2,5],[4,6.5,1.2],[3,6,1]]).astype(np.float32)

sess=tf.Session()
y=np.array(y).astype(np.float32)#labels是float64的数据类型
E2=sess.run(tf.nn.softmax_cross_entropy_with_logits(labels=y,
logits=logits))
print(E2)

result：

[1.6878611e-041.0346780e-036.5883912e-022.6669841e+005.4985214e-02]

发现维度都已经变了，这个是N*1维了。

即使，把上面sigmoid_cross_entropy_with_logits的结果维度改变，也是[1.7251741.45396481.14896830.494311571.4547749]，两者还是不一样。

关于选用softmax_cross_entropy_with_logits还是sigmoid_cross_entropy_with_logits,使用softmax，精度会更好，数值稳定性更好，同时，会依赖超参数。

2其他不常用loss

函数	作用
AdaptiveLogSoftmaxWithLoss	用于不平衡类

以上这篇Pytorch的损失函数Lossfunction使用详解就是小编分享给大家的全部内容了，希望能给大家一个参考，也希望大家多多支持毛票票。

声明：本文内容来源于网络，版权归原作者所有，内容由互联网用户自发贡献自行上传，本网站不拥有所有权，未作人工编辑处理，也不承担相关法律责任。如果您发现有涉嫌版权的内容，欢迎发送邮件至：czq8825#qq.com（发邮件时，请将#更换为@）进行举报，并提供相关证据，一经查实，本站将立刻删除涉嫌侵权内容。

Pytorch 的损失函数Loss function使用详解

热门推荐

随机推荐