手写NMS & soft-NMS

1. NMS（非极大抑制）

简介

非极大抑制算法应用相当广泛，其主要目的是消除多余的框，找到最佳的物体检测位置。

其实现的思想主要是将各个框的置信度进行排序，然后选择其中置信度最高的框A，将其作为标准选择其他框，同时设置一个阈值，当其他框B与A的重合程度超过阈值就将B舍弃掉，然后在剩余的框中选择置信度最大的框，重复上述操作。

实现

参考faster rcnn源码 lib/nms/py_cpu_nms.py

import numpy as np

def nms(dets, thresh):
    x1 = dets[:, 0]
    y1 = dets[:, 1]
    x2 = dets[:, 2]
    y2 = dets[:, 3]
    score = dets[:, 4]
    areas = (x2 - x1 + 1) * (y2 - y1 + 1)

    keep = []
    order = score.argsort()[::-1] # 按score从大到小得到索引排列
    while order.size > 0:
        i = order[0]
        keep.append(i)
        xx1 = np.maximum(x1[i], x1[order[1:]]) # 获取剩余所有框的x位置坐标与基准框i的最大值
        yy1 = np.maximum(y1[i], y1[order[1:]])
        xx2 = np.minimum(x2[i], x2[order[1:]])
        yy2 = np.minimum(y2[i], y2[order[1:]])
        inter = np.maximum(0.0, xx2 - xx1 + 1) * np.maximum(0.0, yy2 - yy1 + 1) # 防止小于零情况
        over = inter / (areas[i] + areas[order[1:]] - inter) # 获取相交部分所占比例
        index = np.where(over <= thresh)[0] # np.where返回的是tuple，所以用[0]
        order = order[index + 1] # 因为去除了第一个元素i，所以在order中的索引要index + 1
    return keep


if __name__ == '__main__':
    dets = np.array([
        [204, 102, 358, 250, 0.5],
        [257, 118, 380, 250, 0.7],
        [280, 135, 400, 250, 0.6],
        [255, 118, 360, 235, 0.7]])
    thresh = 0.7
    res = nms(dets, thresh)
    print(res)

2. soft-NMS

简介

NMS存在一个致命的问题，因为采用了贪心的算法，每次都直接删除了与基准框IOU大于阈值的框。这样会造成一个问题，就是当两个物体比较重叠时，因为检测框IOU大于阈值，那么会直接删掉置信度低的框，导致漏检。

依此缺点，soft-nms改进的思路为：不要直接删除所有IOU大于阈值的框，而是降低其置信度。

通过公式比较二者的异同：

NMS:

$\begin{equation} s_i = \begin{cases} s_i, &iou(M,b_i) < N_t\\ 0, &iou(M,b_i) \geq N_t \end{cases} \end{equation}$

Soft-NMS:

(1) 线性加权：

$\begin{equation} s_i = \begin{cases} s_i, &iou(M,b_i) < N_t\\ s_i(1-iou(M,b_i)), &iou(M,b_i) \geq N_t \end{cases} \end{equation}$

(2) 高斯加权：

$\begin{equation} s_i = s_i \text{exp}(- \frac{iou(M,b_i)^2}{\sigma}), \forall b_i \notin \mathbb{D} \end{equation}$

实现

伪代码如下：

参考soft-NMS源码pyx文件

import numpy as np

def soft_nms(boxes, sigma, Nt, threshold, method=1):  # Nt为IOU阈值， threshold为最后得分的阈值
    N = boxes.shape[0]
    pos, maxpos, maxscore = 0, 0, 0

    for i in range(N):
        maxscore = boxes[i, 4]
        maxpos = i

        tx1 = boxes[i, 0]
        ty1 = boxes[i, 1]
        tx2 = boxes[i, 2]
        ty2 = boxes[i, 3]
        ts = boxes[i, 4]

        pos = i + 1

        # 获取最大位置，然后交换
        # 1. 获取最大分数位置
        while pos < N:
            if maxscore < boxes[pos, 4]:
                maxscore = boxes[pos, 4]
                maxpos = pos
            pos += 1

        # 2. 把最大分数位置换到i位置
        boxes[i, 0] = boxes[maxpos, 0]
        boxes[i, 1] = boxes[maxpos, 1]
        boxes[i, 2] = boxes[maxpos, 2]
        boxes[i, 3] = boxes[maxpos, 3]
        boxes[i, 4] = boxes[maxpos, 4]

        boxes[maxpos, 0] = tx1
        boxes[maxpos, 1] = ty1
        boxes[maxpos, 2] = tx2
        boxes[maxpos, 3] = ty2
        boxes[maxpos, 4] = ts

        # tx1, ty1, tx2, ty2, ts为最大分数的边框和置信度分数
        tx1 = boxes[i, 0]
        ty1 = boxes[i, 1]
        tx2 = boxes[i, 2]
        ty2 = boxes[i, 3]
        ts = boxes[i, 4]

        # 3. 重置pos位置，在步骤1中pos被置为N了
        pos = i + 1

        # NMS操作
        while pos < N:
            x1 = boxes[pos, 0]
            y1 = boxes[pos, 1]
            x2 = boxes[pos, 2]
            y2 = boxes[pos, 3]
            s = boxes[pos, 4]

            area = (x2 - x1 + 1) * (y2 - y1 + 1)
            iw = min(tx2, x2) - max(tx1, x1) + 1
            if iw > 0:
                ih = min(ty2, y2) - max(ty1, y1) + 1
                if ih > 0:
                    ovr = iw * ih / float((tx2 - tx1 + 1) * (ty2 - ty1 + 1) + area - iw * ih)

                    if method == 1:  # soft-NMS 线性
                        if ovr > Nt:
                            weight = 1 - ovr
                        else:
                            weight = 1
                    elif method == 2:  # soft-NMS 高斯
                        weight = np.exp(-(ovr * ovr) / sigma)
                    else:  # 原始NMS
                        if ovr > Nt:
                            weight = 0
                        else:
                            weight = 1

                    boxes[pos, 4] = weight * boxes[pos, 4]

                    # 如果不满足threshold阈值条件，则把最后面元素换到该位置，达到缩减boxes的目的
                    if boxes[pos, 4] < threshold:
                        boxes[pos, 0] = boxes[N - 1, 0]
                        boxes[pos, 1] = boxes[N - 1, 1]
                        boxes[pos, 2] = boxes[N - 1, 2]
                        boxes[pos, 3] = boxes[N - 1, 3]
                        boxes[pos, 4] = boxes[N - 1, 4]
                        N -= 1
                        pos -= 1
            pos += 1

    keep = [i for i in range(N)]
    return keep


if __name__ == '__main__':
    boxes = np.array([
        [204, 102, 358, 250, 0.5],
        [257, 118, 380, 250, 0.7],
        [280, 135, 400, 250, 0.6],
        [255, 118, 360, 235, 0.7]])
    thresh = 0.3
    threshold = 0.25
    Nt = 0.3
    sigma = 0.5
    res = soft_nms(boxes, sigma, Nt, threshold)
    print(box[res])