Python, Bytetrack source code interpretation, parameters, source code explanation, code analysis sentence by sentence, target tracking

Article directory

  • 1. Get the index
  • 2. High-scoring boxes participate in matching, and there may be boxes left that cannot be matched.
  • 3. Low score boxes participate in matching
  • 4. Handle unconfirmed matches
  • 5. Create a new [STrack object]
  • 6. Throw away [STrack objects] that have not been matched to the frame for too long
  • 7. Output tracking box

1. Get the index

self.args.track_thresh is the track threshold. The score of a trajectory is the iou score multiplied by the confidence of the box.
remain_inds is the index of the high-scoring box.
inds_second is the index of the low-scoring box (between 0.1 and self.args.track_thresh).

 remain_inds = scores >= self.args.track_thresh
        
        inds_low = scores > 0.1
        inds_high = scores < self.args.track_thresh

        inds_second = np.logical_and(inds_low, inds_high)
        dets_second = bboxes[inds_second] # Low-scoring boxes
        dets = bboxes[remain_inds] # High-scoring boxes
        scores_keep = scores[remain_inds] # # Corresponding scores of high-scoring boxes
        scores_second = scores[inds_second] # The corresponding score of the low-scoring box

2. High-scoring boxes participate in matching, and there may be boxes left that cannot be matched

Among detections is the STrack class made of high-scoring frames.

 if len(dets) > 0:
            '''Detections'''
            detections = [STrack(STrack.tlbr_to_tlwh(tlbr), s) for
                          (tlbr, s) in zip(dets, scores_keep)]
        else:
            detections = []

The STrack that was not confirmed before is unconfirmed.
The STracks that were previously being tracked were tracked_stracks. 【track.is_activated】

 ''' Add new track detectedlets to tracked_stracks'''
        unconfirmed = []
        tracked_stracks = [] # type: list[STrack]
        for track in self.tracked_stracks:
            if not track.is_activated:
                unconfirmed.append(track)
            else:
                tracked_stracks.append(track)

Both the tracking track and the lost track participate in the matching of high-scoring frame detections.
This step is the key calculation: dists = matching.fuse_score(dists, detections) #! ! The IOU score is multiplied by the box’s confidence score! !

When calculating IOU with all detection frames, the frames predicted by Kalman are used.

 ''' Step 2: First association, with high score detection boxes'''
        strack_pool = joint_stracks(tracked_stracks, self.lost_stracks) # Tracked track and lost track
        # Predict the current location with KF
        STrack.multi_predict(strack_pool) # Update mean and covariance
        dists = matching.iou_distance(strack_pool, detections) # Calculate iou with all detection frames, the smaller the better the match. 1 means no intercourse.
        if not self.args.mot20:
            dists = matching.fuse_score(dists, detections) # ! ! The IOU score is multiplied by the box's confidence score! !

Use matching.linear_assignment for matching. Whether the previous [STrack object] can match the current [detection frame] depends on self.args.match_thresh as the threshold. matches are matched, for example [(1,2)] means that the first [STrack object] in strack_pool matches the second detection frame in detections.
Obviously, u_track is the remaining [STrack object] sequence number that has not been successfully matched, such as [4].
u_detection is the detection frame number in the remaining detections that did not match successfully, such as [0].

 matches, u_track, u_detection = matching.linear_assignment(dists, thresh=self.args.match_thresh) # matches is a match, u_track is a track without a match, u_detection is a detection frame without a match

For matched matches, if track.state == TrackState.Tracked, update the [STrack object] (Kalman’s update step).

Otherwise, it is track.re_activate(det, self.frame_id, new_id=False) (which is also Kalman’s update step).

 for itracked, idet in matches:
            track = strack_pool[itracked]
            det = detections[idet]
            if track.state == TrackState.Tracked:
                track.update(detections[idet], self.frame_id)
                activated_starcks.append(track)
            else:
                track.re_activate(det, self.frame_id, new_id=False)
                refind_stracks.append(track)

3. Low-scoring boxes participate in matching

r_tracked_stracks contains [STrack objects] that are being tracked, and the high score frame does not match the rest.

 ''' Step 3: Second association, with low score detection boxes'''
        # association the untrack to the low score detections
        if len(dets_second) > 0:
            '''Detections'''
            detections_second = [STrack(STrack.tlbr_to_tlwh(tlbr), s) for
                                 (tlbr, s) in zip(dets_second, scores_second)]
        else:
            detections_second = []
        r_tracked_stracks = [strack_pool[i] for i in u_track if strack_pool[i].state == TrackState.Tracked]
        dists = matching.iou_distance(r_tracked_stracks, detections_second)

Use a lower threshold for selection matching.linear_assignment(dists, thresh=0.5 ).

 matches, u_track, u_detection_second = matching.linear_assignment(dists, thresh=0.5)

Those that can be matched are still activated.

        for itracked, idet in matches:
            track = r_tracked_stracks[itracked]
            det = detections_second[idet]
            if track.state == TrackState.Tracked:
                track.update(det, self.frame_id)
                activated_starcks.append(track)
            else:
                track.re_activate(det, self.frame_id, new_id=False)
                refind_stracks.append(track)


For [STrack objects] that cannot match either the high-scoring frame or the low-scoring frame, put lost_stracks in.
At this point, all the [STrack objects] that appeared in the previous frame have a destination, either to activated_starcks, to refind_stracks, or to the lost_stracks here.

 for it in u_track:
            track = r_tracked_stracks[it]
            if not track.state == TrackState.Lost:
                track.mark_lost()
                lost_stracks.append(track)

4. Handling unconfirmed matches

What is the difference between high-scoring box matching and low-scoring box matching? Because it can be distinguished more accurately.

unconfirmed means that the previous state does not satisfy: track.state == TrackState.Tracked.

There are high-scoring boxes in u_detection, and the remaining [STrack objects] were not matched in the first match.

Here try to match unconfirmed with u_detection and try your luck. In this way, it is possible to match, and there is no need to give this box a new ID.

thresh=0.7 is used here. These thresh are actually important hyperparameters. If your detection model has low confidence, then these thresh hyperparameters need to be adjusted, not just the initial adjustment. that.

 '''Deal with unconfirmed tracks, usually tracks with only one beginning frame'''
        detections = [detections[i] for i in u_detection]
        dists = matching.iou_distance(unconfirmed, detections)
        if not self.args.mot20:
            dists = matching.fuse_score(dists, detections)
        matches, u_unconfirmed, u_detection = matching.linear_assignment(dists, thresh=0.7)

Handles matches.

 for itracked, idet in matches:
            unconfirmed[itracked].update(detections[idet], self.frame_id)
            activated_starcks.append(unconfirmed[itracked])

For unconfirmed u_unconfirmed, if the high score frame cannot be matched this time, add the [STrack object] in u_unconfirmed to removed_stracks.

 for it in u_unconfirmed:
            track = unconfirmed[it]
            track.mark_removed()
            removed_stracks.append(track)

5. Create a new [STrack object]

A box with a high score cannot match any existing [STrack object]. A new ID is directly assigned to make this box a [STrack object] and added to activated_starcks.

 """ Step 4: Init new tracks"""
        for inew in u_detection:
            track = detections[inew]
            if track.score < self.det_thresh:
                continue
            track.activate(self.kalman_filter, self.frame_id)
            activated_starcks.append(track)

6. Throw away [STrack objects] that have not been matched to the frame for too long

If the [STrack object] of the previous frame cannot match the frame, it will become lost_stracks.
Here, the [STrack object] in lost_stracks is traversed. If it takes too long to find a frame match, if it exceeds the defined buffer_size, it will be thrown into removed_stracks.

 """ Step 5: Update state"""
        for track in self.lost_stracks:
            if self.frame_id - track.end_frame > self.max_time_lost:
                track.mark_removed()
                removed_stracks.append(track)

7. Output tracking box

Self.tracked_stracks contains all the [STrack objects] of the previous frame. In this frame, some are classified into unconfirmed. If there is no matching high score frame in unconfirmed, the mark will be removed [see 4. Processing unconfirmed match]. So the following sentence selects the [STrack object] that is still being tracked in the previous frame.

self.tracked_stracks = [t for t in self.tracked_stracks if t.state == TrackState.Tracked]

The following sentence adds together all the [STrack Object List] activated_starcks that can be tracked in this frame.

In this frame, what conditions can be added to activated_starcks:

1. The high score frame and the [STrack Object List] tracked_stracks of the previous frame are matched successfully, and track.state == TrackState.Tracked is satisfied.

2. The low-scoring frame is successfully matched with the [STrack Object List] tracked_stracks of the previous frame, and is matched with a lower slack threshold to satisfy track.state == TrackState.Tracked.

3. The high score frame and the [STrack Object List] of the previous frame unconfirmed match successfully, with a lower relaxation threshold.

4. The high-scoring frame with nothing falling on it becomes its own [STrack object] and is added to activated_starcks.

self.tracked_stracks = joint_stracks(self.tracked_stracks, activated_starcks)

The following sentence adds together all [STrack Object List] refind_stracks that can be tracked in this frame.

In this frame, what conditions can be added to refind_stracks:

1. The high score frame and the [STrack Object List] tracked_stracks of the previous frame are matched successfully, but track.state == TrackState.Tracked is not satisfied, so reactivate.

2. The low-scoring frame is successfully matched with the tracked_stracks of the [STrack Object List] of the previous frame, and is matched with a lower slack threshold. If track.state == TrackState.Tracked is not satisfied, it is reactivated.

self.tracked_stracks = joint_stracks(self.tracked_stracks, refind_stracks)

Update lost_stracks and delete any tracked ones in lost_stracks.

self.lost_stracks = sub_stracks(self.lost_stracks, self.tracked_stracks)
self.lost_stracks.extend(lost_stracks)
self.lost_stracks = sub_stracks(self.lost_stracks, self.removed_stracks)
self.removed_stracks.extend(removed_stracks)

Remove overlapping objects.

self.tracked_stracks, self.lost_stracks = remove_duplicate_stracks(self.tracked_stracks, self.lost_stracks)

The activated items in [STrack Object List] will be output. The pitfall here is that the high-scoring detection target frame of only one frame will not be in an activated state (unless it is the first frame of the entire process).

 output_stracks = [track for track in self.tracked_stracks if track.is_activated]

For a bit of a pitfall, please see vcr: