Hi everyone! Thanks for visiting my seventh blog. Over the last few days, I have worked on several different tasks related to processing the data that my model produces so in this blog, I will go over a few of those techniques. Let’s start by reviewing the overall project objectives and what I have already accomplished. The main goal of my project is to build an end-to-end system that uses deep learning and computer vision to track seizure patients in a clinical setting. I am specifically training and fine-tuning Detectron2, an object detection algorithm, on a pre-processed COCO dataset (see blog #2 for an explanation of what COCO is) of 50 patient videos, which is over 30,000 frames of patient activity. Now that we have discussed the overall objectives, let’s analyze an interesting secondary goal of my project.
As you can see in my project’s title, part of my project aims to use “Out-Of-Distribution” (OOD) detection. You may be wondering what OOD detection is, and in simple terms, it’s a data mining technique that enables the classification of anomalous examples. To perform an OOD analysis, we can extract representations from the training data through a convolutional neural network such as ResNet-50. These representations are simply vector encodings that represent each image. In the case of seizure patient videos, samples that contain artifacts such as a seizure or a nurse moving on camera fall outside of the classical distribution based on these representations. Thus, these examples are anomalies and may have value in the end task of seizure detection. Now that we have explained what OOD detection is, let’s discuss how I plan to incorporate it into my project. Currently, I am still working on my main task of patient detection and I will continue to do this until the end of Week 8. Heading into Week 9, I will aim to take a first crack at OOD Detection so that I can have some results by the end of my project.
Patient Detection Updates
Now that we have cleared some logistical matters, let’s get into what I’ve been working on for the past week! This week, I have taken steps towards refining the post-processing steps of my model. As mentioned in last week’s blog, I completely refined my storage and code organization on the remote computer that I have been working on. Additionally, I have stored my code in a private GitHub repository that is currently private. I hope to share a public link to this repo by the end of Week 8! This week, I worked specifically on saving video inference results to COCO (a reversal of the preprocessing I did in weeks 1 and 2) along with calculating the average precision (AP) of my model for each sub video (600 frame segments of a larger video). Calculating the AP of my model for each sub-video is a crucial step since it will enable me to examine results across a diverse batch of videos and patients. I could also verify my hypothesis that the model performs better given a certain set of conditions. At the moment, I believe that the model performs better for videos that are in-color, contain only the patient, and show the patient as being visible in the main part of the frame (i.e. not in a corner or blocked by several blankets). Once I collect these APs, I can finalize these conditions and correctly claim that my model performs well with the proper input data. Let’s now discuss how I accomplished these tasks on the technical side.
Before we dive into post-processing steps, note that our Detectron2 model enables predictions and can provide coordinates for a predicted bounding box for a given input image. To convert videos to the COCO format, there are multiple steps necessary! First, I used a “predictor” function to get the predicted coordinates from Detectron2. I then store these predicted bounding box coordinates in a JSON file and label them with the category of “eegpt” (EEG Patient). Finally, I run a script to reformat the existing JSON file. Now let’s discuss how I started to calculate Average Precision by sub video. To calculate the AP for a sub video, I need to compare the predicted bounding boxes results with the actual bounding boxes. Thankfully, most of this code is already written and there is not much work for us to do! The main intuition here is to use a tool known as Intersection over Union (IoU). In simple terms, IoU measures the overlap between two bounding boxes (the true bounding box and the predicted bounding box). Two examples are presented below:
As you can see above, IoU is measured by calculating the Area of Overlap and dividing it by the Area of Union. The higher the IoU the better, since it means that the boxes overlap more. This is good for our model since it means that the predicted bounding box and the actual bounding box are similar! Note that the predicted bounding box does not need to exactly match the actual bounding box since the “actual” bounding box for each patient/image was hand-labeled by me in the first place. This is a common pattern in many machine learning and computer vision tasks since humans usually produce the best gold-standard labels (values that we want to predict). Beyond these scripts, I also worked on implementing some of the custom data augmentations that we discussed last week.
In summary, we discussed some logistical updates, some cool new post-processing steps, and interesting applications of our model! Next week, we will dive deeper into the results and I hope to share concrete numbers that describe how my model performs for certain videos. Thanks for sticking through the blog this week and I hope to see you next time.
- Image #1 (IoU Calculation):Wajih, et al. “Intersection over Union (IoU) for Object Detection.” PyImageSearch, 18 Apr. 2020, www.pyimagesearch.com/2016/11/07/intersection-over-union-iou-for-object-detection/.
- Image #2 (Image of Car with IoU):Wajih, et al. “Intersection over Union (IoU) for Object Detection.” PyImageSearch, 18 Apr. 2020, www.pyimagesearch.com/2016/11/07/intersection-over-union-iou-for-object-detection/.
- Facebookresearch. “Facebookresearch/detectron2.” GitHub, 4 Feb. 2021, github.com/facebookresearch/detectron2/blob/master/tools/plain_train_net.py.