Week 6: More Data Augmentation alongside Code Organization
Hi everyone! Welcome back to my senior project blog. It’s amazing that we are already in week 6! This week’s blog will be shorter than usual since there are not too many updates this week. For a broad overview, I spent time working on data augmentation and organizing my existing code. Overall, these steps are part of the tuning of my model. Before I work on an approach for detecting anomalies in patient movement (ex. seizures), I hope to finish refining my patient detection model. Let’s now discuss what I did this week.
Data Augmentation
Since you may be wondering what data augmentation is, I will give a brief intro. Data augmentation is the practice of artificially boosting the size of the training set in machine learning. By performing transforms on existing data, we can create artificial data that the network approaches as being completely separate. For example, if we have an image of a seizure patient, we can create new images from this original image by performing cropping, color changes, resizing, saturation, etc. Why do I need to use data augmentation for my project? Since my training dataset of 30 videos is a bit limited, I am trying to increase the number of training examples through different data augmentation techniques. If you are interested in how data augmentation works on a more technical level, be sure to check out last week’s blog!
Now, let’s discuss how I implemented data augmentation in my seizure patient detection model. Thankfully, the object detection library that I am using, Detectron2, provides an entire ecosystem of tools to use when training different computer vision models. One of these tools, the data loader, enables data augmentation. A data loader is a tool that can help to load a raw dataset into a format that can be understood by the Detectron2 model. It’s important to note that there is both a training and testing data loader. Now you may be wondering – how does this “data loader” provide data augmentation? The reason why a data loader is essential for augmentation is that it specifies which augmentations can be used during training. See the example from my code below:
As you can see in the example, we build our training data loader by specifying the model (a cfg file), a dataset mapper, and a series of augmentations. The dataset mapper takes a COCO dataset and maps in a format suitable for Detectron2. In the example above, the augmentations include random resizing, brightness adjustments, flips, and crops. These augmentations come from the PyTorch ecosystem that Detectron2 was built with. Specifically, the augmentations are built into the transforms part of the Detectron2 library. Now, let’s understand what these data augmentations do and how they impact the model’s performance. For a certain subset of images, the model will apply these data augmentations randomly. However, across all the transforms, there will be a roughly uniform distribution. This means that there will not be a substantially larger number of crops than flips. Since we are aiming to build a patient detection system, the bounding box is very important. Thus, it’s important to understand that when we transform an image, the bounding box also shifts. For example, if we rotate an image, the bounding box would also rotate.
Let’s now discuss how the data augmentation improved the performance of our model. After training the data augmented model for 50,000 iterations, the minimum validation loss was 0.594. This marks a slight improvement from the previous 0.599 validation loss. While this may not seem too useful, this is a neat finding since it shows that even with basic augmentations, we can achieve some level of performance improvements. Through designing custom transforms and stacking more augmentations, we may be able to take our model to the next level. I will work on this over this weekend.
Code Organization
Given that the model pipeline that I am using is becoming quite big, I realized that it is necessary to spend a few days on purely organizing the code! With several different scripts and over three thousand lines of code in total, it’s important to have a structured way of accessing files and storing experiment results. Note that I am using a shared remote computer (via ssh) with GPUs, so clearing cache and optimizing for storage is not immediately concerning.
To store my code, I used a structure of 3 layers. I first create a top-level folder with the annotated seizure patient video data alongside a “code” folder. This code folder can be broken down into preproc (preprocessing), postproc (postprocessing), and network. Preproc contains several scripts for converting the COCO data into a usable JSON file. Postproc contains scripts for video tracking and storing inference results. Network contains the main set of scripts for training and testing models with Detectron2. I also created a results folder with examples of videos and graphs. I have begun the process of storing these files via a private GitHub repository. This is useful for my project since I can modify code and commit changes instantly! I will aim to share the link to a public repo in next week’s blog.
Beyond the fact that organizing code is useful for navigating through the previous maze of files, it has also taught me a lot about using the command line on Ubuntu machines (with Linux)! I’ve learned about “fish” (an interactive command-line assistant), “tmux” (a library for running multiple files at once), and built-in tools (to check GPU usage and machine settings). It’s really cool since knowing these neat tools will serve me well in the future for my other projects.
Conclusions and Next Steps:
In this week’s shorter blog, we went over data augmentation techniques for my project as well as code organization. While these two topics may not seem incredibly useful in the short run, they will set up our pipeline for success in the long run! Next week, I will add in custom data augmentations (transforms), collect metrics like Average Precision, and continue tuning my model as I hope to begin wrapping up patient detection. I will also have some cool graphs to show. Thanks for sticking through the blog!