During Week 4, I connected my TrainingDataReader class from last week to my FeatureExtractor class, which is the NLP portion of my project. I was getting the data from the TrainingDataReader class and testing it out on my FeatureExtractor class. In the beginning of the week, talked to my external advisor on a video call to discuss the progress I have done thus far and what I should be doing for the rest of the week. I made a few adjustments to my code to make the input of each file work with the FeatureExtractor class.
I started connecting both programs so that the data from the TrainingDataReader class would be an input into the FeatureExtractor class. I started getting the input but I was not sure how to get the second element in a Pair, so I had to research about it and found out that you can do that with the second() method. I also had to find a pattern that splits an article into an array of sentences. I first started to use article.split(“. ”) but it did not work so I started researching on how I can use a pattern matcher to get an array of sentences. After I finished researching on what pattern I should use, I started testing my articles to see if there were any bugs in my code. Unfortunately, I found a bug in my code and spent two days trying to figure out what I did wrong. I had an ArrayIndexOutOfBoundsException and fixed it my adding a new rule to my rulebook.
Next week, I will be creating my machine learning model using a classifier and testing it out to see how accurate it is.
Thanks for reading!