Week 3 – More analysis
I have finished the Weka tutorial I was doing earlier. I have also done some work in document analysis, which could be useful for the data sets for this project. The document analysis included training an algorithm so it could tell whether a passage was about a given topic or not.
I have also joined the Coursera machine learning course. I am currently on Week 1, watching videos and reading the information about this course.
Finally, I am now trying out some algorithms on sample student performance data sets from Kaggle. One of them is IBk, a nearest-neighbor classifier. While it works on the data set, its error rate is very high, so I am looking for other algorithms that could work better on the data set. I should also be able to obtain the actual data set for the final project soon.
It sounds like you’re making steady progress. Glad to hear that you were able to finish your tutorial. Regarding document analysis, I’ve heard of tf-idf analysis. Maybe you could look into it if you haven’t already?
The Coursera Machine Learning course is extremely valuable, and I’m sure you’ll enjoy it. For Kaggle datasets, often times, you can test code that others have written. It can be pretty useful and help you learn about improving accuracy.
Great progress! I hope that you are able to get the actual data set soon!