Week 10: Troubleshooting the 3 black-boxes
I started this week off by working on the XGBoost package. In order to format the data correctly, I had to convert the training/testing data into a matrix using the as.matrix() function and the training/testing labels into a string of integers using the as.integer() function. The 2 parameters “max.depth” and “nrounds” are used to increase or decrease the functions accuracy. By fine-tuning these 2 aspects, I was able to get an accuracy of 92.23%!
For randomForest, I also ran into a formatting issue, but thankfully, when I applied the as.matrix() and as.integer() functions to the data, the function went through. After I applied the randomForest regression techniques to the data, I used the predict function to predict the labels. Again, I had to convert the testing data into a matrix with as.matrix(). Unfortunately, after running this, I got an accuracy of around 55%, which is pretty low. During Week 11, I’ll be working on raising this accuracy.
I only started looking into adaboost, and it seems as though the package it uses is called “fastAdaboost.” I’ll also be spending time next week to figure this out.
Thanks for reading!
Best,
Arshia
Great to hear about your excellent progress, 94% is awesome for accuracy levels!