3- The one with the Data

Mar 11, 2019

Since you last had the incomparable pleasure of reading my senior project blog, about 17.5 quintillion bytes of data were generated. Needless to say, your digital footprint is much more extensive (and much more public) than your carbon footprint. What is privacy, anyways?

Moving on:

This past week I spent reveling in the world of complex data analysis, and reading a REALLY long paper on the major cell types which compose all human tissues. I was able to gain some inspiration from the computational methods used in this paper, and a subsequent paper I explored by the GTEx Project itself inspired me to replicate their methods to find tissue-specific genes. In replicating their methods, I have one goal: To compare the tissue-specific genes I discover using all GTEx samples with the tissue specific genes I get after removing abnormal samples. This will allow me to decide whether the GTEx database is really representative of a normal genotype. Next week, I hope to begin my statistical analysis–but we all know how tedious data cleaning is 🙁 .

Meanwhile, my patient classifier model is currently on hold- lets just say I have to go a little deeper in my learning.

4 Replies to “3- The one with the Data”

  1. Rishi A. says:

    If the first comic was true… that would be a major oof.

  2. Eva P. says:

    Data cleaning 🙁 Also how long is your model on hold for?

    1. Shreya S. says:

      It sucks, right!? I’ll probably pick up where I left off with the model in 1-2 weeks after I get far enough in the analysis.

  3. Cindy K. says:

    It sounds like your project is coming along nicely. Sounds fun!

