Research Update 6 ~Alex

Another good week! This week has been focused mostly on improving the plots for when they are included in my paper. They've been resized, reformatted, the titles fixed, colors changed, legend added and they are paper ready. In addition we have been investigating the best procedural steps to ensure optimal results. The NA cost debate from last week has been investigated further to include costs between 0 and 0.4 so we can decide definitively on a cost. Right now we have tentatively decided on 0.1 or 0.2 as that seems to be where the age cohorts disappear.

 In addition to NA cost we are looking to include normalization as a step, which will hopefully allow us to include larger sequences without any age cohorts. We are trying 4 different normalization procedures including maxlength shown below. The titles are the oldest age included in the  sequence. The colors are once again the ages.

Finally we have looked at internal clustering measures to determine the optimal number of clusters. We used ASW and PBC as before but this time we looked at age cutoffs and NA cost. Looking at these plots we determined 5 clusters is the optimal amount because that is right before a large drop off in the clustering performance.

Other than validating the new variables, which is not going so well, everything worked this week! We'll continue struggling with the variables next week, hopefully with better results!


Popular posts from this blog

Week 9 Review: 7/30 - 8/3

Week 10 Review: 9/6 - 9/10