Posts

Showing posts from July, 2018

Weekend 8: Lawrence Hall of Science

Image
This weekend, we decided we should visit the Lawerence Hall of Science just up the hill from where we work. We didn't do very much, unfortunately, because we only had a little over an hour before they closed. For as little time as we had, I would say we spent it well.

Week 8 Review: 7/23 - 7/27

Image
After week 7's setback, our research group had to restrategize our data analysis tactic. Instead of using the Degree of change metric with 4 clusters compared to log(Throughput), we chose a method using 2 clusters per window. In a very general sense, the performance of one data transfer could be considered "good" or "bad." We hypothesized that by using two clusters, one cluster would contain normal transfers, the "good" ones, and anomalous points, the "bad" ones. By considering the proportion of "good" points to all the points in one time window, we can make an approximation of what the throughput should be for the window, most notably, if the throughput is abnormally low. We didn't consider a way to identify the normal cluster from the anomalous because we weren't sure how well it would work in the first place. For the sake of time, we bypassed this hurdle by only considering the smaller of the two clusters. Our choice t

Research update 7 ~Alex

Image
This week has been very busy. The results section of my paper was due so the majority of my time was spent typing and editing and re-editing that. There were also some final decisions made about future steps. Instead of the NA cost strategy we've been following, we've decided to employ a delete strategy. In this strategy (which only works because there are only missing values at the end of the sequences) we will delete all missing values and be left with  sequences of different lengths. Then these mismatching sequences will be normalized and compared. This method gives faster and more accurate results than simply picking a substitution cost. Other than that change in beginning procedure, the rest of our methods have not changed. In addition to this decision, we have also decided on an ending age of 39 for our sequences. The new variables are finally fixed and we have a final dataset. We are ready to receive our final clustering results and have ensured they will be accurate wi

Week 7 Review: 7/16 - 7/20

Image
In this week much effort was put into being able to compare average window throughput and degree of change. One challenge to overcome was how to scale the throughput data. To best understand the distribution of the average throughput's for every window, we made a histogram using the unscaled throughputs. It was evident that the magnitudes of the very high throughput dwarfed the small throughput by at least 10^2. Two methods were considered to scale the throughput; apply the reciprocal operation, (^-1), or apply a logarithmic function. The  logarithmic function most evenly represents small throughput and high throughput. After choosing our scaling method we graphed the log(throughput) with the degree of change over time. From these figures, it seems there is no discernible relationship between degree of change and log(throughput). This was a big step back as Alina and I had to rethink an approach to this project.

Weekend 6: Pier 39

Image
Our trip to Pier 39 on our 6th weekend was very fun. We started by getting lunch in Chinatown again. While in Chinatown, we passed a Chinese barberpole, so I just had to take a picture. For those who are unaware, I am involved in the YSU's barbershop chorus, One Achord. My infatuation with barbershop music has lead me to relish the peculiar nuances of the art form, such as barber pole appreciation. Click here for a quick video of the barbershop quartet "The Newfangled Four" singing in front of the worlds tallest barberpole as reference. We first went into the Aquarium of the Bay. It wasn't that big but there was a pretty wide variety. I admit I don't very well remember what kinds of animals I've seen at the Toledo Zoo/ Aquarium, but I think this was the first time I've seen sea otters. There was some other intriguing aquatic life like stingrays, jellyfish, and octopuses. On the lower part the pier, mostly everything was gif

Research Update 6 ~Alex

Image
Another good week! This week has been focused mostly on improving the plots for when they are included in my paper. They've been resized, reformatted, the titles fixed, colors changed, legend added and they are paper ready. In addition we have been investigating the best procedural steps to ensure optimal results. The NA cost debate from last week has been investigated further to include costs between 0 and 0.4 so we can decide definitively on a cost. Right now we have tentatively decided on 0.1 or 0.2 as that seems to be where the age cohorts disappear.  In addition to NA cost we are looking to include normalization as a step, which will hopefully allow us to include larger sequences without any age cohorts. We are trying 4 different normalization procedures including maxlength shown below. The titles are the oldest age included in the  sequence. The colors are once again the ages. Finally we have looked at internal clustering measures to determine the optimal number of cl