Posts

Week 10 Review: 9/6 - 9/10

Image
I spent all my time this week preparing for the end of the internship. I submitted a poster for admittance to the Association for Computing Machinery poster competition to be held at the Super Computing Conference in Dallas, Texas this year. Beyond that, I spent a lot of time organizing files and finishing mandatory assignments. On Monday I submitted a minute long video to a small competition for giving elevator pitches and got an honorable mention award on Thursday, which was a tie between 5th and 6th place out of 10. Not bad for one hour of recording at midnight. Thursday was the poster session for all the interns and it went fairly well. The expense of organizing the event probably wasn't worth the amount of visitors that came to participate, though. I will say that I was one of the few interns that presented their poster to Michael Stewart Witherell, the director of the lab! Now that was cool! The research poster presented at the final poster session There was a lot plan

Week 9 Review: 7/30 - 8/3

At the start of this week I tried to clean and analyze more recent Tstat logs from January and July of this year. To my surprise, these files were formatted differently from my previous files which meant I had to clean and compile them differently as well. This posed many unexpected problems that took much longer to solve than I first imagined. Also planned for the week was to add a way to identify the "normal" cluster to the algorithm and perform quantitative analysis on the similarities of the calculated sequences with the log(throughput). All at the same time, this content needed to be put into my poster for the intern poster session by noon on Friday. After cleaning, I tried to add the cluster identification step to the algorithm. I was running behind, so without finishing it I moved on to calculating quantitative similarity metrics. I didn't want to consider the similarity of the sequences when throughput is high (because our values turn out to be low) so I took the

Weekend 8: Lawrence Hall of Science

Image
This weekend, we decided we should visit the Lawerence Hall of Science just up the hill from where we work. We didn't do very much, unfortunately, because we only had a little over an hour before they closed. For as little time as we had, I would say we spent it well.

Week 8 Review: 7/23 - 7/27

Image
After week 7's setback, our research group had to restrategize our data analysis tactic. Instead of using the Degree of change metric with 4 clusters compared to log(Throughput), we chose a method using 2 clusters per window. In a very general sense, the performance of one data transfer could be considered "good" or "bad." We hypothesized that by using two clusters, one cluster would contain normal transfers, the "good" ones, and anomalous points, the "bad" ones. By considering the proportion of "good" points to all the points in one time window, we can make an approximation of what the throughput should be for the window, most notably, if the throughput is abnormally low. We didn't consider a way to identify the normal cluster from the anomalous because we weren't sure how well it would work in the first place. For the sake of time, we bypassed this hurdle by only considering the smaller of the two clusters. Our choice t

Research update 7 ~Alex

Image
This week has been very busy. The results section of my paper was due so the majority of my time was spent typing and editing and re-editing that. There were also some final decisions made about future steps. Instead of the NA cost strategy we've been following, we've decided to employ a delete strategy. In this strategy (which only works because there are only missing values at the end of the sequences) we will delete all missing values and be left with  sequences of different lengths. Then these mismatching sequences will be normalized and compared. This method gives faster and more accurate results than simply picking a substitution cost. Other than that change in beginning procedure, the rest of our methods have not changed. In addition to this decision, we have also decided on an ending age of 39 for our sequences. The new variables are finally fixed and we have a final dataset. We are ready to receive our final clustering results and have ensured they will be accurate wi

Week 7 Review: 7/16 - 7/20

Image
In this week much effort was put into being able to compare average window throughput and degree of change. One challenge to overcome was how to scale the throughput data. To best understand the distribution of the average throughput's for every window, we made a histogram using the unscaled throughputs. It was evident that the magnitudes of the very high throughput dwarfed the small throughput by at least 10^2. Two methods were considered to scale the throughput; apply the reciprocal operation, (^-1), or apply a logarithmic function. The  logarithmic function most evenly represents small throughput and high throughput. After choosing our scaling method we graphed the log(throughput) with the degree of change over time. From these figures, it seems there is no discernible relationship between degree of change and log(throughput). This was a big step back as Alina and I had to rethink an approach to this project.

Weekend 6: Pier 39

Image
Our trip to Pier 39 on our 6th weekend was very fun. We started by getting lunch in Chinatown again. While in Chinatown, we passed a Chinese barberpole, so I just had to take a picture. For those who are unaware, I am involved in the YSU's barbershop chorus, One Achord. My infatuation with barbershop music has lead me to relish the peculiar nuances of the art form, such as barber pole appreciation. Click here for a quick video of the barbershop quartet "The Newfangled Four" singing in front of the worlds tallest barberpole as reference. We first went into the Aquarium of the Bay. It wasn't that big but there was a pretty wide variety. I admit I don't very well remember what kinds of animals I've seen at the Toledo Zoo/ Aquarium, but I think this was the first time I've seen sea otters. There was some other intriguing aquatic life like stingrays, jellyfish, and octopuses. On the lower part the pier, mostly everything was gif