Skip to main content

Week of 10/10

Week of 10/1

Since I didn't get to it last week, here's my video explaining datasets and its applications on YouTube!


This week we learned about clustering and graphing clustering. Clustering is a pretty practical way to plot a lot of data, and its analysis and the "machine learning" aspect of it is generally considered to be supervised learning. Initially, we thought that we could focus on K-means clustering specifically and then work on the different aspects of K-means. However, some research proved to us that K-means as a topic was far too narrow- we had to find a better topic to divvy up. We then, as a group, decided to instead focus on clustering as a whole and split clustering into six different topics. 

Image result for hierarchical clustering

I chose to focus on hierarchical clustering. At first glance, hierarchical clustering looks incredibly intimidating, as you can probably tell from the picture above. However, as I read a little more on it, it started to make a bit more sense to me. After arranging data on graphs and identifying clusters, the computer will group individual clusters by proximity until it reaches a point where all clusters have merged into one. Graphically, it'll look like a dendrogram, and it'll create branches that will help you identify the different steps/layers of grouping the computer has gone through throughout its process of merging and cluster organizing. 

There's two different kinds of hierarchical clustering: first is agglomerative, which means that you work your way up and start with each point as a cluster until they all merge into one; the other is divisive, which means that you have one large cluster to begin with and then you split them up based on their distance from each other. While they both accomplish nearly the same end-goal, agglomerative is far more popular and is widely used. Hierarchical agglomerative clustering is also the kind of clustering most databases and algorithms use to sort, categorize, and analyze their data.

This video helped me familiarize myself with hierarchical clustering, so you might find it useful/interesting:


At CalTech this Thursday, we learned about graphing clustering. It was something I hadn't thought about before, but I think upon listening to the Grad student talk about it, it sounded pretty interesting and I could actually understand it. That was kind of nice. Near the end, he challenged us to do some research on some centrality equations, and even though I still kind of don't get some of it, I think that it gives me a nice direction to put my effort towards. I decided to do a small summary chart of the notes I took that day, so here you go with that:



The software I used to create this rasterizes the image, which means that it draws things by pixels and not by vectors, which draws things and locates things by vectors and lines. Rasterizing makes images a bit blurry and not very versatile for resizing, but I hope that this works. I've yet to do my assigned research on the equations yet, but I'll update y'all with that when I get the chance to. 

Comments