About Creation

I have a rule. After every podcast, I write down 10 things I learned. I don’t know if anyone else does this. Do you do this? Some people make illustrations. They send me what they’ve learned. It’s a creation of a creation of a creation. A drawing of a podcast of someone’s life.But I broke my rule. It’s been over a month. And my brain is digging for the lessons from my interview with the creator of WordPress. I think I have Alzheimer’s. Matt was 19 years old when he started WordPress. It was 2003. Now WordPress.com gets more traffic than Amazon.com.The Wall Street Journal and The New York Times both use WordPress. I use WordPress.I wanted to know if it’s still worth the time and effort to make your own site. He said it is. That’s how you break out…“We’re trying to revitalize the independent web,” Matt Mullenweg said. He’s 33 now. “It’s not like these big sites are going anywhere. They’re fantastic. I use all of them, but you want balance. You need your own site that belongs to you… like your own home on the Internet.”This is part of Matt’s code. Not WordPress’s “code.” Matt’s like a robot. I mean that as a compliment. There are many signs of this: language, ability, he’s very exact.

Done? While there’s not necessarily a “correct” answer here, it’s most likely you split the bugs into four clusters. The spiders in one cluster, the pair of snails in another, the butterflies and moth into one, and the trio of wasps and bees into one more.That wasn’t too bad, was it? You could probably do the same with twice as many bugs, right? If you had a bit of time to spare — or a passion for entomology — you could probably even do this same with a hundred bugs.For a machine though, grouping ten objects into however many meaningful clusters is no small task, thanks to a mind-bending branch of maths called combinatorics, which tells us that are 115,975 different possible ways you could have grouped those ten insects together. Had there been twenty bugs, there would have been over fifty trillion possible ways of clustering them.With a hundred bugs — there’d be many times more solutions than there are particles in the known universe. How many times more? By my calculation, approximately five hundred million billion billion times more. In fact, there are more than four million billion googol solutions (what’s a googol?). For just a hundred objects.Almost all of those solutions would be meaningless — yet from that unimaginable number of possible choices, you pretty quickly found one of the very few that clustered the bugs in a useful way.Us humans take it for granted how good we are categorizing and making sense of large volumes of data pretty quickly. Whether it’s a paragraph of text, or images on a screen, or a sequence of objects — humans are generally fairly efficient at making sense of whatever data the world throws at us.Given that a key aspect of developing A.I. and Machine Learning is getting machines to quickly make sense of large sets of input data, what shortcuts are there available? Here, you can read about three clustering algorithms that can machines can use to quickly make sense of large datasets. This is by no means an exhaustive list — there are other algorithms out there — but they represent a good place to start!You’ll find for each a quick summary of when you might use them, a brief overview of how they work, and a more detailed, step-by-step worked example. I believe it helps to understand an algorithm by actually carrying out yourself. If you’re really keen, you’ll find the best way to do this is with pen and paper. Go ahead — nobody will judge!

 

Interior Design

Folie comes with Visual Portfolio Builder like no other theme.

Predefined Layouts & 100+ Options to choose from.

Creativity is the key

Even core Calypso project team members had to get over our intimidation. None of us were strong JavaScript developers. But as each day passed our experience built, we made mistakes, we reviewed them, we fixed them, and we learned. Once we had the project moving, we set better examples for other engineers, and shared our knowledge across the company. Developers have increasingly adopted SSGs because they are faster, more secure, and less complex than WordPress, and they typically fit in nicely with version control platforms like GitHub and Bitbucket. David Walsh (edit — although the post is on David Walsh’s blog, it was actually written by Eduardo Bouças)There are several variations on the algorithm described here. The initial method of ‘seeding’ the clusters can be done in one of several ways. Here, we randomly assigned every player into a group, then calculated the group means. This causes the initial group means to tend towards being similar to one another, which ensures greater repeatability.An alternative is to seed the clusters with just one player each, then start assigning players to the nearest cluster. The returned clusters are more sensitive to the initial seeding step, reducing repeatability in highly variable datasets. However, this approach may reduce the number of iterations required to complete the algorithm, as the groups will take less time to diverge.An obvious limitation to K-means clustering is that you have to provide a priori assumptions about how many clusters you’re expecting to find. There are methods to assess the fit of a particular set of clusters. For example, the Within-Cluster Sum-of-Squares is a measure of the variance within each cluster. The ‘better’ the clusters, the lower the overall WCSS.

The structure of the dendrogram gives us insight into how our dataset is structured. In our example, we see two main branches, with Humpback Whale and Fin Whale on one side, and the Bottlenose Dolphin/Risso’s Dolphin and Pilot Whale/Killer Whale on the other.In evolutionary biology, much larger datasets with many more specimens and measurements are used in this way to infer taxonomic relationships between them. Outside of biology, hierarchical clustering has applications in Data Mining and Machine Learning contexts.

The cool thing is that this approach requires no assumptions about the number of clusters you’re looking for. You can split the returned dendrogram into clusters by “cutting” the tree at a given height. This height can be chosen in a number of ways, depending on the resolution at which you wish to cluster the data.

 

Finish

So, sure. Maybe you buy into the notion that the future is static. And maybe you also believe it’s going to be mighty difficult for WordPress to wean those self-identified WordPress developers onto something that’s not PHP, but the WordPress team is not not dumb, and they’re not standing still. With the release of Calypso, they completely re-wrote the admin portal for WordPress using NodeJS and ReactJS whose output is of course, STATIC!In other words, the WordPress team has started to disrupt itself and has moved to JavaScript and GitHub version control. The challenge is going to be convincing hundreds of thousands of PHP developers to follow suit.

The structure of the dendrogram gives us insight into how our dataset is structured. In our example, we see two main branches, with Humpback Whale and Fin Whale on one side, and the Bottlenose Dolphin/Risso’s Dolphin and Pilot Whale/Killer Whale on the other.In evolutionary biology, much larger datasets with many more specimens and measurements are used in this way to infer taxonomic relationships between them. Outside of biology, hierarchical clustering has applications in Data Mining and Machine Learning contexts.

The cool thing is that this approach requires no assumptions about the number of clusters you’re looking for. You can split the returned dendrogram into clusters by “cutting” the tree at a given height. This height can be chosen in a number of ways, depending on the resolution at which you wish to cluster the data.

Recent News