Skip to main content

Posts

Showing posts from March, 2020

Google Colab - Increase RAM upto 25GB

Google colab is a free jupyter notebook that is hosted on Google cloud servers. We can use CPU, GPU and TPU for free. It helps you to write and execute your code. You can directly access this through your browser. If you want to use Google Cloud/AWS, it requires hell lot of setup. You have to spin a cluster, create a notebook and then use it. But, Google colab will be readily available for you to use it. You can also install libraries from the notebook itself. These notebooks are very useful for training large models and processing huge datasets. Students and developers can make use of this because it’s very difficult for them to afford GPUs and TPUs. I was trying to run a memory heavy job. The notebook crashed. Then, I came to know how I can increase the RAM. So, I thought of sharing it in my blog. There are some constraints with the notebook. You can run these notebooks for not more than 12 hours and you can use only 12 GB RAM. There is no direct method or button t

Hierarchical clustering algorithm

Clustering is one of the techniques used to group the objects such that similar objects are in the same cluster. The objects in the same cluster are similar and vice versa. Clustering is widely used in the industry to solve problems. This article is organized as follows: Introduction Divisive and Agglomerative clustering Agglomerative clustering K means vs Hierarchical clustering Introduction: Hierarchical clustering is one of the most popular clustering algorithms. This algorithm builds a hierarchy of clusters. There are two different methods of hierarchical clustering, Divisive and Agglomerative. Please refer to the below image to get a sense of how hierarchical clusters look. Hierarchical clustering Divisive and Agglomerative clustering: The divisive method is a top-down clustering method in which we assign all the data points to a single cluster. We divide the single cluster into two and we go on dividing the sub-clusters until we rea

K means clustering algorithm

Clustering is one of the techniques used to group the objects such that similar objects are in the same cluster. The objects in the same cluster are similar and vice versa. Clustering is widely used in the industry to solve problems. For example, if we have a lot of documents and we want to cluster them based on its domain. We can use clustering to group similar documents. I will give you a real life example. You have 10 apples, 10 oranges and 10 bananas. All the fruits are mixed and finally you have 30 fruits. You want to separate them into apples, oranges and bananas. What would you do? Based on the color and shape, you will recognize the fruit and you can easily separate them. You have used features like shape and color of the fruit to separate them. Similarly, we create features for documents and cluster them if any two documents are similar. This article assumes that you have discovered the features from the objects and you are ready with the features. In this article, I