Machine Learning

Confound between Covariance and Correlation? Me too.
There are two key concepts: Covariance and Correlation in the statistics, but what are they about? Most importantly are they the same or do they have any differences? Are they related to each other? What are the different types of correlations? The above are the questions that I was having initially when I started learning about statistics. In this post, I have made an attempt to answer the above-stated questions. Let's go straight into it and try to understand Covariance and Correlation.…

SQLAlchemy - ORM for Python
I n this post, we will discuss SQLAlchemy package which is used for connecting to the Object -Relational databases like: SQLite Postgres MySql and lot more Some of you might wonder why we need a package to connect to the database when we can connect it directly? SQLAlchemy package is an ORM which sorts out our development efforts and is very useful. You will understand the advantage of using SQLAlchemy package by end of this post. ORM (Object Relational Model) is always useful and…

XGBoost Detailed Explanation
In the previous post, Boosting Algorithms explained in detail we discussed in detail boosting algorithms and there working. In this post, let's talk about one more popular algorithm in boosting category: XGBoost. XGBoost XGBoost (Extreme Gradient Boosting) which is one of the most popular Gradient Boosting algorithms. It is a decision-tree-based ensemble Machine Learning algorithm that uses a gradient boosting framework. NOTE: This Gradient Boosting tree algorithm is the same as the…

Build flexible and accurate clusters with Gaussian Mixture Models
In the post, Unsupervised Learning k-means clustering algorithm in Python, we have discussed the clustering technique and covered k-means which is an unsupervised algorithm. In this post, we will be understanding the Gaussian Mixture Models algorithm which is another algorithm used to solve clustering problems. We will also talk about the limitations of the K-Means algorithm and how GMM can help to resolve the limitations. Like K-Means, GMM is also categorized as an unsupervised algorithm but…

Lasso and Ridge Regression Detailed Explanation
In Linear Regression we saw that the complexity of the model is not controlled. Linear Regression only tries to minimize the error (e.g. MSE) and may result in arbitrarily complex coefficients. The model which we are developing should be as simple as possible but not simpler. Regularization is a process used to create an optimally complex model, i.e. a model which is as simple as possible while performing well on the training data. As we can see from the diagram shown above our model…

Introduction to Bokeh Server
In the previous post, Data Visualization using the Bokeh package in Python, we learned about generating powerful visualizations using the Bokeh package in Python. If you have not gone through this post or don't know about Bokeh, I highly recommend you go through the post before proceeding further. We will continue our discussion on Bokeh, and we will see how to build powerful applications using the Bokeh package. Advantages of Bokeh Server Visualization/reports can be viewed by a large…

Random Forests Explained in detail
In my previous post, I have talked about Decision Tree and explained in detail how it works. In this post we will be talking about Random forest which is a collection of Decision Tree. A random forest is almost always better than a single decision tree. This is the reason why it is one of the most popular machine learning algorithms. Random forests use a technique known as bagging, which is an ensemble method. Before diving into the understanding of the random forest let's spend some time…

Unsupervised Learning k-means clustering algorithm in Python
Here we are back again to discuss Unsupervised Learning. We will be discussing the k-means clustering algorithm to solve the Unsupervised Learning problem. In previous posts, k-nearest neighbor algorithm for supervised learning in Python Regression model in Machine Learning using Python we saw techniques to solve Supervised Learning problems, where we were having labeled data. In this post, we will be looking at techniques to solve problems related to unlabelled data (i.e. the…

Regression model in Machine Learning using Python
In the previous article k-nearest neighbor algorithm for supervised learning in Python, we have explored supervised learning, and we have also seen k-nearest neighbor algorithm to solve the classification model type problems. In this post, we will be exploring algorithms which will be used to solve the regression model problems (where we will be having the target variable as a continuous variable). Let's get started. Linear Regression is a technique where we try to draw a line in order to…