Machine Learning
Does probability really help businesses?
In this post, we will start from the basics and cover what statistics really mean intuitively. And later on, we will see examples where probability is actually being used to make profitable business decisions. Let's get started. What is Statistics? Statistics is a branch of mathematics working with data collection, organization, analysis, interpretation, and presentation. There are two different statistics calculations: Descriptive or Summary Statistics - where we apply techniques to get…
Introduction to Bokeh Server
In the previous post, Data Visualization using the Bokeh package in Python, we learned about generating powerful visualizations using the Bokeh package in Python. If you have not gone through this post or don't know about Bokeh, I highly recommend you go through the post before proceeding further. We will continue our discussion on Bokeh, and we will see how to build powerful applications using the Bokeh package. Advantages of Bokeh Server Visualization/reports can be viewed by a large…
Random Forests Explained in detail
In my previous post, I have talked about Decision Tree and explained in detail how it works. In this post we will be talking about Random forest which is a collection of Decision Tree. A random forest is almost always better than a single decision tree. This is the reason why it is one of the most popular machine learning algorithms. Random forests use a technique known as bagging, which is an ensemble method. Before diving into the understanding of the random forest let's spend some time…
Unsupervised Learning k-means clustering algorithm in Python
Here we are back again to discuss Unsupervised Learning. We will be discussing the k-means clustering algorithm to solve the Unsupervised Learning problem. In previous posts, k-nearest neighbor algorithm for supervised learning in Python Regression model in Machine Learning using Python we saw techniques to solve Supervised Learning problems, where we were having labeled data. In this post, we will be looking at techniques to solve problems related to unlabelled data (i.e. the…
Regression model in Machine Learning using Python
In the previous article k-nearest neighbor algorithm for supervised learning in Python, we have explored supervised learning, and we have also seen k-nearest neighbor algorithm to solve the classification model type problems. In this post, we will be exploring algorithms which will be used to solve the regression model problems (where we will be having the target variable as a continuous variable). Let's get started. Linear Regression is a technique where we try to draw a line in order to…
Machine Learning: Cleaning data
I n the previous post, we saw how to import data from different type of file sources using various packages available in python. If you haven't gone through Importing data using Python. I recommend going through it before stepping into this post. Once we have data imported into dataset its very important to understand the data before proceeding further. It's important to understand the attributes (columns) of our dataset provided. Whenever we obtain a new dataset, our first task is to do…
Hierarchical clustering algorithm in Python
In the previous post, Unsupervised Learning k-means clustering algorithm in Python we discussed the K Means clustering algorithm. We learned how to solve Machine Learning Unsupervised Learning Problem using the K Means clustering algorithm. If you don't know about K Means clustering algorithm or have limited knowledge, I recommend you to go through the post. In this post, we will be looking at the Hierarchical Clustering algorithm which is used to solve the Unsupervised Learning problem. As…
Logistic Regression Detailed Explanation
Logistic regression is a binary classification model, i.e. it will help to make predictions in cases where the output is a categorical variable. We cannot draw a line and classify data points into two classes. So we can use the curve also known as the sigmoid curve. The sigmoid function is represented as: $$ 1\over {1 + e^{-(\beta_0 + \beta_1x)}}$$ As we know Linear Regression is represented as: $$h_\theta(x) = w^Tx$$ and the Logistic regression is represented as \(h_\theta(x) = g(w^Tx…
Limitations of the Linear Regression
In the previous post, we discussed a Simple Linear Regression detailed Explanation. I recommend you to go through the post to have a detailed understanding of the Simple Linear Regression. There are a few assumptions that Linear Regression has to find the best fit line. NOTE: This assumptions hold true for Simple and Multi Linear Regression. Let's understand the assumptions of Linear Regression and discuss them in detail. Limitations of the Linear Regression We cannot apply linear…