Skip to main content

Posts

Showing posts from December, 2019

Complete Data Science Pipeline

Data Science is not just modelling. To extract value out from Data Science, it needs to be integrated with business and deploy the product to make it available for the users. To build a Data Science product, it needs to go through several steps. In this article, I will discuss the complete Data Science pipeline. Steps involved in building a Data Science product: Understanding the Business problem Data Collection Data Cleaning Exploratory Data Analysis Modelling Deployment Let us discuss each step in detail. Understanding the business problem: We use Data Science to solve a problem. Without understanding the problem, we can’t apply data science and solve it. Understanding the business is very important in building a data science product. The model which we build completely depends on the problem we are solving. If the requirement is different, we need to adjust our algorithm such that it solves the problem. For example, if we are build

Buzz words every Data Scientist should know

We hear a lot of buzz words related to AI and Data Science and we often use them interchangeably. This will happen only if we don’t have clarity. In this blog, I will explain some of the popular buzz words that every Data Scientist should understand. Artificial Intelligence: This is the ability of a machine to understand and complete the tasks without any human intervention. We can say that we have achieved A.I only when we cannot distinguish between a machine and a human. Currently, we are building software/machines that can outperform humans in specific tasks. If we can outperform in all the tasks that a human can perform, then we can say that Artificial Intelligence is achieved. Machine Learning: Machine Learning is an application of Artificial Intelligence. Using Machine Learning, machines can automatically learn from history and improve based on the experience without explicitly programmed. We use statistical methods to develop robust models using past data. There are m

Feature Engineering for Data Science

Feature Engineering is one of the most important steps in solving a Data Science problem. This helps us to create a robust and high-performance model. This is the reason why Data Scientists spend time in understanding the data and preprocessing it before diving into modelling. In this article, I will explain some feature engineering steps that I have learned from the course “ How to Win a Data Science Competition: Learn from Top Kagglers ”. If you have time, I recommend you to take this course. Feature Engineering involves mainly feature preprocessing and feature generation. Most of the real-world data is noisy. We cannot expect all the features to be numerical. It may also contain strings, timestamp features etc. it is suggested to convert all the features into numerical and categorical features (some models can handle categorical features too). Using the existing features, we can also create new features based on domain knowledge. They might contribute to predicting the targ

Time Series Forecasting using Artificial Neural Networks

In today’s world, everyone is excited to know what will happen in future. Researchers have been developing sophisticated algorithms to forecast weather, earthquake, stock prices and whatnot. Nowadays, we are using forecasting in every domain. In this article, I will explain how we can do time series forecasting using Neural Networks. Artificial Neural Networks are used as an alternative technique for time series forecasting as it has gained more popularity in recent years. Neural Networks are universal approximators. They can detect non-linear patterns and hidden information inside the data. They are data-driven and self-adaptive in nature. Unlike statistical methods, we need not make any statistical assumptions. In statistical methods like ARIMA, we have to make sure that the time series is stationary. If the time series has seasonality, we have to use SARIMA, an extended version of ARIMA. But, we need not worry about that if we are using Neural Networks for forecasting. But,