Pandas in machine learning

Hello Shouters !! Today will learn how to use pandas in machine learning.

In the earlier blog, we have learned how to work with google collab. Now the most important aspect of a machine learning algorithm is the dataset. We have connected our google drive with google collab for that purpose. If you don’t know about google collab, I will recommend you to go through the introduction to google collab first.

In this blog now we will learn about how you can use your dataset in google collab using pandas and if you know nothing about machine learning, I suggest you read this blog first, practical approach to machine learning.

Pandas is a python library that is used to do manipulations to data present in the dataset.

For this first, we have to import the pandas’ library. In google collab, we can run each cell(where we write code) separately. e.g. we write import pandas and then run it to see whether it has an error or not.

we can run the cell either by clicking on the play button or by pressing Ctrl+Enter. Now we can put our next code either in this cell or in the next cell its completely our wish.

Steps to follow-

We use import pandas as pd where this ‘as’ function allows us to use pd instead of writing pandas every time making it a bit easier to code.

So now you have to upload the dataset to your google drive and you can get the dataset from here. This is a dataset containing data of air flights from one place to another.

After the panda’s library has been imported, the next step is to use its functions to do some data manipulation. First, we will declare a variable named path and give the path of our dataset to this variable. To get the path of the dataset, click on the drive in the left sidebox and then go to the folder in which you have uploaded your dataset. Now right-click on the dataset and select copy path.

Now paste this copied path in the path variable. The next step is to read the data with pandas. We will put this whole dataset into one variable which is known as the data frame and we will put its name as ‘df’ and then we will show the dataset using ‘head’ function.

This is the basic function of pandas and there also many more functions to it which we will continue in our next blog.

Frequently Asked Questions-

Why do we need pandas?

We use pandas for data manipulation.

How can we traverse the data frame?

The data frame can be traversed by using functions in the pandas’ library or by simply applying the traversing code in python

