by Team InDeepData
15628
Full form of CSV: Comma Separated Values
CSV files contain lists of data separated by commas in plain text format. Basically we use CSV files to exchange and transfer data from one destination to another.
And we can read it with or without Pandas.
To download dataset please visit here.
Table of Contents
How to read CSV file with Pandas
# Import Pandas library
import pandas as pd
# Import the Dataset file. Pandas has read_csv method that help to load .csv file
file_path = "iris.csv"
data = pd.read_csv(file_path)
# Let's see first five datapoints
data.head(5)
#Output: You will get the first 5 rows of the Dataset
# ['sepal.length', 'sepal.width', 'petal.length', 'petal.width', 'variety']
# ['5.1', '3.5', '1.4', '.2', 'Setosa']
# ['4.9', '3', '1.4', '.2', 'Setosa']
# ['4.7', '3.2', '1.3', '.2', 'Setosa']
# ['4.6', '3.1', '1.5', '.2', 'Setosa']
# ['5', '3.6', '1.4', '.2', 'Setosa']
How to read CSV file without using the Pandas library
CSV
Python's built-in library can be used to read csv files without using pandas.
Here we are using the reader()
function to read the data from the file.
# Import reader module from csv Library
from csv import reader
# read the CSV file
def load_csv(filename):
# Open file in read mode
file = open(filename,"r")
# Reading file
lines = reader(file)
# Converting into a list
data = list(lines)
return data
if __name__ == "__main__":
# Path of the dataset
file_path = "iris.csv"
data = load_csv(file_path)
# Let's print the first 5 datapoints
for row in data[:6]:
print(row, end = "\n")
#Output: You will get the first 5 rows of the Dataset as given below.
# ['sepal.length', 'sepal.width', 'petal.length', 'petal.width', 'variety']
# ['5.1', '3.5', '1.4', '.2', 'Setosa']
# ['4.9', '3', '1.4', '.2', 'Setosa']
# ['4.7', '3.2', '1.3', '.2', 'Setosa']
# ['4.6', '3.1', '1.5', '.2', 'Setosa']
# ['5', '3.6', '1.4', '.2', 'Setosa']
Although load_csv()
is a helpful function, it has some limitations. While reading files, it doesn't handle empty spaces/row. We can solve this problem by using a list.
To solve this problem you can use the below code snippets.
Let see
from csv import reader
# Load the CSV file
def load_csv(filename):
data = list()
# Open file in read mode
file = open(filename,"r")
# Reading file
lines = reader(file)
csv_reader = reader(file)
for row in csv_reader:
if not row:
continue
data.append(row)
return data
if __name__ == "__main__":
# Path of the dataset
filename = "iris.csv"
data = load_csv(filename)
# Let's print the first 5 datapoints
for row in data[:6]:
print(row, end = "\n")
# ['sepal.length', 'sepal.width', 'petal.length', 'petal.width', 'variety']
# ['5.1', '3.5', '1.4', '.2', 'Setosa']
# ['4.9', '3', '1.4', '.2', 'Setosa']
# ['4.7', '3.2', '1.3', '.2', 'Setosa']
# ['4.6', '3.1', '1.5', '.2', 'Setosa']
# ['5', '3.6', '1.4', '.2', 'Setosa']
Thanks 🙌 for visiting InDeepData.