CSV files are used a lot in storing tabular data into a file. We can easily export data from database tables or excel files to CSV files. It’s also easy to read by humans as well as in the program. In this tutorial, we will learn how to parse CSV files in Python.
Parsing a file means reading the data from a file. The file may contain textual data so-called text files, or they may be a spreadsheet.
CSV stands for Comma Separated Files, i.e. data is separated using comma from each other. CSV files are created by the program that handles a large number of data. Data from CSV files can be easily exported in the form of spreadsheet and database as well as imported to be used by other programs. Let’s see how to parse a CSV file. Parsing CSV files in Python is quite easy. Python has an inbuilt CSV library which provides the functionality of both readings and writing the data from and to CSV files. There are a variety of formats available for CSV files in the library which makes data processing user-friendly.
Reading CSV files using the inbuilt Python CSV module.
import csv
with open('university_records.csv', 'r') as csv_file:
reader = csv.reader(csv_file)
for row in reader:
print(row)
Output:
For writing a file, we have to open it in write mode or append mode. Here, we will append the data to the existing CSV file.
import csv
row = ['David', 'MCE', '3', '7.8']
row1 = ['Lisa', 'PIE', '3', '9.1']
row2 = ['Raymond', 'ECE', '2', '8.5']
with open('university_records.csv', 'a') as csv_file:
writer = csv.writer(csv_file)
writer.writerow(row)
writer.writerow(row1)
writer.writerow(row2)
There is one more way to work with CSV files, which is the most popular and more professional, and that is using the pandas library. Pandas is a Python data analysis library. It offers different structures, tools, and operations for working and manipulating given data which is mostly two dimensional or one-dimensional tables.
To work with the CSV file, you need to install pandas. Installing pandas is quite simple, follow the instructions below to install it using PIP.
$ pip install pandas
Python Install Pandas[/caption] [caption id=“attachment_30145” align=“aligncenter” width=“727”]
Once the installation is complete, you are good to go.
You need to know the path where your data file is in your filesystem and what is your current working directory before you can use pandas to import your CSV file data. I suggest keeping your code and the data file in the same directory or folder so that you will not need to specify the path which will save you time and space.
import pandas
result = pandas.read_csv('ign.csv')
print(result)
Output
Writing CSV files using pandas is as simple as reading. The only new term used is DataFrame
. Pandas DataFrame is a two-dimensional, heterogeneous tabular data structure (data is arranged in a tabular fashion in rows and columns. Pandas DataFrame consists of three main components - data, columns, and rows - with a labeled x-axis and y-axis (rows and columns).
from pandas import DataFrame
C = {'Programming language': ['Python', 'Java', 'C++'],
'Designed by': ['Guido van Rossum', 'James Gosling', 'Bjarne Stroustrup'],
'Appeared': ['1991', '1995', '1985'],
'Extension': ['.py', '.java', '.cpp'],
}
df = DataFrame(C, columns=['Programming language', 'Designed by', 'Appeared', 'Extension'])
export_csv = df.to_csv(r'program_lang.csv', index=None, header=True)
Output
We learned to parse a CSV file using built-in CSV module and pandas module. There are many different ways to parse the files, but programmers do not widely use them. Libraries like PlyPlus, PLY, and ANTLR are some of the libraries used for parsing text data. Now you know how to use inbuilt CSV library and powerful pandas module for reading and writing data in CSV format. The codes shown above are very basic and straightforward. It is understandable by anyone familiar with python, so I don’t think there is any need for explanation. However, the manipulation of complex data with empty and ambiguous data entry is not easy. It requires practice and knowledge of various tools in pandas. CSV is the best way of saving and sharing data. Pandas is an excellent alternative to CSV modules. You may find it difficult in the beginning, but it isn’t so hard to learn. With a little bit of practice, you will master it.
Thanks for learning with the DigitalOcean Community. Check out our offerings for compute, storage, networking, and managed databases.
Java and Python Developer for 20+ years, Open Source Enthusiast, Founder of https://www.askpython.com/, https://www.linuxfordevices.com/, and JournalDev.com (acquired by DigitalOcean). Passionate about writing technical articles and sharing knowledge with others. Love Java, Python, Unix and related technologies. Follow my X @PankajWebDev
Nice tutorial In first example of reading csv, we try to close file and are using with statement too. With will close your resource so you don’t need to.
- Ankit Rana
Get paid to write technical tutorials and select a tech-focused charity to receive a matching donation.
Full documentation for every DigitalOcean product.
The Wave has everything you need to know about building a business, from raising funding to marketing your product.
Stay up to date by signing up for DigitalOcean’s Infrastructure as a Newsletter.
New accounts only. By submitting your email you agree to our Privacy Policy
Scale up as you grow — whether you're running one virtual machine or ten thousand.
Sign up and get $200 in credit for your first 60 days with DigitalOcean.*
*This promotional offer applies to new accounts only.