In this post we will dive into topic: Pandas data input and output.
This section will teach you how to read and write data to and from a variety of file types, including CSV, Excel", SQL, HTML", Parquet", JSON etc. You’ll also learn how to manipulate data from other sources, such as databases and web sites.
Table of Contents
Welcome to the Pandas data input and output department! In this section, we’ll look at how to read and write data to and from several file formats.
By the end of this course, you will have a solid grasp of how to work with data in Pandas and will be able to import and export data in a number of formats.
Most Common Data Formats
“Comma Separated Values” is abbreviated as CSV. It is a file format for storing tabular data in plain text. A CSV file’s lines indicate rows, and the values inside a row are separated by commas. As a result, it is a simple yet effective format for storing and exchanging data. The column headers, which are used to identify the fields in the data, are frequently seen on the first line of a CSV file. Numerous programmes, including Microsoft" Excel", Google Sheets, and many computer languages, including Python", support CSV. CSV is also a popular format for exchanging data across computers.
Excel" is a spreadsheet programme created by Microsoft". It is used to generate and handle several forms of data, including numbers, text, and formulae. Excel" files end in “.xls” or “.xlsx,” and they hold data in a tabular format, akin to a table in a relational database". Each page in an Excel" workbook represents a table, and each cell in the data represents a field. Excel" has a plethora of built-in data manipulation and analysis operations and capabilities, such as sorting, filtering, and graphing.
Pandas supports reading and writing JSON data using the pd.read json() and pd.to json() methods, respectively. This enables you to interact with JSON data in Python" and conduct different data manipulation and analysis operations with ease.
Parquet" is a big data" columnar storage format. It is intended to facilitate the storing and retrieval of huge and complicated data collections. Parquet" is designed for columnar data storage and is especially well-suited for storing huge data sets with complicated schemas that are utilised for analytics.
One of Parquet’s primary advantages is its ability to compress and encode data in order to decrease the amount of disc space required to store it. This increases storage efficiency and accelerates data retrieval. Furthermore, Parquet" supports a variety of encoding techniques, including RLE, DICT, and PLAIN, which may be utilised to maximise storage and retrieval.
Many big data" tools and platforms, including Apache Hadoop", Apache Spark", and Apache Impala", support Parquet". Many data processing frameworks, including Pandas, support it as well.
The pd.read parquet() and pd.to parquet() methods in Pandas allow you to read and write Parquet" data. This enables you to deal with Parquet" data and conduct different data manipulation and analysis activities using Python".
Pandas Data Input and Output
Pandas is a sophisticated Python" module that lets you read and write data to and from a wide range of file types and data sources. Here is a list of some of the file types and data sources that Pandas can read and write:
- Pandas Input Data Types:
- CSV (Comma Separated Values) using
- Excel" using
- SQL using
- JSON using
- HTML" using
- SAS using
- STATA using
- HDF5 using
- Pickle using
- SQLite using
- Parquet" using
- and many more.
- CSV (Comma Separated Values) using
- Pandas Output Data Types:
Assume we have the input CSV Data as:
ID,Name,Age 1,AAA,10 2,BBB,20 3,CCC,30 4,DDD,40 5,EEE,50
Pandas CSV Tutorial
In the following example we will read data from CSV file, do some data manipulation and then save it again to CSV data format.
import pandas as pd # Read CSV file df = pd.read_csv('data/data.csv') print(df.columns) # Perform data manipulation df['new_column'] = df['ID'] + df['Age'] # Write CSV file df.to_csv('data_modified.csv', index=False) print(df)
The result it:
Index(['ID', 'Name', 'Age'], dtype='object') ID Name Age new_column 0 1 AAA 10 11 1 2 BBB 20 22 2 3 CCC 30 33 3 4 DDD 40 44 4 5 EEE 50 55
Pandas Excel Tutorial
import pandas as pd # Read Excel file df = pd.read_excel('data.xlsx') # Perform data manipulation df['new_column'] = df['column1'] + df['column2'] # Pandas Write Excel df.to_excel('data_modified.xlsx', index=False)
Pandas SQL Tutorial
In the following example we will read data from SQLite database file and some data manipulation on read data.
import pandas as pd import sqlite3 # Connect to SQLite database conn = sqlite3.connect('data.db') # Read SQL query df = pd.read_sql('SELECT * FROM data', conn) # Perform data manipulation df['new_column'] = df['column1'] + df['
Pandas JSON Tutorial
In the following example we will read data from JSON file, do some data manipulation and then save it again to JSON data format.
import pandas as pd # Read JSON file df = pd.read_json('data.json') # Perform data manipulation df['new_column'] = df['column1'] + df['column2'] # Write JSON file df.to_json('data_modified.json', index=False)
Pandas HTML Tutorial
import pandas as pd # Read HTML table df = pd.read_html('data.html') # Perform data manipulation df['new_column'] = df['column1'] + df['column2'] # Write HTML table df.to_html('data_modified.html')
Pandas SAS Tutorial
In the following example we will read data from SAS file, do some data manipulation and then save it again to SAS data format.
import pandas as pd # Read SAS file df = pd.read_sas('data.sas7bdat') # Perform data manipulation df['new_column'] = df['column1'] + df['column2'] # Write SAS file df.to_sas('data_modified.sas7bdat')
Pandas is a robust Python" library that lets you read and write data from a wide range of file formats and data sources, including CSV, Excel", SQL, JSON, HTML", SAS, STATA, HDF5, Pickle, SQLite, Parquet", and many others. To read a file, use functions like pd.read csv(), pd.read excel(), pd.read json(), and so on. To create a file, use functions like to csv(), to excel(), to json(), and so on. Additionally, you may manipulate the data before saving it to a new file. Adding additional columns, filtering rows, and other features are examples.
Could You Please Share This Post? I appreciate It And Thank YOU! :) Have A Nice Day!
We are sorry that this post was not useful for you!
Let us improve this post!
Tell us how we can improve this post?