Automating Excel Files with Python
Excel is widely used for data storage and analysis, making it a common target for automation. Python provides libraries such as openpyxl
and pandas
to read, write, and manipulate Excel files programmatically.
Installing Required Libraries
To work with Excel in Python, install the necessary libraries using pip
:
pip install openpyxl pandas
openpyxl
: Works with.xlsx
files (Excel 2007+).pandas
: Handles Excel data efficiently for data analysis.
Reading Excel Files
Loading an Excel File
import openpyxl
# Load an existing Excel file
workbook = openpyxl.load_workbook("data.xlsx")
# Display available sheets
print(workbook.sheetnames)
Accessing a Specific Sheet
sheet = workbook["Sheet1"]
# Get the value of a specific cell
cell_value = sheet["A1"].value
print(f"Value in A1: {cell_value}")
Reading Data from Multiple Cells
for row in sheet.iter_rows(min_row=1, max_row=5, values_only=True):
print(row)
Writing to Excel Files
Creating a New Excel File and Writing Data
workbook = openpyxl.Workbook()
sheet = workbook.active
sheet.title = "Sales Data"
# Writing data to cells
sheet["A1"] = "Product"
sheet["B1"] = "Price"
sheet["A2"] = "Laptop"
sheet["B2"] = 1200
# Save the workbook
workbook.save("sales.xlsx")
Appending Data to an Existing Excel File
workbook = openpyxl.load_workbook("sales.xlsx")
sheet = workbook.active
# Append new row
sheet.append(["Phone", 800])
# Save the updated file
workbook.save("sales.xlsx")
Modifying Excel Files
Updating Cell Values
workbook = openpyxl.load_workbook("sales.xlsx")
sheet = workbook.active
# Update cell value
sheet["B2"] = 1300 # Changing Laptop price
workbook.save("sales.xlsx")
Formatting Cells
from openpyxl.styles import Font, Color
workbook = openpyxl.load_workbook("sales.xlsx")
sheet = workbook.active
# Apply bold font to header
header_font = Font(bold=True)
for cell in sheet[1]: # First row (header)
cell.font = header_font
workbook.save("sales.xlsx")
Working with Multiple Sheets
Creating a New Sheet
workbook.create_sheet(title="Inventory")
workbook.save("sales.xlsx")
Copying Data Between Sheets
source_sheet = workbook["Sales Data"]
destination_sheet = workbook["Inventory"]
for row in source_sheet.iter_rows(values_only=True):
destination_sheet.append(row)
workbook.save("sales.xlsx")
Deleting Data and Sheets
Deleting a Sheet
workbook.remove(workbook["Inventory"])
workbook.save("sales.xlsx")
Deleting Rows and Columns
sheet.delete_rows(2) # Deletes row 2
sheet.delete_cols(2) # Deletes column B
workbook.save("sales.xlsx")
Converting Excel to CSV and Vice Versa
Excel to CSV
import pandas as pd
df = pd.read_excel("sales.xlsx")
df.to_csv("sales.csv", index=False)
CSV to Excel
df = pd.read_csv("sales.csv")
df.to_excel("sales.xlsx", index=False)
Automating Excel Reports
This script reads an Excel file, filters data, and creates a summary report.
import pandas as pd
# Load data
df = pd.read_excel("sales.xlsx")
# Filter products with price above $1000
filtered_df = df[df["Price"] > 1000]
# Save report
filtered_df.to_excel("high_value_sales.xlsx", index=False)
print("Report generated successfully.")
Conclusion
This section covered automating Excel tasks, including reading, writing, modifying, formatting, and converting Excel files. These techniques are useful for generating reports, processing large datasets, and automating data entry tasks.
Would you like additional examples or modifications?