deltagradient: Automating PDF Generation and Manipulation with Python

Automating PDF Generation and Manipulation with Python

PDF files are widely used for reports, invoices, and documents. Python provides libraries like reportlab, PyPDF2, and pdfplumber to generate, edit, and extract text from PDFs.

Installing Required Libraries

pip install reportlab PyPDF2 pdfplumber

reportlab – Creates PDFs from scratch with text, images, tables, and charts.
PyPDF2 – Merges, splits, and extracts text from PDFs.
pdfplumber – Extracts structured data from PDFs.

Creating a PDF Using `reportlab`

from reportlab.pdfgen import canvas

# Create a new PDF
pdf = canvas.Canvas("output.pdf")

# Add text
pdf.setFont("Helvetica", 14)
pdf.drawString(100, 750, "Automated PDF Report")
pdf.drawString(100, 730, "Generated using Python")

# Save the PDF
pdf.save()

print("PDF created successfully.")

Adding Images to a PDF

from reportlab.lib.pagesizes import letter

pdf = canvas.Canvas("pdf_with_image.pdf", pagesize=letter)

pdf.drawString(100, 750, "PDF with Image")
pdf.drawImage("image.png", 100, 500, width=200, height=100)

pdf.save()

Creating Tables in a PDF

from reportlab.platypus import SimpleDocTemplate, Table, TableStyle
from reportlab.lib import colors

# Create PDF document
pdf = SimpleDocTemplate("table_report.pdf")

# Table data
data = [["Product", "Sales", "Revenue"],
        ["Laptop", "100", "$50,000"],
        ["Phone", "200", "$30,000"]]

# Create table
table = Table(data)

# Add styles
style = TableStyle([
    ("BACKGROUND", (0, 0), (-1, 0), colors.grey),
    ("TEXTCOLOR", (0, 0), (-1, 0), colors.whitesmoke),
    ("ALIGN", (0, 0), (-1, -1), "CENTER"),
    ("FONTNAME", (0, 0), (-1, 0), "Helvetica-Bold"),
    ("BOTTOMPADDING", (0, 0), (-1, 0), 10),
    ("GRID", (0, 0), (-1, -1), 1, colors.black),
])

table.setStyle(style)

# Build PDF
pdf.build([table])

print("PDF with table created.")

Merging Multiple PDFs Using `PyPDF2`

from PyPDF2 import PdfMerger

pdfs = ["file1.pdf", "file2.pdf"]
merger = PdfMerger()

for pdf in pdfs:
    merger.append(pdf)

merger.write("merged.pdf")
merger.close()

print("PDFs merged successfully.")

Splitting a PDF File

from PyPDF2 import PdfReader, PdfWriter

reader = PdfReader("large.pdf")
writer = PdfWriter()

# Extract first 3 pages
for i in range(3):
    writer.add_page(reader.pages[i])

# Save the extracted pages
with open("split.pdf", "wb") as output_pdf:
    writer.write(output_pdf)

print("PDF split successfully.")

Extracting Text from a PDF Using `PyPDF2`

reader = PdfReader("document.pdf")

# Extract text from the first page
page = reader.pages[0]
text = page.extract_text()

print("Extracted text:", text)

Extracting Tables from a PDF Using `pdfplumber`

import pdfplumber

with pdfplumber.open("table_document.pdf") as pdf:
    page = pdf.pages[0]
    table = page.extract_table()

    for row in table:
        print(row)

Automating PDF Report Generation and Emailing

import yagmail

yag = yagmail.SMTP("your_email@gmail.com", "your_password")

# Send the PDF report via email
yag.send(
    to="recipient@example.com",
    subject="Automated PDF Report",
    contents="Please find the attached PDF report.",
    attachments="output.pdf"
)

print("PDF report emailed successfully.")

Conclusion

This section covered automating PDF generation and manipulation, including creating PDFs, adding images and tables, merging and splitting PDFs, extracting text, and emailing PDF reports. These techniques are useful for automating document workflows.

Would you like additional examples or modifications?

deltagradient

Automating PDF Generation and Manipulation with Python

Automating PDF Generation and Manipulation with Python

Installing Required Libraries

Creating a PDF Using `reportlab`

Adding Images to a PDF

Creating Tables in a PDF

Merging Multiple PDFs Using `PyPDF2`

Splitting a PDF File

Extracting Text from a PDF Using `PyPDF2`

Extracting Tables from a PDF Using `pdfplumber`

Automating PDF Report Generation and Emailing

Conclusion

Tools

Python

Python Automation

Machine Learning

File Tools

Web Tools

Data Tools

Developer Tools

Automating PDF Generation and Manipulation with Python

Automating PDF Generation and Manipulation with Python

Installing Required Libraries

Creating a PDF Using reportlab

Adding Images to a PDF

Creating Tables in a PDF

Merging Multiple PDFs Using PyPDF2

Splitting a PDF File

Extracting Text from a PDF Using PyPDF2

Extracting Tables from a PDF Using pdfplumber

Automating PDF Report Generation and Emailing

Conclusion

Tools

Python

Python Automation

Machine Learning

Creating a PDF Using `reportlab`

Merging Multiple PDFs Using `PyPDF2`

Extracting Text from a PDF Using `PyPDF2`

Extracting Tables from a PDF Using `pdfplumber`