Spire.Office Knowledgebase Page 16 | E-iceblue

Comprehensive Guide for Generating Barcodes in Python with Spire.Barcode

Barcodes are essential in inventory management, retail systems, logistics, and many other data-driven fields. For Python developers, generating barcodes in Python can be complex—especially when working with multiple formats or large-scale automation. That’s why a reliable Python barcode generator library is necessary to ensure flexible and efficient barcode creation, with support for various barcode types and batch processing.

This article provides an accurate and efficient approach to generating barcodes in Python using Spire.Barcode for Python.

Get Started with Spire.Barcode for Python

Why Choose Spire.Barcode?

Spire.Barcode for Python is a professional and user-friendly Python Barcode API designed for developers who want to add barcode generation or reading features to their Python applications. Here’s why it stands out:

  • Supports multiple barcode symbologies including Code 128, QR Code, EAN-13, UPC, and more
  • High-quality image output with complete customization settings
  • Comprehensive and easy-to-integrate API
  • No need for third-party dependencies
  • One library to generate and scan barcodes

Installation and Importing

To install Spire.Barcode for Python, simply run:

pip install spire.barcode

You can also install Free Spire.Barcode for Python for simple barcode generating tasks:

pip install spire.barcode.free

How to Generate Barcode in Python

To generate a barcode in Python, you typically need to define the barcode type (such as Code 128 or QR Code) and the content to encode. Using a library like Spire.Barcode, you can configure them in just a few lines of code.

Key Classes and Methods:

  • BarcodeSettings: Defines properties such as type, data, color, text, etc.
  • BarCodeGenerator: Generates barcode images based on settings.
  • GenerateImage(): Outputs barcode as an image stream.

Step 1: Import the Required Modules

To start coding your Python barcode generator, import the necessary classes.

  • Python
from spire.barcode import BarcodeSettings, BarCodeType, BarCodeGenerator, Code128SetMode, FontStyle, Color

Step2: Configure Barcode Settings

Create a BarcodeSettings object and define barcode properties:

  • Python
# Create a BarcodeSettings object
barcodeSettings = BarcodeSettings()
# Set the barcode type
barcodeSettings.Type = BarCodeType.Code128
# Set the barcode data
barcodeSettings.Data = "ABC123456789"
# Set the barcode code128 set mode
barcodeSettings.Code128SetMode = Code128SetMode.Auto
# Choose the data display position
barcodeSettings.ShowTextOnBottom = True
# Set the bottom text and style
barcodeSettings.BottomText = "Code 128 Example"
barcodeSettings.SetTextFont("Arial", 12.0, FontStyle.Regular)
barcodeSettings.ShowBottomText = True
# Set the background color
barcodeSettings.BackColor = Color.get_Beige()

Step 3: Generate the Barcode Image

Create a BarCodeGenerator object using the configured BarcodeSettings, then generate the barcode image as a stream and save it to a local file:

  • Python
# Create a BarCodeGenerator object
barcodeGenerator = BarCodeGenerator(barcodeSettings)
# Generate the barcode image
barcodeImage = barcodeGenerator.GenerateImage()
# Save the image
with open("output/Code 128.png", "wb") as fp:
fp.write(barcodeImage)

The generated barcode:

Python barcode generator Code 128 example using Spire.Barcode

This script allows you to generate a Code 128 barcode and save it as an image. Just replace the BarCodeType and Data value to customize.

Generating Other Barcode Types

Spire.Barcode for Python supports a wide range of barcode symbologies, including the most commonly used types such as Code 39, UPC, QR Code, and EAN-13. This ensures developers can generate barcodes for various applications with compatibility and ease.

Barcode Type Support Overview

Barcode Category Barcode Types (Examples) Free Version Paid Version
1D Linear Barcodes Codabar, Code11, Code25, Interleaved25, Code39, Code39Extended, Code93, Code93Extended, Code128, EAN8, EAN13, EAN128, EAN14, UPCA, UPCE, MSI, PostNet, Planet, SCC14, SSCC18, ITF14, ITF6, PZN, OPC ✅ (Partial) ✅ (All)
2D Barcodes QRCode, DataMatrix, Pdf417, Pdf417Macro, Aztec, MicroQR ✅ (QRCode only) ✅ (All)
Stacked/Composite Codes RSS14, RSS14Truncated, RSSLimited, RSSExpanded
Postal Barcodes USPS, SwissPostParcel, DeutschePostIdentcode, DeutschePostLeitcode, RoyalMail4State, SingaporePost4State

Advanced: Generate Barcodes in Bulk

You can easily generate multiple barcodes in bulk using Spire.Barcode for Python. This is especially useful for inventory management, batch labeling, or automated systems where each item requires a unique barcode.

Code Example

  • Python
# Create a list of barcode data
data = ["Barcode 1", "Barcode 2", "Barcode 3"]
# Loop through the data to generate barcodes
for barcode_data in data:
# Create a BarcodeSettings object
settings = BarcodeSettings()
# Set the barcode type and data
settings.Type = BarCodeType.Code39
settings.Data  = barcode_data
# Create a BarCodeGenerator object
generator = BarCodeGenerator(settings)
# Save the barcode image
barcode_stream = generator.GenerateImage()
with open(f"output/{barcode_data}.png", "wb") as file:
file.write(barcode_stream)

This Python script generates and saves a barcode image for each entry in the list, streamlining barcode creation workflow.

Conclusion

Generating barcodes in Python is simple and efficient with Spire.Barcode for Python. Whether you’re creating a single Code 128 barcode or automating batch QR code generation, this robust and flexible library gives you the control and quality needed for professional barcode integration. From supporting various symbologies to delivering high-quality output with minimal code, this Python barcode generator is an excellent tool for developers across industries.

Frequently Asked Questions

Q: How to create a barcode in Python?

You can create a barcode using libraries like Spire.Barcode for Python, which supports a variety of symbologies like Code 128, QR Code, and more. Just install the package, configure barcode settings, and save the output image.

Q: How is barcode generated?

Barcodes are generated by encoding data into a visual pattern of bars or modules. With Python, this is done through barcode libraries like Spire.Barcode, which translate string input into a corresponding image.

Q: How to create a code generator in Python?

If you're referring to a barcode generator, define the barcode type (e.g., Code 128), provide the data, and use a library like Spire.Barcode to generate and save the image. You can automate this process using loops and functions.

Q: How to generate QR code by Python?

You can use Spire.Barcode for Python to generate QR codes quickly and efficiently. Here's a full example that creates a QR code with embedded data:

  • Python
from spire.barcode import BarcodeSettings, BarCodeGenerator, BarCodeType

# Create a BarcodeSettings object
barcodeSettings = BarcodeSettings()
# Set the barcode type to QRCode
barcodeSettings.Type = BarCodeType.QRCode
# Set the barcode data
barcodeSettings.Data = "ABC123456"
# Generate the barcode
barcodeGenerator = BarCodeGenerator(barcodeSettings)
barcodeGenerator.GenerateImage()
with  open("output/QRCode.png", "wb") as f:
f.write(barcodeGenerator.GenerateImage())

Result:

Generate QR code in Python with Spire.Barcode library

This enables you to embed URLs, text, or IDs into scannable QR images.

See Also: How to Generate and Scan QR Codes with Python

Get a Free License

Spire.Barcode for Python offers a free trial license that removes limitations and watermarking. Get a free license today and explore the full capabilities of this powerful Python barcode library.

Visual guide for removing auto filters in Excel using Spire.XLS for .NET and C#

PDFs are widely used in reports, invoices, and digital forms due to their consistent formatting across platforms. However, their fixed layout makes editing difficult without specialized tools. For developers looking to edit PDF using Python, Spire.PDF for Python provides a comprehensive and easy-to-use solution. This Python PDF editor enables you to modify PDF files programmatically—changing text, replacing images, adding annotations, handling forms, and securing files—without relying on Adobe Acrobat or any external software.

In this article, we will explore how to use Spire.PDF for Python to programmatically edit PDFs in Python applications.

Why Use Python and Spire.PDF to Edit PDF Documents?

Python is a highly versatile programming language that provides an excellent platform for automating and managing PDF documents. When it comes to edit PDF Python tasks, Spire.PDF for Python stands out as a comprehensive and easy-to-use solution for all your PDF manipulation needs.

Benefits of Using Python for PDF Editing

  • Automation and Batch Processing: Streamline repetitive PDF editing tasks efficiently.
  • Cost-Effective: Reduce manual work, saving time and resources when you Python-edit PDF files.
  • Integration: Seamlessly incorporate PDF editing into existing Python-based systems and workflows.

Advantages of Spire.PDF for Python

Spire.PDF for Python is a standalone library that enables developers to create, read, edit, convert, and save PDF files without relying on external software. As a trusted Python PDF editor, it offers powerful features such as:

  • Text and Image Editing
  • Annotations and Bookmark Management
  • Form Field Handling
  • Security Settings (Encryption and Permissions)
  • Conversion to Word, Excel, HTML, and Images

To learn more about these specific features, visit the Spire.PDF for Python tutorials.

With its intuitive API design, Spire.PDF makes it easier than ever to edit PDF files in Python quickly and effectively, ensuring a smooth development experience.

Getting Started with Spire.PDF for Python

Installation:

To install Spire.PDF for Python, simply run the following pip command:

pip install spire.pdf

Alternatively, you can install Free Spire.PDF for Python, a free version suitable for small projects, by running:

pip install spire.pdf.free

You can also download the library manually from the links.

Basic Setup Example:

The following example demonstrates how to create a simple PDF using Spire.PDF for Python:

  • Python
from spire.pdf import PdfDocument, PdfFont, PdfBrushes, PdfFontFamily, PdfFontStyle

# Create a new PDF document
pdf = PdfDocument()
# Add a new page to the document
page = pdf.Pages.Add()
# Create a font
font = PdfFont(PdfFontFamily.TimesRoman, 28.0, PdfFontStyle.Bold)
# Create a brush
brush = PdfBrushes.get_Black()
# Draw the string using the font and brush
page.Canvas.DrawString("Hello, World", font, brush, 100.0, 100.0)
# Save the document
pdf.SaveToFile("output/NewPDF.pdf")
pdf.Close()

Result: The generated PDF displays the text "Hello, World" using Times Roman Bold.

PDF created using Spire.PDF for Python showing Hello World text

With Spire.PDF installed, you're now ready to edit PDFs using Python. The sections below explain how to manipulate structure, content, security, and metadata.

How to Edit an Existing PDF Using Spire.PDF for Python

Spire.PDF for Python provides a simple yet powerful way to edit PDF using Python. With its intuitive API, developers can automate a wide range of PDF editing tasks including modifying document structure, page content, security settings, and properties. This section outlines the core categories of editing and their typical use cases.

Edit PDF Pages and Structure with Python

Structure editing lets you manipulate PDF page order, merge files, or insert/delete pages—ideal for document assembly workflows.

  • Insert or Delete Pages

Use the Pages.Insert() and Pages.RemoveAt() methods of the PdfDocument class to insert or delete pages at specific positions.

Code Example

  • Python
from spire.pdf import PdfDocument, PdfPageSize, PdfMargins, PdfPageRotateAngle

# Load a PDF document
pdf = PdfDocument()
pdf.LoadFromFile("Sample.pdf")

# Insert and delete pages
# Insert at beginning
pdf.Pages.Insert(0, PdfPageSize.A4(), PdfMargins(50.0, 60.0), PdfPageRotateAngle.RotateAngle90)
# Delete second page
pdf.Pages.RemoveAt(1)

# Save the document
pdf.SaveToFile("output/InsertDeletePage.pdf")
pdf.Close()

Result:

PDF pages inserted and deleted with Python code using Spire.PDF

  • Merge Two PDF Files

The AppendPage() method allows you to combine PDFs by inserting pages from one document into another.

Code Example

  • Python
import os
from spire.pdf import PdfDocument

# Specify the PDF file path
pdfPath = "PDFs/"
# Read the PDF file names from the path and add them to a list
files = [pdfPath + file for file in os.listdir(pdfPath) if file.endswith(".pdf")]

# Load the first PDF file
pdf = PdfDocument()
pdf.LoadFromFile(files[0])
# Iterate through the other PDF files
for i in range(1, len(files)):
    # Load the current PDF file
    pdf2 = PdfDocument()
    pdf2.LoadFromFile(files[i])
    # Append the pages from the current PDF file to the first PDF file
    pdf.AppendPage(pdf2)

# Save the merged PDF file
pdf.SaveToFile("output/MergePDFs.pdf")
pdf.Close()

Result:

Merged PDF documents using Python and Spire.PDF

You may also like: Splitting PDF Files with Python Code

Edit PDF Content with Python

As a Python PDF editor, Spire.PDF supports a variety of content-level operations, including modifying text, images, annotations, and interactive forms.

  • Replace Text in a PDF

The PdfTextReplacer class can be used to find and replace text from a page. Note that precise replacement may require case and layout-aware handling.

Code Example

  • Python
from spire.pdf import PdfDocument, PdfTextReplacer, ReplaceActionType, Color

# Load the PDF file
pdf = PdfDocument()
pdf.LoadFromFile("Sample.pdf")

# Iterate through the pages
for i in range(pdf.Pages.Count):
    page = pdf.Pages.get_Item(i)
    # Create a PdfTextReplacer object
    replacer = PdfTextReplacer(page)
    # Set the replacement options
    replacer.Options.ReplaceType = ReplaceActionType.IgnoreCase
    # Replace the text
    replacer.ReplaceAllText("drones", "ROBOTS", Color.get_Aqua()) # Setting the color is optional

# Save the merged PDF file
pdf.SaveToFile("output/ReplaceText.pdf")
pdf.Close()

Result:

Alt: Text replaced in a PDF file using Python with Spire.PDF

  • Replace Images in a PDF

Spire.PDF for Python provides the PdfImageHelper class to help you replace images in a PDF file with ease. By retrieving image information from a specific page, you can use the ReplaceImage() method to directly substitute the original image with a new one.

Code Example

  • Python
from spire.pdf import PdfDocument, PdfImageHelper, PdfImage

# Load the PDF file
pdf = PdfDocument()
pdf.LoadFromFile("Sample.pdf")

# Get a page
page = pdf.Pages.get_Item(0)

# Create a PdfImageHelper instance
imageHelper = PdfImageHelper()
# Get the image info of the first image on the page
imageInfo = imageHelper.GetImagesInfo(page)[0]
# Load a new image
newImage = PdfImage.FromFile("Image.png")
# Replace the image
imageHelper.ReplaceImage(imageInfo, newImage)

# Save the PDF file
pdf.SaveToFile("output/ReplaceImage.pdf")
pdf.Close()

Result:

Image replacement in a PDF document using Spire.PDF for Python

  • Add Comments or Notes

To add comments or notes with Python, use the PdfTextMarkupAnnotation class and add it to the page’s AnnotationsWidget collection.

Code Example

  • Python
from spire.pdf import PdfDocument, PdfTextFinder, PdfTextMarkupAnnotation, PdfRGBColor, Color

# Load the PDF file
pdf = PdfDocument()
pdf.LoadFromFile("Sample.pdf")

# Get a page
page = pdf.Pages.get_Item(0)

#Create a PdfTextFinder instance and set the options
finder = PdfTextFinder(page)
finder.Options.Parameter.IgnoreCase = False
finder.Options.Parameter.WholeWord = True
# Find the text to comment
text = finder.Find("redefining entire industries")[0]

# Get the bound of the text
bound = text.Bounds[0]

# Add comment
commentText = ("This is a powerful expression, but a bit vague. "
                "You might consider specifying which industries are "
                "being redefined and how, to make the claim more "
                "concrete and credible.")
comment = PdfTextMarkupAnnotation("Commenter", commentText, bound)
comment.TextMarkupColor = PdfRGBColor(Color.get_Yellow())
page.AnnotationsWidget.Add(comment)

# Save the PDF file
pdf.SaveToFile("output/CommentNote.pdf")
pdf.Close()

Result:

Comment added to PDF using Python Spire.PDF annotations

  • Edit or Read Form Fields

Spire.PDF for Python allows you to programmatically fill out and read form fields in a PDF document. By accessing the FieldsWidget property of a PdfFormWidget object, you can iterate through all interactive form elements, such as text boxes, combo boxes, and checkboxes, and update or extract their values.

Code Example

  • Python
from spire.pdf import PdfDocument, PdfFormWidget, PdfComboBoxWidgetFieldWidget, PdfCheckBoxWidgetFieldWidget, PdfTextBoxFieldWidget

# Load the PDF file
pdf = PdfDocument()
pdf.LoadFromFile("EmployeeInformationForm.pdf")

forms = pdf.Form
formWidgets = PdfFormWidget(forms).FieldsWidget

# Fill the forms
for i in range(formWidgets.Count):
    formField = formWidgets.get_Item(i)
    if formField.Name == "FullName":
        textBox = PdfTextBoxFieldWidget(formField)
        textBox.Text = "Amanda Ray Thompson"
    elif formField.Name == "DateOfBirth":
        textBox = PdfTextBoxFieldWidget(formField)
        textBox.Text = "01/01/1980"
    elif formField.Name == "Gender":
        comboBox = PdfComboBoxWidgetFieldWidget(formField)
        comboBox.SelectedIndex  = [ 1 ]
    elif formField.Name == "Department":
        formField.Value = "Human Resources"
    elif formField.Name == "AgreeTerms":
        checkBox = PdfCheckBoxWidgetFieldWidget(formField)
        checkBox.Checked = True

# Read the forms
formValues = []

for i in range(formWidgets.Count):
    formField = formWidgets.get_Item(i)
    if isinstance(formField, PdfTextBoxFieldWidget):
        formValues.append(formField.Name + ": " + formField.Text)
    elif isinstance(formField, PdfComboBoxWidgetFieldWidget):
        formValues.append(formField.Name + ": " + formField.SelectedValue)
    elif isinstance(formField, PdfCheckBoxWidgetFieldWidget):
        formValues.append(formField.Name + ": " + str(formField.Checked))

# Write the form values to a file
with open("output/FormValues.txt", "w") as file:
    file.write("\n".join(formValues))

# Save the PDF file
pdf.SaveToFile("output/FilledForm.pdf")
pdf.Close()

Result:

PDF form fields filled and retrieved programmatically with Python and Spire.PDF

Explore more: How to Insert Page Numbers to PDF Using Python

Manage PDF Security with Python

PDF security editing is essential when dealing with sensitive documents. Spire.PDF supports encryption, password protection, digital signature handling, and permission settings.

  • Add a Password and Set Permissions

The Encrypt() method lets you secure a PDF with user/owner passwords and define allowed actions like printing or copying.

Code Example

  • Python
from spire.pdf import PdfDocument, PdfEncryptionAlgorithm, PdfDocumentPrivilege, PdfPasswordSecurityPolicy

# Load the PDF file
pdf = PdfDocument()
pdf.LoadFromFile("EmployeeInformationForm.pdf")

# Create a PdfSecurityPolicy object and set the passwords and encryption algorithm
securityPolicy = PdfPasswordSecurityPolicy("userPSD", "ownerPSD")
securityPolicy.EncryptionAlgorithm = PdfEncryptionAlgorithm.AES_128

# Set the document privileges
pdfPrivileges = PdfDocumentPrivilege.ForbidAll()
pdfPrivileges.AllowPrint = True
pdfPrivileges.AllowFillFormFields  = True
# Apply the document privileges
securityPolicy.DocumentPrivilege = pdfPrivileges

# Encrypt the PDF with the security policy
pdf.Encrypt(securityPolicy)

# Save the PDF file
pdf.SaveToFile("output/EncryptedForm.pdf")
pdf.Close()

Result

Encrypted PDF file with password using Spire.PDF for Python

  • Remove the Password from a PDF

To open a protected file, provide the user password when calling LoadFromFile(), use Decrypt() to decrypt the document, and save it again unprotected.

Code Example

  • Python
from spire.pdf import PdfDocument

# Load the encrypted PDF file with the owner password
pdf = PdfDocument()
pdf.LoadFromFile("output/EncryptedForm.pdf", "ownerPSD")

# Decrypt the PDF file
pdf.Decrypt()

# Save the PDF file
pdf.SaveToFile("output/DecryptedForm.pdf")
pdf.Close()

Recommended for you: Use Python to Add and Remove Digital Signature in PDF

Edit PDF Properties with Python

Use Spire.PDF to read and edit PDF metadata and viewer preferences—key features for document presentation and organization.

  • Update Document Metadata

Update metadata such as title, author, or subject via the DocumentInformation property of the PDF document.

Code Example

  • Python
from spire.pdf import PdfDocument

# Load the PDF file
pdf = PdfDocument()
pdf.LoadFromFile("EmployeeInformationForm.pdf")

# Set document metadata
pdf.DocumentInformation.Author = "John Doe"
pdf.DocumentInformation.Title = "Employee Information Form"
pdf.DocumentInformation.Producer  = "Spire.PDF"

# Save the PDF file
pdf.SaveToFile("output/EditProperties.pdf")
pdf.Close()

Result:

PDF metadata edited using Python Spire.PDF API

  • Set View Preferences

The ViewerPreferences property allows you to customize the viewing mode of a PDF (e.g., two-column layout).

Code Example

  • Python
from spire.pdf import PdfDocument, PdfPageLayout, PrintScalingMode

# Load the PDF file
pdf = PdfDocument()
pdf.LoadFromFile("EmployeeInformationForm.pdf")

# Set the viewer preferences
pdf.ViewerPreferences.DisplayTitle = True
pdf.ViewerPreferences.HideToolbar = True
pdf.ViewerPreferences.HideWindowUI = True
pdf.ViewerPreferences.FitWindow = False
pdf.ViewerPreferences.HideMenubar = True
pdf.ViewerPreferences.PrintScaling = PrintScalingMode.AppDefault
pdf.ViewerPreferences.PageLayout = PdfPageLayout.OneColumn

# Save the PDF file
pdf.SaveToFile("output/EditViewerPreference.pdf")
pdf.Close()

Result:

PDF viewer preferences set using Spire.PDF for Python

Similar topic: Change PDF Version Easily with Python Code

Conclusion

Editing PDFs using Python is both practical and efficient with Spire.PDF for Python. Whether you're building automation tools, editing digital forms, or securing sensitive reports, Spire.PDF equips you with a comprehensive suite of editing features—all accessible via clean and simple Python code.

With capabilities that span content editing, form interaction, document structuring, and security control, this Python PDF editor is a go-to solution for developers and organizations aiming to streamline their PDF workflows.

Frequently Asked Questions

Q: Can I edit a PDF using Python?

A: Yes, Python offers powerful libraries like Spire.PDF for Python that enable you to edit text, images, forms, annotations, and even security settings in a PDF file.

Q: How to edit a PDF using coding?

A: By using libraries such as Spire.PDF for Python, you can load an existing PDF, modify its content or structure programmatically, and save the changes with just a few lines of code.

Q: What is the Python library for PDF editor?

A: Spire.PDF for Python is a popular choice. It offers comprehensive functionalities for creating, reading, editing, converting, and securing PDF documents without the need for additional software.

Q: Can I modify a PDF for free?

A: Yes, you can use the free edition of Spire.PDF for Python to edit PDF files, although it comes with some limitations, such as processing up to 10 pages per document. Additionally, you can apply for a 30-day temporary license that removes all limitations and watermarks for full functionality testing.

PDF documents may occasionally include blank pages. These pages can affect the reading experience, increase the file size and lead to paper waste during printing. To improve the professionalism and usability of a PDF document, detecting and removing blank pages is an essential step.

This article shows how to accurately detect and remove blank pages—including those that appear empty but actually contain invisible elements—using Python, Spire.PDF for Python, and Pillow.

Install Required Libraries

This tutorial requires two Python libraries:

  • Spire.PDF for Python: Used for loading PDFs and detecting/removing blank pages.
  • Pillow: A library for image processing that helps detect visually blank pages, which may contain invisible content.

You can easily install both libraries using pip:

pip install Spire.PDF Pillow

Need help installing Spire.PDF? Refer to this guide:

How to Install Spire.PDF for Python on Windows

How to Effectively Detect and Remove Blank Pages from PDF Files in Python

Spire.PDF provides a method called PdfPageBase.IsBlank() to check if a page is completely empty. However, some pages may appear blank but actually contain hidden content like white text, watermarks, or background images. These cannot be reliably detected using the PdfPageBase.IsBlank() method alone.

To ensure accuracy, this tutorial adopts a two-step detection strategy:

  • Use the PdfPageBase.IsBlank() method to identify and remove fully blank pages.
  • Convert non-blank pages to images and analyze them using Pillow to determine if they are visually blank.

⚠️ Important:

If you don’t use a valid license during the PDF-to-image conversion, an evaluation watermark will appear on the image, potentially affecting the blank page detection.

Contact the E-iceblue sales team to request a temporary license for proper functionality.

Steps to Detect and Remove Blank Pages from PDF in Python

Follow these steps to implement blank page detection and removal in Python:

1. Define a custom is_blank_image() Method

This custom function uses Pillow to check whether the converted image of a PDF page is blank (i.e., if all pixels are white).

2. Load the PDF Document

Load the PDF using the PdfDocument.LoadFromFile() method.

3. Iterate Through Pages

Loop through each page to check if it’s blank using two methods:

  • If the PdfPageBase.IsBlank() method returns True, remove the page directly.
  • If not, convert the page to an image using the PdfDocument.SaveAsImage() method and analyze it with the custom is_blank_image() method.

4. Save the Result PDF

Finally, save the PDF with blank pages removed using the PdfDocument.SaveToFile() method.

Code Example

  • Python
import io
from spire.pdf import PdfDocument
from PIL import Image

# Custom function: Check if the image is blank (whether all pixels are white)
def is_blank_image(image):
        # Convert to RGB mode and then get the pixels
        img = image.convert("RGB")
        # Get all pixel points and check if they are all white
        white_pixel = (255, 255, 255)
        return all(pixel == white_pixel for pixel in img.getdata())

# Load the PDF document
pdf = PdfDocument()
pdf.LoadFromFile("Sample1111.pdf")

# Iterate through each page in reverse order to avoid index issues when deleting
for i in range(pdf.Pages.Count - 1, -1, -1):
    page = pdf.Pages.get_Item(i)
    # Check if the current page is completely blank
    if page.IsBlank():
        # If it's completely blank, remove it directly from the document
        pdf.Pages.RemoveAt(i)
    else:
        # Convert the current page to an image
        with pdf.SaveAsImage(i) as image_data:
            image_bytes = image_data.ToArray()
            pil_image = Image.open(io.BytesIO(image_bytes))
            # Check if the image is blank
            if is_blank_image(pil_image):
                # If it's a blank image, remove the corresponding page from the document
                pdf.Pages.RemoveAt(i)

# Save the resulting PDF
pdf.SaveToFile("RemoveBlankPages.pdf")
pdf.Close()

Python Find and Remove Blank Pages from PDF

Frequently Asked Questions (FAQs)

Q1: What is considered a blank page in a PDF file?

A: A blank page may be truly empty or contain hidden elements such as white text, watermarks, or transparent objects. This solution detects both types using a dual-check strategy.

Q2: Can I use this method without a Spire.PDF license?

A: Yes, you can run it without a license. However, during PDF-to-image conversion, an evaluation watermark will be added to the output images, which may affect the accuracy of blank page detection. It's best to request a free temporary license for testing.

Q3: What versions of Python are compatible with Spire.PDF?

A: Spire.PDF for Python supports Python 3.7 and above. Ensure that Pillow is also installed to perform image-based blank page detection.

Q4: Can I modify the script to only detect blank pages without deleting them?

A: Absolutely. Just remove or comment out the pdf.Pages.RemoveAt(i) line and use print() or logging to list detected blank pages for further review.

Conclusion

Removing unnecessary blank pages from PDF files is an important step in optimizing documents for readability, file size, and professional presentation. With the combined power of Spire.PDF for Python and Pillow, developers can precisely identify both completely blank pages and pages that appear empty but contain invisible content. Whether you're generating reports, cleaning scanned files, or preparing documents for print, this Python-based solution ensures clean and efficient PDFs.

Get a Free License

To fully experience the capabilities of Spire.PDF for Python without any evaluation limitations, you can request a free 30-day trial license.

page 16