Getting the coordinates of text or images in a PDF helps accurately identify elements, making it easier to extract content. This is especially important for data analysis, where specific information needs to be pulled from complicated layouts. Additionally, knowing these coordinates allows users to add notes, marks, or stamps in the right places, improving document interactivity and collaboration by letting them highlight important sections or add comments exactly where they're needed.

In this article, you will learn how to get coordinates of the specified text or image in a PDF document using Java and Spire.PDF for Java library.

Install Spire.PDF for Java

First of all, you're required to add the Spire.Pdf.jar file as a dependency in your Java program. The JAR file can be downloaded from this link. If you use Maven, you can easily import the JAR file in your application by adding the following code to your project's pom.xml file.

<repositories>
    <repository>
        <id>com.e-iceblue</id>
        <name>e-iceblue</name>
        <url>https://repo.e-iceblue.com/nexus/content/groups/public/</url>
    </repository>
</repositories>
<dependencies>
    <dependency>
        <groupId>e-iceblue</groupId>
        <artifactId>spire.pdf</artifactId>
        <version>12.6.4</version>
    </dependency>
</dependencies>

Coordinate System in Spire.PDF

When utilizing Spire.PDF for Java to work with an existing PDF document, it's important to note that the coordinate system's origin is positioned at the top-left corner of the page. The x-axis extends to the right, and the y-axis extends downward, as illustrated below.

Java: Get Coordinates of Text or Images in PDF

Get Coordinates of the Specified Text in PDF

To start, you can use the PdfTextFinder.find() method to search for all occurrences of the specified text on the page, which results in a list of PdfTextFragment. After that, you can retrieve the coordinates of the first occurrence of the text using the PdfTextFragment.getPositions() method.

The steps to get coordinates of the specified text in PDF are as follows:

  • Create a PdfDocument object.
  • Load a PDF file using PdfDocument.loadFromFile() method.
  • Get a specific page using PdfDocument.getPages().get() method.
  • Search for all occurrences of the specified text on the page using PdfTextFinder.find() method and return results in a list of PdfTextFragment.
  • Access a specific PdfTextFragment in the list, and get the coordinates of the fragment using PdfTextFragment.getPositions() method.
  • Java
import com.spire.pdf.PdfDocument;
import com.spire.pdf.PdfPageBase;
import com.spire.pdf.texts.PdfTextFindOptions;
import com.spire.pdf.texts.PdfTextFinder;
import com.spire.pdf.texts.PdfTextFragment;
import com.spire.pdf.texts.TextFindParameter;

import java.awt.geom.Point2D;
import java.util.EnumSet;
import java.util.List;

public class GetTextCoordinates {

    public static void main(String[] args) {

        // Create a PdfDocument object
        PdfDocument doc = new PdfDocument();

        // Load a PDF file
        doc.loadFromFile("C:\\Users\\Administrator\\Desktop\\Input.pdf");

        // Get a specific page
        PdfPageBase page = doc.getPages().get(0);

        // Create a PdfTextFinder object
        PdfTextFinder finder = new PdfTextFinder(page);

        // Set the find options
        PdfTextFindOptions options = new PdfTextFindOptions();
        options.setTextFindParameter(EnumSet.of(TextFindParameter.IgnoreCase));
        finder.setOptions(options);

        // Find all instances of the text
        List fragments = finder.find("Personal Data");

        // Get a specific text fragment
        PdfTextFragment fragment = (PdfTextFragment)fragments.get(0);


        // Get the positions of the text (If the text spans multiple lines, there will be more than one position)
        Point2D[] positions = fragment.getPositions();

        // Get its first position
        double x = positions[0].getX();
        double y = positions[0].getY();

        // Print result
        System.out.println(String.format("The text is located at: (%f, %f).",x,y));
    }
}

Java: Get Coordinates of Text or Images in PDF

Get Coordinates of the Specified Image in PDF

To begin, you can use the PdfImageHelper.getImagesInfo() method to retrieve information about all images on the specified page, storing the results in an array of PdfImageInfo. Next, you can obtain the X and Y coordinates of a specific image using the PdfImageInfo.getBounds().getX() and PdfImageInfo.getBounds().getY() methods.

The steps to get coordinates of the specified image in PDF are as follows:

  • Create a PdfDocument object.
  • Load a PDF file using PdfDocument.loadFromFile() method.
  • Get a specific page using PdfDocument.getPages().get() method.
  • Retrieve all the image information on the page using PdfImageHelper.getImagesInfo() method and return results in an array of PdfImageInfo.
  • Get X and Y coordinates of a specific image using PdfImageInfo.getBounds().getX() and PdfImageInfo.getBounds().getY() methods
  • Java
import com.spire.pdf.PdfDocument;
import com.spire.pdf.PdfPageBase;
import com.spire.pdf.utilities.PdfImageHelper;
import com.spire.pdf.utilities.PdfImageInfo;

public class GetImageCoordinates {

    public static void main(String[] args) {

        // Create a PdfDocument object
        PdfDocument doc = new PdfDocument();

        // Load a PDF file
        doc.loadFromFile("C:\\Users\\Administrator\\Desktop\\Input2.pdf");

        // Get a specific page
        PdfPageBase page = doc.getPages().get(0);

        // Create a PdfImageHelper object
        PdfImageHelper helper = new PdfImageHelper();

        // Get image information from the page
        PdfImageInfo[] imageInfo = helper.getImagesInfo(page);

        // Get X, Y coordinates of the first image
        double x = imageInfo[0].getBounds().getX();
        double y = imageInfo[0].getBounds().getY();

        // Print result
        System.out.println(String.format("The image is located at: (%f, %f).",x,y));
    }
}

Java: Get Coordinates of Text or Images in PDF

Apply for a Temporary License

If you'd like to remove the evaluation message from the generated documents, or to get rid of the function limitations, please request a 30-day trial license for yourself.

PDF annotations are notes or markers added to documents, which are great for making comments, giving explanations, giving feedback, etc. Co-creators of documents often communicate with annotations. However, when the issues associated with the annotations have been dealt with or the document has been finalized, it is necessary to remove the annotations to make the document more concise and professional. This article shows how to delete PDF annotations programmatically using Spire.PDF for Java.

Install Spire.PDF for Java

First of all, you're required to add the Spire.Pdf.jar file as a dependency in your Java program. The JAR file can be downloaded from this link. If you use Maven, you can easily import the JAR file in your application by adding the following code to your project's pom.xml file.

<repositories>
    <repository>
        <id>com.e-iceblue</id>
        <name>e-iceblue</name>
        <url>https://repo.e-iceblue.com/nexus/content/groups/public/</url>
    </repository>
</repositories>
<dependencies>
    <dependency>
        <groupId>e-iceblue</groupId>
        <artifactId>spire.pdf</artifactId>
        <version>12.6.4</version>
    </dependency>
</dependencies>

Remove the Specified Annotation

Annotations are page-level document elements. Therefore, deleting an annotation requires getting the page where the annotation is located first, and then you can use the PdfPageBase.getAnnotationsWidget().removeAt() method to delete the annotation. The detailed steps are as follows.

  • Create a PdfDocument instance.
  • Load a PDF document using PdfDocument.loadFromFile() method.
  • Get the first page using PdfDocument.getPages().get() method.
  • Remove the first annotation from this page using PdfPageBase.getAnnotationsWidget().removeAt() method.
  • Save the document using PdfDocument.saveToFile() method.
  • Java
import com.spire.pdf.PdfDocument;
import com.spire.pdf.PdfPageBase;

public class RemoveAnnotation {
    public static void main(String[] args) {

        //Create an object of PdfDocument
        PdfDocument pdf = new PdfDocument();

        //Load a PDF document
        pdf.loadFromFile("C:/Annotations.pdf");

        //Get the first page
        PdfPageBase page = pdf.getPages().get(0);

        //Remove the first annotation
        page.getAnnotationsWidget().removeAt(0);

        //Save the document
        pdf.saveToFile("RemoveOneAnnotation.pdf");
    }
}

Java: Remove Annotations from PDF Documents

Remove All Annotations from a Page

Spire.PDF for Java also provides PdfPageBase.getAnnotationsWidget().clear() method to remove all annotations in the specified page. The detailed steps are as follows.

  • Create a PdfDocument instance.
  • Load a PDF document using PdfDocument.loadFromFile() method.
  • Get the first page using PdfDocument.getPages().get() method.
  • Remove all annotations from the page using PdfPageBase.getAnnotationsWidget().clear() method.
  • Save the document using PdfDocument.saveToFile() method.
  • Java
import com.spire.pdf.PdfDocument;
import com.spire.pdf.PdfPageBase;

public class RemoveAllAnnotationPage {
    public static void main(String[] args) {

        //Create an object of PdfDocument
        PdfDocument pdf = new PdfDocument();

        //Load a PDF document
        pdf.loadFromFile("C:/Annotations.pdf");

        //Get the first page
        PdfPageBase page = pdf.getPages().get(0);

        //Remove all annotations in the page
        page.getAnnotationsWidget().clear();

        //Save the document
        pdf.saveToFile("RemoveAnnotationsPage.pdf");
    }
}

Java: Remove Annotations from PDF Documents

Remove All Annotations from a PDF Document

To remove all annotations from a PDF document, we need to loop through all pages in the document and delete all annotations from each page. The detailed steps are as follows.

  • Create a PdfDocument instance.
  • Load a PDF document using PdfDocument.loadFromFile() method.
  • Loop through all pages to delete annotations.
  • Delete annotations in each page using PdfPageBase.getAnnotationsWidget().clear() method.
  • Save the document using PdfDocument.saveToFile() method.
  • Java
import com.spire.pdf.PdfDocument;
import com.spire.pdf.PdfPageBase;

public class RemoveAllAnnotations {
    public static void main(String[] args) {

        //Create an object of PdfDocument
        PdfDocument pdf = new PdfDocument();

        //Load a PDF document
        pdf.loadFromFile("C:/Users/Sirion/Desktop/Annotations.pdf");

        //Loop through the pages in the document
        for (Object page : (Iterable) pdf.getPages()) {
            PdfPageBase pageBase = (PdfPageBase) page;
            //Remove annotations in each page
            pageBase.getAnnotationsWidget().clear();
        }


        //Save the document
        pdf.saveToFile("RemoveAllAnnotations.pdf");
    }
}

Java: Remove Annotations from PDF Documents

Apply for a Temporary License

If you'd like to remove the evaluation message from the generated documents, or to get rid of the function limitations, please request a 30-day trial license for yourself.

Monday, 11 January 2021 07:13

Add Line Numbers to a PDF in C#/VB.NET

This article demonstrates how to add line numbers before chunks of text in a PDF page by using Spire.PDF for .NET.

Below is a screenshot of the input document.

Add Line Numbers to a PDF in C#, VB.NET

C#
using Spire.Pdf;
using Spire.Pdf.General.Find;
using Spire.Pdf.Graphics;
using System.Drawing;

namespace AddLineNumber
{
    class Program
    {
        static void Main(string[] args)
        {
     //Create a PdfDocument object
     PdfDocument doc = new PdfDocument();

     //Load PDF document
     doc.LoadFromFile(@"C:\Users\Administrator\Desktop\input.pdf");

     //Get the first page
     PdfPageBase page = doc.Pages[0];

     //Find specified text in the first line
     PdfTextFinder finder = new PdfTextFinder(page);
     finder.Options.Parameter = Spire.Pdf.Texts.TextFindParameter.WholeWord;
     PdfTextFragment topLine = finder.Find("C# (pronounced See Sharp)")[0];

     //Get line height
     float lineHeight = topLine.Bounds[0].Height;

     //Get a Y coordinate for the starting position of line numbers
     float y = topLine.Bounds[0].Location.Y - 2;

     //Find specified text in the second line
     PdfTextFinder secondfinder = new PdfTextFinder(page);
     secondfinder.Options.Parameter = Spire.Pdf.Texts.TextFindParameter.WholeWord;
     PdfTextFragment secondLine = secondfinder.Find("language. C#")[0];

     //Calculate line spacing
     float lineSpacing = secondLine.Bounds[0].Top - topLine.Bounds[0].Bottom;

     //Find specified text in the last line
     PdfTextFinder bottomfinder = new PdfTextFinder(page);
     bottomfinder.Options.Parameter = Spire.Pdf.Texts.TextFindParameter.WholeWord;
     PdfTextFragment bottomLine = bottomfinder.Find("allocation of objects")[0];

     //Get the bottom Y coordinate of the last line, which is the height of the line number area
     float height = bottomLine.Bounds[0].Bottom;

     //Create a font with the same size as the text in the PDF
     PdfFont font = new PdfFont(PdfFontFamily.TimesRoman, 11f);

     int i = 1;
     while (y < height)
     {
         //Draw line numbers at the beginning of each line
         page.Canvas.DrawString(i.ToString(), font, PdfBrushes.Black, new PointF(15, y));
         y += lineHeight + lineSpacing;
         i++;
     }

     //Save the document
     doc.SaveToFile("result.pdf");
        }
    }
}
VB.NET
Imports Spire.Pdf
Imports Spire.Pdf.General.Find
Imports Spire.Pdf.Graphics
Imports System.Drawing
 
Namespace AddLineNumber
    Class Program
        Shared  Sub Main(ByVal args() As String)
        'Create a PdfDocument object
        Dim doc As New PdfDocument()

        'Load PDF document
        doc.LoadFromFile("C:\Users\Administrator\Desktop\input.pdf")

        'Get the first page
        Dim page As PdfPageBase = doc.Pages(0)

        'Find specified text in the first line
        Dim finder As New PdfTextFinder(page)
        finder.Options.Parameter = Spire.Pdf.Texts.TextFindParameter.WholeWord
        Dim topLine As PdfTextFragment = finder.Find("C# (pronounced See Sharp)")(0)

        'Get line height
        Dim lineHeight As Single = topLine.Bounds(0).Height

        'Get a Y coordinate for the starting position of line numbers
        Dim y As Single = topLine.Bounds(0).Location.Y - 2

        'Find specified text in the second line
        Dim secondfinder As New PdfTextFinder(page)
        secondfinder.Options.Parameter = Spire.Pdf.Texts.TextFindParameter.WholeWord
        Dim secondLine As PdfTextFragment = secondfinder.Find("language. C#")(0)

        'Calculate line spacing
        Dim lineSpacing As Single = secondLine.Bounds(0).Top - topLine.Bounds(0).Bottom

        'Find specified text in the last line
        Dim bottomfinder As New PdfTextFinder(page)
        bottomfinder.Options.Parameter = Spire.Pdf.Texts.TextFindParameter.WholeWord
        Dim bottomLine As PdfTextFragment = bottomfinder.Find("allocation of objects")(0)

        'Get the bottom Y coordinate of the last line, which is the height of the line number area
        Dim height As Single = bottomLine.Bounds(0).Bottom

        'Create a font with the same size as the text in the PDF
        Dim font As PdfFont = New PdfFont(PdfFontFamily.TimesRoman, 11.0F)

        Dim i As Integer = 1
        While y < height
            'Draw line numbers at the beginning of each line
            page.Canvas.DrawString(i.ToString(), font, PdfBrushes.Black, New PointF(15, y))
            y += lineHeight + lineSpacing
            i += 1
        End While

        'Save the document
        doc.SaveToFile("result.pdf")

        End Sub
    End Class
End Namespace

Output

Add Line Numbers to a PDF in C#, VB.NET

We have introduced how to compare two Word documents in C# and VB.NET. From Spire.Doc V8.12.14, it supports to get the differences between two Word documents in a structure list. This article will show you how to use Spire.Doc to get the differences by comparing two Word documents.

C#
using Spire.Doc;
using Spire.Doc.Documents;
using Spire.Doc.Fields;
using Spire.Doc.Formatting.Revisions;
using System;

namespace GetWordDifferences
    {
    class Program
    {
        static void Main(string[] args)

        {
            //Load the first Word document
            Document doc1 = new Document();
            doc1.LoadFromFile("Sample1.docx");

            //Load the second Word document
            Document doc2 = new Document();
            doc2.LoadFromFile("Sample2.docx");

            //Compare the two Word documents
            doc1.Compare(doc2, "Author");

            foreach (Section sec in doc1.Sections)
            {
                foreach (DocumentObject docItem in sec.Body.ChildObjects)
                {
                    if (docItem is Paragraph)
{
                        Paragraph para = docItem as Paragraph;
                        if (para.IsInsertRevision)
                        { 
                            EditRevision insRevison = para.InsertRevision;
                            EditRevisionType insType = insRevison.Type; 
                            string insAuthor = insRevison.Author; 
                            DateTime insDateTime = insRevison.DateTime; 
                        }

                        else if (para.IsDeleteRevision)
                        { 
                            EditRevision delRevison = para.DeleteRevision; 
                            EditRevisionType delType = delRevison.Type; 
                            string delAuthor = delRevison.Author; 
                            DateTime delDateTime = delRevison.DateTime; 
                        }

                        foreach (ParagraphBase paraItem in para.ChildObjects)
                        {
                            if (paraItem.IsInsertRevision)
                            { 
                                EditRevision insRevison = paraItem.InsertRevision; 
                                EditRevisionType insType = insRevison.Type; 
                                string insAuthor = insRevison.Author; 
                                DateTime insDateTime = insRevison.DateTime; 
                            }

                            else if (paraItem.IsDeleteRevision)
                            { 
                                EditRevision delRevison = paraItem.DeleteRevision; 
                                EditRevisionType delType = delRevison.Type; 
                                string delAuthor = delRevison.Author; 
                                DateTime delDateTime = delRevison.DateTime; 
                            }

                        }
                    }
                }
            }

            //Get the difference about revisions
            DifferRevisions differRevisions = new DifferRevisions(doc1);
            var insetRevisionsList = differRevisions.InsertRevisions;
            var deletRevisionsList = differRevisions.DeleteRevisions;      
        }
    }
 }
VB.NET
Imports Spire.Doc
Imports Spire.Doc.Documents
Imports Spire.Doc.Fields
Imports Spire.Doc.Formatting.Revisions
Imports System

Namespace GetWordDifferences
    
    Class Program
        
        Private Shared Sub Main(ByVal args() As String)
            'Load the first Word document
            Dim doc1 As Document = New Document
            doc1.LoadFromFile("Sample1.docx")
            'Load the second Word document
            Dim doc2 As Document = New Document
            doc2.LoadFromFile("Sample2.docx")
            'Compare the two Word documents
            doc1.Compare(doc2, "Author")
            For Each sec As Section In doc1.Sections
                For Each docItem As DocumentObject In sec.Body.ChildObjects
                    If (TypeOf docItem Is Paragraph) Then
                        Dim para As Paragraph = CType(docItem,Paragraph)
                        If para.IsInsertRevision Then
                            Dim insRevison As EditRevision = para.InsertRevision
                            Dim insType As EditRevisionType = insRevison.Type
                            Dim insAuthor As String = insRevison.Author
                            Dim insDateTime As DateTime = insRevison.DateTime
                        ElseIf para.IsDeleteRevision Then
                            Dim delRevison As EditRevision = para.DeleteRevision
                            Dim delType As EditRevisionType = delRevison.Type
                            Dim delAuthor As String = delRevison.Author
                            Dim delDateTime As DateTime = delRevison.DateTime
                        End If
                        
                        For Each paraItem As ParagraphBase In para.ChildObjects
                            If paraItem.IsInsertRevision Then
                                Dim insRevison As EditRevision = paraItem.InsertRevision
                                Dim insType As EditRevisionType = insRevison.Type
                                Dim insAuthor As String = insRevison.Author
                                Dim insDateTime As DateTime = insRevison.DateTime
                            ElseIf paraItem.IsDeleteRevision Then
                                Dim delRevison As EditRevision = paraItem.DeleteRevision
                                Dim delType As EditRevisionType = delRevison.Type
                                Dim delAuthor As String = delRevison.Author
                                Dim delDateTime As DateTime = delRevison.DateTime
                            End If
                            
                        Next
                    End If
                    
                Next
            Next
            'Get the difference about revisions
            Dim differRevisions As DifferRevisions = New DifferRevisions(doc1)
            Dim insetRevisionsList = differRevisions.InsertRevisions
            Dim deletRevisionsList = differRevisions.DeleteRevisions
        End Sub
    End Class
End Namespace
Tuesday, 10 November 2020 07:16

Java set Excel print page margins

This article demonstrates how to set Excel page margins before printing the Excel worksheets in Java applications. By using Spire.XLS for Java, we could set top margin, bottom margin, left margin, right margin, header margin, and footer margin. Please note that the unit for margin is inch on Spire.XLS for Java while On Microsoft Excel, it is cm (1 inch=2.54 cm).

import com.spire.xls.*;

public class setMargins {
    public static void main(String[] args) {

        String outputFile="output/setMarginsOfExcel.xlsx";

        //Load the sample document from file
        Workbook workbook = new Workbook();
        workbook.loadFromFile("Sample.xlsx");

        //Get the first worksheet.
        Worksheet sheet = workbook.getWorksheets().get(0);

        //Get the PageSetup object of the first worksheet.
        PageSetup pageSetup = sheet.getPageSetup();

        //Set the page margins of bottom, left, right and top.
        pageSetup.setBottomMargin(2);
        pageSetup.setLeftMargin(1);
        pageSetup.setRightMargin(1);
        pageSetup.setTopMargin(3);
        
        //Set the margins of header and footer.
        pageSetup.setHeaderMarginInch(2);
        pageSetup.setFooterMarginInch(2);

        //Save to file.
        workbook.saveToFile(outputFile, ExcelVersion.Version2013);

    }
}

Output:

Java set Excel print page margins

A digital signature is a type of electronic signature that can be used to verify the authenticity and integrity of digital documents. It can help recipients identify where the digital documents originate from and whether they have been changed by a third party after they were signed. In this article, we will demonstrate how to add or delete digital signatures in Excel in C# and VB.NET using Spire.XLS for .NET.

Install Spire.XLS for .NET

To begin with, you need to add the DLL files included in the Spire.XLS for .NET package as references in your .NET project. The DLL files can be either downloaded from this link or installed via NuGet.

PM> Install-Package Spire.XLS

Add a Digital Signature to Excel in C# and VB.NET

You can add a digital signature to protect the integrity of an Excel file. Once the digital signature is added, the file becomes read-only to discourage further editing. If someone makes changes to the file, the digital signature will become invalid immediately.

Spire.XLS for .NET provides the AddDigitalSignature method of Workbook class to add digital signatures to an Excel file. The detailed steps are as follows:

  • Initialize an instance of the Workbook class.
  • Load an Excel file using Workbook.LoadFromFile() method.
  • Initialize an instance of the X509Certificate2 class with the specified certificate (.pfx) file path and the password of the .pfx file.
  • Initialize an instance of the DateTime class.
  • Add a digital signature to the file using Workbook.AddDigitalSignature(X509Certificate2, string, DateTime) method.
  • Save the result file using Workbook.SaveToFile() method.
  • C#
  • VB.NET
using Spire.Xls;
using Spire.Xls.Core.MergeSpreadsheet.Interfaces;
using System;
using System.Security.Cryptography.X509Certificates;

namespace AddSignatureInExcel
{
    class Program
    {
        static void Main(string[] args)
        {
            //Create a Workbook instance
            Workbook workbook = new Workbook();

            //Load an Excel file
            workbook.LoadFromFile("Sample.xlsx");

            //Add digital signature to the file
            X509Certificate2 cert = new X509Certificate2("gary.pfx", "e-iceblue");

            // Define the path to the certificate file
            string certificatePath = "gary.pfx";

            DateTime certtime = new DateTime(2020, 7, 1, 7, 10, 36);

            // Add a digital signature to the workbook using the certificate
            IDigitalSignatures signature = workbook.AddDigitalSignature(certificatePath, "e-iceblue", "Signed by Gary Zhang", certtime);

            //Save the result file
            workbook.SaveToFile("AddDigitalSignature.xlsx", FileFormat.Version2013);
        }
    }
}

C#/VB.NET: Add or Delete Digital Signature in Excel

Delete All Digital Signatures from Excel in C# and VB.NET

Spire.XLS for .NET provides the RemoveAllDigitalSignatures method of Workbook class for developers to remove digital signatures from an Excel file. The detailed steps are as follows:

  • Initialize an instance of the Workbook class.
  • Load an Excel file using Workbook.LoadFromFile() method.
  • Remove all digital signatures from the file using Workbook.RemoveAllDigitalSignatures() method.
  • Save the result file using Workbook.SaveToFile() method.
  • C#
  • VB.NET
using Spire.Xls;

namespace DeleteSignatureInExcel
{
    class Program
    {
        static void Main(string[] args)
        {
            //Create a Workbook instance
            Workbook workbook = new Workbook();
            //Load an Excel file
            workbook.LoadFromFile("AddDigitalSignature.xlsx");

            //Remove all the digital signatures in the file
            workbook.RemoveAllDigitalSignatures();

            //Save the result file
            workbook.SaveToFile("RemoveDigitalSignature.xlsx", FileFormat.Version2013);
        }
    }
}

C#/VB.NET: Add or Delete Digital Signature in Excel

Apply for a Temporary License

If you'd like to remove the evaluation message from the generated documents, or to get rid of the function limitations, please request a 30-day trial license for yourself.

This article demonstrates how to find the text that matches a specific regular expression in a PDF document using Spire.PDF for Java.

import com.spire.pdf.*;
import com.spire.pdf.texts.*;
import java.awt.*;
import java.util.*;
import java.util.List;

public class FindText {
    public static void main(String[] args) {
        //Load a PDF document
        PdfDocument pdf = new PdfDocument();
        pdf.loadFromFile("C:\\Users\\Administrator\\Desktop\\test.pdf");

        //Create a object of PdfTextFind collection
        PdfTextFindOptions findOptions = new PdfTextFindOptions();

        //Loop through the pages
        for (Object page : (Iterable) pdf.getPages()) {
            PdfPageBase pageBase = (PdfPageBase) page;

            //Define a regular expression
            String pattern = "\\#\\w+\\b";
            // Set search parameter to use regular expression
            findOptions.setTextFindParameter(EnumSet.of(TextFindParameter.Regex));

            // Create a text finder object for the page
            PdfTextFinder textFinder = new PdfTextFinder(pageBase);

            // Find text fragments that match the pattern
            List<PdfTextFragment> finds = textFinder.find(pattern, findOptions);

            //Highlight the search results with yellow
            for (PdfTextFragment find : finds) {
                find.highLight(Color.yellow);
            }
        }

        //Save to file
        pdf.saveToFile("FindByPattern.pdf");
    }
}

Find Text in PDF by Regular Expression in Java

To better understand the process, watch this short video demonstrating how to convert Word documents to PDF in Java before following the step-by-step guide below.

video

Converting Word documents (doc or .docx) to PDF is a common requirement in many Java-based applications, especially those involving documentation, report generation, or digital archiving. In this tutorial, we'll show you how to convert Word to PDF in Java using reliable and easy-to-implement libraries like Spire.Doc for Java. This guide will walk you through all the code examples from code library integration to converting Word files to PDF format seamlessly.

After reading this guide, you will learn:

Let's dive into the Java code for Word to PDF conversion and help your application automate document processing with just a few lines of code.

Specific Steps of How to Convert Doc/Docx Files to PDF Format with Spire.Doc for Java

Before going through the sample code, you should know one of the best Word Java library known as Spire.Doc for Java. It supports not only Word file format conversion, but more advanced settings including page size adjustment, font embed, specific area conversion, etc. It is a one time download for long-term benefits.

Follow the steps below to learn how you can convert Word files to PDF format with Spire.Doc for Java.

Step 1. Install Spire.Doc for Java

Before converting, you should add the Spire.Doc.jar file as a dependency in your Java program. You can download the JAR file from the official download page.

If you are using Maven, you can easily import the JAR file in your application by adding the following code to your project's pom.xml file directly:


<repositories>
    <repository>
        <id>com.e-iceblue</id>
        <name>e-iceblue</name>
        <url>https://repo.e-iceblue.com/nexus/content/groups/public/</url>
    </repository>
</repositories>
<dependencies>
    <dependency>
        <groupId>e-iceblue</groupId>
        <artifactId>spire.doc</artifactId>
        <version>14.6.0</version>
    </dependency>
</dependencies>

Step 2. Convert Word to PDF with Java Code

After library integration, now, it's time to convert your Word files. Copy the code below to your Java program:

import com.spire.doc.Document;
import com.spire.doc.FileFormat;

public class ConvertWordToPdf {

    public static void main(String[] args) {

        // Create a Document object
        Document doc = new Document();

        // Load a Word document
        doc.loadFromFile("C:\\Users\\Administrator\\Desktop\\Input.docx");

        // Save the document to PDF
        doc.saveToFile("ToPDF.pdf", FileFormat.PDF);

        // Dispose resources
        doc.dispose();
    }
}

RESULT:

result of converting word to pdf with spire doc for java

Advanced Settings When Converting Word Files to PDF

Except for simply converting Word to PDF files, Spire.Doc for Java provides more options beyond simply conversion. For example, you can set a password to protect your PDF data with simple code during the conversion. You have no need to look for other tutorials. The following list is a preview of these settings, and you can directly jump to the corresponding part.

Convert Word files to PDF/A Format with Java Code

Spire.Doc for Java allows you to set the conformance level as Pdf/A-1a. To apply these customized settings, pass the ToPdfParameterList object as a parameter to the Document.saveToFile() method.

Copy the code below to convert your doc/docx to PDF/A files with Java:

import com.spire.doc.Document;
import com.spire.doc.ToPdfParameterList;
import com.spire.doc.PdfConformanceLevel;

public class ConvertWordToPdfa {

    public static void main(String[] args) {

        // Create a Document object
        Document doc = new Document();

        // Load a Word document
        doc.loadFromFile("C:\\Users\\Administrator\\Desktop\\input.docx");

        // Create a ToPdfParameterList object
        ToPdfParameterList parameters  = new ToPdfParameterList();

        // Set the conformance level for PDF
        parameters.setPdfConformanceLevel(PdfConformanceLevel.Pdf_A_1_A);

        // Save the document to a PDF file
        doc.saveToFile("C:\\Users\\Administrator\\Desktop\\ToPdfA.pdf", parameters);

        // Dispose resources
        doc.dispose();
    }
}

Convert Word Files to Password-Protected PDF in Java

To protect your data, you may need to set a password for the converted PDF file. In this part, you can encrypt the converted PDF documents with password during the conversion process without any hassle. By passing the ToPdfParameterList object as a parameter to the Document.saveToFile() method, these encryption settings will be applied during the saving process.

Copy the code below to encrypt converted PDF files during conversion with Java:

import com.spire.doc.Document;
import com.spire.doc.PdfPermissionsFlags;
import com.spire.doc.ToPdfParameterList;

public class ConvertWordToPasswordProtectedPdf {

    public static void main(String[] args) {

        // Create a Document object
        Document doc = new Document();

        // Load a Word document
        doc.loadFromFile("C:\\Users\\Administrator\\Desktop\\Input.docx");

        // Create a ToPdfParameterList object
        ToPdfParameterList parameters = new ToPdfParameterList();

        // Set open password and permission password for PDF
        parameters.getPdfSecurity().encrypt("openPsd", PdfPermissionsFlags.valueOf("permissionPsd"));

        // Save the document to PDF
        doc.saveToFile("PasswordProtected.pdf", parameters);

        // Dispose resources
        doc.dispose();
    }
}

Convert a Specific Area in Word to PDF in Java

The third section is that you can create a copy of a certain section with Spire.Doc's Section.deepClone() method and use the SectionCollection.add() method to add the copied section to the section collection of another document.

You can easily create a document containing the desired section from the source document with the following Java code:

import com.spire.doc.Document;
import com.spire.doc.FileFormat;
import com.spire.doc.Section;

public class ConvertSectionToPdf {

    public static void main(String[] args) {

        // Create a Document object
        Document doc = new Document();

        // Load a Word document
        doc.loadFromFile("C:\\Users\\Administrator\\Desktop\\Input.docx");

        // Get a specific section of the document
        Section section = doc.getSections().get(1);

        // Create a new document object
        Document newDoc = new Document();

        // Clone the default style to the new document
        doc.cloneDefaultStyleTo(newDoc);

        // Clone the section to the new document
        newDoc.getSections().add(section.deepClone());

        // Save the new document to PDF
        newDoc.saveToFile("SectionToPDF.pdf", FileFormat.PDF);

        // Dispose resources
        doc.dispose();
        newDoc.dispose();
    }
}

How to Adjust Word Files When Converting Word to PDF with Java Code

To get the best conversion result, you can make more adjustment to your Word documents during the conversion process. For example, you can embed fonts, adjust page size, set image quality, create bookmarks, or modify hyperlink with Java code.

Here, I will take one adjustment as an example to show you how to manage it. If you need to apply other adjustments, you can directly click the link above and jump to the corresponding page.

Example: Set Image Quality During Word to PDF Conversion

Image quality is a vital element when converting DOC/DOCX files to PDF format. It may influence the conversion speed and play an important role in the conversion result. With Spire.Doc for Java, you can set the image quality according to your specific needs to ensure the highest efficiency.

Copy the following code in your Java program to adjust picture quality:

import com.spire.doc.Document;
import com.spire.doc.FileFormat;

public class SetImageQualityDuringConversion {

    public static void main(String[] args) {
        
        // Create a Document object
        Document doc = new Document();

        // Load a Word document
        doc.loadFromFile("C:\\Users\\Administrator\\Desktop\\Input.docx");

        // Set the image quality to 50% of the original quality
        doc.setJPEGQuality(50);

        // Preserve original image quality
        // doc.setJPEGQuality(100);

        // Save the document to PDF
        doc.saveToFile("SetImageQuality.pdf", FileFormat.PDF);

        // Dispose resources
        doc.dispose();
    }
}

Final Words

Converting Word to PDF in Java is simple and efficient with Spire.Doc for Java. From basic conversion to advanced options like password protection, section export, and image quality settings, you can handle it all with just a few lines of code.

If you'd like to access all features without limitations and remove the evaluation watermark, please request a 30-day trial license for yourself.

Monday, 20 July 2020 07:58

Edit Bookmarks in PDF in Java

This article demonstrates how to edit the existing bookmarks in a PDF file, for example, change the bookmark title, font color and text style using Spire.PDF for Java.

import com.spire.pdf.PdfDocument;
import com.spire.pdf.bookmarks.PdfBookmark;
import com.spire.pdf.bookmarks.PdfTextStyle;
import com.spire.pdf.graphics.PdfRGBColor;

import java.awt.*;

public class EditBookmarks {
    public static void main(String[] args) {
        //Create a PdfDocument instance
        PdfDocument doc = new PdfDocument();
        //Load the PDF file
        doc.loadFromFile("Bookmarks.pdf");

        //Get the first bookmark
        PdfBookmark bookmark = doc.getBookmarks().get(0);
        //Change the title of the bookmark
        bookmark.setTitle("New Title");
        //Change the font color of the bookmark
        bookmark.setColor(new PdfRGBColor(new Color(255,0,0)));
        //Change the outline text style of the bookmark
        bookmark.setDisplayStyle(PdfTextStyle.Italic);

        //Edit child bookmarks of the first bookmark
        for (Object Bookmark : (Iterable) bookmark) {
            PdfBookmark childBookmark=(PdfBookmark)Bookmark;
            childBookmark.setColor(new PdfRGBColor(new Color(0,0,255)));
            childBookmark.setDisplayStyle(PdfTextStyle.Bold);

            for (PdfBookmark Bookmark2 : (Iterable⁢PdfBookmark>) bookmark){
                PdfBookmark childBookmark2=(PdfBookmark)Bookmark2;
                childBookmark2.setColor(new PdfRGBColor(new Color(160,160,122)) );
                childBookmark2.setDisplayStyle(PdfTextStyle.Bold);
            }
        }

        //Save the result file
        doc.saveToFile("EditBookmarks.pdf");
        doc.close();
    }
}

Output:

Edit Bookmarks in PDF in Java

Monday, 30 January 2023 08:50

Java: Add, Edit, or Delete Bookmarks in PDF

A bookmark in a PDF document consists of formatted text linking to a specific section of the document. Readers can navigate through pages by simply clicking on the bookmarks displayed on the side of the page instead of scrolling up and down, which is very helpful for those huge documents. Moreover, well-organized bookmarks can also serve as contents. When you create a PDF document with a lot of pages, it’s better to add bookmarks to link to significant content. This article is going to show how to add, modify, and remove bookmarks in PDF documents using Spire.PDF for Java through programming.

Install Spire.PDF for Java

First of all, you're required to add the Spire.Pdf.jar file as a dependency in your Java program. The JAR file can be downloaded from this link. If you use Maven, you can easily import the JAR file in your application by adding the following code to your project's pom.xml file.

<repositories>
    <repository>
        <id>com.e-iceblue</id>
        <name>e-iceblue</name>
        <url>https://repo.e-iceblue.com/nexus/content/groups/public/</url>
    </repository>
</repositories>
<dependencies>
    <dependency>
        <groupId>e-iceblue</groupId>
        <artifactId>spire.pdf</artifactId>
        <version>12.6.4</version>
    </dependency>
</dependencies>

Add Bookmarks to a PDF Document

Spire.PDF for Java provides PdfDocument.getBookmarks().add() method to add bookmarks to a PDF document. In addition to adding primary bookmarks, we can use PdfBookmark.add() method to add a sub-bookmark to a primary bookmark. There are also many other methods under PdfBookmark class which are used to set the destination, text color, and text style of bookmarks. The detailed steps of adding bookmarks to a PDF document are as follows.

  • Create a PdfDocument class instance.
  • Load a PDF document using PdfDocument.loadFromFile() method.
  • Loop through the pages in the PDF document to add bookmarks and set their styles.
  • Add a primary bookmark to the document using PdfDocument.getBookmarks().add() method.
  • Create a PdfDestination class object and set the destination of the primary bookmark using PdfBookmark.setAction() method.
  • Set the text color of the primary bookmark using PdfBookmark.setColor() method.
  • Set the text style of the Primary bookmark using PdfBookmark.setDisplayStyle() method.
  • Add a sub-bookmark to the primary bookmark using PdfBookmark.add() method.
  • Use the above methods to set the destination, text color, and text style of the sub-bookmark.
  • Save the document using PdfDocument.saveToFile() method.
  • Java
import com.spire.pdf.PdfDocument;
import com.spire.pdf.PdfPageBase;
import com.spire.pdf.actions.PdfGoToAction;
import com.spire.pdf.bookmarks.PdfBookmark;
import com.spire.pdf.bookmarks.PdfTextStyle;
import com.spire.pdf.general.PdfDestination;
import com.spire.pdf.graphics.PdfRGBColor;

import java.awt.*;
import java.awt.geom.Point2D;

public class addBookmark {
    public static void main(String[] args) {

        //Create a PdfDocument class instance
        PdfDocument pdf = new PdfDocument();

        //Load a PDF file
        pdf.loadFromFile("There's No Planet B.pdf");

        //Loop through the pages in the PDF file
        for(int i = 0; i< pdf.getPages().getCount();i++) {
            PdfPageBase page = pdf.getPages().get(i);
            //Add a bookmark
            PdfBookmark bookmark = pdf.getBookmarks().add(String.format("Bookmark-%s", i + 1));
            //Set the destination page and location
            PdfDestination destination = new PdfDestination(page, new Point2D.Float(0, 0));
            bookmark.setAction(new PdfGoToAction(destination));
            //Set the text color
            bookmark.setColor(new PdfRGBColor(new Color(139, 69, 19)));
            //Set the text style
            bookmark.setDisplayStyle(PdfTextStyle.Bold);
            //Add a child bookmark
            PdfBookmark childBookmark = bookmark.add(String.format("Sub-Bookmark-%s", i + 1));
            //Set the destination page and location
            PdfDestination childDestination = new PdfDestination(page, new Point2D.Float(0, 100));
            childBookmark.setAction(new PdfGoToAction(childDestination));
            //Set the text color
            childBookmark.setColor(new PdfRGBColor(new Color(255, 127, 80)));
            //Set the text style
            childBookmark.setDisplayStyle(PdfTextStyle.Italic);
        }

        //Save the result file
        pdf.saveToFile("AddBookmarks.pdf");
    }
}

Java: Add, Edit, or Delete Bookmarks in PDF

Edit Bookmarks in a PDF Document

We can also use methods of PdfBookmark class in Spire.PDF for Java to edit existing PDF bookmarks. The detailed steps are as follows.

  • Create a PdfDocument class instance.
  • Load a PDF document using PdfDocument.loadFromFile() method.
  • Get the first bookmark using PdfDocument.getBookmarks().get() method.
  • Change the title of the bookmark using PdfBookmark.setTitle() method.
  • Change the font color of the bookmark using PdfBookmark.setColor() method.
  • Change the outline text style of the bookmark using PdfBookmark.setDisplayStyle() method.
  • Change the text color and style of the sub-bookmark using the above methods.
  • Save the document using PdfDocument.saveToFile() method.
  • Java
import com.spire.pdf.PdfDocument;
import com.spire.pdf.bookmarks.PdfBookmark;
import com.spire.pdf.bookmarks.PdfTextStyle;
import com.spire.pdf.graphics.PdfRGBColor;

import java.awt.*;

public class editBookmarks {
    public static void main(String[] args) {

        //Create a PdfDocument class instance
        PdfDocument doc = new PdfDocument();

        //Load a PDF file
        doc.loadFromFile("AddBookmarks.pdf");

        //Get the first bookmark
        PdfBookmark bookmark = doc.getBookmarks().get(0);
        //Change the title of the bookmark
        bookmark.setTitle("New Title");
        //Change the font color of the bookmark
        bookmark.setColor(new PdfRGBColor(new Color(255,0,0)));
        //Change the outline text style of the bookmark
        bookmark.setDisplayStyle(PdfTextStyle.Italic);

        //Edit sub-bookmarks of the first bookmark
        for (Object Bookmark : (Iterable) bookmark) {
            PdfBookmark childBookmark=(PdfBookmark)Bookmark;
            childBookmark.setColor(new PdfRGBColor(new Color(0,0,255)));
            childBookmark.setDisplayStyle(PdfTextStyle.Bold);
        }


        //Save the result file
        doc.saveToFile("EditBookmarks.pdf");
        doc.close();
    }
}

Java: Add, Edit, or Delete Bookmarks in PDF

Delete Bookmarks from a PDF Document

We can use Spire.PDF for Java to delete any bookmark in a PDF document. PdfDocument.getBookmarks().removeAt() is used to remove a specific primary bookmark, PdfDocument.getBookmarks().clear() method is used to remove all bookmarks, and PdfBookmark.removeAt() method is used to remove a specific sub-bookmark of a primary bookmark. The detailed steps of removing bookmarks form a PDF document are as follows.

  • Create PdfDocument class instance.
  • Load a PDF document using PdfDocument.loadFromFile() method.
  • Get the first bookmark using PdfDocument.getBookmarks().get() method.
  • Remove the sub-bookmark of the first bookmark using PdfBookmark.removeAt() method.
  • Save the document using PdfDocument.saveToFile() method.
  • Java
import com.spire.pdf.PdfDocument;
import com.spire.pdf.bookmarks.PdfBookmark;

public class deleteBookmarks {
    public static void main(String[] args) {

        //Create a PdfDocument instance
        PdfDocument pdf = new PdfDocument();

        //Load the PDF file
        pdf.loadFromFile("AddBookmarks.pdf");

        //Get the first bookmark
        PdfBookmark pdfBookmark = pdf.getBookmarks().get(0);

        //Delete the sub-bookmark of the first bookmark
        pdfBookmark.removeAt(0);

        //Delete the first bookmark along with its child bookmark
        //pdf.getBookmarks().removeAt(0);

        //Delete all the bookmarks
        //pdf.getBookmarks().clear();

        //Save the result file
        pdf.saveToFile("DeleteBookmarks.pdf");
    }
}

Java: Add, Edit, or Delete Bookmarks in PDF

Apply for a Temporary License

If you'd like to remove the evaluation message from the generated documents, or to get rid of the function limitations, please request a 30-day trial license for yourself.