Spire.OCR for Java Program Guide Content

2024-09-12 01:23:30 Written by Koohji

Spire.OCR for Java is a professional OCR library to read text from Images in JPG, PNG, GIF, BMP and TIFF formats. Developers can easily add OCR functionalities on Java applications (J2SE and J2EE). It supports commonly used image formats and provides functionalities like reading multiple characters and fonts from images, bold and italic styles and much more.

Spire.OCR for Java provides a very easy way to extract text from images. With just three lines of code in Java, Spire.OCR supports read texts from variable common image formats, such as Bitmap, JPG, PNG, TIFF and GIF.

Spire.OCR for Java offers developers a new model for extracting text from images. In this article, we will demonstrate how to extract text from images in Java using the new model of Spire.OCR for Java.

The detailed steps are as follows.

Step 1: Create a Java Project in IntelliJ IDEA.

Extract Text from Images Using the New Model of Spire.OCR for Java

Step 2: Add Spire.OCR.jar to Your Project.

Option 1: Install Spire.OCR for Java via Maven.

If you're using Maven, you can install Spire.OCR for Java by adding the following code to your project's pom.xml file:

<repositories>
    <repository>
        <id>com.e-iceblue</id>
        <name>e-iceblue</name>
        <url>https://repo.e-iceblue.com/nexus/content/groups/public/</url>
    </repository>
</repositories>
<dependencies>
    <dependency>
        <groupId>e-iceblue</groupId>
        <artifactId>spire.ocr</artifactId>
        <version>2.1.5</version>
    </dependency>
</dependencies>

Option 2: Manually Import Spire.OCR.jar.

First, download Spire.OCR for Java from the following link and extract it to a specific directory:

https://www.e-iceblue.com/Download/ocr-for-java.html

Next, in IntelliJ IDEA, go to File > Project Structure > Modules > Dependencies. In the Dependencies pane, click the "+" button and select JARs or Directories. Navigate to the directory where Spire.OCR for Java is located, open the lib folder and select the Spire.OCR.jar file, then click OK to add it as the project’s dependency.

Extract Text from Images Using the New Model of Spire.OCR for Java

Step 3: Download the New Model of Spire.OCR for Java.

Download the model that fits in with your operating system from one of the following links.

Windows x64

Linux x64 (CentOS 8, Ubuntu 18 and above versions are required)

macOS 10.15 and later

Linux aarch

Then extract the package and save it to a specific directory on your computer. In this example, we saved the package to "D:\".

Extract Text from Images Using the New Model of Spire.OCR for Java

Step 4: Implement Text Extraction from Images Using the New Model of Spire.OCR for Java.

Use the following code to extract text from images with the new OCR model of Spire.OCR for Java:

  • Java
import com.spire.ocr.*;
import java.io.BufferedWriter;
import java.io.FileWriter;
import java.io.IOException;

public class Main {
    public static void main(String[] args) {
        try {
            // Create an instance of the OcrScanner class
            OcrScanner scanner = new OcrScanner();

            // Create an instance of the ConfigureOptions class to set up the scanner configurations
            ConfigureOptions configureOptions = new ConfigureOptions();

            // Set the path to the new model
            configureOptions.setModelPath("D:\\win-x64");

            // Set the language for text recognition. The default is English.
            // Supported languages include English, Chinese, Chinesetraditional, French, German, Japanese, and Korean.
            configureOptions.setLanguage("English");

            // Apply the configuration options to the scanner
            scanner.ConfigureDependencies(configureOptions);

            // Extract text from an image
            scanner.scan("Sample.png");

            // Save the extracted text to a text file
            saveTextToFile(scanner, "output.txt");

        } catch (OcrException e) {
            e.printStackTrace();
        }
    }

    private static void saveTextToFile(OcrScanner scanner, String filePath) {
        try {
            String text = scanner.getText().toString();
            try (BufferedWriter writer = new BufferedWriter(new FileWriter(filePath))) {
                writer.write(text);
            }
        } catch (IOException | OcrException e) {
            e.printStackTrace();
        }
    }
}

Apply for a Temporary License

If you'd like to remove the evaluation message from the generated documents, or to get rid of the function limitations, please request a 30-day trial license for yourself.

Java: Convert HTML to Excel

2024-08-30 01:03:24 Written by Koohji

HTML files often contain valuable datasets embedded within tables. However, analyzing this data directly in HTML can be cumbersome and inefficient. Converting HTML tables to Excel format allows you to take advantage of Excel's powerful data manipulation and analysis tools, making it easier to sort, filter, and visualize the information. Whether you need to analyze data for a report, perform calculations, or simply organize it in a more user-friendly format, converting HTML to Excel streamlines the process. In this article, we will demonstrate how to convert HTML files to Excel format in Java using Spire.XLS for Java.

Install Spire.XLS for Java

First of all, you're required to add the Spire.Xls.jar file as a dependency in your Java program. The JAR file can be downloaded from this link. If you use Maven, you can easily import the JAR file in your application by adding the following code to your project's pom.xml file.

<repositories>
    <repository>
        <id>com.e-iceblue</id>
        <name>e-iceblue</name>
        <url>https://repo.e-iceblue.com/nexus/content/groups/public/</url>
    </repository>
</repositories>
<dependencies>
    <dependency>
        <groupId>e-iceblue</groupId>
        <artifactId>spire.xls</artifactId>
        <version>16.6.5</version>
    </dependency>
</dependencies>

Convert HTML to Excel in Java

Spire.XLS for Java provides the Workbook.loadFromHtml() method for loading an HTML file. Once the HTML file is loaded, you can convert it to Excel format using the Workbook.saveToFile() method. The detailed steps are as follows.

  • Create an object of the Workbook class.
  • Load an HTML file using the Workbook.loadFromHtml() method.
  • Save the HTML file in Excel format using the Workbook.saveToFile() method.
  • Java
import com.spire.xls.ExcelVersion;
import com.spire.xls.Workbook;

public class ConvertHtmlToExcel {
    public static void main(String[] args) {
        // Specify the input HTML file path
        String filePath = "C:\\Users\\Administrator\\Desktop\\Sample.html";

        // Create an object of the workbook class
        Workbook workbook = new Workbook();
        // Load the HTML file
        workbook.loadFromHtml(filePath);

        // Save the HTML file in Excel XLSX format
        String result = "C:\\Users\\Administrator\\Desktop\\ToExcel.xlsx";
        workbook.saveToFile(result, ExcelVersion.Version2013);

        workbook.dispose();
    }
}

Java: Convert HTML to Excel

Insert HTML String into Excel in Java

In addition to converting HTML files to Excel, Spire.XLS for Java allows you to insert HTML strings directly into Excel cells using the CellRange.setHtmlString() method. The detailed steps are as follows.

  • Create an object of the Workbook class.
  • Get a specific worksheet by its index (0-based) using the Workbook.getWorksheets().get(index) method.
  • Get the cell that you want to add an HTML string to using the Worksheet.getCellRange() method.
  • Add an HTML sting to the cell using the CellRange.setHtmlString() method.
  • Save the resulting workbook to a new file using the Workbook.saveToFile() method.
  • Java
import com.spire.xls.CellRange;
import com.spire.xls.ExcelVersion;
import com.spire.xls.Workbook;
import com.spire.xls.Worksheet;

public class InsertHtmlStringInExcelCell {
    public static void main(String[] args) {
        // Create an object of the workbook class
        Workbook workbook = new Workbook();
        // Get the first sheet
        Worksheet sheet = workbook.getWorksheets().get(0);

        // Specify the HTML string
        String htmlCode = "<p><font size='12'>This is a <b>paragraph</b> with <span style='color: red;'>colored text</span>.</font></p>";

        // Get the cell that you want to add the HTML string to
        CellRange range = sheet.getCellRange("A1");
        // Add the HTML string to the cell
        range.setHtmlString(htmlCode);

        // Auto-adjust the width of the first column based on its content
        sheet.autoFitColumn(1);

        // Save the resulting workbook to a new file
        String result = "C:\\Users\\Administrator\\Desktop\\InsertHtmlStringIntoCell.xlsx";
        workbook.saveToFile(result, ExcelVersion.Version2013);

        workbook.dispose();
    }
}

Java: Convert HTML to Excel

Apply for a Temporary License

If you'd like to remove the evaluation message from the generated documents, or to get rid of the function limitations, please request a 30-day trial license for yourself.

Page 6 of 81
page 6