Spire.Office Knowledgebase Page 32

Knowledgebase (2329)

Children categories

Spire.OfficeJs (5)

View items...

C#: Convert HTML to PDF using ChromeHtmlConverter

2024-11-05 01:00:28 Written by Koohji

HTML is widely used to present content in web browsers, but preserving its exact layout when sharing or printing can be challenging. PDF, by contrast, is a universally accepted format that reliably maintains document layout across various devices and operating systems. Converting HTML to PDF is particularly useful in web development, especially when creating printable versions of web pages or generating reports from web data.

Spire.PDF for .NET now supports a streamlined method to convert HTML to PDF in C# using the ChromeHtmlConverter class. This tutorial provides step-by-step guidance on performing this conversion effectively.

Convert HTML to PDF with ChromeHtmlConverter in C#
Generate Output Logs During HTML to PDF Conversion in C#

Install Spire.PDF for .NET

To begin with, you need to add the DLL files included in the Spire.PDF for.NET package as references in your .NET project. The DLLs files can be either downloaded from this link or installed via NuGet.

Package Manager

PM> Install-Package Spire.PDF

Install Google Chrome

This method requires Google Chrome to perform the conversion. If Chrome is not already installed, you can download it from this link and install it.

Convert HTML to PDF using ChromeHtmlConverter in C#

You can utilize the ChromeHtmlConverter.ConvertToPdf() method to convert an HTML file to a PDF using the Chrome plugin. This method accepts 3 parameters, including the input HTML file path, output PDF file path, and ConvertOptions which allows customization of conversion settings like conversion timeout, PDF paper size and page margins. The detailed steps are as follows.

Create an instance of the ChromeHtmlConverter class and provide the path to the Chrome plugin (chrome.exe) as a parameter in the class constructor.
Create an instance of the ConvertOptions class.
Customize the conversion settings, such as the conversion timeout, the paper size and page margins of the converted PDF through the properties of the ConvertOptions class.
Convert an HTML file to PDF using the ChromeHtmlConverter.ConvertToPdf() method.

using Spire.Additions.Chrome;

namespace ConvertHtmlToPdfUsingChrome
{
    internal class Program
    {
        static void Main(string[] args)
        {
            //Specify the input URL and output PDF file path
            string inputUrl = @"https://www.e-iceblue.com/Tutorials/Spire.PDF/Spire.PDF-Program-Guide/C-/VB.NET-Convert-Image-to-PDF.html";            
            string outputFile = @"HtmlToPDF.pdf";

            //Specify the path to the Chrome plugin
            string chromeLocation = @"C:\Program Files\Google\Chrome\Application\chrome.exe";

            //Create an instance of the ChromeHtmlConverter class
            ChromeHtmlConverter converter = new ChromeHtmlConverter(chromeLocation);

            // Create an instance of the ConvertOptions class
            ConvertOptions options = new ConvertOptions();
            //Set conversion timeout
            options.Timeout = 10 * 3000;
            //Set paper size and page margins of the converted PDF
            options.PageSettings = new PageSettings()
            {
                PaperWidth = 8.27,
                PaperHeight = 11.69,
                MarginTop = 0,
                MarginLeft = 0,
                MarginRight = 0,
                MarginBottom = 0

            };

            //Convert the URL to PDF
            converter.ConvertToPdf(inputUrl, outputFile, options);
        }
    }
}

The converted PDF file maintains the same appearance as if the HTML file were printed to PDF directly through the Chrome browser:

C#: Convert HTML to PDF using ChromeHtmlConverter

Generate Output Logs During HTML to PDF Conversion in C#

Spire.PDF for .NET enables you to generate output logs during HTML to PDF conversion using the Logger class. The detailed steps are as follows.

Create an instance of the ChromeHtmlConverter class and provide the path to the Chrome plugin (chrome.exe) as a parameter in the class constructor.
Enable Logging by creating a Logger object and assigning it to the ChromeHtmlConverter.Logger property.
Create an instance of the ConvertOptions class.
Customize the conversion settings, such as the conversion timeout, the paper size and page margins of the converted PDF through the properties of the ConvertOptions class.
Convert an HTML file to PDF using the ChromeHtmlConverter.ConvertToPdf() method.

using Spire.Additions.Chrome;

namespace ConvertHtmlToPdfUsingChrome
{
    internal class Program
    {
        static void Main(string[] args)
        {
            //Specify the input URL and output PDF file path
            string inputUrl = @"https://www.e-iceblue.com/Tutorials/Spire.PDF/Spire.PDF-Program-Guide/C-/VB.NET-Convert-Image-to-PDF.html";
            string outputFile = @"HtmlToPDF.pdf";

            // Specify the log file path
            string logFilePath = @"Logs.txt";

            //Specify the path to the Chrome plugin
            string chromeLocation = @"C:\Program Files\Google\Chrome\Application\chrome.exe";

            //Create an instance of the ChromeHtmlConverter class
            ChromeHtmlConverter converter = new ChromeHtmlConverter(chromeLocation);
            //Enable logging
            converter.Logger = new Logger(logFilePath);

            //Create an instance of the ConvertOptions class
            ConvertOptions options = new ConvertOptions();
            //Set conversion timeout
            options.Timeout = 10 * 3000;
            //Set paper size and page margins of the converted PDF
            options.PageSettings = new PageSettings()
            {
                PaperWidth = 8.27,
                PaperHeight = 11.69,
                MarginTop = 0,
                MarginLeft = 0,
                MarginRight = 0,
                MarginBottom = 0

            };

            //Convert the URL to PDF
            converter.ConvertToPdf(inputUrl, outputFile, options);
        }
    }
}

Here is the screenshot of the output log file:

C#: Convert HTML to PDF using ChromeHtmlConverter

Apply for a Temporary License

If you'd like to remove the evaluation message from the generated documents, or to get rid of the function limitations, please request a 30-day trial license for yourself.

Published in Conversion

Tagged under

pdf net Conversion

Python: Import and Export PDF Form Data

2024-11-04 01:09:46 Written by Koohji

PDF forms are essential tools for collecting information across various industries. Understanding how to import and export this data in different formats like FDF, XFDF, and XML can greatly enhance your data management processes. For instance, importing form data allows you to update or pre-fill PDF forms with existing information, saving time and increasing accuracy. Conversely, exporting form data enables you to share collected information effortlessly with other applications, facilitating seamless integration and minimizing manual entry errors. In this article, we will introduce how to import and export PDF form data in Python using Spire.PDF for Python.

Import PDF Form Data from FDF, XFDF or XML Files in Python
Export PDF Form Data to FDF, XFDF or XML Files in Python

Install Spire.PDF for Python

This scenario requires Spire.PDF for Python and plum-dispatch v1.7.4. They can be easily installed in your Windows through the following pip command.

Package Manager

pip install Spire.PDF

If you are unsure how to install, please refer to this tutorial: How to Install Spire.PDF for Python on Windows

Import PDF Form Data from FDF, XFDF or XML Files in Python

Spire.PDF for Python offers the PdfFormWidget.ImportData() method for importing PDF form data from FDF, XFDF, or XML files. The detailed steps are as follows.

Create an object of the PdfDocument class.
Load a PDF document using PdfDocument.LoadFromFile() method.
Get the form of the PDF document using PdfDocument.Form property.
Import form data from an FDF, XFDF or XML file using PdfFormWidget.ImportData() method.
Save the resulting document using PdfDocument.SaveToFile() method.

Python

from spire.pdf.common import *
from spire.pdf import *

# Create an object of the PdfDocument class
pdf = PdfDocument()
# Load a PDF document
pdf.LoadFromFile("Forms.pdf")

# Get the form of the document
pdfForm = pdf.Form
formWidget = PdfFormWidget(pdfForm)

# Import PDF form data from an XML file
formWidget.ImportData("Data.xml", DataFormat.Xml)

# Import PDF form data from an FDF file
# formWidget.ImportData("Data.fdf", DataFormat.Fdf)

# Import PDF form data from an XFDF file
# formWidget.ImportData("Data.xfdf", DataFormat.XFdf)

# Save the resulting document
pdf.SaveToFile("Output.pdf")
# Close the PdfDocument object
pdf.Close()

Python: Import and Export PDF Form Data

Export PDF Form Data to FDF, XFDF or XML Files in Python

Spire.PDF for Python also enables developers to export PDF form data to FDF, XFDF, or XML files by using the PdfFormWidget.ExportData() method. The detailed steps are as follows.

Create an object of the PdfDocument class.
Load a PDF document using PdfDocument.LoadFromFile() method.
Get the form of the PDF document using PdfDocument.Form property.
Export form data to an FDF, XFDF or XML file using PdfFormWidget.ExportData() method.

Python

from spire.pdf.common import *
from spire.pdf import *

# Create an object of the PdfDocument class
pdf = PdfDocument()
# Load a PDF document
pdf.LoadFromFile("Forms.pdf")

# Get the form of the document
pdfForm = pdf.Form
formWidget = PdfFormWidget(pdfForm)

# Export PDF form data to an XML file
formWidget.ExportData("Data.xml", DataFormat.Xml, "Form")

# Export PDF form data to an FDF file
# formWidget.ExportData("Data.fdf", DataFormat.Fdf, "Form")

# Export PDF form data to an XFDF file
# formWidget.ExportData("Data.xfdf", DataFormat.XFdf, "Form")

# Close the PdfDocument object
pdf.Close()

Python: Import and Export PDF Form Data

Apply for a Temporary License

If you'd like to remove the evaluation message from the generated documents, or to get rid of the function limitations, please request a 30-day trial license for yourself.

Published in Form Field

Tagged under

pdf Python Form Field

C#: Convert Word to Markdown

2024-10-31 06:06:12 Written by Koohji

Markdown, with its lightweight syntax, offers a streamlined approach to web content creation, collaboration, and document sharing, particularly in environments where tools like Git or Markdown-friendly editors are prevalent. By converting Word documents to Markdown files, users can enhance their productivity, facilitate easier version control, and ensure compatibility across different systems and platforms. In this article, we will explore the process of converting Word documents to Markdown files using Spire.Doc for .NET, providing simple C# code examples.

Convert Word to Markdown with C#
Convert Word to Markdown Without Images

Install Spire.Doc for .NET

To begin with, you need to add the DLL files included in the Spire.Doc for .NET package as references in your .NET project. The DLL files can be either downloaded from this link or installed via NuGet.

Package Manager

PM> Install-Package Spire.Doc

Convert Word to Markdown with C#

Using Spire.Doc for .NET, we can convert a Word document to a Markdown file by loading the document using Document.LoadFromFile() method and then convert it to a Markdown file using Document.SaveToFile(filename: String, FileFormat.Markdown) method. The detailed steps are as follows:

Create an instance of Document class.
Load a Word document using Document.LoadFromFile() method.
Convert the document to a Markdown file using Document.SaveToFile(filename: String, FileFormat.Markdown) method.
Release resources.

using Spire.Doc;

namespace WordToMarkdown
{
    class Program
    {
        static void Main(string[] args)
        {
            // Create an instance of Document class
            Document doc = new Document();

            // Load a Word document
            doc.LoadFromFile("Sample.docx");

            // Convert the document to a Markdown file
            doc.SaveToFile("output/WordToMarkdown.md", FileFormat.Markdown);
            doc.Dispose();
        }
    }
}

C#: Convert Word to Markdown

Convert Word to Markdown Without Images

When using Spire.Doc for .NET to convert Word documents to Markdown files, images are stored in Base64 encoding by default, which can increase the file size and affect compatibility. To address this, we can remove the images during conversion, thereby reducing the file size and enhancing compatibility.

The following steps outline how to convert Word documents to Markdown files without images:

Create an instance of Document class.
Load a Word document using Document.LoadFromFile() method.
Iterate through the sections and then the paragraphs in the document.
Iterate through the document objects in the paragraphs:
- Get a document object through Paragraph.ChildObjects[] property.
- Check if it’s an instance of DocPicture class. If it is, remove it using Paragraph.ChildObjects.Remove(DocumentObject) method.
Convert the document to a Markdown file using Document.SaveToFile(filename: String, FileFormat.Markdown) method.
Release resources.

using Spire.Doc;
using Spire.Doc.Documents;
using Spire.Doc.Fields;

namespace WordToMarkdownNoImage
{
    class Program
    {
        static void Main(string[] args)
        {
            // Create an instance of Document class
            Document doc = new Document();

            // Load a Word document
            doc.LoadFromFile("Sample.docx");

            // Iterate through the sections in the document
            foreach (Section section in doc.Sections)
            {
                // Iterate through the paragraphs in the sections
                foreach (Paragraph paragraph in section.Paragraphs)
                {
                    // Iterate through the document objects in the paragraphs
                    for (int i = 0; i < paragraph.ChildObjects.Count; i++)
                    {
                        // Get a document object
                        DocumentObject docObj = paragraph.ChildObjects[i];
                        // Check if it is an instance of DocPicture class
                        if (docObj is DocPicture)
                        {
                            // Remove the DocPicture instance
                            paragraph.ChildObjects.Remove(docObj);
                        }
                    }
                }
            }

            // Convert the document to a Markdown file
            doc.SaveToFile("output/WordToMarkdownNoImage.md", FileFormat.Markdown);
            doc.Dispose();
        }
    }
}

C#: Convert Word to Markdown

Apply for a Temporary License

If you'd like to remove the evaluation message from the generated documents, or to get rid of the function limitations, please request a 30-day trial license for yourself.

Published in Conversion

Tagged under

doc net Conversion

News Category

Knowledgebase (2329)

Children categories

Purchase (7)

Licensing (7)

Benchmark (1)

Java (482)

.NET (1323)

Cloud (13)

CPP (76)

Python (359)

AI (4)

JavaScript (51)

Spire.OfficeJs (5)

C#: Convert HTML to PDF using ChromeHtmlConverter

Install Spire.PDF for .NET

Install Google Chrome

Convert HTML to PDF using ChromeHtmlConverter in C#

Generate Output Logs During HTML to PDF Conversion in C#

Apply for a Temporary License

Python: Import and Export PDF Form Data

Install Spire.PDF for Python

Import PDF Form Data from FDF, XFDF or XML Files in Python

Export PDF Form Data to FDF, XFDF or XML Files in Python

Apply for a Temporary License

C#: Convert Word to Markdown

Install Spire.Doc for .NET

Convert Word to Markdown with C#

Convert Word to Markdown Without Images

Apply for a Temporary License

More...

Python: Set, Update, and Get Cell Values in Excel Worksheets

Python: Get or Replace Used Fonts in PDF

Python: Find and Highlight Data in Excel Worksheets

Python: Detect and Remove VBA Macros in Word Documents