Spire.Office Knowledgebase Page 32 | E-iceblue

HTML is widely used to present content in web browsers, but preserving its exact layout when sharing or printing can be challenging. PDF, by contrast, is a universally accepted format that reliably maintains document layout across various devices and operating systems. Converting HTML to PDF is particularly useful in web development, especially when creating printable versions of web pages or generating reports from web data.

Spire.PDF for .NET now supports a streamlined method to convert HTML to PDF in C# using the ChromeHtmlConverter class. This tutorial provides step-by-step guidance on performing this conversion effectively.

Install Spire.PDF for .NET

To begin with, you need to add the DLL files included in the Spire.PDF for.NET package as references in your .NET project. The DLLs files can be either downloaded from this link or installed via NuGet.

PM> Install-Package Spire.PDF

Install Google Chrome

This method requires Google Chrome to perform the conversion. If Chrome is not already installed, you can download it from this link and install it.

Convert HTML to PDF using ChromeHtmlConverter in C#

You can utilize the ChromeHtmlConverter.ConvertToPdf() method to convert an HTML file to a PDF using the Chrome plugin. This method accepts 3 parameters, including the input HTML file path, output PDF file path, and ConvertOptions which allows customization of conversion settings like conversion timeout, PDF paper size and page margins. The detailed steps are as follows.

  • Create an instance of the ChromeHtmlConverter class and provide the path to the Chrome plugin (chrome.exe) as a parameter in the class constructor.
  • Create an instance of the ConvertOptions class.
  • Customize the conversion settings, such as the conversion timeout, the paper size and page margins of the converted PDF through the properties of the ConvertOptions class.
  • Convert an HTML file to PDF using the ChromeHtmlConverter.ConvertToPdf() method.
  • C#
using Spire.Additions.Chrome;

namespace ConvertHtmlToPdfUsingChrome
{
    internal class Program
    {
        static void Main(string[] args)
        {
            //Specify the input URL and output PDF file path
            string inputUrl = @"https://www.e-iceblue.com/Tutorials/Spire.PDF/Spire.PDF-Program-Guide/C-/VB.NET-Convert-Image-to-PDF.html";            
            string outputFile = @"HtmlToPDF.pdf";

            //Specify the path to the Chrome plugin
            string chromeLocation = @"C:\Program Files\Google\Chrome\Application\chrome.exe";

            //Create an instance of the ChromeHtmlConverter class
            ChromeHtmlConverter converter = new ChromeHtmlConverter(chromeLocation);

            // Create an instance of the ConvertOptions class
            ConvertOptions options = new ConvertOptions();
            //Set conversion timeout
            options.Timeout = 10 * 3000;
            //Set paper size and page margins of the converted PDF
            options.PageSettings = new PageSettings()
            {
                PaperWidth = 8.27,
                PaperHeight = 11.69,
                MarginTop = 0,
                MarginLeft = 0,
                MarginRight = 0,
                MarginBottom = 0

            };

            //Convert the URL to PDF
            converter.ConvertToPdf(inputUrl, outputFile, options);
        }
    }
}

The converted PDF file maintains the same appearance as if the HTML file were printed to PDF directly through the Chrome browser:

C#: Convert HTML to PDF using ChromeHtmlConverter

Generate Output Logs During HTML to PDF Conversion in C#

Spire.PDF for .NET enables you to generate output logs during HTML to PDF conversion using the Logger class. The detailed steps are as follows.

  • Create an instance of the ChromeHtmlConverter class and provide the path to the Chrome plugin (chrome.exe) as a parameter in the class constructor.
  • Enable Logging by creating a Logger object and assigning it to the ChromeHtmlConverter.Logger property.
  • Create an instance of the ConvertOptions class.
  • Customize the conversion settings, such as the conversion timeout, the paper size and page margins of the converted PDF through the properties of the ConvertOptions class.
  • Convert an HTML file to PDF using the ChromeHtmlConverter.ConvertToPdf() method.
  • C#
using Spire.Additions.Chrome;

namespace ConvertHtmlToPdfUsingChrome
{
    internal class Program
    {
        static void Main(string[] args)
        {
            //Specify the input URL and output PDF file path
            string inputUrl = @"https://www.e-iceblue.com/Tutorials/Spire.PDF/Spire.PDF-Program-Guide/C-/VB.NET-Convert-Image-to-PDF.html";
            string outputFile = @"HtmlToPDF.pdf";

            // Specify the log file path
            string logFilePath = @"Logs.txt";

            //Specify the path to the Chrome plugin
            string chromeLocation = @"C:\Program Files\Google\Chrome\Application\chrome.exe";

            //Create an instance of the ChromeHtmlConverter class
            ChromeHtmlConverter converter = new ChromeHtmlConverter(chromeLocation);
            //Enable logging
            converter.Logger = new Logger(logFilePath);

            //Create an instance of the ConvertOptions class
            ConvertOptions options = new ConvertOptions();
            //Set conversion timeout
            options.Timeout = 10 * 3000;
            //Set paper size and page margins of the converted PDF
            options.PageSettings = new PageSettings()
            {
                PaperWidth = 8.27,
                PaperHeight = 11.69,
                MarginTop = 0,
                MarginLeft = 0,
                MarginRight = 0,
                MarginBottom = 0

            };

            //Convert the URL to PDF
            converter.ConvertToPdf(inputUrl, outputFile, options);
        }
    }
}

Here is the screenshot of the output log file:

C#: Convert HTML to PDF using ChromeHtmlConverter

Apply for a Temporary License

If you'd like to remove the evaluation message from the generated documents, or to get rid of the function limitations, please request a 30-day trial license for yourself.

Python: Import and Export PDF Form Data

2024-11-04 01:09:46 Written by Koohji

PDF forms are essential tools for collecting information across various industries. Understanding how to import and export this data in different formats like FDF, XFDF, and XML can greatly enhance your data management processes. For instance, importing form data allows you to update or pre-fill PDF forms with existing information, saving time and increasing accuracy. Conversely, exporting form data enables you to share collected information effortlessly with other applications, facilitating seamless integration and minimizing manual entry errors. In this article, we will introduce how to import and export PDF form data in Python using Spire.PDF for Python.

Install Spire.PDF for Python

This scenario requires Spire.PDF for Python and plum-dispatch v1.7.4. They can be easily installed in your Windows through the following pip command.

pip install Spire.PDF

If you are unsure how to install, please refer to this tutorial: How to Install Spire.PDF for Python on Windows

Import PDF Form Data from FDF, XFDF or XML Files in Python

Spire.PDF for Python offers the PdfFormWidget.ImportData() method for importing PDF form data from FDF, XFDF, or XML files. The detailed steps are as follows.

  • Create an object of the PdfDocument class.
  • Load a PDF document using PdfDocument.LoadFromFile() method.
  • Get the form of the PDF document using PdfDocument.Form property.
  • Import form data from an FDF, XFDF or XML file using PdfFormWidget.ImportData() method.
  • Save the resulting document using PdfDocument.SaveToFile() method.
  • Python
from spire.pdf.common import *
from spire.pdf import *

# Create an object of the PdfDocument class
pdf = PdfDocument()
# Load a PDF document
pdf.LoadFromFile("Forms.pdf")

# Get the form of the document
pdfForm = pdf.Form
formWidget = PdfFormWidget(pdfForm)

# Import PDF form data from an XML file
formWidget.ImportData("Data.xml", DataFormat.Xml)

# Import PDF form data from an FDF file
# formWidget.ImportData("Data.fdf", DataFormat.Fdf)

# Import PDF form data from an XFDF file
# formWidget.ImportData("Data.xfdf", DataFormat.XFdf)

# Save the resulting document
pdf.SaveToFile("Output.pdf")
# Close the PdfDocument object
pdf.Close()

Python: Import and Export PDF Form Data

Export PDF Form Data to FDF, XFDF or XML Files in Python

Spire.PDF for Python also enables developers to export PDF form data to FDF, XFDF, or XML files by using the PdfFormWidget.ExportData() method. The detailed steps are as follows.

  • Create an object of the PdfDocument class.
  • Load a PDF document using PdfDocument.LoadFromFile() method.
  • Get the form of the PDF document using PdfDocument.Form property.
  • Export form data to an FDF, XFDF or XML file using PdfFormWidget.ExportData() method.
  • Python
from spire.pdf.common import *
from spire.pdf import *

# Create an object of the PdfDocument class
pdf = PdfDocument()
# Load a PDF document
pdf.LoadFromFile("Forms.pdf")

# Get the form of the document
pdfForm = pdf.Form
formWidget = PdfFormWidget(pdfForm)

# Export PDF form data to an XML file
formWidget.ExportData("Data.xml", DataFormat.Xml, "Form")

# Export PDF form data to an FDF file
# formWidget.ExportData("Data.fdf", DataFormat.Fdf, "Form")

# Export PDF form data to an XFDF file
# formWidget.ExportData("Data.xfdf", DataFormat.XFdf, "Form")

# Close the PdfDocument object
pdf.Close()

Python: Import and Export PDF Form Data

Apply for a Temporary License

If you'd like to remove the evaluation message from the generated documents, or to get rid of the function limitations, please request a 30-day trial license for yourself.

C#: Convert Word to Markdown

2024-10-31 06:06:12 Written by Koohji

Markdown, with its lightweight syntax, offers a streamlined approach to web content creation, collaboration, and document sharing, particularly in environments where tools like Git or Markdown-friendly editors are prevalent. By converting Word documents to Markdown files, users can enhance their productivity, facilitate easier version control, and ensure compatibility across different systems and platforms. In this article, we will explore the process of converting Word documents to Markdown files using Spire.Doc for .NET, providing simple C# code examples.

Install Spire.Doc for .NET

To begin with, you need to add the DLL files included in the Spire.Doc for .NET package as references in your .NET project. The DLL files can be either downloaded from this link or installed via NuGet.

PM> Install-Package Spire.Doc

Convert Word to Markdown with C#

Using Spire.Doc for .NET, we can convert a Word document to a Markdown file by loading the document using Document.LoadFromFile() method and then convert it to a Markdown file using Document.SaveToFile(filename: String, FileFormat.Markdown) method. The detailed steps are as follows:

  • Create an instance of Document class.
  • Load a Word document using Document.LoadFromFile() method.
  • Convert the document to a Markdown file using Document.SaveToFile(filename: String, FileFormat.Markdown) method.
  • Release resources.
  • C#
using Spire.Doc;

namespace WordToMarkdown
{
    class Program
    {
        static void Main(string[] args)
        {
            // Create an instance of Document class
            Document doc = new Document();

            // Load a Word document
            doc.LoadFromFile("Sample.docx");

            // Convert the document to a Markdown file
            doc.SaveToFile("output/WordToMarkdown.md", FileFormat.Markdown);
            doc.Dispose();
        }
    }
}

C#: Convert Word to Markdown

Convert Word to Markdown Without Images

When using Spire.Doc for .NET to convert Word documents to Markdown files, images are stored in Base64 encoding by default, which can increase the file size and affect compatibility. To address this, we can remove the images during conversion, thereby reducing the file size and enhancing compatibility.

The following steps outline how to convert Word documents to Markdown files without images:

  • Create an instance of Document class.
  • Load a Word document using Document.LoadFromFile() method.
  • Iterate through the sections and then the paragraphs in the document.
  • Iterate through the document objects in the paragraphs:
    • Get a document object through Paragraph.ChildObjects[] property.
    • Check if it’s an instance of DocPicture class. If it is, remove it using Paragraph.ChildObjects.Remove(DocumentObject) method.
  • Convert the document to a Markdown file using Document.SaveToFile(filename: String, FileFormat.Markdown) method.
  • Release resources.
  • C#
using Spire.Doc;
using Spire.Doc.Documents;
using Spire.Doc.Fields;

namespace WordToMarkdownNoImage
{
    class Program
    {
        static void Main(string[] args)
        {
            // Create an instance of Document class
            Document doc = new Document();

            // Load a Word document
            doc.LoadFromFile("Sample.docx");

            // Iterate through the sections in the document
            foreach (Section section in doc.Sections)
            {
                // Iterate through the paragraphs in the sections
                foreach (Paragraph paragraph in section.Paragraphs)
                {
                    // Iterate through the document objects in the paragraphs
                    for (int i = 0; i < paragraph.ChildObjects.Count; i++)
                    {
                        // Get a document object
                        DocumentObject docObj = paragraph.ChildObjects[i];
                        // Check if it is an instance of DocPicture class
                        if (docObj is DocPicture)
                        {
                            // Remove the DocPicture instance
                            paragraph.ChildObjects.Remove(docObj);
                        }
                    }
                }
            }

            // Convert the document to a Markdown file
            doc.SaveToFile("output/WordToMarkdownNoImage.md", FileFormat.Markdown);
            doc.Dispose();
        }
    }
}

C#: Convert Word to Markdown

Apply for a Temporary License

If you'd like to remove the evaluation message from the generated documents, or to get rid of the function limitations, please request a 30-day trial license for yourself.

page 32