Spire.Office Knowledgebase Page 26 | E-iceblue

In Excel file processing, using byte streams in Python to create, read, and modify Excel files enables efficient data manipulation and automation. This approach eliminates reliance on physical storage or local filesystems, making it ideal for cloud-based or memory-constrained environments. It also supports real-time data exchange, system integration, and instant feedback in web applications, promoting rapid development and adaptable workflows. In this article, we will explore how to use Spire.XLS for Python to dynamically process Excel workbooks by byte streams with simple Python code.

Install Spire.XLS for Python

This scenario requires Spire.XLS for Python and plum-dispatch v1.7.4. They can be easily installed in your Windows through the following pip commands.

pip install Spire.XLS

If you are unsure how to install, please refer to: How to Install Spire.XLS for Python on Windows

Create Excel Files and Save as Byte Streams in Python

With Spire.XLS for Python, we can create an Excel workbook by initializing a Workbook instance and populating it with data. Once the workbook is ready, we can save it to a Stream object and convert that stream into a bytes object for further use or storage. This method allows us to efficiently generate Excel files in memory without the need for disk storage.

Below are the steps for creating an Excel file and saving it as a byte stream with Python:

  • Create an instance of the Workbook class to initialize a new Excel workbook. The new workbook includes three default worksheets.
  • Retrieve a worksheet using the Workbook.Worksheets.get_Item() method.
  • Create a data list or obtain it from another source.
  • Iterate through rows and columns to populate the worksheet with data using the Worksheet.Range.get_Item().Value or NumberValue properties.
  • Format cells using the properties available in CellRange.Style.
  • Create a Stream object and save the workbook to it using the Workbook.SaveToStream() method.
  • Convert the stream to a bytes object using the Stream.ToArray() method.
  • Python
from spire.xls import Workbook, FileFormat, Stream, Color, HorizontalAlignType

# Create an instance of Workbook class
workbook = Workbook()

# Get the first worksheet
sheet = workbook.Worksheets.get_Item(0)

# Create a 2D list of data or read data from other sources
data = [
    ["Country", "Capital", "Population (Millions)", "Area (km²)", "Continent"],
    ["United States", "Washington, D.C.", 331, 9833520, "North America"],
    ["Canada", "Ottawa", 38, 9984670, "North America"],
    ["Brazil", "Brasília", 213, 8515767, "South America"],
    ["United Kingdom", "London", 68, 243610, "Europe"],
    ["Germany", "Berlin", 83, 357022, "Europe"],
    ["India", "New Delhi", 1391, 3287263, "Asia"],
    ["China", "Beijing", 1441, 9596961, "Asia"],
    ["Australia", "Canberra", 26, 7692024, "Oceania"],
    ["South Africa", "Pretoria", 60, 1219090, "Africa"],
    ["Japan", "Tokyo", 126, 377975, "Asia"]
]

# Insert the data into the worksheet
for i, row in enumerate(data):
    for j, value in enumerate(row):
        if isinstance(value, str):
            sheet.Range.get_Item(i + 1, j + 1).Value = value
        else:
            sheet.Range.get_Item(i + 1, j + 1).NumberValue = value

# Format the header row with new colors
headerRow = sheet.AllocatedRange.Rows.get_Item(0)
headerRow.Style.Color = Color.FromRgb(0, 102, 204)  # Blue color for the header
headerRow.Style.Font.FontName = "Calibri"
headerRow.Style.Font.Size = 14
headerRow.Style.Font.IsBold = True
headerRow.Style.Font.Color = Color.FromRgb(255, 255, 255)  # White text
headerRow.Style.HorizontalAlignment = HorizontalAlignType.Center

# Format the data rows with new alternating colors
for i in range(1, sheet.AllocatedRange.Rows.Count):
    row = sheet.AllocatedRange.Rows.get_Item(i)
    row.Style.Font.FontName = "Times New Roman"
    row.Style.Font.Size = 12
    row.Style.HorizontalAlignment = HorizontalAlignType.Left
    if i % 2 == 0:
        row.Style.Color = Color.FromRgb(229, 243, 255)  # Light blue for even rows
    else:
        row.Style.Color = Color.FromRgb(255, 255, 204)  # Light yellow for odd rows

# Auto-fit the columns
for i in range(sheet.AllocatedRange.Columns.Count):
    sheet.AutoFitColumn(i + 1)

# Create a Stream object
stream = Stream()

# Save the workbook to the stream
workbook.SaveToStream(stream, FileFormat.Version2016)
workbook.Dispose()

# Convert the stream to bytes
bytes_data = stream.ToArray()

# Write the bytes to a file or use them as needed
with open("output/CreateExcelByStream.xlsx", "wb") as file:
    file.write(bytes_data)

Excel Files Created with Python Through Byte Streams

Read Excel Files from Byte Streams in Python

To load an Excel workbook from a byte stream, we can convert the byte data into a Stream object and load it into a Workbook instance. Then, we can then access the worksheet data to extract and utilize the data within the Python application seamlessly.

The steps for reading Excel files from byte streams using Python are as follows:

  • Create or convert to a bytes object for the Excel file, or use an existing one.
  • Create a Stream object from the bytes.
  • Instantiate the Workbook class and load the Excel file from the Stream object using the Workbook.LoadFromStream() method.
  • Retrieve a worksheet using the Workbook.Worksheets.get_Item() method.
  • Iterate through rows and columns to access cell values using the Worksheet.AllocatedRange.get_Item().Value property.
  • Output the values or utilize them as needed.
  • Python
from spire.xls import Workbook, Stream

# Create a bytes object or use an existing one
with open("output/CreateExcelByStream.xlsx", "rb") as file:
    bytes_data = file.read()

# Create an instance of the Workbook class
workbook = Workbook()

# Load the Excel file from the byte stream
workbook.LoadFromStream(Stream(bytes_data))

# Get the first worksheet
sheet = workbook.Worksheets.get_Item(0)

# Read data from the worksheet
# Create a list to store the data
data = []
for i in range(sheet.AllocatedRange.Rows.Count):
    # Retrieve a row of data
    row = sheet.AllocatedRange.Rows.get_Item(i)
    # Create a list to store the row's data
    row_data = []
    for j in range(row.Cells.Count):
        # Get the value of the cell
        cellValue = sheet.AllocatedRange.get_Item(i + 1, j + 1).Value
        row_data.append(cellValue)
    data.append(row_data)

# Display the data or use it as needed
for row in data:
    print(row)

# Release resources
workbook.Dispose()

Data Read from Excel Streams with Spire.XLS

Modify Excel Files from Byte Streams in Python

Modifying Excel files from byte streams enables us to update or enhance data dynamically without saving it to disk. This method involves loading the byte stream into a Workbook instance, making changes to its content or formatting, and saving the changes back to a byte stream for reuse.

The following steps show how to modify an Excel workbook from a byte stream using Python:

  • Create or convert to a bytes object of the Excel file, or use an existing one.
  • Initialize a Stream object from the bytes and load it into a Workbook using the Workbook.LoadFromStream() method.
  • Access a worksheet using the Workbook.Worksheets.get_Item() method.
  • Modify cell values with the Worksheet.AllocatedRange.get_Item().Value property.
  • Format cells using properties in CellRange.Style and add borders with the CellRange.BorderAround() method or the CellRange.BorderInside() method.
  • Auto-fit column widths using the Worksheet.AutoFitColumn() method.
  • Save the workbook to a new Stream object using the Workbook.SaveToStream() method and convert it back to bytes or bytearray using Stream.ToArray() method.
  • Python
from spire.xls import Workbook, Stream, HorizontalAlignType, Color, FileFormat

# Create a bytes object or use an existing one
with open("output/CreateExcelByStream.xlsx", "rb") as file:
    bytes_data = file.read()

# Create an instance of the Workbook class
workbook = Workbook()

# Load the Excel file from the byte stream
stream = Stream(bytes_data)
workbook.LoadFromStream(stream)
stream.Close()

# Remove unnecessary worksheets (commented out in this case)
#for i in range(1, workbook.Worksheets.Count):
#    workbook.Worksheets.RemoveAt(i)

# Get the first worksheet
sheet = workbook.Worksheets.get_Item(0)

# Modify the style of the header row
headerRow = sheet.AllocatedRange.Rows.get_Item(0)
headerRow.Style.Font.Bold = False
headerRow.Style.Font.FontName = "Arial"
headerRow.Style.Font.Size = 12
headerRow.Style.HorizontalAlignment = HorizontalAlignType.Left
headerRow.Style.Color = Color.FromRgb(173, 216, 230) # Light blue background color
# Add outline borders for the header row
headerRow.BorderAround()

# Modify the style of the data rows
for i in range(1, sheet.AllocatedRange.Rows.Count):
    row = sheet.AllocatedRange.Rows.get_Item(i)
    row.Style.Font.FontName = "Consolas"
    row.Style.Font.Size = 11
    if i % 2 == 0:
        row.Style.Color = Color.FromRgb(240, 240, 240)  # Light gray background color for even rows
    else:
        row.Style.Color = Color.FromRgb(255, 255, 255)  # White background color for odd rows

# Auto-adjust the column widths
for i in range(sheet.AllocatedRange.Columns.Count):
    sheet.AutoFitColumn(i + 1)

# Save the modified Excel file
streamTemp = Stream()
workbook.SaveToStream(streamTemp, FileFormat.Version2016)
workbook.Dispose()

# Convert the stream to bytes
bytes_data = streamTemp.ToArray()

# Write the bytes to a file or use them as needed
with open("output/ModifiedExcel.xlsx", "wb") as file:
    file.write(bytes_data)

Modify Excel Worksheets with Python in Streams

Apply for a Temporary License

If you'd like to remove the evaluation message from the generated documents, or to get rid of the function limitations, please request a 30-day trial license for yourself.

Spire.OCR for Python is a robust and professional Optical Character Recognition (OCR) library that enables developers to extract text from images in various formats, including JPG, PNG, GIF, BMP, and TIFF. It provides an intuitive and straightforward solution for integrating OCR capabilities into Python applications, allowing users to extract text from popular image formats with just a few lines of code.

The library supports text recognition in commonly used fonts like Arial, Times New Roman, Courier New, Verdana, Tahoma, and Calibri, with regular, bold, and italic text styles. It also enables developers to recognize text in multiple languages, including English, Chinese, French, German, Japanese, and Korean, making it a versatile tool for global applications.

HTML, the backbone of web development, is widely used to build and present content on the web. While HTML is great for creating dynamic and interactive web pages, it is not well suited for creating professional-looking documents. When faced with such requirements, converting HTML to Word format is an ideal solution.

By implementing the Html to Word conversion, you can preserve the structure and content of the HTML while applying appropriate formatting and styles in Word to ensure the document look more professional. In this article, you will learn how to convert HTML to Word in React using Spire.Doc for JavaScript.

Install Spire.Doc for JavaScript

To get started with converting Word documents to PDF in a React application, you can either download Spire.Doc for JavaScript from our website or install it via npm with the following command:

npm i spire.doc

Make sure to copy all the dependencies to the public folder of your project. Additionally, include the required font files to ensure accurate and consistent text rendering.

For more details, refer to the documentation: How to Integrate Spire.Doc for JavaScript in a React Project

Convert an HTML File to Word with JavaScript in React

With Spire.Doc for JavaScript, you can simply load an HTML file and then save it as a Word Doc or Docx format through the Document.SaveToFile() function. The following are the main steps to convert an HTML file to Word in JavaScript.

  • Load the font file to ensure correct text rendering.
  • Create a new document using the wasmModule.Document.Create() function.
  • Load the HTML file using the Document.LoadFromFile() function.
  • Save the HTML file to a Word file using the Document.SaveToFile() function.
  • JavaScript
import React, { useState, useEffect } from 'react';

function App() {

  // State to hold the loaded WASM module
  const [wasmModule, setWasmModule] = useState(null);

  // useEffect hook to load the WASM module when the component mounts
  useEffect(() => {
    const loadWasm = async () => {
      try {

        // Access the Module and spiredoc from the global window object
        const { Module, spiredoc } = window;

        // Set the wasmModule state when the runtime is initialized
        Module.onRuntimeInitialized = () => {
          setWasmModule(spiredoc);
        };
      } catch (err) {

        // Log any errors that occur during loading
        console.error('Failed to load WASM module:', err);
      }
    };

    // Create a script element to load the WASM JavaScript file
    const script = document.createElement('script');
    script.src = `${process.env.PUBLIC_URL}/Spire.Doc.Base.js`;
    script.onload = loadWasm;

    // Append the script to the document body
    document.body.appendChild(script);

    // Cleanup function to remove the script when the component unmounts
    return () => {
      document.body.removeChild(script);
    };
  }, []); 

  // Function to convert HTML file to Word
  const HtmlToWord = async () => {
    if (wasmModule) {

      // Load the font file into the virtual file system (VFS)
      await wasmModule.FetchFileToVFS('CALIBRI.ttf','/Library/Fonts/', `${process.env.PUBLIC_URL}/`);

      // Specify the input and output file paths
      const inputFileName = 'sample1.html';
      const outputFileName = 'HtmlToWord.docx';

      // Fetch the input file and add it to the VFS
      await wasmModule.FetchFileToVFS(inputFileName,'', `${process.env.PUBLIC_URL}/`);

      // Create a new document
      const doc = wasmModule.Document.Create();

      // Load the HTML file
      doc.LoadFromFile({fileName: inputFileName,fileFormat: wasmModule.FileFormat.Html,validationType: wasmModule.XHTMLValidationType.None});

      // Save the HTML file to a Word file
      doc.SaveToFile({fileName: outputFileName, fileFormat: wasmModule.FileFormat.Docx});

      // Read the generated Word file from VFS
      const modifiedFileArray = wasmModule.FS.readFile(outputFileName);

      // Create a Blog object from the Word file
      const modifiedFile = new Blob([modifiedFileArray], {type: "application/vnd.openxmlformats-officedocument.wordprocessingml.document"});

      // Create a URL for the Blob
      const url = URL.createObjectURL(modifiedFile);

      // Create an anchor element to trigger the download
      const a = document.createElement("a");
      a.href = url;
      a.download = outputFileName;
      document.body.appendChild(a);
      a.click(); 
      document.body.removeChild(a); 
      URL.revokeObjectURL(url); 

      // Clean up resources
      doc.Dispose();
    }
  };

  return (
    <div style={{ textAlign: 'center', height: '300px' }}>
      <h1>Convert HTML File to Word Using JavaScript in React</h1>
      <button onClick={HtmlToWord} disabled={!wasmModule}>
        Convert
      </button>
    </div>
  );
}

export default App;

Run the code to launch the React app at localhost:3000. Once it's running, click on the "Convert" button to download the Word file generated from an HTML file:

Run the React app at localhost:3000

Below is the input HTML file and the converted Word file:

Convert an Html file to a Word document with JavaScript in React

Convert an HTML String to Word with JavaScript in React

You can also convert an HTML string to Word by calling the Paragraph.AppendHTML() function to add the HTML string to a paragraph in Word and then save the Word document. The following are the main steps to convert an HTML string to a Word file in JavaScript.

  • Load the font file to ensure correct text rendering.
  • Specify the HTML string
  • Create a new document using the wasmModule.Document.Create() function.
  • Add a new section using the Document.AddSection() function.
  • Add a paragraph to the section using the Section.AddParagraph() function.
  • Append the HTML string to the paragraph using the Paragraph.AppendHTML() function.
  • Save the Word document using the Document.SaveToFile() function.
  • JavaScript
import React, { useState, useEffect } from 'react';

function App() {

  // State to hold the loaded WASM module
  const [wasmModule, setWasmModule] = useState(null);

  // useEffect hook to load the WASM module when the component mounts
  useEffect(() => {
    const loadWasm = async () => {
      try {

        // Access the Module and spiredoc from the global window object
        const { Module, spiredoc } = window;

        // Set the wasmModule state when the runtime is initialized
        Module.onRuntimeInitialized = () => {
          setWasmModule(spiredoc);
        };
      } catch (err) {

        // Log any errors that occur during loading
        console.error('Failed to load WASM module:', err);
      }
    };

    // Create a script element to load the WASM JavaScript file
    const script = document.createElement('script');
    script.src = `${process.env.PUBLIC_URL}/Spire.Doc.Base.js`;
    script.onload = loadWasm;

    // Append the script to the document body
    document.body.appendChild(script);

    // Cleanup function to remove the script when the component unmounts
    return () => {
      document.body.removeChild(script);
    };
  }, []); 

  // Function to convert HTML string to Word
  const HtmlStringToWord = async () => {
    if (wasmModule) {

      // Load the font file into the virtual file system (VFS)
      await wasmModule.FetchFileToVFS('CALIBRI.ttf','/Library/Fonts/', `${process.env.PUBLIC_URL}/`);

      // Specify the output file path
      const outputFileName = 'HtmlStringToWord.docx';

      // Specify the HTML string
      let HTML = "<html><head><title>HTML to Word Example</title><style>, body {font-family: 'Calibri';}, h1 {color: #FF5733; font-size: 24px; margin-bottom: 20px;}, p {color: #333333; font-size: 16px; margin-bottom: 10px;}";
      HTML+="ul {list-style-type: disc; margin-left: 20px; margin-bottom: 15px;}, li {font-size: 14px; margin-bottom: 5px;}, table {border-collapse: collapse; width: 100%; margin-bottom: 20px;}";
      HTML+= "th, td {border: 1px solid #CCCCCC; padding: 8px; text-align: left;}, th {background-color: #F2F2F2; font-weight: bold;}, td {color: #0000FF;}</style></head>";
      HTML+="<body><h1>This is a Heading</h1><p>This is a paragraph demonstrating the conversion of HTML to Word document.</p><p>Here's an example of an unordered list:</p><ul><li>Item 1</li><li>Item 2</li><li>Item 3</li></ul>";
      HTML+="<p>Here's a table:</p><table><tr><th>Product</th><th>Quantity</th><th>Price</th></tr><tr><td>Jacket</td><td>30</td><td>$150</td></tr><tr><td>Sweater</td><td>25</td><td>$99</td></tr></table></body></html>";

      // Create a new document
      const doc = wasmModule.Document.Create();
      
      // Add a section to the document
      let section = doc.AddSection();

      // Add a paragraph to the section
      let paragraph = section.AddParagraph();
    
      // Append the HTML string to the paragraph
      paragraph.AppendHTML(HTML.toString('utf8',0,HTML.length));
      
      // Save the Word file
      doc.SaveToFile({fileName: outputFileName,fileFormat: wasmModule.FileFormat.Docx2016});
    
      // Read the generated Word file from VFS
      const modifiedFileArray = wasmModule.FS.readFile(outputFileName);

      // Create a Blog object from the Word file
      const modifiedFile = new Blob([modifiedFileArray], {type: "application/vnd.openxmlformats-officedocument.wordprocessingml.document"});

      // Create a URL for the Blob
      const url = URL.createObjectURL(modifiedFile);

      // Create an anchor element to trigger the download
      const a = document.createElement("a");
      a.href = url;
      a.download = outputFileName;
      document.body.appendChild(a);
      a.click(); 
      document.body.removeChild(a); 
      URL.revokeObjectURL(url); 

      // Clean up resources
      doc.Dispose();
    }
  };

  return (
    <div style={{ textAlign: 'center', height: '300px' }}>
      <h1>Convert HTML String to Word Using JavaScript in React</h1>
      <button onClick={HtmlStringToWord} disabled={!wasmModule}>
        Convert
      </button>
    </div>
  );
}

export default App;

Convert an HTML string to a Word document with JavaScript in React

Get a Free License

To fully experience the capabilities of Spire.Doc for JavaScript without any evaluation limitations, you can request a free 30-day trial license.

page 26