Spire.Office Knowledgebase Page 46 | E-iceblue

Various written documents, such as academic papers, reports, and legal materials, often have specific formatting guidelines that encompass word count, page count, and other essential metrics. Accurately measuring these elements is crucial as it ensures that your document adheres to the required standards and meets the expected quality benchmarks. In this article, we will explain how to count words, pages, characters, paragraphs, and lines in a Word document in Python using Spire.Doc for Python.

Install Spire.Doc for Python

This scenario requires Spire.Doc for Python. It can be easily installed in your Windows through the following pip commands.

pip install Spire.Doc

If you are unsure how to install, please refer to: How to Install Spire.Doc for Python on Windows

Count Words, Pages, Characters, Paragraphs, and Lines in a Word Document in Python

Spire.Doc for Python offers the BuiltinDocumentProperties class that empowers you to retrieve crucial information from your Word document. By utilizing this class, you can access a wealth of details, including the built-in document properties, as well as the number of words, pages, characters, paragraphs, and lines contained within the document.

The steps below explain how to get the number of words, pages, characters, paragraphs, and lines in a Word document in Python using Spire.Doc for Python:

  • Create an object of the Document class.
  • Load a Word document using the Document.LoadFromFile() method.
  • Get the BuiltinDocumentProperties object using the Document.BuiltinDocumentProperties property.
  • Get the number of words, characters, paragraphs, lines, and pages in the document using the WordCount, CharCount, ParagraphCount, LinesCount, PageCount properties of the BuiltinDocumentProperties class, and append the result to a list.
  • Write the content of the list into a text file.
  • Python
from spire.doc import *
from spire.doc.common import *

# Create an object of the Document class
doc = Document()
# Load a Word document
doc = Document("Input.docx")

# Create a list
sb = []

# Get the built-in properties of the document
properties = doc.BuiltinDocumentProperties

# Get the number of words, characters, paragraphs, lines, and pages and append the result to the list
sb.append("The number of words: " + str(properties.WordCount))
sb.append("The number of characters: " + str(properties.CharCount))
sb.append("The number of paragraphs: " + str(properties.ParagraphCount))
sb.append("The number of lines: " + str(properties.LinesCount))
sb.append("The number of pages: " + str(properties.PageCount))

# Save the data in the list to a text file
with open("result.txt", "w") as file:
file.write("\n".join(sb))

doc.Close()

Python: Count Words, Pages, Characters, Paragraphs and Lines in Word

Count Words and Characters in a Specific Paragraph of a Word Document in Python

In addition to retrieving the overall word count, page count, and other metrics for an entire Word document, you are also able to get the word count and character count for a specific paragraph by using the Paragraph.WordCount and Paragraph.CharCount properties.

The steps below explain how to get the number of words and characters of a paragraph in a Word document in Python using Spire.Doc for Python:

  • Create an object of the Document class.
  • Load a Word document using the Document.LoadFromFile() method.
  • Get a specific paragraph using the Document.Sections[sectionIndex].Paragraphs[paragraphIndex] property.
  • Get the number of words and characters in the paragraph using the Paragraph.WordCount and Paragraph.CharCount properties, and append the result to a list.
  • Write the content of the list into a text file.
  • Python
from spire.doc import *
from spire.doc.common import *

# Create an object of the Document class
doc = Document()
# Load a Word document
doc = Document("Input.docx")

# Get a specific paragraph
paragraph = doc.Sections.get_Item(0).Paragraphs.get_Item(0)

# Create a list
sb = []

# Get the number of words and characters in the paragraph and append the result to the list
sb.append("The number of words: " + str(paragraph.WordCount))
sb.append("The number of characters: " + str(paragraph.CharCount))

# Save the data in the list to a text file
with open("result.txt", "w") as file:
file.write("\n".join(sb))

doc.Close()

Python: Count Words, Pages, Characters, Paragraphs and Lines in Word

Apply for a Temporary License

If you'd like to remove the evaluation message from the generated documents, or to get rid of the function limitations, please request a 30-day trial license for yourself.

When dealing with a large volume of customized documents such as contracts, reports, or personal letters, the variable feature in Word documents becomes crucial. Variables allow you to store and reuse information like dates, names, or product details, making the documents more personalized and dynamic. This article will delve into how to use Spire.Doc for Python to insert, count, retrieve, and delete variables in Word documents, enhancing the efficiency and flexibility of document management.

Install Spire.Doc for Python

This scenario requires Spire.Doc for Python and plum-dispatch v1.7.4. They can be easily installed in your Window through the following pip command.

pip install Spire.Doc

If you are unsure how to install, please refer to this tutorial: How to Install Spire.Doc for Python on Window

Add Variables into Word Documents with Python

The way Word variables work is based on the concept of "fields". When you insert a variable into a Word document, what you're actually doing is inserting a field, which points to a value stored either in the document properties or an external data source. Upon updating the fields, Word recalculates them to display the most current information.

Spire.Doc for Python offers the VariableCollection.Add(name, value) method to insert variables into Word documents. Here are the detailed steps:

  • Create a Document object.
  • Call the Document.AddSection() method to create a new section.
  • Call the Section.AddParagraph() method to create a new paragraph.
  • Call the Paragraph.AppendField(fieldName, fieldType) method to add a variable field (FieldDocVariable) within the paragraph.
  • Set Document.IsUpdateFields to True to update the fields.
  • Save the document by Document.SaveToFile() method.
  • Python
from spire.doc import *

# Create a Document object
document = Document()

# Add a new section to the document
section = document.AddSection()

# Add a new paragraph within the newly created section
paragraph = section.AddParagraph()

# Append a FieldDocVariable type field named "CompanyName" to the paragraph
paragraph.AppendField("CompanyName", FieldType.FieldDocVariable)

# Add the variable to the document's variable collection
document.Variables.Add("CompanyName", "E-ICEBLUE")

# Update fields
document.IsUpdateFields = True

# Save the document to a specified path
document.SaveToFile("AddVariable.docx", FileFormat.Docx2016)

# Dispose the document
document.Dispose()

Python: Add, Count, Retrieve and Remove Word Variables

Count the Number of Variables in a Word Document with Python

Here are the detailed steps to use the Document.Variables.Count property to get the number of variables:

  • Create a Document object.
  • Call the Document.LoadFromFile() method to load the document that contains the variables.
  • Use the Document.Variables.Count property to obtain the number of variables.
  • Print the count in console.
  • Python
from spire.doc import *

# Create a Document object
document = Document()

# Load an existing document
document.LoadFromFile("AddVariable.docx")

# Get the count of variables in the document
count=document.Variables.Count

# Print to console
print(f"The count of variables:{count}")

Python: Add, Count, Retrieve and Remove Word Variables

Retrieve Variables from a Word Document with Python

Spire.Doc for Python provides the GetNameByIndex(int index) and GetValueByIndex(int index) methods to retrieve variable names and values by their indices. Below are the detailed steps:

  • Create a Document object.
  • Call the Document.LoadFromFile() method to load the document that contains the variables.
  • Call the Document.Variables.GetNameByIndex(index) method to obtain the variable name.
  • Call the Document.Variables.GetValueByIndex(index) method to obtain the variable value.
  • Call the Document.Variables.get_Item(name) to obtain variable value through the variable name.
  • Print the count in console.
  • Python
from spire.doc import *

# Create a Document object
document = Document()

# Load an existing document
document.LoadFromFile("AddVariable.docx")

# Obtain variable name based on index 0
name=document.Variables.GetNameByIndex(0)

# Obtain variable value based on index 0
value=document.Variables.GetValueByIndex(0)

# Obtain variable value through the variable name
value1=document.Variables.get_Item("CompanyName")

# Print to console
print("Variable Name:", name)
print("Variable Value:", value)

Python: Add, Count, Retrieve and Remove Word Variables

Delete Variables from a Word Document with Python

The VariableCollection.Remove(name) method can be used to delete a specified variable from the document, with the parameter being the name of the variable.

  • Create a Document object.
  • Call the Document.LoadFromFile() method to load the document that contains the variables.
  • Call the Document.Variables.Remove(name) method to remove the variable.
  • Set Document.IsUpdateFields to True to update the fields.
  • Save the document by Document.SaveToFile() method.
  • Python
from spire.doc import *

# Create a Document object
document = Document()

# Load an existing document
document.LoadFromFile("AddVariable.docx")

# Remove the variable named "CompanyName"
document.Variables.Remove("CompanyName")

# Update fields
document.IsUpdateFields=True

# Save the document
document.SaveToFile("RemoveVariable.docx",FileFormat.Docx2016)

# Dispose the document
document.Dispose()

Apply for a Temporary License

If you'd like to remove the evaluation message from the generated documents, or to get rid of the function limitations, please request a 30-day trial license for yourself.

Spire.Doc for Python is a robust library that enables you to read and write Microsoft Word documents using Python. With Spire.Doc, you can create, read, edit, and convert both DOC and DOCX file formats without requiring Microsoft Word to be installed on your system.

This article demonstrates how to install Spire.Doc for Python on Mac.

Step 1

Download the most recent version of Python for macOS and install it on your Mac. If you have already completed this step, proceed directly to step 2.

How to Install Spire.Doc for Python on Mac

Step 2

Open VS Code and search for 'Python' in the Extensions panel. Click 'Install' to add support for Python in your VS Code.

How to Install Spire.Doc for Python on Mac

Step 3

Click 'Explorer' > 'NO FOLRDER OPENED' > 'Open Folder'.

How to Install Spire.Doc for Python on Mac

Choose an existing folder as the workspace, or you can create a new folder and then open it.

How to Install Spire.Doc for Python on Mac

Add a .py file to the folder you just opened and name it whatever you want (in this case, HelloWorld.py).

How to Install Spire.Doc for Python on Mac

Step 4

Use the keyboard shortcut Ctrl + ' to open the Terminal. Then, install Spire.Doc for Python by entering the following command line in the terminal.

pip3 install spire.doc

Note that pip3 is a package installer specifically designed for Python 3.x versions, while pip is a package installer for Python 2.x versions. If you are working with Python 2.x, you can use the pip command.

How to Install Spire.Doc for Python on Mac

Step 5

Open a Terminal window on your Mac, and type the following command to obtain the installation path of Python on your system.

python3 -m pip --version

How to Install Spire.Doc for Python on Mac

Step 6

Add the following code snippet to the 'HelloWorld.py' file.

  • Python
from spire.doc.common import *
from spire.doc import *

document = Document()
section = document.AddSection()
paragraph = section.AddParagraph()
paragraph.AppendText("Hello World")
document.SaveToFile("HelloWorld.docx", FileFormat.Docx2019)
document.Dispose()

How to Install Spire.Doc for Python on Mac

After executing the Python file, you will find the resulting Word document in the 'EXPLORER' panel.

How to Install Spire.Doc for Python on Mac

page 46