Spire.Office Knowledgebase Page 48 | E-iceblue

Creating a table of contents in a Word document significantly enhances its navigability and readability. It serves as a road map for the document, enabling readers to quickly overview the structure and grasp the content framework. This feature facilitates easy navigation for users to jump to any section within the document, which is particularly valuable for lengthy reports, papers, or manuals. It not only saves readers time in locating information but also augments the professionalism of the document and enhances the user experience. Moreover, a table of contents is easy to maintain and update; following any restructuring of the document, it can be swiftly revised to reflect the latest content organization, ensuring coherence and accuracy throughout the document. This article will demonstrate how to use Spire.Doc for Python to create a table of contents in a newly created Word document within a Python project.

Install Spire.Doc for Python

This scenario requires Spire.Doc for Python and plum-dispatch v1.7.4. They can be easily installed in your Windows through the following pip command.

pip install Spire.Doc

If you are unsure how to install, please refer to: How to Install Spire.Doc for Python on Windows

Python Create a Table Of Contents Using Heading Styles

Creating a table of contents using heading styles is a default method in Word documents to automatically generate a table of contents by utilizing different levels of heading styles to mark titles and sub-titles within the document, followed by leveraging Word's table of contents feature to automatically populate the contents. Here are the detailed steps:

  • Create a Document object.
  • Add a section using the Document.AddSection() method.
  • Add a paragraph using the Section.AddParagraph() method.
  • Create a table of contents object using the Paragraph.AppendTOC(int lowerLevel, int upperLevel) method.
  • Create a CharacterFormat object and set the font.
  • Apply a heading style to the paragraph using the Paragraph.ApplyStyle(BuiltinStyle.Heading1) method.
  • Add text content using the Paragraph.AppendText() method.
  • Apply character formatting to the text using the TextRange.ApplyCharacterFormat() method.
  • Update the table of contents using the Document.UpdateTableOfContents() method.
  • Save the document using the Document.SaveToFile() method.
  • Python
from spire.doc import *
from spire.doc.common import *

# Create a new document object
doc = Document()

# Add a section to the document
section = doc.AddSection()

# Append a Table of Contents (TOC) paragraph
TOC_paragraph = section.AddParagraph()
TOC_paragraph.AppendTOC(1, 3)

# Create and set character format objects for font
character_format1 = CharacterFormat(doc)
character_format1.FontName = "Microsoft YaHei"

character_format2 = CharacterFormat(doc)
character_format2.FontName = "Microsoft YaHei"
character_format2.FontSize = 12

# Add a paragraph with Heading 1 style
paragraph = section.Body.AddParagraph()
paragraph.ApplyStyle(BuiltinStyle.Heading1)

# Add text and apply character formatting
text_range1 = paragraph.AppendText("Overview")
text_range1.ApplyCharacterFormat(character_format1)

# Insert normal content
paragraph = section.Body.AddParagraph()
text_range2 = paragraph.AppendText("Spire.Doc for Python is a professional Python Word development component that enables developers to easily integrate Word document creation, reading, editing, and conversion functionalities into their own Python applications. As a completely standalone component, Spire.Doc for Python does not require the installation of Microsoft Word on the runtime environment.")

# Add a paragraph with Heading 1 style
paragraph = section.Body.AddParagraph()
paragraph.ApplyStyle(BuiltinStyle.Heading1)
text_range1 = paragraph.AppendText("Main Functions")
text_range1.ApplyCharacterFormat(character_format1)

# Add a paragraph with Heading 2 style
paragraph = section.Body.AddParagraph()
paragraph.ApplyStyle(BuiltinStyle.Heading2)
textRange1 = paragraph.AppendText("Only Spire.Doc, No Microsoft Office Automation")
textRange1.ApplyCharacterFormat(character_format1)

# Add regular content
paragraph = section.Body.AddParagraph()
textRange2 = paragraph.AppendText("Spire.Doc for Python is a totally independent Python Word class library which doesn't require Microsoft Office installed on system. Microsoft Office Automation is proved to be unstable, slow and not scalable to produce MS Word documents. Spire.Doc for Python is many times faster than Microsoft Word Automation and with much better stability and scalability.")
textRange2.ApplyCharacterFormat(character_format2)

# Add a paragraph with Heading 3 style
paragraph = section.Body.AddParagraph()
paragraph.ApplyStyle(BuiltinStyle.Heading3)
textRange1 = paragraph.AppendText("Word Versions")
textRange1.ApplyCharacterFormat(character_format1)
paragraph = section.Body.AddParagraph()
textRange2 = paragraph.AppendText("Word97-03  Word2007  Word2010  Word2013  Word2016  Word2019")
textRange2.ApplyCharacterFormat(character_format2)

# Add a paragraph with Heading 2 style
paragraph = section.Body.AddParagraph()
paragraph.ApplyStyle(BuiltinStyle.Heading2)
textRange1 = paragraph.AppendText("Convert File Documents with High Quality")
textRange1.ApplyCharacterFormat(character_format1)

# Add regular content
paragraph = section.Body.AddParagraph()
textRange2 = paragraph.AppendText("By using Spire.Doc for Python, users can save Word Doc/Docx to stream, save as web response and convert Word Doc/Docx to XML, Markdown, RTF, EMF, TXT, XPS, EPUB, HTML, SVG, ODT and vice versa. Spire.Doc for Python also supports to convert Word Doc/Docx to PDF and HTML to image.")
textRange2.ApplyCharacterFormat(character_format2)

# Add a paragraph with Heading 2 style
paragraph = section.Body.AddParagraph()
paragraph.ApplyStyle(BuiltinStyle.Heading2)
extRange1 = paragraph.AppendText("Other Technical Features")
textRange1.ApplyCharacterFormat(character_format1)

# Add regular content
paragraph = section.Body.AddParagraph()
textRange2 = paragraph.AppendText("By using Spire.Doc for Python, developers can build any type of a 64-bit Python application to create and handle Word documents.")
textRange2.ApplyCharacterFormat(character_format2)

# Update the table of contents
doc.UpdateTableOfContents()

# Save the document
doc.SaveToFile("CreateTOCUsingHeadingStyles.docx", FileFormat.Docx2016)

# Release resources
doc.Dispose()

Python: Create a Table Of Contents for a Newly Created Word Document

Python Create a Table Of Contents Using Outline Level Styles

In a Word document, you can create a table of contents using outline level styles. You can assign an outline level to a paragraph using the ParagraphFormat.OutlineLevel property. Afterwards, you apply these outline levels to the rules for generating the table of contents using the TableOfContent.SetTOCLevelStyle() method. Here's a detailed steps:

  • Create a Document object.
  • Add a section using the Document.AddSection() method.
  • Create a ParagraphStyle object and set the outline level using ParagraphStyle.ParagraphFormat.OutlineLevel = OutlineLevel.Level1.
  • Add the created ParagraphStyle object to the document using the Document.Styles.Add() method.
  • Add a paragraph using the Section.AddParagraph() method.
  • Create a table of contents object using the Paragraph.AppendTOC(int lowerLevel, int upperLevel) method.
  • Set the default setting for creating the table of contents with heading styles to False, TableOfContent.UseHeadingStyles = false.
  • Apply the outline level style to the table of contents rules using the TableOfContent.SetTOCLevelStyle(int levelNumber, string styleName) method.
  • Create a CharacterFormat object and set the font.
  • Apply the style to the paragraph using the Paragraph.ApplyStyle(ParagraphStyle.Name) method.
  • Add text content using the Paragraph.AppendText() method.
  • Apply character formatting to the text using the TextRange.ApplyCharacterFormat() method.
  • Update the table of contents using the Document.UpdateTableOfContents() method.
  • Save the document using the Document.SaveToFile() method.
  • Python
from spire.doc import *
from spire.doc.common import *

# Create a document object
doc = Document()

# Add a section to the document
section = doc.AddSection()

# Define Outline Level 1
titleStyle1 = ParagraphStyle(doc)
titleStyle1.Name = "T1S"
titleStyle1.ParagraphFormat.OutlineLevel = OutlineLevel.Level1
titleStyle1.CharacterFormat.Bold = True
titleStyle1.CharacterFormat.FontName = "Microsoft YaHei"
titleStyle1.CharacterFormat.FontSize = 18
titleStyle1.ParagraphFormat.HorizontalAlignment = HorizontalAlignment.Left
doc.Styles.Add(titleStyle1)

# Define Outline Level 2
titleStyle2 = ParagraphStyle(doc)
titleStyle2.Name = "T2S"
titleStyle2.ParagraphFormat.OutlineLevel = OutlineLevel.Level2
titleStyle2.CharacterFormat.Bold = True
titleStyle2.CharacterFormat.FontName = "Microsoft YaHei"
titleStyle2.CharacterFormat.FontSize = 16
titleStyle2.ParagraphFormat.HorizontalAlignment = HorizontalAlignment.Left
doc.Styles.Add(titleStyle2)

# Define Outline Level 3
titleStyle3 = ParagraphStyle(doc)
titleStyle3.Name = "T3S"
titleStyle3.ParagraphFormat.OutlineLevel = OutlineLevel.Level3
titleStyle3.CharacterFormat.Bold = True
titleStyle3.CharacterFormat.FontName = "Microsoft YaHei"
titleStyle3.CharacterFormat.FontSize = 14
titleStyle3.ParagraphFormat.HorizontalAlignment = HorizontalAlignment.Left
doc.Styles.Add(titleStyle3)

# Add a paragraph
TOCparagraph = section.AddParagraph()
toc = TOCparagraph.AppendTOC(1, 3)
toc.UseHeadingStyles = False
toc.UseHyperlinks = True
toc.UseTableEntryFields = False
toc.RightAlignPageNumbers = True
toc.SetTOCLevelStyle(1, titleStyle1.Name)
toc.SetTOCLevelStyle(2, titleStyle2.Name)
toc.SetTOCLevelStyle(3, titleStyle3.Name)

# Define character format
characterFormat = CharacterFormat(doc)
characterFormat.FontName = "Microsoft YaHei"
characterFormat.FontSize = 12

# Add a paragraph and apply outline level style 1
paragraph = section.Body.AddParagraph()
paragraph.ApplyStyle(titleStyle1.Name)
paragraph.AppendText("Overview")

# Add a paragraph and set the text content
paragraph = section.Body.AddParagraph()
textRange = paragraph.AppendText("Spire.Doc for Python is a professional Word Python API specifically designed for developers to create, read, write, convert, and compare Word documents with fast and high-quality performance.")
textRange.ApplyCharacterFormat(characterFormat)

# Add a paragraph and apply outline level style 1
paragraph = section.Body.AddParagraph()
paragraph.ApplyStyle(titleStyle1.Name)
paragraph.AppendText("Main Functions")

# Add a paragraph and apply outline level style 2
paragraph = section.Body.AddParagraph()
paragraph.ApplyStyle(titleStyle2.Name)
paragraph.AppendText("Only Spire.Doc, No Microsoft Office Automation")

# Add a paragraph and set the text content
paragraph = section.Body.AddParagraph()
textRange = paragraph.AppendText("Spire.Doc for Python is a totally independent Python Word class library which doesn't require Microsoft Office installed on system. Microsoft Office Automation is proved to be unstable, slow and not scalable to produce MS Word documents. Spire.Doc for Python is many times faster than Microsoft Word Automation and with much better stability and scalability.")
textRange.ApplyCharacterFormat(characterFormat)

# Add a paragraph and apply outline level style 3
paragraph = section.Body.AddParagraph()
paragraph.ApplyStyle(titleStyle3.Name)
paragraph.AppendText("Word Versions")

# Add a paragraph and set the text content
paragraph = section.Body.AddParagraph()
textRange = paragraph.AppendText("Word97-03  Word2007  Word2010  Word2013  Word2016  Word2019")
textRange.ApplyCharacterFormat(characterFormat)

# Add a paragraph and apply outline level style 2
paragraph = section.Body.AddParagraph()
paragraph.ApplyStyle(titleStyle2.Name)
paragraph.AppendText("Convert File Documents with High Quality")

# Add a paragraph and set the text content
paragraph = section.Body.AddParagraph()
textRange = paragraph.AppendText("By using Spire.Doc for Python, users can save Word Doc/Docx to stream, save as web response and convert Word Doc/Docx to XML, RTF, EMF, TXT, XPS, EPUB, HTML, SVG, ODT and vice versa. Spire.Doc for Python also supports to convert Word Doc/Docx to PDF and HTML to image.")
textRange.ApplyCharacterFormat(characterFormat)

# Add a paragraph and apply outline level style 2
paragraph = section.Body.AddParagraph()
paragraph.ApplyStyle(titleStyle2.Name)
paragraph.AppendText("Other Technical Features")

# Add a paragraph and set the text content
paragraph = section.Body.AddParagraph()
textRange = paragraph.AppendText("By using Spire.Doc for Python, developers can build any type of a 64-bit Python application to create and handle Word documents.")
textRange.ApplyCharacterFormat(characterFormat)

# Update the table of contents
doc.UpdateTableOfContents()

# Save the document
doc.SaveToFile("CreateTOCUsingOutlineStyles.docx", FileFormat.Docx2016)

# Release resources
doc.Dispose()

Python: Create a Table Of Contents for a Newly Created Word Document

Python Create a Table Of Contents Using Image Captions

Using the Spire.Doc library, you can create a table of contents based on image captions by employing the TableOfContent(Document, "\\h \\z \\c \"Picture\"") method. Below are the detailed steps:

  • Create a Document object.
  • Add a section using the Document.AddSection() method.
  • Create a table of content object with tocForImage = new TableOfContent(Document, " \\h \\z \\c \"Picture\"") and specify the style of the table of contents.
  • Add a paragraph using the Section.AddParagraph() method.
  • Add the table of content object to the paragraph using the Paragraph.Items.Add(tocForImage) method.
  • Add a field separator using the Paragraph.AppendFieldMark(FieldMarkType.FieldSeparator) method.
  • Add the text content "TOC" using the Paragraph.AppendText("TOC") method.
  • Add a field end mark using the Paragraph.AppendFieldMark(FieldMarkType.FieldEnd) method.
  • Add an image using the Paragraph.AppendPicture() method.
  • Add a caption paragraph for the image using the DocPicture.AddCaption() method, including product information and formatting.
  • Update the table of contents to reflect changes in the document using the Document.UpdateTableOfContents(tocForImage) method.
  • Save the document using the Document.SaveToFile() method.
  • Python
from spire.doc import *
from spire.doc.common import *

# Create a new document object
doc = Document()

# Add a section to the document
section = doc.AddSection()

# Create a table of content object for images
tocForImage = TableOfContent(doc, " \\h \\z \\c \"Picture\"")

# Add a paragraph to the section
tocParagraph = section.Body.AddParagraph()

# Add the TOC object to the paragraph
tocParagraph.Items.Add(tocForImage)

# Add a field separator
tocParagraph.AppendFieldMark(FieldMarkType.FieldSeparator)

# Add text content
tocParagraph.AppendText("TOC")

# Add a field end mark
tocParagraph.AppendFieldMark(FieldMarkType.FieldEnd)

# Add a blank paragraph to the section
section.Body.AddParagraph()

# Add a paragraph to the section
paragraph = section.Body.AddParagraph()

# Add an image
docPicture = paragraph.AppendPicture("images/DOC-Python.png")
docPicture.Width = 100
docPicture.Height = 100

# Add a caption paragraph for the image
obj = docPicture.AddCaption("Picture",CaptionNumberingFormat.Number,CaptionPosition.BelowItem)

paragraph = (Paragraph)(obj)
paragraph.AppendText("  Spire.Doc for Python product")
paragraph.Format.AfterSpacing = 20

# Continue adding paragraphs to the section
paragraph = section.Body.AddParagraph()
docPicture = paragraph.AppendPicture("images/PDF-Python.png")
docPicture.Width = 100
docPicture.Height = 100
obj = docPicture.AddCaption("Picture",CaptionNumberingFormat.Number,CaptionPosition.BelowItem)
paragraph = (Paragraph)(obj)
paragraph.AppendText("  Spire.PDF for Python product")
paragraph.Format.AfterSpacing = 20

paragraph = section.Body.AddParagraph()
docPicture = paragraph.AppendPicture("images/XLS-Python.png")
docPicture.Width = 100
docPicture.Height = 100
obj = docPicture.AddCaption("Picture",CaptionNumberingFormat.Number,CaptionPosition.BelowItem)
paragraph = (Paragraph)(obj)
paragraph.AppendText("  Spire.XLS for Python product")
paragraph.Format.AfterSpacing = 20

paragraph = section.Body.AddParagraph()
docPicture = paragraph.AppendPicture("images/PPT-Python.png")
docPicture.Width = 100
docPicture.Height = 100
obj = docPicture.AddCaption("Picture",CaptionNumberingFormat.Number,CaptionPosition.BelowItem)
paragraph = (Paragraph)(obj)
paragraph.AppendText("  Spire.Presentation for Python product")
paragraph.Format.AfterSpacing = 20

# Update the table of contents
doc.UpdateTableOfContents(tocForImage)

# Save the document to a file
doc.SaveToFile("CreateTOCWithImageCaptions.docx", FileFormat.Docx2016)

# Dispose of the document object
doc.Dispose()

Python: Create a Table Of Contents for a Newly Created Word Document

Python Create a Table Of Contents Using Table Captions

Similarly, you can create a table of contents based on table captions by employing the TableOfContent(Document, " \\h \\z \\c \"Table\"") method. Here are the detailed steps:

  • Create a Document object.
  • Add a section using the Document.AddSection() method.
  • Create a table of content object tocForTable = new TableOfContent(Document, " \\h \\z \\c \"Table\"") and specify the style of the table of contents.
  • Add a paragraph using the Section.AddParagraph() method.
  • Add the table of content object to the paragraph using the Paragraph.Items.Add(tocForTable) method.
  • Add a field separator using the Paragraph.AppendFieldMark(FieldMarkType.FieldSeparator) method.
  • Add the text content "TOC" using the Paragraph.AppendText("TOC") method.
  • Add a field end mark using the Paragraph.AppendFieldMark(FieldMarkType.FieldEnd) method.
  • Add a table using the Section.AddTable() method and set the number of rows and columns using the Table.ResetCells(int rowsNum, int columnsNum) method.
  • Add a table caption paragraph using the Table.AddCaption() method, including product information and formatting.
  • Update the table of contents to reflect changes in the document using the Document.UpdateTableOfContents(tocForTable) method.
  • Save the document using the Document.SaveToFile() method.
  • Python
from spire.doc import *
from spire.doc.common import *

# Create a new document
doc = Document()

# Add a section to the document
section = doc.AddSection()

# Create a TableOfContent object
tocForTable = TableOfContent(doc,  " \\h \\z \\c \"Table\"")

# Add a paragraph in the section to place the TableOfContent object
tocParagraph = section.Body.AddParagraph()
tocParagraph.Items.Add(tocForTable)
tocParagraph.AppendFieldMark(FieldMarkType.FieldSeparator)
tocParagraph.AppendText("TOC")
tocParagraph.AppendFieldMark(FieldMarkType.FieldEnd)

# Add two empty paragraphs in the section
section.Body.AddParagraph()
section.Body.AddParagraph()

# Add a table in the section
table = section.Body.AddTable(True)
table.ResetCells(1, 3)

# Add a caption paragraph for the table
obj = table.AddCaption("Table", CaptionNumberingFormat.Number, CaptionPosition.BelowItem)
paragraph = (Paragraph)(obj)
paragraph.AppendText(" One row three columns")
paragraph.Format.AfterSpacing = 20

# Add a new table in the section
table = section.Body.AddTable(True)
table.ResetCells(3, 3)

# Add a caption paragraph for the second table
obj = table.AddCaption("Table", CaptionNumberingFormat.Number, CaptionPosition.BelowItem)
paragraph = (Paragraph)(obj)
paragraph.AppendText(" Three rows three columns")
paragraph.Format.AfterSpacing = 20

# Add another new table in the section
table = section.Body.AddTable(True)
table.ResetCells(5, 3)

# Add a caption paragraph for the third table
obj = table.AddCaption("Table", CaptionNumberingFormat.Number, CaptionPosition.BelowItem)
paragraph = (Paragraph)(obj)
paragraph.AppendText(" Five rows three columns")
paragraph.Format.AfterSpacing = 20

# Update the table of contents
doc.UpdateTableOfContents(tocForTable)

# Save the document to a specified file
doc.SaveToFile("CreateTOCUsingTableCaptions.docx", FileFormat.Docx2016)

# Dispose resources
doc.Dispose()

Python: Create a Table Of Contents for a Newly Created Word Document

Apply for a Temporary License

If you'd like to remove the evaluation message from the generated documents, or to get rid of the function limitations, please request a 30-day trial license for yourself.

Rearranging slides in a PowerPoint presentation is a simple but essential skill. Whether you need to change the order of your points, group related slides together, or move a slide to a different location, the ability to efficiently reorganize your slides can help you create a more coherent and impactful presentation.

In this article, you will learn how to rearrange slides in a PowerPoint document in Python using Spire.Presentation for Python.

Install Spire.Presentation for Python

This scenario requires Spire.Presentation for Python and plum-dispatch v1.7.4. They can be easily installed in your system through the following pip command.

pip install Spire.Presentation

If you are unsure how to install, please refer to this tutorial: How to Install Spire.Presentation for Python on Windows

Rearrange Slides in a PowerPoint Document in Python

To reorder the slides in PowerPoint, two Presentation objects were created - one for loading the original document, and one for creating a new document. By copying the slides from the original document to the new one in the desired sequence, the slide order could be easily rearranged.

The following are the steps to rearrange slides in a PowerPoint document using Python.

  • Create a Presentation object.
  • Load a PowerPoint document using Presentation.LoadFromFile() method.
  • Specify the slide order within a list.
  • Create another Presentation object for creating a new presentation.
  • Add the slides from the original document to the new presentation in the specified order using Presentation.Slides.AppendBySlide() method.
  • Save the new presentation to a PPTX file using Presentation.SaveToFile() method.
  • Python
from spire.presentation.common import *
from spire.presentation import *

# Create a Presentation object
presentation = Presentation()

# Load a PowerPoint file
presentation.LoadFromFile("C:\\Users\\Administrator\\Desktop\\input.pptx")

# Specify the new slide order within a list
newSlideOrder = [4,2,1,3]

# Create another Presentation object
new_presentation =  Presentation()

# Remove the default slide
new_presentation.Slides.RemoveAt(0)

# Iterate through the list
for i in range(len(newSlideOrder)):

    # Add the slides from the original PowerPoint file to the new PowerPoint document in the new order
    new_presentation.Slides.AppendBySlide(presentation.Slides[newSlideOrder[i] - 1])

# Save the new presentation to file
new_presentation.SaveToFile("output/NewOrder.pptx", FileFormat.Pptx2019)

# Dispose resources
presentation.Dispose()
new_presentation.Dispose()

Python: Rearrange Slides in a PowerPoint Document

Apply for a Temporary License

If you'd like to remove the evaluation message from the generated documents, or to get rid of the function limitations, please request a 30-day trial license for yourself.

Word documents often contain valuable data in the form of tables, which can be used for reporting, data analysis, and record-keeping. However, manually extracting and transferring these tables to other formats can be a time-consuming and error-prone task. By automating this process using Python, we can save time, ensure accuracy, and maintain consistency. Spire.Doc for Python provides a seamless solution for the table extraction task, making it effortless to create accessible and manageable files with data from Word document tables. This article will demonstrate how to leverage Spire.Doc for Python to extract tables from Word documents and write them into text files and Excel worksheets.

Install Spire.Doc for Python

This scenario requires Spire.Doc for Python and plum-dispatch v1.7.4. They can be easily installed in your Windows through the following pip command.

pip install Spire.Doc

If you are unsure how to install, please refer to: How to Install Spire.Doc for Python on Windows

Extract Tables from Word Documents to Text Files with Python

Spire.Doc for Python offers the Section.Tables property to retrieve a collection of tables within a section of a Word document. Then, developers can use the properties and methods under the ITable class to access the data in the tables and write it into a text file. This provides a convenient solution for converting Word document tables into text files.

The detailed steps for extracting tables from Word documents to text files are as follows:

  • Create an object of Document class and load a Word document using Document.LoadFromFile() method.
  • Iterate through the sections in the document and get the table collection of each section through Section.Tables property.
  • Iterate through the tables and create a string object for each table.
  • Iterate through the rows in each table and the cells in each row, get the text of each cell through TableCell.Paragraphs[].Text property, and add the cell text to the string.
  • Save each string to a text file.
  • Python
from spire.doc import *
from spire.doc.common import *

# Create an instance of Document
doc = Document()

# Load a Word document
doc.LoadFromFile("Sample.docx")

# Loop through the sections
for s in range(doc.Sections.Count):
    # Get a section
    section = doc.Sections.get_Item(s)
    # Get the tables in the section
    tables = section.Tables
    # Loop through the tables
    for i in range(0, tables.Count):
        # Get a table
        table = tables.get_Item(i)
        # Initialize a string to store the table data
        tableData = ''
        # Loop through the rows of the table
        for j in range(0, table.Rows.Count):
            # Loop through the cells of the row
            for k in range(0, table.Rows.get_Item(j).Cells.Count):
                # Get a cell
                cell = table.Rows.get_Item(j).Cells.get_Item(k)
                # Get the text in the cell
                cellText = ''
                for para in range(cell.Paragraphs.Count):
                    paragraphText = cell.Paragraphs.get_Item(para).Text
                    cellText += (paragraphText + ' ')
                # Add the text to the string
                tableData += cellText
                if k < table.Rows.get_Item(j).Cells.Count - 1:
                    tableData += '\t'
            # Add a new line
            tableData += '\n'
    
        # Save the table data to a text file
        with open(f'output/Tables/WordTable_{s+1}_{i+1}.txt', 'w', encoding='utf-8') as f:
            f.write(tableData)
doc.Close()

Python: Extract Tables from Word Documents

Extract Tables from Word Documents to Excel Workbooks with Python

Developers can also utilize Spire.Doc for Python to retrieve table data and then use Spire.XLS for Python to write the table data into an Excel worksheet, thereby enabling the conversion of Word document tables into Excel workbooks.

Install Spire.XLS for Python via PyPI:

pip install Spire.XLS

The detailed steps for extracting tables from Word documents to Excel workbooks are as follows:

  • Create an object of Document class and load a Word document using Document.LoadFromFile() method.
  • Create an object of Workbook class and clear the default worksheets using Workbook.Worksheets.Clear() method.
  • Iterate through the sections in the document and get the table collection of each section through Section.Tables property.
  • Iterate through the tables and create a worksheet for each table using Workbook.Worksheets.Add() method.
  • Iterate through the rows in each table and the cells in each row, get the text of each cell through TableCell.Paragraphs[].Text property, and write the text to the worksheet using Worksheet.SetCellValue() method.
  • Save the workbook using Workbook.SaveToFile() method.
  • Python
from spire.doc import *
from spire.doc.common import *
from spire.xls import *
from spire.xls.common import *

# Create an instance of Document
doc = Document()

# Load a Word document
doc.LoadFromFile('Sample.docx')

# Create an instance of Workbook
wb = Workbook()
wb.Worksheets.Clear()

# Loop through sections in the document
for i in range(doc.Sections.Count):
    # Get a section
    section = doc.Sections.get_Item(i)
    # Loop through tables in the section
    for j in range(section.Tables.Count):
        # Get a table
        table = section.Tables.get_Item(j)
        # Create a worksheet
        ws = wb.Worksheets.Add(f'Table_{i+1}_{j+1}')
        # Write the table to the worksheet
        for row in range(table.Rows.Count):
            # Get a row
            tableRow = table.Rows.get_Item(row)
            # Loop through cells in the row
            for cell in range(tableRow.Cells.Count):
                # Get a cell
                tableCell = tableRow.Cells.get_Item(cell)
                # Get the text in the cell
                cellText = ''
                for paragraph in range(tableCell.Paragraphs.Count):
                    paragraph = tableCell.Paragraphs.get_Item(paragraph)
                    cellText = cellText + (paragraph.Text + ' ')
                # Write the cell text to the worksheet
                ws.SetCellValue(row + 1, cell + 1, cellText)

# Save the workbook
wb.SaveToFile('output/Tables/WordTableToExcel.xlsx', FileFormat.Version2016)
doc.Close()
wb.Dispose()

Python: Extract Tables from Word Documents

Apply for a Temporary License

If you'd like to remove the evaluation message from the generated documents, or to get rid of the function limitations, please request a 30-day trial license for yourself.

page 48