Installé via NuGet

PM> Install-Package Spire.PDF

Pour empêcher que votre document PDF ne soit utilisé de manière non autorisée, vous pouvez filigraner le document avec du texte ou une image. Dans cet article, vous apprendrez à programmer ajouter des filigranes de texte (filigranes à une seule ligne et multilignes) au PDF en C# et VB.NET en utilisant Spire.PDF for .NET.

Installer Spire.PDF for .NET

Pour commencer, vous devez ajouter les fichiers DLL inclus dans le package Spire.PDF for .NET en tant que références dans votre projet .NET. Les fichiers DLL peuvent être téléchargés à partir de ce lien ou installés via NuGet.

  • Package Manager
PM> Install-Package Spire.PDF 

Ajouter un filigrane de texte au PDF

Spire.PDF ne fournit pas d'interface ou de classe pour gérer les filigranes dans les fichiers PDF. Vous pouvez cependant dessiner du texte comme "confidentiel", "usage interne" ou "brouillon" sur chaque page pour imiter l'effet de filigrane. Voici les principales étapes pour ajouter un filigrane de texte à toutes les pages d'un document PDF.

  • Créez un objet PdfDocument et chargez un exemple de document PDF à l'aide de la méthode PdfDocument.LoadFromFile().
  • Créez un objet PdfTrueTypeFont, spécifiez le texte du filigrane et mesurez la taille du texte à l'aide de la méthode PdfFontBase.MeasureString().
  • Parcourir toutes les pages du document.
  • Traduisez le système de coordonnées d'une certaine page par des coordonnées spécifiées à l'aide de la méthode PdfPageBase.Canvas.TraslateTransform() et faites pivoter le système de coordonnées de 45 degrés dans le sens antihoraire à l'aide de la méthode PdfPageBase.Canvas.RotateTransform(). Cela garantit que le filigrane apparaîtra au milieu de la page à un angle de 45 degrés.
  • Dessinez le texte du filigrane sur la page à l'aide de la méthode PdfPageBase.Canvas.DrawString().
  • Enregistrez le document dans un autre fichier PDF à l'aide de la méthode PdfDocument.SaveToFile().
  • C#
  • VB.NET
using Spire.Pdf;
    using Spire.Pdf.Graphics;
    using System.Drawing;
    
    namespace AddTextWatermarkToPdf
    {
        class Program
        {
            static void Main(string[] args)
            {
                //Create a PdfDocument object
                PdfDocument pdf = new PdfDocument();
    
                //Load a sample PDF document
                pdf.LoadFromFile(@"C:\Users\Administrator\Desktop\sample.pdf");
    
                //Create a PdfTrueTypeFont object
                PdfTrueTypeFont font = new PdfTrueTypeFont(new Font("Arial", 50f), true);
    
                //Set the watermark text
                string text = "CONFIDENTIAL";
    
                //Measure the text size
                SizeF textSize = font.MeasureString(text);
    
                //Calculate the values of two offset variables,
                //which will be used to calculate the translation amount of the coordinate system
                float offset1 = (float)(textSize.Width * System.Math.Sqrt(2) / 4);
                float offset2 = (float)(textSize.Height * System.Math.Sqrt(2) / 4);
    
                //Traverse all the pages in the document
                foreach (PdfPageBase page in pdf.Pages)
                {
                    //Set the page transparency
                    page.Canvas.SetTransparency(0.8f);
    
                    //Translate the coordinate system by specified coordinates
                    page.Canvas.TranslateTransform(page.Canvas.Size.Width / 2 - offset1 - offset2, page.Canvas.Size.Height / 2 + offset1 - offset2);
    
                    //Rotate the coordinate system 45 degrees counterclockwise
                    page.Canvas.RotateTransform(-45);
    
                    //Draw watermark text on the page
                    page.Canvas.DrawString(text, font, PdfBrushes.DarkGray, 0, 0);
                }
    
                //Save the changes to another file
                pdf.SaveToFile("TextWatermark.pdf");
            }
        }
    }
Imports Spire.Pdf
    Imports Spire.Pdf.Graphics
    Imports System.Drawing
    
    Namespace AddTextWatermarkToPdf
        Class Program
            Shared  Sub Main(ByVal args() As String)
                'Create a PdfDocument object
                Dim pdf As PdfDocument =  New PdfDocument()
    
                'Load a sample PDF document
                pdf.LoadFromFile("C:\Users\Administrator\Desktop\sample.pdf")
    
                'Create a PdfTrueTypeFont object
                Dim font As PdfTrueTypeFont =  New PdfTrueTypeFont(New Font("Arial",50f),True)
    
                'Set the watermark text
                Dim text As String =  "CONFIDENTIAL"
    
                'Measure the text size
                Dim textSize As SizeF =  font.MeasureString(text)
    
                'Calculate the values of two offset variables,
                'which will be used to calculate the translation amount of the coordinate system
                Dim offset1 As single = CType((textSize.Width * System.Math.Sqrt(2) / 4), single)
                Dim offset2 As single = CType((textSize.Height * System.Math.Sqrt(2) / 4), single)
    
                'Traverse all the pages in the document
                Dim page As PdfPageBase
                For Each page In pdf.Pages
                    'Set the page transparency
                    page.Canvas.SetTransparency(0.8f)
    
                    'Translate the coordinate system by specified coordinates
                    page.Canvas.TranslateTransform(page.Canvas.Size.Width / 2 - offset1 - offset2, page.Canvas.Size.Height / 2 + offset1 - offset2)
    
                    'Rotate the coordinate system 45 degrees counterclockwise
                    page.Canvas.RotateTransform(-45)
    
                    'Draw watermark text on the page
                    page.Canvas.DrawString(text, font, PdfBrushes.DarkGray, 0, 0)
                Next
    
                'Save the changes to another file
                pdf.SaveToFile("TextWatermark.pdf")
            End Sub
        End Class
    End Namespace

C#/VB.NET: Add Text Watermarks to PDF

Ajouter des filigranes de texte multilignes au PDF

Il peut arriver que vous souhaitiez ajouter plusieurs lignes de filigranes de texte à votre document. Pour obtenir l'effet de filigrane en mosaïque, vous pouvez utiliser la classe PdfTilingBrush, qui produit un motif en mosaïque répété pour remplir une zone graphique. Voici les principales étapes pour ajouter des filigranes multilignes à un document PDF.

  • Créez un objet PdfDocument et chargez un exemple de document PDF à l'aide de la méthode PdfDocument.LoadFromFile().
  • Créez une méthode personnalisée InsertMultiLineTextWatermark(PdfPageBase page, String watermarkText, PdfTrueTypeFont font, int rowNum, int columnNum) pour ajouter des filigranes de texte multiligne à une page PDF. Les paramètres rowNum et columnNum spécifient le numéro de ligne et de colonne des filigranes en mosaïque.
  • Parcourez toutes les pages du document et appelez la méthode personnalisée InsertMultiLineTextWatermark() pour appliquer des filigranes à chaque page.
  • Enregistrez le document dans un autre fichier à l'aide de la méthode PdfDocument.SaveToFile().
  • C#
  • VB.NET
using System;
    using Spire.Pdf;
    using Spire.Pdf.Graphics;
    using System.Drawing;
    
    namespace AddMultiLineTextWatermark
    {
        class Program
        {
            static void Main(string[] args)
            {
                //Create a PdfDocument instance
                PdfDocument pdf = new PdfDocument();
    
                //Load a sample PDF document
                pdf.LoadFromFile(@"C:\Users\Administrator\Desktop\sample.pdf");
    
                //Create a PdfTrueTypeFont object
                PdfTrueTypeFont font = new PdfTrueTypeFont(new Font("Arial", 20f), true);
    
                //Traverse all the pages
                for (int i = 0; i < pdf.Pages.Count; i++)
                {
                    //Call InsertMultiLineTextWatermark() method to add text watermarks to the specified page
                    InsertMultiLineTextWatermark(pdf.Pages[i], "E-ICEBLUE CO LTD", font, 3, 3);
                }
    
                //Save the document to another file
                pdf.SaveToFile("MultiLineTextWatermark.pdf");
            }
    
            //Create a custom method to insert multi-line text watermarks to a page
            static void InsertMultiLineTextWatermark(PdfPageBase page, String watermarkText, PdfTrueTypeFont font, int rowNum, int columnNum)
            {
                //Measure the text size
                SizeF textSize = font.MeasureString(watermarkText);
    
                //Calculate the values of two offset variables, which will be used to calculate the translation amount of coordinate system
                float offset1 = (float)(textSize.Width * System.Math.Sqrt(2) / 4);
                float offset2 = (float)(textSize.Height * System.Math.Sqrt(2) / 4);
    
                //Create a tile brush
                PdfTilingBrush brush = new PdfTilingBrush(new SizeF(page.ActualSize.Width / columnNum, page.ActualSize.Height / rowNum));
                brush.Graphics.SetTransparency(0.3f);
                brush.Graphics.Save();
                brush.Graphics.TranslateTransform(brush.Size.Width / 2 - offset1 - offset2, brush.Size.Height / 2 + offset1 - offset2);
                brush.Graphics.RotateTransform(-45);
    
                //Draw watermark text on the tile brush
                brush.Graphics.DrawString(watermarkText, font, PdfBrushes.Violet, 0, 0);
                brush.Graphics.Restore();
    
                //Draw a rectangle (that covers the whole page) using the tile brush
                page.Canvas.DrawRectangle(brush, new RectangleF(new PointF(0, 0), page.ActualSize));
            }
        }
    }
Imports System
    Imports Spire.Pdf
    Imports Spire.Pdf.Graphics
    Imports System.Drawing
    
    Namespace AddMultiLineTextWatermark
        Class Program
            Shared  Sub Main(ByVal args() As String)
                'Create a PdfDocument instance
                Dim pdf As PdfDocument =  New PdfDocument()
    
                'Load a sample PDF document
                pdf.LoadFromFile("C:\Users\Administrator\Desktop\sample.pdf")
    
                'Create a PdfTrueTypeFont object
                Dim font As PdfTrueTypeFont =  New PdfTrueTypeFont(New Font("Arial",20f),True)
    
                'Traverse all the pages
                Dim i As Integer
                For  i = 0 To  pdf.Pages.Count- 1  Step  i + 1
                    'Call InsertMultiLineTextWatermark() method to add text watermarks to the specified page
                    InsertMultiLineTextWatermark(pdf.Pages(i), "E-ICEBLUE CO LTD", font, 3, 3)
                Next
    
                'Save the document to another file
                pdf.SaveToFile("MultiLineTextWatermark.pdf")
            End Sub
    
            'Create a custom method to insert multi-line text watermarks to a page
            Shared  Sub InsertMultiLineTextWatermark(ByVal page As PdfPageBase, ByVal watermarkText As String, ByVal font As PdfTrueTypeFont, ByVal rowNum As Integer, ByVal columnNum As Integer)
                'Measure the text size
                Dim textSize As SizeF =  font.MeasureString(watermarkText)
    
                'Calculate the values of two offset variables, which will be used to calculate the translation amount of coordinate system
                Dim offset1 As single = CType((textSize.Width * System.Math.Sqrt(2) / 4), single)
                Dim offset2 As single = CType((textSize.Height * System.Math.Sqrt(2) / 4), single)
    
                'Create a tile brush
                Dim brush As PdfTilingBrush =  New PdfTilingBrush(New SizeF(page.ActualSize.Width / columnNum,page.ActualSize.Height / rowNum))
                brush.Graphics.SetTransparency(0.3f)
                brush.Graphics.Save()
                brush.Graphics.TranslateTransform(brush.Size.Width / 2 - offset1 - offset2, brush.Size.Height / 2 + offset1 - offset2)
                brush.Graphics.RotateTransform(-45)
    
                'Draw watermark text on the tile brush
                brush.Graphics.DrawString(watermarkText, font, PdfBrushes.Violet, 0, 0)
                brush.Graphics.Restore()
    
                'Draw a rectangle (that covers the whole page) using the tile brush
                page.Canvas.DrawRectangle(brush, New RectangleF(New PointF(0, 0), page.ActualSize))
            End Sub
        End Class
    End Namespace

C#/VB.NET: Add Text Watermarks to PDF

Demander une licence temporaire

Si vous souhaitez supprimer le message d'évaluation des documents générés ou vous débarrasser des limitations de la fonction, veuillez demander une licence d'essai de 30 jours pour toi.

Voir également

Thursday, 24 August 2023 08:34

C#/VB.NET: Extrair tabelas de PDF

Instalado via NuGet

PM> Install-Package Spire.PDF

Links Relacionados

O PDF é um dos formatos de documento mais populares para compartilhar e gravar dados. Você pode encontrar a situação em que precisa extrair dados de documentos PDF, especialmente os dados em tabelas. Por exemplo, há informações úteis armazenadas nas tabelas de suas faturas em PDF e você deseja extrair os dados para análise ou cálculo posterior. Este artigo demonstra como extrair dados de tabelas PDF e salve-o em um arquivo TXT usando Spire.PDF for .NET.

Instalar o Spire.PDF for .NET

Para começar, você precisa adicionar os arquivos DLL incluídos no pacote Spire.PDF for.NET como referências em seu projeto .NET. Os arquivos DLLs podem ser baixados deste link ou instalados via NuGet.

  • Package Manager
PM> Install-Package Spire.PDF 

Extrair dados de tabelas PDF

A seguir estão as principais etapas para extrair tabelas de um documento PDF.

  • Crie uma instância da classe PdfDocument.
  • Carregue o documento PDF de amostra usando o método PdfDocument.LoadFromFile().
  • Extraia tabelas de uma página específica usando o método PdfTableExtractor.ExtractTable(int pageIndex).
  • Obtenha o texto de uma determinada célula da tabela usando o método PdfTable.GetText(int rowIndex, int columnIndex).
  • Salve os dados extraídos em um arquivo .txt.
  • C#
  • VB.NET
using System.IO;
    using System.Text;
    using Spire.Pdf;
    using Spire.Pdf.Utilities;
    
    namespace ExtractPdfTable
    {
        class Program
        {
            static void Main(string[] args)
            {
                //Create a PdfDocument object
                PdfDocument doc = new PdfDocument();
    
                //Load the sample PDF file
                doc.LoadFromFile(@"C:\Users\Administrator\Desktop\table.pdf");
    
                //Create a StringBuilder object
                StringBuilder builder = new StringBuilder();
    
                //Initialize an instance of PdfTableExtractor class
                PdfTableExtractor extractor = new PdfTableExtractor(doc);
    
                //Declare a PdfTable array
                PdfTable[] tableList = null;
    
                //Loop through the pages
                for (int pageIndex = 0; pageIndex < doc.Pages.Count; pageIndex++)
                {
                    //Extract tables from a specific page
                    tableList = extractor.ExtractTable(pageIndex);
    
                    //Determine if the table list is null
                    if (tableList != null && tableList.Length > 0)
                    {
                        //Loop through the table in the list
                        foreach (PdfTable table in tableList)
                        {
                            //Get row number and column number of a certain table
                            int row = table.GetRowCount();
                            int column = table.GetColumnCount();
    
                            //Loop though the row and colunm
                            for (int i = 0; i < row; i++)
                            {
                                for (int j = 0; j < column; j++)
                                {
                                    //Get text from the specific cell
                                    string text = table.GetText(i, j);
    
                                    //Add text to the string builder
                                    builder.Append(text + " ");
                                }
                                builder.Append("\r\n");
                            }
                        }
                    }
                }
    
                //Write to a .txt file
                File.WriteAllText("Table.txt", builder.ToString());
            }
        }
    }
Imports System.IO
    Imports System.Text
    Imports Spire.Pdf
    Imports Spire.Pdf.Utilities
    
    Namespace ExtractPdfTable
        Class Program
            Shared  Sub Main(ByVal args() As String)
                'Create a PdfDocument object
                Dim doc As PdfDocument =  New PdfDocument()
    
                'Load the sample PDF file
                doc.LoadFromFile("C:\Users\Administrator\Desktop\table.pdf")
    
                'Create a StringBuilder object
                Dim builder As StringBuilder =  New StringBuilder()
    
                'Initialize an instance of PdfTableExtractor class
                Dim extractor As PdfTableExtractor =  New PdfTableExtractor(doc)
    
                'Declare a PdfTable array
                Dim tableList() As PdfTable =  Nothing
    
                'Loop through the pages
                Dim pageIndex As Integer
                For  pageIndex = 0 To  doc.Pages.Count- 1  Step  pageIndex + 1
                    'Extract tables from a specific page
                    tableList = extractor.ExtractTable(pageIndex)
    
                    'Determine if the table list is null
                    If tableList <> Nothing And tableList.Length > 0 Then
                        'Loop through the table in the list
                        Dim table As PdfTable
                        For Each table In tableList
                            'Get row number and column number of a certain table
                            Dim row As Integer =  table.GetRowCount()
                            Dim column As Integer =  table.GetColumnCount()
    
                            'Loop though the row and colunm
                            Dim i As Integer
                            For  i = 0 To  row- 1  Step  i + 1
                                Dim j As Integer
                                For  j = 0 To  column- 1  Step  j + 1
                                    'Get text from the specific cell
                                    Dim text As String =  table.GetText(i,j)
    
                                    'Add text to the string builder
                                    builder.Append(text + " ")
                                Next
                                builder.Append("\r\n")
                            Next
                        Next
                    End If
                Next
    
                'Write to a .txt file
                File.WriteAllText("Table.txt", builder.ToString())
            End Sub
        End Class
    End Namespace

C#/VB.NET: Extract Tables from PDF

Solicitar uma licença temporária

Se você deseja remover a mensagem de avaliação dos documentos gerados ou se livrar das limitações de função, por favor solicite uma licença de teste de 30 dias para você mesmo.

Veja também

Установлено через NuGet

PM> Install-Package Spire.PDF 

Ссылки по теме

PDF — один из самых популярных форматов документов для обмена и записи данных. Вы можете столкнуться с ситуацией, когда вам необходимо извлечь данные из PDF-документов, особенно данные в таблицах. Например, в таблицах ваших счетов в формате PDF хранится полезная информация, и вы хотите извлечь данные для дальнейшего анализа или расчета. В этой статье показано, как извлекать данные из таблиц PDF и сохраните его в файле TXT с помощью Spire.PDF for .NET.

Установите Spire.PDF for .NET

Для начала вам нужно добавить файлы DLL, включенные в пакет Spire.PDF for .NET, в качестве ссылок в ваш проект .NET. Файлы DLL можно загрузить по этой ссылке или установить через NuGet.

  • Package Manager
PM> Install-Package Spire.PDF 

Извлечение данных из таблиц PDF

Ниже приведены основные шаги по извлечению таблиц из PDF-документа.

  • Создайте экземпляр класса PdfDocument.
  • Загрузите образец PDF-документа, используя метод PdfDocument.LoadFromFile().
  • Извлеките таблицы с определенной страницы с помощью метода PdfTableExtractor.ExtractTable(int pageIndex).
  • Получите текст определенной ячейки таблицы, используя метод PdfTable.GetText(int rowIndex, int columnsIndex).
  • Сохраните извлеченные данные в файле .txt.
  • C#
  • VB.NET
using System.IO;
    using System.Text;
    using Spire.Pdf;
    using Spire.Pdf.Utilities;
    
    namespace ExtractPdfTable
    {
        class Program
        {
            static void Main(string[] args)
            {
                //Create a PdfDocument object
                PdfDocument doc = new PdfDocument();
    
                //Load the sample PDF file
                doc.LoadFromFile(@"C:\Users\Administrator\Desktop\table.pdf");
    
                //Create a StringBuilder object
                StringBuilder builder = new StringBuilder();
    
                //Initialize an instance of PdfTableExtractor class
                PdfTableExtractor extractor = new PdfTableExtractor(doc);
    
                //Declare a PdfTable array
                PdfTable[] tableList = null;
    
                //Loop through the pages
                for (int pageIndex = 0; pageIndex < doc.Pages.Count; pageIndex++)
                {
                    //Extract tables from a specific page
                    tableList = extractor.ExtractTable(pageIndex);
    
                    //Determine if the table list is null
                    if (tableList != null && tableList.Length > 0)
                    {
                        //Loop through the table in the list
                        foreach (PdfTable table in tableList)
                        {
                            //Get row number and column number of a certain table
                            int row = table.GetRowCount();
                            int column = table.GetColumnCount();
    
                            //Loop though the row and colunm
                            for (int i = 0; i < row; i++)
                            {
                                for (int j = 0; j < column; j++)
                                {
                                    //Get text from the specific cell
                                    string text = table.GetText(i, j);
    
                                    //Add text to the string builder
                                    builder.Append(text + " ");
                                }
                                builder.Append("\r\n");
                            }
                        }
                    }
                }
    
                //Write to a .txt file
                File.WriteAllText("Table.txt", builder.ToString());
            }
        }
    }
Imports System.IO
    Imports System.Text
    Imports Spire.Pdf
    Imports Spire.Pdf.Utilities
    
    Namespace ExtractPdfTable
        Class Program
            Shared  Sub Main(ByVal args() As String)
                'Create a PdfDocument object
                Dim doc As PdfDocument =  New PdfDocument()
    
                'Load the sample PDF file
                doc.LoadFromFile("C:\Users\Administrator\Desktop\table.pdf")
    
                'Create a StringBuilder object
                Dim builder As StringBuilder =  New StringBuilder()
    
                'Initialize an instance of PdfTableExtractor class
                Dim extractor As PdfTableExtractor =  New PdfTableExtractor(doc)
    
                'Declare a PdfTable array
                Dim tableList() As PdfTable =  Nothing
    
                'Loop through the pages
                Dim pageIndex As Integer
                For  pageIndex = 0 To  doc.Pages.Count- 1  Step  pageIndex + 1
                    'Extract tables from a specific page
                    tableList = extractor.ExtractTable(pageIndex)
    
                    'Determine if the table list is null
                    If tableList <> Nothing And tableList.Length > 0 Then
                        'Loop through the table in the list
                        Dim table As PdfTable
                        For Each table In tableList
                            'Get row number and column number of a certain table
                            Dim row As Integer =  table.GetRowCount()
                            Dim column As Integer =  table.GetColumnCount()
    
                            'Loop though the row and colunm
                            Dim i As Integer
                            For  i = 0 To  row- 1  Step  i + 1
                                Dim j As Integer
                                For  j = 0 To  column- 1  Step  j + 1
                                    'Get text from the specific cell
                                    Dim text As String =  table.GetText(i,j)
    
                                    'Add text to the string builder
                                    builder.Append(text + " ")
                                Next
                                builder.Append("\r\n")
                            Next
                        Next
                    End If
                Next
    
                'Write to a .txt file
                File.WriteAllText("Table.txt", builder.ToString())
            End Sub
        End Class
    End Namespace

C#/VB.NET: Extract Tables from PDF

Подать заявку на временную лицензию

Если вы хотите удалить оценочное сообщение из сгенерированных документов или избавиться от функциональных ограничений, пожалуйста запросить 30-дневную пробную лицензию для себя.

Смотрите также

Thursday, 24 August 2023 08:31

C#/VB.NET: Tabellen aus PDF extrahieren

Über NuGet installiert

PM> Install-Package Spire.PDF 

verwandte Links

PDF ist eines der beliebtesten Dokumentformate zum Teilen und Schreiben von Daten. Es kann vorkommen, dass Sie Daten aus PDF-Dokumenten extrahieren müssen, insbesondere Daten in Tabellen. Beispielsweise sind in den Tabellen Ihrer PDF-Rechnungen nützliche Informationen gespeichert und Sie möchten die Daten zur weiteren Analyse oder Berechnung extrahieren. Dieser Artikel zeigt, wie es geht Extrahieren Sie Daten aus PDF-Tabellen und speichern Sie es in einer TXT-Datei, indem Sie Spire.PDF for .NETverwenden.

Installieren Sie Spire.PDF for .NET

Zunächst müssen Sie die im Spire.PDF for.NET-Paket enthaltenen DLL-Dateien als Referenzen in Ihrem .NET-Projekt hinzufügen. Die DLLs-Dateien können entweder über diesen Link heruntergeladen oder über NuGet installiert werden.

  • Package Manager
PM> Install-Package Spire.PDF 

Extrahieren Sie Daten aus PDF-Tabellen

Im Folgenden sind die wichtigsten Schritte zum Extrahieren von Tabellen aus einem PDF-Dokument aufgeführt.

  • Erstellen Sie eine Instanz der PdfDocument-Klasse.
  • Laden Sie das Beispiel-PDF-Dokument mit der Methode PdfDocument.LoadFromFile().
  • Extrahieren Sie Tabellen aus einer bestimmten Seite mit der Methode PdfTableExtractor.ExtractTable(int pageIndex).
  • Rufen Sie den Text einer bestimmten Tabellenzelle mit der Methode PdfTable.GetText(int rowIndex, int columnsIndex) ab.
  • Speichern Sie die extrahierten Daten in einer TXT-Datei.
  • C#
  • VB.NET
using System.IO;
    using System.Text;
    using Spire.Pdf;
    using Spire.Pdf.Utilities;
    
    namespace ExtractPdfTable
    {
        class Program
        {
            static void Main(string[] args)
            {
                //Create a PdfDocument object
                PdfDocument doc = new PdfDocument();
    
                //Load the sample PDF file
                doc.LoadFromFile(@"C:\Users\Administrator\Desktop\table.pdf");
    
                //Create a StringBuilder object
                StringBuilder builder = new StringBuilder();
    
                //Initialize an instance of PdfTableExtractor class
                PdfTableExtractor extractor = new PdfTableExtractor(doc);
    
                //Declare a PdfTable array
                PdfTable[] tableList = null;
    
                //Loop through the pages
                for (int pageIndex = 0; pageIndex < doc.Pages.Count; pageIndex++)
                {
                    //Extract tables from a specific page
                    tableList = extractor.ExtractTable(pageIndex);
    
                    //Determine if the table list is null
                    if (tableList != null && tableList.Length > 0)
                    {
                        //Loop through the table in the list
                        foreach (PdfTable table in tableList)
                        {
                            //Get row number and column number of a certain table
                            int row = table.GetRowCount();
                            int column = table.GetColumnCount();
    
                            //Loop though the row and colunm
                            for (int i = 0; i < row; i++)
                            {
                                for (int j = 0; j < column; j++)
                                {
                                    //Get text from the specific cell
                                    string text = table.GetText(i, j);
    
                                    //Add text to the string builder
                                    builder.Append(text + " ");
                                }
                                builder.Append("\r\n");
                            }
                        }
                    }
                }
    
                //Write to a .txt file
                File.WriteAllText("Table.txt", builder.ToString());
            }
        }
    }
Imports System.IO
    Imports System.Text
    Imports Spire.Pdf
    Imports Spire.Pdf.Utilities
    
    Namespace ExtractPdfTable
        Class Program
            Shared  Sub Main(ByVal args() As String)
                'Create a PdfDocument object
                Dim doc As PdfDocument =  New PdfDocument()
    
                'Load the sample PDF file
                doc.LoadFromFile("C:\Users\Administrator\Desktop\table.pdf")
    
                'Create a StringBuilder object
                Dim builder As StringBuilder =  New StringBuilder()
    
                'Initialize an instance of PdfTableExtractor class
                Dim extractor As PdfTableExtractor =  New PdfTableExtractor(doc)
    
                'Declare a PdfTable array
                Dim tableList() As PdfTable =  Nothing
    
                'Loop through the pages
                Dim pageIndex As Integer
                For  pageIndex = 0 To  doc.Pages.Count- 1  Step  pageIndex + 1
                    'Extract tables from a specific page
                    tableList = extractor.ExtractTable(pageIndex)
    
                    'Determine if the table list is null
                    If tableList <> Nothing And tableList.Length > 0 Then
                        'Loop through the table in the list
                        Dim table As PdfTable
                        For Each table In tableList
                            'Get row number and column number of a certain table
                            Dim row As Integer =  table.GetRowCount()
                            Dim column As Integer =  table.GetColumnCount()
    
                            'Loop though the row and colunm
                            Dim i As Integer
                            For  i = 0 To  row- 1  Step  i + 1
                                Dim j As Integer
                                For  j = 0 To  column- 1  Step  j + 1
                                    'Get text from the specific cell
                                    Dim text As String =  table.GetText(i,j)
    
                                    'Add text to the string builder
                                    builder.Append(text + " ")
                                Next
                                builder.Append("\r\n")
                            Next
                        Next
                    End If
                Next
    
                'Write to a .txt file
                File.WriteAllText("Table.txt", builder.ToString())
            End Sub
        End Class
    End Namespace

C#/VB.NET: Extract Tables from PDF

Beantragen Sie eine temporäre Lizenz

Wenn Sie die Bewertungsmeldung aus den generierten Dokumenten entfernen oder die Funktionseinschränkungen beseitigen möchten, wenden Sie sich bitte an uns Fordern Sie eine 30-Tage-Testlizenz an für sich selbst.

Siehe auch

Thursday, 24 August 2023 08:29

C#/VB.NET: extraer tablas de PDF

Instalado a través de NuGet

PM> Install-Package Spire.PDF

enlaces relacionados

PDF es uno de los formatos de documentos más populares para compartir y escribir datos. Puede encontrarse con la situación en la que necesita extraer datos de documentos PDF, especialmente los datos en tablas. Por ejemplo, hay información útil almacenada en las tablas de sus facturas en PDF y desea extraer los datos para su posterior análisis o cálculo. Este artículo demuestra cómo extraer datos de tablas PDF y guárdelo en un archivo TXT utilizando Spire.PDF for .NET.

Instalar Spire.PDF for .NET

Para empezar, debe agregar los archivos DLL incluidos en el paquete Spire.PDF for .NET como referencias en su proyecto .NET. Los archivos DLL se pueden descargar desde este enlace o instalar a través de NuGet.

  • Package Manager
PM> Install-Package Spire.PDF 

Extraer datos de tablas PDF

Los siguientes son los pasos principales para extraer tablas de un documento PDF.

  • Cree una instancia de la clase PdfDocument.
  • Cargue el documento PDF de muestra utilizando el método PdfDocument.LoadFromFile().
  • Extraiga tablas de una página específica utilizando el método PdfTableExtractor.ExtractTable(int pageIndex).
  • Obtenga el texto de una determinada celda de tabla usando el método PdfTable.GetText(int rowIndex, int columnIndex).
  • Guarde los datos extraídos en un archivo .txt.
  • C#
  • VB.NET
using System.IO;
    using System.Text;
    using Spire.Pdf;
    using Spire.Pdf.Utilities;
    
    namespace ExtractPdfTable
    {
        class Program
        {
            static void Main(string[] args)
            {
                //Create a PdfDocument object
                PdfDocument doc = new PdfDocument();
    
                //Load the sample PDF file
                doc.LoadFromFile(@"C:\Users\Administrator\Desktop\table.pdf");
    
                //Create a StringBuilder object
                StringBuilder builder = new StringBuilder();
    
                //Initialize an instance of PdfTableExtractor class
                PdfTableExtractor extractor = new PdfTableExtractor(doc);
    
                //Declare a PdfTable array
                PdfTable[] tableList = null;
    
                //Loop through the pages
                for (int pageIndex = 0; pageIndex < doc.Pages.Count; pageIndex++)
                {
                    //Extract tables from a specific page
                    tableList = extractor.ExtractTable(pageIndex);
    
                    //Determine if the table list is null
                    if (tableList != null && tableList.Length > 0)
                    {
                        //Loop through the table in the list
                        foreach (PdfTable table in tableList)
                        {
                            //Get row number and column number of a certain table
                            int row = table.GetRowCount();
                            int column = table.GetColumnCount();
    
                            //Loop though the row and colunm
                            for (int i = 0; i < row; i++)
                            {
                                for (int j = 0; j < column; j++)
                                {
                                    //Get text from the specific cell
                                    string text = table.GetText(i, j);
    
                                    //Add text to the string builder
                                    builder.Append(text + " ");
                                }
                                builder.Append("\r\n");
                            }
                        }
                    }
                }
    
                //Write to a .txt file
                File.WriteAllText("Table.txt", builder.ToString());
            }
        }
    }
Imports System.IO
    Imports System.Text
    Imports Spire.Pdf
    Imports Spire.Pdf.Utilities
    
    Namespace ExtractPdfTable
        Class Program
            Shared  Sub Main(ByVal args() As String)
                'Create a PdfDocument object
                Dim doc As PdfDocument =  New PdfDocument()
    
                'Load the sample PDF file
                doc.LoadFromFile("C:\Users\Administrator\Desktop\table.pdf")
    
                'Create a StringBuilder object
                Dim builder As StringBuilder =  New StringBuilder()
    
                'Initialize an instance of PdfTableExtractor class
                Dim extractor As PdfTableExtractor =  New PdfTableExtractor(doc)
    
                'Declare a PdfTable array
                Dim tableList() As PdfTable =  Nothing
    
                'Loop through the pages
                Dim pageIndex As Integer
                For  pageIndex = 0 To  doc.Pages.Count- 1  Step  pageIndex + 1
                    'Extract tables from a specific page
                    tableList = extractor.ExtractTable(pageIndex)
    
                    'Determine if the table list is null
                    If tableList <> Nothing And tableList.Length > 0 Then
                        'Loop through the table in the list
                        Dim table As PdfTable
                        For Each table In tableList
                            'Get row number and column number of a certain table
                            Dim row As Integer =  table.GetRowCount()
                            Dim column As Integer =  table.GetColumnCount()
    
                            'Loop though the row and colunm
                            Dim i As Integer
                            For  i = 0 To  row- 1  Step  i + 1
                                Dim j As Integer
                                For  j = 0 To  column- 1  Step  j + 1
                                    'Get text from the specific cell
                                    Dim text As String =  table.GetText(i,j)
    
                                    'Add text to the string builder
                                    builder.Append(text + " ")
                                Next
                                builder.Append("\r\n")
                            Next
                        Next
                    End If
                Next
    
                'Write to a .txt file
                File.WriteAllText("Table.txt", builder.ToString())
            End Sub
        End Class
    End Namespace

C#/VB.NET: Extract Tables from PDF

Solicitar una Licencia Temporal

Si desea eliminar el mensaje de evaluación de los documentos generados o deshacerse de las limitaciones de la función, por favor solicitar una licencia de prueba de 30 días para ti.

Ver también

Thursday, 24 August 2023 08:28

C#/VB.NET: PDF에서 테이블 추출

NuGet을 통해 설치됨

PM> Install-Package Spire.PDF

관련된 링크들

PDF는 데이터 공유 및 쓰기에 가장 널리 사용되는 문서 형식 중 하나입니다. PDF 문서, 특히 테이블의 데이터에서 데이터를 추출해야 하는 상황이 발생할 수 있습니다. 예를 들어, PDF 송장 테이블에 유용한 정보가 저장되어 있으며 추가 분석이나 계산을 위해 데이터를 추출하려고 합니다. 이 문서에서는 다음 방법을 보여줍니다 PDF 테이블에서 데이터 추출Spire.PDF for .NET를 사용하여 TXT 파일로 저장합니다.

Spire.PDF for .NET 설치

먼저 Spire.PDF for .NET 패키지에 포함된 DLL 파일을 .NET 프로젝트의 참조로 추가해야 합니다. DLL 파일은 이 링크 에서 다운로드하거나 NuGet을 통해 설치할 수 있습니다.

  • Package Manager
PM> Install-Package Spire.PDF 

PDF 테이블에서 데이터 추출

다음은 PDF 문서에서 테이블을 추출하는 주요 단계입니다.

  • PdfDocument 클래스의 인스턴스를 만듭니다.
  • PdfDocument.LoadFromFile() 메서드를 사용하여 샘플 PDF 문서를 로드합니다.
  • PdfTableExtractor.ExtractTable(int pageIndex) 메서드를 사용하여 특정 페이지에서 테이블을 추출합니다.
  • PdfTable.GetText(int rowIndex, int columnIndex) 메서드를 사용하여 특정 테이블 셀의 텍스트를 가져옵니다.
  • 추출된 데이터를 .txt 파일에 저장합니다.
  • C#
  • VB.NET
using System.IO;
    using System.Text;
    using Spire.Pdf;
    using Spire.Pdf.Utilities;
    
    namespace ExtractPdfTable
    {
        class Program
        {
            static void Main(string[] args)
            {
                //Create a PdfDocument object
                PdfDocument doc = new PdfDocument();
    
                //Load the sample PDF file
                doc.LoadFromFile(@"C:\Users\Administrator\Desktop\table.pdf");
    
                //Create a StringBuilder object
                StringBuilder builder = new StringBuilder();
    
                //Initialize an instance of PdfTableExtractor class
                PdfTableExtractor extractor = new PdfTableExtractor(doc);
    
                //Declare a PdfTable array
                PdfTable[] tableList = null;
    
                //Loop through the pages
                for (int pageIndex = 0; pageIndex < doc.Pages.Count; pageIndex++)
                {
                    //Extract tables from a specific page
                    tableList = extractor.ExtractTable(pageIndex);
    
                    //Determine if the table list is null
                    if (tableList != null && tableList.Length > 0)
                    {
                        //Loop through the table in the list
                        foreach (PdfTable table in tableList)
                        {
                            //Get row number and column number of a certain table
                            int row = table.GetRowCount();
                            int column = table.GetColumnCount();
    
                            //Loop though the row and colunm
                            for (int i = 0; i < row; i++)
                            {
                                for (int j = 0; j < column; j++)
                                {
                                    //Get text from the specific cell
                                    string text = table.GetText(i, j);
    
                                    //Add text to the string builder
                                    builder.Append(text + " ");
                                }
                                builder.Append("\r\n");
                            }
                        }
                    }
                }
    
                //Write to a .txt file
                File.WriteAllText("Table.txt", builder.ToString());
            }
        }
    }
Imports System.IO
    Imports System.Text
    Imports Spire.Pdf
    Imports Spire.Pdf.Utilities
    
    Namespace ExtractPdfTable
        Class Program
            Shared  Sub Main(ByVal args() As String)
                'Create a PdfDocument object
                Dim doc As PdfDocument =  New PdfDocument()
    
                'Load the sample PDF file
                doc.LoadFromFile("C:\Users\Administrator\Desktop\table.pdf")
    
                'Create a StringBuilder object
                Dim builder As StringBuilder =  New StringBuilder()
    
                'Initialize an instance of PdfTableExtractor class
                Dim extractor As PdfTableExtractor =  New PdfTableExtractor(doc)
    
                'Declare a PdfTable array
                Dim tableList() As PdfTable =  Nothing
    
                'Loop through the pages
                Dim pageIndex As Integer
                For  pageIndex = 0 To  doc.Pages.Count- 1  Step  pageIndex + 1
                    'Extract tables from a specific page
                    tableList = extractor.ExtractTable(pageIndex)
    
                    'Determine if the table list is null
                    If tableList <> Nothing And tableList.Length > 0 Then
                        'Loop through the table in the list
                        Dim table As PdfTable
                        For Each table In tableList
                            'Get row number and column number of a certain table
                            Dim row As Integer =  table.GetRowCount()
                            Dim column As Integer =  table.GetColumnCount()
    
                            'Loop though the row and colunm
                            Dim i As Integer
                            For  i = 0 To  row- 1  Step  i + 1
                                Dim j As Integer
                                For  j = 0 To  column- 1  Step  j + 1
                                    'Get text from the specific cell
                                    Dim text As String =  table.GetText(i,j)
    
                                    'Add text to the string builder
                                    builder.Append(text + " ")
                                Next
                                builder.Append("\r\n")
                            Next
                        Next
                    End If
                Next
    
                'Write to a .txt file
                File.WriteAllText("Table.txt", builder.ToString())
            End Sub
        End Class
    End Namespace

C#/VB.NET: Extract Tables from PDF

임시 면허 신청

생성된 문서에서 평가 메시지를 제거하거나 기능 제한을 제거하려면 다음을 수행하십시오 30일 평가판 라이선스 요청 자신을 위해.

또한보십시오

Thursday, 24 August 2023 08:27

C#/VB.NET: estrae tabelle da PDF

Installato tramite NuGet

PM> Install-Package Spire.PDF

Link correlati

Il PDF è uno dei formati di documento più popolari per la condivisione e la scrittura di dati. Potresti incontrare la situazione in cui è necessario estrarre dati da documenti PDF, in particolare i dati nelle tabelle. Ad esempio, ci sono informazioni utili memorizzate nelle tabelle delle tue fatture PDF e desideri estrarre i dati per ulteriori analisi o calcoli. Questo articolo illustra come estrarre i dati dalle tabelle PDF e salvarlo in un file TXT utilizzando Spire.PDF for .NET.

Installa Spire.PDF for .NET

Per cominciare, è necessario aggiungere i file DLL inclusi nel pacchetto Spire.PDF for.NET come riferimenti nel progetto .NET. I file DLL possono essere scaricati da questo link o installato tramite NuGet.

  • Package Manager
PM> Install-Package Spire.PDF 

Estrai dati da tabelle PDF

Di seguito sono riportati i passaggi principali per estrarre le tabelle da un documento PDF.

  • Crea un'istanza della classe PdfDocument.
  • Caricare il documento PDF di esempio utilizzando il metodo PdfDocument.LoadFromFile().
  • Estrai le tabelle da una pagina specifica utilizzando il metodo PdfTableExtractor.ExtractTable(int pageIndex).
  • Ottieni il testo di una determinata cella della tabella utilizzando il metodo PdfTable.GetText(int rowIndex, int columnIndex).
  • Salva i dati estratti in un file .txt.
  • C#
  • VB.NET
using System.IO;
    using System.Text;
    using Spire.Pdf;
    using Spire.Pdf.Utilities;
    
    namespace ExtractPdfTable
    {
        class Program
        {
            static void Main(string[] args)
            {
                //Create a PdfDocument object
                PdfDocument doc = new PdfDocument();
    
                //Load the sample PDF file
                doc.LoadFromFile(@"C:\Users\Administrator\Desktop\table.pdf");
    
                //Create a StringBuilder object
                StringBuilder builder = new StringBuilder();
    
                //Initialize an instance of PdfTableExtractor class
                PdfTableExtractor extractor = new PdfTableExtractor(doc);
    
                //Declare a PdfTable array
                PdfTable[] tableList = null;
    
                //Loop through the pages
                for (int pageIndex = 0; pageIndex < doc.Pages.Count; pageIndex++)
                {
                    //Extract tables from a specific page
                    tableList = extractor.ExtractTable(pageIndex);
    
                    //Determine if the table list is null
                    if (tableList != null && tableList.Length > 0)
                    {
                        //Loop through the table in the list
                        foreach (PdfTable table in tableList)
                        {
                            //Get row number and column number of a certain table
                            int row = table.GetRowCount();
                            int column = table.GetColumnCount();
    
                            //Loop though the row and colunm
                            for (int i = 0; i < row; i++)
                            {
                                for (int j = 0; j < column; j++)
                                {
                                    //Get text from the specific cell
                                    string text = table.GetText(i, j);
    
                                    //Add text to the string builder
                                    builder.Append(text + " ");
                                }
                                builder.Append("\r\n");
                            }
                        }
                    }
                }
    
                //Write to a .txt file
                File.WriteAllText("Table.txt", builder.ToString());
            }
        }
    }
Imports System.IO
    Imports System.Text
    Imports Spire.Pdf
    Imports Spire.Pdf.Utilities
    
    Namespace ExtractPdfTable
        Class Program
            Shared  Sub Main(ByVal args() As String)
                'Create a PdfDocument object
                Dim doc As PdfDocument =  New PdfDocument()
    
                'Load the sample PDF file
                doc.LoadFromFile("C:\Users\Administrator\Desktop\table.pdf")
    
                'Create a StringBuilder object
                Dim builder As StringBuilder =  New StringBuilder()
    
                'Initialize an instance of PdfTableExtractor class
                Dim extractor As PdfTableExtractor =  New PdfTableExtractor(doc)
    
                'Declare a PdfTable array
                Dim tableList() As PdfTable =  Nothing
    
                'Loop through the pages
                Dim pageIndex As Integer
                For  pageIndex = 0 To  doc.Pages.Count- 1  Step  pageIndex + 1
                    'Extract tables from a specific page
                    tableList = extractor.ExtractTable(pageIndex)
    
                    'Determine if the table list is null
                    If tableList <> Nothing And tableList.Length > 0 Then
                        'Loop through the table in the list
                        Dim table As PdfTable
                        For Each table In tableList
                            'Get row number and column number of a certain table
                            Dim row As Integer =  table.GetRowCount()
                            Dim column As Integer =  table.GetColumnCount()
    
                            'Loop though the row and colunm
                            Dim i As Integer
                            For  i = 0 To  row- 1  Step  i + 1
                                Dim j As Integer
                                For  j = 0 To  column- 1  Step  j + 1
                                    'Get text from the specific cell
                                    Dim text As String =  table.GetText(i,j)
    
                                    'Add text to the string builder
                                    builder.Append(text + " ")
                                Next
                                builder.Append("\r\n")
                            Next
                        Next
                    End If
                Next
    
                'Write to a .txt file
                File.WriteAllText("Table.txt", builder.ToString())
            End Sub
        End Class
    End Namespace

C#/VB.NET: Extract Tables from PDF

Richiedi una licenza temporanea

Se desideri rimuovere il messaggio di valutazione dai documenti generati o eliminare le limitazioni delle funzioni, per favore richiedere una licenza di prova di 30 giorni per te.

Guarda anche

Thursday, 24 August 2023 08:21

C#/VB.NET : extraire des tableaux d'un PDF

Installé via NuGet

PM> Install-Package Spire.PDF

Le PDF est l'un des formats de document les plus populaires pour le partage et l'écriture de données. Vous pouvez rencontrer la situation où vous devez extraire des données à partir de documents PDF, en particulier les données des tableaux. Par exemple, des informations utiles sont stockées dans les tableaux de vos factures PDF et vous souhaitez extraire les données pour une analyse ou un calcul plus approfondi. Cet article montre comment extraire des données de tableaux PDF et enregistrez-le dans un fichier TXT en utilisant Spire.PDF for .NET.

Installer Spire.PDF for .NET

Pour commencer, vous devez ajouter les fichiers DLL inclus dans le package Spire.PDF for .NET en tant que références dans votre projet .NET. Les fichiers DLL peuvent être téléchargés à partir de ce lien ou installés via NuGet.

  • Package Manager
PM> Install-Package Spire.PDF 

Extraire des données de tableaux PDF

Voici les principales étapes pour extraire des tableaux d'un document PDF.

  • Créez une instance de la classe PdfDocument.
  • Chargez l'exemple de document PDF à l'aide de la méthode PdfDocument.LoadFromFile().
  • Extrayez les tableaux d’une page spécifique à l’aide de la méthode PdfTableExtractor.ExtractTable(int pageIndex).
  • Obtenez le texte d’une certaine cellule du tableau à l’aide de la méthode PdfTable.GetText(int rowIndex, int columnIndex).
  • Enregistrez les données extraites dans un fichier .txt.
  • C#
  • VB.NET
using System.IO;
    using System.Text;
    using Spire.Pdf;
    using Spire.Pdf.Utilities;
    
    namespace ExtractPdfTable
    {
        class Program
        {
            static void Main(string[] args)
            {
                //Create a PdfDocument object
                PdfDocument doc = new PdfDocument();
    
                //Load the sample PDF file
                doc.LoadFromFile(@"C:\Users\Administrator\Desktop\table.pdf");
    
                //Create a StringBuilder object
                StringBuilder builder = new StringBuilder();
    
                //Initialize an instance of PdfTableExtractor class
                PdfTableExtractor extractor = new PdfTableExtractor(doc);
    
                //Declare a PdfTable array
                PdfTable[] tableList = null;
    
                //Loop through the pages
                for (int pageIndex = 0; pageIndex < doc.Pages.Count; pageIndex++)
                {
                    //Extract tables from a specific page
                    tableList = extractor.ExtractTable(pageIndex);
    
                    //Determine if the table list is null
                    if (tableList != null && tableList.Length > 0)
                    {
                        //Loop through the table in the list
                        foreach (PdfTable table in tableList)
                        {
                            //Get row number and column number of a certain table
                            int row = table.GetRowCount();
                            int column = table.GetColumnCount();
    
                            //Loop though the row and colunm
                            for (int i = 0; i < row; i++)
                            {
                                for (int j = 0; j < column; j++)
                                {
                                    //Get text from the specific cell
                                    string text = table.GetText(i, j);
    
                                    //Add text to the string builder
                                    builder.Append(text + " ");
                                }
                                builder.Append("\r\n");
                            }
                        }
                    }
                }
    
                //Write to a .txt file
                File.WriteAllText("Table.txt", builder.ToString());
            }
        }
    }
Imports System.IO
    Imports System.Text
    Imports Spire.Pdf
    Imports Spire.Pdf.Utilities
    
    Namespace ExtractPdfTable
        Class Program
            Shared  Sub Main(ByVal args() As String)
                'Create a PdfDocument object
                Dim doc As PdfDocument =  New PdfDocument()
    
                'Load the sample PDF file
                doc.LoadFromFile("C:\Users\Administrator\Desktop\table.pdf")
    
                'Create a StringBuilder object
                Dim builder As StringBuilder =  New StringBuilder()
    
                'Initialize an instance of PdfTableExtractor class
                Dim extractor As PdfTableExtractor =  New PdfTableExtractor(doc)
    
                'Declare a PdfTable array
                Dim tableList() As PdfTable =  Nothing
    
                'Loop through the pages
                Dim pageIndex As Integer
                For  pageIndex = 0 To  doc.Pages.Count- 1  Step  pageIndex + 1
                    'Extract tables from a specific page
                    tableList = extractor.ExtractTable(pageIndex)
    
                    'Determine if the table list is null
                    If tableList <> Nothing And tableList.Length > 0 Then
                        'Loop through the table in the list
                        Dim table As PdfTable
                        For Each table In tableList
                            'Get row number and column number of a certain table
                            Dim row As Integer =  table.GetRowCount()
                            Dim column As Integer =  table.GetColumnCount()
    
                            'Loop though the row and colunm
                            Dim i As Integer
                            For  i = 0 To  row- 1  Step  i + 1
                                Dim j As Integer
                                For  j = 0 To  column- 1  Step  j + 1
                                    'Get text from the specific cell
                                    Dim text As String =  table.GetText(i,j)
    
                                    'Add text to the string builder
                                    builder.Append(text + " ")
                                Next
                                builder.Append("\r\n")
                            Next
                        Next
                    End If
                Next
    
                'Write to a .txt file
                File.WriteAllText("Table.txt", builder.ToString())
            End Sub
        End Class
    End Namespace

C#/VB.NET: Extract Tables from PDF

Demander une licence temporaire

Si vous souhaitez supprimer le message d'évaluation des documents générés ou vous débarrasser des limitations de la fonction, veuillez demander une licence d'essai de 30 jours pour toi.

Voir également

Instalado via NuGet

PM> Install-Package Spire.PDF

Links Relacionados

Uma tabela fornece acesso rápido e eficiente aos dados exibidos em linhas e colunas de maneira visualmente atraente. Quando apresentados em uma tabela, os dados têm um impacto maior do que quando usados apenas como palavras e permitem que os leitores comparem e entendam facilmente as relações entre eles. Neste artigo, você aprenderá como crie uma tabela em PDF em C# e VB.NET usando Spire.PDF for .NET.

O Spire.PDF for .NET oferece as classes PdfTable e PdfGrid para trabalhar com as tabelas em um documento PDF. A classe PdfTable é usada para criar rapidamente tabelas simples e regulares sem muita formatação, enquanto a classe PdfGrid é usada para criar tabelas mais complexas.

A tabela abaixo lista as diferenças entre essas duas classes.

Tabela Pdf PDFGrid
Formatação
Linha Pode ser definido através de eventos. Sem suporte de API. Pode ser definido por meio da API.
Coluna Pode ser definido por meio da API (StringFormat). Pode ser definido por meio da API (StringFormat).
Célula Pode ser definido através de eventos. Sem suporte de API. Pode ser definido por meio da API.
Outros
Extensão da coluna Não suporta. Pode ser definido por meio da API.
Expansão de linha Pode ser definido através de eventos. Sem suporte de API. Pode ser definido por meio da API.
Tabela aninhada Pode ser definido através de eventos. Sem suporte de API. Pode ser definido por meio da API.
Eventos BeginCellLayout,  EndCellLayout, BeginRowLayout, EndRowLayout, BeginPageLayout, EndPageLayout. BeginPageLayout, EndPageLayout.

As seções a seguir demonstram como criar uma tabela em PDF usando a classe PdfTable e a classe PdfGrid, respectivamente.

Instalar o Spire.PDF for .NET

Para começar, você precisa adicionar os arquivos DLL incluídos no pacote Spire.PDF for.NET como referências em seu projeto .NET. Os arquivos DLL podem ser baixados deste link ou instalados via NuGet.

PM> Install-Package Spire.PDF

Criar uma tabela usando a classe PDFTable

A seguir estão as etapas para criar uma tabela usando a classe PdfTable.

  • Crie um objeto PdfDocument.
  • Adicione uma página usando o método PdfDocument.Pages.Add().
  • Crie um objeto Pdftable.
  • Defina o estilo da tabela por meio da propriedade PdfTable.Style.
  • Insira dados na tabela por meio da propriedade PdfTable.DataSource.
  • Defina a altura e a cor da linha por meio do evento BeginRowLayout.
  • Desenhe a tabela na página PDF usando o método PdfTable.Draw().
  • Salve o documento em um arquivo PDF usando o método PdfDocument.SaveToFile().
  • C#
  • VB.NET
using System;
    using System.Data;
    using System.Drawing;
    using Spire.Pdf;
    using Spire.Pdf.Graphics;
    using Spire.Pdf.Tables;
    
    namespace CreateTable
    {
        class Program
        {
            static void Main(string[] args)
            {
                //Create a PdfDocument object
                PdfDocument doc = new PdfDocument();
    
                //Add a page
                PdfPageBase page = doc.Pages.Add(PdfPageSize.A4, new PdfMargins(40));
    
                //Create a PdfTable object
                PdfTable table = new PdfTable();
    
                //Set font for header and the rest cells
                table.Style.DefaultStyle.Font = new PdfTrueTypeFont(new Font("Times New Roman", 12f, FontStyle.Regular), true);
                table.Style.HeaderStyle.Font = new PdfTrueTypeFont(new Font("Times New Roman", 12f, FontStyle.Bold), true);
    
                //Crate a DataTable
                DataTable dataTable = new DataTable();
                dataTable.Columns.Add("ID");
                dataTable.Columns.Add("Name");
                dataTable.Columns.Add("Department");
                dataTable.Columns.Add("Position");
                dataTable.Columns.Add("Level");
                dataTable.Rows.Add(new string[] { "1", "David", "IT", "Manager", "1" });
                dataTable.Rows.Add(new string[] { "3", "Julia", "HR", "Manager", "1" });
                dataTable.Rows.Add(new string[] { "4", "Sophie", "Marketing", "Manager", "1" });
                dataTable.Rows.Add(new string[] { "7", "Wickey", "Marketing", "Sales Rep", "2" });
                dataTable.Rows.Add(new string[] { "9", "Wayne", "HR", "HR Supervisor", "2" });
                dataTable.Rows.Add(new string[] { "11", "Mia", "Dev", "Developer", "2" });
    
                //Set the datatable as the data source of table
                table.DataSource = dataTable;
    
                //Show header(the header is hidden by default)
                table.Style.ShowHeader = true;
    
                //Set font color and backgroud color of header row
                table.Style.HeaderStyle.BackgroundBrush = PdfBrushes.Gray;
                table.Style.HeaderStyle.TextBrush = PdfBrushes.White;
    
                //Set text alignment in header row
                table.Style.HeaderStyle.StringFormat = new PdfStringFormat(PdfTextAlignment.Center, PdfVerticalAlignment.Middle);
    
                //Set text alignment in other cells
                for (int i = 0; i < table.Columns.Count; i++)
                {
                    table.Columns[i].StringFormat = new PdfStringFormat(PdfTextAlignment.Center, PdfVerticalAlignment.Middle);
                }
    
                //Register with BeginRowLayout event
                table.BeginRowLayout += Table_BeginRowLayout;
    
                //Draw table on the page
                table.Draw(page, new PointF(0, 30));
    
                //Save the document to a PDF file
                doc.SaveToFile("PdfTable.pdf");
            }
    
            //Event handler
            private static void Table_BeginRowLayout(object sender, BeginRowLayoutEventArgs args)
            {
                //Set row height
                args.MinimalHeight = 20f;
    
                //Alternate row color
                if (args.RowIndex < 0)
                {
                    return;
                }
                if (args.RowIndex % 2 == 1)
                {
                    args.CellStyle.BackgroundBrush = PdfBrushes.LightGray;
                }
                else
                {
                    args.CellStyle.BackgroundBrush = PdfBrushes.White;
                }
            }
        }
    }

C#/VB.NET: Create Tables in PDF

Criar uma tabela usando a classe PDFGrid

Abaixo estão as etapas para criar uma tabela usando a classe PdfGrid.

  • Crie um objeto PdfDocument.
  • Adicione uma página usando o método PdfDocument.Pages.Add().
  • Crie um objeto PDFGrid.
  • Defina o estilo da tabela através da propriedade PdfGrid.Style.
  • Adicione linhas à tabela usando o método PdfGrid.Rows.Add().
  • Insira dados em células específicas por meio da propriedade PdfGridRow.Cells[index].Value.
  • Distribua células em colunas ou linhas por meio da propriedade PdfGridRow.RowSpan ou PdfGridRow.ColumnSpan.
  • Defina a formatação de uma célula específica por meio das propriedades PdfGridRow.Cells[index].StringFormat e PdfGridRow.Cells[index].Style.
  • Desenhe a tabela na página PDF usando o método PdfGrid.Draw().
  • Salve o documento em um arquivo PDF usando o método PdfDocument.SaveToFile().
  • C#
  • VB.NET
using Spire.Pdf;
    using Spire.Pdf.Graphics;
    using Spire.Pdf.Grid;
    using System.Drawing;
    
    namespace CreateGrid
    {
        class Program
        {
            static void Main(string[] args)
            {
                //Create a PdfDocument object
                PdfDocument doc = new PdfDocument();
    
                //Add a page
                PdfPageBase page = doc.Pages.Add(PdfPageSize.A4,new PdfMargins(40));
    
                //Create a PdfGrid
                PdfGrid grid = new PdfGrid();
    
                //Set cell padding
                grid.Style.CellPadding = new PdfPaddings(1, 1, 1, 1);
    
                //Set font
                grid.Style.Font = new PdfTrueTypeFont(new Font("Times New Roman", 13f, FontStyle.Regular), true);
    
                //Add rows
                PdfGridRow row1 = grid.Rows.Add();
                PdfGridRow row2 = grid.Rows.Add();
                PdfGridRow row3 = grid.Rows.Add();
                PdfGridRow row4 = grid.Rows.Add();
                grid.Columns.Add(4);
    
                //Set column width
                foreach (PdfGridColumn col in grid.Columns)
                {
                    col.Width = 110f;
                }
    
                //Write data into specific cells
                row1.Cells[0].Value = "Order and Payment Status";
                row2.Cells[0].Value = "Order number";
                row2.Cells[1].Value = "Date";
                row2.Cells[2].Value = "Customer";
                row2.Cells[3].Value = "Paid or not";
                row3.Cells[0].Value = "00223";
                row3.Cells[1].Value = "2022/06/02";
                row3.Cells[2].Value = "Brick Lane Realty";
                row3.Cells[3].Value = "Yes";
                row4.Cells[0].Value = "00224";
                row4.Cells[1].Value = "2022/06/03";
                row4.Cells[3].Value = "No";
    
                //Span cell across columns
                row1.Cells[0].ColumnSpan = 4;
    
                //Span cell across rows
                row3.Cells[2].RowSpan = 2;
    
                //Set text alignment of specific cells
                row1.Cells[0].StringFormat = new PdfStringFormat(PdfTextAlignment.Center);
                row3.Cells[2].StringFormat = new PdfStringFormat(PdfTextAlignment.Left, PdfVerticalAlignment.Middle);
    
                //Set background color of specific cells
                row1.Cells[0].Style.BackgroundBrush = PdfBrushes.Orange;
                row4.Cells[3].Style.BackgroundBrush = PdfBrushes.LightGray;
    
                //Format cell border
                PdfBorders borders = new PdfBorders();
                borders.All = new PdfPen(Color.Orange, 0.8f);
                foreach (PdfGridRow pgr in grid.Rows)
                {
                    foreach (PdfGridCell pgc in pgr.Cells)
                    {
                        pgc.Style.Borders = borders;
                    }
                }
    
                //Draw table on the page
                grid.Draw(page, new PointF(0, 30));
    
                //Save the document to a PDF file
                doc.SaveToFile("PdfGrid.pdf");
            }
        }
    }

C#/VB.NET: Create Tables in PDF

Solicitar uma licença temporária

Se você deseja remover a mensagem de avaliação dos documentos gerados ou se livrar das limitações de função, por favor solicite uma licença de teste de 30 dias para você mesmo.

Veja também

Таблица обеспечивает быстрый и эффективный доступ к данным, отображаемым в строках и столбцах в визуально привлекательной форме. Представленные в виде таблицы данные оказывают большее влияние, чем просто слова, и позволяют читателям легко сравнивать и понимать отношения между ними. В этой статье вы узнаете, как создать таблицу в формате PDF на C# и VB.NET используя Spire.PDF for .NET.

Spire.PDF for .NET предлагает класс PdfTable и PdfGrid для работы с таблицами в документе PDF. Класс PdfTable используется для быстрого создания простых обычных таблиц без лишнего форматирования, а класс PdfGrid используется для создания более сложных таблиц.

В таблице ниже перечислены различия между этими двумя классами.

PdfТаблица PdfGrid
Форматирование
Ряд Можно установить через события. Нет поддержки API. Можно настроить через API.
Столбец Можно установить через API (StringFormat). Можно установить через API (StringFormat).
Клетка Можно установить через события. Нет поддержки API. Можно настроить через API.
Другие
Диапазон столбцов Не поддерживается. Можно настроить через API.
Диапазон строк Можно установить через события. Нет поддержки API. Можно настроить через API.
Вложенная таблица Можно установить через события. Нет поддержки API. Можно настроить через API.
События Бегинцелллайаут, Эндцелллайаут, БегинРовлайаут, ЭндРовлайаут, Бегинпажемакет, ендпажемакаут. Бегинпажемакет, ендпажемакаут.

В следующих разделах показано, как создать таблицу в PDF с помощью классов PdfTable и PdfGrid соответственно.

Установите Spire.PDF for .NET

Для начала вам нужно добавить файлы DLL, включенные в пакет Spire.PDF for .NET, в качестве ссылок в ваш проект .NET. Файлы DLL можно загрузить по этой ссылке или установить через NuGet.

PM> Install-Package Spire.PDF

Создайте таблицу с помощью класса PdfTable

Ниже приведены шаги для создания таблицы с использованием класса PdfTable.

  • Создайте объект PdfDocument.
  • Добавьте к нему страницу с помощью метода PdfDocument.Pages.Add().
  • Создайте объект Pdftable.
  • Задайте стиль таблицы через свойство PdfTable.Style.
  • Вставить данные в таблицу через свойство PdfTable.DataSource.
  • Установите высоту строки и цвет строки через событие BeginRowLayout.
  • Нарисуйте таблицу на странице PDF с помощью метода PdfTable.Draw().
  • Сохраните документ в файл PDF с помощью метода PdfDocument.SaveToFile().
  • C#
  • VB.NET
using System;
    using System.Data;
    using System.Drawing;
    using Spire.Pdf;
    using Spire.Pdf.Graphics;
    using Spire.Pdf.Tables;
    
    namespace CreateTable
    {
        class Program
        {
            static void Main(string[] args)
            {
                //Create a PdfDocument object
                PdfDocument doc = new PdfDocument();
    
                //Add a page
                PdfPageBase page = doc.Pages.Add(PdfPageSize.A4, new PdfMargins(40));
    
                //Create a PdfTable object
                PdfTable table = new PdfTable();
    
                //Set font for header and the rest cells
                table.Style.DefaultStyle.Font = new PdfTrueTypeFont(new Font("Times New Roman", 12f, FontStyle.Regular), true);
                table.Style.HeaderStyle.Font = new PdfTrueTypeFont(new Font("Times New Roman", 12f, FontStyle.Bold), true);
    
                //Crate a DataTable
                DataTable dataTable = new DataTable();
                dataTable.Columns.Add("ID");
                dataTable.Columns.Add("Name");
                dataTable.Columns.Add("Department");
                dataTable.Columns.Add("Position");
                dataTable.Columns.Add("Level");
                dataTable.Rows.Add(new string[] { "1", "David", "IT", "Manager", "1" });
                dataTable.Rows.Add(new string[] { "3", "Julia", "HR", "Manager", "1" });
                dataTable.Rows.Add(new string[] { "4", "Sophie", "Marketing", "Manager", "1" });
                dataTable.Rows.Add(new string[] { "7", "Wickey", "Marketing", "Sales Rep", "2" });
                dataTable.Rows.Add(new string[] { "9", "Wayne", "HR", "HR Supervisor", "2" });
                dataTable.Rows.Add(new string[] { "11", "Mia", "Dev", "Developer", "2" });
    
                //Set the datatable as the data source of table
                table.DataSource = dataTable;
    
                //Show header(the header is hidden by default)
                table.Style.ShowHeader = true;
    
                //Set font color and backgroud color of header row
                table.Style.HeaderStyle.BackgroundBrush = PdfBrushes.Gray;
                table.Style.HeaderStyle.TextBrush = PdfBrushes.White;
    
                //Set text alignment in header row
                table.Style.HeaderStyle.StringFormat = new PdfStringFormat(PdfTextAlignment.Center, PdfVerticalAlignment.Middle);
    
                //Set text alignment in other cells
                for (int i = 0; i < table.Columns.Count; i++)
                {
                    table.Columns[i].StringFormat = new PdfStringFormat(PdfTextAlignment.Center, PdfVerticalAlignment.Middle);
                }
    
                //Register with BeginRowLayout event
                table.BeginRowLayout += Table_BeginRowLayout;
    
                //Draw table on the page
                table.Draw(page, new PointF(0, 30));
    
                //Save the document to a PDF file
                doc.SaveToFile("PdfTable.pdf");
            }
    
            //Event handler
            private static void Table_BeginRowLayout(object sender, BeginRowLayoutEventArgs args)
            {
                //Set row height
                args.MinimalHeight = 20f;
    
                //Alternate row color
                if (args.RowIndex < 0)
                {
                    return;
                }
                if (args.RowIndex % 2 == 1)
                {
                    args.CellStyle.BackgroundBrush = PdfBrushes.LightGray;
                }
                else
                {
                    args.CellStyle.BackgroundBrush = PdfBrushes.White;
                }
            }
        }
    }

C#/VB.NET: Create Tables in PDF

Создайте таблицу с помощью класса PdfGrid

Ниже приведены шаги по созданию таблицы с использованием класса PdfGrid.

  • Создайте объект PdfDocument.
  • Добавьте к нему страницу с помощью метода PdfDocument.Pages.Add().
  • Создайте объект PdfGrid.
  • Задайте стиль таблицы через свойство PdfGrid.Style.
  • Добавьте строки в таблицу с помощью метода PdfGrid.Rows.Add().
  • Вставка данных в определенные ячейки через свойство PdfGridRow.Cells[index].Value.
  • Распределяйте ячейки по столбцам или строкам с помощью свойства PdfGridRow.RowSpan или PdfGridRow.ColumnSpan.
  • Задайте форматирование конкретной ячейки с помощью свойств PdfGridRow.Cells[index].StringFormat и PdfGridRow.Cells[index].Style.
  • Нарисуйте таблицу на странице PDF с помощью метода PdfGrid.Draw().
  • Сохраните документ в файл PDF с помощью метода PdfDocument.SaveToFile().
  • C#
  • VB.NET
using Spire.Pdf;
    using Spire.Pdf.Graphics;
    using Spire.Pdf.Grid;
    using System.Drawing;
    
    namespace CreateGrid
    {
        class Program
        {
            static void Main(string[] args)
            {
                //Create a PdfDocument object
                PdfDocument doc = new PdfDocument();
    
                //Add a page
                PdfPageBase page = doc.Pages.Add(PdfPageSize.A4,new PdfMargins(40));
    
                //Create a PdfGrid
                PdfGrid grid = new PdfGrid();
    
                //Set cell padding
                grid.Style.CellPadding = new PdfPaddings(1, 1, 1, 1);
    
                //Set font
                grid.Style.Font = new PdfTrueTypeFont(new Font("Times New Roman", 13f, FontStyle.Regular), true);
    
                //Add rows
                PdfGridRow row1 = grid.Rows.Add();
                PdfGridRow row2 = grid.Rows.Add();
                PdfGridRow row3 = grid.Rows.Add();
                PdfGridRow row4 = grid.Rows.Add();
                grid.Columns.Add(4);
    
                //Set column width
                foreach (PdfGridColumn col in grid.Columns)
                {
                    col.Width = 110f;
                }
    
                //Write data into specific cells
                row1.Cells[0].Value = "Order and Payment Status";
                row2.Cells[0].Value = "Order number";
                row2.Cells[1].Value = "Date";
                row2.Cells[2].Value = "Customer";
                row2.Cells[3].Value = "Paid or not";
                row3.Cells[0].Value = "00223";
                row3.Cells[1].Value = "2022/06/02";
                row3.Cells[2].Value = "Brick Lane Realty";
                row3.Cells[3].Value = "Yes";
                row4.Cells[0].Value = "00224";
                row4.Cells[1].Value = "2022/06/03";
                row4.Cells[3].Value = "No";
    
                //Span cell across columns
                row1.Cells[0].ColumnSpan = 4;
    
                //Span cell across rows
                row3.Cells[2].RowSpan = 2;
    
                //Set text alignment of specific cells
                row1.Cells[0].StringFormat = new PdfStringFormat(PdfTextAlignment.Center);
                row3.Cells[2].StringFormat = new PdfStringFormat(PdfTextAlignment.Left, PdfVerticalAlignment.Middle);
    
                //Set background color of specific cells
                row1.Cells[0].Style.BackgroundBrush = PdfBrushes.Orange;
                row4.Cells[3].Style.BackgroundBrush = PdfBrushes.LightGray;
    
                //Format cell border
                PdfBorders borders = new PdfBorders();
                borders.All = new PdfPen(Color.Orange, 0.8f);
                foreach (PdfGridRow pgr in grid.Rows)
                {
                    foreach (PdfGridCell pgc in pgr.Cells)
                    {
                        pgc.Style.Borders = borders;
                    }
                }
    
                //Draw table on the page
                grid.Draw(page, new PointF(0, 30));
    
                //Save the document to a PDF file
                doc.SaveToFile("PdfGrid.pdf");
            }
        }
    }

C#/VB.NET: Create Tables in PDF

Подать заявку на временную лицензию

Если вы хотите удалить оценочное сообщение из сгенерированных документов или избавиться от функциональных ограничений, пожалуйста запросить 30-дневную пробную лицензию для себя.

Смотрите также