What are the differences between structured and unstructured data? Electronic signature guide

What is structured data and how does it differ from unstructured data? How does it affect your daily work with electronic documents and the choice of the appropriate electronic signature format? In this article you will find a clear and practical explanation of this issue.
What are the differences between structured and unstructured data? Electronic signature guide

What is structured data?

Structured data is information organized according to a clearly defined schema. This means that individual data elements are located in the intended places - for example, in specific fields or tags - allowing computers to automatically locate and interpret them without guessing at the context.

Examples of structured data

  • XML files, such as KSeF e-invoice. The Ministry of Finance clearly indicates that "PDF, JPG ≠ e-Invoice, e-Invoice = XML". A structured invoice is an invoice issued via KSeF with a number assigned to identify this invoice in the system. The definition of a structured invoice was introduced for the purposes of the VAT Act.
  • Electronic forms, where data are entered into specific fields (e.g., first name, last name, address).
  • Tabular files (CSV, XLS/XLSX), where information is arranged in rows and columns.

 

Due to its structure, such data are ideal for automatic processing. Computer systems easily import them into databases, check their correctness and use the information, for example, to generate reports or statements. A typical example is official forms or e-invoices, which can be mass-processed by government systems.

Unstructured data - what is it?

Unstructured data does not have a fixed schema. They are simply texts, images or scans that are primarily intended to be read by humans. A computer cannot automatically extract detailed information from them because they lack metadata that defines the structure of the content.

Examples of unstructured data

  • PDF documents, such as scans of letters or traditional contracts.
  • DOCX or ODT text files, where the content is continuous and there are no isolated information fields.
  • Images or scans saved as JPG, PNG or PDF containing only graphics.

 

These documents are easy for humans to read, but require additional processing to extract usable data for computers.

How to choose the format of an electronic signature?

The distinction between structured and unstructured data is important when choosing an electronic signature format.

XAdES signature - for structured data

XAdES (XML Advanced Electronic Signature) is the optimal solution for XML documents, such as e-invoices, official forms (e.g. tax returns or Social Security applications). This signature does not interfere with the structure of the document, thus preserving data integrity. It can be saved inside an XML file or separately (as an external .xades or .xml file).

XAdES allows the signed document to be automatically loaded into the computer system, making further processing easier and faster.

PAdES and CAdES signature - for unstructured data

  • PAdES (PDF Advanced Electronic Signature) is ideal for PDF documents. The signature is integrated directly into the PDF file, so that the recipient receives one convenient document for verification. In addition, graphical visualization of the signature is possible, which improves the readability and intuitiveness of the documents.

 

  • CAdES (CMS Advanced Electronic Signature) can be used to sign virtually any file (Word, image, ZIP). It creates a separate signature file (usually .p7s) that attaches to the original document. It is especially useful where you can't or don't want to convert the document to PDF. Keep in mind, this format is the oldest signature format, which is derived from the classic cryptographic signature technology used for many years. You can read more about its limitations here (Internal vs. external signatures - differences, formats and applications (Part 2)

In practice ...

...WHY CHOOSE?

  • Structured data (e.g., XML) are best signed with the format XAdES.
  • Unstructured data (e.g. PDF, DOCX, scans) - format. PAdES (for PDF) or CAdES (for other file types).

 

Choosing the right electronic signature format for the type of data ensures the convenience, integrity of the document and efficiency of its further use by both humans and computer systems.

However, regardless of the method chosen, both types of signatures (SimplySign and Certum Mini) guarantee the same level of security and legal validity derived from a qualified certificate.

With this knowledge, we can consciously choose the format and type of signature best suited to a given situation, ensuring both convenience and compliance.

 

SimplySign and Certum Mini qualified signatures

If you have any concerns or questions, please contact us

+48 22 417 05 55

We will answer your questions, find a date that suits you and an advisor in Gdansk, Gdynia, Krakow, Warsaw or Wroclaw.

You can also write an email from us [email protected]. 

Check also:

Do you need help?

Find what you're looking for