Knowledge Base

Document AI Glossary

Key terms and concepts in intelligent document processing, AI extraction, and document automation — explained in plain language.

Confidence Score

A numeric value (0.0 to 1.0) that indicates how certain the extraction model is about a returned value. Used to route low-confidence results to human review and high-confidence results to automated workflows.

Read more

Document Digitization

The end-to-end process of converting physical or PDF documents into machine-readable, structured data — spanning scanning, text extraction, classification, and data extraction.

Read more

Few-Shot Learning for Document AI

A technique where a small number of annotated examples dramatically improve extraction accuracy. In DocumentIQ, users draw bounding boxes on documents to create few-shot examples injected into LLM prompts.

Read more

Intelligent Document Processing (IDP)

An AI-driven approach to extracting, classifying, and validating data from documents. IDP goes beyond basic OCR by understanding document context, layout, and meaning.

Read more

LLM-Based Document Extraction

Using large language models to extract structured data from document text. Instead of rigid templates, LLMs read and understand context to identify fields across any document layout.

Read more

Optical Character Recognition (OCR)

Technology that converts images of text — from scanned documents, photos, or PDFs — into machine-readable text. OCR is the foundation of document digitization but has significant limitations on its own.

Read more

Retrieval-Augmented Generation (RAG)

A technique that enhances LLM responses by retrieving relevant document chunks via vector search before generating an answer. DocumentIQ uses RAG to power project-scoped chat assistants.

Read more

Structured Data Extraction

The process of converting unstructured document content into organized, machine-readable data with defined field schemas, types, and confidence scores.

Read more