ToolMint
PDF Tools4 min readApril 29, 2026

Why Is My Scanned PDF So Large and How to Fix It

A short scanned document that should be a few hundred kilobytes often comes out as 20, 30, or even 50MB. This is one of the most common PDF complaints and one of the easiest to fix once you understand why it happens.

The Real Reason Scanned PDFs Are So Large

When you scan a physical document, the scanner captures it as a photograph — a high-resolution raster image. That image is then embedded inside a PDF container. A single page scanned at 300 DPI produces a raw image of roughly 2500x3300 pixels. A 10-page document contains 10 of these images. At 300 DPI color scanning, each image can be 2-5MB before any PDF overhead. That is where the 20-50MB total comes from.

Color Scanning vs. Grayscale vs. Black and White

Color scans are the largest. A color image stores red, green, and blue values for every pixel. Grayscale stores only brightness. Black and white stores only on or off. For most text documents — contracts, forms, handwritten notes — color scanning adds zero useful information. Switching to grayscale reduces file size by roughly 65%. Switching to black and white for pure text reduces it by 80-90%.

  • Color scan: 5MB per page (unnecessary for most text documents)
  • Grayscale: 1.5-2MB per page
  • Black and white: 0.3-0.8MB per page

How to Fix an Already Large Scanned PDF

If the PDF has already been scanned at high resolution and color, use the Compress PDF tool. Upload the file and apply High compression. For scanned text documents, High compression is safe — the compression algorithm is intelligent enough to preserve text legibility while reducing the redundant image data. A 30MB 10-page scanned contract typically compresses to under 2MB at High compression with completely readable text.

Preventing Large Scans in the Future

Change your scanner settings before scanning. For text documents like contracts, letters, and forms: scan at 150-200 DPI in grayscale. This produces sharp, readable text at a fraction of the size of a color 300 DPI scan. For documents with photos or color diagrams that must be preserved accurately: use 200-300 DPI color. For archival copies that must be searchable: 300 DPI grayscale with OCR is the professional standard.

Why OCR Makes Scanned PDFs Both Smaller and Searchable

Running OCR on a scanned PDF creates a text layer on top of the image. Some OCR tools then replace the image pages with the recognized text, dramatically reducing file size. Others add searchable text without removing the images. Use the PDF to Text tool to extract searchable text from scanned PDFs. For documents that must remain as PDF but be searchable, professional OCR tools create an invisible text layer while keeping the visual appearance.

Try the tools mentioned in this guide

Frequently Asked Questions

Should I scan documents at 300 DPI or 600 DPI?
300 DPI is the standard for readable text documents. 600 DPI is only necessary for fine detail like architectural drawings, small-print legal documents, or photo-quality preservation.
Will compressing a scanned PDF make the text unreadable?
No, for typical text documents at High compression. The text remains clear. Only very fine print or extremely low-contrast scans may show degradation at High compression.
My scanner does not have a DPI setting — what do I do?
Use a phone scanning app like Microsoft Lens or Adobe Scan, which let you choose scan quality. Alternatively, compress the resulting PDF using the Compress PDF tool.
Can I convert a scanned PDF into a smaller text-based PDF?
OCR software can create a searchable text layer over the scanned image, but true text-based PDF conversion requires high-quality OCR and produces mixed results for handwritten content.

Related Guides