Convert a PDF to Plain Text

Sometimes you want the words without the styling. Plain text is the most universal format there is.

3 min readConvert PDF

You're feeding documents into an AI model, or running search across a hundred contracts, or just want to grep for a phrase. You don't care about formatting. You want the words, one document, plain text.

When plain text is the right answer

Text analytics: word counts, topic modelling, sentiment analysis, fuzzy matching across documents. AI prompts: feeding contract text to a language model. Search: grep, ripgrep, command-line tools. Light editing: quick copy-paste into emails or notes.

For anything that needs structure (headings, tables, formatting), prefer Word or HTML instead.

How to convert

Flint's convert hub outputs plain text. One .txt file per PDF, all content in reading order. UTF-8 encoding by default — safe for any modern tool.

If the PDF is scanned, OCR runs automatically. Proofread important passages.

Reading order

Plain text preservation of reading order is the converter's main job. Two-column layouts come across as one column flowing top-to-bottom, left-then-right. Footnotes appear at the end of each page or end of the document, depending on how the original was structured.

If the order is off, you usually want a different conversion (HTML or Word) that preserves structure better.

Encoding gotchas

Special characters (curly quotes, em-dashes, accented letters) sometimes come through as escape codes or substituted with ASCII equivalents. UTF-8 output handles most cases. If you're feeding into a system that expects ASCII only, convert curly quotes to straight quotes with find-and-replace.

FAQ

Will tables come across?

As text rows. Columns are separated by whitespace, which is fine for human reading but messy for automated parsing. Use CSV or Excel for tabular data.

Can I get just specific pages?

Yes — split the PDF first to extract specific pages, then convert that subset.

What about formatting like bold and italic?

Lost in plain text. The whole point of .txt is no formatting. Use Markdown or HTML if you want light formatting preserved.

Will it work for many languages?

Yes — UTF-8 supports virtually every modern language. Make sure your destination tool reads UTF-8 too.

Just the words, nothing else. Convert your PDF to text for analysis or search.

Try it now

Drop a PDF in and you'll be done in seconds — no install, files private to your account.

More on this

Convert a PDF to Plain Text | Flint — Flint PDF