Your accounting system exported 500 invoices into one PDF. Each invoice starts with 'INVOICE NUMBER:' on a new page. You need them as separate files.
Content-based splitting catches exactly this. Flint splits wherever a text pattern appears.
How content split works
Open split PDF, drop your invoice export, choose 'split by content'. Type the text marker — 'INVOICE NUMBER:'. Flint scans every page for that text and starts a new output file at every page where the marker appears. 500 invoices become 500 files.
Pattern matching options
Plain text matches case-insensitively by default. Toggle case-sensitive if needed. Regex mode handles patterns like 'Invoice [0-9]+' for files with varying invoice numbers in the marker. Most accounting exports have a consistent format you can target with a simple phrase.
What you should not target
Avoid markers that appear mid-page on multiple pages (you'd split mid-content). Avoid markers that don't always appear on a new page. Test with a small sample first — drop in 10 pages, run the split, confirm boundaries are right, then scale up.
Naming outputs from the matched text
If your marker captures the invoice number ('Invoice #12345'), Flint can use the matched text in the output filename. 'Invoice-12345.pdf', 'Invoice-12346.pdf'. This is the difference between sortable, searchable files and a folder of unidentifiable splits.
FAQ
Does this work on scanned PDFs?
Only if the scan has been OCR'd (text layer present). Pure image scans aren't searchable, so content split can't see anything.
What if a marker spans two lines?
Most PDF text extraction reassembles lines. If yours doesn't, simplify the marker to a single-line phrase.
How do I know if my PDF has searchable text?
Try to select text in any PDF reader. If you can highlight and copy, text is there. If selection covers the whole page as one image, it's image-only.
Can I split by content AND by size at once?
One operation at a time. Split by content first, then size-split any oversize results.
Text marker in, individual files out. Split by content.