File uploads & OCR
When someone attaches a file to ChatGPT, Claude, Gemini, or any other supported tool, PromptSpotter extracts the text in the browser and runs the same detection rules over it — before the file leaves the page. Here’s what we support and how it works.
Supported file types
PromptSpotter extracts text from these formats:
- Plain text —
.txt,.md,.csv,.tsv,.log,.json,.yaml,.xml - Code —
.js,.ts,.py,.go,.rb,.java,.rs,.sqland most other source-code extensions - Documents —
.pdf(with embedded text),.docx,.rtf - Spreadsheets —
.xlsx,.ods(cell text is extracted, formulas are not evaluated) - Slides —
.pptx(slide text and speaker notes) - Images and scanned PDFs — only when OCR is turned on (see below)
How extraction works
When the user picks a file in the AI tool’s upload picker, the PromptSpotter extension intercepts the file before the page’s own upload handler sees it. We read the file with the browser’s built-in APIs (no upload to our servers), pull the text out locally, and run detection. If the file is clean, the upload proceeds as normal — the user sees no delay beyond a brief spinner.
The original file is never sent to PromptSpotter’s servers. Only the matched fragments (e.g. the substring that looks like an API key) are logged to your activity feed, and only if your workspace has logging turned on.
OCR for images and scanned PDFs
OCR — optical character recognition — is the process of pulling text out of pixels. We use it for screenshots, photos of whiteboards, and PDFs that were scanned rather than exported from a word processor. OCR is off by default because:
- It uses noticeably more CPU than text extraction, which can slow down older laptops.
- Many teams don’t upload images to AI tools, so the cost isn’t worth the value.
Turn OCR on for your workspace
- Go to Policy in the admin console.
- Scroll to File scanning and toggle OCR for images and scanned PDFs.
- Save. The change rolls out to all extensions within about a minute.
File types we don’t scan
Encrypted archives (.zip with a password, encrypted .pdf) and proprietary binary formats we don’t parse (e.g. .dwg, .psd) pass through without scanning. If your team uploads sensitive content in formats we don’t cover, email info@promptspotter.com — we add formats based on what real teams use.