Thursday, January 05, 2023

node.js pdf reader

 adrienjoly/npm-pdfreader: 🚜 Parse text and tables from PDF files. @GitHub

Read text and parse tables from PDF files.

Supports tabular data with automatic column detection, and rule-based parsing.

Dependencies: it is based on pdf2json, which itself relies on Mozilla's pdf.js.

using Node.js only. It does not work from a web browser. no OCR