Reverse Dependencies of pdftotext
The following projects have a declared dependency on pdftotext:
- Auto-Research — Geberate scientific survey with just a query
- bbva2pandas — Parse BBVA monthly reports directly to a Dataframe
- chatlocal — chat with your local files
- ckanext-resource-indexer — no summary
- DataXtractor — DataXtractor is a versatile Python library designed to simplify the extraction of valuable data from a variety of sources, including images and PDF documents. Whether you need to extract text, tables, or structured content, DataXtractor provides powerful and intuitive tools to streamline the process.
- hetzner-fix-report — This package demonstrates building and publishing Python Packages with GitHubs infrastructure
- imagetocsv — Converts An Image to a CSV. This exists because Chorus 3.0 are bat-shit and only show images for vital metadata.
- ioc-parser-ng — Tool to extract indicators of compromise from security reports, next generation
- jsonify-resume — A cli that converts resumes into JSON Resume schema
- misp-modules — MISP modules are autonomous modules that can be used for expansion and other services in MISP
- monopoly-core — Monopoly is a Python library & CLI that converts bank statement PDFs to CSV
- monopoly-sg — PDF parsing for Singaporean banks
- nedextract — extract specific information from annual report files
- ofxstatement-french — OFXStatement plugin for french financial institutions like BanquePopulaire.
- oireachtas-data — Oireachtas debate data
- onegov.file — Images/files organized in collections.
- pdf-llm-tools — A family of LLM-enhanced PDF utilities
- pdf2dataset — Easily convert a subdirectory with big volume of PDF documents into a dataset, supports extracting text and images
- pdfparser — pdf parsing tools
- pdfs-rename — Bulk rename PDFs.
- PyBookReader — no summary
- pysin — PySin is a toolbox for text retrieval in unstructured documents datasets. It contains both a multi-type text extractor and a search engine. To test them, you can use the medical prescriptions generator that is also provided.
- sec-certs — A tool for data scraping and analysis of security certificates from Common Criteria and FIPS 140-2/3 frameworks
- sermos-tools — Sermos Tools
- serveliza — Serveliza description
- situacao — A command line tool to interact with Portugal COVID-19 data.
- smbcrawler — Search SMB shares for interesting files
- soam — Tools for time series analysis, plotting and reporting.
- webchanges — Check web (or command output) for changes since last run and notify. Anonymously alerts you of web changes, with
- xmpdf — Extracts email metadata and text from a PDF file
- yara-mail — A Python package and command line utility for scanning emails with YARA rules
1