Reverse Dependencies of docling
The following projects have a declared dependency on docling:
- crewai — Cutting-edge framework for orchestrating role-playing, autonomous AI agents. By fostering collaborative intelligence, CrewAI empowers agents to work together seamlessly, tackling complex tasks.
- data-prep-toolkit-lang — Data Preparation Toolkit Transforms using Ray
- data-prep-toolkit-transforms — Data Preparation Toolkit Transforms using Ray
- docetl — ETL with LLM operations.
- docitup — no summary
- docling-haystack — Docling Haystack converter
- docling-langchain — Docling LangChain integration
- Docs2KG — Unified Knowledge Graph Construction from Heterogeneous Documents Assisted by Large Language Models
- doctext — no summary
- flexrag — A RAG Framework for Information Retrieval and Generation.
- gptparse — A tool for converting PDF documents to Markdown using OCR and vision language models
- hdfs-docling-analyze — A library for analyzing files from HDFS and saving results to MongoDB
- instructlab-sdg — Synthetic Data Generation
- langchain-docling — Docling LangChain integration
- langroid — Harness LLMs with Multi-Agent Programming
- leettools — AI Search Workflow with Document Pipelines.
- lionagi — An Intelligence Operating System.
- litdb — A literature database tool with GPT integration.
- llama-index-readers-docling — llama-index readers docling integration
- llms-txt-action — GitHub Action to make documentation more accessible to LLMs.
- lollmsvectordb — A modular text-based database manager for retrieval-augmented generation (RAG), seamlessly integrating with the LoLLMs ecosystem.
- markdrop — A comprehensive PDF processing toolkit that converts PDFs to markdown with advanced AI-powered features for image and table analysis. Supports local files and URLs, preserves document structure, extracts high-quality images, detects tables using advanced ML models, and generates detailed content descriptions using multiple LLM providers including OpenAI and Google's Gemini.
- observers — 🤗 Observers: A Lightweight Library for AI Observability
- parsestudio — Parse PDF files using different parsers.
- pdf2csv — A python library and CLI tool to convert PDF files to CSV files.
- py-ai-workflows — A toolkit for AI workflows.
- PyAutoGen — A programming framework for agentic AI
- quackling — Quackling enables document-native generative AI applications
- ragnardoc — RAGNARDoc (RAG Native Automatic Reingestion for Documents) is a tool that runs natively on a developer workstation and automatically ingests local documents into various Retrieval Augmented Generation indexes. It is designed as a companion app for workstation RAG applications which would benefit from maintaining an up-to-date view of documents hosted natively on a user's workstation.
- sherlock-lit — Sherlock-lit helps you get everything an Abstract hides with a fast technical description card (Research questions, contribution, possible future works) of an NLP paper before reading it.
- sieves — Rapid prototyping and robust baselines for information extraction with zero- and few-shot models.
- spacy-layout — Use spaCy with PDFs, Word docs and other documents
- txtai — All-in-one open-source embeddings database for semantic search, LLM orchestration and language model workflows
- versionhq — LLM orchestration frameworks for model-agnostic AI agents that handle complex outbound workflows
1