Reverse Dependencies of trafilatura
The following projects have a declared dependency on trafilatura:
- agent-cloud — no summary
- agent-cloud-os — no summary
- agent-context — no summary
- agent-management-system — no summary
- agent.ngo — no summary
- agent-studio — No-code AI agent creation platform with Streamlit UI and FastAPI backend
- agent-system — no summary
- agentbox — no summary
- agentDB — no summary
- agentvm — no summary
- aicompleter — Interactive AI program framework for Python
- ams-core — no summary
- ams-python — no summary
- appnext — no summary
- auto-ams — no summary
- bytewax-azure-ai-search — Custom sink for Azure AI Search
- chatgpt-mirai-qq-bot-scheduler — 定时任务 for lss233/chatgpt-mirai-qq-bot,提供聚合block实现定时任务的增删查(同时也可以配置其他安装的block实现自动调用)
- chatgpt-mirai-qq-bot-web-image-generate — WebImageGeneratePlugin for lss233/chatgpt-mirai-qq-bot
- chatgpt-mirai-qq-bot-web-search — WebSearch adapter for lss233/chatgpt-mirai-qq-bot
- code-context — no summary
- codegraph-agent — no summary
- contentmap — no summary
- crawzy — A web scraping tool for crawling through websites and retrieving textual content
- data-prep-toolkit-lang — Data Preparation Toolkit Transforms using Ray
- data-prep-toolkit-transforms — Data Preparation Toolkit Transforms using Ray
- data-prep-toolkit-transforms-lang1 — Data Preparation Toolkit Transforms
- datatrove — HuggingFace library to process and filter large amounts of webdata
- dataverse — An open-source simplifies ETL workflow with Python based on Spark
- delphai-ml-utils — A Python package to manage delphai machine learning operations.
- deva — data eval in future
- django-embeddings — A Django application for embeddings
- docop-tasks-restricted — Tasks for docop that have more restrictive open source licensing
- dotagent — no summary
- dotagent-dev — no summary
- dotams — no summary
- dotnext — no summary
- dpk-html2parquet-transform-python — HTML2PARQUET Python Transform
- dspygen — A Ruby on Rails style framework for the DSPy (Demonstrate, Search, Predict) project for Language Models like GPT, BERT, and LLama.
- duowen-agent — 多闻LLM核心工具包
- edubot — Basic Edubot module
- eric-chen-forward — Classifier for institution and scholar data
- genia — no summary
- Greek-scraper — Ultra-fast and efficient web scraper with GPU utilization for text cleaning and JSON output. Supports generic and language-specific scraping.
- griptape — Modular Python framework for LLM workflows, tools, memory, and data.
- hawkins-agent — A Python SDK for building AI agents with minimal code using Hawkins ecosystem with HawkinDB memory
- hawkins-agent-lib — A Python SDK for building AI agents with minimal code using Hawkins ecosystem with HawkinDB memory
- hawkins-agent-sdk — A Python SDK for building and managing AI agents with integrated memory, knowledge base, and workflow management
- hawkins-rag — A Python package for building RAG systems with HawkinsDB and multiple data source integrations
- hawkinsdb — A neuroscience-inspired memory layer for LLM applications
- hogwarts-browser-use — hogwarts browser use 霍格沃兹测试开发学社学员定制版
- incognitoGPT — no summary
- janito — Janito CLI tool
- keywords-en — keywords extract
- knowledge-storm — STORM: A language model-powered knowledge curation engine.
- langroid — Harness LLMs with Multi-Agent Programming
- langsearch — Easily create semantic search based LLM applications on your own data
- lavague-core — automation code generation from text instructions
- lightlang — A lightweight ergonomic framework for LLM workflows
- liteauto — free google results
- llama-cpp-agent — A framework for building LLM based AI agents with llama.cpp.
- llm_chat_term — Chat with LLMs from the terminal
- llm-server — no summary
- llmproxy — no summary
- lxmfy-news-bot — LXMFy News Bot using RSS and trafilatura to fetch full-text
- MainContentExtractor — A library to extract the main content from html. Developed for information on LLM and for feeding data into LangChain and LlamaIndex.
- medhaai — A flexible and powerful multi-agent AI framework for building advanced AI applications
- minet — A webmining CLI tool & library for python.
- multiai — A Python library for text-based AI interactions
- namas — no summary
- newsfeedback — Tool for extracting and saving news article metadata at regular intervals.
- next-ams — no summary
- next-llm — no summary
- nextagent — no summary
- nextapi — no summary
- nextpy-ai — no summary
- nlp-toolbox — Natural Language Processing Tools
- obsei — Obsei is an automation tool for text analysis need
- oneai-stage — NLP as a Service
- openagent-py — no summary
- openagentos — no summary
- openAMS — no summary
- openbb-sec — SEC extension for OpenBB
- opencopilot-ai — OpenCopilot Backend
- opendatagen — Data preparation system to build controllable AI system
- openlora — no summary
- opsci-toolbox — a complete toolbox
- parselite — A powerful web content fetcher and processor
- pyopengenai — A powerful web content fetcher and processor
- raggy — scraping stuff
- readium — A tool to extract and analyze documentation from repositories, directories, and URLs
- readthis — readthis - A command line tool to read a text file aloud
- redel — A toolkit for recursive delegation of LLMs
- safari-to-sqlite — Save tabs from Safari to a SQLite database. Supports Datasette. Can sync multiple devices with Turso.
- Scrapework — simple scraping framework
- scrapme — A comprehensive web scraping framework featuring both static and dynamic content extraction, automatic Selenium/geckodriver management, rate limiting, proxy rotation, and Unicode support (including Georgian). Built with BeautifulSoup4 and Selenium, it provides an intuitive API for extracting text, tables, links and more from any web source.
- searchflow — An assistant helping you to index webpages into structured datasets.
- shandu — Deep research system with LangChain and LangGraph
- site2md — Host an API to convert websites to markdown with optional features
- synthora — Synthora is a lightweight and extensible framework for LLM-driven Agents and ALM research. It provides essential components to build, test and evaluate agents. At its core, Synthora aims to assemble an agent with a single config, thus minimizing your effort in building, tuning, and sharing agents.
- thirdai — A faster cpu machine learning library
1
2