docler

View on PyPIReverse Dependencies (0)

0.3.0 docler-0.3.0-py3-none-any.whl

Wheel Details

Project: docler
Version: 0.3.0
Filename: docler-0.3.0-py3-none-any.whl
Download: [link]
Size: 1450610
MD5: 0aa49c3e1f11678fa7e793055cbdfc04
SHA256: 3ad7b74ac6bf64f027c58578824da58c22d6b134c5d6a0bc3805c0689e131eb0
Uploaded: 2025-03-26 20:23:47 +0000

dist-info

METADATA

Metadata-Version: 2.4
Name: docler
Version: 0.3.0
Summary: Abstractions & Tools for OCR / document processing
Author-Email: Philipp Temminghoff <philipptemminghoff[at]googlemail.com>
Project-Url: Documentation, https://phil65.github.io/docler/
Project-Url: Source, https://github.com/phil65/docler
Project-Url: Issues, https://github.com/phil65/docler/issues
Project-Url: Discussions, https://github.com/phil65/docler/discussions
Project-Url: Code coverage, https://app.codecov.io/gh/phil65/docler
License: MIT License Copyright (c) 2024, Philipp Temminghoff Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the "Software"), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions: The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software. THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.
Classifier: Development Status :: 4 - Beta
Classifier: Intended Audience :: Developers
Classifier: Operating System :: OS Independent
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3 :: Only
Classifier: Programming Language :: Python :: 3.12
Classifier: Programming Language :: Python :: 3.13
Classifier: Programming Language :: Python :: 3.14
Classifier: Topic :: Documentation
Classifier: Topic :: Software Development
Classifier: Topic :: Utilities
Classifier: Typing :: Typed
Requires-Python: >=3.12
Requires-Dist: anyenv (>=0.4.1)
Requires-Dist: pydantic
Requires-Dist: pydantic-settings (>=2.8.1)
Requires-Dist: upathtools (>=0.4.3)
Requires-Dist: azure-ai-documentintelligence; extra == "all"
Requires-Dist: chromadb; extra == "all"
Requires-Dist: docling; extra == "all"
Requires-Dist: kreuzberg; extra == "all"
Requires-Dist: llama-index-readers-smart-pdf-loader; extra == "all"
Requires-Dist: llmling-agent[pydantic-ai]; extra == "all"
Requires-Dist: llmsherpa; extra == "all"
Requires-Dist: marker-pdf; extra == "all"
Requires-Dist: markitdown[all]; extra == "all"
Requires-Dist: mistralai; extra == "all"
Requires-Dist: openai; extra == "all"
Requires-Dist: pinecone[asyncio]; extra == "all"
Requires-Dist: qdrant-client[fastembed]; extra == "all"
Requires-Dist: streamlit; extra == "all"
Requires-Dist: upstash-vector; extra == "all"
Requires-Dist: azure-ai-documentintelligence; extra == "azure"
Requires-Dist: flagembedding; extra == "bge"
Requires-Dist: chromadb (>=0.6.3); extra == "chromadb"
Requires-Dist: diff-match-patch; extra == "diffs"
Requires-Dist: docling[ocrmac,rapidocr,vlm]; extra == "docling"
Requires-Dist: kreuzberg; extra == "kreuzberg"
Requires-Dist: litellm; extra == "litellm"
Requires-Dist: tokonomics; extra == "litellm"
Requires-Dist: llama-index; extra == "llama-index"
Requires-Dist: llama-parse; extra == "llama-parse"
Requires-Dist: marker-pdf; extra == "marker"
Requires-Dist: markitdown[all]; extra == "markitdown"
Requires-Dist: mistralai; extra == "mistralai"
Requires-Dist: openai; extra == "openai"
Requires-Dist: pinecone[asyncio]; extra == "pinecone"
Requires-Dist: qdrant-client[fastembed]; extra == "qdrant"
Requires-Dist: llama-index-readers-smart-pdf-loader; extra == "smart-pdf"
Requires-Dist: llmsherpa; extra == "smart-pdf"
Requires-Dist: streambricks; extra == "streamlit"
Requires-Dist: streamlit; extra == "streamlit"
Requires-Dist: tokonomics; extra == "streamlit"
Provides-Extra: all
Provides-Extra: azure
Provides-Extra: bge
Provides-Extra: chromadb
Provides-Extra: diffs
Provides-Extra: docling
Provides-Extra: kreuzberg
Provides-Extra: litellm
Provides-Extra: llama-index
Provides-Extra: llama-parse
Provides-Extra: marker
Provides-Extra: markitdown
Provides-Extra: mistralai
Provides-Extra: openai
Provides-Extra: pinecone
Provides-Extra: qdrant
Provides-Extra: smart-pdf
Provides-Extra: streamlit
Description-Content-Type: text/markdown
License-File: LICENSE
[Description omitted; length: 3152 characters]

WHEEL

Wheel-Version: 1.0
Generator: hatchling 1.27.0
Root-Is-Purelib: true
Tag: py3-none-any

RECORD

Path Digest Size
docler/__init__.py sha256=Dd2qu7EwmamHPQhqUvEKLrp5ulQk-9LipOrGnNh3i58 1082
docler/__main__.py sha256=ja8Ehs5tfhSl8KrWNQ9H0_laOnhhDKOa9OqcGdJtRjI 1413
docler/common_types.py sha256=YMvvSvtYi_RLIl0QXgHVAho1txhGGiJVOhIinr80leo 1215
docler/diffs.py sha256=WWN8G3O2ZnsI-X0aLGlen7EhKAgCACr-338nEjdtK0U 2468
docler/log.py sha256=oWeCVy3T6V6uIyFJMBgrAwaeBxPdz2A6oq7gjV4fK0c 362
docler/mime_types.py sha256=D_0DPfUTMqqbl2V27DsyWP37LFiorE1Zq4GP6FK_qRQ 5106
docler/models.py sha256=PYOv_9AciZ9Snbt6Ddw1cLjXGxMONEu20tYm2TpVx0A 6403
docler/provider.py sha256=1pt2HBo6fFJ6cynIg-rITKx87YDvo2d3w_dOQdcuk-M 2046
docler/py.typed sha256=47DEQpj8HBSa-_TImW-5JCeuQeRkm5NMpJWZG3hSuFU 0
docler/streamlit_app.py sha256=BUgtttwK4NwlYvkaamVBZv1YRyyKQZrZCVOkav4Ayx0 4586
docler/streamlit_chunk_app.py sha256=If6g5fqu6zdDdOfj3dIYF2hJiuwnRxXfKX1ElK6l9p8 3337
docler/utils.py sha256=-NZAF4I7wyENR3rGpPpEQyU0-KWKxmwTRndqf5lGBYk 3819
docler/annotators/__init__.py sha256=Lq7-HbH3YzTgQLfBBCMm7FaRfRAJ4WjPHMlopGAuV0E 32
docler/annotators/ai_document_annotator.py sha256=6Jz4xLkWr7h0P3ykT2lON9VSyj86q58JelmOnuw-lrA 3984
docler/annotators/ai_image_annotator.py sha256=nOYnQpuzpCYJzPYsb_mBJTa26qmKLHY83IAdszBeb3w 5887
docler/annotators/base.py sha256=n44AkaF2jdA8bZXN1-ZYg4ekN30GUXACKCndgWB3a0E 909
docler/chunkers/__init__.py sha256=47DEQpj8HBSa-_TImW-5JCeuQeRkm5NMpJWZG3hSuFU 0
docler/chunkers/base.py sha256=xUmTpSQ4YNpq2mm1JuG5HhC5f86RLxNFcBdz-PmZTsY 2360
docler/chunkers/ai_chunker/__init__.py sha256=u3T838746QLgWjdN11tonK0BIeg0Yk35fQUfyg7Byf4 101
docler/chunkers/ai_chunker/chunker.py sha256=SPKjw3bNBTt71LqoIhZIWKkwp0lFdqvweYLqx3HioaM 2940
docler/chunkers/ai_chunker/models.py sha256=wqHpJj1iMwLqmwFhN95maLI7rw938KwL4cVud0iTxqc 683
docler/chunkers/ai_chunker/utils.py sha256=qUT16DXOPLBEYZDzHMt58R7jhBGZn44oAu4GrJTqvTc 1186
docler/chunkers/llamaindex_chunker/__init__.py sha256=8oVxediDyEyM9g2vbBuXD9-7OWFq9j7cfPSE1aENDHg 133
docler/chunkers/llamaindex_chunker/chunker.py sha256=I3MlfhH0IUutwtrnKIBdo47jD2xyTdBdoA0u1ylJP2A 5497
docler/chunkers/markdown_chunker/__init__.py sha256=bCCjaeoYNJa6fj3SYPOfc9WZs23TZKIkeYMttBK-j3M 125
docler/chunkers/markdown_chunker/chunker.py sha256=Xbs59DCPxPUYzjiWdUpDW_gnuFyFnEYmXLLo75jfa2U 3537
docler/chunkers/markdown_chunker/utils.py sha256=X8TDY1y17cZAvrIW3KDqpSYH5GJz_ovx0wz8_2mFAE4 1670
docler/configs/__init__.py sha256=-YUMW0trkUDOA_zdx38JqI-iIAlMPw6WyzwaN2_ngTw 3843
docler/configs/annotator_configs.py sha256=3nHdWgkegfzOZojWoNoQoTBpCqkqMc7V4ZYEZB35FbA 3692
docler/configs/chunker_configs.py sha256=RJxhRGzv68mtBS1mAJET0bQQnMvcaH0a0X8J3mYkeg8 3689
docler/configs/converter_configs.py sha256=ZYVUgg-OgdLR_bZpB46Qx70ZdEoXRCw6thFJ1oTl3E8 10545
docler/configs/embedding_configs.py sha256=NHHrFg7PRpKx2clxG1oY8jct2uAuSqp5AuaLWLxEtVc 4401
docler/configs/file_db_configs.py sha256=DH36ZvmPFTD83QvzpSNoKVwltFKk2yalHsR9T8f5Bao 8489
docler/configs/processor_configs.py sha256=rSEtHfzSXna-arOw6O8OBJg_p9Oxgq0chpWBqTlQ_wk 2688
docler/configs/vector_db_configs.py sha256=gXJaVOIgLDRi2cjelIcr5T2QdokV5tYWTj_4qFT9N4o 4445
docler/converters/__init__.py sha256=47DEQpj8HBSa-_TImW-5JCeuQeRkm5NMpJWZG3hSuFU 0
docler/converters/base.py sha256=F0Omi0gtdiW4aiwVZjsNAlkFfFarpkWIPTCdo18xxO8 6950
docler/converters/dir_converter.py sha256=mpjGDLP7SVoxOrUKCOUgNWO2yDeAydiHNgdYl3ve4DI 5581
docler/converters/exceptions.py sha256=JLa9K4h9jPGHsRZKD5a7KCL4poaOoclTyjtyYVfz_vc 213
docler/converters/registry.py sha256=sIS0t0-HfK56C0EBC-ZW3GYwMTkZtAqEC5jS8zgP-vM 7849
docler/converters/azure_provider/__init__.py sha256=8oWn40cDUMAVawD3-35uUHS29LJbaS7nclzHMS3jmo4 130
docler/converters/azure_provider/provider.py sha256=gamh5YrhtCGsQzJp1JtLt2N4J_wcTkQVmb56gHksvxA 7317
docler/converters/datalab_provider/__init__.py sha256=tPnPQTW10WfvBPq9WKUH5uPDh-9oAqpFj1il7xwX_-I 138
docler/converters/datalab_provider/provider.py sha256=lo7xTOQAt0WpYnSba3f7PzXB3TxijZvDzwXPSZHTrnQ 5502
docler/converters/datalab_provider/utils.py sha256=eVJqcfexsPkc9taC4OBkQySUyLGlQ6_PdyuHamUKBcg 2468
docler/converters/docling_provider/__init__.py sha256=tchV6Xpxk5EWdhOCRpFrWEoN1HI_t7KVMmvbsNIP7ro 138
docler/converters/docling_provider/provider.py sha256=h-n5w1uTt9ub4aMYOXbMavq4Wuw-zY9wlbZ9Bhj3gjc 5795
docler/converters/docling_provider/utils.py sha256=jSwrZ2fHkhSyZt2HMFVNi2PLeZOmPD8-GYfSEbZlSKk 1512
docler/converters/kreuzberg_provider/__init__.py sha256=cmd3yb5HhuQu0_mEwwwsX081v4Q1B3gvhfOkWQSMmwM 146
docler/converters/kreuzberg_provider/provider.py sha256=RLwbC9ehO8z2tNDIRfH5rk1Fzfwrv2HA7-F6ZEDnVaY 3257
docler/converters/llamaparse_provider/__init__.py sha256=kf4i9fMaiqt5nGKWJmFVwcIyQB3kdszDExbBjVKm5sw 142
docler/converters/llamaparse_provider/provider.py sha256=LURB27xr6oYen1b1yliCYAFX7ZfSoBp3mr82vraBup0 5987
docler/converters/llm_provider/__init__.py sha256=3fkbHBd8tpfR8BFlUtRBeFEJykBNJCINNqf_HAtIe2c 118
docler/converters/llm_provider/provider.py sha256=LOwc48gVe_HwyzCFJ1RodzgcfmaDj1BbsWwBAd8iLGs 3333
docler/converters/marker_provider/__init__.py sha256=XG6HHVGNytuv1mNH_hgCI3oE7BlYRYRZ0TJY78inMIg 126
docler/converters/marker_provider/provider.py sha256=7DWsv3Tajw4MKFgthCyZ_x7gx7uLooTTaVXoPrQIOvE 4673
docler/converters/markitdown_provider/__init__.py sha256=c8IE-Pba2FgZ6XV1AsoJNYiBz75dnFWqXx9JhI5aMDY 150
docler/converters/markitdown_provider/provider.py sha256=2iQ2Vf8m0ole_-djZXK9t03XQvj4AcoRwnY6zK8ZBBE 2941
docler/converters/mistral_provider/__init__.py sha256=fh-_z11e98GgW66pT7vMXEEKBiMTZAJ7adBOmlPprRM 130
docler/converters/mistral_provider/provider.py sha256=rtLYoZW76z8dwu0h8SB1jJ4Sty9Dk_7Ys9ZH9Za9NC0 3483
docler/converters/upstage_provider/__init__.py sha256=A6I1jxqJVk6H9s_7FCgF08OUrIfHYflTV073UHUaPm0 130
docler/converters/upstage_provider/provider.py sha256=v5HVwylEl5hu1qg35GRU9cxCnqK8evWo1Mzngj9P9EM 6599
docler/embeddings/__init__.py sha256=1u-lJ4XYWAOVcS5pGX0MwrONrRiBftdmjYEBU7Fp6GE 27
docler/embeddings/base.py sha256=8ogaHx0cX7E6gBHx-nnjuqkhwJHXsrt_lgZF_mH4Ug4 1454
docler/embeddings/bge_provider.py sha256=lgBAKofJogHnzUrJPCR2mvzbnmlCK8VR00ppxw3epn0 1434
docler/embeddings/litellm_provider.py sha256=mPrghBeWPdxGpLocxroCrMEelEXwMuJ4RLiAQ-FuY2s 3412
docler/embeddings/openai_provider.py sha256=mus09WImpCkmqFfJqcQGxY1BMeTgKtYpxsrsHWpil5U 3052
docler/embeddings/stf_provider.py sha256=RZS0ARVExFPOhTHy1M3pLmBBd-4KPiReSwybGDHZ2_s 1922
docler/processors/__init__.py sha256=47DEQpj8HBSa-_TImW-5JCeuQeRkm5NMpJWZG3hSuFU 0
docler/processors/ai_processor.py sha256=jEnOxCJawOU0akqc-ZxRzQ6MGY-xO956boldn2S0mi8 9802
docler/processors/base.py sha256=a3I5_2V7qdLRbWVg1QdyirVIfEi53OqFionksoooeCQ 841
docler/resources/pdf_sample.pdf sha256=OMl5LXJcRd1DFpnmo7Dw-OF8Y8msczE4fuMNzG5CpRE 142786
docler/resources/samplepptx.pptx sha256=SiKO-Kwabbi2kFNNzAUBuV6EmyaUuFBNB-VgmWQHU-4 413895
docler/resources/test.docx sha256=7hl0Yz87Hou1QgGrzneadVyxzVJZZBoIz3sbmzbexCs 135824
docler/resources/test.jpg sha256=k5CzRSX9BE32kmXgIqBjRqu20gOxTLybJHPAgMaA6C4 474288
docler/resources/test.pptx sha256=8LnlJSrsNzDJHydqfDuPSpiTp7ZlQPbGrxYKSYVcpQc 277515
docler/resources/test.xls sha256=F6lLZRTomY9NwlvHcmW29imCwYYUukQBqH-gH5D1Px0 27648
docler/resources/test.xlsx sha256=qGe-bs44pCJLz-NDEuXClu1SCnakMeBFRgPOsXH71K8 11562
docler/streamlit_app/__init__.py sha256=hOA2K-5rQI3OqV0se8Pqmi69aqU5nolOY-1GI4QQFkA 131
docler/streamlit_app/app.py sha256=DXczc0_db5o6-FthjMPwElnKKmjjtfPXe_4pu_GD6SQ 1597
docler/streamlit_app/chunkers.py sha256=-7ysIYdT2Nm2W1pKQwOjujPCeC7ebBE1V-Uc-FmvYWc 357
docler/streamlit_app/converters.py sha256=07PGjcNSUp1dXEHFoTCgTGV9eDV0Ai06uQULd1bI8Fc 1140
docler/streamlit_app/state.py sha256=rZ_Ca5Gcf78MhNnml9Kq2wiE9hWEMHHWYhSWJf95TYc 1080
docler/streamlit_app/step1_conversion.py sha256=a8tvIpsP9egpJAW4DrlVQUYTv4zD0amV9-c0Fo6zHWo 3927
docler/streamlit_app/step2_chunking.py sha256=BYFkwnQMTO7YJfft9TwkbJbz028oymeg1F53arPy2Ko 5864
docler/streamlit_app/step3_vectorstore.py sha256=G8vJ_JNtgpp-FTs7G0Jbfr9qcdEQBGUyqViR2tU5nCE 10484
docler/streamlit_app/utils.py sha256=zZ2TbOWbYQ5g6z2qe4FIHGr_e2W85l_QVXtPBE-6mvI 826
docler/vector_db/__init__.py sha256=zIB5alfyHdjs2t-Za8OyH3CAszF9wKLc39IOOgtNsuQ 39
docler/vector_db/base.py sha256=driFzZs1w8R3OS_NXZhP5_ulPbpXWmJvPAU2FmzDjmk 3637
docler/vector_db/base_manager.py sha256=3n-LV8tmgdSD-7nPv3qvx6PHSuyCcHRiIV7w5oyfHt4 1833
docler/vector_db/dbs/__init__.py sha256=47DEQpj8HBSa-_TImW-5JCeuQeRkm5NMpJWZG3hSuFU 0
docler/vector_db/dbs/chroma_db/__init__.py sha256=5rEhGN03hTbeInV8wsnYG40p3iZo1V3u9j8bOtteH9E 219
docler/vector_db/dbs/chroma_db/db.py sha256=b0FdiZ4yhmIRhhVxOk2yT_Q6IM4FJAIxBipCrBTk6e4 6190
docler/vector_db/dbs/chroma_db/manager.py sha256=6H7uKFW-VJ1zU0O-C8kyecBmvl_GWy4xHVb7zIY-YZQ 7423
docler/vector_db/dbs/openai_db/__init__.py sha256=EZq0N9PwiTzKL6JEWKpFjyE7tvCPusanxMNJlY7zEAs 221
docler/vector_db/dbs/openai_db/db.py sha256=mJiS0RwQKrQP3q_Q_As8tNu1rKbiCiYBPRTz2W1hUwM 7280
docler/vector_db/dbs/openai_db/manager.py sha256=NTVSjKHKzaEN1_mTOnZZXpADDGsbcwbqa2b6F7wW5qY 9496
docler/vector_db/dbs/openai_db/utils.py sha256=kzTRsxjgm5FiA2CCLgzjzx0N53Ky0HNoX-ga7GE_fMY 1181
docler/vector_db/dbs/pinecone_db/__init__.py sha256=QhvvNMm8_eYFMzRQzL4bMhy_E4cSLi6xS5Rsq1oR9Pk 270
docler/vector_db/dbs/pinecone_db/db.py sha256=0ek_9InzzL_DRhk9nJv6ohJF95Rz80hLubLl-vy4Ta4 6878
docler/vector_db/dbs/pinecone_db/manager.py sha256=LS0ieiRY6UWaBRI7_M55cG-6rJC8kMqBR9-bfZo5itk 7060
docler/vector_db/dbs/pinecone_db/utils.py sha256=95Rv5KiIryoqChCX7Dmb69wCXCbP7R3jKd0i2PxLMec 2575
docler/vector_db/dbs/qdrant_db/__init__.py sha256=cMwP3fzZclzF0YsT9bFTMCE5V9B1qlhqvdURpcMJte0 125
docler/vector_db/dbs/qdrant_db/db.py sha256=sMghktiUW1sIruyG7-esZbTMVvGPKmlSqivtc47mXQc 7023
docler/vector_db/dbs/qdrant_db/utils.py sha256=cv-bu7ezC300SjIWof4gIv0g05A2cqzaFi0MJhHUZKM 683
docler-0.3.0.dist-info/METADATA sha256=jFD62UvXxYtpzgJH_6ycVsf_b8akuXdqDDtN1Qv66sA 7866
docler-0.3.0.dist-info/WHEEL sha256=qtCwoSJWgHk21S1Kb4ihdzI2rlJ1ZKaIurTj_ngOhyQ 87
docler-0.3.0.dist-info/entry_points.txt sha256=_34L9xb3GZQGO3nJd5M6Tnx4hOEG_kTdEv-d5DcQAEM 47
docler-0.3.0.dist-info/licenses/LICENSE sha256=AteGCH9r177TxxrOFEiOARrastASsf7yW6MQxlAHdwA 1078
docler-0.3.0.dist-info/RECORD

entry_points.txt

docler = docler.__main__:cli