hdfs-docling-analyze

View on PyPIReverse Dependencies (0)

0.3.0 hdfs_docling_analyze-0.3.0-py3-none-any.whl

Wheel Details

Project: hdfs-docling-analyze
Version: 0.3.0
Filename: hdfs_docling_analyze-0.3.0-py3-none-any.whl
Download: [link]
Size: 4157965
MD5: 4ed5261d12c709f34db12c7762b69d1e
SHA256: 41188fc1349412c28d49da5c5c089cb910f6d92b2a5ade431cf2680a41b3da74
Uploaded: 2025-01-14 09:41:16 +0000

dist-info

METADATA

Metadata-Version: 2.1
Name: hdfs_docling_analyze
Version: 0.3.0
Summary: A library for analyzing files from HDFS and saving results to MongoDB
Author: Vo Nhu Y
Author-Email: vonhuy5112002[at]gmail.com
Home-Page: https://github.com/vonhuy1
Classifier: Programming Language :: Python :: 3
Classifier: License :: OSI Approved :: MIT License
Classifier: Operating System :: OS Independent
Requires-Python: >=3.8
Requires-Dist: obsei[all]
Requires-Dist: docling (==2.14.0)
Requires-Dist: hdfs (==2.7.3)
Requires-Dist: pymongo (==4.10.1)
Requires-Dist: pylatexenc (==2.10)
Requires-Dist: Spire.Doc (==12.12.0)
Requires-Dist: asyncio
Requires-Dist: pytesseract (==0.3.13)
Requires-Dist: odfpy (==1.4.1)
Requires-Dist: python-docx (==1.1.2)
Requires-Dist: unstructured
Requires-Dist: langchain
Requires-Dist: langchain-community
Requires-Dist: pyexcel-ods3 (==0.6.1)
Requires-Dist: docx2txt (==0.8)
Requires-Dist: pypandoc (==1.14)
Requires-Dist: striprtf (==0.0.28)
Requires-Dist: xlrd (==2.0.1)
Requires-Dist: llama-extract
Requires-Dist: llama-parse
Requires-Dist: llama-index
Requires-Dist: llama-index-llms-openai
Requires-Dist: llama-index-embeddings-openai
Requires-Dist: transformers
Requires-Dist: torch
Requires-Dist: keras
Requires-Dist: pypdfium2
Requires-Dist: pypdf2
Requires-Dist: pi-heif
Requires-Dist: pdfminer
Requires-Dist: pdfminer.six
Requires-Dist: unstructured-inference
Description-Content-Type: text/markdown
[Description omitted; length: 352 characters]

WHEEL

Wheel-Version: 1.0
Generator: bdist_wheel (0.44.0)
Root-Is-Purelib: true
Tag: py3-none-any

RECORD

Path Digest Size
hdfs_analyze/__init__.py sha256=QYXBcyYxQYZbZZpkD8AjVNZpiBUq-6ocYw7VF8nuE9E 132
hdfs_analyze/config_docling.py sha256=eOT2f10Uv-ovCJwKutQq2HI6IJZDXyurfz2mH-2GSK8 3508
hdfs_analyze/config_hdfs.py sha256=B__uUzA1gM3zreqS5NzlhTPN6kYvyEZtbX3MUFTD_9U 6963
hdfs_analyze/connect_hdfs.py sha256=QBjnwQbeBXIkiQoLRQ7lB5JP8UDmpbbj_HV9y9qgutg 300
hdfs_analyze/connect_mongo.py sha256=eKqm368QjQX7FZCaRDNYNdp4l_o17WmlKEY-pG3LxnA 392
hdfs_analyze/extract_content_no_docling_v2.py sha256=RHW2enqG0oqgBAbmEuI7pHOjbImP-VN9-2MfHrnOJyY 16000
hdfs_analyze/forward_config.py sha256=sQfFFK7CHgmTG-5xL2kaHWpQSKuVe-UdBaI5t8y_77E 6358
hdfs_analyze/pipe_line_obsei.py sha256=UAg7XKE1XrmDBUvphOtVnad21vlULvxrP1P8KqbnQ3E 14161
hdfs_analyze/post_install.py sha256=qYpVeTLqhjzSSxwwMCn0oV_GjQhY9gqFtQ_bVxVNmOU 525
hdfs_analyze/obsei_module/__init__.py sha256=47DEQpj8HBSa-_TImW-5JCeuQeRkm5NMpJWZG3hSuFU 0
hdfs_analyze/obsei_module/obsei/__init__.py sha256=Uua3Zja0cXJoT0T-kricizXvaisc8G0o9JMcb-1BC0I 536
hdfs_analyze/obsei_module/obsei/_version.py sha256=go20U3RCVaJ2N55RnX4tO5rinfUCRV0puFyrHCto8yw 23
hdfs_analyze/obsei_module/obsei/configuration.py sha256=Cp_7wF-pbpTy6d4iu0_gL85pXDpNYc0SGsMCc51ZvyY 1239
hdfs_analyze/obsei_module/obsei/payload.py sha256=f_fAUv_NwlqwBXmEqYUNUYX5Fduuw2DdRvYIsPhok34 857
hdfs_analyze/obsei_module/obsei/process_workflow.py sha256=85_BU-6AsJJ1cQvW5qSzF7pBQfuAJvGTpoShzikvZlA 1936
hdfs_analyze/obsei_module/obsei/processor.py sha256=z2zwYvu73sNW6H4qhFUSKRBb212d6CZwXjy7gDMWc4s 2592
hdfs_analyze/obsei_module/obsei/analyzer/__init__.py sha256=47DEQpj8HBSa-_TImW-5JCeuQeRkm5NMpJWZG3hSuFU 0
hdfs_analyze/obsei_module/obsei/analyzer/base_analyzer.py sha256=VreSOdyERjnarLjRTVM1qyUQNWO8HfEpHCoopieDnz8 2544
hdfs_analyze/obsei_module/obsei/analyzer/classification_analyzer.py sha256=4Gnk47aONC0nKWpS2UYHTm31mhUwD17oJySNUAZYgms 6161
hdfs_analyze/obsei_module/obsei/analyzer/dummy_analyzer.py sha256=f8RBgbxpmbZ630fK5FN_pdhpv2XSh5_Bm5iUD6czedU 1331
hdfs_analyze/obsei_module/obsei/analyzer/ner_analyzer.py sha256=F3FD-xlvkzlAS_-noZOfvI6YJ9M6fQSNnuiblwLCvkU 5735
hdfs_analyze/obsei_module/obsei/analyzer/pii_analyzer.py sha256=kYpjrpkMvAa8Xw_wHVGr1-n2uMTxYi_v2T84rRs0a0s 7690
hdfs_analyze/obsei_module/obsei/analyzer/sentiment_analyzer.py sha256=r8F0_eU0Pq90I3xlePKpSFtPiy69OhMJPnadHaSWJIQ 3304
hdfs_analyze/obsei_module/obsei/analyzer/test_ner.py sha256=5ixoCwGZmLYu3fBkQG1bYoUz_l9DAaQFn5B4CgVCQh8 538
hdfs_analyze/obsei_module/obsei/analyzer/translation_analyzer.py sha256=RaN933gYUrIEt40PPGfOAXudL80Xu28-IruoUre_ya4 2451
hdfs_analyze/obsei_module/obsei/misc/__init__.py sha256=47DEQpj8HBSa-_TImW-5JCeuQeRkm5NMpJWZG3hSuFU 0
hdfs_analyze/obsei_module/obsei/misc/gpu_util.py sha256=9s21j9NwFNe5nfPz3QoGFN4WSTSRM1zsr5oMl7e17-w 446
hdfs_analyze/obsei_module/obsei/misc/utils.py sha256=fiegwYB38RFVnYOrqZLTwj6Hlvmc1qakBqfsmlUc0Jk 6525
hdfs_analyze/obsei_module/obsei/misc/web_search.py sha256=4a3OmsoTwcVdyJvVFZ1NCYBOTNRH6tJDCgLO_jC1jdA 1023
hdfs_analyze/obsei_module/obsei/misc/youtube_reviews_scrapper.py sha256=f5SGHeFZ_D_r06TP35ooPLvII0Pxie6f_s4vNCv8v2k 7906
hdfs_analyze/obsei_module/obsei/postprocessor/__init__.py sha256=47DEQpj8HBSa-_TImW-5JCeuQeRkm5NMpJWZG3hSuFU 0
hdfs_analyze/obsei_module/obsei/postprocessor/base_postprocessor.py sha256=qaRjEkPN15deSPKBCzPPxy16EJxPgUmd0zlZnbEEc8A 562
hdfs_analyze/obsei_module/obsei/postprocessor/inference_aggregator.py sha256=XHpIRC6D-HjKQUHNyqbPy5D5z0oZ6dkP8cNpsf024kg 2057
hdfs_analyze/obsei_module/obsei/postprocessor/inference_aggregator_function.py sha256=7aSsWvnmy18kMa8WOrl0nNw17xmL5_p0oSxoQVYHdPg 4374
hdfs_analyze/obsei_module/obsei/postprocessor/test3.py sha256=5cCbDh8EfklHhdEV09gMzsR_61opwYJ_SQ1UJz4Q9jU 1123
hdfs_analyze/obsei_module/obsei/preprocessor/__init__.py sha256=47DEQpj8HBSa-_TImW-5JCeuQeRkm5NMpJWZG3hSuFU 0
hdfs_analyze/obsei_module/obsei/preprocessor/base_preprocessor.py sha256=hrcTqceTKRtY5T3oIZaAfEG25xeHxIYRLckEUl1J2RI 576
hdfs_analyze/obsei_module/obsei/preprocessor/text_cleaner.py sha256=HyJlBvCIqjCQGlwLgVI2bCHqXmEYoz31kt1YktAS2eU 2819
hdfs_analyze/obsei_module/obsei/preprocessor/text_cleaning_function.py sha256=imZHLcbk5Szjt-MzNMZmYPsVzQ9AZEMOy8nQjhSsDIE 6080
hdfs_analyze/obsei_module/obsei/preprocessor/text_splitter.py sha256=Kf_gymU9_WB_oIzNqFsIAF5bAwIHczEEaicoaZmuUC8 4320
hdfs_analyze/obsei_module/obsei/preprocessor/text_tokenizer.py sha256=pr4hVo15nF4dgtO4heIQF3GsTuK-menIO80Pbfs-qj4 723
hdfs_analyze/obsei_module/obsei/sink/__init__.py sha256=47DEQpj8HBSa-_TImW-5JCeuQeRkm5NMpJWZG3hSuFU 0
hdfs_analyze/obsei_module/obsei/sink/base_sink.py sha256=kmlYD3pO4mcQrrEbaW4uxtE_Rkg0z-M_dtOJcgdq4qo 1366
hdfs_analyze/obsei_module/obsei/sink/dailyget_sink.py sha256=vrFpEYfVK7Bz3nPUL51YT4wdilPYRQrUzcJ-y_ZNGM4 5481
hdfs_analyze/obsei_module/obsei/sink/elasticsearch_sink.py sha256=8FHi1ebnCUheY22eQv2gAJLop67p2Bz-IkdnIXpRSKI 3559
hdfs_analyze/obsei_module/obsei/sink/http_sink.py sha256=WMBy8C1yQCiVQYa9GECX0-gG0BuUOiRe2opRj1KYWS0 1625
hdfs_analyze/obsei_module/obsei/sink/jira_sink.py sha256=M2BaJ6ZFLnQvfN22XTejJnWH7Ca8AODmYxWelddJFCo 3538
hdfs_analyze/obsei_module/obsei/sink/logger_sink.py sha256=2vl6sxhcTU5jY4X90VM6U5dbAvvGMyDz3e6NIQQT4iI 1198
hdfs_analyze/obsei_module/obsei/sink/pandas_sink.py sha256=kKtS2IMVHpJ3XBVoOeQe0bWnph4m2J4-YZGTdbP6BZ0 2386
hdfs_analyze/obsei_module/obsei/sink/slack_sink.py sha256=6nhvqfLuBpDnPjuLbfMnzFrsca4Gm7CdtiwuWbafHQU 2405
hdfs_analyze/obsei_module/obsei/sink/test1.py sha256=47DEQpj8HBSa-_TImW-5JCeuQeRkm5NMpJWZG3hSuFU 0
hdfs_analyze/obsei_module/obsei/sink/zendesk_sink.py sha256=oCNNvufLKuXW--KLmFPuoMRoOKJtKw22aPCWfulIBDE 5220
hdfs_analyze/obsei_module/obsei/source/__init__.py sha256=47DEQpj8HBSa-_TImW-5JCeuQeRkm5NMpJWZG3hSuFU 0
hdfs_analyze/obsei_module/obsei/source/appstore_scrapper.py sha256=omyVCH5zMoSfBmB1kbgElIiWJHgSuuLSkAZIqhlRGAY 5752
hdfs_analyze/obsei_module/obsei/source/base_source.py sha256=ak9a1yLv3vTRMWEQQ3GhPo33xxf2-LTiSbnAFumd1aE 570
hdfs_analyze/obsei_module/obsei/source/email_source.py sha256=m0FGfem1SqGdZiFI58IqO8RmmMnlYBdVK1Th3H24OX4 12548
hdfs_analyze/obsei_module/obsei/source/facebook_source.py sha256=YB-jaMH0ag7oLYZnQloxe0DCRt1Ee_m-xCuLdfqdPtk 6670
hdfs_analyze/obsei_module/obsei/source/google_maps_reviews.py sha256=_3Ufw02q_CK3FQNPpc40BJEDAC9Rkk0CafCJ4eKBNkg 4914
hdfs_analyze/obsei_module/obsei/source/google_news_source.py sha256=mJISAUdDVeu38vFfAI5jKINa22QPWxkwddY3ogLnk1U 6382
hdfs_analyze/obsei_module/obsei/source/module_url_test.py sha256=blqJAdb8ihhRIs_1B0HoaCJBNqRr3bA2dVf7m6CIVGA 19251
hdfs_analyze/obsei_module/obsei/source/pandas_source.py sha256=ma_6R2v8gmz0lpwFITz0rWfxm5cF-1FiPvHqQLhH3xs 1775
hdfs_analyze/obsei_module/obsei/source/playstore_reviews.py sha256=rOZoVJxuzsQF4pXFil_JgHf4NQcNzChup9dzahrEow0 5144
hdfs_analyze/obsei_module/obsei/source/playstore_scrapper.py sha256=iCgNBTKycr2xTIhlYUE6qcHLKVw9r8zifrU-Fk_z2o0 6081
hdfs_analyze/obsei_module/obsei/source/reddit_scrapper.py sha256=bZUpvurhQZMzzw5LCMaa6XKn0JoFkyTBMgpG0h24EPk 3599
hdfs_analyze/obsei_module/obsei/source/reddit_source.py sha256=h6zSE41BnZFL-Ur29NB8If0IkbnVF7nbfNSd0-Np4z8 5673
hdfs_analyze/obsei_module/obsei/source/twitter_source.py sha256=umSBK2cQqmWZW66ivA32gPuzh-WK2394DcnyT8U7Noc 10961
hdfs_analyze/obsei_module/obsei/source/website_crawler_source.py sha256=BGHbb8kDJNkfqAz_KFEg0evjqslEjeVhUop58QDJkGw 4235
hdfs_analyze/obsei_module/obsei/source/youtube_reviews.py sha256=47DEQpj8HBSa-_TImW-5JCeuQeRkm5NMpJWZG3hSuFU 0
hdfs_analyze/obsei_module/obsei/source/youtube_scrapper.py sha256=Fhp3I9EcKDrPIp67_8dVGabswscBDKhQwpDaYAPbQSs 4190
hdfs_analyze/obsei_module/obsei/workflow/__init__.py sha256=47DEQpj8HBSa-_TImW-5JCeuQeRkm5NMpJWZG3hSuFU 0
hdfs_analyze/obsei_module/obsei/workflow/base_store.py sha256=hLiJhKi_qawR8k1m6_coD1hoepc-sUeBBHzWYiztM88 900
hdfs_analyze/obsei_module/obsei/workflow/store.py sha256=k-Yscr9nNUzNGhPKAY2kydQqJIGmFXX01fvMJeYRW08 7538
hdfs_analyze/obsei_module/obsei/workflow/workflow.py sha256=Ed4k9I1QfSbHh0DZ4cA9fK0ka2mnzH7CdTOfLFGIji0 1080
hdfs_analyze/third_party/tesserocr-2.7.1-cp311-cp311-win_amd64.whl sha256=sQYBAlPE7mA9e1p6T6Ts7ii0tibr8meMrgXt2l3V5Tc 4072897
hdfs_docling_analyze-0.3.0.dist-info/METADATA sha256=3HkQ6TTNkzlW6sQ1aUpbYVmf85ggJe_KmHpGJo9qduU 1795
hdfs_docling_analyze-0.3.0.dist-info/WHEEL sha256=eOLhNAGa2EW3wWl_TU484h7q1UNgy0JXjjoqKoxAAQc 92
hdfs_docling_analyze-0.3.0.dist-info/top_level.txt sha256=FZUHKAXNKQVTT8OS-BGGl8kxa79ClclolPM22Yif55M 13
hdfs_docling_analyze-0.3.0.dist-info/RECORD

top_level.txt

hdfs_analyze