deboiler

View on PyPIReverse Dependencies (0)

2023.46.150 deboiler-2023.46.150-py3-none-any.whl

Wheel Details

Project: deboiler
Version: 2023.46.150
Filename: deboiler-2023.46.150-py3-none-any.whl
Download: [link]
Size: 31656
MD5: 9721b42eefd36c72583de6d16a4ec395
SHA256: 4891db91c84c654fdd62941af7b8218f4110eda2c17b0a9adc0b6633f394005e
Uploaded: 2023-11-17 22:52:35 +0000

dist-info

METADATA

Metadata-Version: 2.1
Name: deboiler
Version: 2023.46.150
Summary: Deboiler is an open-source package to clean HTML pages across an entire domain
Author: Globality AI
Home-Page: https://github.com/globality-corp/deboiler
License: MIT
Keywords: deboiler,python,html cleaning
Requires-Dist: tqdm
Requires-Dist: pandas
Requires-Dist: fastavro
Requires-Dist: lxml
Requires-Dist: tldextract
Requires-Dist: importlib-metadata (<4.3)
Requires-Dist: langdetect
Requires-Dist: flake8-isort (>=3.0.1); extra == "lint"
Requires-Dist: flake8-print (>=3.1.0); extra == "lint"
Requires-Dist: flake8-logging-format; extra == "lint"
Requires-Dist: globality-black; extra == "lint"
Requires-Dist: mypy; extra == "typehinting"
Requires-Dist: types-setuptools; extra == "typehinting"
Provides-Extra: lint
Provides-Extra: typehinting
Description-Content-Type: text/markdown
License-File: LICENSE
[Description omitted; length: 7722 characters]

WHEEL

Wheel-Version: 1.0
Generator: bdist_wheel (0.41.3)
Root-Is-Purelib: true
Tag: py3-none-any

RECORD

Path Digest Size
deboiler/__init__.py sha256=s9ylEuOBniYM7BHzRi1372-nXx0I4Ur5PVOD3rDZ-Ss 130
deboiler/deboiler.py sha256=BlQ5Q7Q4bnccfLUX_jmc1dz0s3rbpKizpyJmobGXxx0 14139
deboiler/logger.py sha256=6wmsAd07zZ-M3ceGuzrrEuOCSLBXhvvtpRthwVBZlF8 594
deboiler/lxml_query.py sha256=xPW8B1NaPkPnHZLfYtU3ty1oqV3TKfuud8C2W17f7Gw 1207
deboiler/dataset/__init__.py sha256=756K3naEGAP3rs76do7OHd30Uztul9up3MJHNuIT_n8 282
deboiler/dataset/avro_dataset.py sha256=xPY-ip6Nu20xWyQCgsHpr9WPPd1nc8yCt7hYCmCp4es 2617
deboiler/dataset/base.py sha256=SKtBhZxDijGpMEGsHgyeK63YBBf3WKPRjlJmcIHRhfQ 3071
deboiler/dataset/dataframe_dataset.py sha256=a7u-Ipv8VQ3E9D-doT9adg7KSu4vtCKw52CWlh3wHI0 1645
deboiler/dataset/json_dataset.py sha256=OWaRIjMF_7rvfJ2TloSC7gj7nLVbbxyGPbZJF3XKwzw 2124
deboiler/dataset/list_dataset.py sha256=5hIf93IeqRpGi3TR_OtCXnmzsxnBXgkiTlxPlvZB_K0 1422
deboiler/models/__init__.py sha256=47DEQpj8HBSa-_TImW-5JCeuQeRkm5NMpJWZG3hSuFU 0
deboiler/models/lxml_node.py sha256=4_gJU09Ww_R-lOIjPfPGwF78GebBrlQkdJXzGDU78vY 12937
deboiler/models/page.py sha256=LG1IOLc8uAdjKQFDdMQl6Mg_uLU_PND36SvuaFMvzbE 4970
deboiler/models/tag.py sha256=aATO1NguOlfydb1_FVlgqljlDuA1TYyc4EN3qSMlh7Y 875
deboiler/tests/__init__.py sha256=47DEQpj8HBSa-_TImW-5JCeuQeRkm5NMpJWZG3hSuFU 0
deboiler/tests/test_denoising.py sha256=ntb3ydQTX9Kh7SYagpHoN1hgkCE1P4txOMhz7wNE8uE 2593
deboiler/tests/test_end_to_end_pipeline.py sha256=7PE5c4YOC8iWNoxhagEVU1z8j4cXCDp1Ch13l5MUroQ 923
deboiler/tests/test_language_detection.py sha256=Li95LDv5vNDFGF-IR_8AeGeAumz5B5mNT7KFxgSgIn4 1459
deboiler/tests/test_operation_modes.py sha256=fPNdNx2Z-uMCl5EJ6dalqTZIG-RxCz61lc9UObKQN1Q 2080
deboiler/tests/test_text_extraction.py sha256=u5N1IgMWacLgkDOQxwP6L-kuutgGpe0TJ7qT-8ssUzY 2774
deboiler/tests/fixtures/__init__.py sha256=7zgE8-uN1BZRsE_zVpU1UQtbJYqZjkYElDGKsdzQwgA 200
deboiler-2023.46.150.dist-info/LICENSE sha256=xx0jnfkXJvxRnG63LTGOxlggYnIysveWIZ6H3PNdCrQ 11357
deboiler-2023.46.150.dist-info/METADATA sha256=NawTzInYOFFyHL-jkHTraGsbh_yHU79ObMnGpJRY6jM 8603
deboiler-2023.46.150.dist-info/WHEEL sha256=Xo9-1PvkuimrydujYJAjF7pCkriuXBpUPEjma1nZyJ0 92
deboiler-2023.46.150.dist-info/top_level.txt sha256=nazDh4hO0uW4esQJsnMKrAyLlRCvUMROKgqGB7G5bxw 9
deboiler-2023.46.150.dist-info/RECORD

top_level.txt

deboiler