news-please

View on PyPIReverse Dependencies (2)

1.6.13 news_please-1.6.13-py3-none-any.whl

Wheel Details

Project: news-please
Version: 1.6.13
Filename: news_please-1.6.13-py3-none-any.whl
Download: [link]
Size: 95941
MD5: b51310e2fd53fe9841aac1b2eddb42b8
SHA256: 0c378cc0c388cb5051e22493665bb714ab7ca057cf1c568ed015b1adbccbf373
Uploaded: 2024-07-29 09:01:03 +0000

dist-info

METADATA

Metadata-Version: 2.1
Name: news-please
Version: 1.6.13
Summary: news-please is an open source easy-to-use news extractor that just works.
Author: Felix Hamborg
Author-Email: felix.hamborg[at]uni-konstanz.de
Home-Page: https://github.com/fhamborg/news-please
Download-Url: https://github.com/fhamborg/news-please
License: Apache License 2.0
Keywords: news crawler news scraper news extractor crawler extractor scraper information retrieval
Classifier: Development Status :: 5 - Production/Stable
Classifier: Environment :: Console
Classifier: Intended Audience :: Developers
Classifier: Intended Audience :: Science/Research
Classifier: License :: OSI Approved :: Apache Software License
Classifier: Operating System :: MacOS
Classifier: Operating System :: Microsoft
Classifier: Operating System :: POSIX :: Linux
Classifier: Programming Language :: Python :: 3.8
Classifier: Programming Language :: Python :: 3.9
Classifier: Programming Language :: Python :: 3.10
Classifier: Programming Language :: Python :: 3.11
Classifier: Programming Language :: Python :: 3.12
Classifier: Topic :: Internet
Classifier: Topic :: Scientific/Engineering :: Information Analysis
Requires-Dist: Scrapy (>=1.1.0)
Requires-Dist: PyMySQL (>=0.7.9)
Requires-Dist: psycopg2-binary (>=2.8.4)
Requires-Dist: hjson (>=1.5.8)
Requires-Dist: elasticsearch (>=2.4)
Requires-Dist: beautifulsoup4 (>=4.3.2)
Requires-Dist: readability-lxml (>=0.6.2)
Requires-Dist: langdetect (>=1.0.7)
Requires-Dist: python-dateutil (>=2.4.0)
Requires-Dist: plac (>=0.9.6)
Requires-Dist: dotmap (>=1.2.17)
Requires-Dist: PyDispatcher (>=2.0.5)
Requires-Dist: warcio (>=1.3.3)
Requires-Dist: ago (>=0.0.9)
Requires-Dist: six (>=1.10.0)
Requires-Dist: lxml (>=3.3.5)
Requires-Dist: hurry.filesize (>=0.9)
Requires-Dist: bs4
Requires-Dist: faust-cchardet (>=2.1.18)
Requires-Dist: boto3
Requires-Dist: redis
Requires-Dist: newspaper4k (>=0.9.3.1)
Requires-Dist: lxml-html-clean (>=0.1.1)
Requires-Dist: typing-extensions (>=4.7.0)
Requires-Dist: pywin32 (>=220); sys_platform == "win32"
License-File: LICENSE.txt
[Description omitted; length: 603 characters]

WHEEL

Wheel-Version: 1.0
Generator: setuptools (72.0.0)
Root-Is-Purelib: true
Tag: py3-none-any

RECORD

Path Digest Size
newsplease/NewsArticle.py sha256=BWpTNs7LCPZn6VZ2dNUfsVvR5QQeJ80svuT1LAp9grI 1627
newsplease/__init__.py sha256=6C4bHLTFFpqJpHai2WO0oLnEkpZwkI-RijnvLI74yEQ 6448
newsplease/__main__.py sha256=6uA9lpXLnxmV8HhSuyu_1yIF27na_gKVh3oxPeLWJkM 25434
newsplease/config.py sha256=1FPARLJzTz4dbM_AUm6Y4clI3Ifo1cStn2m56t67RBY 9098
newsplease/helper.py sha256=DF6HVhRhLdg6KJ60aYxXYFSIvF4rYDEfi6XofZqlPVI 1262
newsplease/single_crawler.py sha256=D45xGPJTLJe3DmrZ3TYugklcWP-B4wYRaQs_W0iq6wA 11657
newsplease/config/config.cfg sha256=6M0NNStn50CQX0airln2gJwxr2aJxSLji9KqRJA-AKs 15666
newsplease/config/config_lib.cfg sha256=H6MphZGQzZf9c9Bn3T4Tw6jcczCWN_4D7q1_zY_uyx8 15071
newsplease/config/sitelist.hjson sha256=CcYIUWa9sOTAqSXz4Mcac_zchH9DxT3bsHsOQVEqaaw 2127
newsplease/crawler/__init__.py sha256=47DEQpj8HBSa-_TImW-5JCeuQeRkm5NMpJWZG3hSuFU 0
newsplease/crawler/commoncrawl_crawler.py sha256=1vcOf-qp7QgFZJDsYqXd6FitNTlrWx6SXLKl95-i27o 18927
newsplease/crawler/commoncrawl_extractor.py sha256=bFODqGHSg3MRpD7CCMqp22FPbfUwoyAMZ-M-qkVGdwM 15901
newsplease/crawler/items.py sha256=hVIrdX0sB90QnoQif4S_f5mFRUEfpAUaBE1xt2DnKbQ 1450
newsplease/crawler/response_decoder.py sha256=ZH3gltuwICu_Tu9dwvChDU8DyiQLwsd8WeQET7PFjcQ 1612
newsplease/crawler/simple_crawler.py sha256=ZdWaY6lFAvyGH-r5PZSaHt7b9d0575A2AYI69BWOXi4 3860
newsplease/crawler/spiders/__init__.py sha256=ULwecZkx3_NTphkz7y_qiazBeUoHFnCCWnKSjoDCZj0 161
newsplease/crawler/spiders/download_crawler.py sha256=BjtbrBvNj7bpAHBGTrOckg_0K_PlyY5T0I7rpInBnBc 1418
newsplease/crawler/spiders/gdelt_crawler.py sha256=EloTyh9oM6rrwzpfN2F0nmI3RbutxWKECqrFf4zwo1Q 3602
newsplease/crawler/spiders/newsplease_spider.py sha256=3oJGzSe0xJscABhwveQpP14QaSh8x-Lrtqfgl4AFPLE 812
newsplease/crawler/spiders/recursive_crawler.py sha256=SxO8zLY72mkRxzUPeggzvf8g6KAm0aHtT2wDnCpSX0Q 2066
newsplease/crawler/spiders/recursive_sitemap_crawler.py sha256=-UzqUSepYcJY48x4S3vWrwgDh9ELQmd_dOoM_WVueR4 2645
newsplease/crawler/spiders/rss_crawler.py sha256=FXw_pmo_YHz_fboN8t1XZm4fSVXKwE8_dpI_BhTxvGI 4785
newsplease/crawler/spiders/sitemap_crawler.py sha256=A1Mfwk1mXWoSlg1VYTrLiXT5Cs_IuEhZxbbTFXNVduo 2613
newsplease/examples/__init__.py sha256=47DEQpj8HBSa-_TImW-5JCeuQeRkm5NMpJWZG3hSuFU 0
newsplease/examples/commoncrawl.py sha256=4I5I83UT1Oi554kZh3b2cJ_yA4gbhh1TiGbPZ_s6CCM 9059
newsplease/examples/downloadfromfile.py sha256=Ry_ZmpQFWoBlZFnh_xnAmmry8a_uXfOAZ9qK9Cscih0 707
newsplease/examples/downloadfromurl.py sha256=hGjGoY1PFaN0b5Ut3As_0WOtwM7hHTqELRFwa4BoAfY 534
newsplease/helper_classes/__init__.py sha256=47DEQpj8HBSa-_TImW-5JCeuQeRkm5NMpJWZG3hSuFU 0
newsplease/helper_classes/class_loader.py sha256=-fuUlMyqaBwSsrF1bJnnMQgDz0Ot7NOxMI3uxER3VUo 702
newsplease/helper_classes/heuristics.py sha256=LQMj4bXXY2d9IDM6HOBx6FUw_kivm51FUn-v5tZJlWo 4819
newsplease/helper_classes/parse_crawler.py sha256=Nw0Kr9s4p-kmqJUkRo_TmDaRDJAyfvaZ8GieDOWqeNU 5021
newsplease/helper_classes/savepath_parser.py sha256=KocvVTRc9CzA6jdWXUl9eDyi6iEMuKS3cDNMBRta6ac 11877
newsplease/helper_classes/url_extractor.py sha256=xZ5bwoyLLSXyBEY6SX2-_Dmbq6IoKCjKzqmZUeU8l5Y 10139
newsplease/helper_classes/sub_classes/__init__.py sha256=47DEQpj8HBSa-_TImW-5JCeuQeRkm5NMpJWZG3hSuFU 0
newsplease/helper_classes/sub_classes/heuristics_manager.py sha256=bHSlXHOflCB8dT5zs6EDQTOxobehKRXiHNkSSdO3CdI 9811
newsplease/pipeline/__init__.py sha256=47DEQpj8HBSa-_TImW-5JCeuQeRkm5NMpJWZG3hSuFU 0
newsplease/pipeline/pipelines.py sha256=OJ7CX4PFVj9M3snXrWGiBib8jIv014tgn-wxLP5atsM 43626
newsplease/pipeline/extractor/__init__.py sha256=47DEQpj8HBSa-_TImW-5JCeuQeRkm5NMpJWZG3hSuFU 0
newsplease/pipeline/extractor/article_candidate.py sha256=YSgQSUsKumDjr6D282N2nRw_N7k8xP1RyJ8-tcIhU7w 362
newsplease/pipeline/extractor/article_extractor.py sha256=0QZeMplyTtPREx0hP8fEDbap0LH2-7nrhGcp5VWBrtc 3098
newsplease/pipeline/extractor/cleaner.py sha256=pfdE75LUrc3_gJLn4-5ibdKewoSZSlUumD25TgFyY4g 3909
newsplease/pipeline/extractor/comparer/__init__.py sha256=47DEQpj8HBSa-_TImW-5JCeuQeRkm5NMpJWZG3hSuFU 0
newsplease/pipeline/extractor/comparer/comparer.py sha256=aSQDadWFZn2q45zLEKCCeqtuFW01BLaT_r1l6pRA6mI 1960
newsplease/pipeline/extractor/comparer/comparer_Language.py sha256=08ecx8B9nkRJ32B24X9xMhWajjaO39HEVLmCU2gBRE4 1934
newsplease/pipeline/extractor/comparer/comparer_author.py sha256=yTruRBP_4O3aRbSjFzREUCkg5_7C1CnlshzHfnssiuE 1349
newsplease/pipeline/extractor/comparer/comparer_date.py sha256=gJ6PqK2fZPkq_A9alJxCZDIVQ-yzbs8Z7T1hDgbxJl0 1274
newsplease/pipeline/extractor/comparer/comparer_description.py sha256=MieVaMIAL38VcvZPteQRrQbkKJupcWNUakoWzNUfsqA 1411
newsplease/pipeline/extractor/comparer/comparer_text.py sha256=cbRNgfpAgaOtJ_ktd5aXu5RlbQq5sLWUB4iePnvODuA 3329
newsplease/pipeline/extractor/comparer/comparer_title.py sha256=iyHdZEEi2gyQmsdYLFnuxT6q31eSna2olqjFWuyxCuI 3056
newsplease/pipeline/extractor/comparer/comparer_topimage.py sha256=mQjQ_J916FoRqBUpGITUMYYORaECIwAgxk9s8fAk6E0 1973
newsplease/pipeline/extractor/extractors/__init__.py sha256=47DEQpj8HBSa-_TImW-5JCeuQeRkm5NMpJWZG3hSuFU 0
newsplease/pipeline/extractor/extractors/abstract_extractor.py sha256=pho-PF2T_DswaU_JK0K3ZO2Qe5V3fZwLBP1ZCr3JNIY 2011
newsplease/pipeline/extractor/extractors/date_extractor.py sha256=i1I2ZS2r0hTPeB4CVH4l0G6mMLSYTcBCz-Js9K-8CRA 9236
newsplease/pipeline/extractor/extractors/lang_detect_extractor.py sha256=vn4j0Ffnrs9B4U_7tq_KcEXITSRogzvSCFag6fpbtSI 2905
newsplease/pipeline/extractor/extractors/newspaper_extractor.py sha256=g-A-N6fHbrNEddaM4io_covsRkj4kEW_COFerM-ddTM 1897
newsplease/pipeline/extractor/extractors/newspaper_extractor_no_images.py sha256=wBPopCzLy7yoEH0LidttcI4swIq0TD9t1NqU42BLywc 178
newsplease/pipeline/extractor/extractors/readability_extractor.py sha256=80shyarFytLD916ZI_dUKpXN4CATa-x8_ZXgkikXhHo 1306
news_please-1.6.13.dist-info/LICENSE.txt sha256=xazLvYVG6Uw0rtJK_miaYXYn0Y7tWmxIJ35I21fCOFE 11356
news_please-1.6.13.dist-info/METADATA sha256=4TA7FxE420ZYA8MGHpORso4_2aZ2gOFJ3sXXF6hnvIY 2624
news_please-1.6.13.dist-info/WHEEL sha256=Rp8gFpivVLXx-k3U95ozHnQw8yDcPxmhOpn_Gx8d5nc 91
news_please-1.6.13.dist-info/entry_points.txt sha256=Fpc0Ve-0092RkcfGKNjx6uxeZfFRaxZltNe4eeGn32U 111
news_please-1.6.13.dist-info/top_level.txt sha256=qaFdpp4zmVZSkpY7P4Yr4J5aP9v4D7FsU4Z_WUHEctY 11
news_please-1.6.13.dist-info/RECORD

top_level.txt

newsplease

entry_points.txt

news-please = newsplease.__main__:main
news-please-cc = newsplease.examples.commoncrawl:main