cc2dataset

View on PyPIReverse Dependencies (0)

1.5.0 cc2dataset-1.5.0-py3-none-any.whl

Wheel Details

Project: cc2dataset
Version: 1.5.0
Filename: cc2dataset-1.5.0-py3-none-any.whl
Download: [link]
Size: 12305
MD5: daf7cca8dee1d839a755ba4a5860f676
SHA256: e903a02b39f0bb98d320d966b80dd4abfc8646e385488d811abd6bd7e9619ef5
Uploaded: 2023-06-25 22:54:59 +0000

dist-info

METADATA

Metadata-Version: 2.1
Name: cc2dataset
Version: 1.5.0
Summary: Easily convert common crawl to image caption set using pyspark
Author: Romain Beaumont
Author-Email: romain.rom1[at]gmail.com
Home-Page: https://github.com/rom1504/cc2dataset
License: MIT
Keywords: machine learning
Classifier: Development Status :: 4 - Beta
Classifier: Intended Audience :: Developers
Classifier: Topic :: Scientific/Engineering :: Artificial Intelligence
Classifier: License :: OSI Approved :: MIT License
Classifier: Programming Language :: Python :: 3.6
Requires-Dist: pyspark
Requires-Dist: pysimdjson
Requires-Dist: fsspec
Requires-Dist: pandas
Requires-Dist: loguru
Requires-Dist: pyarrow
Requires-Dist: fastwarc
Requires-Dist: s3fs
Requires-Dist: fire
Requires-Dist: requests
Description-Content-Type: text/markdown
[Description omitted; length: 4970 characters]

WHEEL

Wheel-Version: 1.0
Generator: bdist_wheel (0.40.0)
Root-Is-Purelib: true
Tag: py3-none-any

RECORD

Path Digest Size
cc2dataset/__init__.py sha256=XD70ZT1GtUZY1Cr6L8yKuLQPr0oVS2VRWtMN_0RNEGs 70
cc2dataset/main.py sha256=ByuNpG7_zPTB40V50CJSpzsLQgQjgPk7-jXEYdEv0JI 13307
cc2dataset/spark_session_builder.py sha256=Mn9DYNVZE06MmHZ00I-Y08b1pbFst3MEd02fLVx12yQ 3595
cc2dataset-1.5.0.data/data/README.md sha256=o7S7WTrU90TnYYjonOtjNl3yDJfKqNuMEb-OtmVok14 4968
cc2dataset-1.5.0.dist-info/LICENSE sha256=kQ7Sg07ctsEZeQxAgGGT8tsbjz8G5o_kY8ToEkxWfjE 1072
cc2dataset-1.5.0.dist-info/METADATA sha256=f3FG8L2y5wUzxuTPkhPSIl9MqZkSpGTgnzmmX7Vop5Q 5789
cc2dataset-1.5.0.dist-info/WHEEL sha256=pkctZYzUS4AYVn6dJ-7367OJZivF2e8RA9b_ZBjif18 92
cc2dataset-1.5.0.dist-info/top_level.txt sha256=sx3Zm58ophtFrhQbkH98g_KPtQrzbGjGjVSB6fkwDLg 11
cc2dataset-1.5.0.dist-info/RECORD

top_level.txt

cc2dataset