tkitSimhash

View on PyPIReverse Dependencies (0)

0.0.1.9 tkitSimhash-0.0.1.9-py2.py3-none-any.whl

Wheel Details

Project: tkitSimhash
Version: 0.0.1.9
Filename: tkitSimhash-0.0.1.9-py2.py3-none-any.whl
Download: [link]
Size: 6275
MD5: 76dfa38b4b18ad8627d7b75fa7bedc65
SHA256: 2b07211229352ee23e7de38b91bc86f0bd0c19715c80072c0c98dd5727156611
Uploaded: 2022-09-24 14:35:23 +0000

dist-info

METADATA

Metadata-Version: 2.1
Name: tkitSimhash
Version: 0.0.1.9
Summary: # Remove duplicates 重复内容筛选 tkitSimhash zh 根据经验,一般当两个文档特征字之间的汉明距离小于 3, 就可以判定两个文档相似。《数学之美》一书中,在讲述信息指纹时对这种算法有详细的介绍。 ```python from tkitSimhash import simHash sim=simHash() text1 = """' , in Valve's absence, the modern slew of co-op zombie games have not
Author: Terry Chan
Author-Email: napoler2008[at]gmail.com
Home-Page: https://terrychanorg.jetbrains.space/p/tkittools/repositories/tkitRemoveDuplicates/files/master/README.md
Requires-Dist: jieba (>=0.42.1)
Requires-Dist: simhash (==2.1.2)
Requires-Dist: nltk (>=3.6)
Requires-Dist: pytest (==7.1.3)
Description-Content-Type: text/markdown
[Description omitted; length: 2958 characters]

WHEEL

Wheel-Version: 1.0
Generator: bdist_wheel (0.37.0)
Root-Is-Purelib: true
Tag: py2-none-any
Tag: py3-none-any

RECORD

Path Digest Size
tkitSimhash/__init__.py sha256=KJfCFTVUKKW4CoOIavolG6asnAPw6eHJNZD_Vgsf-1w 30
tkitSimhash/simHash.py sha256=Q2vEPwJm5PzSyn1GenwsE6sQa_QyGC2BUkm7NZ74X9w 7994
tkitSimhash-0.0.1.9.dist-info/METADATA sha256=TZ1689wLfoSVXMwYJUOik1sOiJOd83-7GGHQTw0x3JU 4076
tkitSimhash-0.0.1.9.dist-info/WHEEL sha256=WzZ8cwjh8l0jtULNjYq1Hpr-WCqCRgPr--TX4P5I1Wo 110
tkitSimhash-0.0.1.9.dist-info/top_level.txt sha256=-0FmjTtTXgth0QSIuVvEIThk6iEIO23gUmev8oHw5Jc 12
tkitSimhash-0.0.1.9.dist-info/RECORD

top_level.txt

tkitSimhash