Reverse Dependencies of pyspark
The following projects have a declared dependency on pyspark:
- powerbiclient — A Custom Jupyter Widget Library
- PPISleuth — Snif out targets from a PPI network.
- pramen-py — Pramen transformations written in python
- praxis-ml — A PySpark-based library for transparent and interpretable machine learning
- prda — Prda contains packages for data processing, analysis and visualization. The ultimate goal is to fill the “last mile” between analysts and packages.
- pre-ai-python — Microsoft AI Python Package
- prism-dev — The easiest way to create data pipelines in Python.
- prism-ds — The easiest way to create data pipelines in Python.
- prophecy-build-tool — Prophecy-build-tool (PBT) provides utilities to build and distribute projects created from the Prophecy IDE.
- prt-databricks-simplify-7da-data-ingest — providing creating data layers from pdro s3 buckets
- psdq — Ad Hoc Data Quality Tool for PySpark
- psyaitools — Loudness added.
- publicationpackage — This package contains the functions needed for doing a publication in a Jupyter Notebook.
- pulse-telemetry — Spark applications for transforming raw incoming data into a set of schemas for analysis.
- py-data-modori — LMOps Tool for Korean
- py-dataframe-show-reader — Reads the output of a DataFrame.show() statement into a DataFrame
- py-dbchat — Chat with your existing database without using vector DB.
- py-homepass — Python package for basic interaction with the Plume Homepass API
- py-project-toml-kimtodd — Sample Python Project for creating a new Python Module
- py4phi — A library for encryption/decryption and analysis of sensitive data.
- pyautodata — Python library designed to minimize the setup/arrange phase of your unit tests
- pybda — Analysis of big biological data sets for distributed HPC clusters.
- pyBlindRL — A Python implementation of blind Richardson-Lucy deconvolution
- pycatcher — This package identifies outlier(s) for a given time-series dataset in simple steps. It supports day, week, month and quarter level time-series data.
- pydantic-cereal — Advanced serialization for Pydantic models
- pydantic-kedro — Kedro
- pydantic-spark — Converting pydantic classes to spark schemas
- pydataassist — PyDataUtils
- pydatavec — Python interface for DataVec
- pydeequ — PyDeequ - Unit Tests for Data
- pydeequ2 — PyDeequ2 - aws clone
- pydeequ3 — Pydeequ3: PySpark 3 support for deequ - AWSClone
- pydeequalb — PyDeequ - Unit Tests for Data
- pydisconet — analyzing the co-authorship network of researchers in the field of biology
- pyeqx — no summary
- pyeqx-core — no summary
- pyfate — no summary
- pyIcarus — Tools library for Security Research
- pyjanitor — Tools for cleaning pandas DataFrames
- pyjats — Quick pydantic based parser for parsing JATS xml from e.g. EuropePMC
- pykiwi — pykiwi
- pyMicroeconomics — Functions for microeconomic analysis
- pyod-pyspark — no summary
- pyonion — A minimal implementation of the ONe Instance ONly algorithm
- pyoptimus — Optimus is the missing framework for cleaning and pre-processing data in a distributed fashion.
- pypas — Python wrapper for CyberArk Core PAS REST-API
- pype-spark — no summary
- pypersonality — Identify the personality from a given text using (MBTI) Myers-Briggs Personality Type Dataset
- pyQualitas — A project to ensure the data quality using python
- pyrander — A random test lib
- pyrasterframes — Access and process geospatial raster data in PySpark DataFrames
- pyrecdp — A data processing bundle for spark based recommender system operations
- pyresumize — Resume Parser Written in Python3 . The module supports .pdf and .docx files
- pysarplus — SAR prediction for use with PySpark
- pyschema2 — Schema generation and validation for pyspark
- pyseqtender — Distributed NGS pipelines made easy
- pysequila — An SQL-based solution for large-scale genomic analysis
- pysetl — A PySpark ETL Framework
- pySISF — A Python toolkit for SISF Images
- pyspark-bucketmap — Easily group pyspark data into buckets and map them to different values.
- pyspark-cli — PySpark Project Buiding Tool
- pyspark-config — Configurable data pipeline with Pyspark
- pyspark-connectors — The easy and quickly way to connect and integrate the Spark project with many others data sources.
- pyspark-data-mocker — Mock a datalake easily to be able to test your pyspark data application
- pyspark-datacol-diff — PySpark utility created to quickly provide details regarding which attributes differ between 2 dataframes with same schema and primary key.
- pyspark-dataframe-wrappers — Sample Python Project for creating a new Python Module
- pyspark-dbscan — An "Efficient" Implementation of DBSCAN on PySpark
- pyspark-ds-toolbox — A Pyspark companion for data science tasks.
- pyspark-eda — A Python package for univariate ,bivariate and multivariate data analysis using PySpark
- pyspark-event-correlation — Event Correlation and Changing Detection Algorithm
- pyspark-factories — Create pyspark dataframes with randomly generated data from structschema
- pyspark-flame — A low-overhead sampling profiler for PySpark, that outputs Flame Graphs
- pyspark-functions — no summary
- pyspark-graph — Pure pyspark implementation of graph algorithms
- pyspark-helpers — A collection of tools to help when developing PySpark applications
- pyspark-iomete — IOMETE's PySpark library that contains useful utilities for working with PySpark
- pyspark-json-loader — A package to load and preprocess JSON data using PySpark
- pyspark-pdf — Spark-Pdf is a library for processing documents using Apache Spark
- pyspark-prometheus — Prometheus instrumentation for Spark Streaming metrics.
- pyspark-regression — A tool for regression testing Spark Dataframes in Python
- pyspark-spy — Collect and aggregate on spark events for profitz. In 🐍 way!
- pyspark-stubs — A collection of the Apache Spark stub files
- pyspark-supp — Data Engineer Support PySpark Library
- pyspark-test — Check that left and right spark DataFrame are equal.
- pyspark-testframework — Testframework for PySpark DataFrames
- pyspark-testing — Testing Framework for PySpark
- pyspark-types — `pyspark_types` is a Python library that provides a simple way to map Python dataclasses to PySpark StructTypes
- pyspark-util — PySpark utility functions
- pyspark-utilities — Spark utilities to be used by the Analytics BDA
- pyspark-utility — This project is meant to help pyspark user (Databricks) to get some utility functions.
- pyspark-val — PySpark validation & testing tooling
- pyspark-vector-files — Read vector files into a Spark DataFrame with geometry encoded as WKB.
- pyspark3d — Spark extension for processing large-scale 3D data sets
- pysparkcli — PySpark Project Buiding Tool
- pysparkdt — An open-source Python library for simplifying local testing of Databricks workflows that use PySpark and Delta tables.
- pysparkextra — extra utilities for pyspark.sql
- pysparkifier — Streamlined pyspark usage
- pysparkify — Spark based ETL
- pysparklib — A elaborate and developed PySpark libraries and resources.
- pysparkly — Pyspark useful functions and extensions