Wheelodex — pyspark — Reverse Dependencies

Wheelodex » Projects » pyspark » Reverse Dependencies

Reverse Dependencies of pyspark

The following projects have a declared dependency on pyspark:

spark-connect-proxy — A reverse proxy server which allows secure connectivity to a Spark Connect server
spark-dataframe-tools — spark_dataframe_tools
spark-expectations — This project helps us to run Data Quality Rules in flight while spark job is being run
spark-generated-rules-tools — spark_generated_rules_tools
spark-hdfs-tools — spark_hdfs_tools
spark-hydro — Advanced Delta-Lake related tooling based on Apache Spark
spark-insights — A package to generate HTML reports for Spark DataFrames with detailed data health checks.
spark-llm — LLM assistant for the development of Spark applications
spark-loader — loads spark
spark-map — Pyspark implementation of `map()` function for spark DataFrames
Spark-Matcher — Record matching and entity resolution at scale in Spark
spark-pipeline — Data Science oriented tools, mostly for Apache Spark
spark-pit — PIT join library for PySpark
spark-plotting-tools — spark_plotting_tools
spark-privacy-preserver — Anonymizing Library for Apache Spark
spark-quality-rules-tools — spark_quality_rules_tools
spark-scaffolder-transforms-tools — spark_scaffolder_transforms_tools
spark-silex — Silex adds more sparks to your project!
spark-sql-to-sqlite — no summary
sparkaid — Utils for working with Spark
sparkautomapper — AutoMapper for Spark
sparkautomapper.fhir — FHIR extensions for SparkAutoMapper
SparkAutoML — For an easy implementation of spark's machine learning library
SparkBoot — SparkBoot: make an easy way (yaml) to run pyspark
sparkcraft — SparkCraft
sparkdantic — A pydantic -> spark schema library
sparkdataframecomparer — Deep Comparer for Spark Data Frames
sparkdh — no summary
sparkfhirschemas — AutoMapper for Spark
sparkkgml — From Knowledge Graphs to Machine Learning!
sparklanes — A lightweight framework to build and execute data processing pipelines in pyspark (Apache Spark's python API)
sparklightautoml-dev — Spark-based distribution version of fast and customizable framework for automatic ML model creation (AutoML)
sparkly-em — Sparkly is a TF/IDF top-k blocking for entity matching system built on top of Apache Spark and PyLucene.
sparkmanager — A pyspark management framework
SparkMinIOHandle — Spark MinIO Handler Package
SparkMLTransforms — Transformations in Spark for ML Features
sparkmon — sparkmon
Sparkora — Exploratory data analysis toolkit for Pyspark
sparkouille — Ways to productionize machine learning predictions
sparkpipelineframework — Framework for simpler Spark Pipelines
sparkpipelineframework.testing — Testing Framework for SparkPipelineFramework
sparkpl — A utility package for converting between PySpark and Polars DataFrames
sparkql — sparkql: Apache Spark SQL DataFrame schema management for sensible humans
sparksampling — pyspark-sampling
SparkSchemafy — Formats spark schema output into a schema definition
sparksnake — Improving the development of Spark applications deployed as jobs on AWS services like Glue and EMR
sparksql-helper — SparkSQL Helper
sparksql-jupyter — Spark SQL magic command for Jupyter notebooks
sparksql-magic — Spark SQL magic command for Jupyter notebooks
SparkStream — A simple spark streaming handler.
sparkypandy — It's not spark, it's now pandas, it's just awkward...
SPARQL2Spark — SPARQL Result to Spark
spinecore — The core lib of spine library
spirograph — A tool to help building ML pipeline easier for non technical users..
spl-transpiler — Convert Splunk SPL queries into PySpark code
splink — Fast probabilistic data linkage at scale
sqlframe — Turning PySpark Into a Universal DataFrame API
sqlmesh — no summary
sqlmesh-cube — SQLMesh extension for generating Cube semantic layer configurations
squirrel-datasets-core — Squirrel public datasets collection
ssb-ipython-kernels — Jupyter kernels for working with dapla services
ssb-spark-tools — A collection of data processing Spark functions for the use in Statistics Norway.
stacks-data — A suite of utilities to support data engineering workloads within an Ensono Stacks data platform.
statscanpy — Basic package for querying & downloading StatsCan data by table name.
stoys — Stoys: Spark Tools @ stoys.io
superannotate-databricks-connector — Custom functions to work with SuperAnnotate in Databricks
sws-spark-dissemination-helper — A Python helper package providing streamlined Spark functions for efficient data dissemination processes
synaptiq-datawarehouse — Add your description here
synthesized-datasets — Publically available datasets for benchmarking and evaluation.
synthesized3 — Synthesized SDK.
Synthius — A toolkit for generating and evaluating synthetic data in terms of utility, privacy, and similarity
tabpipe — A toolkit for tabular data ML preprocessing pipelines.
td-pyspark — Treasure Data extension for pyspark
tecton — Tecton Python SDK
tecton-parallel-retrieval — [private preview] Parallel feature retrieval for Tecton
tecton-utils — [private preview] Utils for Tecton
teehr — Tools for Exploratory Evaluation in Hydrologic Research
test-amundsen-databuilder — Amundsen Data builder
test-data-modori — LMOps Tool for Korean
testabc — no summary
testfate — no summary
testKuldeep — testing databricks
testlib123 — Library created to map two Dataset
testuL — Library created to map two Dataset
text-dedup — no summary
tgedr-nihao — studies with financial data sources
tgedr-pycode — python handy code
tidal-algorithmic-mixes — common transformers used by the tidal personalization team.
tidal-per-transformers — common transformers used by the tidal personalization team.
tidy-tools — Declarative programming for PySpark workflows.
tidypyspark — dplyr for pyspark
timestep — Timestep AI CLI - free, local-first, open-source AI
tinderbox — Sharable PySpark tranformation sequence
tinsel — PySpark schema generator
tinytimmy — A simple and easy to use Data Quality (DQ) tool built with Python.
tmlt.analytics — Tumult's differential privacy analytics API
tmlt.core — Tumult's differential privacy primitives
toolbox-pyspark — Helper files/functions/classes for generic PySpark processes
trac-runtime — TRAC Model Runtime for Python
tracdap-runtime — Runtime package for building models on the TRAC Data & Analytics Platform

1 2 … 6 7 8 9