Wheelodex — pyspark — Reverse Dependencies

Wheelodex » Projects » pyspark » Reverse Dependencies

Reverse Dependencies of pyspark

The following projects have a declared dependency on pyspark:

mkpipe — Core ETL pipeline framework for mkpipe.
ml-verbs — Generic interfaces for machine learning
ml2rt — Machine learning utilities for model conversion, serialization, loading etc
ml4ir — Machine Learning libraries for Information Retrieval
mlapp — IBM Services Framework for ML Applications Python 3 framework for building robust, production-ready machine learning applications. Official ML accelerator within the larger RAD-ML methodology.
mlforecast — Scalable machine learning based time series forecasting
mlops-cloud — no summary
mlops-core — no summary
mlpiper — An engine for running component based ML pipelines
mlpr — A library for machine learning pipeline and creation of reports.
mlpype-spark — no summary
mlserver-mllib — Spark MLlib runtime for MLServer
mltronsAutoDataPrep — First Automated Data Preparation library powered by Deep Learning to automatically clean and prepare TBs of data on clusters at scale.
mockalot — Mockup data generator library.
mockingbird — Generate mock documents in various formats (CSV, DOCX, PDF, TXT, and more) that embed seed data and can be used to test data classification software.
model-fkeywords — A Natural Language Processing Library
modern_robitcs_smc — Modern Robotics: Mechanics, Planning, and Control
modern_robotics_smc — Modern Robotics: Mechanics, Planning, and Control
moonspark — Logging helpers for PySpark
more-pyspark — no summary
mosaic-common-utils — Utils library for Mosaic
mosaic-utils — Utils library for Mosaic
mosaicml-streaming — Streaming lets users create PyTorch compatible datasets that can be streamed from cloud-based object stores
mr_urdf_loader — Modern Robotics URDF Load Module
mse — Make Structs Easy (MSE)
mx-stream-core — This is package stream core of mindx
my-pyspark-package — A package to count nulls and -1s in PySpark DataFrames.
myetljob-run — My first ETL library
mymaplib-123 — Library created to map two Dataset
myray — my ray desc
namedframes — Named Data Frames
narwhals — Extremely lightweight compatibility layer between dataframe libraries
nbodyx — A JAX simulator for N body problems.
nebius-connect — Nebius AI connector for Apache Spark™
neuralforecast — Time series forecasting suite using deep learning models
ng-ai — NebulaGraph AI Suite
ng-data-pipelines-sdk — A library for interacting with data from Amazon S3 through PySpark. Read, write and transform data using a powerful and intuitive API with strong consistency and type checking, thanks to Pydantic. Compatible with Amazon MWAA running Airflow 2.7.2 and above.
ngdi — NebulaGraph Data Intelligence Suite
nh-prototype — no summary
nichirin — TODO
NikeCA — Standardize and Automate processes
nixtlats — Python SDK for Nixtla API (TimeGPT)
nlpbook — Applied Natural Language Processing in the Enterprise - An O'Reilly Media Publication
no-spark-in-my-home — Yet another Python package for data generation
noaa-object-data-delivery-pipeline — Pipeline for ingesting a sample dataset
nolanm-portfolio-package — no summary
NolanMQuantTradingEnvSetUp — no summary
nops-metadata — Metadata producer tooling used in nOps.io
numderivax — Numerical differentiation in JAX.
nuna-sql-tools — Nuna Sql Tools contains utilities to create and manipulate schemas and sql statements.
oarphpy — A collection of Python utils with an emphasis on Data Science
obsrv — no summary
ocean-spark-airflow-provider — Apache Airflow connector for Ocean for Apache Spark
ocean-spark-connect — Spark Connect adapter for Ocean Spark
ocean-sparkconnect — Spark Connect adapter for Ocean Spark
ODP-DQ — DQ Solution to answer all DQ needs.
olapy — OlaPy, an experimental OLAP engine based on Pandas
omigo-ext — Extensions for omigo_core package
omniduct — A toolkit providing a uniform interface for connecting to and extracting data from a wide variety of (potentially remote) data stores (including HDFS, Hive, Presto, MySQL, etc).
onetl — One ETL tool to rule them all
ons-metadata-validation — automated metadata validation for ONS metadata templates
ons-utils — A suite of pyspark, pandas and general pipeline utils for ONS projects.
openhunt — A Python library to expedite the analysis of data during hunting engagements
openImageDatasetSDK — Python SDK for the Open Image Dataset.
openImageDatasetSDKTest — Python SDK for the Open Image Dataset.
openmetadata-data-profiler — Data Profiler Library for OpenMetadata
openmeteo-requests — Open-Meteo Python Library
OpenOA — A package for collecting and assigning wind turbine metrics
openpredict — A package to help serve predictions of biomedical concepts associations as Translator Reasoner API.
openwpm-utils — Tools for parsing crawl data generated by OpenWPM
ophelia-spark — Ophelia is a spark miner AI engine that builds data mining & ml pipelines with PySpark.
ophelian — Ophelian is a go-to framework for seamlessly putting ML & AI prototypes into production.
oplangchain — langchain for OpenPlugin
oracle-ads — Oracle Accelerated Data Science SDK
outset — add zoom indicators, insets, and magnified panels to matplotlib/seaborn visualizations with ease!
ovobdkit — Big Data Development Kit for OVO Big Data
ovotestkit — Testing Kit for OVO Big Data
owl-sanitizer-data-quality — Data Quality framework for Pyspark jobs
packyak — Infrastructure for AI applications and machine learning pipelines
pami — This software is being developed at the University of Aizu, Aizu-Wakamatsu, Fukushima, Japan
pandera — A light-weight and flexible data validation and testing tool for statistical data objects.
pano-airflow — Programmatically author, schedule and monitor data pipelines
patek — A collection of utilities and tools for accelerating pyspark development and productivity.
pathling — Python API for Pathling
pb2df — A python module for converting proto3 type/object to spark dataframe object.
pbspark — Convert between protobuf messages and pyspark dataframes
petastorm — Petastorm is a library enabling the use of Parquet storage from Tensorflow, Pytorch, and other Python-based ML training frameworks.
pfeed — Data pipeline for algo-trading, getting and storing both real-time and historical data made easy.
pii-anonymizer — Data Protection Framework is a python library/command line application for identification, anonymization and de-anonymization of Personally Identifiable Information data.
pillar1 — Official package for Pillar1 company
pineapple-spark — Pineapple is an extension of Apache Sedona for processing large-scale complex spatial queries
pingpong-datahub — A CLI to work with DataHub metadata
pipeasy-spark — an easy way to define preprocessing data pipelines for pysparark
places-intel — A library for fetching and processing place data with polygons using Outscraper and Overpass APIs.
ploosh — A framework to automatize your tests for data projects
ploosh-core — A framework to automatize your tests for data projects
poetry-demo5678 — no summary
PoliPrompt — PoliPrompt performs data analysis with state-of-the-art foundation models
polyexpose — polyexpose
pou-shap — A unified approach to explain the output of any machine learning model.

1 2 3 4 5 6 7 8 9