Package PySpark job dependencies for GCP Dataproc
Learn how to package dependencies for your PySpark job running on the GCP Dataproc cluster
Google Cloud Dataproc is a managed cloud service that makes it easy to run Apache Spark and other popular big data processing frameworks on Google Cloud Platform (GCP). With Dataproc, you can create and manage Spark clusters quickly and easily, without having to worry about the underlying infrastructure.