This post is mostly for myself to organize in my brain the various things Saturn Cloud has published.

Dask

Beginner’s Guide

I recently started a series of articles, aimed at beginner’s, but incorporating practical tips we’ve picked up working with customers.

Just Start with the Dask LocalCluster: There is a lot out there…


Speed up your code immediately without spending a bunch of time learning new things.

Photo by Sigmund on Unsplash

This article is the second article of an ongoing series on using Dask in practice. Each article in this series will be simple enough for beginners, but provide useful tips for real work. The first article in the series is about using LocalCluster

At Saturn Cloud, we manage a data…


It’s pretty frustrating when your browser disconnects from a Jupyter Notebook!

Photo by Nathan Dumlao on Unsplash

At Saturn Cloud, we manage a data science platform that provides Jupyter notebooks, Dask clusters and ways to deploy models, dashboards and jobs. As a result, we often help customers troubleshoot their notebooks, and network disconnects is a common issue.

We’ve gotten a number of customers struggling with long running…


Dask is great, and the LocalCluster is the easiest way to start

Photo by Christin Hume on Unsplash

This article is the first article of an ongoing series on using Dask in practice. Each article in this series will be simple enough for beginners, but provide useful tips for real work.

At Saturn Cloud, we manage a data science platform that provides Jupyter notebooks, Dask clusters and ways…


How to troubleshoot performance issues with Dask GroupBy Aggregations. A deep dive into the algorithm.

Example GroupBy Task Graph — courtsey of Julia Signell

At Saturn Cloud, we manage a data science platform that provides Jupyter notebooks, Dask clusters and ways to deploy models, dashboards and jobs. As a result, we often help customers troubleshoot Dask operations. Issues with GroupBys come up quite often.

Dask DataFrames are amazing for interactively exploring large datasets. However…


Bringing 100x Faster Data Science to Millions of Python Users

Snowflake (NYSE:SNOW), the cloud data platform, is partnering, integrating products, and pursuing joint go-to-market with Saturn Cloud to help data science teams get 100x faster results. Check out our January 13 webinar if that intrigues you.

Now SQL users can…


Kubernetes provides a ton of useful primitives in setting up your own infrastructure. However, the standard way of provisioning Kubernetes isn’t set up very well for data science workflows. This article describes those problems, and how we think of them.

Photo by Amy Elting on Unsplash

Disclaimer: I’m the CTO of Saturn Cloud — we build an enterprise data science platform focusing on huge performance gains leveraging Dask and RAPIDS.

Multiple AZ (Availability Zones) interact poorly with EBS (Elastic Block Store) and Jupyter

Love it or hate it, Jupyter is the most common data science tool IDE today. If you are provisioning a Kubernetes cluster for data scientists, they…


How to load data efficiently from Snowflake into Dask

Photo by Darius Cotoi on Unsplash

Snowflake is the most popular data warehouse among our Saturn users. This article will cover efficient ways to load Snowflake data into Dask so you can do non-sql operations (think machine learning) at scale. Disclaimer: I’m the CTO of Saturn Cloud — we focus on enterprise Dask.

The Basics

First, some basics…


The world is a scary scary place

online marketing freaks me out.. much like this tricycle rider

Recap

I recently quit my hedge fund job to pursue entrepreneurship. My hypothesis is that in the long term entrepreneurship outperforms high paying jobs, and that most risk of failure can be minimized when starting a business by choosing conservative (easy) problems, and applying…


Pandas Timestamps the most common datetime representations in data science. This article goes over most of the common use cases.

Photo by Aron Visuals on Unsplash

Disclaimer: I’m the CTO of Saturn Cloud. We make it easy to connect your team with cloud resources. Want to use Jupyter and Dask? Deploy models, dashboards or jobs? Work from your laptop, or a 4 TB Jupyter instance? Get complete transparency into who is consuming what cloud resources? …

Hugo Shi

Founder of SaturnCloud.io

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store