How to setup python on MacOS, the right way — for data teams

Quick Note…

This article is intended for a slightly more advanced audience and is written under the assumption that you are already familiar with the following tools/concepts:

If you are not familiar with these concepts you should still be able to follow along, but I would recommend spending the time to dive deeper into each of these topics for a more holistic understanding. The links above lead to Corey Schafer’s Youtube channel — one of the best online resources out there— where he explains each of these concepts in detail.

Lastly, if your machine already has python3 installed, I would recommend uninstalling it before returning to this blog, and then following the instructions from here.

Intro

When your team is one person, what you do with your python environment does not really matter. However, when your team starts to scale, collaboration and version control become a new challenge.

This post is going to focus on how your team should set up their python environment across each persons machine so you never run into build issues when collaborating on projects.

Avoid this trap…

Before we dive into what to do, I want to highlight the most common thing beginners do when they first start coding in python, and why we should avoid doing this on our own machines. To put it simply, the wrong way to do it is: downloading python from the website, installing everything globally and not using virtual environments.

When you first start coding in python, this typically does not lead to any issues. However it becomes a problem when you start collaborating with other team members or increase the number of projects you are working on.

You want to make sure that different users can install the same configuration for the projects you are working on, on different machines.

If your teammate has python 3.1 installed globally on their machine, and you have python 3.9, it will likely lead to issues when you try to collaborate on the same project.

In order to smooth out the process of sharing your work, you need to be explicit with every project you work on. This includes being explicit about which version of python you are using and about which python packages should be installed.

Once you and your team understand the necessary tools to accomplish this, sharing your work and collaboration becomes a breeze. The following section will outline all the tools you’ll need in order to do this in the most straightforward way possible.

pyenv & pipenv — The Holy Grail

To quote The Zen of Python, “explicit is better than implicit”. However, you wouldn’t know this to be true if you have installed python in the traditional way. Most python beginners do not know how to be explicit about which python version they are using. pyenv helps us do this by making it easy to switch between multiple versions of python. From here, we can be explicit about which version of python we are using when switching between various projects.

pipenv is a tool that combines the concepts of package management and virtual environments. Historically, these two concepts have been handled by two separate tools (pip and virtualenv), but pipenv combines the two and solves common workflow issues that people run into when using the traditional method. By using pipenv, we will be able to be explicit about which python packages we have installed for each project, while confining everything to a virtual environment. For a deeper dive, check out this video by Corey Schafer, where he offers a very thorough explanation of how pipenv works.

By combining pyenv and pipenv, we are being explicit about what version of python and what packages we are using for every project. This minimizes the risk of any project breaking due to future updates, and it simplifies collaboration across your team.

Now that we’ve gone over our tool selection, let’s dive into the step-by-step instructions.

Step 1: ensure you have Xcode Command Line installed on your Mac

First things first, you need to ensure you have the Xcode Command Line toolkit installed on your Mac. This gives you access to a number of commands which you can run through terminal. Open terminal and run the following command in your root directory.

$ xcode-select --install

Step 2: Install Homebrew (package management tool for MacOS)

The simplest way to do this is paste the following command into your macOS Terminal, this is the installation script listed directly on Homebrew’s website.

$ /bin/bash -c "$(curl -fsSL https://raw.githubusercontent.com/Homebrew/install/HEAD/install.sh)"

Step 3: Install pyenv via Homebrew

$ brew install pyenv

Step 4: Configure your .zshrc (or .bash_profile) to work with pyenv (this command makes pyenv run every time you open your prompt)

$ echo -e 'if command -v pyenv 1>/dev/null 2>&1; then\n eval "$(pyenv init -)"\nfi' >> ~/.zshrc

Step 5: Install python build dependencies (not having these can cause build issues when trying to install python via pyenv)

$ brew install openssl readline sqlite3 xz zlib

Step 6: Install the latest python version (3.9.1 as of writing)

$ pyenv install 3.9.1

Step 7: Set your global python default

$ pyenv global 3.9.1# verify that it worked
$
pyenv version
3.9.1 (set by /Users/wbmcdonald4/.pyenv/version)

pipenv

Now that we’ve gone over pyenv, this next section is going to go over the use of pipenv using the example of setting up jupyter workspace (common tool that data teams use for analysis).

The first thing you’ll need to do is create the directory where you are looking to create your working environment, and tell it which version of python you want to use. I typically locate all my python projects within my Documents folder. I’ve added the path to the directory in the prompt to help clarify the following steps:

/Users/wbmcdonald4 $ cd Documents
/wbmcdonald4/Documents $ mkdir my_project
/Documents/my_project $ cd my_project
/Documents/my_project $ touch .python-version

So far you have created the project directory & created an empty .python-version file. This file tells pyenv which version of python you are trying to run within this project. Now you will need to edit the file using nano or vim (built in text editor).

/Documents/my_project $ nano .python-version

You’ll now find yourself within the .python-version file on your MacOS terminal. Enter the python version you would like to use for the project, then save and exit the file. It should look something like this before you save the file:

Now that your path knows which python to use, you need to install pipenv onto this specific version of python. Run the following command:

/Documents/my_project $ pip install pipenv

Make sure you are in the current project directory when you run this, to ensure that you install pipenv onto the correct version of python.

We are almost there. The last thing I like to do is add the following environment variables to my .zshrc file (.bash_profile if you use bash instead of zsh) to help pipenv run a little smoother.

$ echo -e 'export PIPENV_NO_INHERIT=True' >> ~/.zshrc
$ echo -e 'export PIPENV_VERBOSITY="-1"' >> ~/.zshrc

The first ensures that pipenv does not inherit from parent directories, and the second suppresses a warning that occurs if you run pipenv while the virtual environment is already active. (This is a minor thing but is kind of annoying).

Now you are good to go. Let’s start by installing some standard packages that data teams typically use. Run the following pipenv command:

$ pipenv install jupyter jupyterlab numpy pandas matplotlib

This command accomplishes a few things. Since it is the first time you are running pipenv install in this directory, it will start by creating a virtual environment. From there it installs these python packages in that environment, and creates two files (Pipfile & Pipfile.lock) that are used by pipenv to manage the project dependencies. After this command is finishes running, your directory should look something like this:

If you were to open your Pipfile, it will look something like this.

Now you should have everything you need to get started. To launch a notebook from within a virtual environment, just run the following command:

$ pipenv run jupyter notebook

This will activate the virtual environment and launch jupyter from within that environment. Now when you open a new .ipynb file there you should be able to import the packages we installed earlier with no issue.

And that’s it! You should take this same approach any time you are starting a new project. If you have any questions, please post in the comments.

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store