Deep Learning CV Modeling

Train and deploy a Tensorflow model to detect Oil & Gas well pads in the Permian Basin

In this video and article, we’ll go over one of the more in-depth examples we provide in Workbench: a deep learning model using a GPU server.

Deep Learning CV Example

In this deep learning example, we will detect oil and gas well pads in the Permian Basin. The notebook takes us through training a model on image chips generated from our platform, validating that model, and deploying it over a large area. Running this entire example can take around 30 minutes using a GPU, but much longer using a CPU. Let’s start up Workbench with a GPU server.

If you are already have a CPU server running you can stop your session and restart with a GPU server by going to File > Hub Control Panel > Stop My Server > Start My Server > Launch Server.

To start a GPU-enabled server, we’ll choose the second option in this list of server options. This will take a bit longer to start than a CPU server as we corral the resources you need behind the scenes.

Now that we have a GPU-enabled Workbench instance, let’s take a look at the Deep Learning CV example in /example_notebooks/examples/05_deep_learning_cv.

deep-learning-1You should see three supporting files: two Python modules containing additional code and a GeoJSON file of hand-drawn polygons around well pads. Back in the notebook, these first few cells import all the necessary packages—including the two modules defined in the extra Python files—load the GeoJSON training data, and visualize them using Workflows.



Next, we will generate the training data from that GeoJSON. The file contains the function that generates binary training images using the Scenes API. 

As the notebook states, this step will take 10-20 minutes to run.

Once the training data has been generated, let’s look at an example. The training data is stored as a TFRecords file, so we define a parsing function for that file and visualize the imagery and the binary image with the well pads rendered in yellow.


Model & Training

Next, we build a UNet classifier and compile it with the Adam optimizer. We also divide the samples into training and validation sets before fitting the model. The training step for this model takes about 1.5 minutes on the Workbench GPU. Once trained, we save the model to the local filesystem, zip it up, and then upload it to DL’s Storage service. There, the trained model can be accessed from other systems using your user’s credentials.


After reviewing the model’s training history, let’s test the model out on some imagery. We define a few functions to load the model from Storage and predict over a Tile, a specific type of AOI.

There are a few ways to define a tile, documented under the Scenes API. Here, we are defining a 128x128 pixel, 10-meter resolution area from a latitude and longitude pair.

Passing the key of this tile into the predict_image function we defined will retrieve imagery over the Tile and predict on it. The next cell shows this output.


Deploy & Collect Results

Next we’ll make a Catalog product for the output of our model. We will output data with one layer, the probability of the pixel being a well pad.

To learn how to create a new Catalog product see "Managing Products" in Introduction to the Catalog API.

Now, we want to deploy this model on a large AOI. The AOI itself will be subdivided into small Tiles. Each unit of work will take in one Tile, predict on the imagery that falls within that Tile, and upload the result to our Catalog Product. The function to do so, predict_wellpads, performs all of this work using the Scenes and Catalog APIs. After testing the function we see that it generates the expected output, and also has uploaded the image to the Catalog.

def predict_wellpads(dltile_key, catalog_pid, upload=True):
import descarteslabs as dl

tile = dl.scenes.DLTile.from_key(dltile_key)

im, pred, meta = predict_image(dltile_key)

# Upload result to Catalog
if upload:
print("Upload to catalog {}".format(catalog_pid))
catalog = dl.Catalog()
return pred

Our scalable computing environment, Tasks, is an easy way to scale any Python code you write in a parallel fashion. When you call tasks.create_function, the returned object is itself a function that submits work to the Tasks API. This new function—here called async_predict—takes the same arguments as the one you defined in the notebook, but the actual work is done within an HPC cluster hosted by DL. In this case, we will limit the number of workers to 10, but you do not typically need to specify a limit.

tasks = dl.Tasks()
async_predict = tasks.create_function(

Now, we define an AOI, split the AOI into Tiles, and visualize it with Workflows.


Finally, we will submit all of this work to Tasks by calling async_predict for each tile in our list. DL Monitor is used to monitor the progress of your Tasks. We have 66 tiles over our AOI, so we see that 66 tasks were submitted. After the image used to process these Tasks is finished building, the number of active workers will increase from 0 and begin computing the Tasks in parallel.

Once the Tasks are finished, 66 images have been added to your Catalog Product. The last step of this notebook visualizes the results.