Watson Studio Desktop — First Impressions

Mark Ryan
7 min readFeb 24, 2019
Completed flow for a simple run through of the customer churn problem

I have been using Watson Studio (as it is called now), both the local and cloud versions, for a couple of years as a platform for machine learning. Recently a colleague encouraged me to try out Watson Studio Desktop (available for free trial here). This article summarizes my experience installing Watson Studio Desktop and running through a simple exercise to apply SVM and logistic regression to predict customer churn. Links to the Watson Studio Desktop trial, the stream / flow created in this exercise, and the dataset are listed at the end of this article.

Download and Installation

Once I had signed up for the trial, the download was straightforward — the 3.4 G package took about 20 minutes.

The install has two stages:

  • the first stage, with a standard install GUI, took about 2 minutes
  • the second stage starts when you launch Watson Studio Desktop and it took a little over 15 minutes

The install was flawless. I have to admit I had some trepidation about installing a data science environment locally after my experience installing a local client for Azure Machine Learning and its very particular system requirements. I was worried that the Watson Studio Desktop install would choke on Windows 7 or have a bunch of surprise prerequisites. As it turned out, I had nothing to worry about. From a standing start I was up and running in less than 45 minutes.

No issues with install

Features Available in Watson Studio Desktop

This article describes the features that were available in Watson Studio Desktop as of February, 2019. Update: as of April 16, 2019, Jupyter notebooks are available in Watson Studio Desktop. See this article for an example of using notebooks in Watson Studio Desktop.

While I have tried out some of the non-coding features in Watson Studio, all of my projects have exploited Watson Studio’s Python notebook environment. In my exploration of Watson Studio Desktop I decided to take advantage of the opportunity to do a small project using Modeler, a non-coding environment. This environment will be very familiar to anybody who has used SPSS Modeler outside the context of Watson Studio.

Setting up projects in Watson Studio Desktop is identical to the experience with Watson Studio cloud, minus the requirement to identify cloud storage when you initially set up a project. In addition to Projects and Modeler, Watson Studio Desktop also includes Data Refinery which you can use to cleanse and prepare data for a data science project. I decided to use the data manipulation features in directly in Modeler, so I didn’t dig into Data Refinery in this run through of Watson Studio Desktop.

Watson Studio Desktop includes Data Refinery

Exercising Modeler in Watson Studio Desktop

I had recently worked through an example of using Python notebooks in Watson Studio cloud to predict customer churn. In this example several methods (including SVM and logistic regression) are used to predict whether a client is going to leave a mobile phone provider. I decided to take the same customer churn dataset and see if I could figure out how to use Modeler in Watson Studio Desktop to get roughly the same results that I got with the Python notebook example for SVM and logistic regression in Watson Studio cloud.

Before I started into the Modeler flow I completed the following steps to get ready:

  1. Added a new project
  2. Added the churn dataset as a data asset in the new project I had created. You do this exactly as you would in the other instantiations of Watson Studio: click Add to Project at the top of the project screen, select Data and then either browse or drag and drop the file you want to add in the data pane on the right.
  3. Added a Modeler Flow to the project I had created: click Add to Project at the top of the project screen and select Modeler Flow.
  4. Watched this brief video to get an overview of Modeler since I had never used SPSS Modeler before.

With the above steps under my belt it was relatively easy to get going. In Modeler you drag and drop elements from the Palette on the left, right click on the elements and Open them to define their behaviour, and connect elements to establish the flow.

Here’s what the completed flow looks like in Modeler:

Flow in Modeler for churn prediction

Here’s a summary of the elements in the flow:

  • Data Asset to identify the input dataset. To make the connection to the churn dataset added to the project in step 2 above: select the Data Asset icon that you have dragged from the palette; right click and Open, then click on Change Data Asset; select the data asset defined in the project for the churn dataset.
  • Filter to select the subset of fields (columns) that you want to keep from the input dataset: right click and Open. You can choose to select the fields you want to filter or the fields you want to retain.
  • Auto Data Prep to make intelligent choices to prepare the data: normalization, replacement of categorical strings with integers, etc. I can certainly see the value of having customized prep for each field, but I was impressed at how easy this feature made it to get to my goal of rapidly prototyping a churn prediction.
  • Partition to split the data set into train and test
  • Models for SVM and Regression — for both of these, right click and Open to identify the Target (label) field and the input fields.
  • Analysis for each model: right click and Run to see the results.
SVM results
Regression results

Keeping defaults in most cases and doing zero tuning, the models created using Modeler in Watson Studio Desktop got roughly the same test accuracy (80%) as the Python-based models from the original churn prediction example. Once I had installed Watson Studio Desktop, it took me less than 3 hours to get these results even though I had never used Modeler before.

Here’s a summary of my overall impressions of Watson Studio Desktop after completing this exercise:

  • Install is painless.
  • Anybody who has used the other instantiations of Watson Studio will be able to use Watson Studio Desktop immediately with no issues.
  • It is very convenient to have direct, fast access to the filesystem on my laptop with Watson Studio Desktop, and I anticipate taking advantage of the ability to work with no internet connection.
  • Modeler is easy to use and mostly intuitive. I was briefly held up on two points. First, it wasn’t entirely intuitive to me how to associate a data asset in the project to a Data Asset in Modeler — the Change Asset button threw me off because there wasn’t an existing asset to change yet. Second, it wasn’t immediately obvious to me that the target/label was identified in the model objects.
  • I like the ability to export (though the feature is called “Download”) the finished Modeler flow / stream as a file.
  • The flexibility of the flow in Modeler is really handy, especially the ability to preview at any stage. As an example, I tried selecting Preview then Profile from Auto Data Prep and was alarmed to see that I had “lost” the category labels:

However, I could simply do exactly the same thing on Filter, the previous node in the flow, to get what I wanted, with the labels intact:

If you want to try it out the exercise described in this article yourself:

--

--

Mark Ryan
Mark Ryan

Written by Mark Ryan

Technical writing manager at Google. Opinions expressed are my own.

No responses yet