How to train a machine-learning model Without a PHD.

What a machine learning model?
A machine learning model is a complex learning algorithm that must be taught to recognize patterns. with enough examples it will be able to find those patterns in things it wasn’t taught about.

The Problem

There are many ways to teach machine-learning models already out there in the world. Many of them require extensive training, all the way up to getting a PHD in mathematics or artificial intelligence to use. Many of the tools out on the market require extensive setup, have command-line interfaces and require large staffs to operate. IBM wanted a tool that would be usable by just about anybody to make a machine learning model that they could easily integrate into their application.

Identifying how machine-learning works

So as a designer I am not usually well-versed on every tool I work with. It’s also a danger to become too entrenched in an industry as you start to learn and re-use the jargon which can put out new users that don’t have the benefit of the instruction of highly-trained super-users. Machine learning was well outside my realm of expertise which gave me a unique outsider view in the organization. But I still needed to know how the system worked at a workflow level so I spent my first few months at IBM learning at a high-level how much of this system worked.

Screen+Shot+2019-02-11+at+11.24.26+AM.jpg

Research

I installed and used as many competitors tools as I could. Regrettably many of them were behind very expensive multi-layered large-scale consulting paywalls. But there were some free tools that were out there all the way to Microsoft’s Azure studio that gave me some insight into how others had tackled machine learning training. I also studied extensively what it took to make a model, what it needed to succeed and when a user could know it was “finished” (it never is finished!). All the products out on the market started out with an expectation that the user already was highly familiar with machine-learning terminology, functionality and outcomes. Watson Knowledge Studios goal was that they needed none of that, but that the power would be there once they needed it. Kinda like putting the transition from iMovie to Final Cut Pro all in the same application.

Brainstorming

This is where the bulk of my effort went. Because the project was of such a high level of complexity I had to iterate rapidly to find my own blindspots when it came to training the machine-learning model. My knowledge never would be fully complete on machine learning so I had to heavily rely on my colleagues to explain concepts and suss out technical limitations of my approaches. I also spend a large amount of time trying to figure out what paths were the most painful for users to go through, what model training would result in the best result for their effort and how the flow of steps needed could be simplified. I also created personas during this time with help of our customer-facing team members and internal analytics, regrettably they are IBM proprietary so I cant share them.

User flow

Establishing the user flow was of the utmost importance before any serious wireframing could be done as it informed how a user would actually use the system to start and refine their models. I started out by creating extensive flowcharts in omnigraffle that were complex enough that it seemed like only I could read them. They weren’t helpful for informing developers or managers of my intentions as well as I had hoped. So I created a “Concept Car”, a narrative way to explain a user flow to just about anybody. A concept car isn’t intended to be a perfect flow, it shows where a user fails, gets frustrated, has to reach out to others, is successful or has to use external tools. I went through nine concept cars during this single project alone as I refined the flow.

Wireframing/Prototyping

Because of the extensive work done brainstorming there were only a couple critical path prototypes I wanted to test. If they didn’t work it wasn’t worth committing the extensive effort behind pushing their concepts to production. I only had one shot to push out working code due to an extensive backlog of other projects so it had to be as close to the right thing as I could manage. Wireframing was done mostly in Sketch, which was converted into interactive clickable prototypes in Axure (or hand-coded HTML when Axure was too heavy a tool for the simple thing I wanted to test).

Usability Testing

Although I didn’t run the usability testing for this project I did help find many of the users, usually through extensive outreach and, scouring through the analytics to make sure they were the right folks to talk to based on their usage or dis-usage of Watson. I also made and maintained personas of who we wanted to talk to and analyzed and reviewed all the findings that our usability researchers gained. I tried to be on every usability test call, but as policy we tried to limit exposure of clients to a handful of customer-facing people at IBM so I wasn’t able to interact with customers directly.

Screen Shot 2019-02-11 at 12.56.58 PM.png

Rollout

When I left IBM this project was still being planned for development and rollout so I cant speak to that extensively. However, in the process of producing the various concepts, performing user testing and wireframing/prototyping I worked extensively with our development team - who happened to be based entirely in Japan. I would often work late after-hours to have informal meetups with so that they wouldn’t always have to be the ones calling in to us at 1 AM in the morning. I used these sessions to make sure designs were feasible, go over concept cars with them and field any other concerns or issues they may have. Because of cultural differences they would not always feel comfortable fielding concerns when managers or higher-ups were in the room and these meetings gave them a voice. Because of this they were ready to start work as soon as time opened up in their packed schedules.