Willy Wonka & The Cognitive Big Data Factory

Working with big data technology for me is a lot like a kid getting the chance to work in Willy Wonka's Chocolate Factory. For the uninitiated, Willy Wonka and the Chocolate Factory is a film adaptation of the 1964 classic novel Charlie and the Chocolate Factory by Roald Dahl. It tells the story of a starry-eyed child, Charlie Bucket, who receives a "golden ticket" and visits Willy Wonka's secretive, mysterious chocolate factory, along with four other children from around the world. There they meet diminutive orange people with green hair, Oompa-Loompas, who not only help make the candy, but also dispense wisdom through clever choreographed musical numbers. 

For a few minutes then, please indulge my analogy to this story, because If you love information, the power of big data, and the thrill of working with cutting edge information technology, choosing a career in data sciences and big data technology, just might be your golden ticket to a world of pure imagination - and gainful employment for that matter.

So hear me out - all undecided college undergrads, social networking-addled high school students. Come along with me on an imaginary tour, right now, into Willy Wonka's Cognitive Big Data Factory.

The factory I am alluding to is located in Pittsburgh's Squirrel Hill neighborhood, on the corner of Murray Ave. and Forbes Ave., on top of a Rite Aid Pharmacy, to be specific. More importantly, however, it is located on the corner of limitless career possibilities - and you. 

First, a little about me, your tour guide. I am not Willy Wonka. I am an IBM Watson Explorer Information Developer, somewhat of an  Oompa-Loompa in Willy Wonka's Cognitive Big Data factory. Moreover, I am a particular type of Oompa-Loompa - one given the challenging task of describing how all the magical things that are made in Willy Wonka's Cognitive Big Data Factory workAnd, you will soon learn, Willy Wonka makes a lot of candy and chocolate bars. 

"Who can take a dataset, sprinkle it with dew... Cover it with chocolate and a content analytic or two....  the dataman can. The dataman can, cause he mixes it with love and makes your data feel good"

Lets start with the basic requirements of working in a big data environment. You must love data. A penchant for thinking creatively about data is a huge plus. You can come from nearly any discipline under the sun (and, Watson only knows I am living proof of that): biology, history, civil engineering, mathematics - almost any field - you will find, - can be touched by the power of data science. Nonetheless, it is the love of data that gets you through the door of Willy's Cognitive Big Data factory, this one, and all other factories similar to it.

Now that we are inside Willy Wonka's Cognitive Big Data Factory, lets have a look around.

"Hold your breath. Come with me. And you'll be - in a world of pure information. If you want to view data paradise... Look around and view it. Anything you want to do - do it (but be Agile about it)." 

All that sweet colorful candy is structured data. And, that big flowing river of milk chocolate - why that is the never ending stream of unstructured data. See all those Oompa-Loompas gathering about those white boards, which feature rainbows of colorful sticky notes? They are Software-Engineer-Loompas. Those big colorful candy-sized mushrooms, those are work stations.

Look! Over there are QA-Loompas eating the candyProfessional Service-Loompas can't stop talking about how good our candy is. Other Information-Developer-Loompas like myself, are explaining to folks how to eat the candy - it should be easy, but sometimes it is not. UX-Loompas are making the candy easier to hold and more attractive to look at. Many of the Loompas are beginning to gather in semi-circles, around a bongo drum (literally, we have a real bongo drum in our lab, which is beaten to herald the beginning of our daily stand-up meeting). 

In short, there is a myriad of different and diverse folks, hailing from all parts of the world, with a wide variety of technology backgrounds, who all come to work in Willy Wonka's Big Data Cognitive factory and help make the candy and chocolate bars.

Okay, so what is so special about the candy they make? Well, you did sign the waiver not to tell Mr. Slugworth right? You did. Great. These folks are all working on the everlasting cognitive gobstopper - otherwise known as IBM Watson.

The folks in the Squirrel Hill Cognitive Big Data factory work on the part of Watson that explores and discovers data, which is called Watson Explorer. Additionally, they work on the components that comprise Watson Explorer, including Watson Explorer Engine, Watson Application Builder, Watson Content Analytics and Watson Results Module. There are more components to Watson Explorer, but you likely get the point. 

On the grander scale Watson is a "cognitive technology" which means that it processes information more like a human does than a computer. It represents a natural extension of what humans can do at their best. It can read and understand natural language, something very important in understanding that never-ending, flowing river of milk chocolate called unstructured data, which itself makes up about 80-percent of the world's candy store of data.

What else can Watson do? When asked a question, it can generate a hypothesis and rapidly parse relevant evidence and evaluate responses from disparate data sources. And, through repeated use, Watson literally gets smarter by tracking feedback from its users and learning from both its successes and failures. Pretty cool candy huh?

Now that you glimpsed, dare I say, information paradise, how do you become part of it? How do you become an Oompa-Loompa at Willy Wonka's Cognitive Big Data Factory,  and admittedly, even at Mr. Slugworth's Oracle factory, or some other data factory at Google, Microsoft and others? Well, you don't need to be lucky enough to find one out of five candy bars in a billion to have your golden ticket punched. Your golden ticket is simply a choice that you can make today - your golden ticket is choosing a career in data sciences and information technology. 

"There is no, life I know, that can compare - to working with information... Living there... you'll be free... If you truly wish to be." 

To Learn More about...

  • IBM Watson - Click here
  • University of Pittsburgh Information Science Programs - Click here  
  • Carnegie Mellon University's School of Computer science Programs - Click here
  • The Pitt Science Outreach Club and the Pittsburgh Data Jam - Click here 
  • Getting involved with Pittsburgh Dataworks - Click here

To watch Willy Wonka...