This site is in beta. Tell us what you think.
Chapter 1 | Machines That Learn Guidebook

Introduction to Machines That Learn

Artificial Intelligence

Artificial intelligence emulates human intelligence in machine systems, either to augment or automate human work. Advanced sub-fields of AI explore new, non-human forms of 'intelligence' that don't parallel our ways of thinking so closely.

Often when we hear the term 'artificial intelligence,' it connotes characters from science fiction. This is similar to what AI scientists think of 'General AI'—a mostly-hypothetical form of AI with true, adaptive intelligence. Such an AI could emulate the human capability to teach ourselves about the world around us. General AI is also sometimes known as 'wide' or 'strong' AI.

This kind of AI belongs to the realm of futurists and is often found in science fiction, with characters like Wall-E, the android Data on Star Trek, and Ultron in Marvel comics. Pop culture references to AI in English-language culture began with the problematic computer "Hal 9000" from the movie adaptation of Arthur C. Clarke's 2001: A Space Odyssey and continued on. These tropes tend to fall into three categories:

  • Massive super-mind computer villains like Skynet or The Matrix,
  • Lovable but emotionally-limited characters like Star Wars' C3PO, or
  • Informative, disembodied voices, like the 'ship's voice' in Star Trek.

Similarly, the general public tends to think of AIs as:

  • Incomprehensible and concerning collections of computers 'making decisions' about our lives,
  • Clever toys or interactive androids like 'assistant bots' acting as greeters at large buildings or stores, with limited functionality, or
  • Practical voice assistants like Siri, Google Assistant, or Alexa.

In most of these models, we have imagined technology in some human-like image, with human-like motivations. In one sense, this is false—what is called AI today is primarily complex math formulas and access to a reference library of data, with no self-awareness like a human has. In the other sense, it's true—the programmers of these machine systems incorporate their own ways of thinking (including bias) and feed data to systems that may include some very human incompletions and inaccuracies.

How to think about machines that learn

Sometimes it's easier to think about AI and related technologies as 'machines that learn.'

Artificial Intelligence vs. Intelligence Automation

Artificial Intelligence (AI) is sometimes best thought of as 'intelligence augmentation.' Artificial intelligence implies being autonomous and aware in a way that is unrealistic for machines. Intelligence Augmentation better describes what machines are capable of at the moment, which is support for human intelligence.

Shifting from Spreadsheet to Algorithm

Artificial Intelligence is an excellent example of technology that is only as useful as the thinking we bring to it.

That is to say, underlying the technology of AI is thinking that needs to be understood to take advantage of the technology (or protect against poor implementations of it).

To properly use AI, we need to understand the concepts of prediction, algorithms, the (limits of) automation, data ethics, and data in motion. Otherwise, we'll probably not get very far. Until we shift our mental model from spreadsheets to algorithms, AI tech will only give us a slightly better spreadsheet.

Let's start by unpacking some of the terms related to learning through—and by—machines.

While there are extensive debates as to the difference between AI and Machine Learning, for the purposes of this article we'll consider Machine Learning to be a subset of AI. We'll explore a couple of critical sub-fields of practical AI and contextualize where they fit alongside other data- and analytics-related initiatives.


Expert Systems

Emulating Human Knowledge

For a long time, most machines that we interacted with couldn't adapt to changing circumstances. The core of many 'smart' things we use is an expert system. Expert systems attempt to mirror a human expert's knowledge and decision-making power, using ready-made decisions and pre-recorded knowledge.

For example, when you go to a medical website and answer a series of questions based on your symptoms, it's as if the machine behind that tool is going through a tree-like chart of decisions. Each answer you give triggers a pre-programmed follow-up question and, eventually, a result. Such a system on its own does not 'learn' or adapt but can help narrow down potential articles to read or suggest actions for the 'patient' who is visiting the website.

Expert systems were one of the earliest forms of artificial intelligence and are still in use today in more advanced forms. However, there are two critical challenges for designers of expert systems. First, they depend on extracting knowledge from experts in an efficient, cost-effective, and machine-readable way. Second, expert systems cannot adapt to changing circumstances without additional input from either humans or another source of knowledge. In the medical diagnosis example from before, the accuracy of that system is limited to what it already 'knows.' So, before a developer added COVID-19 to the system, anyone presenting with those specific symptoms would probably be incorrectly categorized as influenza or pneumonia.


Moving from information retrieval to statistical analysis

While expert systems work well in known, low-change situations like helping someone select the right help article or find the right type of customer service agent, their approach is fundamentally based on concrete knowledge, which is by definition about the past. But what if you need to find out new information about data? This is where analytics comes in.

Analytics refers to the discovery, interpretation, and communication of meaningful information in data. Analytics is a multidisciplinary field, encompassing many methodologies and with applications at several stages of the data supply chain. It is sometimes redundantly called data analytics to emphasize the use of digital data. While analytics generally involves reporting on the past (batch analytics) or present (real-time analytics), there are also future-focused analytics called 'predictive analytics.' Particular uses and data types have their own analytics sub-fields, such as web analytics, sales analytics, or analytics of natural language (usually referred to as natural language processing).

Predictive analytics

Using statistics to predict the future

Predictive analytics uses data and statistical algorithms not just to describe something from the past as traditional analytics does but also to calculate the likelihood of future outcomes. This is different from traditional analytics, which is focused only on what happened in the past. You can think of this as the difference between seeing a report of the traffic experienced during a summer day's commute versus using that (and many other data points) to model what traffic might look like tomorrow when children return to school and the weather gets snowy.

Receiving notification of an expected flight delay while waiting in the airport is now based on more than human prediction: Human expertise is paired with a machine system comparing current and past flights and referencing other systems that predict the weather or other influences. Travel-booking tools like Google Flights will even predict how likely a delay is on a flight you are considering purchasing so you can plan transfers accordingly. Machines are constantly analyzing and learning to provide up-to-date estimates.

Machines' ability to predict the future goes beyond simple statistics, however.

'Narrow' AI

Narrow AI refers to a more practical artificial intelligence than that of science fiction. Narrow AIs are focused on a particular area of expertise or limited scope. is a well-publicized example of a purpose-specific AI which helps people schedule appointments. Such AI is sometimes better characterized as Intelligence Automation — decision support or decision automation to help humans. Google Assistant and other such services may appear to be wide AIs but are actually collections of many narrow AIs (scheduling, music playback, reading from Wikipedia, etc.) and are not actually a generalized intelligence that is self-aware or can truly teach itself. Other terms for narrow AI are 'artificial narrow intelligence' or 'weak AI.'

Most technologies that seem to be 'intelligent' are combinations of expert systems, basic analytics, and a field called machine learning.

Machine Learning

Machine learning is a collection of approaches that use statistical analyses to infer the probability of future events or possible trends in a dataset. Machine learning (often referred to as ML) powers effective predictive analytics and many other modern technologies, including narrow AI. ML sets itself apart from analytics by focusing on configuring computer systems to 'learn' without being specifically guided. By updating their statistical models and algorithms using a sample of data, ML's goal is to perform a set of functions with increasing degrees of accuracy and efficiency.

There are two main types of machine learning: supervised learning, suitable for specific tasks we already know of, and unsupervised learning, which is good for discovering things we didn't already know. They are also often combined in semi-supervised learning strategies that leverage multiple methods at once. Let's explore each and the most common kinds of analysis underneath them.

Supervised Machine Learning

Supervised machine learning works with pre-categorized (or 'structured') data; it is task-driven, and the objective is prediction. It's 'supervised' by data scientists who utilize computational thinking to provide carefully selected input data, monitor the output, and adjust the algorithms and models the machine uses.

Supervised machine learning of algorithm refinement is guided by feedback from human beings so that such a system gets better at spotting known patterns. For example, a human might train a software system to better recognize faces in photographs or follow a “textbook” diagnostic process in medicine.

When you have to complete a 'captcha' form to log in to a website, you're not just proving you're a human; you're training a machine to spot patterns as you do. Many years ago, those captchas focused on deciphering text so that machines could convert printed materials into digital versions (which they could later learn from). Now, such captchas focus on identifying cars, trucks, pedestrians, buses, and street features to train autonomous vehicles to understand the world they are navigating.


Classification refers to picking a category from an existing set of options. For example, a machine might look at a picture of socks and divide them by color.


Regression analysis refers to assigning a numeric value to a given input, such as measuring the length of a sock.

Unsupervised Machine Learning

Unsupervised learning is driven by the given dataset rather than a set task. Unsupervised learning works on 'unstructured' (not yet categorized) data. Unsupervised learning aims to identify patterns (or 'structures') which humans can then use to label data appropriately and/or discover things about the data set.

Ultimately, unsupervised ML is especially promising in that it can help humans spot patterns they may not have already seen, and which aren't based on ‘expert system’ decision trees. This is the machine version of the 'intuitive' element of a doctor’s diagnostic process, associating symptoms and causes rather than the medical textbook systematically diagnosing known issues.

IBM estimates that 80%-90% of the data companies have is unstructured. Unsupervised machine learning takes unstructured data (data with no taxonomy labels yet applied) and searches for patterns and trends. The goal is to cluster data points together which have some correlation to each other

Unsupervised machine learning is used for:

  • Categorizing and correlating news
  • Recognizing similar images
  • Detecting anomalies or outliers
  • Identifying common attributes of users
  • Helping recommend things


Clustering works by dividing a dataset into similar groupings, like stacks of clothes that seem similar at first glance.


Association works by identifying sequences, like what items of clothing you have worn together in the past.

Dimensionality Reduction

Dimensionality reduction aims to find 'hidden' dependencies and attempt to generalize something from the data. This might be what clothing could be paired to make a good outfit, whether or not you have paired it together in the past, based on commonalities.

Neural Networks & Deep Learning

Neural networks infer meaning by comparing input against layers of prior learning. In our brains, some neurons fire for broad detail like ‘circle’ and ‘bright’ and then refer to other neurons which fire for more detail, like ‘yellow.’ In AI, this occurs through several layers of mathematic equations.

The phrase "deep learning" refers to the number of layers and is not strictly defined; some experts consider it jargon in its current usage. Deep learning is based around unsupervised or semi-supervised learning, tapping complex datasets like the video feeds from autonomous vehicles, which it uses to learn to recognize stop signs or pedestrians.


Unintended consequences of machine learning

An unintended bias issue happened with Goldman Sachs, the financial institution hosting the Apple Credit Card infrastructure. They developed an algorithm that screens applicants for creditworthiness. Their algorithm, which likely utilized deep learning, somehow concluded erroneously that women were less creditworthy than men. In one notable case, a couple with completely joint finances received very different offers: The man received a credit limit 20 times higher than his wife, despite her superior credit score. It was a scandal, and as the two firms were so interdependent, it was perhaps harder to identify the responsible party.

When the algorithm is complex and not visible to others, it’s hard to do the forensic analysis to determine the source(s) of the unintended outcome—was it the accuracy of the data fed into those algorithms? Or is the algorithm itself wrong? With feedback loops between the data and ‘self-teaching’ algorithms, the cause of these kinds of outcomes can be a mystery for end-users—and a deserved black eye for major brands.

You can dive deeper in Causeit's extensive guide to Data Ethics, written in conjunction with leading researchers like Accenture.

Recommendation Engines

Recommendation engines are the collection of algorithms that help users find things like new content, social connections, search results, or products. Because the internet is incredibly complex and constantly evolving, recommendation engines are crucial for navigation. The relevance and accuracy of recommendations depend on the simplicity of source data, the sample size of users, and the quality of algorithms used to recommend content. Generally, heavily labeled data like books are easier to recommend than, for example, Twitter posts whose meaning is dense with slang and acronyms.

Examples of recommendation engines include:

  • Search engines
  • Netflix's movie suggestions
  • Facebook's "people you may know" feature
  • Amazon's product suggestions
  • Clothing store Banana Republic's 'fit estimator'

Basic recommendation engines center on expert systems—when a journalist writes an article, they may also select related news articles, which a machine then retrieves and offers to the user as recommendations. More advanced recommendation engines augment these expert systems with many kinds of machine learning to predict and recommend things to users.


Avoiding unintended recommendations—or recommendations that reveal too much

In one apocryphal case, in 2012, a parent found out that their teenaged daughter was pregnant because an e-commerce site mailed an advertisement suggesting baby-related products based on the daughter's search history. The algorithm was correct, but its designers hadn't foreseen that not all recommendations that could be made should be made.

Be careful with recommendation engines.

Correlation ≠ recommendation.

Just because something is correlated doesn't mean we should consider it recommended. For example, Robinhood (an online stock-trading service) will tell you what investors who own a given stock are also invested in. This does not mean that Robinhood recommends investing in those stocks, though—they say exactly that next to the list of related stocks. Distinguishing correlation (or 'related' things) from recommendation, as Robinhood does, sets realistic user expectations and avoids liability.

Business Implications

Augmenting or automating offerings with AI

AI offers many opportunities to improve an organization's offerings. Some are far easier than others, so it's helpful to start with brainstorming and then sort by difficulty or feasibility with a combination of business stakeholders, technologists, users, and ethicists. Several key questions can help you get started:

  • How could machine learning technologies create higher value for our customers?
  • Where could we increase our return on investment?
  • Where could we lower the cost of entry to our offerings?
  • Where could we reduce maintenance costs or effort for ourselves or our easier?
  • How could we make our products more accessible to entry-level users or markets?

Distinguishing Between Reality, Aspiration, and Hype

When thinking about advanced technologies, it's important to keep a keen eye on hype vs. reality. Gartner's 'hype cycle' helps us understand typical phases of discussion regarding advanced technologies. First, we can expect to experience a peak of inflated expectations—where science-fiction-like stories show us new possibilities. Understandably, when those promises are unfulfilled, we experience a trough of disillusionment where we give up on the idea of a new technology solving actual problems in our work lives. Later, we experience a 'slope of enlightenment' as the technology offers more practical applications, showing the initial signs of productive use. Finally, if all goes well, we experience a 'plateau of productivity' where the once-breakthrough technology becomes business as usual.

AI's hype cycle is no exception. We've been hearing about AI's promise for a long time.  For the most part, the reality of AI is quite far from the promise.


AI falls into two categories:

  • General AI, which is mostly in the realm of science-fiction for now
  • Narrow AI, which refers to very specific machine 'intelligence' usually used to augment human capacity within a narrow scope

There are two main types of machine learning:

  • Supervised learning, which is good for specific tasks we already know of
  • Unsupervised learning, which is good for discovering things we didn't already know