Internet of Things
TinkerforgeCode in ActionAbout Me
  • Course Outline
  • 1 - Getting Started
    • Overview
    • Connect to the LED
    • Getting Started
      • Glitch
      • The Application Template
    • Concepts in Programming
      • What is Programming?
      • Variables
      • Functions and Commands
      • Control Structures
      • Loops
      • Objects and Libraries
    • Programming Simple Web Apps
    • Exercises
      • 1.1 Buttons and Inputs
  • 2 - Internet of Things
    • Overview
    • IoT in our Apps
      • Getting Started
        • Hardware Kit
        • Brick Viewer and Daemon
      • Connect to the Devices
        • The Tinkerforge Device Manager
      • Program the Devices
        • RGB LED
        • RGB LED Button
        • OLED Display
        • Sensors
          • Humidity Sensor
          • Ambient Light Sensor
    • Components and Use Cases
    • Exercises
      • 2.1 Lights and Buttons
      • 2.2 Sensors
      • 2.3 Display
  • 3 - Artificial Intelligence
    • Overview
    • AI in our Apps
      • Google's Teachable Machine
      • Face Recognition
      • Training a Custom Model
    • Rules vs. Learning
    • Learning from Data
    • Use Cases
      • Computer Vision
        • Image Classification
        • Handwriting Recognition
    • Machine Learning Algorithms
      • Artificial Neural Networks
      • Decision Trees
      • Logistic Regression
    • Exercises
      • 3.1 Rules vs. Learning
      • 3.2 Fruits and Vegetables
      • 3.3 Face Recognition
      • 3.4 A Classifier for Iris
  • 4 - Cloud & APIs
    • Overview
    • APIs in our Apps
    • Cloud and APIs
      • Weather API
      • NASA Open APIs
      • EDAMAM Nutrition and Recipes API
    • Push Notifications
    • Exercises
  • 5 - App Project
    • Overview
    • Summer 2021
    • Summer 2022
  • Appendix
    • Other Devices
      • Motorized Linear Poti
      • Sound Pressure Sensor
      • NFC Reader
      • Motion Detector
    • UI Features
      • Realtime Charts
      • Countdown Timer
    • Digital Computers
      • Overview
      • The Binary System
      • Code Systems
      • Logic Gates
      • Binary Addition
      • From Analog to Digital
    • Cheat Sheets
    • Projects
      • IoT @ Pickup-Boxes
Powered by GitBook
On this page
  • An example for a rule-based decision
  • Some problems are hard for a computer, but easy for humans
  • An image is a series of bits to a computer
  • Rules are doomed to fail here
  • Handwritten digits (MNIST)
  • A smaller image recognition problem
  • Can rules work here?
  • Variation is the problem
  • We need rules that can generalize
  • Rules learned from data
  • Slides
  • References

Was this helpful?

  1. 3 - Artificial Intelligence

Rules vs. Learning

For automated decisions, we can use explicit hard-coded rules. Or we can let machines learn the rules from data. What is the difference?

PreviousTraining a Custom ModelNextLearning from Data

Last updated 4 years ago

Was this helpful?

An example for a rule-based decision

We are familiar with rules to make decisions. Consider the example of a traffic light. When the light is red, we know we must stop and wait. When the light turns green, we know we can go. And when we approach a traffic light that is just switching from green to yellow, we need to decide whether we should accelerate or break. That decision could be based on our speed and/or the distance to the traffic light. The following image shows an example for a rule-based decision for this problem:

When a computer has all the required information (status of the traffic light, distance to traffic light), we could easily automate this simple decision using a program similar to the one in the image above.

Some problems are hard for a computer, but easy for humans

The image below shows a cat. We know this immediately when we look at the image. This because our brains have evolved to recognize important concepts in our environment automatically. Recognizing a tiger standing in front of us was and is crucial for our survival.

An image is a series of bits to a computer

For computers, it's not at all that easy. A computer is a general-purpose machine that has no concept of things like cats or other objects. To a computer, the image below is nothing but a long list of numbers. To make things worse, these numbers aren't decimal but binary numbers, consisting only of 0 and 1, or bits. Depending on whether it's a color or grayscale image, a chunk of a certain length represents the color of a single pixel.

So unless we teach our computer how to recognize a cat in a stream of thousands or millions of 0 and 1, it won't be able to do it. The question then is: Can we teach a computer with a set of rules what a cat looks like in an image? Just like we did with the traffic light example above? Or do we need a different approach?

Rules are doomed to fail here

You can probably guess the answer, but let's first at least give a try: How could a set of rules look like that tells us in the end whether and image contains a cat or not? Here is an idea, albeit a naive one: Let's imagine we create a prototypical cat blueprint and compare a new image pixel by pixel with that blueprint to calculate a similarity score. We could then define a threshold of similarity to decide. This might be a good idea for simple, standardized objects that always appear the same in an image. But cats can appear very differently: Sitting, running, jumping, chasing. They can be close to the camera or far away. Cats come in different shapes and colors. The combinations are indefinitely many, and this makes it impossible to come up with a blueprint filter for a cat.

Before we solve the cat-recognition problem, why not start with something simpler?

Handwritten digits (MNIST)

A smaller image recognition problem

To simplify the problem of recognizing something in an image, lets downscale the problem. We now focus on small images of 28 by 28 pixels, each pixel can be either black (0), white (1), or any gray shade in between. The task is to classify any such image into one of 10 categories, which represent the numbers between 0 and 9.

But let's go back to the basics and assume we don't have any of that yet. A handwritten digit from the MNIST dataset is represented in the same manner as the cat image from above. It consists of a 28 by 28 matrix (= 784 pixels in total), in which each value represents a shade of gray between 0 (black) and 1 (white). To be precise, the 2D matrix is imposed by us for better visual representation. All the computer sees is a long list of 784 times 8 bits, assuming we use 1 byte to represent a shade of gray. Knowing that one row has 28 pixels allows us to display the image correctly.

Can rules work here?

Now that we have a simpler and well-defined problem, can we come up with a system of rules to classify any image from the MNIST dataset into one of the 10 categories and thereby recognize the digit correctly? You can try this yourself in the following exercise:

Variation is the problem

You might come up with a decent solution to the exercise, but it most likely won't be as good as state-of-the-art algorithms with a precision of 99.75%. And it will be quite an effort to implement. But what makes it so difficult?

If all humans wrote the same and all digits looked the same, the problem would be straightforward to solve. The suggested prototypical blueprint approach from above would work perfectly. We could define a filter for every digit from 0 to 9 and apply the filter to a new image. The filter with the highest similarity value wins and becomes our prediction. This essentially as if we wrote IF..ELSE rules comparing the image pixel by pixel to the digits 0 through 9. A filter is just a better way to imagine this.

However, as we can see from the image below, people write digits very differently, and they don't look the same at all. Just look at instances of the same digit and compare them. You will find many versions of the same digit. With the high degree of variation, our filter-based approach will probably fail or at least perform much worse than in a pixel-perfect world. Variation is a problem for rules-based systems.

To illustrate the problem with the filter-approach, look at the two 3s that are layered on top of each other. The one in blue, the other in grayscale. Both clearly represent a 3, but the pixels are at different locations. A naive filter that doesn't consider the variation wouldn't work here.

We need rules that can generalize

To cope with the high variation in handwritten digits, we need less strict rules that generalize better. An example of such a rule or "fuzzy" filter is shown in the figure below. For each digit, a pixel in the blue area increases the likelihood for that particular class, while a red pixel decreases it.

Let's look at the filter for the zero. The red area in the middle means that when in a new image we find pixels in that area, it is less likely that the digit in the image is a 0. In contrast, when we find pixels lying on the oval blue shape surrounding this red area, it is more likely a 0. To calculate the score for the class "0", we compare the image pixel by pixel with the fuzzy filter and sum or subtract a value depending on whether pixels are in the blue or red areas. We calculate the scores for all 10 classes and choose the highest as the winner.

Rules learned from data

The fuzzy filter from above is not something we want to hard-code manually. Note that the filter does not only contain blue or red, but all shades between full and no color at all. This means the filter doesn't just assign positive or negative values, but also learned a continuous bandwidth for applying the filter. Some pixels are more important than others.

This would be a very cumbersome task when done manually. We'd have to go through all 90,000 images and calculate a precise value between -1 and +1 for each pixel in our filter and make sure we don't overfit our filter to the last images we considered. This is exactly what a machine learning algorithm does when learning the rules (here: the filters) to classify handwritten digits, but it can do it much faster and better than humans can. In the following sections, we'll learn how a machine learning algorithm learns from data.

Slides

References

This problem is famous in machine learning and is being used as a benchmark for new machine learning algorithms. The benchmark is based on the so-called (Modified National Institute of Standards and Technology database), which contains a total of 90,000 images of handwritten digits. Yann LeCun et al. offered a new approach called Convolutional Neural Networks in their paper Gradient-based learning applied to document recognition, which is still the best known architecture for a neural network to solve this problem today. State-of-the-art algorithms can correctly recognize ~99.75% of all digits.

You can download the slides as PDF .

Y. LeCun, L. Bottou, Y. Bengio, and P. Haffner. "Gradient-based learning applied to document recognition." Proceedings of the IEEE, 86(11):2278-2324, November 1998. .

MNIST dataset
3.1 Rules vs. Learning
here
Online
A simple decision based on a set of rules.
To a computer, an image is nothing but a long list of numbers.
A handwritten digit can be represented as a matrix of 28 by 28.
Can a computer learn to recognize handwritten digits?
People write digits differently, which makes the task hard.
Two 3s on top of each other. They look very different, so that a filter will fail here.
A learned filter for the MNIST problem.