3.2 Fruits and Vegetables

In this coding challenge we'll use Google's Teachable Machine to let our web app distinguish between different kinds of fruits that we hold in front of the webcam.

Make a copy of the application template for each of the tasks below and write the necessary code with JavaScript and HTML to solve the tasks below.

a) Train your first model

Before we write our first training process in JavaScript, let's begin with a simpler way. Google's Teachable Machine is a tool that allows you to use an easy-to-use graphical user interface to train a sophisticated deep learning model right here in your browser. All you need is 3 different fruits or vegetables.

Teachable Machine uses a technique called transfer learning, which re-uses a fully trained deep neural network and only exchanges the final output layer. This require much less data (images) than if we trained a model from scratch.

Go to the Teachable Machine website for training an image model and create 3 classes, one for each fruit you want to distinguish. Add some images for each class using your webcam.

How many images do you think you should use for training the model?
What is the model's prediction when you are holding nothing into the camera, or an entirely different thing that the model has never seen before?
When testing the model, see what happens when you change the scene, such as the lighting, background or even your clothes?
Have a look at the details regarding your model's training progress and performance. What does this information tell you?

Once you're happy with your model's performance, let's move on to the next exercise!

b) Export the model

Playing around with the training and preview of the model is fun. But ultimately, we want to put the model to use in our application. Teachable Machine allow us to export the trained model and use it with the JavaScript machine learning Library ml5.js. ml5.js uses a JavaScript version of the popular TensorFlow machine learning library developed by Google.

Export your model as a Tensorflow.js version and use the option to upload it to the Google cloud. Choose the p5.js version of the generated code snippet and copy it into a new empty Glitch project! Preview the website and see your model in action.

The JavaScript library p5.js simplifies many common task related to drawing and animations. Among other things, it includes a built-in lifecycle for animations with the functions preload(), setup(), and draw().

Go through the generated JavaScript code line by line and try to get an understanding of what happens here. Write down any questions and lines that you don't understand.
What does the output of a machine learning model look like when we use it in a program?

c) Migrate the model into your app

Now that you understand the generated code at a basic level, let's see if we can transfer the model into our web application template. Try to decompose the whole process into smaller chunks that you migrate one by one.

To separate concerns and improve our code, let's create a separate JavaScript file for the integration of your model.
Start by initializing the global variables that we need, such as the model's URL.
Now, use the setup() function and try to get the webcam working in our application template. Take a look at the function parent() in p5.js to put the video output in a good spot in your HTML document.
Go through the rest of the code and decide which steps to migrate next into your new JavaScript file. Make the code simpler when possible and leave out anything you don't need.
In the last step, show the output of the model (predicted label and confidence) next to the webcam video and update it in real time.

d) Connect the LED and display

With the model successfully integrated into our web app template, we can now combine the prediction of the model with our connected hardware devices. Let's connect the LED with the model's prediction by lighting up the LED in the fruit's color (banana = yellow, apple = red, etc.). Additionally, output the prediction and the confidence to the OLED display.

Previous3.1 Rules vs. Learning Next3.3 Face Recognition

Last updated 4 years ago

Was this helpful?