Tutorial

TinyML Tutorial

A practical tutorial for running machine learning models on microcontrollers and edge devices. Covers data collection, model training, quantization, TensorFlow Lite Micro deployment, and on-device inference.

Tutorial·Difficulty: Intermediate·12 chapters·Updated May 10, 2026

Chapters

About this tutorial

Run trained neural networks on hardware that costs less than a coffee.

Who This Is For

  • Python developers who know basic ML and want to deploy models to microcontrollers
  • Embedded engineers curious about adding inference to their devices
  • Hobbyists with an Arduino or ESP32 and some interest in AI

Contents

Fundamentals

  1. Introduction: What TinyML is, the hardware landscape, and your first inference
  2. ML Foundations: The ML pipeline from data to prediction, mapped to constrained hardware
  3. Development Environment: Setting up Python, TensorFlow, Arduino IDE, and TFLite Micro

Core Concepts

  1. Data Collection: Capturing sensor data from the device, labeling, and building datasets
  2. Model Training: Designing and training small models in Keras for edge targets
  3. Model Conversion: Quantization, TFLite conversion, and generating C arrays

Deployment

  1. TFLite Micro Deployment: Flashing a model to Arduino or ESP32 and running inference
  2. The Inference Engine: TFLite Micro API, memory allocation, and the interpreter loop

Advanced

  1. Sensor Integration: Reading accelerometer, microphone, and camera data for real inference
  2. Edge Impulse Workflow: Using Edge Impulse to collect, train, and deploy without writing all the glue
  3. Optimization: Pruning, quantization-aware training, and squeezing latency

Mastery

  1. Best Practices: Patterns, pitfalls, and what to do when the model lies

How to Use This Tutorial

  1. Read sequentially for the full arc from training to deployment
  2. Type the code. Don't copy-paste; the muscle memory matters for debugging
  3. Have hardware nearby. An Arduino Nano 33 BLE Sense is the reference board, but notes for ESP32-S3 are included throughout

Quick Reference

Essential Commands

# Install Python dependencies
pip install tensorflow tflite-model-maker numpy

# Convert a Keras model to TFLite
python3 convert.py

# Generate a C array from a .tflite file
xxd -i model.tflite > model_data.cc

# Flash to Arduino (via arduino-cli)
arduino-cli compile --fqbn arduino:mbed_nano:nano33ble sketch/
arduino-cli upload  --fqbn arduino:mbed_nano:nano33ble -p /dev/ttyACM0 sketch/

# Monitor serial output
arduino-cli monitor -p /dev/ttyACM0 --config baudrate=115200

Minimal Inference Sketch

#include <TensorFlowLite.h>
#include "model_data.h"
#include "tensorflow/lite/micro/micro_interpreter.h"
#include "tensorflow/lite/micro/all_ops_resolver.h"
#include "tensorflow/lite/schema/schema_generated.h"

constexpr int kTensorArenaSize = 8 * 1024;
uint8_t tensor_arena[kTensorArenaSize];

void setup() {
  const tflite::Model* model = tflite::GetModel(g_model_data);
  tflite::AllOpsResolver resolver;
  tflite::MicroInterpreter interpreter(
      model, resolver, tensor_arena, kTensorArenaSize);
  interpreter.AllocateTensors();

  TfLiteTensor* input = interpreter.input(0);
  input->data.f[0] = 0.5f;  // your feature value

  interpreter.Invoke();

  TfLiteTensor* output = interpreter.output(0);
  float score = output->data.f[0];
}

void loop() {}

Common Patterns

# Quantize a model to int8
converter = tf.lite.TFLiteConverter.from_keras_model(model)
converter.optimizations = [tf.lite.Optimize.DEFAULT]
converter.representative_dataset = representative_data_gen
converter.target_spec.supported_ops = [tf.lite.OpsSet.TFLITE_BUILTINS_INT8]
converter.inference_input_type  = tf.int8
converter.inference_output_type = tf.int8
tflite_model = converter.convert()

Learning Path Suggestions

ML practitioner moving to edge (3-4 hours)

  1. Skim chapters 1-2 (you know the ML part)
  2. Read chapters 3, 6, 7, 8 carefully
  3. Work through the sensor project in chapter 9
  4. Build a gesture classifier or keyword spotter

Embedded developer learning ML (6-8 hours)

  1. Read chapters 1-5 carefully
  2. Work through chapters 6-9 hands-on
  3. Try the Edge Impulse workflow in chapter 10 as a shortcut
  4. Return to chapter 11 to optimize your first real project

Why TinyML?

  • No connectivity required: inference happens on the device, not in the cloud
  • Low latency: no round-trip; response in milliseconds
  • Privacy: sensor data never leaves the hardware
  • Cost: a $4 microcontroller replaces a server for many classification tasks
  • Power: models running on a Cortex-M4 use milliwatts, not watts

Additional Resources

Hardware Version Note

This tutorial targets TensorFlow Lite Micro as of 2026 and the Arduino Nano 33 BLE Sense Rev2 as the primary reference board. ESP32-S3 notes are included for the inference and sensor chapters. The Edge Impulse chapter covers the web interface as of early 2026.

A note on scope: this tutorial stops at the edge of what a single microcontroller can do without an accelerator. Chips with dedicated NPUs (like the Arduino Nicla Voice or the Coral Dev Board Micro) follow the same workflow but with additional SDK steps not covered here.