TinyML Tutorial

Run trained neural networks on hardware that costs less than a coffee.

Who This Is For

Python developers who know basic ML and want to deploy models to microcontrollers
Embedded engineers curious about adding inference to their devices
Hobbyists with an Arduino or ESP32 and some interest in AI

Fundamentals

Introduction: What TinyML is, the hardware landscape, and your first inference
ML Foundations: The ML pipeline from data to prediction, mapped to constrained hardware
Development Environment: Setting up Python, TensorFlow, Arduino IDE, and TFLite Micro

Core Concepts

Data Collection: Capturing sensor data from the device, labeling, and building datasets
Model Training: Designing and training small models in Keras for edge targets
Model Conversion: Quantization, TFLite conversion, and generating C arrays

Deployment

TFLite Micro Deployment: Flashing a model to Arduino or ESP32 and running inference
The Inference Engine: TFLite Micro API, memory allocation, and the interpreter loop

Advanced

Sensor Integration: Reading accelerometer, microphone, and camera data for real inference
Edge Impulse Workflow: Using Edge Impulse to collect, train, and deploy without writing all the glue
Optimization: Pruning, quantization-aware training, and squeezing latency

Mastery

Best Practices: Patterns, pitfalls, and what to do when the model lies

How to Use This Tutorial

Read sequentially for the full arc from training to deployment
Type the code. Don't copy-paste; the muscle memory matters for debugging
Have hardware nearby. An Arduino Nano 33 BLE Sense is the reference board, but notes for ESP32-S3 are included throughout

Quick Reference

Essential Commands

# Install Python dependencies
pip install tensorflow tflite-model-maker numpy

# Convert a Keras model to TFLite
python3 convert.py

# Generate a C array from a .tflite file
xxd -i model.tflite > model_data.cc

# Flash to Arduino (via arduino-cli)
arduino-cli compile --fqbn arduino:mbed_nano:nano33ble sketch/
arduino-cli upload  --fqbn arduino:mbed_nano:nano33ble -p /dev/ttyACM0 sketch/

# Monitor serial output
arduino-cli monitor -p /dev/ttyACM0 --config baudrate=115200

Minimal Inference Sketch

#include <TensorFlowLite.h>
#include "model_data.h"
#include "tensorflow/lite/micro/micro_interpreter.h"
#include "tensorflow/lite/micro/all_ops_resolver.h"
#include "tensorflow/lite/schema/schema_generated.h"

constexpr int kTensorArenaSize = 8 * 1024;
uint8_t tensor_arena[kTensorArenaSize];

void setup() {
  const tflite::Model* model = tflite::GetModel(g_model_data);
  tflite::AllOpsResolver resolver;
  tflite::MicroInterpreter interpreter(
      model, resolver, tensor_arena, kTensorArenaSize);
  interpreter.AllocateTensors();

  TfLiteTensor* input = interpreter.input(0);
  input->data.f[0] = 0.5f;  // your feature value

  interpreter.Invoke();

  TfLiteTensor* output = interpreter.output(0);
  float score = output->data.f[0];
}

void loop() {}

Common Patterns

# Quantize a model to int8
converter = tf.lite.TFLiteConverter.from_keras_model(model)
converter.optimizations = [tf.lite.Optimize.DEFAULT]
converter.representative_dataset = representative_data_gen
converter.target_spec.supported_ops = [tf.lite.OpsSet.TFLITE_BUILTINS_INT8]
converter.inference_input_type  = tf.int8
converter.inference_output_type = tf.int8
tflite_model = converter.convert()

Learning Path Suggestions

ML practitioner moving to edge (3-4 hours)

Skim chapters 1-2 (you know the ML part)
Read chapters 3, 6, 7, 8 carefully
Work through the sensor project in chapter 9
Build a gesture classifier or keyword spotter

Embedded developer learning ML (6-8 hours)

Read chapters 1-5 carefully
Work through chapters 6-9 hands-on
Try the Edge Impulse workflow in chapter 10 as a shortcut
Return to chapter 11 to optimize your first real project

Why TinyML?

No connectivity required: inference happens on the device, not in the cloud
Low latency: no round-trip; response in milliseconds
Privacy: sensor data never leaves the hardware
Cost: a $4 microcontroller replaces a server for many classification tasks
Power: models running on a Cortex-M4 use milliwatts, not watts

Additional Resources

Hardware Version Note

This tutorial targets TensorFlow Lite Micro as of 2026 and the Arduino Nano 33 BLE Sense Rev2 as the primary reference board. ESP32-S3 notes are included for the inference and sensor chapters. The Edge Impulse chapter covers the web interface as of early 2026.

A note on scope: this tutorial stops at the edge of what a single microcontroller can do without an accelerator. Chips with dedicated NPUs (like the Arduino Nicla Voice or the Coral Dev Board Micro) follow the same workflow but with additional SDK steps not covered here.

Chapters

About this tutorial