TinyML Tutorial
A practical tutorial for running machine learning models on microcontrollers and edge devices. Covers data collection, model training, quantization, TensorFlow Lite Micro deployment, and on-device inference.
Chapters
About this tutorial
Run trained neural networks on hardware that costs less than a coffee.
Who This Is For
- Python developers who know basic ML and want to deploy models to microcontrollers
- Embedded engineers curious about adding inference to their devices
- Hobbyists with an Arduino or ESP32 and some interest in AI
Contents
Fundamentals
- Introduction: What TinyML is, the hardware landscape, and your first inference
- ML Foundations: The ML pipeline from data to prediction, mapped to constrained hardware
- Development Environment: Setting up Python, TensorFlow, Arduino IDE, and TFLite Micro
Core Concepts
- Data Collection: Capturing sensor data from the device, labeling, and building datasets
- Model Training: Designing and training small models in Keras for edge targets
- Model Conversion: Quantization, TFLite conversion, and generating C arrays
Deployment
- TFLite Micro Deployment: Flashing a model to Arduino or ESP32 and running inference
- The Inference Engine: TFLite Micro API, memory allocation, and the interpreter loop
Advanced
- Sensor Integration: Reading accelerometer, microphone, and camera data for real inference
- Edge Impulse Workflow: Using Edge Impulse to collect, train, and deploy without writing all the glue
- Optimization: Pruning, quantization-aware training, and squeezing latency
Mastery
- Best Practices: Patterns, pitfalls, and what to do when the model lies
How to Use This Tutorial
- Read sequentially for the full arc from training to deployment
- Type the code. Don't copy-paste; the muscle memory matters for debugging
- Have hardware nearby. An Arduino Nano 33 BLE Sense is the reference board, but notes for ESP32-S3 are included throughout
Quick Reference
Essential Commands
# Install Python dependencies
pip install tensorflow tflite-model-maker numpy
# Convert a Keras model to TFLite
python3 convert.py
# Generate a C array from a .tflite file
xxd -i model.tflite > model_data.cc
# Flash to Arduino (via arduino-cli)
arduino-cli compile --fqbn arduino:mbed_nano:nano33ble sketch/
arduino-cli upload --fqbn arduino:mbed_nano:nano33ble -p /dev/ttyACM0 sketch/
# Monitor serial output
arduino-cli monitor -p /dev/ttyACM0 --config baudrate=115200
Minimal Inference Sketch
#include <TensorFlowLite.h>
#include "model_data.h"
#include "tensorflow/lite/micro/micro_interpreter.h"
#include "tensorflow/lite/micro/all_ops_resolver.h"
#include "tensorflow/lite/schema/schema_generated.h"
constexpr int kTensorArenaSize = 8 * 1024;
uint8_t tensor_arena[kTensorArenaSize];
void setup() {
const tflite::Model* model = tflite::GetModel(g_model_data);
tflite::AllOpsResolver resolver;
tflite::MicroInterpreter interpreter(
model, resolver, tensor_arena, kTensorArenaSize);
interpreter.AllocateTensors();
TfLiteTensor* input = interpreter.input(0);
input->data.f[0] = 0.5f; // your feature value
interpreter.Invoke();
TfLiteTensor* output = interpreter.output(0);
float score = output->data.f[0];
}
void loop() {}
Common Patterns
# Quantize a model to int8
converter = tf.lite.TFLiteConverter.from_keras_model(model)
converter.optimizations = [tf.lite.Optimize.DEFAULT]
converter.representative_dataset = representative_data_gen
converter.target_spec.supported_ops = [tf.lite.OpsSet.TFLITE_BUILTINS_INT8]
converter.inference_input_type = tf.int8
converter.inference_output_type = tf.int8
tflite_model = converter.convert()
Learning Path Suggestions
ML practitioner moving to edge (3-4 hours)
- Skim chapters 1-2 (you know the ML part)
- Read chapters 3, 6, 7, 8 carefully
- Work through the sensor project in chapter 9
- Build a gesture classifier or keyword spotter
Embedded developer learning ML (6-8 hours)
- Read chapters 1-5 carefully
- Work through chapters 6-9 hands-on
- Try the Edge Impulse workflow in chapter 10 as a shortcut
- Return to chapter 11 to optimize your first real project
Why TinyML?
- No connectivity required: inference happens on the device, not in the cloud
- Low latency: no round-trip; response in milliseconds
- Privacy: sensor data never leaves the hardware
- Cost: a $4 microcontroller replaces a server for many classification tasks
- Power: models running on a Cortex-M4 use milliwatts, not watts
Additional Resources
- TensorFlow Lite Micro docs
- TinyML: Machine Learning with TensorFlow Lite (Warden & Situnayake)
- Edge Impulse Studio
- Arduino Nano 33 BLE Sense overview
- Pete Warden's TinyML blog
Hardware Version Note
This tutorial targets TensorFlow Lite Micro as of 2026 and the Arduino Nano 33 BLE Sense Rev2 as the primary reference board. ESP32-S3 notes are included for the inference and sensor chapters. The Edge Impulse chapter covers the web interface as of early 2026.
A note on scope: this tutorial stops at the edge of what a single microcontroller can do without an accelerator. Chips with dedicated NPUs (like the Arduino Nicla Voice or the Coral Dev Board Micro) follow the same workflow but with additional SDK steps not covered here.