The best way to predict the future is to invent it

MarJazz — Jazz Piano Generation

Project Overview

MarJazz is a deep learning model for autoregressive MIDI music generation, trained on solo jazz piano performances from the PiJAMA dataset . The model uses a dual-block LSTM architecture with REMI tokenisation to generate coherent ~60-second jazz piano pieces.


Architecture

Token IDs → Embedding (128-dim)
    ↓
LSTM Block 1 → Dropout → LayerNorm   (pitch contour, rhythm)
    ↓
LSTM Block 2 → Dropout → LayerNorm   (phrases, harmony)
    ↓
MLP Head (GELU) → Dropout → LayerNorm
    ↓
Linear → Logits (vocab_size)

Tokenisation: REMI (REvamped MIDI-derived) — encodes music as Bar, Position, Pitch, Velocity, and Duration tokens, preserving bar-level structure essential for jazz phrasing.

Generation: Autoregressive sampling with temperature scaling, top-k filtering, and a repetition penalty to prevent motif loops.


Key Hyperparameters

ParameterValueDescription
Embedding dim128Token embedding dimension
Hidden size256LSTM hidden state size
LSTM layers2Stacked layers per block
Context window1024Max sequence length (tokens)
Temperature0.85Softmax temperature at generation
Top-k30Top-k sampling
Rep. penalty1.5Repetition penalty factor
Target length60 sTarget generation duration

Dataset Analysis

The model was trained on filtered MIDI files from PiJAMA. Quality filters were applied for pitch range, tempo, duration, and note density before tokenisation. The figures show pitch class and velocity distributions of the training corpus.


Sample Output

A generated ~60-second jazz piano piece is included in samples/generated_song.mid. To listen, drag and drop the file into Signal MIDI Editor (browser-based, no install needed).


Tech Stack

  • PyTorch — model definition and training
  • MidiTok — REMI tokenisation
  • Symusic — fast MIDI I/O
  • pretty_midi — MIDI analysis and visualisation
  • NumPy, Pandas, Matplotlib

GitHub

markusle56/MarJazzzzz