AI Spaceship

Build Your Own LLM

Learn the full LLM pipeline through compact lessons and executable labs.

Learn to measure, test, and improve AI systems through hands-on labs: metrics, benchmarks, unit tests for LLMs, adversarial testing, regression suites, and production monitoring.

Course Snapshot

Modules: 5
Lessons: 17

Completion0%

Module 1: Foundations — Why Evaluation Matters

3 lessons

The Evaluation MindsetOpen

Lesson in Module 1: Foundations — Why Evaluation Matters

Open lesson

Accuracy, Precision, and Recall Locked

Lesson in Module 1: Foundations — Why Evaluation Matters

Locked

Building a Scoring Function Locked

Lesson in Module 1: Foundations — Why Evaluation Matters

Locked

Module 2: Text & Generation Metrics

4 lessons

Exact Match and F1 Locked

Lesson in Module 2: Text & Generation Metrics

Locked

BLEU and ROUGE Locked

Lesson in Module 2: Text & Generation Metrics

Locked

Semantic Similarity Scoring Locked

Lesson in Module 2: Text & Generation Metrics

Locked

Custom Rubric Graders Locked

Lesson in Module 2: Text & Generation Metrics

Locked

Module 3: LLM-as-Judge

3 lessons

The LLM-as-Judge Pattern Locked

Lesson in Module 3: LLM-as-Judge

Locked

Designing Judge Prompts Locked

Lesson in Module 3: LLM-as-Judge

Locked

Calibrating and Validating Judges Locked

Lesson in Module 3: LLM-as-Judge

Locked

Module 4: Test Suites & Regression Testing

4 lessons

Building an Eval Dataset Locked

Lesson in Module 4: Test Suites & Regression Testing

Locked

Assertion-Based Testing Locked

Lesson in Module 4: Test Suites & Regression Testing

Locked

Regression Detection Locked

Lesson in Module 4: Test Suites & Regression Testing

Locked

Statistical Significance Locked

Lesson in Module 4: Test Suites & Regression Testing

Locked

Module 5: Adversarial & Robustness Testing

3 lessons

Prompt Injection Detection Locked

Lesson in Module 5: Adversarial & Robustness Testing

Locked

Perturbation Testing Locked

Lesson in Module 5: Adversarial & Robustness Testing

Locked

Edge Cases and Boundary Testing Locked

Lesson in Module 5: Adversarial & Robustness Testing

Locked

Evaluation & Testing for AI Systems

Learn the full LLM pipeline through compact lessons and executable labs.