Lab 6 — Interactive Explainer

SageMaker Pipelines & Model Registry

Build a 9-step ML pipeline that automates data processing, hyperparameter tuning, model evaluation, bias detection, and conditional model registration — all orchestrated as a single reproducible workflow.

🔗 SageMaker Pipelines 📦 Model Registry ⚖️ Clarify Bias 🏢 HCM Context 🧪 Lab 6

📋 Lab 6 Overview

This lab brings together everything from Labs 1–5 into a single automated workflow. Instead of running data processing, training, tuning, evaluation, and deployment as separate manual steps, you define them as a SageMaker Pipeline — a reproducible, auditable, end-to-end ML workflow that runs with one API call.

Duration: ~90 minutes • Phase: MLOps • Prerequisite: Understanding of Labs 1–5 concepts

What You Build

Input

💾 Feature Store

Customer churn data
30+ engineered features

Pipeline

🔗 9 Steps

Process → Tune → Eval
→ Clarify → Register

Output

📦 Model Registry

Versioned model
+ bias report + lineage

Key insight: The pipeline only registers the model if AUC exceeds a threshold (conditional step). This is the quality gate that prevents bad models from reaching production.

The 9 Pipeline Steps

#Step NameTypeWhat It Does
1ChurnModelProcessProcessingFetches data from Feature Store, splits into train/validation/test
2ChurnHyperParameterTuningTuningXGBoost HPO with 5 hyperparameter ranges (like Lab 4)
3ChurnEvalBestModelProcessingEvaluates the best model, computes AUC and classification metrics
4ChurnCreateModelModelCreates a SageMaker Model object from the best training job
5ChurnModelConfigFileProcessingGenerates Clarify analysis config (bias detection settings)
6ChurnTransformTransformBatch inference on test data to generate predictions
7ClarifyProcessingStepProcessingRuns SageMaker Clarify to detect bias in predictions
8RegisterChurnModelRegisterRegisters model in Model Registry with metrics and explainability
9CheckAUCScoreChurnEvaluationConditionGates registration: only proceeds if AUC > threshold

Key Concepts

🔗

SageMaker Pipelines

Orchestration service for ML workflows. Define steps as Python objects, connect inputs/outputs, execute with one call. Tracks every run with full metadata.

💾

Feature Store

Centralized repository for ML features. Serves both batch training (offline store) and real-time inference (online store). Ensures feature consistency.

📦

Model Registry

Version control for models. Stores model artifacts, metrics, bias reports, and approval status. Enables governed promotion from staging to production.

🧬

Model Lineage

Tracks the full provenance of a model: which data, features, training job, and pipeline run produced it. Essential for audit and compliance.

🔗 Pipeline Steps — Interactive Flow

Click any step to explore what it does, or auto-play to walk through the entire 9-step pipeline.

🔗 Click any node or press Auto-play to walk through the ChurnModelPipeline step by step.
🧹ProcessFeature Store → Split 🎯HPO TuningXGBoost + Bayesian 📊EvaluateAUC + Metrics 📦Create ModelSageMaker Model ⚙️ConfigClarify settings 🔄TransformBatch predictions ⚖️ClarifyBias detection 📦RegisterModel Registry ConditionAUC > threshold? TRAINING & EVALUATION GOVERNANCE & REGISTRATION
Step Details
StepSelect a node above

💾 SageMaker Feature Store

Before the pipeline runs, you populate a Feature Store with pre-engineered features. This decouples feature engineering from model training — features are computed once and reused across multiple pipeline runs, experiments, and models.

Feature Store Architecture

💾

Offline Store (S3)

Full historical feature data in Parquet format. Used for batch training and pipeline processing steps. Queryable via Athena SQL. This is what Lab 6 uses.

Online Store (DynamoDB)

Low-latency feature lookup for real-time inference. Returns the latest feature values for a given record ID in single-digit milliseconds.

Lab 6 Feature Group Schema

The churn prediction dataset includes 30+ engineered features from customer behavior data:

FeatureTypeDescription
retainedLong (target)1 = customer stayed, 0 = churned
esentFloatNumber of emails sent to customer
eopenrateFloatEmail open rate (engagement signal)
eclickrateFloatEmail click-through rate
avgorderFloatAverage order value
ordfreqFloatOrder frequency (transactions per period)
paperlessLongPaperless billing enabled (1/0)
refillLongAuto-refill subscription active
doorstepLongDoorstep delivery preference
first_last_days_diffFloatDays between first and last order (tenure)
favday_*LongOne-hot encoded preferred shopping day
💡 Why Feature Store matters: Without it, every pipeline run must re-compute features from raw data (slow, error-prone). With Feature Store, features are computed once by a dedicated feature engineering pipeline, then consumed by any number of training pipelines. This ensures training and inference use identical feature logic.

📦 Model Registry

The Model Registry is version control for ML models. Every pipeline run that passes the quality gate produces a registered model version — complete with metrics, bias reports, and approval status.

Model Package Contents

ArtifactSource StepPurpose
model.tar.gzChurnHyperParameterTuningTrained XGBoost model artifact (deployable)
evaluation.jsonChurnEvalBestModelAUC, accuracy, precision, recall, F1 metrics
Clarify bias reportClarifyProcessingStepStatistical parity, disparate impact analysis
Explainability reportClarifyProcessingStepSHAP values showing feature importance
Inference specPipeline configContainer image, instance type, input format

Model Lifecycle

⏳ PendingManualApproval

Pipeline registers model
Awaits human review

✅ Approved

Reviewer approves
Ready for deployment

❌ Rejected

Fails review criteria
Archived, not deployed

🔔 The Condition Step is the automated gate. If AUC < threshold, the model is never registered — it doesn't even reach PendingManualApproval. This prevents obviously bad models from wasting reviewer time. Only models that pass the automated check get human review.

Conditional Registration Logic

The CheckAUCScoreChurnEvaluation step uses a ConditionGreaterThan check:

📝 If evaluation.json → binary_classification_metrics.auc.value > threshold:
→ Execute RegisterChurnModel (model enters registry)
Else:
→ Pipeline completes without registration (model discarded)

⚖️ SageMaker Clarify & Model Lineage

Responsible AI requires understanding both what your model predicts and why. Clarify detects bias in predictions, while lineage tracking provides full provenance of every model artifact.

Clarify Bias Detection

The ClarifyProcessingStep runs post-training bias analysis on the model's predictions. It checks whether the model treats different groups fairly.

📊

Pre-Training Bias

Detects imbalances in the training data itself. Example: if 90% of "retained" customers are from one demographic, the model may learn biased patterns.

🔍

Post-Training Bias

Measures whether the model's predictions are fair across groups. Checks disparate impact, statistical parity difference, and conditional demographic disparity.

⚠️ Why this matters for HCM: An attrition model that systematically flags employees from certain demographics as "high flight risk" could lead to discriminatory retention interventions. Clarify catches this before the model reaches production.

Model Lineage

Lineage tracking answers: "How was this model created?" — tracing from raw data through every transformation, training job, and evaluation step.

Lineage ComponentWhat It TracksWhy It Matters
Data SourceFeature Store group, S3 paths, data versionReproduce training with exact same data
Processing JobScript version, parameters, output artifactsAudit data transformations
Training JobAlgorithm, hyperparameters, instance type, durationUnderstand model configuration
EvaluationMetrics (AUC, F1), evaluation datasetCompare model versions objectively
Bias ReportClarify analysis results, fairness metricsCompliance and responsible AI audit trail
💡 Lineage visualization: In the lab, you generate a visual graph showing all artifacts connected to your model — from the Feature Store query through the training job to the final registered model package. This is what auditors and compliance teams review.

🏢 HCM Mapping — AnyCompany Context

How does a SageMaker Pipeline apply to AnyCompany's ML products? Each product has different pipeline complexity, retraining frequency, and governance requirements.

Pipeline Scenarios at AnyCompany

🏢 Click a scenario to see how SageMaker Pipelines would be configured for different AnyCompany ML products.
🚨

Fraud Detection Pipeline

Monthly retraining with new fraud patterns. Strict quality gates — recall must exceed 95%.

📉

Attrition Prediction Pipeline

Quarterly retraining. Clarify bias checks critical — cannot discriminate by demographics.

🤖

AnyCompany Assist Fine-Tune

Weekly fine-tuning on new conversation data. A/B testing gate before full rollout.

💰

Salary Benchmarking Pipeline

Annual retraining with market data refresh. RMSE threshold as quality gate.

Pipeline Configuration
ScenarioSelect a card above

Lab 6 → AnyCompany Attrition Pipeline

Lab 6 ConceptAnyCompany EquivalentWhy It Matters
Customer churn targetEmployee attrition (left_company)Same binary classification problem structure
Feature Store (30 features)HR Feature Store (50+ features from HRIS, Comp, Perf)Centralized features shared across attrition, engagement, and flight-risk models
XGBoost HPO stepXGBoost + LightGBM comparison stepProduction pipelines often compare multiple algorithms
AUC condition gateAUC > 0.78 AND recall > 0.70Multiple metrics must pass — single AUC isn't enough for HR decisions
Clarify bias checkBias analysis by gender, age, ethnicity, locationLegal requirement — cannot deploy discriminatory attrition model
Model RegistryVersioned model catalog with approval workflowHR leadership must approve before model influences retention decisions
Lineage trackingFull audit trail for compliance (GDPR, DPDP Act)Must prove which data trained which model for regulatory audits

Production Pipeline Patterns

💡 Scheduled execution: In production, pipelines are triggered by EventBridge schedules (monthly for fraud, quarterly for attrition) or by data arrival events (new Feature Store data lands → pipeline starts automatically). No manual pipeline.start() calls.
🛡️ Multi-stage approval: AnyCompany's production pipeline adds a human approval step between registration and deployment. The pipeline registers the model, sends an SNS notification to the ML team, and waits for manual approval before triggering the deployment pipeline (Lab 5's blue/green traffic shift).