Skip to main content

Leaderboard Operations

Learn how to view the leaderboard, create evaluations, compare models, and use filters.

Viewing Leaderboard

To view the leaderboard:

  1. Navigate to Leaderboard: Go to the Leaderboard section
  2. View Stats: Check overview statistics
  3. Select View: Choose Table View or List View
  4. Review Results: Review evaluation results
  5. Filter Results: Apply filters as needed

Leaderboard Display:

  • Stats Overview: Overview statistics at the top
  • Filters: Filter options below stats
  • Results: Evaluation results in selected view
  • Details: Detailed metrics for selected items

View Options:

  • Table View: Comprehensive tabular view
  • List View: Card-based view with details panel
  • Toggle: Switch between views easily

Filtering Results

Filter leaderboard results:

Filter Options:

  • By Benchmark: Filter by selected benchmarks
  • By Training: Filter by selected trainings
  • By Date: Filter by date range
  • By Name: Search by training or checkpoint name
  • By Metric: Filter by metric value ranges

Filter Application:

  • Multiple Filters: Apply multiple filters simultaneously
  • Filter Persistence: Filters persist in URL
  • Clear Filters: Clear all filters easily
  • Filter Indicators: Visual indicators for active filters

Filter Benefits:

  • Focused View: Focus on specific models or benchmarks
  • Comparison: Compare specific subsets
  • Analysis: Analyze specific time periods
  • Search: Find specific models quickly

Creating New Evaluations

Create new evaluations from the leaderboard:

Evaluation Creation:

  1. Click "New Evaluation": Start evaluation creation
  2. Select Checkpoint: Choose checkpoint to evaluate
  3. Select Benchmark: Choose benchmark(s) to evaluate against
  4. Run Evaluation: Start the evaluation
  5. Monitor Progress: Track evaluation progress
  6. View Results: Results appear in leaderboard

Evaluation Options:

  • Single Checkpoint: Evaluate one checkpoint
  • Multiple Benchmarks: Evaluate against multiple benchmarks
  • GPU Allocation: Configure GPU allocation for evaluation
  • Priority: Set evaluation priority

Evaluation Process:

  • Queued: Evaluation is queued
  • Running: Evaluation is running
  • Completed: Evaluation completed
  • Results: Results available in leaderboard

Comparing Models/Checkpoints

Compare models and checkpoints:

Comparison Methods:

  • Side-by-Side: Compare multiple models side-by-side
  • Metric Comparison: Compare specific metrics
  • Ranking Comparison: Compare rankings
  • Trend Comparison: Compare performance trends

Comparison Features:

  • Select Multiple: Select multiple items to compare
  • Metric Focus: Focus on specific metrics
  • Visual Comparison: Visual comparison charts
  • Export: Export comparison data

Comparison Use Cases:

  • Model Selection: Compare models to select best
  • Architecture Comparison: Compare different architectures
  • Hyperparameter Comparison: Compare different hyperparameters
  • Version Comparison: Compare model versions

Sorting by Metrics

Sort leaderboard by metrics:

Sorting Options:

  • Select Metric: Choose metric to sort by
  • Sort Direction: Ascending or descending
  • Apply Sort: Apply sorting to leaderboard
  • Multiple Sorts: Sort by multiple metrics (future)

Sorting Metrics:

  • Accuracy: Sort by accuracy
  • ROC AUC: Sort by ROC AUC
  • KS Statistic: Sort by KS statistic
  • Gini Coefficient: Sort by Gini coefficient
  • Custom: Sort by any available metric

Sorting Benefits:

  • Best Models First: See best models at top
  • Quick Comparison: Quickly compare by metric
  • Ranking: Understand model rankings
  • Selection: Easily identify top performers

Next Steps