Skip to main content

Leaderboard Visualizations

The Leaderboard provides multiple visualization options to view and analyze model performance.

Table View

The Table View provides a comprehensive tabular view of all evaluations:

Table Features:

  • Columns: Training, checkpoint, benchmarks, and metrics
  • Sorting: Sort by any column
  • Filtering: Filter by benchmarks, trainings, dates
  • Heatmap: Optional heatmap visualization for scores
  • Expansion: Expand rows for detailed information

Table Columns:

  • Training: Training name and information
  • Checkpoint: Checkpoint identifier and epoch
  • Benchmarks: Benchmark columns with metrics
  • Metrics: Metric values for each benchmark
  • Ranking: Current ranking position

Table Benefits:

  • Comprehensive: See all data at once
  • Comparable: Easy to compare across models
  • Sortable: Sort by any metric
  • Filterable: Filter to specific models or benchmarks

List View

The List View provides a card-based view of evaluations:

List Features:

  • Card Layout: Each evaluation as a card
  • Summary Metrics: Key metrics displayed prominently
  • Details Panel: Detailed metrics in side panel
  • Selection: Select items to view details
  • Score Visualization: Visual score indicators

List Benefits:

  • Focused: Focus on one evaluation at a time
  • Detailed: Detailed metrics in side panel
  • Visual: Visual score indicators
  • Interactive: Interactive selection and exploration

Card Information:

  • Training Name: Name of the training
  • Checkpoint Info: Checkpoint identifier
  • Overall Score: Aggregated score across benchmarks
  • Key Metrics: Key performance metrics
  • Status: Evaluation status

Detailed Metrics

View detailed metrics for each evaluation:

Metrics Display:

  • Per Benchmark: Metrics for each benchmark
  • All Metrics: All configured metrics displayed
  • Metric Values: Actual metric values
  • Comparisons: Compare with other evaluations
  • Trends: Performance trends over time

Metrics Information:

  • Metric Name: Name of the metric
  • Metric Value: Actual value
  • Benchmark: Benchmark the metric comes from
  • Interpretation: How to interpret the value
  • Comparison: Compare with other values

Detailed View Features:

  • Expandable: Expand to see all metrics
  • Sortable: Sort metrics by value
  • Filterable: Filter to specific metrics
  • Exportable: Export metrics data

Stats Overview

Overview statistics at the top of the leaderboard:

Stats Displayed:

  • Total Checkpoints Created: Total checkpoints from all trainings
  • Checkpoints Evaluated: Number of checkpoints evaluated
  • Performance Trend: Performance trend over time
  • Benchmark Consistency: Consistency between training and benchmark datasets

Stats Information:

  • Total Checkpoints Created: Count of all checkpoints
  • Checkpoints Evaluated: Count of evaluated checkpoints
  • Performance Trend: Percentage change in performance
  • Benchmark Consistency: Alignment percentage

Stats Benefits:

  • Quick Overview: Quick overview of leaderboard status
  • Trends: See performance trends
  • Consistency: Check dataset consistency
  • Progress: Track evaluation progress

Next Steps