Skip to main content

Complete Machine Learning Workflow

This guide walks you through the complete end-to-end workflow from connecting your data sources to deploying models in production.

Workflow Overview

The complete ML workflow in NeoSpace consists of six main stages:

  1. Connect Data Sources → 2. Create Datasets → 3. Train Models → 4. Evaluate Models → 5. Compare Performance → 6. Deploy Models

Each stage builds on the previous one, creating a seamless pipeline from raw data to production models.

Stage 1: Connect Data Sources

Objective: Connect your data sources to the NeoSpace platform.

Steps:

  1. Navigate to Integration (Connectors) section
  2. Click "New Integration"
  3. Select connector type (S3, Oracle, etc.)
  4. Configure connection details
  5. Enter credentials securely
  6. Validate connection
  7. Save connector

Outcome: Your data sources are connected and accessible.

Next: Proceed to create datasets from connected data sources.

Stage 2: Create Datasets

Objective: Create and prepare datasets for training.

Steps:

  1. Navigate to Datasets section
  2. Click "New Dataset"
  3. Enter dataset information (name, structure, domain)
  4. Select data files from connectors
  5. Run File Analysis to understand data structure
  6. Configure Modeling:
    • Select feature columns
    • Select target columns
    • Exclude unwanted features
  7. Configure Data Partitioning:
    • Choose split method (percentage or date range)
    • Set training/validation splits
  8. Process Dataset to make it READY

Outcome: Dataset is ready for training.

Next: Use dataset in training jobs.

Stage 3: Train Models

Objective: Train LDM models on your datasets.

Steps:

  1. Navigate to Training section
  2. Click "New Training"
  3. Enter training name and description
  4. Select Datasets:
    • Choose datasets for training
    • Configure features and targets per dataset
  5. Configure Data Split:
    • Set training/validation split
    • Choose split method
  6. Configure Architecture:
    • Select model architecture (NeoLDM or Transformer)
    • Configure model size
    • Customize YAML configuration if needed
    • Set GPU count
  7. Start Training
  8. Monitor Training:
    • View training logs
    • Track training metrics
    • Monitor checkpoints
    • Review training summary

Outcome: Trained model with checkpoints available.

Next: Evaluate models using benchmarks.

Stage 4: Evaluate Models

Objective: Evaluate trained models using benchmarks.

Steps:

  1. Navigate to Benchmark section
  2. Create Benchmark (if not exists):
    • Enter benchmark name and description
    • Select task type (Classification, Regression, etc.)
    • Select datasets for evaluation
    • Configure metrics
  3. Navigate to Leaderboard section
  4. Click "New Evaluation"
  5. Select Checkpoint: Choose checkpoint to evaluate
  6. Select Benchmark: Choose benchmark(s) to evaluate against
  7. Run Evaluation
  8. Monitor Evaluation: Track evaluation progress
  9. View Results: Results appear in leaderboard

Outcome: Model performance evaluated against benchmarks.

Next: Compare models and select best performers.

Stage 5: Compare Performance

Objective: Compare models and identify best performers.

Steps:

  1. Navigate to Leaderboard section
  2. Review Stats: Check overview statistics
  3. Apply Filters (if needed):
    • Filter by benchmarks
    • Filter by trainings
    • Filter by date range
    • Search by name
  4. Sort Results: Sort by relevant metrics
  5. Compare Models:
    • Select models to compare
    • Review metrics side-by-side
    • Analyze performance differences
  6. Select Best Model:
    • Review rankings
    • Consider all metrics
    • Align with business goals
    • Select best checkpoint

Outcome: Best performing model identified.

Next: Deploy model to inference server.

Stage 6: Deploy Models

Objective: Deploy best model to production.

Steps:

  1. Navigate to Inference Server section
  2. Click "Deploy Model"
  3. Select Model: Choose the best checkpoint
  4. Configure Deployment:
    • Set instance count
    • Configure GPU allocation
    • Set resource limits
  5. Deploy Model: Start deployment
  6. Monitor Deployment: Track deployment progress
  7. Verify Serving: Test that model is serving correctly
  8. Monitor Performance:
    • Track inference latency
    • Monitor throughput
    • Check system health
  9. Scale if Needed: Adjust instances based on demand

Outcome: Model deployed and serving predictions in production.

Next: Monitor and optimize production model.

Component Integration

How components work together:

Data Flow:

  • Connectors → Provide data sources
  • Datasets → Organize and prepare data
  • Training → Train models on datasets
  • Benchmark → Evaluate models
  • Leaderboard → Compare and select models
  • Inference Server → Deploy and serve models

Integration Points:

  • Datasets use data from Connectors
  • Training uses READY Datasets
  • Benchmarks evaluate Training checkpoints
  • Leaderboard aggregates Benchmark results
  • Inference Server deploys selected checkpoints

Best Practices Workflow

Recommended practices for the complete workflow:

Data Preparation:

  • Ensure data quality before creating datasets
  • Use appropriate dataset structures
  • Configure proper data splits
  • Validate dataset health

Model Training:

  • Start with small models for experimentation
  • Monitor training closely
  • Save checkpoints regularly
  • Track training experiments

Model Evaluation:

  • Use consistent benchmarks
  • Evaluate all relevant checkpoints
  • Consider multiple metrics
  • Track evaluation history

Model Selection:

  • Don't rely solely on rankings
  • Consider business requirements
  • Validate on test data
  • Document selection decisions

Model Deployment:

  • Start with minimal instances
  • Monitor closely after deployment
  • Test thoroughly before production
  • Have rollback plan ready

Workflow Optimization

Tips for optimizing your workflow:

Efficiency:

  • Reuse datasets across trainings
  • Use consistent benchmarks
  • Automate evaluation where possible
  • Track experiments systematically

Quality:

  • Ensure data quality at each stage
  • Validate configurations before proceeding
  • Monitor performance at each stage
  • Review and optimize continuously

Collaboration:

  • Document decisions and configurations
  • Share datasets and benchmarks
  • Collaborate on model development
  • Review and learn from experiments

Next Steps