Skip to main content

Dataset Operations

Learn how to process datasets, view data, manage multiple datasets, and handle deletions.

Processing a Dataset

Process a dataset to make it ready for training:

Processing Steps:

  1. Navigate to Dataset: Go to the dataset details
  2. Complete Configuration: Ensure all configurations are complete
  3. Click "Process": Start the processing job
  4. Monitor Progress: Track processing progress
  5. Wait for Completion: Wait for processing to complete
  6. Verify Status: Confirm dataset is READY

Processing Activities:

  • Data validation
  • Feature extraction
  • Data transformation
  • Partition creation
  • Quality checks
  • Index creation

Processing Time:

  • Depends on dataset size
  • Depends on data complexity
  • Can take minutes to hours
  • Monitor progress in real-time

Viewing Dataset Data

View and explore dataset contents:

Viewing Options:

  • Dataset Overview: Basic dataset information
  • Data Preview: Preview of dataset rows
  • Feature Details: Detailed feature information
  • Statistics: Dataset statistics and metrics

Data Preview:

  • View sample rows
  • See feature values
  • Check data types
  • Verify data quality

Feature Details:

  • Feature names and types
  • Feature statistics
  • Value distributions
  • Missing value information

Managing Multiple Datasets

Manage multiple datasets efficiently:

Dataset List:

  • View all datasets
  • Filter by status, domain(s), or type (can filter by multiple domains)
  • Search by name
  • Sort by various criteria

Bulk Operations:

  • Select multiple datasets
  • Delete multiple datasets
  • Reprocess multiple datasets
  • Export dataset information

Organization:

  • Use descriptive names
  • Add descriptions
  • Use one or more domains for organization (datasets can have multiple domains)
  • Tag datasets appropriately
  • Maintain consistent domain naming conventions

Deleting Datasets

Delete datasets that are no longer needed:

Deletion Process:

  1. Navigate to Dataset: Go to the dataset
  2. Click Delete: Start deletion process
  3. Confirm Deletion: Confirm you want to delete
  4. Wait for Completion: Wait for deletion to complete

Deletion Considerations:

  • In Use: Cannot delete datasets used in active trainings
  • Dependencies: Check for dependencies before deletion
  • Permanent: Deletion is permanent and cannot be undone
  • Data Files: Data files may be retained depending on configuration

Safe Deletion:

  • Check for active uses
  • Verify no dependencies
  • Backup if needed
  • Confirm deletion

Next Steps