Dataset Operations
Learn how to process datasets, view data, manage multiple datasets, and handle deletions.
Processing a Dataset
Process a dataset to make it ready for training:
Processing Steps:
- Navigate to Dataset: Go to the dataset details
- Complete Configuration: Ensure all configurations are complete
- Click "Process": Start the processing job
- Monitor Progress: Track processing progress
- Wait for Completion: Wait for processing to complete
- Verify Status: Confirm dataset is READY
Processing Activities:
- Data validation
- Feature extraction
- Data transformation
- Partition creation
- Quality checks
- Index creation
Processing Time:
- Depends on dataset size
- Depends on data complexity
- Can take minutes to hours
- Monitor progress in real-time
Viewing Dataset Data
View and explore dataset contents:
Viewing Options:
- Dataset Overview: Basic dataset information
- Data Preview: Preview of dataset rows
- Feature Details: Detailed feature information
- Statistics: Dataset statistics and metrics
Data Preview:
- View sample rows
- See feature values
- Check data types
- Verify data quality
Feature Details:
- Feature names and types
- Feature statistics
- Value distributions
- Missing value information
Managing Multiple Datasets
Manage multiple datasets efficiently:
Dataset List:
- View all datasets
- Filter by status, domain(s), or type (can filter by multiple domains)
- Search by name
- Sort by various criteria
Bulk Operations:
- Select multiple datasets
- Delete multiple datasets
- Reprocess multiple datasets
- Export dataset information
Organization:
- Use descriptive names
- Add descriptions
- Use one or more domains for organization (datasets can have multiple domains)
- Tag datasets appropriately
- Maintain consistent domain naming conventions
Deleting Datasets
Delete datasets that are no longer needed:
Deletion Process:
- Navigate to Dataset: Go to the dataset
- Click Delete: Start deletion process
- Confirm Deletion: Confirm you want to delete
- Wait for Completion: Wait for deletion to complete
Deletion Considerations:
- In Use: Cannot delete datasets used in active trainings
- Dependencies: Check for dependencies before deletion
- Permanent: Deletion is permanent and cannot be undone
- Data Files: Data files may be retained depending on configuration
Safe Deletion:
- Check for active uses
- Verify no dependencies
- Backup if needed
- Confirm deletion
Next Steps
- Check the Usage Guide for best practices
- Review Core Concepts to understand dataset types