Generate AI Training Data
Generate synthetic vision data for AI training
Creating large, diverse datasets is a crucial step in training AI models, especially for tasks like image recognition, object detection, and classification. However, collecting and labeling real-world data can be time-consuming, costly, and sometimes impractical. This is where synthetic AI training data comes in as a powerful alternative.
Synthetic AI training data involves generating artificial data that mimics real-world conditions. Using simulation environments like Unity and tools such as realvirtual.io AIBuilder, developers can create vast amounts of synthetic data by varying parameters such as object position, lighting, textures, and camera angles. This process helps generate robust datasets that improve AI model performance without the need for large, manually-collected datasets.
Benefits of Synthetic AI Training Data
Cost-effective: Eliminates the need for expensive data collection processes and reduces manual labeling time.
Control over variables: Allows developers to adjust conditions like lighting, camera perspective, and object configurations to cover a wide range of scenarios.
Faster iterations: Enables rapid creation of diverse training sets, leading to quicker AI model training cycles.
Enhanced robustness: By simulating edge cases and varying environments, AI models trained on synthetic data tend to generalize better to real-world conditions.
In realvirtual.io AIBuilder, the Variators are responsible for dynamically adjusting various parameters during the training process, helping generate a wide variety of synthetic data automatically. This method allows AI models to learn from a range of inputs and ensures the model performs well under different conditions.
Components for Training
The AIBuilder Demo scene demonstrates how to create synthetic AI training data and train object detection models using the realvirtual.io AIBuilder framework. This demo is designed to showcase key functionalities like camera placement, data recording, and AI model training, using a conveyor belt and sorting system with different colored Lego bricks.
AI Camera
The AI Camera is a vital component in the synthetic training process. It allows you to position the camera within the scene to capture images from different angles, lighting conditions, and perspectives.
The camera's primary role is to record synthetic data, which is then used for training the AI model. In the Inspector window, you can set the Mode of the AI Camera to Training for generating data.
Once set up, the AI Camera continuously captures images of the objects (e.g., Lego bricks) on the conveyor belt, which will be used to train the AI to recognize and sort these objects.
The AI Camera is straightforward to use, requiring no additional configuration. Simply place it in the desired position where synthetic images should be generated.
AIBuilder
The AIBuilder component is designed to manage the entire process of AI training and data recording.
The AI Builder needs to be set to the desired mode, depending on the current stage of your workflow:
Training Mode: In this mode, synthetic training data can be generated using the AI Camera. Once enough data has been recorded, you can start the training process based on the generated data. This mode is used to prepare and train the AI model for object detection.
Detection Mode: In detection mode, the AI performs object detection based on the previously trained model. The detection is still done using the AI Camera (not a real one), making it ideal for testing and developing the entire AI-based process within a fully digital twin. This allows you to test the detection capabilities and refine the AI integration with PLC and robotics interfaces before deploying it to a physical system. See section Testing AI in a Digital Twin.
Training Data Recorder
The Data Recorder, located under "Step 1 - Data Recorder" in the AIBuilderDemo scene, is responsible for capturing synthetic data that will later be used for training AI models. This component collects labeled data from the scene and saves it into a designated folder.
In the scene, the Data Recorder works in combination with the AI Camera and other variators (such as Sky and Light Variators) to create a wide variety of training data. You can configure key parameters like timing, sample count, and validation ratio to control how the synthetic data is generated and stored.
In the Inspector panel of the Training Data Recorder, you can:
Set Labels for the objects being captured (e.g., Lego brick types).
Adjust the Timing settings, including Time Scale and Delay between captures.
Specify the number of Samples to collect and the Validation Ratio for splitting the dataset.
Choose the folder to Export the data to, ensuring the synthetic data is saved for later training use.
Additionally, variators such as Sky and Light Variator are also present to introduce variation in lighting conditions, ensuring the AI becomes robust across different environments.
Starting the Training Process
To begin the AI training process:
Open the scene in Unity and ensure that the Data Recorder is properly configured with the labels, timing, and samples as needed.
Start the scene by entering Play Mode.
In the Inspector panel under the Data Recorder, click on Start Recording.
Once selected, the Data Recorder will begin capturing synthetic data according to the defined settings. The images and their corresponding labels will be saved in the specified folder for later use in training the AI model.
Variators
Sky Variator
The Sky Variator is responsible for varying the sky conditions during the synthetic data recording process. By changing the sky settings, the AI model is exposed to different lighting conditions, making it more robust to real-world variability. In the Inspector, you can configure the following options:
Apply On Init: Enables the sky variation when the recording starts.
Apply On Snapshot: Applies the sky variation each time a snapshot of synthetic data is recorded.
Step: Defines the number of different sky conditions that will be applied during the recording.
Light Variator
The Light Variator is responsible for adjusting the lighting conditions during data recording, helping to expose the AI to various lighting intensities and temperatures. In the Inspector, you can configure the following options:
Apply On Init: Enables light variation at the start of the recording.
Apply On Snapshot: Applies light variation during each snapshot.
Min/Max Intensity: Defines the range of light intensity that will be applied.
Min/Max Temperature: Defines the range of light temperature (from cooler to warmer tones) to be used.
Light: Specifies the light source to be varied, such as the sun in the scene.
Object Variator:
Apply On Init: This option is unchecked, meaning the variation of objects will not happen immediately when the scene starts.
Apply On Snapshot: This option is unchecked, indicating that variations won’t automatically occur when snapshots are taken.
Objects: There are four types of objects configured for variation:
2x2 Brick
3x2 Brick
4x2 Brick
6x2 Brick
Weights: Each object has an equal weight of 0.25, meaning each object has an equal chance of being selected during the variation process.
Transform Variator:
Apply On Init: Checked, meaning that variations in position, rotation, and scale will be applied when the scene is initialized.
Apply On Snapshot: Unchecked, so variations will not happen automatically with each snapshot.
Position Variation: The position of the object can vary between -0.5 and 0.5 units on each of the X, Y, and Z axes.
Rotation Variation: The object’s rotation can vary by up to 360 degrees along the X, Y, and Z axes.
Scale Variation: The object’s scale remains fixed (no scaling variation), as indicated by the zero values for scale variation.
Last updated