Train an Object Detection Model ================================ .. role:: guimenu .. role:: guiaction .. role:: guioption Follow this guide to learn how to prepare and, if necessary, train an object detection model. Neura offers two ways to train an object detection model: using :ref:`AI Hub ` or the :ref:`AI Development Arena `. The latter is significantly faster and more user-friendly. Both use different methods for object detection. .. contents:: :local: :depth: 2 :backlinks: none :class: toc .. _using-the-train-tab-method-neura-dlis1: Using the AI Hub - Method: neura_DLIS1 or neura_DLSI3 ----------------------------------------------------- Here's how you can train an object detection model, using **AI Hub**. This tutorial assumes that you have at least one registered dataset, possibly generated using :ref:`AI Hub: Data Generation `. .. note:: Find detailed information about the training process and the desricption of training parameter :ref:`here. ` Step 1: Train a model and test ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 1. Using an external PC, access Maira GUI at `http://192.168.2.14:8080 `_ and switch to :strong:`AI Hub Mode`. 2. Click :guimenu:`TRAIN`. 3. Select :guioption:`Object Detection` as the **Type of Training**. 4. Select :guioption:`neura_DLIS1` or :guioption:`neura_DLIS3` as the **Method**. 5. Select as many datasets and dataset types from the **Dataset list** dropdown and it's corresponding types. Provide a suitable name for the model and define the number of iterations. 6. Click :guimenu:`Start Train`. The progress bar should gradually fill up. 7. Once the training is complete, you will see a message indicating that the process has finished. 8. Successfully trained models can be found under :guimenu:`DATA MANAGEMENT` > :guimenu:`Trained Segmentation Models`. Database search allows finding the generated dataset by the assigned name. .. attention:: Don't forget to test your model after training. See :ref:`Model Testing` for more information. .. _using-the-object-onboarding-wizard-method-neura-dlis2: Using the AI Development Arena Object Onboarding Wizard - Method: neura_DLIS2 ----------------------------------------------------------------------------- This tutorial assumes that you have completed the tutorial about collecting real data (:ref:`Tutorial: Collect Real Data `) and saved a dataset with the type **OBJECT ONBOARDING**. .. hint:: What is :ref:`Object Onboarding Wizard `? As of now, if you capture images using an external camera, you need to do the following to make the images available for Object Onboarding (to train a **neura_DLIS2** model). 1. Ensure the image files are prefixed with *rgb_* followed by a digit and with **.png**, **.jpg** or **.jpeg** as extensions. (e.g., *rgb_1.jpg*) 2. Store the image files in a folder with a name that matches the object name 3. Copy the folder into a USB and use the robot's **File Transfer** utility to transfer the folder to the AI server location: *object_perception/objects/object_templates*. 1. Using an external PC, access Maira GUI at `http://192.168.2.14:8080 `_ and switch to :strong:`AI Hub Mode`. 2. Click :guioption:`AI Dev Arena` and a new tab to `AI Dev Arena `_ opens. 3. Click :guioption:`Object Onboarding Wizard` to open the wizard. A new tab opens. .. figure:: /_static/ai_hub/ai_dev_arena/ai_dev_arena06.png :align: center :class: padded-image Step 1: Generating masks for the new object ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 4. Click :guioption:`Object Onboarding` and click :guioption:`Load Images`. Navigate into *object_templates* directory, click the folder that matches the assigned name during saving the dataset. Click :guioption:`OK`. .. figure:: /_static/ai_hub/ai_dev_arena/ai_dev_arena07.png :align: center :class: padded-image 5. You see your first image and a crosshair arrow. Draw a bounding box around the object: * Position the crosshair in such a way that your object of interest is tightly (tangentially touching) the lower right quadrant. **Left-click** once. * Position the crosshair in such a way that your object of interest is tightly (tangentially touching) the upper left quadrant. **Left-click** once. A complete bounding box is formed around the object. * Click :guioption:`Copy Box` to apply the relative position of the bounding box to all images. * Click through the images to review the bounding boxes. If an adjustment is needed, simply redraw the box. .. figure:: /_static/ai_hub/ai_dev_arena/ai_dev_arena08.png :align: center :class: padded-image .. attention:: You only need to annotate one object per image. If you have multiple candidates that can be annotated in an image, pick the one that's least occluded. .. note:: The bounding box does not need to be perfectly aligned. The majority of the object should be inside the box, but parts of it can extend beyond the boundary. .. figure:: /_static/ai_hub/ai_dev_arena/ai_dev_arena09.png :align: center :class: padded-image 6. Click :guioption:`Generate Masks` to automatically generate red masks around your object's boundaries. .. figure:: /_static/ai_hub/ai_dev_arena/ai_dev_arena10.png :align: center :class: padded-image 7. Press :guioption:`Save Masks` to save the mask templates. Click on the :guioption:`Home` icon to return to the front page of the wizard. Step 2: Train a model from the generated masks and test ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 8. Click :guioption:`Inference` to create a model from the generated masks and also to perform a sample inference. Select :guioption:`Load Images` to select the images from the file system to make a sample inference. Click :guioption:`OK`. .. figure:: /_static/ai_hub/ai_dev_arena/ai_dev_arena11.png :align: center :class: padded-image .. tip:: You could use an image from the *object_templates/*. If you would like to test the model on an unseen image captured using an external camera, save the image in a USB and use the Robot Mode's **File Manager** utility to transfer the image to *object_perception/objects/test_images*. 9. For each image your have selected, you can choose from the following prompt types to experiment and evaluate their performance. .. list-table:: :widths: 10 40 :header-rows: 1 * - Prompt Type - Description * - Visual - Primary detection mode for *neura_DLIS2* method. Detects only the selected, onboarded objects. * - Everything - Detects all objects in the image. * - Text - Detects all objects, which attributes that matches the input prompt the closest. .. figure:: /_static/ai_hub/ai_dev_arena/ai_dev_arena12.png :align: center :class: padded-image .. figure:: /_static/ai_hub/ai_dev_arena/ai_dev_arena14.png :align: center :class: padded-image 10. There are also several ``Advanced parameters`` that can be shown and configured. .. list-table:: :widths: 10 40 :header-rows: 1 * - Parameter - Description * - detection threshold - The confidence score required for an object to be considered detected. A higher threshold means only high-confidence detections are retained, while a lower threshold allows more potential detections, including uncertain ones. * - visual similarity threshold - Determines how closely a detected object must match a reference object in terms of appearance. A high threshold ensures only highly similar objects are detected, while a lower threshold allows more variation. * - mask threshold min - The minimum threshold for a pixel to be considered part of the detected object's segmentation mask. A higher value results in a stricter, more refined mask, while a lower value captures more surrounding pixels. * - mask threshold max - The upper limit for the mask threshold, ensuring that pixels beyond a certain confidence level are included in the object’s segmentation. * - iou - A metric measuring the overlap between the predicted bounding box (or mask) and the ground truth. A higher IoU means stricter matching, ensuring detected objects closely align with their actual shapes. .. figure:: /_static/ai_hub/ai_dev_arena/ai_dev_arena13.png :align: center :class: padded-image 11. After specifying the inference parameters for every image, click :guioption:`Predict`. The first time a new or updated model is loaded, it may take a few extra seconds to complete. Subsequent predictions take an average of ~0.5 seconds. .. note:: If the inference results are not satisfactory, tune the ``Advanced Parameters``