Data Flow¶

Data Flow
- Data flow between dataloader and model

Data flow defines how data should be passed between two isolated modules, e.g. dataloader and model, as shown below.

In MMSelfSup, we mainly focus on the data flow between dataloader and model, and between model and visualizer. As for the data flow between model and metric, please refer to the docs in other repos, e.g. MMClassification. Also for data flow between model and visualizer, you can refer to visualization

Data flow between dataloader and model¶

The data flow between dataloader and model can be generally split into three parts, i) use PackSelfSupInputs to pack data from previous transformations into a dictionary, ii) use collate_fn to stack a list of tensors into a batched tensor, iii) data preprocessor will move all these data to target device, e.g. GPUS, and unzip the dictionary from the dataloader into a tuple, containing the input images and meta info (SelfSupDataSample).

Data from dataset¶

In MMSelfSup, before feeding into the model, data should go through a series of transformations, called pipeline, e.g. RandomResizedCrop and ColorJitter. No matter how many transformations in the pipeline, the last transformation is PackSelfSupInputs. PackSelfSupInputs will pack these data from previous transformations into a dictionary. The dictionary contains two parts, namely, inputs and data_samples.

# We omit some unimportant code here

class PackSelfSupInputs(BaseTransform):

    def transform(self,
                  results: Dict) -> Dict[torch.Tensor, SelfSupDataSample]:

        packed_results = dict()
        if self.key in results:
            ...
            packed_results['inputs'] = img

        ...
        packed_results['data_samples'] = data_sample

        return packed_results

Note: inputs contains a list of images, e.g. the multi-views in contrastive learning. Even a single view, PackSelfSupInputs will still put it into a list.

Data from dataloader¶

After receiving a list of dictionary from dataset, collect_fn in dataloader will gather inputs in each dict and stack them into a batched tensor. In addition, data_sample in each dict will be also collected in a list. Then, it will output a dict, containing the same keys with those of the dict in the received list. Finally, dataloader will output the dict from the collect_fn.

Data from data preprocessor¶

Data preprocessor is the last step to process the data before feeding into the model. It will apply image normalization, convert BGR to RGB and move all data to the target device, e.g. GPUs. After above steps, it will output a tuple, containing a list of batched images, and a list of data samples.

class SelfSupDataPreprocessor(ImgDataPreprocessor):

    def forward(
            self,
            data: dict,
            training: bool = False
    ) -> Tuple[List[torch.Tensor], Optional[list]]:

        assert isinstance(data,
                          dict), 'Please use default_collate in dataloader, \
            instead of pseudo_collate.'

        data = [val for _, val in data.items()]
        batch_inputs, batch_data_samples = self.cast_data(data)
        # channel transform
        if self._channel_conversion:
            batch_inputs = [
                _input[:, [2, 1, 0], ...] for _input in batch_inputs
            ]

        # Convert to float after channel conversion to ensure
        # efficiency
        batch_inputs = [input_.float() for input_ in batch_inputs]

        # Normalization. Here is what is different from
        # :class:`mmengine.ImgDataPreprocessor`. Since there are multiple views
        # for an image for some  algorithms, e.g. SimCLR, each item in inputs
        # is a list, containing multi-views for an image.
        if self._enable_normalize:
            batch_inputs = [(_input - self.mean) / self.std
                            for _input in batch_inputs]

        return batch_inputs, batch_data_samples