Collection of profiled models used to estimate the disrtibuted training time for different Transformer Encoder models partiotioned using Megatron partitioning strategy, for different target losses