I have always been somewhat skeptical about the value of foundation models as the basis for models that deal with rather specialized fields of knowledge. If I am coding and training a model to analyze clinical and lab data related to infectious diseases, for example, I would include data from other fields of medicine in the training set. Including the vast amount of data in a foundation model that is far away from infectious diseases seems to serve no purpose, however. It may also be that such data might swamp the medical data.
If by data you mean tabular, wouldn’t the pretrained model only cut training time. If the data used to evaluate loss is the same, custom or pretrained models should converge at the same pont, shouldn’t they?
Maybe I don’t understand the post.