LLM Engineering
In a system like OpenAI, essentially there are two pipelines: the training of the model and the serving of the model.
Importantly, the distinction between the two is the real-time constraint. Training and testing models can be done slowly, whereas serving the model needs to be as fast as possible.
For training, typically a separate pipeline will exist offline for researchers to test new iterations of the model. This could involve incorporating new training data, formatting and parsing training data, novel changes to the underlying algorithms, etc.