BrainBlog for Arrcus by Jason Bloomberg
At the center of the explosion of interest in AI are the models – in particular, the large language models (LLMs) that typically drive generative AI but also specialized language models (SLMs) of various sizes.
Models are like empty vessels until someone trains them – a time-consuming, resource-intensive process that takes place before putting them into use.
Training alone, however, doesn’t make a model useful. AI models deliver value via inferencing.
Inferencing refers to the process of applying a fully trained model to new data – data that drive whatever output the business requires, including decisions, predictions, or agentic behavior.
While training takes place beforehand, inferencing occurs and wherever people require results – often in real-time.
As a result, the infrastructure requirements for training and inferencing are quite different, especially for the network that must support the real-time nature of inferencing at scale.
Click here to read the entire article.


