Tales from the Machinery Room - Customizing LLMs

This talk shares technical insights from TNG AI Research on creating the R1T Chimera models by merging DeepSeek models using Mixture-of-Experts and Assembly-of-Experts techniques. It covers theoretical foundations, technical methods, and key findings, illustrating how a small consultancy developed high-usage, innovative large language models.

talk.summaryAiDisclaimer

Fabian KlemmTNG Technology Consulting GmbH

talkDetail.whenAndWhere

Thursday, May 7, 16:15-17:05

Room D

talks.description

"The Germans just Frankensteined DeepSeek's R1 and V3 into something called R1T Chimera".

Way beyond this X-post, the R1T and R1T2 Chimera models published by TNG gained severe attention with a daily usage of more than 10 billion tokens via OpenRouter. So, what kind of "Frankensteining" is going on there? How can a small software consultancy such as TNG produce its own models?

Our internal research team has been experimenting and publishing results with a focus on Mixture-of-Expert Large Language Models and adaptions of the DeepSeek model family. First, we manipulated the way experts work within a model under the name "Mixture of Tunable Experts". We then continued with the "Assembly-of-Experts" merging process resulting in the successful Chimera models.

This talk aims to give some technical insights into the work of TNG AI Research. We recall the theoretical basics and our most important results. We provide technical details on how things were done as well as some anecdotes about the successes and losses along this journey.

deepseek

models

experts

chimera

talks.speakers

Fabian Klemm

TNG Technology Consulting GmbH

Germany

Fabian Klemm completed his doctorate at the TU Munich in the field of discrete optimization and applied geometry. In 2020 and joining TNG as a software consultant. Besides a lot of DevOps experience, Fabian has a developed a greater interest in AI and particular LLM topics. Fabian contributes to the TNG Skainet team that operates the TNG internal AI server rack and since 2024 Fabian has been working in the internal TNG AI research team that is responsible for the successful TNG DeepSeek Chimera models.