In a nutshell
- The first Romanian LLM was introduced today, a project led by Politehnica Bucharest, the University of Bucharest and the Institute of Logic and Data Science, with the support of BRD Groupe Société Générale.
- The model is open source, and it could be used for the development of AI tools and platforms.
- Together with the LLM, an OpenLLM-Ro community is also initiated with the aim to further develop AI implementation in Romania.
Get the details
From the second part of 2023, a team of researchers from Politehnica Bucharest, the University of Bucharest and the Institute of Logic and Data Science worked on the development and training of the Romanian LLM. The contributing researchers executed the project pro-bono, according to Startup.ro. Moreover, Politehnica Bucharest provided the computing power necessary to train the model.
BRD Groupe Société Générale, supporting innovation and future technologies in Romania, is the main partner of the project.
The endeavor to develop a language-specific model is often initiated by the academic communities. In March this year, the Bulgarian Institute for Computer Science, Artificial Intelligence and Technology (INSAIT) launched the first free Bulgarian trained open LLM, BgGPT. Later that month, Greece introduced their own LLM, Meltemi. Across Europe, there are other examples, such as the German LeoLM and the Spanish Aguila.
A key objective in developing a Romanian LLM is to address the limitations of current open LLMs, which are primarily trained on English monolingual datasets. The new model was exposed to several million documents in the Romanian language, shared the researchers.
In their own words
“We hope that the launch of this model is just the beginning of a long-term effort that will result in better LLMs for the Romanian language. We have already discovered a method that we want to apply to other recently released models (Llama-3 and Mistral) that generally perform better than the one we started with (Llama-2),” explained Traian Rebedea, associate professor at Politehnica Bucharest and principal researcher at NVIDIA, one of the technical coordinators of the OpenLLM-Ro initiative for Startup.ro.
“We hope that both private and public entities understand the importance of developing large language and multimodal (text-image) models for the Romanian language. We welcome everyone to join us in the OpenLLM-Ro initiative and the research projects that will support it,” he said.
“By getting involved in the innovation landscape, we can help the latest technologies have a positive impact in Romanian society almost at the same pace as developments in the field at an international level,” said Horia Velicu, Head of Innovation Lab in BRD Groupe Société Générale.