In a rapidly evolving landscape where the boundaries of communication are continually expanding, Polish startup ElevenLabs has emerged as one of the rising solutions from Central and Eastern Europe in the realm of voice AI. As a dedicated research and deployment company, their mission is unequivocal — to make content universally accessible in any language and voice.
Founded by two Polish childhood friends, Mati Staniszewski and Piotr Dabkowski, who shared a passion for technology, ElevenLabs embarked on a journey to utilize the power of AI in the text-to-speech space in 2022. Having secured substantial funding of over €20M in less than a year, ElevenLabs represents one of the latest emerging startups with CEE roots. With their recent release of a new deep learning model, the startup supports capabilities across 30 languages.
In the following interview, The Recursive sat down with the co-founder Mati to delve into the team’s inspiring journey, innovative approach and solution, views on the AI ecosystem development, and the profound impact they’re set to make on creators, publishers, and beyond.
The Recursive: Could you tell us more about the founding story behind ElevenLabs and your mission?
Mati Staniszewski: I met my co-founder and best friend in high school over 12 years ago. Later on, we both moved to the UK to study, and one thing that remained constant through the years was working together. We would get together to learn more about technologies we wanted to understand better, and one of them was voice.
We realized how many opportunities were still in the AI voice space. And then a second thing happened — we connected it with the state of Polish dubbing of foreign movies where we have a single narrator voiceover reading all the lines. So, we decided to start ElevenLabs as a research and deployment company with a mission to make quality content available across all languages.
In the field of text-to-speech technology, what distinguishes ElevenLabs from its competitors?
Mati Staniszewski: Our key focus areas are around creators, for example, people who create voiceovers for YouTube or social media or those in the publishing space who are creating audiobooks or want to narrate news articles. We’re also seeing huge demand in gaming, entertainment, and AI agents.
There are many text-to-speech technologies out in the market. But one thing they all have in common is how robotic they sound, so they are nearly immediately recognizable as not human. What we did differently was learning what makes a human voice sound human. We then built our foundational speech models off the back of this research (all of our research is in-house, which almost no startup currently does). We then created our own text-to-speech model and also a proprietary voice model that was able to mimic the innate ‘humanness’ of speech, offering users a much higher level of quality.
Thinking about the future, we want to continue being the best research audio AI hub in the world. Our first core element is working on the foundational audio technology and how we can make it applicable across more use cases. The second element focuses on integrating the technology into people’s workflow to simplify the process. We’re also thinking deeply about how we can support the industry and work closely with key stakeholders, such as voice actors, to create new opportunities.
As a young company established in 2022, how do you reflect on raising over $20M in less than a year? What advice would you give other founders?
Mati Staniszewski: We closed our first round of $2 million led by Credo Ventures last year, and then we raised another Series A round earlier this year, which was the $20 million round. Each fundraising was such a different experience.
The first round was our first fundraising ever as founders, so we were convincing investors to engage with the concept of what we were building. When it comes to our Series A, we actually weren’t looking to fundraise at all; investors approached us when they saw our technology. And while this time we didn’t need the money in the short term, we needed great partners on the journey that would help us unlock the next stage of the company.
“I think my biggest tip is to really focus on building the product and thinking about why it is useful. Everything else will come if you are doing that well,” says Mati Staniszewski.
Of course, you need the capital to get there. But if you have the capital for the next six months or more, you can put all your focus on the product and users.
For those looking to fundraise, I would recommend speaking to investors in batches — so you speak with a few investors at the same time. This helps because you can compare feedback and reactions and get a better sense of whether things are going well.
You have recently released a new deep learning model, the Eleven Multilingual v2. How does this impact your company and the content creation space?
Mati Staniszewski: It’s a huge milestone for us, as we started with the goal of making content available in different voices and languages. Still, until recently, it was a very small proportion of what we were able to support. And our new model suddenly opens us to almost 30 languages. And we’re just so excited because now all the creators who have been asking us about languages like Korean and Japanese, or Mandarin or Romanian, finally have the tool they’ve been looking for.
This is great because it allows us to provide a higher diversity of languages to content creators worldwide. It is a huge step forward as we open ElevenLabs up to most of the world.
How do you envision the future of content consumption and creation shaped by the use of AI technology?
Mati Staniszewski: I believe there will be several stages. The first one, which is happening now, allows creators to produce content in a much easier way. The quality and availability of production will increase. And that’s great because, suddenly, so many new creators can access tools, such as ElevenLabs and other tools across the AI space, they couldn’t previously afford. This will continue over the next year or two, when there’ll be an increasing amount of high-quality content that people can finally enjoy.
There’ll be a second stage, which will flip from creators producing content to people accessing content on demand. For example, in the case of a book, you will be able to open it yourself and choose which voice you’d like it to be read with. And you will be able to listen to it in a perfect quality. Then, in the third stage, which may blend at some point, you will be able to not only consume existing content but also create and interact with content as a listener. So when you listen to a book, you can pause it and tell the generative AI tool to introduce new, unknown stories you want to explore or even speak to the characters yourself.
What ethical considerations guide your approach to developing and deploying AI solutions as AI technologies evolve?
Mati Staniszewski: This is a very important subject for both ElevenLabs and the whole field. Every company in the AI space has the responsibility to think about safeguards and ethical considerations as they create and release their work.
For us, the importance lies in traceability and accountability. In practice, this means you need to be a verified user to use to create voices, enabling us to trace back every clip generated to the account it came from. We also believe that education and providing tools for people to distinguish AI-generated content is increasingly important. It’s going to be essential to be able to provide listeners with the ability to verify whether the content is generated by AI or not. To that end, we have created our own publicly released tool — an AI speech classifier — where you upload any audio and can find out if it includes AI audio created via ElevenLabs. Hopefully, over time, the tool will also be able to detect audio created on other platforms.
Can you tell me a little more about your international team? How do you view the availability and quality of Polish talent?
Mati Staniszewski: Our team of over 20 people is working remotely from multiple locations in Europe, including Slovakia, Poland, Germany, Switzerland, and a few others. We also have a smaller portion of the team in the US and Asia. However, we are planning to create hubs in the future where our employees will have the option to work both in person and remotely.
We have five employees in Poland. We see a lot of exceptional talent, especially on the technical side, who are from Poland and the CEE region. I think the level of education for those technical elements is just phenomenal.
However, what sets these regions apart from others, like the US or UK, is the familiarity with startup practices and processes. In Poland, people are often not familiar with them, so it’s slightly harder to make sure they know what the journey represents, what the equity element is, or how funding works. But the quality of the talent in Poland, based on the technical aptitude and passion for the subject, is currently the most important factor for us. We are just at the beginning, and we need people who love what they do and match their interests to their work.
There are many professionals who create great, generic product work all over the world. However, it’s something different to make research available at scale and serve an AI model efficiently to thousands of millions of people. And that’s where we have seen high expertise in backend engineering and deploying AI systems at scale in the CEE region.
What factors led you to establish ElevenLabs in the US rather than Europe? What advice would you give aspiring regional AI entrepreneurs looking to start their own companies?
Mati Staniszewski: Our main goal is to become a global company where all users of different languages have access to ElevenLabs. To that end, we didn’t constrain ourselves to one location and instead created a remote team. When we were thinking of the easiest way to move quickly and achieve our goal, the US presented the clearest ecosystem for us, which is why we’ve incorporated there.
It’s very important to define the problem you aim to solve, and then, before you start solving it, try to understand and dig a little deeper into the potential users and engage with them.
“Don’t start solving something with AI just because AI is everywhere right now. Instead, focus on something that you will actually enjoy solving and want to work on,” says Mati Staniszewski.