Best Practices For Building Voice Apps With Conversational Frameworks
May 24, 2022
4 min read
Do developers hate frameworks? While it’s a common myth that real developers should build frameworks, rather than use them, frameworks are indispensable. Particularly when it comes to turning users’ ideas into code, without having to know all the ins and outs.
Some would say that frameworks limit the possibilities, leaving no room for control. But that’s not true: frameworks can always be adjusted and improved to fit the needs of a particular use case. There are common principles that stand behind most frameworks, and it’s much easier to start with a ready-to-use framework that matches your needs and is fully customised to better fit your task rather than starting a project from scratch. One of the main benefits of open-source frameworks is this space for quick and simple customization.
And just like any industry, Conversational AI is no exception. From the outside, coding a voice-first project from scratch may seem effortless — after all, it’s all about putting queries and answers together. But as one goes deeper, it turns out that voice UI is just as complicated a system as any other, with traps and pitfalls of its own. There is speech recognition, NLU, speech synthesis, and channels — all of which you’ll have to carry out by yourself. Not to mention a huge increase in costs.
Sure, a team of in-house Python devs can easily create a basic website bot using any Python chatbot builder. However, this would break the golden rule of development: long-term planning at early stages and thinking through multiple outcomes — once the project is ready to scale, it will save time.
If you decide to go with a framework, make sure to choose one with a big community and support. So when a problem occurs, there will always be someone to help you out. Another great thing about frameworks is that someone has already experienced the problem you are facing and most likely solved it. which means that the problem might already be solved, or a bunch of other users will be down for its solution. Thus, by tapping the collective consciousness, you’ll get your issues fixed faster than figuring them out on your own.
Choosing the right framework
First, you need to clear out all the important details: what languages you’ll use, who’ll be involved in the project, what products you’ll plugin, and what performance requirements there should be.
Whatever programming language stands behind a framework, its features are transferred to the framework, imposing certain restrictions. For example, Java and Kotlin are strongly-typed languages, which results in more control over low-level logic and better optimization. Python is a higher-level language and is not strongly typed. A global interpreter lock, that Python has, makes synchronous interpretation impossible.
Advanced frameworks allow users to utilize nearly any component needed. In conversational AI frameworks, the NLU engine is usually the most important and complicated component, so when choosing the framework, opt for those backed by NLU core. Another crucial thing is to ensure the framework has an active community and ongoing support, to do that check out when the last commit was made on Github.
Top Conversational AI frameworks
Whether you want to create a voice assistant for a mobile app or voice-first games for smart displays and smart TVs, conversational AI frameworks can help you with that. You can choose from dozens of voice tech frameworks that fit a variety of tasks, here are the most advanced ones with active communities:
Rasa Open Source
Built on Python, Rasa is an open-source machine learning framework that allows for the automation of text-and voice-based conversations. Besides, Rasa has a built-in NLU and can be used both as an end-to-end solution and as an NLU server. With this framework, you can create context-aware virtual assistants for custom conversational channels, Facebook Messenger, Slack, Google Hangouts, Webex Teams, Microsoft Bot Framework, Rocket.Chat, Mattermost, Telegram, Twilio, Alexa Skills and Google Home Actions.
TypeScrip-based Jovo Framework enables voice experiences that work across multiple devices and platforms, from mobile phones to Amazon Alexa to Google Assistant to Raspberry Pi, and more.
Also built on TypeScript, BotPress is an open-source conversational AI platform for enterprises that enables conversations and workflow automation. However, while it boasts features like advanced permission, security and data compliance, open-source doesn’t cater to the needs of enterprises when it comes to features like a number of administrators, roles, multiple languages, white-label widgets and interfaces. With no voice markdown or available channels, BotPress seems to be mainly aimed at bots, not voice interfaces.
Deep Pavlov is an open-source framework built on Python. It allows for building NLU-powered multi-skill chatbots, multi-state support, contexts and so on. Other DeepPavlov models can be easily connected to the agent for annotation and evaluation. You can use the framework pretty much for anything, but it’ll require extra work and customization since there’s no channel support at the moment.
Frameworks are crucial if you want to create a unique custom solution and save time doing that. You can use open-source frameworks for free even for commercial use. Some of them enable users to build even enterprise-grade projects with low investments — for free or with additional payment for advanced features, integrations, or channels. What’s important is frameworks don’t limit the development, and allow their users to create skills for smart devices, smartphones, bots, games – you name it. With frameworks, only imagination is the limit