In the beginning was the word…the spoken word, that is. First generation mobile networks were designed and built to enable telephony on the move, so that users could not only make telephone calls when they were away from a landline connection, but could also be reached even when the calling party didn’t know where they were.
The aim was to reproduce the characteristics and the quality of the fixed-line PSTN. Cellular architecture, frequency reuse and a system of location registers made it all possible. The network was optimised for voice, and the channels used enough spectrum and bandwidth to make the voice calls intelligible. It was possible to send data over these analog cellular networks - just as it was possible to send data over the fixed telephone network. A few determined individuals had cellular modem sand even mobile fax machines, which like their fixed network equivalents converted digital data streams into squeaks, whistles and crackling sounds.
Second generation digital networks introduced all-digital technology. The main design imperative was to squeeze more voice calls into the available spectrum, and to make use of improvements in silicon and software to bring new spectrum bands into use. Voice was now digitised, so that a component called a codec turned the analog sound waves of the voice into a digital data stream to send across the mobile network.
The shift to the second generation also introduced more sophisticated data transmission capability; as well as the SMS text messaging service and circuit switched data. A stopgap solution, GPRS in the world of GSM and ETSI specifications, introduced the packet switched data that was by then predominant in data transmission in the fixed network. It could support some specific data services and a limited mobile-only browsing experience, but it took until the third generation for mobile data to be able to support proper web browsing and multimedia services. Mobile devices became more capable too, with the advent of high-quality touchscreens and cameras.
Third generation mobile networks, introduced in the early 2000s, used an enhanced version of the packet core in the telecom network infrastructure. This allowed 2G to carry data but maintained a circuit switched core network for voice calls. The higher bandwidth available allowed for a new codec for speech, called the Adaptive Multi Rate (AMR) codec. It allowed for high-definition voice when the network conditions allowed.
At the end of 2009 the first 4G network was launched. The Long Term Evolution (LTE) technology, on which this was based, offered much higher data rates and lower latency than 3G, enabling services such as HD video streaming and online gaming. It also offered a way to introduce packet switching voice with LTE. From the telecom service providers’ perspective this meant better utilisation of spectrum and network resources and reduced dependency on circuit switched technology for voice.
But this also meant that voice calls on the LTE network needed a new technology, to enable calls to be carried in the absence of a network based on the circuits which had always been intrinsic to voice calls. Packet based Voice over IP (VoIP) had been in use on the fixed network for some time then, a disruptive technology at first and then in enterprise and telecom service providers’ networks for fixed line voice.
The technology chosen to enable packet-switched voice on the 4G mobile network is called Voice over LTE (VoLTE). This was built on the high-definition voice capability introduced in3G, enabled faster call set-up, and made possible services which integrated voice and data.
The technology stack underpinning VoLTE is IP Multimedia Subsystem (IMS). IMS is an architectural framework for delivering IP multimedia services integrated with data services on mobile devices. It provides a standardized interface for telecom service providers to offer a wide range of services, including voice, video, messaging, and data, over IP networks.
IMS is designed to be access-agnostic, supporting various types of networks such as LTE, Wi-Fi, and fixed broadband. IMS is crucial for VoLTE as it provides the underlying infrastructure needed to support voice services over LTE networks. VoLTE relies on IMS to manage and deliver voice calls, ensuring high-quality service and enabling features such as call continuity, supplementary services (like call forwarding and voicemail), and integration with other IP-based services. IMS handles the session control, authentication, and billing for VoLTE, ensuring that the voice services are reliable, secure, and seamlessly integrated with the telecom service providers’ overall service offer. IMS is also the underlying architecture for Webrtc, the standards-based approach to real time communications using web browsers on any device including the mobile network.
Not all implementations of VoLTE/IMS are the same. Over the course of the last few years, we have seen the emergence of compact versions that can be deployed for private network applications and also in the public cloud with a minimal footprint. These should be contrasted with the heavier options available from more traditional telecommunications equipment manufacturers that require dedicated hardware and software.
However, neither the compact all in one IMS implementations, nor the dedicated hardware-based versions were able to evolve with the cloud and virtualization requirements of modern day IT and Telco environments.
The road to deployment of VoLTE was not entirely smooth.
Initially the packet-based 4G networks were not required to support voice, because telecom service providers had invested in the 3G networks and already built circuit switched voice capability, hence had the ability to use ‘Circuit Switched Fall Back’ (CSFB), reverting to 2G and 3G networks for voice calls. This continued as the adoption and demand of LTE kept increasing.
Over time 4G has become the most commonly deployed mobile technology. With the arrival of 5G and the demand for more spectrum, has put pressure on the MNOs sunset 2G and 3G networks in many countries to re-farm the spectrum and reallocate to 5G or other services.
4G and 5G are rapidly replacing 2G and 3G as the generations of choice for mobile communication. In 2022, devices in which 2G or 3G were the highest embedded technology generation was above 20% worldwide (and much higher in some regions, such as Europe at 30%). However, one of the stories of the next decade is that these technologies will be rapidly phased out of consumer and IoT deployments.
Partly this switch from 2G/3G to 4G/5G is because of the increasing closure of 2G and 3G networks around the world, necessitating a migration to higher generations. For instance, 2G and 3G networks have already been switched off, or are going through the process of switch-off, in Australia, Japan, South Korea and the United States, amongst others. In Europe, 2G and 3G networks have hung on for longer, but most telecom service providers are planning a shutdown of 3G in the next 3-5 years.
The result of these trends is that increasingly CSFB is not an option any more, either because the 2G/3G networks are not available, or because the device maker is choosing not to support the legacy technologies within their devices, or both. Support for VoLTE/IMS thus becomes essential in carrying voice on the mobile networks.
There is, of course, another way to provide voice calls on a network that’s designed for data. That’s what is usually called “Over The Top” (OTT) voice. This began on the fixed network as a way for users to bypass telecom service providers’ long distance and international call rates. The concept was popularised by Skype, which provided free computer-to-computer calls, and soon introduced video calling. Other application providers, notably OTT messaging providers such as WhatsApp, Telegram, Line and the Chinese provider WeChat, followed on and began to offer voice and video calls.
OTT voice has become very popular and is commonly used today. There are some significant upsides for users in OTT services.
But OTT voice has many downsides compared to the voice (and video) services offered by the telecom service providers.
VoLTE, as the name suggests, is aimed at delivering voice services over LTE networks. As network deployments evolve to 5G, a 5G variant is required in the form of Voice over New Radio (VoNR). This performs a similar function to VoLTE but with enhanced call quality, better network efficiency, the ability to implement dedicated network slices, and the ability to further integrate voice with data services.
VoNR, like VoLTE, uses the IMS to manage voice services, ensuring compatibility and continuity across different network technologies. VoNR is intended to support voice services over a 5G Stand Alone(SA) core network.
The mechanism for support for voice over most 5G devices is therefore most commonly through a fallback to VoLTE. It is unlikely that 5G network coverage will outstrip 4G in any markets for the next 5-10 years, meaning that support via 4G fallback will continue.
Voice has a future beyond telephony. There is widespread evidence from around the world that the number of calls, and of call minutes, is in long term decline. Users are increasingly substituting messaging - paradoxically, including voice notes - and interactive chat for telephone calls.
But voice is finding new use cases, especially in the Internet of Things (IoT). According to Transforma Insights: “Almost 20% of IoT devices are in a category of use case that has a requirement for voice support, up from 14% in 2023. The 2024 figure equates to 1.4 billion connections…the proportion of revenue that is dependent on supporting voice services is even more pronounced over the forecast period. By 2033 of the almost USD80 billion cellular IoT connectivity revenue opportunity, 22% is generated by applications that will have a requirement for voice support. In total over the years 2024-2033, cellular IoT connectivity revenue will be USD 536 billion globally, of which USD 102 billion (19%) will be generated by applications that have a requirement to support voice.”[1]
Voice capability is a requirement in automotive use cases such as eCall (where it is mandated by regulators), and in bCall (automated breakdown service). It’s also needed in other contexts, including security, personal tracking, worker safety and access control. An important emerging domain is AR/VR, where voice interaction and a voice-enabled UI is developing very rapidly, in both industrial and B2C deployments.
ng-voice is at the forefront of this voice evolution, offering a cloud-native IMS solution that is infrastructure-agnostic, cost-efficient, and highly automated. Our solution supports both VoLTE and VoNR through the same software stack, ensuring that telecom service providers can deliver high-quality voice and video services over 4G and 5G networks.
One of the key advantages of ng-voice's solution is its flexibility and scalability. By leveraging cloud-native technologies, ng-voice enables telecom service providers to deploy and manage their networks with unprecedented efficiency, reducing total cost of ownership by up to 70%.
This is particularly beneficial for telecom service providers looking to future-proof their networks and support new and emerging voice use cases like IoT. A cloud-based platform provides telecom service providers with the ability to unlock a very short innovation cycle, developing and refining revenue generating features tailored to their own specific needs.
[1] “Why VoLTE/VoNR is a critical part of an IoT connectivity provider's portfolio”, Transforma Insights, September 2024
By using this website, you agree to the storing of cookies on your device to enhance site navigation, analyze site usage, and assist in our marketing efforts. View our Privacy Policy for more information.