• News


Launching ESP-RTC: A Real-Time Audio-and-Video Communication Solution

Shanghai, China
Aug 5, 2022

ESP-RTC achieves stable, smooth, ultra-low-latency voice-and-video transmission in real time, providing an ideal solution for users who want to build low-cost and low-power audio-and-video products.

Espressif Systems (SSE: 688018.SH) is pleased to announce the release of ESP-RTC (ESP Real-Time Communication), an audio-and-video communication solution, which achieves stable, smooth and ultra-low latency voice-and-video transmissions in real time.

ESP-RTC is built around Espressif's ESP32-S3-Korvo-2 multimedia development board. ESP32-S3-Korvo-2 is equipped with the ESP32-S3 AI SoC, along with a dual microphone array for near-/far-field voice wake-up and speech recognition. It also integrates cameras, Micro SD cards, LCDs and other peripherals, and supports processing based on MJPEG video streams, thus providing an ideal development board for users who wish to build low-cost and low-power audio-and-video products.

The ESP-RTC solution materializes real-time audio-and-video transmission based on Espressif's self-developed SIP (Session Initialization Protocol) stack, which includes a transport layer, a transaction layer and a session layer. The signaling interaction module of ESP-RTC supports UDP, TCP and TLS, while its media transmission module supports RTP (UDP), RTCP, SRTP, TURN and other NAT transmission protocols. It is worth mentioning that the transmission module of the ESP-RTC solution also includes counter-measure algorithms, such as a Jitter Buffer and PLC, which effectively solve packet loss, jitter, congestion, and delays in weak networks, fully ensuring smooth audio-and-video communication in real time.

The ESP-RTC solution also supports the RTSP (Real Time Streaming Protocol) stack, whose media transmission module supports both RTP/UDP and RTP over TCP. The ESP-RTC solution can be used as an RTSP server supporting the on-demand use of such players as VLC/FFMPEG/PotPlayer/KmPlayer, or as an RTSP client supporting EasyDarwin, an easy-to-use, open-source, streaming platform framework.

Based on Espressif's self-developed algorithms, i.e., acoustic echo cancellation (AEC), background noise suppression (BNS), automatic gain control (AGC), ESP-RTC reduces sound interference in audio calls, ensuring high quality and stability in voice communication. ESP-RTC also utilizes Espressif's chip-level codec algorithm to provide users with a clear picture in their video calls. Furthermore, ESP-RTC takes advantage of the excellent AI computing power of Espressif's ESP32-S3 SoC, to achieve high-performance voice wake-up, voice recognition, and image recognition. Thus, ESP-RTC is suitable for the development of smart speakers, door video-intercom systems, smart-home control panels, pet monitors, car monitors, children's toys and other application scenarios.

The ESP-RTC solution supports open-source servers, such as FreeSWITCH and FreePBX, and can also access mature SFU Cloud servers to materialize group conference calls. Additionally, developers can quickly build audio and video communication-related applications with the help of Espressif's open-source ESP-IDF (IoT Development Framework) and ESP-ADF (Audio Development Framework).

If you want to know more about ESP-RTC, Espressif's real-time, audio-an-video communication solution, please contact our customer support team. You can also go to Espressif's official Taobao store to buy the ESP32-S3-Korvo-2 development board and build your own audio-and-video-call gadget.

Share this article
  • LinkedIn
  • 微信


Reuse this content


Technical Writer and Editor

About this author ›