Best real-time communication (RTC / VoIP) softphone on the Linux desktop?


The Debian community has recently started discussing the way to choose the real-time communications (RTC/VoIP) desktop client for Debian 8 (jessie) users.

Debian 7 (wheezy), like Fedora, ships GNOME as the default desktop and the GNOME Empathy client is installed by default with it. Simon McVittie, Empathy package maintainer has provided a comprehensive response to the main discussion points indicating that the Empathy project comes from an Instant Messaging (IM) background (it is extremely easy to setup and use for XMPP chat) but is not a strong candidate for voice and video.

Just how to choose an RTC/VoIP client then?

One question that is not answered definitively is just who should choose the default RTC client. Some people have strongly argued that the maintainers of individual desktop meta-packages should choose as they see fit.

Personally, I don't agree with this viewpoint and it is easy to explain why.

Just imagine the maintainers of GNOME choose one RTC application and the maintainers of XFCE choose an alternative and these two RTC applications don't talk to each other. If a GNOME user wants to call an XFCE user, do they have to go to extra effort to get an extra package installed? Do they even have to change their desktop? For power users these questions seem trivial but for many of our friends and family who we would like to contact with free software, it is not amusing.

When the goal of the user is to communicate freely and if they are to remain free to choose any of the desktops then a higher-level choice of RTC client (or at least a set of protocols that all default clients must support) becomes essential.

Snail mail to the rescue?

There are several friends and family I want to be able to call with free software. The only way I could make it accessible to them was to burn self-booting Debian Live desktop DVDs with an RTC client pre-configured.

Once again, as a power-user maybe I have the capability to do this - but is this an efficient way to overcome those nasty proprietary RTC clients, burning one DVD at a time and waiting for it to be delivered by snail mail?

A billion browsers can't be wrong

WebRTC has been in the most recent stable releases of Firefox/Iceweasel and Chrome/Chromium for over a year now. Many users already have these browsers thanks to automatic updates. It is even working well on the mobile versions of these browsers.

In principle, WebRTC relies on existing technologies such as the use of RTP as a transport for media streams. For reasons of security and call quality, the WebRTC standard mandates the use of several more recent standards and existing RTC clients simply do not interoperate with WebRTC browsers.

It really is time for proponents of free software to decide if they want to sink or swim in this world of changing communications technology. Browsers will not dumb-down to support VoIP softphones that were never really finished in the first place.

Comparing Empathy and Jitsi

There are several compelling RTC clients to choose from and several of them are now being compared on the Debian wiki. Only Jitsi stands out offering the features needed for a world with a billion WebRTC browser users.

FeatureEmpathyWebRTC requirement?Comments
Internet Connectivity Establishment (ICE) and TURN (relay) Only for gmail XMPP accounts, and maybe not for much longer For all XMPP users with any standards-based TURN server, soon for SIP too Mandatory Enables effective discovery of NAT/firewall issues and refusal to place a call when there is a risk of one-way-audio. Some legacy softphones support STUN, which is only a subset of ICE/TURN.
AVPF X Mandatory Enables more rapid feedback about degrading network conditions, packet loss, etc to help variable bit rate codecs adapt and maximise call quality. Most legacy VoIP softphones support AVP rather than AVPF.
DTLS-SRTP X Mandatory for Firefox, soon for Chrome too DTLS-based peer-to-peer encryption of the media streams. Most legacy softphones support no encryption at all, some support the original SRTP mechanism based on SDES keys exchanged in the signalling path.
Opus audio codec X Strongly recommended. G.711 can also be used but does not perform well on low bandwidth/unreliable connections Opus is a variable bit rate codec the supercedes codecs like Speex, SILK, iLBC, GSM and CELT. It is the only advanced codec browsers are expected or likely to implement. Most of the legacy softphones support the earlier codec versions (such as GSM) and some are coded in such a way that they can't support any variable bit-rate codec at all.

Retrofitting legacy softphones with all of these features is no walk in the park. Some of them may be able to achieve compliance more easily by simply throwing away their existing media code and rebuilding on top of the WebRTC media stack used by the browsers

However, the Jitsi community have already proven that their code can handle all of these requirements by using their media processing libraries to power their JitMeet WebRTC video conferencing server

Dreams are great, results are better

Several people have spoken out to say they want an RTC client that has good desktop integration (just like Empathy) but I'm yet to see any of them contribute any code to such an effort.

Is this type of desktop integration the ultimate priority and stubbornly non-negotiable though? Is it more an example of zealous idealism that may snuff out hope of bringing the optimum communications tools into the hands of users?

As for solving all the other problems facing free communications software, the Jitsi community have been at it for more than 10 years. Just have a look at their scorecard on Github to see what I mean. Jitsi lead developer Emil Ivov has a PhD in multimedia and is a regular participant in the IETF, taking on some of the toughest questions, like how to make a world with two protocols (SIP and XMPP) friendly for real users.

A serious issue for all Linux distributions

Communications technology is one of the most pervasive applications and also one of the least forgiving.

Users have limited patience with phones that don't work, as the Australian Russell Crowe demonstrated in his infamous phone-throwing incident.

Maximizing the number of possible users is the key factor that makes networks fail or succeed. It is a knife that cuts both ways: as the free software community struggles with this issue, it undermines our credibility on other issues and makes it harder to bring free desktops to our friends, families and workplaces. Do we really want to see the rest of our work in these areas undermined, especially when there is at least one extremely viable option knocking at the door?