WebRTC video codec

 Posted by on May 25, 2013 at 08:10  WebRTC  No Responses »
May 252013
 

H.264? VP8? Which video codec should the IETF vote as WebRTC’s MTI (mandatory to implement)?

It doesn’t matter. We can argue IPR, processing, quality, power and hardware acceleration all day long. And the the discussions are valuable and interesting. But the decision doesn’t matter.

WebRTC will be implemented in both browsers and apps but the browsers are the key. WebRTC is just another way to embed RTC in apps – better in certain respects but not earth shattering in that context without the browsers in the picture.

However, there are more than 1 billion browsers with VP8 implemented in their WebRTC implementations. Over 1,000,000,000.

Will those browsers use H.264 in their WebRTC implementations? No. So, even in a potential IETF vote of H.264 as the only MTI WebRTC video codec, VP8 (and VP9 etc) will still dominate browser based RTC, and that is the sweet spot for WebRTC. Good or bad, VP8 is already the de facto MTI video codec for browser-based RTC implementations.

More on WebRTC:

WebRTC update at Google I/O 2013

WebRTC – game changer with opportunities for emerging and traditional telecoms and service providers

    Feb 042013
     

    You are in Barcelona and I am in Atlanta.   We are checking out some photos together via a photo sharing app.

    Avalanche Lake photo

     

    Avalanche Trail via WebRTC 

    While showing your photos, you gesture on your Droid to instantly connect audio between us, telling me that the photo we are looking at is from an unbelievable hike you took at Glacier National Park on the Avalanche Lake trail at Logan’s Point.  You tell me to hike it in June when the beargrass is blooming, and, when the trail hits the lake, to walk another quarter of a mile south along the perimeter of the lake to find a great swimming area (and to jump in, no matter how cold the water).

    There are a few different trails from the swimming area – one that is stunning and one that you found overrated – but you can’t remember which is which.  We see in the photo app that one of the guides is online so we ask her if she can join us to answer an Avalanche Lake question.  She decides to connect via video, helps us figure out which trail is which, and demonstrates how her external framed backpack will work better than the internal frame one I was going to buy.

    I save the entire session with one click so that I can swipe through to the audio and video when I go through these photos again.  I can also gesture to pull in smart links – links based on the keywords in the audio of our conversation, the geotags in the photos we reviewed, information from my social graph and preferences determined by my usage patterns (e.g. show me relevant Lonely Planet links but not TripAdvisor links).  A final click books the helpful guide.

    Avalanche trail via today’s disjointed communications paradigm

    You post a bunch of your Glacier photos on the web one night and I happen to see them.  I call you a week later but get your voice mail.  We eventually connect and you tell you about the Avalanche hike.  I think I remember the photo you are talking about, but I’m not even sure if I saw it on Twitter, MMS or email.  I don’t have time to check because you called me right before a meeting.  The other trail, pack recommendations and the good guide?  No chance.

    WebRTC – business drivers

    Our photo sharing experience is just example of how WebRTC will change the communications world.  There are many more WebRTC benefits in the enterprise and SMB worlds, and far more potential benefits that we haven’t yet imagined – magic mixes of functionality, simplicity, cost and quality.  See @disruptivedean, @aswath, and @tsahil for more thoughts on the development of WebRTC.

    However, the WebRTC-enabled benefits aren’t enough by themselves.  WebRTC needs business drivers and business models to enable the magic mix of benefits for the early adopters, get us towards critical mass and profitably expand to the majority.

    Business drivers for the new providers

    In geeky telco circles, we discuss how hard it is for telcos to move from undesirable transactions like phone calls to the natural interactions of in-person communication. Our entire business models, OSS/BSS stacks, organizations and processes are built around these transactions.

    However, we don’t think about the other side as much: it is very hard and expensive for application providers to build into the transaction model of today’s communications, and most of them are too smart to even try. We’ve created a formidable moat around our legacy communications castle: telcos are locked in, and app providers are mainly locked out.

    WebRTC, paired with pervasive Internet, enables any service provider, any web developer and any application provider to become a telco.  WebRTC helps bridge the moat from a provider perspective and helps enable  freemium business models.  Simple voice and video will move towards free, and we’ll pay for the features on top of the voice and video. Any provider, not just your friendly neighborhood telco, will be able to leverage WebRTC to make money from freemium interactions and free us from the transaction shackles.  This includes both direct retail companies providing us with WebRTC-powered apps and services, as well as enablers like Twilio and Voxeo.  Who knows exactly how this ecosystem will develop, but it will develop.

    Business drivers for the mobile carriers

    • Can you hear me now?  If Verizon doesn’t cannibalize their own communications revenue, others will. WebRTC will help any application support voice and video. There is still value in our CLIDs (for now), integrated service delivery and any-to-any communications.  However, user experience and costs trump everything else.  If Verizon gives me identification/integration/pervasiveness, and helps to enable the experience of making video calls with one touch from my photo sharing app, then I’ll take it.  Else, you and I are still video chatting while perusing photos, but Verizon is asking why they can’t hear us now.
    • Can Verizon afford to hear me?  Verizon needs to offload traffic from their wireless last miles.  So does your mobile carrier. Verizon will not be able to collect enough revenue to offset the tremendous infrastructure and support costs of supporting 4G and better data for upcoming subscriber densities.  WebRTC enabled voice and video is a powerful tool for Verizon to leverage in their need to offload mobile data to other last miles.  An example is when you are in your home or business.  Verizon could easily detect that and transfer any sessions to/from your mobile phone on to your computer or tablet browser.  Verizon could put your mobile phone directory right on your browsers, enable you to call with your mobile CLID, gateway any calls that you make to the PSTN (or any island not directly compatible), and of course offer you integration with other WebRTC features.

    Business drivers for the fixed line carriers

    You can already see the big carriers starting to work Congress and the media to start lobbying for subsidies to tear down the business that our taxes and government regulations have helped prop up.  The fixed line business has a short half life.  The main leverage these carriers still have is inertia and our credit cards.  Can WebRTC help them survive?  Maybe, so they better take a shot.

    One idea as an example: convert all of our lines to DSL, enabling carriers to:

    • Keep selling POTS to folks that want it, but do it in an affordable, extensible manner.  Give these customers a WebRTC-powered VoIP client.  Wrap it up inside a phone-like appliance if necessary.  Unlimited calling for flat rate of whatever price they are paying per month now.  Start saving millions in infrastructure, support and OSS/BSS.
    • Sell backup Internet access.  Sell that DSL connection as a second (or third) Internet connection.  WebRTC and the Internet of Things will help make these redundant connections increasingly important, even if it just for low bandwidth transactions and emergency backup purposes.  Carriers may not even sell it directly to consumers – providers of certain services and apps may pay for the redundancy (perhaps you pay ADT for a platinum level of monitoring and ADT in turn pays your DSL provideer.
    • Sell WebRTC enabled apps and services.  The carrier has the business and billing relationship with you.  Leverage it to sell WebRTC enabled apps and services, and/or serve as a commission-carrying channel for other WebRTC communications providers.
    • Sell primary Internet.  Some of us are close enough to central offices, and/or have limited enough Internet needs, that we buy our only Internet connection from the fixed line carrier.
    • Sell WebRTC IaaS to WebRTC app providers.  WebRTC can take advantage of cloud infrastructure such as proxy servers and TURN servers.  Telcos will have racks of empty data center space once they get rid of old class 5 and class 4 switches.  Telcos also have backbone bandwidth, peering relationships and network engineers.  Most WebRTC app providers don’t have those assets, and would prefer not to invest in them.  Infrastructure as a Service deals could be made.

    We can come up with many other similar scenarios.   Bottom line: WebRTC is an opportunity that the fixed line providers should not pass up.  They have business motivations, e.g. survival, to try to dive into WebRTC waters.  They have some assets to try to leverage while they still exist. 

    WebRTC red herrings

    • WebRTC will not make inter-island communications easy
    • WebRTC doesn’t provide identity and authentication solutions
    • WebRTC will be one of many – CU-RTC-WEB and others will also take segments
    • Browsers will not all support WebRTC and cross-browser compatibility will be problematic

    Many have argued that the statements above will doom WebRTC.  I made the argument too.  And, all the above statements, IMO, are true.  So what gives?  Well, at the end of the day, those statements don’t matter.  They are red herrings.

    • Islands are now in.  Communications islands used to be associated with uncertainty (who can I talk with on this island and how well will it work) and the possibility of work or friction (how difficult to get on and how difficult for my friend to get on)…the perceived costs were so high relative to the benefits (mostly cost) that most communications islands became lonely places (with a few exceptions).  However, now it is easy to go island hopping and we do it all the time.  With WebRTC helping to voice-enable and video-enable all islands, we will communicate from whatever island we are already on, or easily hop to another island if we prefer.  We often get caught up in the island red herring (I know I do) because of the utopia of any-to-any, IP-based communications, and our current any-to-any PSTN.  However, when the world changed 66 million years ago, did new dinosaurs replace the ones that became extinct?  No, entire new species developed because the entire ecosystem had changed.  Communications is the same – the entire ecosystem has changed - there will not be another PSTN because today’s Internet ecosystem is better suited for islands.
    • Multi-identity is the future.  Will WebRTC and friends federate identity across islands and use third-party identity providers?  Some will.  Other times we will choose to use the identity we use on a particular app (LinkedIn, Twitter, Facebook, etc.).  If our mobile carrier plays ball, we might choose our mobile CLID.  Other times we will choose to be anonymous.  The days of a handful of communications identities are gone.  WebRTC providers will put us in control of our identities and authentication choices, and it will be their job to take care of aspects like security and billing within that paradigm.
    • I love CU-RTC-WEB.  Yes, we are going to get a fragmented, heterogeneous communications ecosystem: WebRTC, CU-RTC-WEB, Flash, plug-ins, proprietary solutions, NextBigThing, etc.  We can argue WebRTC is better or worse than any of them.  WebRTC isn’t going to change the world by itself.  But it doesn’t need to.  The paradigm changes are pervasive Internet; separation of transport/access/connectivity from application and application provider; and the merging of communications, applications and services.  WebRTC is a very strong change agent in that paradigm shift and will work with others to change the communications world.
    • Remember IE6?  Browsers have never been compatible.  But most users don’t know it when the developers do a good enough job coding for the differences.  Remember all the CSS pain (or still feeling it?) of IE6 versus just about every other “modern” browser?  Whether WebRTC is wrapped in browsers or apps (and it will be in both), the developers will ensure that the differences between the implementations don’t cause us pain.  This isn’t like working with SIP SDP m-line difference caused incompatibility – this will not be a major issue.

    Conclusions

    Telepresence is the closest current remote communications method to in-person communications: telepresence gives us distance collapsing features such as eyeball to eyeball quality, HD resolution and  immersive collaboration.  I use telepresence all the time and it is great.  However, telepresence, and its communications cousins, are still mainly discrete transactions, whereas in-person communication is a series of interactions.  The most disruptive user benefit of WebRTC is to melt the differences between the transaction and interaction paradigms, like in the Avalanche Lake series of interactions.  WebRTC will help move the (remote) communications world towards interactions.

    Timing is everything.  WebRTC is entering the Internet-ecosystem that is full of new business model niches.  WebRTC will help providers fill many of these niches, and create new ones, helping to change the communications world.  This is not a one-winner war – WebRTC will act with other change agents to enable this paradigm change.

    Just like at Avalanche Lake, jump in, even if the water seems a bit intimidating.

     

      Jan 062013
       

      Will WebRTC change the communications world? Since WebRTC is currently in a hype phase, I will take the contrarian position in this post, describing why WebRTC is not the Next Big Thing, and then take the pro-WebRTC view in a future post.

      WebRTC, half of a job done well

      WebRTC gets your voice and video up to (*any*) application. MediaStreams and the getUserMedia API provide a simple and standardized way for a browser to grab media and act on it, and that’s very powerful. Lots of functionality “moves” into the browser – the RTP stack and media codecs, firewall and NAT traversal mechanisms, signaling and media encryption functions. All goodness. However, the second half of the job is connecting two or more people/browsers/apps.

      Houston, we have a problem

      WebRTC took our video streams up to our respective browsers/apps. Now, how do we connect our browsers such that we can enjoy a video session? PeerConnection focuses on this inter-party voice and video communication aspect of WebRTC, but PeerConnection leaves too many key functions open or undefined. Each WebRTC app then needs to decide how to implement these functions (the positive view on this in a future pro-WebRTC post).

      WebRTC islands

      The list of critical functions that are largely undefined in PeerConnection include codecs, authentication, identity and routing. Undefined or loosely defined functions often result in islands because all the implementers then make their own choices, often choices that don’t end up playing nicely together. Video codecs are still not locked down. On the audio codec side, G.711 will be mandated, but the more interesting wide-band audio codecs will likely all be optional. Everything else – identity, authentication, privacy, routing, presence, QoE, TURN (or not), etc. – is left to the application implementer. PeerConnection will result in islands. Some attractive islands with cool beaches (functionality), but new islands aren’t the disruption that many WebRTC advocates are hoping for.

      Part of each WebRTC island is floating in the SIP Ocean

      PeerConnection uses a SIP SDP-like offer/answer protocol (ROAP). SIP SDP is not horrible but it is limited and doesn’t necessarily fit with more disruptive, interaction-based communications paradigms (always-on video sessions for example). This emulation of SIP SDP puts part of each WebRTC island into the SIP Ocean, and specific implementations may drag entire WebRTC islands into the SIP Ocean. There are good reasons for WebRTC/PeerConnection to leverage some SIP aspects, but WebRTC would be far more disruptive if it stepped completely into new waters. Data Channels – the part of WebRTC that focuses on peer-to-peer data sharing – also leverages PeerConnection (will discuss Data Channels in the pro-WebRTC post).

      External WebRTC obstacles

      The current communications landscape is riddled with WebRTC obstacles.

      Microsoft and Apple

      Microsoft IE and Apple Safari don’t have defined roadmaps to support WebRTC (IE can be hacked via Chrome Frame). Microsoft has vested interests – the leading enterprise communications UC client (Lync) and the leading consumer voice/video solution (Skype) – and in fact has proposed a WebRTC alternative, CU-RTC-Web, which is more robust or more complex than WebRTC, depending on your perspective. Apple has been quiet on WebRTC, keeps tight control of Apple platforms, and has treaded lightly in real-time communications, for example keeping Facetime relatively limited. Until Microsoft and/or Apple find business value in WebRTC, they are likely to be obstacles.

      Mobile

      Mobile browsers are not committed to WebRTC, and aren’t as easily extended as their non-mobile OS browser cousins. While Google is one of the primary WebRTC drivers, that does not necessarily bode well for mobile in general since Google’s mobile competitors may respond with anti-WebRTC, or at least anti-Google-version-WebRTC stances. It will be interesting to see if Google drives WebRTC on Android and Chromium, and makes apps like Google Hangouts, Google Talk, and Google Chat be WebRTC “compliant”. Perhaps an Amazon, Samsung or Facebook type company will WebRTC enable their mobile apps, but that is more island-ish than the more pervasive and universal results that WebRTC advocates would like to see.

      The old guard and the current dominant business model

      Most traditional telcos are slow moving and have incredibly complex and inefficient ecosystems (that even they don’t completely understand) built around yesterday’s communications landscape. This applies to their business models, processes, systems and people. Telco ecosystems are therefore not easily changed, even with the best intentions, meaning even a “pro-WebRTC” telco is unlikely to move the needle, and may even slow progress if it bends WebRTC to fit the traditional telco model. However, some telcos do seem to understand the opportunity (see Telefonica’s acquisition of TokBox), and that is the first step in the process.

      New guard “telcos”, for example Voxeo and Twilio, may be more interesting than the old world telcos, but I will save that discussion for the pro-WebRTC post. Any type of telco will need to move from the transaction-based paradigms and business models (calls made out of context of any process and billed per minute) of old-world communications to the interaction-based paradigm that we need to get to (our apps easily enable us to easily interact (including voice and video) whenever, wherever and however we wish to).

      Conclusion

      At the moment, WebRTC is really the getUserMedia API and MediaStreams. That’s powerful but not the disruption that most WebRTC folks are hoping for. Some of this is due to the construction of WebRTC itself, some of it is due to external factors. In combination, the result will be another good island, not a paradigm shift in of itself. I think communication is moving to islands with or without WebRTC, islands connected by each user (“Younified Communications“), but that is not what WebRTC’s strongest advocates would like to see.

        Jan 032013
         

        According to The Next Web, Facebook is rolling out full VoIP calling in Canada and adding short voice clips to their iOS and Android Messenger apps. Canada seems to be the beta market, but the assumption is full VoIP would soon be enabled across the entire Facebook universe.

        Just the tip of the iceberg
        The Techmeme conversation around Facebook VoIP is quickly skewing to disruption, transformation and revolution. However, Facebook is just one of the many apps that will add voice and video capabilities. None of the individual apps will take over VoIP or video, but the aggregate result will replace the PSTN, and get us to the Younified Communications paradigm.

        Video and WebRTC on deck?
        It is interesting that Facebook went mobile-first with VoIP. It could have been a pure opportunity-based decision, but could also indicate that Facebook is building a WebRTC app for the desktop? I also assume that Facebook video is around the corner, likely for both the mobile apps and the web version.

          Nov 022011
           

          If we can eliminate the enterprise LAN walled garden concept, it may be the most significant single catalyst to the next blitz of application, product and service innovation. LAN elimination would overhaul inter-enterprise communications – VoIP, video, telepresence and messaging. It would help enable the complete realization of the potential of VoIP and communications over IP, including the integration of real-time communications into all enterprise apps, workflows and processes ["enterprise LAN" refers to the way the LAN is administered and managed as a private walled garden (combination of firewalls, ALGs, policies, addressing, routing, etc.), rather than to any specific technical implementation of the LAN itself].

          Enterprise LAN islands often completely prevent effective, universal, inter-enterprise real-time communications (most frequently in video and telepresence), and in other cases the LAN reduces island to island communication to least common denominator type approaches such as PSTN bridges (most often in VoIP and messaging). This is a problem today and is even more important tomorrow as real-time communication becomes increasingly embedded in applications, e.g. true real-time video communication.

          We take the existence of the enterprise LAN paradigm for granted – a necessary evil to work around – the LAN walled garden is assumed in all standards-body work around enterprise VoIP and video, vendor roadmaps, service provider offerings, etc. But maybe the LAN can be eliminated, at least for most cases? In general, I think this means distributing “security” to the individual applications and devices, each of which has drastically different security requirements, rather than trying to first wall off the island the island of apps and devices. Specifics for future posts.