Zoom’s earnings may have exceeded all expectations in 2020 as millions of people turned to video conferencing during the pandemic. But Zoom isn’t the only game in town, and one day meetings won’t be entirely virtual.

So Microsoft is showing off a new Intelligent Speaker system designed for use in Microsoft Teams hybrid meetings where some people are in the room.

Not only can these speakers pick up multiple voices in a conference room, but it uses artificial intelligence to create transcripts in real-time. It can even differentiate up to 10 different voices and add participant’s names to the transcript so you know who said what.

Microsoft Intelligent Speaker

Microsoft says it’ll make its intelligent speakers available in private preview later in 2021. The speaker has a 7-microphone array for detecting voices and uses speech recognition to provide real-time captions for meetings and transcripts that can be viewed later.

While transcripts can help create a record of in-person meetings or virtual ones, some features are clearly aimed at folks following along in real-time via a computer screen. For example, there’s support for real-time captions, and The Verge reports the speakers also support translation, allowing remote participants to follow the meeting in their own language.

Microsoft Intelligent Speaker

via Microsoft

Support Liliputing

Liliputing's primary sources of revenue are advertising and affiliate links (if you click the "Shop" button at the top of the page and buy something on Amazon, for example, we'll get a small commission).

But there are several ways you can support the site directly even if you're using an ad blocker* and hate online shopping.

Contribute to our Patreon campaign


Contribute via PayPal

* If you are using an ad blocker like uBlock Origin and seeing a pop-up message at the bottom of the screen, we have a guide that may help you disable it.

Subscribe to Liliputing via Email

Enter your email address to subscribe to this blog and receive notifications of new posts by email.

Join 16,211 other subscribers

7 replies on “Microsoft Intelligent Speaker can generate real-time transcripts with speaker names during Teams meetings”

  1. I think I’d like to review Microsoft’s privacy policies before I agreed to allow my employer to implement AI voice recognition in our meetings.

    If theres a link between my name and some kind of “voice profile”, I’d like to be assured that can’t be shared with anyone. Especially by other parties that have API access to Microsoft accounts. I’d also like to see this feature requiring a user opt-in.

    These kinds of things are starting to make me more vigilant about the things that my employer subjects us to, in terms of digital privacy. I’d really like to start seeing some laws passed in my country to provide more privacy rights to employees.

    1. The article says: The speaker has a 7-microphone array for detecting voices and uses speech recognition to provide real-time captions for meetings and transcripts that can be viewed later

      So it sounds like at least part of this is being done in the speaker. At the very least, I would guess that the speaker has machine learning hardware to help identify differences in voices, and then maybe it passes that data to a cloud service that matches that data to existing voice profiles?

      1. This could be a good way to use idle cloud resources. I would not be surprised if pricing is based on when the transcripts are delivered… one price for same-day and another cheaper price for next day. The microphone just does the signal conditioning and then zips the audio file for transport to MS Cloud.

        1. It definitely presents the option for MS to offer different amounts of priority.

          However, I don’t see this feature being very useful unless the data is available live during the meeting, or at the very least, immediately after the meeting is over.

Comments are closed.