Running successful hybrid meetings is all about effective time and task management. When done right, video conferencing is a harmonious blend of audio and visual elements in a virtual setting so people can communicate as if they were in the same physical location. Yet many video conferences begin with participants sorting out connection problems and saying “Can you hear me?” accompanied by a hand wave for visual confirmation.

Audio is a critical component of video conferencing that is often overlooked. When voices are crisp, clear, and natural sounding, it creates a sense of presence and connection, making participants feel more engaged and involved in the discussion.

A ceiling-mounted speakerphone working in tandem with a camera is one way to streamline getting a meeting started. We spoke with Alvin Xu, Senior Manager at AVer’s Video Conferencing Department on why the FONE700 takes conversations to the next level. (As a side note, among major AV manufacturers, Alvin and his engineering colleagues are the only audio & video R&D team based in Taiwan.)

Why Mount on the Ceiling?

Let’s get straight to the heart of this speakerphone. What’s the big deal with lifting the entire speaker-microphone setup from the conference table onto the ceiling?

In short, you get better audio coverage, better audio quality, and freedom of movement to speak from anywhere in the room. Plus, the clean aesthetics of a discreet flush-mount design are appealing.

Audio and Video Features

Sure, most people can appreciate a clutter-free table. You also mentioned better audio — does it really make such a big difference having the speakerphone on the ceiling instead of in front of you?

Anyone who’s been on a conference call will be familiar with people’s tendency to cluster around the speakerphone on the table, either because it’s your turn to speak and you’re not sure if those on the other end can hear you, or because you can’t hear them and you have to move closer and adjust the volume on the console.

So our inspiration was: “What if we make it seamless for you to walk into a room, plug your laptop (running Zoom, Teams, or whatever) into the control box and just start talking?” No connection worries, crowding around speakerphones, or fidgeting with volume controls.

From the ceiling, the FONE700 has 360-degree audio coverage of all corners of the room. Contrast that to a poorly set-up table-top speakerphone: if you happen to be sitting on the far end of a microphone, your voice can sound distant to people on the other end of the call, a kind of chamber effect. Not so with a ceiling solution. You get consistent high-quality audio output as if the person is talking right in front of you. And that’s the case even if you move around the room — audio tracking is at its best from top-down when there are no obstructions.

The FONE700 works in conjunction with AVer cameras.

The FONE700 isn’t only about audio. It syncs to a camera too. Can you elaborate on that?

With 3D audio tracking, the camera and FONE700 work seamlessly as a team, with FONE700 intelligently detecting the active speaker's location (coordinates), guiding the camera to automatically adjust its view and choose the best image. There’s a whole host of AI features such as noise suppression, double-talk detection, and de-reverberation at work to distinguish the active speaker from background noise and ensure that the camera focuses on the right person. No more manual adjustments! Enjoy smooth meeting flow, clear FONE700 audio, and focused visuals for a truly engaging meeting.

What is “Good Audio Quality”

Most people struggle to define good audio quality but can easily recognize bad audio when they hear it. What are some of the key factors that contribute to good audio quality for video conferencing?

In essence, good audio quality preserves the nuances of the human voice. Now how is that achieved?

One critical aspect is minimizing background noise, such as keyboard clacks, ambient chatter, or other disruptive sounds that can affect the clarity of the audio if within microphone range. Software in our video conferencing systems can help dampen that noise, but obviously it’s best to remove the source of the noise as much as possible.

On a technical level, remember that sound waves are generated by vibrations that vary in frequency. How often these sound waves are captured affects our perceived audio quality. This is called the sampling rate. It’s the number of times per second that an audio signal is captured and converted into digital format. Normally, people can hear sounds between 20 Hz and 20 kHz. A higher sampling rate preserves more detail and nuance, resulting in a more natural, high-fidelity sound.

Bit depth is another important factor. This refers to the number of bits used to represent the amplitude (or loudness) of an audio signal in digital form. More bits, like 24-bit, can capture a wider dynamic range and more subtle variations in volume compared to a more common 16-bit. The higher the bit depth, the richer and more lifelike the audio will sound.

A high-quality video conferencing system optimizes all these factors to deliver an audio experience that feels natural and true-to-life.

Challenges with Audio Quality

What are some common pitfalls to avoid when it comes to video conferencing audio quality?

The most common problem we see is the acoustics of the conferencing environment. Echo, reverberation, and ambient noise in the room degrade audio quality if not properly mitigated. For example, speakerphones placed near loud air conditioning equipment and the like.

Often we also see a mismatch between audio equipment and room size — too few audio pickup points in a large room, which results in people sitting farther away sounding distant to people on the other end of the call.

And lastly, problems with bandwidth. Video conferencing audio nowadays is mostly transmitted over the Internet. With insufficient bandwidth, it’s like a traffic jam where too much data is going through the pipeline.

Improving Audio Quality

What about the surrounding noise that invariably occurs in a meeting? Any suggestions for minimizing background noise?

Aside from designing the ideal meeting room, fortunately, a good speakerphone system will filter out a lot of the background noise.

The FONE700’s audio and visual pairing caters for nuances that happen in real-life meetings. Someone sneezing very loudly, two people having a side conversation, or another person getting up to get a bottle of water. All these incidental noises and movements do not distract the camera and audio pickup away from the meeting’s center of attention.

The FONE700 has built-in features such as acoustic echo cancellation (AEC), advanced noise reduction, automatic gain control (AGC), double-talk detection, de-reverberation, AI noise suppression, and voice fencing.

These features are refined using AI voice sampling. With more profiles of distinctive human voices used to train the AI, our systems improve in their ability to preserve the richness and detail of the original audio signal. This creates a much more immersive experience that feels closer to an in-person conversation.

There are also specific sensitivity adjustments you can make in the FONE700. For example, to prevent the camera from panning dizzyingly to every person who makes a brief utterance, you can manually set a longer voice trigger duration.

You can also adjust the idle interval so that once someone stops talking, the camera immediately zooms out to group view rather than remaining (awkwardly!) fixated on that one person.

For the people on the other side of the video conference, all these refined adjustments make for a more natural two-way conversation.

Setting Up the FONE700

Let’s talk about setup requirements.

The IT department will initially configure camera and speakerphone settings to perform audio tracking in the conference room. Once set up, there’s no need to think about where in the room participants should sit for the best audio quality because it’s all covered.

Ultimately, the FONE700 fulfills the functional requirements of dedicated conference rooms which are popular in hybrid work environments nowadays. You set it and leave it. For meeting participants, it’s all about just walking in and starting to talk.

How many microphone pickup points are ideal?

That really depends on the room setup. Intuitively, the more mics, the better the sound quality. But why is that? Basically, a microphone picks up sound equally from all directions. The more mics you use, the more directional the overall setup becomes, allowing it to pick up a wider range of sound frequencies. The bigger the space with more people, the more mic pickup points you’ll need. Ultimately, cutting corners on the technical specifications will result in a subpar audio experience.

It's also worth noting the importance of placing your microphone correctly. When using a headset, be sure the mic is at least 15-20 cm (6-8 inches) away from your mouth. In a conference room environment, stay within the recommended distance the microphone was designed for, usually within 3.5 m (11 ft) away.

What new features are you considering for future iterations of the FONE700?

Audio fencing has potential for further development. Basically, you can mask audio coming from certain designated zones in the room.

Another interesting area is multi-camera support. Just like TV interviews with a camera focused on each person and the audience sees different perspectives, we can replicate that in-person feel of being in the same room with other meeting participants. The FONE700 coordinates multiple audio and visual feeds so the entire conference call plays out like a well-orchestrated symphony!

References