Recently, the technology media outlet macstories conducted a practical test to reveal the powerful performance of Apple's newly launched Speech API. This technology transcribed a 4K video with a duration of 34 minutes and a size of 7GB in just 45 seconds, showcasing an impressive speed.
This technology was announced at the 2025 WWDC Worldwide Developers Conference, including two important modules: SpeechAnalyzer and SpeechTranscriber. The macstories team used the Yap application tool, which is based on these modules, to conduct detailed tests on its transcription performance. The results showed that Yap demonstrated significant speed advantages when processing videos, far surpassing other mainstream transcription tools currently available on the market.
In comparisons with competitors, Yap completed the transcription in 45 seconds, while OpenAI’s Whisper (MacWhisper V3 Turbo version) took 101 seconds, being 55% slower. Other tools like VidCap and MacWhisper V2 took 1 minute and 55 seconds and 3 minutes and 55 seconds respectively, further highlighting Yap’s advantage.
Although all tools have some errors in recognizing proper nouns, such as inaccurate recognition of "AppStories," Yap’s localized computing capability gives it an unparalleled advantage in terms of speed. This means that if users process multiple video segments weekly, using Yap will save a lot of time and improve work efficiency.
Apple’s innovation in transcription technology not only improves efficiency but also brings convenience to creators, educators, and content producers. As this technology becomes more widespread, we may see more applications in video processing and content generation in enterprises in the future. In summary, Apple’s new technology marks a revolution in the field of voice transcription, making content production more efficient and intelligent in the future.