At Sherpany, the Engineering team runs an initiative called ‘Engineering Lab’, which allows each team member to spend 10% of their time working on a project of their own. This motivated one of our Windows Engineers, Max, to come up with, and build a working prototype as a PoC (proof of concept). The idea behind the PoC is to control our Sherpany App for Windows with the help of speech recognition technology offered by Microsoft.
The speech recognition technology identifies key phrases, which can be used as tools of navigation. It understands basic commands such as ‘next page’ and ‘previous page’, which enables users to work and present hands free. It even recognises these commands in multiple languages, making it a viable option for us since we have customers in different locations all over Europe.
The keyphrase-based navigation was found to be fast, reliable and to work offline. We used programmatic list constraints which are designed to work well with short, distinct phrases or commands - example: ‘next page’ or ‘previous page’. The challenges started when voice commands could not always be recognised within a normal sentence, aspect that could interrupt the flow of a presentation during a meeting. When it comes to continuous speech recognition instead of programmatic list constraints, the idea is still error prone, and not yet ready for production.
In a second attempt, the Cognitive Services API provided by Microsoft resulted to be a more powerful way to address these limitations. However, it still wasn’t reliable enough, and additionally, the API infringed upon important privacy requirements by sending the recorded audio to Microsoft servers.
All in all, the powerful tools and APIs provided by Microsoft fascined the team. Our ‘Engineering Lab’ project, conducted by Max, was rated internally as an interesting and fascinating approach. We are pleased that we managed to prove it works in a PoC.
At Sherpany, we strive for cutting edge technologies, but also rate privacy and data protection of utmost importance. That said, we won't implement speech recognition in the Sherpany App yet, but we are engaging with new technologies as part of our ambition to build award-winning Apps.
Sherpany is hiring!
If you want to work in an environment where your contribution counts, check out our open positions. Sherpany is a young and multi-award winning start-up with headquarters in Zurich and further offices in Lisbon, Paris, Milan and Wroclaw. With our SaaS solution, we turn formal meetings from time-wasters into value-creators for executives, board members, corporate secretaries and assistants, by driving focus on the output.