In a discussion forum for travel writers (where I had posted my introduction of myself and our professional transcription & translation business), another member replied and mentioned that for simple dictation, she uses a voice recognition software program, like Dragon Naturally Speaking, or others.
When what she had said really sank in, I told our team and it felt like someone had kicked us in the gut!
Then our reaction changed to dismay and questioning, “Why should we even stay in business?” This came JUST after spending almost two months getting clear on why the transcription and translation business feels exciting to us and realizing that we want to assist and support creative, positive, motivated people to succeed in ways they have not been able to before, working on interesting projects!
We thought about emptying the Transcription Translation Services Website of all information — just leaving a notice (as a ‘public service’) pointing out to the people who THINK they need us to transcribe their audios that they should go and buy that software instead!
Rest assured. they were nothing but thoughts. Drastic thoughts!
Then we grabbed our thinking caps and started to think some more…
It’s definitely a legitimate question! Why SHOULD someone hire me if they can buy software for dictation and just do it themselves?
In what situations would dictation software be inferior than having a live, intelligent human being (who is passionate about helping their clients succeed) listening and transcribing their audio or video material instead?
Knowing such software exists, ANYONE might appropriately ask that question!
Having used this software in the past, and getting constructive criticism from our past clients, we decided to try it out again and answer the following questions…
The software cannot differentiate between multiple speakers. It will give you a series of words, that could be mingled between two or more streams of thought, and the punctuation would be all over the place.
It doesn’t. It guesses, and sometimes judging by the context, it gets it wrong, which influences the overall quality of our transcriptions.
When people just speak naturally, their speech is filled with tons of filler sounds. Software sometimes catches it, but it won’t most of the time. People generally string multiple sentences together with ‘and’ and ‘but’ forever! This can be important for people in Police Cases, or Psychological research, and only a human can really deliver the transcription that is required.
As mentioned in the other answers, no. The software is generally programmed to base it’s punctuation on pauses in speech. Shorter pause translate to commas (,) and longer pauses to periods or full stops (.). This can be annoying when revising transcriptions and trying to gather data from them.
Just like for a human transcriptionist, unclear audio recording can affect the quality of a software’s transcriptions. Software isn’t any where near getting the same brain power behind it that could differentiate from multiple different sounds. Humans can, with difficulty, but they can do it a lot better.
Depending on the software, the punctuation can be all over the place. It will require some post production work to polish everything off, and create a decent quality product for our clients.
Transcription of any recording, unless it is filled with silence, takes 4 times as long as the actually recording takes to play. When you use software, it can substantially cut down the production time, but it will cause a lot of mistakes that will need at least an hour of cleaning up per 5 pages. An hour long recording can produce 20 pages of written transcription.
Furthermore, in fostering our relationships with the clients we’ve worked with, there is the unquantifiable element of our team being a sort of objective outsiders who can catch errors or discrepancies in the CONTENT. And we often even come up with valuable ideas to help them improve their material!
There is a creative, collaborative give and take between our clients and our transcribers and translators that often seems to be of benefit to BOTH beyond the action of me ‘just transcribing their audio recordings’.
So, I concluded, there IS still a need for my services by many people! Not all, but Iím sure enough to keep me busy. I actually do enjoy this kind of work under the right circumstances and with the types of clients I intend to connect with!
So in the end, I thanked that discussion forum member for her post and the internal thinking process it sent me through, because it helped me face a fear and come out stronger on the other side!
And then, as a welcome validation of everything I had deduced on my own, quickly after I had submitted my reply to her, she was kind enough to reply with a more detailed explanation of how the voice recognition software works and its definite limitations — everything I had suspected, and even more!
I truly have a valuable, worthwhile service to offer my clients. Iím very proud of my skills, my dedication, and my opportunity to make a contribution to the entire world by assisting my clients to develop their own gifts in ways they might never on their own IF it was up to themselves alone to type out their wisdom and creativity!