Are text to speech services and podcasters, friends or foes? Over the past few months, there has been a flurry of companies who have started offering Text to Speech (TTS) services. Many of them are startups, majority of them are in early staged of growth, and almost all of them use one or all of the Amazon Polly, Google Text to Speech, or similar solutions from IBM or Microsoft Azure. In this post, I will talk about a few such services, and ponder on the all important question for Podcasters: should they consider these services as friends, or foes ? From the time I published this post in August 2021, the market for Text to Speech Services has exploded, with options like “Emotive” TTS and Whisper from OpenAI. In the near future, I will update this post with the recent developments in the TTS space. Recently, someone who runs a business of providing translation services, reached out to me. They wanted to know if at gaathastory, we were looking for translators and voiceover artists or narrators. I see the usefulness of such a service, but it does not fit well within our workflow, that is, the way we do things at gaathastory. The discussion with this individual also gave rise to the question:
With the rise of multiple text to speech software’s Does it really makes sense. In today’s world, to have a huge pool of narrators to do voiceover work?
The short answer is yes, and more so for regional languages of India. In the times to come, maybe the situation will change, and we will have more Made in India solutions that will serve this space.

Impact : Text to Speech and Podcasters

From a podcaster’s perspective, are these text to speech solutions, a friend, or a foe? I began to ponder upon this question.

Google Aunty Is Narrating A Story

In mid- 2019, I had tinkered around with Amazon Polly and Google’s text to Speech. I found them to be quite useful tools. We even sent out a couple of test episodes to our beta listeners. The feedback from one of the listeners, a young lad of about five years old was
“Oh Google aunty is narrating the story!”
Of course, the “persona”or “avatar” which had narrated the story was same voice persona, which is used in Google Maps. A couple of years ago, we also experimented with different accents, as well as different genders for the same story. In the end, we decided that using TTS was not something that we wanted to do. Definitely not for the market that we serve, that is, children’s bedtime stories. The technology may evolve. The sound output may become more and more human like, but what would be missing by automating the narration is the human connect. This is is essentially at odds with everything that gaathastory stands for.

The Idea Does Have Merit !

Text to speech does have its utility, and there are multiple instances where TTS might come into play. For example, if you are an educator delivering online lectures, or you’re creating videos or creating informational campaigns. Even for podcasters, in certain use cases, it may hold a lot of value. Then there are marketing newsletters, conducting webinars, running a slideshow with VoiceOver….The possibilities are large.
Use cases for Text To Speech Services and podcasters Blog of Amar Vyas

Uses for Text to Speech (TTS) tools. Amar Vyas, 2021

Good idea can be badly implemented!

One of the poorest implementations of TTS tools I have seen is in a series of YouTube videos. There are several channels which suffer from this malaise. Particularly when somebody is trying to do a product demonstration, and unboxing of a new product or laptop or a phone or similar. For example I saw a video that was about, Vodafone idea, one of the largest telecom cell phone providers in India, and some of the challenges they were experiencing. Moreover, there are a few media publications in India who use probably sub par implementations of these technologies. They call the audio versions of news stories as podcasts but it’s horribly done.

TTS in Indian languages

When we look at some of the Indian languages. Hindi language option is definitely there among the personas. For English, you get multiple voices with “Indian” accents. (edit: what exactly is an Indian accent?) I would be really keen to see more Indian languages. There are some solutions which do offer Kannada, Marathi, Telugu, Tamil. But they are few and far between. In our country, we have almost 12 or 15 major languages. So finding a good speech, or text to speech personas for each of them could be a challenge.

TTS Options available today

Over the past months, deal aggregator sites like Appsumo, Pitchground, Dealify, Stacksocial, and others list several options for TTS. There are also several personal or small scale projects on sites like github, ProductHunt, or betalist. In other words, we have a huge number of software as a service (SaaS) solutions in this space. They all use probably the similar set of technologies at their back end, i.e. solutions from Google Text To Speech, Amazon, IBM, or Microsoft Azure. This makes me wonder:
Does the front end, really matter, when all of them use the same back end technology?
Short answer: it probably does. The services differ in their offerings in any subtle and not so subtle ways. Pricing tiers vary, and so do the features. Some of them offer limited personas in the free tier or base tier; while in other cases, the number of characters that can be converted at a time, vary. So may the total number of words that can be converted to voice in a month. Some providers offer integrations with platforms such as podcast hosts.
The pricing, customer support, User interface, service levels, and value additions created by these companies could make all the difference.

Dictanote Pro

Post the status why yes just let her wife alphabet and the reason for doing so is when I had started using on web browser on my MacBook Pro I had connected the microphone and somehow none of the words are getting recorded and transcribed as I’m sure I want to say why is this happening and the only word that Bhat pick up from the party I thought it was quite funny let it be the way it was but now that I have been able to figure out how to narrate record transcribe it’s actually working out quite well no complaints on that front as of now I will keep posting all Marathi and Hindi asWell if you can’t go well I may continue with right now there a drop in with you Karen Ya microphone I can disappear and I am writing but nothing is getting recorded and password so with that in mind while to offer using the speech to text app seems to have partially positive

Otter.ai on iOs using Hands free

Recorded while walking: so quite a few oops’es
Note: I would really like to know which email service is offered by Dr. Koh
Privacy focused email service providers concept has taken some prominence over the past few years, and then also gained significant sindelle light of multiple data breaches and also email servers themselves being exploited. With that in mind, I will talk about five services that I have tried or used or planning to use, and these are tutanota proton mail. Then his mail fits for this. Dr. Ko, email, and finally, as a larger film. He will relay services themselves. Most of them have premium versions and most recent one being the Go email account from Dr. Koh, and I’m not really sure how the signals shape up in the times to come.

iOs keyboard mic, transcript by ios

Note: one long paragraph, no breaks for sentences. But quite good quality output. Now let me see how good is the dictation that would be speech to text that comes out of the iOS device of course I am home now and I am using the default Apple or iOS dictation device and I have the hands free at the via Bluetooth hands-free text that I am using so this one does not have noise cancellation maybe later I may try it with a noise cancellation option but looks like much of the art capture and the conversion from speech to text is happening rather flawlessly which is an encouraging sign indeed.

Services we have used at gaathastory

I created three audio samples from play.ht and Lovo. Each audio is a TTS conversion of a blog post that I wrote this morning. As I began looking at other available TTS options, I realized that probably, it’s become a problem of plenty. Below are screenshots from Play.ht and Lovo for your quick reference.

Audio generated in Play.ht using Text to Speech services and podcasters can benefit from them
                                                                                                                      Audio generated in Play.ht using TTS (Text to Speech)
Audio generated in Lovo using TTS (Text to Speech)
                                                                                                                      Audio generated in Lovo using TTS (Text to Speech)

Final Thoughts : Text to Speech Services and Podcasters

 In this post, Amar Vyas writes about the evolving market for Text to Speech (TTS) tools. He discusses the use cases for these tools, and impact on podcasters. Should podcasters adopt these tools? As technology tools evolve, it is natural for the speech to text (STT) and Text to Speech (TTS) tools to become more robust. Podcasters should consider these tools as an enabler, and ally of sorts, rather than a foe. I am convinced that these tools will immense content creators including podcasters immensely.


*The likes of Tencent and Alibaba may have their own solutions, which I am not familiar with. Yandex Text to Speech is another solution that I have not tried, either myself or through one of the above mentioned service providers. You may also like to read a related post- testing of Speech to Text apps.
This post about text to speech services and podcasters was updated on 2022-03-07  29 Feb 2024

History: The rise of Speech to Text programs

During the early 2000’s, when the news of carpal tunnel syndrome began to make news, the topic of Speech to Text software began to get discussed quite a bit. One software in particular made news- Dragon Naturally Speaking. This was a Windows program where one could dictate and the software would convert the words to speech. Carpal Tunnel Syndrome and its cousin, RSI (Repetetive Strain Injury) still exist, but technology has enabled more use cases for Speech to Text: I find it very convenient to use this tool to dictate my blog posts. In this post, I have published output from a couple of alternatives to otter.ai. Over the past few weeks, I have explored a few apps other than Otter.ai (which I use extensively and love!).The reasons for considering alternatives to Otter.ai are: a. It is always nice to be aware of alternatives to one’s SAAS of choice. b. Languages: otter provides transcription in English. For Hindi and Marathi, I have to explore options anyways. c. Otter does not offer the option to pay via Paypal and I was not keen on entering my credit card details. Note: as on October 2021, Otter have confirmed that I can pay via my Apply ID. I may act on it. Enter a plethora of apps: a. Dictanote b. Android speech to text converter c. iOs speech to text program

Are the speech to text programs any good?

The proof is in the pudding. Below is the output from a couple of programs:

Dictanote Pro

Post the status why yes just let her wife alphabet and the reason for doing so is when I had started using on web browser on my MacBook Pro I had connected the microphone and somehow none of the words are getting recorded and transcribed as I’m sure I want to say why is this happening and the only word that Bhat pick up from the party I thought it was quite funny let it be the way it was but now that I have been able to figure out how to narrate record transcribe it’s actually working out quite well no complaints on that front as of now I will keep posting all Marathi and Hindi asWell if you can’t go well I may continue with right now there a drop in with you Karen Ya microphone I can disappear and I am writing but nothing is getting recorded and password so with that in mind while to offer using the speech to text app seems to have partially positive

Otter.ai on iOs using Hands free

Recorded while walking: so quite a few oops’es
Note: I would really like to know which email service is offered by Dr. Koh
Privacy focused email service providers concept has taken some prominence over the past few years, and then also gained significant sindelle light of multiple data breaches and also email servers themselves being exploited. With that in mind, I will talk about five services that I have tried or used or planning to use, and these are tutanota proton mail. Then his mail fits for this. Dr. Ko, email, and finally, as a larger film. He will relay services themselves. Most of them have premium versions and most recent one being the Go email account from Dr. Koh, and I’m not really sure how the signals shape up in the times to come.

iOs keyboard mic, transcript by ios

Note: one long paragraph, no breaks for sentences. But quite good quality output. Now let me see how good is the dictation that would be speech to text that comes out of the iOS device of course I am home now and I am using the default Apple or iOS dictation device and I have the hands free at the via Bluetooth hands-free text that I am using so this one does not have noise cancellation maybe later I may try it with a noise cancellation option but looks like much of the art capture and the conversion from speech to text is happening rather flawlessly which is an encouraging sign indeed.

Services we have used at gaathastory

I created three audio samples from play.ht and Lovo. Each audio is a TTS conversion of a blog post that I wrote this morning. As I began looking at other available TTS options, I realized that probably, it’s become a problem of plenty. Below are screenshots from Play.ht and Lovo for your quick reference.

Audio generated in Play.ht using Text to Speech services and podcasters can benefit from them
                                                                                                                      Audio generated in Play.ht using TTS (Text to Speech)
Audio generated in Lovo using TTS (Text to Speech)
                                                                                                                      Audio generated in Lovo using TTS (Text to Speech)

Final Thoughts : Text to Speech Services and Podcasters

 In this post, Amar Vyas writes about the evolving market for Text to Speech (TTS) tools. He discusses the use cases for these tools, and impact on podcasters. Should podcasters adopt these tools? As technology tools evolve, it is natural for the speech to text (STT) and Text to Speech (TTS) tools to become more robust. Podcasters should consider these tools as an enabler, and ally of sorts, rather than a foe. I am convinced that these tools will immense content creators including podcasters immensely.


*The likes of Tencent and Alibaba may have their own solutions, which I am not familiar with. Yandex Text to Speech is another solution that I have not tried, either myself or through one of the above mentioned service providers. You may also like to read a related post- testing of Speech to Text apps.
This post about text to speech services and podcasters was updated on 2022-03-07  29 Feb 2024
Categories: Podcasts