Text-to-Speech

API Diversions

Experimenting with a new API via Cloudflare workers.

James Montgomery

5 minute read

TL; DR Recently reinvigorated with my new UI work, I set myself to task with simplifying the front end player code. No matter how I looked at it, I’d needed to rewrite the back end API to support a front end rewrite. But what if I didn’t precisely need to do that? Enter stage left, Cloudflare Workers. Below I explain how I arrived at this. If you would like to view the Worker code, you can do so here.

New UI for my text-to-speech player

I revisit the UI, hoping to improve the general experience.

James Montgomery

3 minute read

TL; DR My previous user interface (UI) could be, at best described as functional. Whilst it technically worked on any platform I tested, a desktop browser received the best experience. Recently @suivethefirst recommended Bootstrap to me. So I tried to make a better experience with this. Putting the minimum into MVP Putting together the UI as it stood was the last functional step in my project. Delivering an audio player with a combination of CSS and JavaScript felt beyond what I had in the tank at the time.

Introducing async to my serverless text-to-speech player for jokes and quotes

Using AWS components, Cloudflare, and public APIs.

James Montgomery

3 minute read

TL; DR In this post, I revisit my serverless “jokes and quotes” player. The purpose was to remove the tight coupling between the client request and audio playback API source. To explore the issue, I have introduced DynamoDB to store pre-generated results. Challenge overview Logically, all the components of my existing solution depend on each other to succeed to deliver audio to the client. You can visualise it as follows:

A Serverless text-to-speech player for jokes and quotes

Using AWS components, Cloudflare, and public APIs.

James Montgomery

2 minute read

TL; DR I decided to dust off my text-to-speech list, implementing a serverless solution delivering random jokes and quotes. You may visit it at this address: https://tts-ja.mesmontgomery.co.uk/ You can get a preview of the joke quality here: Your browser does not support the audio element. Note: I can’t affect the humour quality 🤣. Solution overview Upon visiting the page, an event triggers calls to the API routes for their relative jokes and quotes.

TTS project update 1 - adding texture to my generated speech

Introducing a pool of voices and choices in rate of speech, pitch and volume gain.

2 minute read

TL; DR In this post, I’d like to share an update on work to address some of the limitations in my text to speech project for Elite Dangerous.. Namely: Single voice is used for the synthesis; and Pitch, tone and emphasis are unchanged from defaults. If you’d like to see the result of the work so far, here is a brief overview video, I’ll describe how we got here below:

Google Cloud Text-to-Speech with PowerShell

A guide for using PowerShell with the Google Cloud Text-to-Speech API.

6 minute read

TL; DR In this post, I’ll walk through the basics of using PowerShell to interact with the Google Cloud Text-to-Speech API. Partly a documentation exercise and somewhat a guide I’d like to have been able to read when I started on my Elite Dangerous Google Cloud Text-to-Speech project. As such, we’ll start with a basic script to produce an audio file and explain how that works. Later I’ll walk through some parameters of interest to affect the response.