AI and Recording Audiobooks

I found one audiobooks series that I enjoy, almost more than reading it. The reason for the enjoyment is the way the actor creates the characters. Each character has its own unique voice. Even from book to book, the protagonist has the same voice so there’s consistency in the series. During this particular story, one of the characters gets a broken nose. The actor adjusted the voice for this character to simulate a stuffed nose and pain.

I’ve read that one of the many uses for Artificial Intelligence (AI), could be to create audiobooks. Once recorded (i.e., the author or actor’s voice), the AI would be able to create new audiobooks using their likeness. Likely this would take a fraction of the cost and time. A real human would only be able to read for so many hours a day. Certain small sounds we make, such as throat clearing or sniffing, would require editing from the recording. AI would likely be able to create a new audio recording instantly without those human noises. However, would AI interpret the meaning properly to match voices to action?

I’m not an actor, but actors are skilled at creating new characters, sometimes simulating different accents, inflections, moods, and emotions all with their voices. If the AI was creating an artistic reading, how would it be able to create new characters? Especially if the voice and character was new. it was one that the actor being imitated, had never used before. Would the AI pick up on changes, like a broken nose, to adjust the voice to match the action?

This would require a subtle understanding and nuance that I’m not sure AI is capable of doing. In my opinion, this is a good thing. Part of what makes the arts exciting is the way they constantly change and evolve. I enjoy this actor’s ability to create characters with his voice and express the emotions of the action. The voices and characters are different in every book of the series, with the exception of the main character, who remains consistent.

Even if the audiobook was something without characters, i.e., an author reading non-fiction, there would still be inflection, emphasis, cadences and the general rhythm of one’s voice. I’m sure AI could do a reasonably good job, likely good enough that most people wouldn’t notice the difference, but we would lose something intangible.

The Glory of the Analog Days

I’ve always considered myself lucky because I grew up using analog technology before migrating to digital. I’ve watched the transition happen. Even though I am sometimes a bit resistant, or slow, to adapt to new digital technologies, I appreciate that I know and understand both styles. As an added bonus, I also realize the meaning and history of many well-known icons, relics from the analog world, that may not have much meaning to somebody younger.

Here are some examples, though recognizable to us as digital icons for actions or objects, they all come from the paper world.

The phone is a receiver from the old-fashioned style with the curly cord connecting it to the base, or early cordless models.

The diskette for “save” is from the era of floppy disks.

Paste is an icon of a clipboard, something most people rarely use.

A piece of paper symbolizes document, or file.

A paperclip indicates an attachment, presumably to represent how it clips two things together. However, in the digital world, this doesn’t work the same way.

Aside from understanding icons used throughout the digital world, there are other advantages to the analog world. Though I took it for granted growing up, when I created something, I knew it was unique. It was one-of-a-kind. This was both a benefit and a detriment. One the plus side, I felt confident I had total control over that one physical instance of my creation. This is important for something confidential or private. On the flip side, only having one instance isn’t good protection against disaster. What if there was a fire, flood, etc.? Or I lost it?

I was reminded of these things while watching Mission: Impossible – Dead Reckoning the other week. The movie starts, predictably enough, with the protagonist receiving a recorded message that self-destructs. However, the message was on an old-fashioned mini tape recorder, including a few sheets of paper. Naturally the whole thing self-destructed within seconds of being read.

But in the digital world, could one ever feel confident that self-destruction included every instance? Could one ever feel confident that only one instance existed? The great benefit of digital technology is that it’s so easy to replicate. However, this is also a detriment with confidential or private information. Or if you feel inclined to send a self-destructing message to someone. It better be programmed to self-destruct in multiple places.

The Era of Beethoven’s 10th Symphony

As a hardcore Beethoven fan, I would personally feel gutted if some artificial intelligence (AI) churned out a “tenth” symphony in the same style. Beethoven had a unique talent and a distinct, yet recognizable style. For example, each of his nine symphonies are all incredibly different from each other. Yet each one is unmistakably identifiable as Beethoven. In this era of generative AI, is it possible that a machine could learn enough to compose like Beethoven? What would that say about the future of music? And more importantly, what would that mean for musicians and composers?

It’s curious to wonder what sort of skills we may be losing by incorporating AI more into our daily lives. For instance, will people still learn to compose music? Or will they instead invest their time in learning how to train, or program, a generative AI chatbot to compose something? With the latter approach, would a composer even need to learn how to play an instrument or read music?

These same questions are relevant in nearly every profession and art form. Even dancers, who need their bodies to perform, could have the choreography generated by AI. I’m sure in the future, AI will be powerful and accurate enough to simulate a dance performance without real people. Or maybe we’ll all just wear Augmented Reality glasses for the performance. There have been many headlines in the news lately about studios using AI to capture the likeness of an actor/actress and reuse his/her face in future productions.

I blogged about this issue a few weeks ago in “The End of Originality.” AI is good at summarizing and recreating, but can it innovate? And can it innovate as well, or better, than a human? Admittedly, our creations are based on our experiences. However, the interpretation and expression of our experiences is unique and offers limitless opportunities for innovation and originality. AI, by contrast, is trained and programmed to create based on what’s already available for consumption (i.e., the AI’s experiences). With so many people learning to use AI, we will definitely need to consider the ways in which we are being trained and reprogrammed.

Nobody will ever be able to compose exactly like Beethoven. And that’s a good thing, in my opinion.

The Power of Non-verbal Communication

Last week I took my second deepwater aqua fit class. I’ve been taking shallow water aqua fit on and off since 2007, but deepwater takes it to a new level. To clarify, I’m not treading water while exercising for 45 minutes. We all wear a foam belt to keep us floating. But it’s still very challenging and a real workout for the core muscles.

During the first class, I found it a bit hard to hear the instructor. The music was blaring. It echoed and resonated loudly in the cavernous pool area. My body was in the middle of the pool and the instructor was standing on the edge. We had a few physical demonstrations. However, if I didn’t understand, I couldn’t really look at anybody else to figure out the movements. With everyone’s body submerged in water, we couldn’t glance at anybody else for pointers.

The second class I had a different instructor. She barely spoke a word the entire class. Instead she used her body, hands, and face to give us the exercises and counting. During our cardio portion, she even demonstrated how to slow down our breathing by pointing to her nose for inhale. Then opening her mouth and pointing to it for exhale. For other exercises, she sat in a chair to show us how to hold our bodies in the water for some of the leg and arm work. Throughout the class, she continually pointed to her abs and bum to remind us that both had to be working to keep us upright and in one place as we did the exercises.

I’m so used to instructors talking (sometimes too much) during exercise classes that I didn’t realize how powerful a simple, wordless demonstration could be. I found her method surprisingly effective and that it eliminated some of the confusion I felt in the first class. It was a good reminder of how important non-verbal communication can be. As we once again learn how to navigate the physical world after all the isolation and lockdowns, it will be interesting to observe these subtle non-verbal mannerisms that tell us so much with so few movements.

Backing up Digital Photos

A friend on mine lost his phone a few months ago. His photos weren’t backed up. Naturally, this induced a mild ripple of panic in me since mine also weren’t backed up at the time. It’s always one of those things on the “to-do” list. The loose translation: I won’t make a move until something happens. Then I’ll swear a lot, berating myself for not having made the backups before. After learning this lesson the hard way, things would change. This time, however, I made it a priority to backup over two years’ worth of photos before tragedy struck.

Since I dislike using the automatic backup option, I had to research other methods. It requires some time and effort to create manual backups. However, it works better for me because it offers me more control.

The backup system I use is an app (Photo Transfer App) to seamlessly move photos from phone to laptop (or backup drive). Prior to this, I had tried other methods to move photos from my phone or iPad to my laptop/backup drive. Nothing every worked that well, perhaps because of compatibility issues between Mac and Android. I tried connecting the devices with cables, using bluetooth, using built-in or downloaded apps. Finally, I found this app.

When determining what backup option will work best for you, here are some things to consider:

Convenience – if it’s not easy, you’re not likely to do it. Think of something that’s manageable. I do my backup every month.

Privacy – this is my hesitation with the automatic back-up from my phone. I don’t trust cloud-based storage options. However, there are lots of secure cloud-based options available. Do your research!

Organization – sometimes I like to organize the photos in my phone. Ideally my backup system would allow me to maintain this system. Also, I hate it when the backup option includes photos in the “deleted” folder. This is how the iPad backup used to work and it drove me crazy. Deleted photos should stay deleted. Including them in a backup is a waste of time and space.

Location – it’s always good to consider having backups in more than one place (i.e., in the cloud and on a backup drive).

Be proactive with backing up photos and other precious things stored in your phone. Learning this lesson the hard way is no fun.

Discovering Audiobooks

As if it wasn’t enough to continual debate between ebooks and paper books, I now have audiobooks to add as a third contender. I discovered them a little bit accidentally. Last fall I joined a book club. One of the members is an audiobook enthusiast. It opened my mind up to the idea a little bit.

Coincidentally, the first book we selected to read (Braiding Sweetgrass: Indigenous Wisdom, Scientific Knowledge, and the Teaching of Plants by Robin Wall Kimmerer) was very popular. I put a hold on every format available to ensure I could finish it before our first meeting. The ebook was available first. I finished it with only hours to spare on the loan. Immediately following, the audio version became free at the library. There were a few chapters I wanted to review again, so I checked it out. Despite my hesitation to try audiobooks, I quite enjoyed listening to the author read her own book.

Since then, I’ve listened to several audiobooks, for various reasons. Usually it’s because the library doesn’t have an electronic version (my preference) or the wait is too long for the paper version. Or sometimes the audiobook is the only version available! Though not my preference, I am enjoying a lot of things about the audiobook experience.

  1. I can listen while I’m doing other things. This is both a benefit and a detriment. It’s great listening to a story as I commute to work, cook, or fold laundry. But I also find I’m slightly less engaged and don’t retain details as well.
  2. It puts me to sleep as easily as reading does. I use an app called Libby, which allows me to set a timer for how long to listen before it shuts off automatically. Typically I fall asleep so quickly (within 10 – 15 minutes of listening, or less) that I end up having to rewind the story slightly the next day, but it’s not a big deal.
  3. It’s fun listening to an author read their own story, or a skilled actor who can add dimension to characters with different voices and accents.

Overall, I still prefer ebooks and paper books to the audiobooks. Besides, I have several podcasts I enjoy listening to and there’s only so much “ear-time” available in a day. But I’ll definitely add audiobooks to the mix once in a while. Good to discover a new option.