Czytaj

arrow pointing down

How to create a David Attenborough-style narration with AI? A guide

Discover how to craft a David Attenborough-style narration using AI – a step-by-step guide with tools and techniques for creating documentary-like voiceovers.

Na tej stronie wykorzystujemy grafiki wygenerowane przy pomocy sztucznej inteligencji.

The following article is a supplement to the video produced on the Beyond AI channel. We encourage you to watch the video on our channel to get a more complete picture of the topics discussed and to see practical examples of how artificial intelligence is applied in daily life. Visit the Beyond AI channel to discover more fascinating AI-related content!

Watch this material on YouTube:

David Attenborough Narrating Life – An AI Experiment

Today, we are talking about a very interesting project created by Charlie Holtz. He utilized several AI technologies, specifically GPT Vision and Eleven Labs, to create a program that observes him through his computer camera and narrates his life in the voice of David Attenborough, as if it were a nature documentary.

Source: https://x.com/charliebholtz/status/1724815159590293764

Run the Voice Generation Script Yourself

Today, we will show you how to run this script on your own (since Charlie has made it publicly available) and what to change so that the narration is in Polish.

1. Clone the Repository from GitHub

To run this script, you must have a computer with Python installed. Charlie provides a link to his GitHub, where he placed the "narrator" project – visit GitHub.

Now, simply clone the repository shared by Charlie. After entering the "narrator" directory, you need to install all the required libraries using a simple PIP command.

<code>

‘ git clone https://github.com/cbh123/narrator.git

‘ cd narrator/

‘ pip install -r requirements.txt

</code>

2. Customize the Commands for the Polish Language

Next, to adapt the narration to the Polish language and use your access keys, we need to change a few things in the files. Most importantly, in the narrator.py file, it is worth changing the prompt used to generate the image description to Polish.

> "You are Krystyna Czubówna. Describe the photo as you would in a nature documentary. Be witty. Do not repeat yourself. Prepare a short description. If there is something even slightly funny in the photo, make it a huge sensation! Speak only in Polish."

Additionally, we must adjust the model parameters for the function call generating the audio to use a model that enables text generation in languages other than English, namely "eleven_multilingual_v2".

3. Export Key Values in the Terminal

The only thing left is to export the values of three keys in the terminal: the OpenAI API key, the Eleven Labs API key (where we also need to register), and the key pointing to the specific voice we want to use.

Remember that the keys for Eleven Labs resources must be provided in quotation marks.

4. Create Your Own Voice with Eleven Labs

Simply register on the Eleven Labs website, and then in the Voice Lab tab, select Add Generative Or Cloned Voice.

In the free version, we can create a synthetic voice by entering Voice Design. We choose parameters such as gender, age, and accent, and click Generate.

Once the voice is created, copy the voice ID, which we will export in the terminal to use it.

5. Run the Files for Capturing Photos and Generating Voice

After these changes, simply run two files in two separate terminal windows: capture.py and narrator.py.

  • capture.py is responsible for saving a photo from the computer's camera every two seconds.
  • narrator.py is responsible for sending that photo to GPT-4 Vision, downloading the description, and generating the voice for the narration.

Unfortunately, the whole process takes quite a while; generating the voice takes roughly as long as reading it out. You need to be patient.

Listen to the Voices We Generated!

You can hear this effect in the video starting at the 4:29 mark. It is a generated voice, but we can also use the paid version of Eleven Labs to upload any voice sample, such as our own or that of a person with a recognizable voice.

In our video starting at 5:25, you will see the result when we used Krystyna Czubówna's voice for educational purposes. The effect is truly stunning! Let us know under the video what you think of this experiment!

We invite you to visit the Beyond AI channel, which is dedicated to artificial intelligence. Our motto is "Your guide to the dynamic world of AI." Discover fascinating content and stay up to date with the latest trends in the field of AI!

Visit Beyond AI on YouTube

The Beyond AI channel is created by specialists from WEBSENSA, a company that has been providing AI solutions to leading representatives of various industries since 2011.

Inne wpisy z tej serii

How AI helps you publish videos on YouTube

Learn how artificial intelligence can help you unlock YouTube’s full potential and make video publishing easier and more efficient.

How to Choose the Perfect Gift? We Tested GPT!

Discover how AI can help you pick the perfect present — from personalised recommendations to analysing YouTube reviews and saving time with intelligent suggestions.