How to Generate an Audio Tone in FFmpeg?

FFmpeg also has the ability to generate sounds or tones natively. With the right kind of scripting you could make chiptune like music straight from the terminal. In this example, generating a single 2000Hz sound for 10 seconds is enough for now:

 $ ffmpeg -f lavfi -i "sine=frequency=2000:duration=10" output.mp3  

A new argument, lavfi, needs to be introduced here as this time there is no input media. Libavfilter,or lavfi as seen above, is an input virtual device that allows for filters, like sine, to virtually create data for future output.

sine
	Indicates the sine filter name. Sine generates signals made from sine waves

frequency, f
	Indicates the tone or frequency (with default 440Hz)

duration, d
	Indicates the duration of the generated frequency output

Love FFmpeg? Grab a copy of FFmpeg For Beginners on Kindle or Paperback to learn over 120 ways to master FFmpeg!

buy now on amazon

(FFmpeg) How to Change the Pitch / Sample Rate of an Audio Track (Mp3)?

Changing the pitch of an audio track means the tempo stays the same but the audio pitch increases/decreases. There isn’t a native way to change the pitch without also changing the playback speed. Luckily, chaining the atempo and asetrate filters together can achieve this effect.

In this example, the pitch of the audio decreases by 50% by changing the sample rate with asetrate, which by itself will result in a longer playback duration:

  $ ffmpeg -i input.mp3 -af "asetrate=44100*0.5" output.mp3 

Tip: Changing the sample rate to change the pitch might create a conflict because some players or websites (like Bandcamp) require audio with a sample rate to be 44.1kHz. 

To keep the pitch change while setting the preferred sample rate the filter aresample is needed, as seen below:

  $ ffmpeg -i input.mp3 -af "asetrate=44100*0.5,aresample=44100" output.mp3 

As stated earlier in this question, for a true pitch to be applied the tempo shouldn’t change. With the use of atempo the pitch can change while the tempo nearly stays the same. In the example below, an atempo between 1.5 to 1.7 should result in proper pitch change:

  $ ffmpeg -i input.mp3 -af "asetrate=44100*0.5,atempo=1.5,aresample=44100" output.mp3 

A Note about Pitch, Tempo & Sample Rate

There are no correct answers here for what works best, that’s up to you. Play around with each version, find a sound that works with your ears and what you are trying to accomplish. 

We Are コンピューター生成 by メトロイヤー
an album made entirely with FFmpeg

For example, I’ve used pitching down, echos and randomized splicing to create sample based Vaporwave music. The entire album was completely generated with FFmpeg using Japanese vintage commercials as input. Personally, I am very pleased with the sound. 

Listen to it here: https://mthu.bandcamp.com/album/we-are

Love FFmpeg? Grab a copy of FFmpeg For Beginners on Kindle or Paperback to learn over 120 ways to master FFmpeg!

buy now on amazon

(FFmpeg) How to Change the Tempo of an (Mp3) Audio Track?

Changing the tempo of an audio file is easy with atempo. This filter only accepts one value, a number between 0.5 and 100. In this example, the audio tempo is increased by 50%:

  $ ffmpeg -i input.mp3 -af "atempo=1.50" output.mp3 

So how does 1.50 equal 50%? That’s because any value above 1.0 will increase the tempo and any value below 1.0 will decrease the tempo.

Here the tempo is reduced by 50% instead:

  $ ffmpeg -i input.mp3 -af "atempo=0.50" output.mp3 

Reducing the tempo has an issue with making the playback choppy and robotic. This is because changing a tempo is a time-stretch. The audio is slower or faster but the pitch does not change. 

Love FFmpeg? Grab a copy of FFmpeg For Beginners on Kindle or Paperback to learn over 120 ways to master FFmpeg!

buy now on amazon

(FFmpeg) How to Add an Echo to an Audio Track?

Adding an echo (or reflected sound) to an audio track is a great way to add more ambience or smoothness to a choppy playback. It’s a personal desired effect but for how simple it is to use, it’s a nice filter to have in the back pocket.

Here aecho uses gain, delays and decays make for a great airy echo effect that fades over time:

  $ ffmpeg -i input.mp3 -af "aecho=in_gain=0.5:out_gain=0.5:delays=500:decays=0.2" output.mp3  

In this filter the delay and decay arguments are plural because multiple delays and decays may be stacked using the | syntax for extra echo control, as seen below:

  $ ffmpeg -i input.mp3 -af "aecho=in_gain=0.5:out_gain=0.5:delays=500|200:decays=0.2|1.0" output.mp3  

Tip: The number of delays and decays must equal the same amount so if there are 2 delay numbers, there has to be 2 decay numbers.

aecho
	Indicates the name of the echo filter

in_gain
	Indicates the input gain reflected signal (default 0.6)

out_gain
	Indicates the output reflected signal (default 0.3)

delays
	Indicates the a list of time intervals (in milliseconds) between the original signal and reflections (0.0 to 90000.0 with default 0.5)

decays
	Indicates the list of loudness of each reflected signal (0.0 to 1.0 with default 0.5)

Tip: If this filter is applied to video input, a pixel format may be required, -pix_fmt yuv420p.

(FFmpeg) How to normalize audio?

Ever get an audio file that just isn’t loud enough but when the volume is edited it peaks? The solution for this problem is to run the audio through a normalization algorithm. Normalization with loudnorm, uses a true peak loudness to increase the maximum volume for each bit.

For audio signals on radio and tv broadcasts, a guideline exists for the permitted maximum levels thus setting a standard for increasing the volume throughout a track. This filter increases volume without changing the sound, compression or quality. This normalization standard is called EBU R128 and what the loudnorm filter is built off of.

If this is starting to feel complex, don’t worry, here’s the recommended settings for normalizing audio with loudnorm: 

  $ ffmpeg -i input.mp3 -af loudnorm=I=-16:LRA=11:TP=-1.5 output.mp3  

Tip: The above example has been found all over the internet, without a clear identity of who invented these exact variables. Play around with the values to get a sound you prefer.

loudnorm
	Indicates the name of the normalization filter

I, i
	Indicates the integrated loudness (-70 to -5.0 with default -24.0)

LRA, lra
	Indicates the loudness range (1.0 to 20.0 with default 7.0)

TP, tp
	Indicates the max true peak (-9.0 to 0.0 with default -2.0)

For more information on loudnorm, visit: https://ffmpeg.org/ffmpeg-filters.html#loudnorm

(FFmpeg) How to Stream a File to YouTube?

Part 1: (FFMPEG) HOW TO OBTAIN A YOUTUBE STREAMING KEY?

Streaming a local file to YouTube is extremely easy and actually a feature that really takes your FFmpeg skills to the next level with little effort. 

With the minimal amount of code this script will stream input.mp4 to YouTube without any issues (stream.sh):

YOUTUBE_URL="rtmp://a.rtmp.youtube.com/live2" # Youtube Streaming URL
KEY="xxxx-xxxx-xxxx-xxxx" # Your Youtube Streaming Key

ffmpeg -re -i input.mp4 -framerate 30 -f FLV "$YOUTUBE_URL/$KEY"

There’s a new option, -re, in this code that is imperative for a stream to run in realtime. The -re option indicates the stream will read the input at the native frame rate and send output at the same rate. Without this flag, the stream will receive data faster than the user can view it. Breaking the realtime point of going live.

Tip: -framerate 30 is required else the stream will not start.

(FFmpeg) How to obtain a YouTube Streaming key?

Before working on the FFmpeg For Beginners Book, the experience I had with streaming with FFmpeg was zero but I reached out to the FFmpeg community and everyone was saying I needed to cover this topic. Fortunately for me, streaming is actually way easier than I was expecting.

First go to https://www.youtube.com and click the create a video button and select the ‘Go Live’ option from the drop down, as seen in figure 109.0:

Figure 109.0: Go Live!

Select ‘New Stream’ and fill out the desired title, description, etc. Next, select ‘Create Stream’. As seen in figure 109.1:

Figure 109.1: Setting up a new stream

Tip: Make sure the ‘webcam’ option isn’t selected but ‘stream’ is. 

Tip: Actually… if you’d rather stream the webcam directly from YouTube, it’s probably easier than using FFmpeg.

YouTube will now wait for the stream data to hit the YouTube server. Before data can be sent the URL and Key must be set to send data over rtmp. In short, the Real-Time Messaging Protocol is a stream protocol and the only one YouTube supports.

The current screen has these two variables available at the bottom left corner, as seen in figure 109.2:

Figure 109.2: stream key and stream URL
Figure 109.2: stream key and stream URL

Keep these two variables around as they’ll be used in the following questions for streaming to YouTube.

To ensure the stream is working, keep this window open to see a preview but to go live, the ‘Go Live’ button must be clicked. Once live, the live chat will be available and you’re ready to communicate with your fan base.

Now you are ready to stream to YouTube.

Part 2: (FFMPEG) HOW TO STREAM A FILE TO YOUTUBE?

Love FFmpeg? Grab a copy of FFmpeg For Beginners on Kindle or Paperback to learn over 120 ways to master FFmpeg!

buy now on amazon

(FFmpeg) How to cross fade two mp3 tracks?

In this section we’ll learn various useful audio filters specifically to audio but all filters will work on video as well. Since this is just an audio section, -af instead of -filter_complex unless multiple audio input sources are required. See question “How to use filters (-vf/-af vs -filter_complex)?” for more details.

Making a music mix? Easily cross fade each track with FFMPEG within seconds. Cross fading allows for a smooth fade in and out between two audio files. In this example a cross fade of 1 second is applied:

 $ ffmpeg -i input1.mp3 -i input2.mp3 -filter_complex "acrossfade=duration=00:00:01:curve1=exp:curve2=exp" output.mp3  
acrossfade
	Indicates the filter name for crossfading

duration, d 
	Indicates the duration of the cross fade effect

curve1
	Indicates the cross fade curve for the first input
curve2 
	Indicates the cross fade curve for the first input

Although in this example, both cross fade curves are exp or exponential, FFMPEG support multiple curves but below are 6 recommended curves:

log
	Logarithmic

par
	Parabola

ipar
	Inverted parabola

losi
	Logistic sigmoid

cub
	cubic

nofade
	No fade applied

For more information on curves, visit: https://ffmpeg.org/ffmpeg-filters.html#afade

(FFmpeg) How to adjust volume of an mp3?

In this section we’ll learn various useful audio filters specifically to audio but all filters will work on video as well. Since this is just an audio section, -af instead of -filter_complex unless multiple audio input sources are required. See question “How to use filters (-vf/-af vs -filter_complex)?” for more details.

As you learn more about FFMPEG, you’ll quick find filters that solve problems in the matter of seconds in terminal vs 5 minutes in GUI software. One example of this is adjusting the volume of an mp3. In the following one-liner, the volume is increased by 10db:

  $ ffmpeg -i input.mp3 -af "volume=volume=10dB" output.mp3  

The volume value can also be negative value to decrease the volume as seen in the following example:

  $ ffmpeg -i input.mp3 -af "volume=volume=-10dB" output.mp3  

volume
Set the audio volume with input_volume * value = output_volume

volume has over 17 different different parameters that can be chained to precisely change the volume. For more information visit, http://ffmpeg.org/ffmpeg-filters.html#toc-volume

(FFmpeg) How to trim ‘x’ seconds from start and end of an audio track?

The last two questions covered trimming the start of an audio track and trimming the end, but what does the command look like combined? 

In this example, the input file is 1 minute long with the requirement of 10 seconds removed from the start and 10 seconds removed from the end of the audio. Resulting in a new 40 second audio clip:

$ ffmpeg -t 00:00:50 -i input.mp3 -ss 00:00:10 -async 1 output.mp3    

In question “How to trim ‘x’ seconds from the end of a track?”, using a smaller duration time than the input audio resulted in causes the end of the audio to be trimmed. 

In question “How to trim ‘x’ seconds from the start of an audio track?”, we learned that through the use of seeking, trimming the start of an audio track is achieved. 

Together, the two can easily manipulate the length of audio quickly and easily.

Love FFmpeg? Grab a copy of FFmpeg For Beginners on Kindle or Paperback to learn over 120 ways to master FFmpeg!

buy now on amazon