unix 将音频文件拆分为多个文件,每个文件的大小都低于阈值

bjg7j2ky  于 2022-11-23  发布在  Unix
关注(0)|答案(2)|浏览(163)

我有一个FLAC文件,我需要将它拆分成几个不同的FLAC文件,每个文件的大小必须小于100 MB。有没有UNIX工具可以帮我做到这一点?我可以自己实现这个逻辑吗?
注:由于FLAC是压缩的,我认为最简单的解决方案将需要首先将文件转换为WAV。

ugmeyewa

ugmeyewa1#

你的问题有两部分。

  • 将现有的FLAC音频文件转换为其他格式,如wav
  • 将转换的wav文件拆分为特定大小的块。

显然,有不止一种方法可以做到这一点。但是,pydub提供了更简单的方法来完成上述任务。有关pydub文档的详细信息,请参阅found here

1)将现有的FLAC音频文件转换为其他格式,如wav

使用pydub,您可以读取FLAC音频格式,然后转换为wav,如下所示

flac_audio = AudioSegment.from_file("sample.flac", "flac")
flac_audio.export("audio.wav", format="wav")

2)将已转换的wav文件拆分为特定大小的块。

同样,有多种方法可以做到这一点。我的方法是确定转换后的wavfile的总长度和大小,然后将其近似为所需的块大小。
所用样品wav file的尺寸为101,612 KB,约为589 sec或略大于9 minutes

  • 观察到的Wav文件大小:*

立体声帧速率44.1KHz的音频文件大约是每分钟10Mb。48K会稍大一些。这意味着相应的单声道文件将是每分钟5Mb
该近似值适用于我们的示例文件,大约为每分钟10 Mb

  • 按数学计算的Wav文件大小:*

wav文件大小和持续时间之间的关系由下式给出
wav_file_size_in_bytes = (sample rate (44100) * bit rate (16-bit) * number of channels (2 for stereo) * number of seconds) / 8 (8 bits = 1 byte)
来源:http://manual.audacityteam.org/o/man/digital_audio.html

  • 我用来计算音频文件块的公式:*

通过以下方法获取块大小
for duration_in_sec (X) we get wav_file_size (Y)
So whats duration in sec (K) given file size of 10Mb
这样得到K = X * 10Mb / Y
pydub.utils有方法make_chunks,可以生成特定持续时间的块(在milliseconds中)。我们使用上面的公式确定所需大小的持续时间。
我们用它来创建10Mb(或接近10Mb)的块,并分别导出每个块。最后一个块可能更小,这取决于大小。

这是一个有效的代码。

from pydub import AudioSegment
#from pydub.utils import mediainfo
from pydub.utils import make_chunks
import math

flac_audio = AudioSegment.from_file("sample.flac", "flac")
flac_audio.export("audio.wav", format="wav")
myaudio = AudioSegment.from_file("audio.wav" , "wav")
channel_count = myaudio.channels    #Get channels
sample_width = myaudio.sample_width #Get sample width
duration_in_sec = len(myaudio) / 1000#Length of audio in sec
sample_rate = myaudio.frame_rate

print "sample_width=", sample_width 
print "channel_count=", channel_count
print "duration_in_sec=", duration_in_sec 
print "frame_rate=", sample_rate
bit_rate =16  #assumption , you can extract from mediainfo("test.wav") dynamically

wav_file_size = (sample_rate * bit_rate * channel_count * duration_in_sec) / 8
print "wav_file_size = ",wav_file_size

file_split_size = 10000000  # 10Mb OR 10, 000, 000 bytes
total_chunks =  wav_file_size // file_split_size

#Get chunk size by following method #There are more than one ofcourse
#for  duration_in_sec (X) -->  wav_file_size (Y)
#So   whats duration in sec  (K) --> for file size of 10Mb
#  K = X * 10Mb / Y

chunk_length_in_sec = math.ceil((duration_in_sec * 10000000 ) /wav_file_size)   #in sec
chunk_length_ms = chunk_length_in_sec * 1000
chunks = make_chunks(myaudio, chunk_length_ms)

#Export all of the individual chunks as wav files

for i, chunk in enumerate(chunks):
    chunk_name = "chunk{0}.wav".format(i)
    print "exporting", chunk_name
    chunk.export(chunk_name, format="wav")

输出:

Python 2.7.9 (default, Dec 10 2014, 12:24:55) [MSC v.1500 32 bit (Intel)] on win32
Type "copyright", "credits" or "license()" for more information.
>>> ================================ RESTART ================================
>>> 
sample_width= 2
channel_count= 2
duration_in_sec= 589
frame_rate= 44100
wav_file_size =  103899600
exporting chunk0.wav
exporting chunk1.wav
exporting chunk2.wav
exporting chunk3.wav
exporting chunk4.wav
exporting chunk5.wav
exporting chunk6.wav
exporting chunk7.wav
exporting chunk8.wav
exporting chunk9.wav
exporting chunk10.wav
>>>
gopyfrb3

gopyfrb32#

我从这里复制了代码并创建了一个函数。也许它会对某人有帮助!

from pydub import AudioSegment
import math
from hurry.filesize import size
from pydub.utils import which, make_chunks
AudioSegment.converter = which("ffmpeg")

def mp3_to_chunks(link:str, mb_split:int=49283072, i_format:str="mp4", o_format:str="wav", filename_to_save:str="chunk"):
    
    flac_audio = AudioSegment.from_file(link,  format=i_format)
    flac_audio.export("audio.wav", format="wav")
    myaudio = AudioSegment.from_file("audio.wav" , "wav")
    channel_count = myaudio.channels    
    sample_width = myaudio.sample_width 
    duration_in_sec = len(myaudio) / 1000
    sample_rate = myaudio.frame_rate
    bit_rate =16  
    wav_file_size = (sample_rate * bit_rate * channel_count * duration_in_sec) / 8
    file_split_size = mb_split  
    total_chunks =  wav_file_size // file_split_size
    chunk_length_in_sec = math.ceil((duration_in_sec * file_split_size ) /wav_file_size)   #in sec
    chunk_length_ms = chunk_length_in_sec * 1000
    chunks = make_chunks(myaudio, chunk_length_ms)

    list_chunks = []
    for i, chunk in enumerate(chunks):
        chunk_name = f"{filename_to_save}{i}.{o_format}"
        list_chunks.append(chunk_name)
        chunk.export(chunk_name, format=o_format)

    with open("audio.wav", 'rb') as file: 
        mp3 = file.read()

    print(f"Original file size: {size(sys.getsizeof(mp3))}")

    for i in list_chunks: 
        with open(i, 'rb') as file: 
            mp4 = file.read()

        print(f'Size for {i}: {size(sys.getsizeof(mp4))}')

    print("Check the content! File is saved 😏")

mp3_to_chunks('/content/Never Going Back Mashup  Best of 2021  Neha Kakkar Atif Aslam Jubin Nautiyal Emraan Hashmi.mp4')

相关问题