我尝试运行一个colab文件训练openAI的点唱机,但是当我尝试运行加载音频的函数代码时,我得到一个错误:
文件“/content/jukebox/jukebox/data/files_dataset.py”,第82行,位于get_song_chunk数据中,sr = load_audio(文件名,sr=自身.sr,偏移量=偏移量,持续时间=自身.样本长度)文件“/content/jukebox/jukebox/utils/io.py“,第48行,位于load_audio frame = frame.to_ndarray(格式=”fltp“)中#转换为浮点数而非整数16属性错误:“list”对象没有属性“to_ndarray”
它似乎将帧输入解释为一个列表,当打印时,它看起来像这样:
[〈平均音频帧0,pts=无,22050 Hz时778个样本,立体声,0x 7 fd 03 dd 64150时fltp〉]
当我尝试更改为frame = resampler.resample(frame)
时,我得到这个错误:
类型错误:“av.audio.frame.AudioFrame”对象不能解释为整数
我真的不知道很多关于音频文件,所以我不知道如何调试,并希望在这里得到帮助。
加载音频的完整代码如下。
def load_audio(file, sr, offset, duration, resample=True, approx=False, time_base='samples', check_duration=True):
if time_base == 'sec':
offset = offset * sr
duration = duration * sr
# Loads at target sr, stereo channels, seeks from offset, and stops after duration
container = av.open(file)
audio = container.streams.get(audio=0)[0] # Only first audio stream
audio_duration = audio.duration * float(audio.time_base)
if approx:
if offset + duration > audio_duration*sr:
# Move back one window. Cap at audio_duration
offset = np.min(audio_duration*sr - duration, offset - duration)
else:
if check_duration:
assert offset + duration <= audio_duration*sr, f'End {offset + duration} beyond duration {audio_duration*sr}'
if resample:
resampler = av.AudioResampler(format='fltp',layout='stereo', rate=sr)
else:
assert sr == audio.sample_rate
offset = int(offset / sr / float(audio.time_base)) #int(offset / float(audio.time_base)) # Use units of time_base for seeking
duration = int(duration) #duration = int(duration * sr) # Use units of time_out ie 1/sr for returning
sig = np.zeros((2, duration), dtype=np.float32)
container.seek(offset, stream=audio)
total_read = 0
for frame in container.decode(audio=0): # Only first audio stream
if resample:
frame.pts = None
frame = resampler.resample(frame)
frame = frame.to_ndarray(format='fltp') # Convert to floats and not int16
read = frame.shape[-1]
if total_read + read > duration:
read = duration - total_read
sig[:, total_read:total_read + read] = frame[:, :read]
total_read += read
if total_read == duration:
break
assert total_read <= duration, f'Expected {duration} frames, got {total_read}'
return sig, sr
3条答案
按热度按时间ki0zmccv1#
如果变量
frame
被解释为一个列表,那么可以用frame = resampler.resample(frame)[0]
替换frame = resampler.resample(frame)
,这样做之后代码就可以正常运行了。2g32fytz2#
尝试将
frame = frame.to_ndarray(format='fltp')
替换为变量frame
的直接赋值:如果希望它是特定的数据类型,可以更改
ndarray
函数的dtype
参数:i7uq4tfw3#
尝试:
frame = frame[0].to_ndarray(format='fltp')