ios AV语音合成器输出为文件?

gpnt7bae  于 12个月前  发布在  iOS
关注(0)|答案(5)|浏览(96)

AVSpeechSynthesizer有一个相当简单的API,它不支持保存到内置的音频文件。
我想知道是否有解决这个问题的方法--也许在静音播放时记录输出,以便稍后回放?或者更有效的方法。

rseugnpd

rseugnpd1#

在iOS 13中,AVSpeechSynthesizer现在有了write(_:toBufferCallback:)

let synthesizer = AVSpeechSynthesizer()
let utterance = AVSpeechUtterance(string: "test 123")
utterance.voice = AVSpeechSynthesisVoice(language: "en")
var output: AVAudioFile?

synthesizer.write(utterance) { (buffer: AVAudioBuffer) in
   guard let pcmBuffer = buffer as? AVAudioPCMBuffer else {
      fatalError("unknown buffer type: \(buffer)")
   }
   if pcmBuffer.frameLength == 0 {
     // done
   } else {
     // append buffer to file
     if output == nil { 
       output = AVAudioFile(
         forWriting: URL(fileURLWithPath: "test.caf"), 
         settings: pcmBuffer.format.settings, 
         commonFormat: .pcmFormatInt16, 
         interleaved: false) 
     }
     output?.write(from: pcmBuffer)
   } 
}

字符串

tct7dpnv

tct7dpnv2#

截至目前AVSpeechSynthesizer不支持这一点.有没有办法得到音频文件使用AVSpeechSynthesizer .我尝试了几个星期前我的应用程序之一,发现这是不可能的,也没有什么改变AVSpeechSynthesizer在iOS 8.
我也想过在播放时录制声音,但这种方法有很多缺陷,比如用户可能使用耳机,系统声音可能很低或静音,它可能会捕捉到其他外部声音,所以不建议使用这种方法。

polhcujo

polhcujo3#

您可以使用OSX通过NSS peechSynthesizer方法startSpeakingString:toURL准备AIFF文件(或者,可能是一些基于OSX的服务):

2hh7jdfx

2hh7jdfx4#

正如@Jan的回答的一些评论所指出的那样,不幸的是AVSpeechSynthesizer在macOS上不起作用。所有其他搜索结果都指向这个问题。
同样如评论所指出的,来自import AppKitNSSpeechSynthesizer仍然可以工作,尽管它被标记为已弃用:https://developer.apple.com/documentation/appkit/nsspeechsynthesizer
我在macOS Ventura上测试了以下代码,它可以正常工作。基本上你需要从NSSpeechSynthesizer.availableVoices而不是AVSpeechSynthesisVoice.speechVoices()中找到一个字符串,并且你需要使用didFinish委托来确保编写完成(不包括在我的测试代码中:https://developer.apple.com/documentation/appkit/nsspeechsynthesizerdelegate/1448538-speechsynthesizer)。
我们不得不使用遗留的API来实现这样的基本功能,这有点令人失望,但我相信他们在内部调用同样的东西。

func test() throws {
    NSSpeechSynthesizer.availableVoices.forEach { print($0) }
    guard let synthesizer = NSSpeechSynthesizer(voice: NSSpeechSynthesizer.VoiceName(rawValue: "com.apple.voice.enhanced.en-US.Evan")) else {
        print("Voice not found")
        throw CoreError.unexpectedNil
    }
    synthesizer.rate = 0.4
    let result = synthesizer.startSpeaking("The quick brown fox jumped over the lazy dog. lol", to: URL(fileURLWithPath: "test.aiff"))
    print(result)
    sleep(1)
}

字符串

tnkciper

tnkciper5#

我现在在两个iOS上都有这个功能(17.1.1)和macOS(14.0)与Xcode 15.0.1。这在很大程度上只是对answer by @Jan Berkel的一个小调整,有助于避免“方波失真”和对该答案的评论所报告的崩溃。参考@Jan Berkel的答案,我发现在对AVAudioFile初始化器的调用中,commonFormat:参数应该是. pcmclock Float 32,以便在iOS和macOS上工作,而它应该是. pcmclock Int 16,以便在模拟器下运行时不崩溃(不要问我为什么-这只是一个观察;-)。
以下是在macOS和iOS设备上工作的相关代码(但在output?.write(from: pcmBuffer)期间在模拟器中崩溃):

output = try AVAudioFile(
   forWriting: url,
   settings: pcmBuffer.format.settings,
   commonFormat: .pcmFormatFloat32,    // works on iOS device and on macOS, but crashes in simulator during self.output?.write(from: pcmBuffer)
   interleaved: false)

字符串
下面是在Xcode中的模拟器上工作的代码(但在macOS和iOS设备上会产生噪音):

output = try AVAudioFile(
   forWriting: url,
   settings: pcmBuffer.format.settings,
   commonFormat: .pcmFormatInt16,    // this produces noise on macOS and iOS device, but works in simulator
   interleaved: false)


就是这样。这是我对@Jan Berkel的答案所做的唯一调整。看看对这个答案的评论,很明显,在做这个工作时有各种各样的陷阱,最值得注意的是,必须确保在文件生成之前不要发布语音合成器,并且能够接收回调。所以让我发布示例代码,展示如何使用答案来创建具有语音合成的音频文件,然后回放-确保语音合成器和音频播放器没有过早释放,并且能够接收各种回调。此代码示例使用“Swift_await”来处理回调,这段代码已经在macOS、iOS设备和iOS模拟器上进行了测试(有给定的警告)。

import Foundation
import AVFAudio

let url : URL = try! FileManager.default.url(for: .documentDirectory, in: .userDomainMask,appropriateFor: nil, create: false)
    .appending(path: "test123.caf", directoryHint: .notDirectory)
let maker = SpokenAudioFileMaker()
let player = SpokenAudioFilePlayer()

func exampleOfUse() async {
    do {
        try await maker.makeAudioFile(text: "test 123", url: url)
        print("Made audio file")
        try await player.play(cafAudioFileUrl: url)
        print("Played audio file")
    } catch {
        fatalError(error.localizedDescription)
    }
}

class SpokenAudioFileMaker : NSObject, AVSpeechSynthesizerDelegate {
    let synthesizer = AVSpeechSynthesizer() // hold on to this object to prevent it from being released before file has been generated
    var output : AVAudioFile? = nil
    // for async-await:
    var audioMakerContinuation : CheckedContinuation<(success:Bool, dummy:Int), Error>? = nil
    
    override init() {
        super.init()
        synthesizer.delegate = self
    }
    
    func makeAudioFile(text:String, url:URL) async throws {
        let utterance = AVSpeechUtterance(string: text)
        _ = try await withCheckedThrowingContinuation({ (continuation:CheckedContinuation<(success:Bool, dummy:Int), Error>) in
            audioMakerContinuation = continuation
            //
            synthesizer.write(utterance) { (buffer: AVAudioBuffer) in
                guard let pcmBuffer = buffer as? AVAudioPCMBuffer else {
                    fatalError("unknown buffer type: \(buffer)")
                }
                if pcmBuffer.frameLength == 0 {
                    // the loop has completed - done.
                } else {
                    do {
                        // append buffer for this range of the utterance to file
                        if self.output == nil {     // essential to avoid file being overridden for each range of the utterance
                            self.output = try AVAudioFile(
                                forWriting: url,
                                settings: pcmBuffer.format.settings,
                                commonFormat: .pcmFormatFloat32,    // this works on iOS device and on macOS, but crashes in simulator during self.output?.write(from: pcmBuffer)
                                // commonFormat: .pcmFormatInt16,    // this produces noise on iOS device, but works in simulator
                                interleaved: false)
                        }
                        try self.output?.write(from: pcmBuffer)
                    } catch {
                        fatalError(error.localizedDescription)
                    }
                }
            }
        })
    }
    
    // MARK: AVSpeechSynthesizerDelegate

    func speechSynthesizer(_ synthesizer: AVSpeechSynthesizer, didFinish utterance: AVSpeechUtterance) {
        print("speechSynthesizer didFinish utterance: \(utterance.speechString)")
        self.output = nil
        audioMakerContinuation?.resume(returning: (success: true, dummy:0))
    }
    
    func speechSynthesizer(_ synthesizer: AVSpeechSynthesizer, willSpeakRangeOfSpeechString characterRange: NSRange, utterance: AVSpeechUtterance) {
        print("speechSynthesizer will speak range...")
    }
}

class SpokenAudioFilePlayer: NSObject, AVAudioPlayerDelegate {
    private var player : AVAudioPlayer?
    // for async-await:
    var audioPlayerContinuation : CheckedContinuation<(success:Bool, dummy:Int), Error>? = nil

    func play(cafAudioFileUrl:URL) async throws {
        let audioData = try Data(contentsOf: cafAudioFileUrl)
        player = try AVAudioPlayer(data:audioData, fileTypeHint:"caf")
        player?.delegate = self
        _ = try await withCheckedThrowingContinuation({ (continuation:CheckedContinuation<(success:Bool, dummy:Int), Error>) in
            audioPlayerContinuation = continuation
            //
            player?.play()
        })
    }

    // MARK: AVAudioPlayerDelegate
    
    func audioPlayerDidFinishPlaying(_ player: AVAudioPlayer, successfully flag: Bool) {
        print("audioPlayerDidFinishPlaying")
        audioPlayerContinuation?.resume(returning: (success: true, dummy:0))
    }

    func audioPlayerDecodeErrorDidOccur(_ player: AVAudioPlayer, error: Error?) {
        if let error = error {
            print(error.localizedDescription)
        }
        audioPlayerContinuation?.resume(returning: (success: false, dummy:0))
    }
}

相关问题