swift 在Create ML中初始化时使用DataSource而不是MLDataTable

e5nqia27  于 2023-06-21  发布在  Swift
关注(0)|答案(1)|浏览(127)

我在Xcode 14.3 Playgrounds中运行以下代码。我使用的是macOS Ventura 13.1。

let csvFile = Bundle.main.url(forResource: "all-data", withExtension: "csv")!
let dataTable = try MLDataTable(contentsOf: csvFile)

let (classifierEvaluationTable, classifierTrainingTable) = dataTable.randomSplit(by: 0.20, seed: 5)

let classifier = try MLTextClassifier(trainingData: classifierTrainingTable, textColumn: "text", labelColumn: "sentiment")

我得到以下警告:

'init(trainingData:textColumn:labelColumn:parameters:)' was deprecated in macOS 13.0: Use DataSource instead of MLDataTable when initializing.

问题是没有关于如何创建DataFrame或DataSource的文档。

ego6inou

ego6inou1#

处理那个案子。我们可以使用DataFrame,这样就可以避免警告。目前它还没有被废弃。
有一个例子,我发现如何重写这个。
先前版本:

let csvFile = Bundle.main.url(forResource: "all-data", withExtension: "csv")!
let dataTable = try MLDataTable(contentsOf: csvFile)

let (classifierEvaluationTable, classifierTrainingTable) = dataTable.randomSplit(by: 0.20, seed: 5)

let classifier = try MLTextClassifier(trainingData: classifierTrainingTable, textColumn: "text", labelColumn: "sentiment")

更新时间:
另外加上这个
import TabularData

let csvFile = Bundle.main.url(forResource: "all-data", withExtension: "csv")!
let dataFrame = DataFrame(contentsOfCSVFile: csvFile)

let (classifierEvaluationSlice, classifierTrainingSlice) = dataFrame.split(by: 0.20, seed: 5)

let classifierTrainingFrame = DataFrame(classifierTrainingSlice)
let classifier = try MLTextClassifier(trainingData: classifierEvaluationFrame, textColumn: "text", labelColumn: "sentiment"))

此外,我们还可以比较和打印指标并保存文件:

let classifierEvaluationFrame = DataFrame(classifierEvaluationSlice)
let metrics = model.evaluation(on: classifierEvaluationFrame, textColumn: "text", labelColumn: "sentiment"))
print(metrics.classificationError)
    
let modelPath = URL(filePath: "YourPath/YourModelName.mlmodel")
try model.write(to: modelPath)

相关问题