BERTopic 使用表示模型需要更长的时间,

qkf9rpyu 于 5个月前发布在其他

关注(0)|答案(1)|浏览(71)

你好，我正在使用以下参数在mabook pro m1上运行BERTopic,并使用预计算的嵌入和句子转换器：

vectorizer_model = CountVectorizer(stop_words="english") 
ctfidf_model = ClassTfidfTransformer(reduce_frequent_words=True)
representation_model = OpenAI(openai_client, model="gpt-3.5-turbo", delay_in_seconds=10, chat=True)

BERTopic( 
    vectorizer_model =   vectorizer_model,
    ctfidf_model      =   ctfidf_model,
    nr_topics        =  'auto',
    min_topic_size   =   max(int(len(docs)/800),10),
    representation_model = representation_model )

我注意到在使用表示模型时，模型的拟合时间有很大差异，不使用时5分钟，使用时35分钟。这是有什么特别的原因吗？因为这应该是在整个主题建模过程结束时运行的操作，它不应该花费30分钟来检索关键词、文档并向聊天gpt发送提示，也许有我不知道的事情发生，提前谢谢
是否有可能在拟合模型后添加表示层？

BERTopic

来源：https://github.com/MaartenGr/BERTopic/issues/1930