BERTopic OpenAI RateLimitError的重试策略

but5z9lq 于 5个月前发布在其他

关注(0)|答案(1)|浏览(67)

对于我的表示模型

bertopic.representation.OpenAI(
        model="gpt-35-turbo", 
        chat=True,
        #delay_in_seconds=1,
        generator_kwargs = {"engine": "gpt-35-turbo", "temperature": 0.1},
        prompt=f"""
Output a concise, English, lowercase topic label for the following keywords. Output only the label, no punctuation. Prefer single terms. If you are unable to perform the task, output: None. 
[KEYWORDS]
"""
    )

有时候我在不知 prop 体原因的情况下遇到了RateLimitError(似乎在训练大于100 000个文档的大数据集时会发生)。
在API调用之间设置等待时间，即使只有一秒钟，也会增加训练时间几倍(不确定为什么)。

是否可以采用一种不同的策略，在发生RateLimitError时捕获它并进行调整？
如果RateLimitError是可以预测的(例如取决于数据集大小)- 是否可以避免？

BERTopic

来源：https://github.com/MaartenGr/BERTopic/issues/1560

1条答案

按热度按时间

hrysbysz1#

是否可以采用不同的策略，在发生速率限制错误时捕获并进行调整？
你可以在OpenAI中使用exponential_backoff来实现这个目标：

exponential_backoff: Retry requests with a random exponential backoff. 
                         A short sleep is used when a rate limit error is hit, 
                         then the requests is retried. Increase the sleep length
                         if errors are hit until 10 unsuccesfull requests. 
                         If True, overrides `delay_in_seconds`.

如果速率限制错误是可以预测的(例如，取决于数据集大小),那么它是可以避免的吗？
这取决于你创建的簇的数量，因为对于每个簇都会有一个调用来创建标签。如果你有很多簇并且将其设置为几秒钟的延迟，那么它将是几秒钟乘以簇的数量。

赞(0）回复(0）举报 5个月前

我来回答

BERTopic OpenAI RateLimitError的重试策略

1条答案

相关问题

热门标签

最新问答