我尝试使用MockMagic()在python中模拟 Dataframe 创建,但看起来代码的某个部分由于从单元测试函数调用时MagicMock中不支持的比较而失败。
这是我的testcase.py网址
sys.modules["pyspark.sql"] = MagicMock()
def test_process_batch():
df = (
[
(1, "foo"),
(2, "bar"),
],
["id", "label"]
)
from pyspark.sql import SparkSession
spark = SparkSession.builder.getOrCreate()
spark_df = spark.createDataFrame(df)
process_batch(spark_df, "123")
assert True
在主程序中,process_batch()
的代码包含一行,用于将 Dataframe 计数与如下数字进行比较。
def process_batch(data_frame, batchId):
"""Process streamed batch dataframe"""
if (data_frame.count() > 0):
...
单元测试失败,出现以下错误。
[CPython38-test] =================================== FAILURES ===================================
[CPython38-test] ______________________________ test_process_batch ______________________________
[CPython38-test]
[CPython38-test] def test_process_batch():
[CPython38-test] df = (
[CPython38-test] [
[CPython38-test] (1, "foo"),
[CPython38-test] (2, "bar"),
[CPython38-test] ],
[CPython38-test] ["id", "label"]
[CPython38-test] )
[CPython38-test] from pyspark.sql import SparkSession
[CPython38-test] spark = SparkSession.builder.getOrCreate()
[CPython38-test] spark_df = spark.createDataFrame(df)
[CPython38-test]
[CPython38-test] > process_batch(spark_df, "123")
[CPython38-test]
[CPython38-test] test/test_cia_optics_ingestion_glue_spark_streaming.py:54:
[CPython38-test] _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
[CPython38-test]
[CPython38-test] data_frame = <MagicMock name='mock.SparkSession.builder.getOrCreate().createDataFrame()' id='140188327287056'>
[CPython38-test] batchId = '123'
[CPython38-test]
[CPython38-test] def process_batch(data_frame, batchId):
[CPython38-test] """Process streamed batch dataframe"""
[CPython38-test]
[CPython38-test] > if (data_frame.count() > 0):
[CPython38-test] E TypeError: '>' not supported between instances of 'MagicMock' and 'int'
[CPython38-test]
你能指导我如何克服这种情况吗?
1条答案
按热度按时间myss37ts1#
正如Samwise所指出的,您也可以模拟
count()
方法:spark_df.count.return_value = 10
。完整的工作代码如下所示: