我正在尝试使用www.example.com API和tensorflow_io从DICOM图像创建TensorFlow数据集tf.data，并希望使用图像的Hounsfield单位执行一些预处理。DICOM图像的形状为（512，512）。我已从图像中提取PixelData，并希望使用以下代码将其转换为适当形状的numpy数组：

image_bytes = tf.io.read_file(image_path)
PixelData = tfio.image.decode_dicom_data(image_bytes, tfio.image.dicom_tags.PixelData).numpy()
pixel_array = np.frombuffer(PixelData, dtype=tf.uint16)
pixel_array = np.reshape(pixel_array, (512,512))
print(pixel_array)

此代码应等效于

Image = pydicom.dcmread(image_path)
pixel_array = Image.pixel_array
print(pixel_array)

以及

Image = pydicom.dcmread(image_path)
PixelData = Image.PixelData
pixel_array = np.frombuffer(PixelData, dtype=np.uint16)
pixel_array = np.reshape(pixel_array, (512,512))
print(pixel_array)

DICOM标签和pydicom使用的标签是一样的，在这里给出。PixelData应该返回DICOM图像的原始字节值。我已经通过pydicom确认了原始像素数据存储为np.uint16值。但是，当我尝试使用np.frombuffer函数将tensorflow给出的字节数据转换为numpy数组时，我得到一个缓冲区大小不能被元素长度整除的错误。
运行上述脚本时，输出形状如下
1.tensorflow ：不与tf.uint16一起运行，使用tf.uint8时给出输出形状（1310719，）

Pydicom直接像素阵列（_A）：（512，512）的输出形状
Pydicom像素数据到像素阵列：（512，512）的输出形状
pydicom示例在两种情况下提供了相同的输出，但tensorflow DICOM标签似乎提供了完全不同的结果。请查找附件中的示例DICOM文件here。库或我的实现是否有问题？
编辑：DICOM图像实际上是有符号的16位整数，而不是无符号的。因此，以下三个代码片段产生相同的输出：
从pydicom直接像素数组

import pydicom
import numpy as np

dcm = pydicom.dcmread("ID_000012eaf.dcm")
print(dcm.pixel_array)

手动将PixelData转换为pixel_array

import pydicom
import numpy as np
dcm = pydicom.dcmread("ID_000012eaf.dcm")
PixelData = dcm.PixelData
pixel_array = np.frombuffer(PixelData, dtype=np.int16)
pixel_array = np.reshape(pixel_array, (512,512))
print(pixel_array)

利用Tensorflow_io直接获取像素阵列

import tensorflow as tf
import tensorflow_io as tfio
import numpy as np
image_bytes = tf.io.read_file("ID_000012eaf.dcm")
pixel_array = tfio.image.decode_dicom_image(image_bytes, on_error='lossy', scale='preserve', dtype=tf.float32).numpy()
pixel_array = pixel_array.astype('int16')
pixel_array /= 2.
pixel_array = np.reshape(pixel_array, (512,512))
print(pixel_array)

然而，由于某种原因，这段代码的最后一个片段仍然无法工作：

import tensorflow as tf
import tensorflow_io as tfio
import numpy as np

image_bytes = tf.io.read_file("ID_000012eaf.dcm")
PixelData = tfio.image.decode_dicom_data(image_bytes, tfio.image.dicom_tags.PixelData).numpy()
pixel_array = np.frombuffer(PixelData, dtype=np.int16)
print(pixel_array)

编辑2：这两段代码在理论上应该可以工作，但是它们显示的错误是字节串的长度不能被int 16的大小整除：

import tensorflow as tf
import tensorflow_io as tfio
import numpy as np
image_bytes = tf.io.read_file("ID_000012eaf.dcm")
PixelData = tfio.image.decode_dicom_data(image_bytes, tfio.image.dicom_tags.PixelData).numpy()
pixel_array =  np.frombuffer(PixelData, dtype=np.int16)
pixel_array = np.reshape(pixel_array, (512,512))
print(pixel_array)

以及

import tensorflow as tf
import tensorflow_io as tfio
import numpy as np
image_bytes = tf.io.read_file("ID_000012eaf.dcm")
PixelData = tfio.image.decode_dicom_data(image_bytes, tfio.image.dicom_tags.PixelData)
pixel_array = tf.io.decode_raw(PixelData, tf.int16)
pixel_array = tf.reshape(pixel_array, (512,512))
print(pixel_array)

编辑3：在得到decode_dicom_data提供的字节串包含十六进制值的提示后，我找到了一种方法将我的数据转换成想要的pixel_array，但我很好奇为什么PixelData是这样存储的：

import tensorflow as tf
import tensorflow_io as tfio
import numpy as np
image_bytes = tf.io.read_file("ID_000012eaf.dcm")
PixelData = tfio.image.decode_dicom_data(image_bytes, tfio.image.dicom_tags.PixelData).numpy()
pixel_array = np.zeros(262144, dtype=np.int16)
start,stop = 0,4
for i in range(262144):
    pixel_array[i] = int(PixelData[start:stop], base=16)
    start+=5
    stop+=5
pixel_array = np.reshape(pixel_array, (512,512))
print(pixel_array)

pydicom的像素数据：

PixelData = b'0\xf80\xf80\xf80...'

来自Tensorflow_io的像素数据

PixelData = b'f830\\f830\\f830\\...'

任何关于代码重构和linting的建议都将受到高度赞赏。我非常感谢@ai2ys帮助我诊断这些问题。

函数tfio.image.decode_dicom_data解码标签信息而不是像素信息。
要读取像素数据，请改用tfio.image.decode_dicom_image。

import tensorflow_io as tfio

image_bytes = tf.io.read_file(image_path)
pixel_data = tfio.image.decode_dicom_image(
    image_bytes,
    dtype=tf.uint16)

# type conversion and reshaping is not required
# as can be checked with the print statement
print(pixel_data.dtype, pixel_data.shape)

# if required the pixel_data can be converted to a numpy array
# but calculations like scaling and offset correction can 
# be done on tensors as well
pixel_data_nparray = pixel_data.numpy()

# reading tag information, e.g. rescale intercept and slope
intersept = tfio.image.decode_dicom_data(
    image_bytes, 
    tfio.image.dicom_tags.RescaleIntercept)
slope = tfio.image.decode_dicom_data(
    image_bytes,
    tfio.image.dicom_tags.RescaleSlope)

print(intersept)
print(slope)

请查看文档以了解更多信息：

使用共享文件编辑2021-02-01：
也可以使用tfio.image.decode_dicom_data读取像素数据，并传递tfio.image.dicom_tags.PixelData，但返回的字节串必须解码。

data = tfio.image.decode_dicom_data(image_bytes, tfio.image.dicom_tags.PixelData)
print(data)

输出（缩短）：

tf.Tensor(b'f830\\f830\\f830\\f830\\ ...')

解释为int16的十六进制值f830为-2000。

2条答案

按热度按时间

s3fp2yjn1#

赞(0）回复(0）举报 2023-03-11

ao218c7q2#

我发现了问题：我拥有的图像是有符号的16位整数数据类型，但tensorflow_io库中没有这样的选项。将数组值转换为16位有符号数后，问题解决了。我必须在decode_dicom_image函数中将数据转换为更高的数据类型，如float32，在numpy中重新转换为有符号的int16，最后除以2（不知道为什么是最后一步），但我最终得到了一个pixel_array，它与pydicom的输出相同。
现在除了从dicom_tag PixelData转换数据外，一切都有意义了，它仍然显示出无法解释的行为。我已经更新了python脚本，这里显示了不同的DICOM图像转换方法和不同的here库，供任何感兴趣的人使用。

python 使用tensorflow_io将DICOM图像转换为pixel_array时出错

2条答案

相关问题

热门标签

最新问答