如何确定Keras模型所需的内存?

yyyllmsg  于 2023-11-19  发布在  其他
关注(0)|答案(5)|浏览(127)

我正在使用Keras 2.0.0,我想在GPU上训练一个具有大量参数的深度模型。使用太大的图像,我会耗尽内存(OOM)。使用太低的图像,模型的准确性将比可能的更差。因此,我想找到适合我的GPU的图像的最大可能输入大小。是否有任何计算内存的功能(例如,与model.summary()相当)给定模型和输入数据?
谢谢你的帮助

laximzn5

laximzn51#

我根据Fabrício佩雷拉的答案创建了一个完整的函数。

def get_model_memory_usage(batch_size, model):
    import numpy as np
    try:
        from keras import backend as K
    except:
        from tensorflow.keras import backend as K

    shapes_mem_count = 0
    internal_model_mem_count = 0
    for l in model.layers:
        layer_type = l.__class__.__name__
        if layer_type == 'Model':
            internal_model_mem_count += get_model_memory_usage(batch_size, l)
        single_layer_mem = 1
        out_shape = l.output_shape
        if type(out_shape) is list:
            out_shape = out_shape[0]
        for s in out_shape:
            if s is None:
                continue
            single_layer_mem *= s
        shapes_mem_count += single_layer_mem

    trainable_count = np.sum([K.count_params(p) for p in model.trainable_weights])
    non_trainable_count = np.sum([K.count_params(p) for p in model.non_trainable_weights])

    number_size = 4.0
    if K.floatx() == 'float16':
        number_size = 2.0
    if K.floatx() == 'float64':
        number_size = 8.0

    total_memory = number_size * (batch_size * shapes_mem_count + trainable_count + non_trainable_count)
    gbytes = np.round(total_memory / (1024.0 ** 3), 3) + internal_model_mem_count
    return gbytes

字符串

更新2019.10.06:增加了对包含其他模型作为层的模型的支持。
UPDATE 2020.07.17:函数现在在TensorFlow v2中正常工作。

x7yiwoj4

x7yiwoj42#

以下是我对@ZFTurbo的回答的变体。它为嵌套的Keras模型,不同的TensorFlow数据类型提供了更好的处理,并消除了对NumPy的依赖。我已经在TensorFlow 2.3.0上编写并测试了这个,它可能在早期版本上不起作用。

def keras_model_memory_usage_in_bytes(model, *, batch_size: int):
    """
    Return the estimated memory usage of a given Keras model in bytes.
    This includes the model weights and layers, but excludes the dataset.

    The model shapes are multipled by the batch size, but the weights are not.

    Args:
        model: A Keras model.
        batch_size: The batch size you intend to run the model with. If you
            have already specified the batch size in the model itself, then
            pass `1` as the argument here.
    Returns:
        An estimate of the Keras model's memory usage in bytes.

    """
    default_dtype = tf.keras.backend.floatx()
    shapes_mem_count = 0
    internal_model_mem_count = 0
    for layer in model.layers:
        if isinstance(layer, tf.keras.Model):
            internal_model_mem_count += keras_model_memory_usage_in_bytes(
                layer, batch_size=batch_size
            )
        single_layer_mem = tf.as_dtype(layer.dtype or default_dtype).size
        out_shape = layer.output_shape
        if isinstance(out_shape, list):
            out_shape = out_shape[0]
        for s in out_shape:
            if s is None:
                continue
            single_layer_mem *= s
        shapes_mem_count += single_layer_mem

    trainable_count = sum(
        [tf.keras.backend.count_params(p) for p in model.trainable_weights]
    )
    non_trainable_count = sum(
        [tf.keras.backend.count_params(p) for p in model.non_trainable_weights]
    )

    total_memory = (
        batch_size * shapes_mem_count
        + internal_model_mem_count
        + trainable_count
        + non_trainable_count
    )
    return total_memory

字符串

isr3a4wc

isr3a4wc3#

希望这可以帮助你…

  • 以下是如何确定Keras模型的形状数量(var model),每个形状单元占用内存中的4个字节:

shapes_count = int(numpy.sum([numpy.prod(numpy.array([s if isinstance(s, int) else 1 for s in l.output_shape])) for l in model.layers]))
memory = shapes_count * 4

  • 这里是如何确定你的Keras模型(var model)的参数数量:

from keras import backend as K
trainable_count = int(numpy.sum([K.count_params(p) for p in set(model.trainable_weights)]))
non_trainable_count = int(numpy.sum([K.count_params(p) for p in set(model.non_trainable_weights)]))

szqfcxe2

szqfcxe24#

考虑到前面的答案没有考虑梯度和/或中间输出和/或混合数据类型和/或嵌套模型所需的内存,我决定也尝试一下。请注意,该函数以位为单位返回估计的内存需求,模型必须使用完全已知的输入形状进行编译(包括batch_size),并且该函数不考虑内部计算(例如,神经注意力)所需的内存。Microsoft开发了可能更准确的a method,但尚未发布代码。

import tensorflow as tf, warnings

# Define function to calculate one layer's memory requirement
def layer_mem(layer: tf.keras.layers.Layer, prev_layer_mem: int) -> int:
    # Check whether calculations can be performed
    if not hasattr(layer, "output_shape") or (None in layer.output_shape):
        msg = f"Check `model.summary(expand_nested=True)` and recompile model to ensure that {layer.name} has a fully defined `output_shape`, including `batch_size`. Using previous layer's memory requirement."
        warnings.warn(msg)
        return prev_layer_mem
    # Collect sizes
    out_size = int(tf.reduce_prod(layer.output_shape)) 
    params = gradients = int(layer.count_params())
    bits = int(layer.dtype[-2:])
    # Calculate memory requirement
    return (params+gradients+out_size)*bits

# Define recursive function to gather all layers' memory requirements
def model_mem(model: tf.keras.Model) -> int:
    # Make limitations known
    warnings.warn("This function does not take into account the memory required for calculations (e.g., outer products)")
    # Initialize
    total_bits = 0
    # Loop over layers in model
    for layer in model.layers:
        # In case of nested model...
        if hasattr(layer, "layers"):
            # ... apply recursion
            total_bits += model_mem(layer)
        else:
            # Calculate and add layer's memory requirement
            prev_layer_mem = layer_mem(layer, locals().get("prev_layer_mem", 0))
            total_bits += prev_layer_mem
    return total_bits

字符串

rbpvctlc

rbpvctlc5#

我相信,如果你使用一个数据生成器,无论是自定义编写的还是利用Keras现有的生成器,它都能解决你的问题。内存错误通常会出现在所有加载的数据对系统来说变得过于重要时,而不是使用生成器将数据集分解为段,这样你就不会耗尽内存,并且可以在任何系统上进行训练。

相关问题