avr-gcc：二进制搜索与线性搜索

dldeef67 于 2023-10-16 发布在其他

关注(0)|答案(3)|浏览(129)

我在存储在程序内存中的几个长度约为30-200的struct数组中进行搜索，线性搜索如下所示：

Glyph getGlyph(Font font, uint16_t code) {
    for (size_t i = 0; i < font.length; i++) {
        if (pgm_read_word(&font.glyphs[i].code) == code) {
            static Glyph glyph;
            memcpy_P(&glyph, &font.glyphs[i], sizeof (Glyph));
            
            return glyph;
        }
    }
    
    // return question mark if unknown code point
    return getGlyph(font, 0x003f);
}

假设我正确地阅读了反汇编，每次迭代有20条指令，据我所知，其中3条指令相当昂贵：brcc、brne和rjmp。
只是评估这是否可以更有效地完成（功率），我实现了二进制搜索：

Glyph getGlyph(Font font, uint16_t code) {
    
    // https://en.wikipedia.org/wiki/Binary_search_algorithm
    int16_t l = 0;
    int16_t r = font.length - 1;
    
    while (l <= r) {
        uint8_t m = (l + r) / 2;
        uint16_t c = pgm_read_word(&font.glyphs[m].code);
        if (c < code) {
            l = m + 1;
        } else if (c > code) {
            r = m - 1;
        } else {
            // found code point, get and return glyph
            static Glyph glyph;
            memcpy_P(&glyph, &font.glyphs[m], sizeof (Glyph));
            
            return glyph;
        }
    }
    
    // return question mark if unknown code point
    return getGlyph(font, 0x003f);
}

这导致反汇编中每次迭代有35条指令，其中5条我认为相当昂贵：brlt、brcc（2x）和rjmp（2x）。
考虑到数组相当短，我想二分搜索的好处并不是很多，可能会被增加的复杂性所补偿？
它是否值得或是否有意义，在所有试图基准这一点？
程序是用-Os编译的。按照注解中的建议使用-O3编译，看起来指令的数量是一个倍数，但也许我只是没有正确解释反汇编。它运行在ATmega 328 p上，我目前正在用gcc 5.4.0和avr-libc 2.0构建它。我也试过gcc 13和avr-libc 2.1，但我发现反汇编很难阅读。
这只是一个simple thermometer and hygrometer with an e-ink display，它运行良好，我很满意的速度和功耗，虽然我缺乏经验，不能与其他任何东西相比。我真的只是好奇，想学习。

来源：https://stackoverflow.com/questions/77023635/avr-gcc-binary-search-vs-linear-search

3条答案

按热度按时间

8mmmxcuj1#

在这个程序中有很多大的瓶颈，它们不一定与搜索有关。为了解决这些问题，你更需要代码审查：

为什么要通过值传递结构体等？这在8比特上是非常低效的，即使是在小结构的情况下。
当你返回一个函数时，你首先将它从某处（flash）复制到static变量（. xml）中，然后再次将它复制到函数调用堆栈中，最后在调用者中再次复制它。当你可以返回一个指向原始项的const指针时，这是非常低效的，或者如果这些需要在运行时修改，只需复制它 * 一次 * -可能仍然返回一个const指针，让调用者复制数据。

（可能需要使用一些哈佛架构调整来直接访问闪存数据-无论您的编译器使用什么来处理，“PROGMEM”宏或类似的东西。）

避免不必要的大计数器和索引。如果你不打算超过255，那么不要使用uint16_t。而且几乎从不使用size_t，这将是一个32位类型，因此非常慢。
递归函数几乎不应该在任何上下文中使用，当然也不应该在嵌入式系统中使用。这部分必须重写为循环-我甚至不明白你为什么要使用递归。
使用标准库bsearch并不一定比手动推出自己的版本慢得多。使用示波器进行基准测试。
一般来说，二进制搜索是一个非常好的8位算法。我们既不需要担心分支也不需要担心对齐，只需要担心生成的原始指令量。另一方面，您实际上选择了仍在生产中的最慢的CPU之一，所以为什么要担心性能。

总的来说，我认为你会从选择现代MCU中受益，比如Cortex M。在C中编程旧的8位实际上是相当困难的，很容易出错。AVR是20世纪90年代的技术，没有太多的理由继续使用它们。（它们是学习汇编程序的不错选择，但仅此而已。

赞(0）回复(0）举报 2023-10-16

koaltpgm2#

在检查了你的github项目并提供了约束条件（ATmega328p）后，我想我理解了你的问题。
你必须保存你宝贵的2048字节RAM的每一位这ATmega。由于可用内存，将lookuptable[256]放置在RAM中是没有选择的。
将您的字体表转换为：

typedef struct
{
    uint8_t height;    /* Height of all glyphs of this font. */
    Glyph Glyphs[256]; /* Glyphs for all 256 charcodes*/
} newFont; /*sitting in PROGMEM*/

在以这种方式准备好数据之后，最终的getGlyph()变成了一行程序：

void getGlyph(Glyph *pDest, newFont const/*PROGMEM*/ * pFont, uint8_t code)
{
    memcpy_P(pDest, pFont->Glyphs[code], sizeof(*pDest));
}

赞(0）回复(0）举报 2023-10-16

jutyujz03#

如果你追求快速的代码，查找表可能是一个不错的选择。
有两种口味。下面的代码在GNU-C99中，因此__flash限定符被支持。这允许更容易理解的代码，而不是混乱的pgm_read_xxx。如果你坚持使用pgm_read_xxx，你将不得不翻译代码。
此外，代码使用 * 指针 * 代替来回复制Font和Glyph。完整的avr-gcc v5.4可编译代码在答案的末尾。

Flash中的查找表

这是首选的方式，因为它不需要宝贵的RAM。下面的代码假设查找表是用0初始化的，这是静态存储中数据的默认值。因此，当我们读取0的id时，我们必须检查is是否实际上是glyph[0]，或者是一些默认值，如code = '?'的glyph[0]

const __flash Glyph*
getGlyph_lookup_flash (const __flash Font *font, uint8_t code)
{
    uint8_t id = font->lookup_flash[code];
    // 0 is also the default initializer value for data in static storage.
    // Check whether we found glyph[0] or not.
    if (id == 0 && code != font->glyphs[0].code)
        id = font->lookup_flash['?'];

    return & font->glyphs[id];
}

备注：

如果您设法使用?的id初始化.lookup_flash的所有未使用的条目，则不再需要if指令。
如果你只使用一种字体，那么font->glyphs[0].code在编译时是已知的。在答案末尾的示例代码中，它可以替换为'a'。
正如您在注解中所写的，代码适合uint8_t，因此每个字体将花费额外的256字节程序内存。

内存查找表

当程序内存中的查找表由于某种原因不可行时，这是一种替代方法。实际上，我在ATmega 168/328中使用这种方法来处理具有可变字体大小的矢量字体，其中Flash中的查找表对于预先计算来说太痛苦了。当您基本上使用从0x 20到0x 7 f（ASCII）的代码并带有一些异常时，那么您可以手动处理异常，从而将表大小减少到例如0x 60字节。
代码使用相同的查找和不同的字体。如果检测到字体切换，则重置表格。对于未知的条目，使用线性搜索来查找索引及其ID。

#define INVALID_ID 0xff

typedef struct
{
    const __flash Font *pfont;
    uint8_t code_to_id[0x100];
} lookup_t;

static void init_lookup (uint8_t *p)
{
    uint8_t i = 0;
    do
    {
        *p++ = INVALID_ID;
    } while (++i);
}

const __flash Glyph*
getGlyph_lookup_ram (const __flash Font *font, uint8_t code)
{
    static lookup_t lookup_ram;

    // Unknown font: Reset `lookup_ram'.
    if (font != lookup_ram.pfont)
        init_lookup (lookup_ram.code_to_id);

    uint8_t *pid = & lookup_ram.code_to_id[code];

    if (*pid != INVALID_ID)
        // Known id.
        return & font->glyphs[*pid];

    // Unknown id: Do a linear search for glyph with required .code.
    const __flash Glyph *pglyph = & font->glyphs[0];
    for (size_t i = 0; i < font->length; ++i, ++pglyph)
    {
        if (pglyph->code == code)
        {
            *pid = i;
            return pglyph;
        }
    }
    
    // Return question mark if unknown code point.
    pglyph = getGlyph_lookup_ram (font, '?');

    // id for code is unknown. Set it to id of '?' so that the next time
    // we don't have to search all glyphs again.
    *pid = lookup_ram.code_to_id['?'];

    return pglyph;
}

avr-gcc v5.4可替换代码

最后，这里是完整的可编译示例，以供参考，
我添加了Font和Glyph的样板定义。

#include <stdint.h>
#include <stdlib.h>

typedef struct
{
    uint8_t code;
    uint8_t data[4];
} Glyph;

typedef struct
{
    size_t length;
    Glyph glyphs[4];
    uint8_t lookup_flash[0x100];
} Font;

#define ARRAY_SIZE(X) (sizeof (X) / sizeof (*X))

const __flash Font font1 =
{
    .length = ARRAY_SIZE (font1.glyphs),
    .glyphs =
    {
        [0] = { 'a', { 0xaa, 0xaa, 0xaa, 0xaa } },
        [1] = { 'B', { 0xBB, 0xBB, 0xBB, 0xBB } },
        [2] = { '2', { 0x22, 0x22, 0x22, 0x22 } },
        [3] = { '?', { 0x01, 0x23, 0x45, 0x67 } },
    },
    .lookup_flash =
    {
        ['a'] = 0,
        ['B'] = 1,
        ['2'] = 2,
        ['?'] = 3,
    }
};

const __flash Glyph*
getGlyph_lookup_flash (const __flash Font *font, uint8_t code)
{
    uint8_t id = font->lookup_flash[code];
    // 0 is also the default initializer value for data in static storage.
    // Check whether we found glyph[0] or not.
    if (id == 0 && code != font->glyphs[0].code)
        id = font->lookup_flash['?'];

    return & font->glyphs[id];
}

#define INVALID_ID 0xff

typedef struct
{
    const __flash Font *pfont;
    uint8_t code_to_id[0x100];
} lookup_t;

lookup_t lookup_ram;

static void init_lookup (uint8_t *p)
{
    uint8_t i = 0;
    do
    {
        *p++ = INVALID_ID;
    } while (++i);
}

const __flash Glyph*
getGlyph_lookup_ram (const __flash Font *font, uint8_t code)
{
    // Unknown font: Reset `lookup_ram'.
    if (font != lookup_ram.pfont)
    {
        init_lookup (lookup_ram.code_to_id);
        lookup_ram.pfont = font;
    }

    uint8_t *pid = & lookup_ram.code_to_id[code];

    if (*pid != INVALID_ID)
        // Known id.
        return & font->glyphs[*pid];

    // Unknown id: Do a linear search for glyph with required .code.
    const __flash Glyph *pglyph = & font->glyphs[0];
    for (size_t i = 0; i < font->length; ++i, ++pglyph)
    {
        if (pglyph->code == code)
        {
            *pid = i;
            return pglyph;
        }
    }
    
    // Return question mark if unknown code point.
    pglyph = getGlyph_lookup_ram (font, '?');

    // id for code is unknown. Set it to id of '?' so that the next time
    // we don't have to search all glyphs again.
    *pid = lookup_ram.code_to_id['?'];

    return pglyph;
}

int main (void)
{
    const __flash Glyph *pglyph1 = getGlyph_lookup_flash (&font1, '*');
    const __flash Glyph *pglyph2 = getGlyph_lookup_ram (&font1, 'B');

    // If you actually need the glyph in RAM, you can:
    Glyph glyph = *pglyph1;
    glyph = *pglyph2;

    return 0;
}

赞(0）回复(0）举报 2023-10-16

我来回答

avr-gcc：二进制搜索与线性搜索

3条答案

Flash中的查找表

内存查找表

avr-gcc v5.4可替换代码

相关问题

热门标签

最新问答