next.js 如何优化超过5万个元素的数组中的搜索？

1yjd4xko 于 2023-01-02 发布在其他

关注(0)|答案(3)|浏览(132)

我有一个字符串数组，大约有50，000个元素。

export const companies = [
  "000014",
  "000016",
  "000017",
  "000019",
  "000020",
  "000021",
  "000023",
  "000025",
  ...
]

这些是公司的名字，我为它们显示某些页面。我做了一个中间件，在其中运行一个循环，遍历这个大数组。

import { NextResponse, type NextRequest } from "next/server";
import { companies } from "./assets/companies";

export async function middleware(req: NextRequest) {
  const { pathname } = req.nextUrl;

  // cycle for comparing current URL with companies
  await for (let i = 0; i < companies.length; i++) {
    if (pathname.startsWith(`/${companies[i]}`))
        return NextResponse.redirect(new URL("/", req.url)); // redirect to main page if companies compare with current pathname
  }
}

这需要一些时间，如何优化它呢？曾经有一个想法是将数组划分为块，但这也不是一个很好的选择。

next.js

来源：https://stackoverflow.com/questions/74975586/how-to-optimize-the-search-in-an-array-with-more-than-50-thousand-elements

3条答案

按热度按时间

kiayqfof1#

假设pathname看起来像/000014/abc/xyz，那么你可以完全摆脱数组迭代。

import { NextResponse, type NextRequest } from "next/server";
import { companies } from "./assets/companies";

const companiesSet = new Set(companies);

export async function middleware(req: NextRequest) {
  const { pathname } = req.nextUrl;
  
  // Ideally you could get this from req.params.companyId instead, but whether that exists or not would depend on the routing code, which you haven't shown.
  const companyId = req.pathname.match(/\/([0-9]+)\//)?.[1];

  if (companiesSet.has(companyId)) {
    return NextResponse.redirect(new URL("/", req.url)); // redirect to main page if companies compare with current pathname
  }
}

话虽如此，50，000个元素并不是真的那么大，考虑到所有的事情，代码不应该特别慢，不必要的await和循环内部的字符串构建可能是一个问题。

赞(0）回复(0）举报 2023-01-02

n9vozmp42#

如果由于某种原因，你不能像@Aurast建议的那样只进行Map/对象查找（例如，你需要进行自由文本搜索来找到最匹配的名称），那么一个非常简单的方法是将公司分组，这样你就可以将列表分成更小的块，并且只搜索一个块。

export const companies = {
    "a": ["aardvarks, inc", "AppendeciteCorp", ],
    "b": ["boulevard bricklayers", "bandits and bakers", "brompton fan club"],
}

然后你可以只查第一个字母，以得到一个更小的数组，这将是非常简单的实现。
另一种方法是得出逻辑结论，然后递归地按首字母分组到一个名为trie或prefix tree的数据结构中，这样即使不是完全匹配，也可以非常快速地搜索内容。

赞(0）回复(0）举报 2023-01-02

bt1cpqcv3#

有几种方法可以提高性能

- - 算法：**您可以使用更快的搜索算法，如二进制搜索或哈希表查找。这些算法的时间复杂度较低，可以更有效地搜索数组。

以下是使用Binary Search的示例

二进制搜索算法

数组应排序以使用Binary Search

import { NextResponse, type NextRequest } from "next/server";
import { companies } from "./assets/companies";

function binarySearch(array, target) {
  let left = 0;
  let right = array.length - 1;

  while (left <= right) {
    const middle = Math.floor((left + right) / 2);
    const value = array[middle];

    if (value === target) {
      return middle;
    } else if (value < target) {
      left = middle + 1;
    } else {
      right = middle - 1;
    }
  }

  return -1;
}

export async function middleware(req: NextRequest) {
  const { pathname } = req.nextUrl;
  const target = pathname.substring(1);
  // Use binary search to find company name in array
  const index = binarySearch(companies, target);
  if (index !== -1) {
    return NextResponse.redirect(new URL("/", req.url));
  }
}

哈希表

import { NextResponse, type NextRequest } from "next/server";
import { companies } from "./assets/companies";

const hashTable = {};
for (const company of companies) {
  hashTable[company] = true;
}

export async function middleware(req: NextRequest) {
  const { pathname } = req.nextUrl;
  const target = pathname.substring(1);

  // Use hash table to find company name in array
  if (hashTable[target]) {
    return NextResponse.redirect(new URL("/", req.url));
  }
}

- - 高速缓存**：如果搜索结果不经常更改，则可以使用高速缓存来存储搜索结果，从而避免每次都搜索整个数组。这可以大大减少搜索数组所需的时间，特别是在重复执行相同搜索的情况下。

缓存

import { NextResponse, type NextRequest } from "next/server";
import { companies } from "./assets/companies";

const cache = new Map(); // Create a new cache

function linearSearch(array, target) {
  // Check if search result is in cache
  if (cache.has(target)) {
    return cache.get(target);
  }

  // Search through the array
  for (let i = 0; i < array.length; i++) {
    if (array[i] === target) {
      cache.set(target, i); // Add search result to cache
      return i;
    }
  }

  return -1;
}

export async function middleware(req: NextRequest) {
  const { pathname } = req.nextUrl;
  const target = pathname.substring(1);

  // Use linear search to find company name in array
  const index = linearSearch(companies, target);
  if (index !== -1) {
    return NextResponse.redirect(new URL("/", req.url));
  }
}

或者，您可以使用Redis数据库进行缓存。

时间复杂度

- - Linear search: O(n),**其中n是数组的大小。这意味着搜索时间随数组的大小线性增加。线性搜索是最基本的搜索算法，相对容易实现，但对于大型数组来说可能会很慢。
- - Binary search: O(log n),**其中n是数组的大小。这意味着搜索时间随数组大小的增加而呈对数增加。二进制搜索比线性搜索更有效，但它要求对数组进行排序。
- - Hash table: O(1),**其中n是数组的大小。这意味着搜索时间是恒定的，与数组的大小无关。哈希表对于搜索和插入操作非常有效，但它们会占用大量内存。
- - Cache: O(1),**其中n是高速缓存的大小。这意味着搜索时间是恒定的，不依赖于数组的大小。高速缓存可用于存储开销较大的搜索结果，并避免每次搜索整个数组的需要。高速缓存的大小通常比数组的大小小得多，因此搜索时间通常非常快。但是，高速缓冲存储器可随时间而填满，且可能需要周期性地清除以释放存储器。

赞(0）回复(0）举报 2023-01-02

我来回答

next.js 如何优化超过5万个元素的数组中的搜索？

3条答案

二进制搜索算法

哈希表

缓存

时间复杂度

相关问题

热门标签

最新问答