ruby 如何使用二进制搜索找到数组中的插入点?

dtcbnfnu  于 2022-12-22  发布在  Ruby
关注(0)|答案(3)|浏览(125)

在数组中进行二分查找的基本思想很简单,但是如果查找没有找到确切的项,它可能会返回一个“近似”索引(有时候我们可能会返回一个值大于或小于查找值的索引)。
为了寻找准确的插入点,似乎在我们得到近似位置之后,我们可能需要向左或向右“扫描”以寻找准确的插入位置,因此,比方说,在Ruby中,我们可以执行arr.insert(exact_index, value)
我有下面的解决方案,但是begin_index >= end_index时的部分处理有点乱,不知道能不能用一个更优雅的解决方案?
(this如果找到了精确匹配,解决方案并不关心扫描多个匹配项,因此为精确匹配返回的索引可能指向与该值对应的任何索引......但我认为,如果它们都是整数,我们总是可以在知道找到了精确匹配项之后搜索a - 1来找到左边界,或者搜索a + 1来找到右边界。)
我的解决方案:

DEBUGGING = true

def binary_search_helper(arr, a, begin_index, end_index)
  middle_index = (begin_index + end_index) / 2
  puts "a = #{a}, arr[middle_index] = #{arr[middle_index]}, " +
           "begin_index = #{begin_index}, end_index = #{end_index}, " +
           "middle_index = #{middle_index}" if DEBUGGING
  if arr[middle_index] == a
    return middle_index
  elsif begin_index >= end_index
    index = [begin_index, end_index].min
    return index if a < arr[index] && index >= 0  #careful because -1 means end of array
    index = [begin_index, end_index].max
    return index if a < arr[index] && index >= 0
    return index + 1
  elsif a > arr[middle_index]
    return binary_search_helper(arr, a, middle_index + 1, end_index)
  else
    return binary_search_helper(arr, a, begin_index, middle_index - 1)
  end
end

# for [1,3,5,7,9], searching for 6 will return index for 7 for insertion
# if exact match is found, then return that index
def binary_search(arr, a)
  puts "\nSearching for #{a} in #{arr}" if DEBUGGING
  return 0 if arr.empty?
  result = binary_search_helper(arr, a, 0, arr.length - 1)
  puts "the result is #{result}, the index for value #{arr[result].inspect}" if DEBUGGING
  return result
end

arr = [1,3,5,7,9]
b = 6
arr.insert(binary_search(arr, b), b)
p arr

arr = [1,3,5,7,9,11]
b = 6
arr.insert(binary_search(arr, b), b)
p arr

arr = [1,3,5,7,9]
b = 60
arr.insert(binary_search(arr, b), b)
p arr

arr = [1,3,5,7,9,11]
b = 60
arr.insert(binary_search(arr, b), b)
p arr

arr = [1,3,5,7,9]
b = -60
arr.insert(binary_search(arr, b), b)
p arr

arr = [1,3,5,7,9,11]
b = -60
arr.insert(binary_search(arr, b), b)
p arr

arr = [1]
b = -60
arr.insert(binary_search(arr, b), b)
p arr

arr = [1]
b = 60
arr.insert(binary_search(arr, b), b)
p arr

arr = []
b = 60
arr.insert(binary_search(arr, b), b)
p arr

结果:

Searching for 6 in [1, 3, 5, 7, 9]
a = 6, arr[middle_index] = 5, begin_index = 0, end_index = 4, middle_index = 2
a = 6, arr[middle_index] = 7, begin_index = 3, end_index = 4, middle_index = 3
a = 6, arr[middle_index] = 5, begin_index = 3, end_index = 2, middle_index = 2
the result is 3, the index for value 7
[1, 3, 5, 6, 7, 9]

Searching for 6 in [1, 3, 5, 7, 9, 11]
a = 6, arr[middle_index] = 5, begin_index = 0, end_index = 5, middle_index = 2
a = 6, arr[middle_index] = 9, begin_index = 3, end_index = 5, middle_index = 4
a = 6, arr[middle_index] = 7, begin_index = 3, end_index = 3, middle_index = 3
the result is 3, the index for value 7
[1, 3, 5, 6, 7, 9, 11]

Searching for 60 in [1, 3, 5, 7, 9]
a = 60, arr[middle_index] = 5, begin_index = 0, end_index = 4, middle_index = 2
a = 60, arr[middle_index] = 7, begin_index = 3, end_index = 4, middle_index = 3
a = 60, arr[middle_index] = 9, begin_index = 4, end_index = 4, middle_index = 4
the result is 5, the index for value nil
[1, 3, 5, 7, 9, 60]

Searching for 60 in [1, 3, 5, 7, 9, 11]
a = 60, arr[middle_index] = 5, begin_index = 0, end_index = 5, middle_index = 2
a = 60, arr[middle_index] = 9, begin_index = 3, end_index = 5, middle_index = 4
a = 60, arr[middle_index] = 11, begin_index = 5, end_index = 5, middle_index = 5
the result is 6, the index for value nil
[1, 3, 5, 7, 9, 11, 60]

Searching for -60 in [1, 3, 5, 7, 9]
a = -60, arr[middle_index] = 5, begin_index = 0, end_index = 4, middle_index = 2
a = -60, arr[middle_index] = 1, begin_index = 0, end_index = 1, middle_index = 0
a = -60, arr[middle_index] = 9, begin_index = 0, end_index = -1, middle_index = -1
the result is 0, the index for value 1
[-60, 1, 3, 5, 7, 9]

Searching for -60 in [1, 3, 5, 7, 9, 11]
a = -60, arr[middle_index] = 5, begin_index = 0, end_index = 5, middle_index = 2
a = -60, arr[middle_index] = 1, begin_index = 0, end_index = 1, middle_index = 0
a = -60, arr[middle_index] = 11, begin_index = 0, end_index = -1, middle_index = -1
the result is 0, the index for value 1
[-60, 1, 3, 5, 7, 9, 11]

Searching for -60 in [1]
a = -60, arr[middle_index] = 1, begin_index = 0, end_index = 0, middle_index = 0
the result is 0, the index for value 1
[-60, 1]

Searching for 60 in [1]
a = 60, arr[middle_index] = 1, begin_index = 0, end_index = 0, middle_index = 0
the result is 1, the index for value nil
[1, 60]

Searching for 60 in []
[60]
x759pob2

x759pob21#

这是Oracle Java中包含的Java的java.util.Arrays.binarySearch中的代码:

/**
     * Searches the specified array of ints for the specified value using the
     * binary search algorithm.  The array must be sorted (as
     * by the {@link #sort(int[])} method) prior to making this call.  If it
     * is not sorted, the results are undefined.  If the array contains
     * multiple elements with the specified value, there is no guarantee which
     * one will be found.
     *
     * @param a the array to be searched
     * @param key the value to be searched for
     * @return index of the search key, if it is contained in the array;
     *         otherwise, <tt>(-(<i>insertion point</i>) - 1)</tt>.  The
     *         <i>insertion point</i> is defined as the point at which the
     *         key would be inserted into the array: the index of the first
     *         element greater than the key, or <tt>a.length</tt> if all
     *         elements in the array are less than the specified key.  Note
     *         that this guarantees that the return value will be &gt;= 0 if
     *         and only if the key is found.
     */
    public static int binarySearch(int[] a, int key) {
        return binarySearch0(a, 0, a.length, key);
    }

    // Like public version, but without range checks.
    private static int binarySearch0(int[] a, int fromIndex, int toIndex,
                                     int key) {
        int low = fromIndex;
        int high = toIndex - 1;

        while (low <= high) {
            int mid = (low + high) >>> 1;
            int midVal = a[mid];

            if (midVal < key)
                low = mid + 1;
            else if (midVal > key)
                high = mid - 1;
            else
                return mid; // key found
        }
        return -(low + 1);  // key not found.
    }

该算法已被证明是适当的,我喜欢这样的事实,即您可以立即从结果中知道它是一个精确匹配还是插入点的提示。
下面是我将其翻译成ruby的方法:

# Inserts the specified value into the specified array using the binary
# search algorithm. The array must be sorted prior to making this call.
# If it is not sorted, the results are undefined.  If the array contains
# multiple elements with the specified value, there is no guarantee
# which one will be found.
#
# @param [Array] array the ordered array into which value should be inserted
# @param [Object] value the value to insert
# @param [Fixnum|Bignum] from_index ordered sub-array starts at
# @param [Fixnum|Bignum] to_index ordered sub-array ends the field before
# @return [Array] the resulting array
def self.insert(array, value, from_index=0,  to_index=array.length)
  array.insert insertion_point(array, value, from_index, to_index), value
end

# Searches the specified array for an insertion point ot the specified value
# using the binary search algorithm.  The array must be sorted prior to making
# this call. If it is not sorted, the results are undefined.  If the array
# contains multiple elements with the specified value, there is no guarantee
# which one will be found.
#
# @param [Array] array the ordered array into which value should be inserted
# @param [Object] value the value to insert
# @param [Fixnum|Bignum] from_index ordered sub-array starts at
# @param [Fixnum|Bignum] to_index ordered sub-array ends the field before
# @return [Fixnum|Bignum] the position where value should be inserted
def self.insertion_point(array, value, from_index=0,  to_index=array.length)
  raise(ArgumentError, 'Invalid Range') if from_index < 0 || from_index > array.length || from_index > to_index || to_index > array.length
  binary_search = _binary_search(array, value, from_index, to_index)
  if binary_search < 0
    -(binary_search + 1)
  else
    binary_search
  end
end

# Searches the specified array for the specified value using the binary
# search algorithm.  The array must be sorted prior to making this call.
# If it is not sorted, the results are undefined.  If the array contains
# multiple elements with the specified value, there is no guarantee which
# one will be found.
#
# @param [Array] array the ordered array in which the value should be searched
# @param [Object] value the value to search for
# @param [Fixnum|Bignum] from_index ordered sub-array starts at
# @param [Fixnum|Bignum] to_index ordered sub-array ends the field before
# @return [Fixnum|Bignum] if > 0 position of value, otherwise -(insertion_point + 1)
def self.binary_search(array, value, from_index=0,  to_index=array.length)
  raise(ArgumentError, 'Invalid Range') if from_index < 0 || from_index > array.length || from_index > to_index || to_index > array.length
  _binary_search(array, value, from_index, to_index)
end

private
# Like binary_search, but without range checks.
#
# @param [Array] array the ordered array in which the value should be searched
# @param [Object] value the value to search for
# @param [Fixnum|Bignum] from_index ordered sub-array starts at
# @param [Fixnum|Bignum] to_index ordered sub-array ends the field before
# @return [Fixnum|Bignum] if > 0 position of value, otherwise -(insertion_point + 1)
def self._binary_search(array, value, from_index, to_index)
  low = from_index
  high = to_index - 1

  while low <= high do
    mid = (low + high) / 2
    mid_val = array[mid]

    if mid_val < value
      low = mid + 1
    elsif mid_val > value
      high = mid - 1
    else
      return mid # value found
    end
  end
  -(low + 1) # value not found.
end

代码返回的值与OP为其测试数据提供的值相同。

mdfafbf1

mdfafbf12#

更新2020

其实二分法的插入问题已经有了很好的研究,有左插入点和右插入点,代码可以在WikipediaRosetta Code上找到,比如要找到左插入点,代码是:

BinarySearch_Left(A[0..N-1], value) {
      low = 0
      high = N - 1
      while (low <= high) {
          // invariants: value > A[i] for all i < low
                         value <= A[i] for all i > high
          mid = (low + high) / 2
          if (A[mid] >= value)
              high = mid - 1
          else
              low = mid + 1
      }
      return low
  }

需要注意的是the overflow bug,所以mid确实应该被发现为low + floor((high - low) / 2)

之前的答案:

实际上,使用begin_index > end_index可以更好地处理它,而不是检查begin_index >= end_index,而且解决方案要干净得多:

def binary_search_helper(arr, a, begin_index, end_index)    
  if begin_index > end_index
    return begin_index
  else
    middle_index = (begin_index + end_index) / 2
    if arr[middle_index] == a
      return middle_index
    elsif a > arr[middle_index]
      return binary_search_helper(arr, a, middle_index + 1, end_index)
    else
      return binary_search_helper(arr, a, begin_index, middle_index - 1)
    end
  end
end

# for [1,3,5,7,9], searching for 6 will return index for 7 for insertion
# if exact match is found, then return that index
def binary_search(arr, a)
  return binary_search_helper(arr, a, 0, arr.length - 1)
end

而且使用迭代代替递归可能更快,也更少担心堆栈溢出。

h7appiyu

h7appiyu3#

左侧插入样品:

def binary_search(arr, target):
    left = 0
    right = len(arr) - 1
    while left <= right:
        mid = (left + right) >> 1
        if arr[mid] < target:
            left = mid + 1
        else:
            right = mid - 1
    
    return left

右侧插入样品:

def binary_search(arr, target):
    left = 0
    right = len(arr) - 1
    while left <= right:
        mid = (left + right) >> 1
        if arr[mid] <= target:
            left = mid + 1
        else:
            right = mid - 1

return left # add - 1 for right most index of target

样品存在:

def binary_search(arr, target):
    left = 0
    right = len(arr) - 1
    while left <= right:
        mid = (left + right) >> 1
        if n < target:
            left = mid + 1
        elif n > target:
            right = mid - 1
        else:
            return True # or return mid for index

return False # or return -1 for not found

样本测试用例:arr = [1、2、3、4、5、5、5、5、5、6、7、8、9、10]
结果=二进制搜索(arr,5)

相关问题