scala 类似于Python的迭代工具,

ldfqzlk8 于 2023-01-13 发布在 Scala

关注(0)|答案(6)|浏览(424)

在Python中，我可以使用itertools.groupby将具有相同键的连续元素分组：

>>> items = [(1, 2), (1, 5), (1, 3), (2, 9), (3, 7), (1, 5), (1, 4)]
>>> import itertools
>>> list(key for key,it in itertools.groupby(items, lambda tup: tup[0]))
[1, 2, 3, 1]

Scala也有groupBy，但是它产生了不同的结果--一个从key指向在可迭代对象中找到的所有具有指定key的值的Map（而不是具有相同key的连续运行）：

scala> val items = List((1, 2), (1, 5), (1, 3), (2, 9), (3, 7), (1, 5), (1, 4))
items: List[(Int, Int)] = List((1,2), (1,5), (1,3), (2,9), (3,7), (1,5), (1,4))

scala> items.groupBy {case (key, value) => key}
res0: scala.collection.immutable.Map[Int,List[(Int, Int)]] = Map(2 -> List((2,9)), 1 -> List((1,2), (1,5), (1,3), (1,5), (1,4)), 3 -> List((3,7)))

什么是实现与Python itertools.groupby相同的最有说服力的方法？

scala

来源：https://stackoverflow.com/questions/24512600/groupby-like-pythons-itertools-groupby

6条答案

按热度按时间

frebpwbc1#

如果你只是想丢弃连续的重复，你可以这样做：

def unchain[A](items: Seq[A]) = if (items.isEmpty) items else {
  items.head +: (items zip items.drop(1)).collect{ case (l,r) if r != l => r }
}

也就是说，只需将列表与移位了一位的列表进行比较，并只保留不同的项。如果您希望自定义相同项的行为（例如，只通过键来执行），可以很容易地在方法中添加(same: (a1: A, a2: A) => Boolean)参数并使用!same(l,r)。
如果您想保留副本，可以使用Scala的groupBy来获得一个非常紧凑（但效率低下）的解决方案：

def groupSequential(items: Seq[A])(same: (a1: A, a2: A) => Boolean) = {
  val ns = (items zip items.drop(1)).
    scanLeft(0){ (n,cc) => if (same(cc._1, cc._2)) n+1 else n }
  (ns zip items).groupBy(_._1).toSeq.sortBy(_._1).map(_._2)
}

赞(0）回复(0）举报 2023-01-13

cnh2zyt32#

使用List.span，如下所示

def keyMultiSpan(l: List[(Int,Int)]): List[List[(Int,Int)]] = l match {

  case Nil => List()
  case h :: t =>
    val ms = l.span(_._1 == h._1)
    ms._1 :: keyMultiSpan(ms._2)
}

因此让

val items = List((1, 2), (1, 5), (1, 3), (2, 9), (3, 7), (1, 5), (1, 4))

所以

keyMultiSpan(items).map { _.head._1 }
res: List(1, 2, 3, 1)

- 更新**

如@Paul所建议的更易读的语法、可能更简洁使用的隐式类和通用性的类型参数化，

implicit class RichSpan[A,B](val l: List[(A,B)]) extends AnyVal {

  def keyMultiSpan(): List[List[(A,B)]] = l match {

      case Nil => List()
      case h :: t =>
        val (f, r) = l.span(_._1 == h._1)
        f :: r.keyMultiSpan()
  }
}

因此，请按如下方式使用它，

items.keyMultiSpan.map { _.head._1 }
res: List(1, 2, 3, 1)

赞(0）回复(0）举报 2023-01-13

jtoj6r0c3#

下面是一个简洁但效率低下的解决方案：

def pythonGroupBy[T, U](items: Seq[T])(f: T => U): List[List[T]] = {
  items.foldLeft(List[List[T]]()) {
    case (Nil, x) => List(List(x))
    case (g :: gs, x) if f(g.head) == f(x) => (x :: g) :: gs
    case (gs, x) => List(x) :: gs
  }.map(_.reverse).reverse
}

下面是一个更好的例子，它只对每个元素调用f一次：

def pythonGroupBy2[T, U](items: Seq[T])(f: T => U): List[List[T]] = {
  if (items.isEmpty)
    List(List())
  else {
    val state = (List(List(items.head)), f(items.head))
    items.tail.foldLeft(state) { (state, x) =>
      val groupByX = f(x)
      state match {
        case (g :: gs, groupBy) if groupBy == groupByX => ((x :: g) :: gs, groupBy)
        case (gs, _) => (List(x) :: gs, groupByX)
      }
    }._1.map(_.reverse).reverse
  }
}

这两个解决方案都在items上折叠，在运行过程中建立了一个组列表。pythonGroupBy2还跟踪当前组的f值。最后，我们必须颠倒每个组和组列表，以获得正确的顺序。

赞(0）回复(0）举报 2023-01-13

3htmauhk4#

试试看：

val items = List((1, 2), (1, 5), (1, 3), (2, 9), (3, 7), (1, 5), (1, 4))
val res = compress(items.map(_._1))

/** Eliminate consecutive duplicates of list elements **/
def compress[T](l : List[T]) : List[T] = l match {
  case head :: next :: tail if (head == next) => compress(next :: tail)
  case head :: tail => head :: compress(tail)
  case Nil => List()
}

/** Tail recursive version **/
def compress[T](input: List[T]): List[T] = {
  def comp(remaining: List[T], l: List[T], last: Any): List[T] = {
    remaining match {
      case Nil => l
      case head :: tail if head == last => comp(tail, l, head)
      case head :: tail => comp(tail, head :: l, head)
    }
  }
  comp(input, Nil, Nil).reverse
}

其中compress是其中一个99 Problems in Scala的解。

赞(0）回复(0）举报 2023-01-13

w8ntj3qf5#

嗯，从盒子里找不到什么东西，但这个可以

def groupz[T](list:List[T]):List[T] = {
      list match {
      case Nil => Nil
      case x::Nil => List(x)
      case x::xs if (x == xs.head) => groupz(xs)
      case x::xs => x::groupz(xs)
      }}

//now let's add this functionality to List class 
 implicit def addPythonicGroupToList[T](list:List[T]) = new {def pythonGroup = groupz(list)}

现在您可以：

val items = List((1, 2), (1, 5), (1, 3), (2, 9), (3, 7), (1, 5), (1, 4))
items.map(_._1).pythonGroup
res1: List[Int] = List(1, 2, 3, 1)

赞(0）回复(0）举报 2023-01-13

6qfn3psc6#

下面是我在工作中遇到的一个问题的简单解决方案。在这种情况下，我不太关心空间，所以不担心高效的迭代器。使用ArrayBuffer来累积结果。
(Don不要将此用于大量的数据。）

连续分组依据

import scala.collection.mutable.ArrayBuffer

object Main {

  /** Returns consecutive keys and groups from the iterable. */
  def sequentialGroupBy[A, K](items: Seq[A], f: A => K): ArrayBuffer[(K, ArrayBuffer[A])] = {
    val result = ArrayBuffer[(K, ArrayBuffer[A])]()
  
    if (items.nonEmpty) {
      // Iterate, keeping track of when the key changes value.
      var bufKey: K = f(items.head)
      var buf: ArrayBuffer[A] = ArrayBuffer()
  
      for (elem <- items) {
        val key = f(elem)
  
        if (key == bufKey) {
          buf += elem
        } else {
          val group: (K, ArrayBuffer[A]) = (bufKey, buf)
          result += group
          bufKey = key
          buf = ArrayBuffer(elem)
        }
      }
  
      // Append last group.
      val group: (K, ArrayBuffer[A]) = (bufKey, buf)
      result += group
    }
    result
  }
  
  def main(args: Array[String]): Unit = {
    println("\nExample 1:")
    sequentialGroupBy[Int, Int](
      Seq(1, 4, 5, 7, 9, 8, 16), 
      x => x % 2
    ).foreach(println)

    println("\nExample 2:")
    sequentialGroupBy[String, Boolean](
      Seq("pi", "nu", "rho", "alpha", "xi"), 
      x => x.length > 2
    ).foreach(println)
  }
}

运行上述代码会产生以下结果：

Example 1:
(1,ArrayBuffer(1))
(0,ArrayBuffer(4))
(1,ArrayBuffer(5, 7, 9))
(0,ArrayBuffer(8, 16))

Example 2:
(false,ArrayBuffer(pi, nu))
(true,ArrayBuffer(rho, alpha))
(false,ArrayBuffer(xi))

赞(0）回复(0）举报 2023-01-13

我来回答

scala 类似于Python的迭代工具,

6条答案

连续分组依据

相关问题

热门标签

最新问答