Python中的数据构造函数的Haskell等价物?

bcs8qyzn  于 2023-03-28  发布在  Python
关注(0)|答案(5)|浏览(131)

在Haskell中,我可以定义一个二叉树如下:

data Bint a = Leaf a | Branch a (Bint a) (Bint a)

然后我可以对它进行一些操作如下:

height (Leaf a) = 1
height (Branch a l r) = 1 + (max (height l) (height r))

count (Leaf a) = 1
count (Branch a l r) = 1 + (count l) + (count r)

我知道Python在Haskell中没有data的等价物。如果有,请告诉我。
那么,如何在Python中定义二叉树,以及如何在其中实现上述两个函数?

wnrlj8wa

wnrlj8wa1#

我在这里使用的是一个类似于Haskell的函数式编程。从某种意义上说,这不是非常“Python”。特别是,它不是面向对象的。尽管如此,它仍然很有用,也很干净。
数据类型是一个类。具有多个数据构造函数的数据类型是一个具有关于如何构造的额外信息的类。当然,它需要一些数据。使用构造函数来确保所有树都是法律的的:

class BinTree (object):
    def __init__(self, value=None, left=None, right=None):
        if left == None and right == None and value != None:                
            self.isLeaf = True
            self.value = value
        elif left != None and right != None and value == None:
            self.isLeaf = False
            self.value = (left, right)
        else:
            raise ArgumentError("some help message")

这个构造函数调用起来有点不方便,所以有一些智能的构造函数很容易用途:

def leaf(value):
    return BinTree(value=value)

def branch(left, right):
    return BinTree(left=left, right=right)

我们如何得到这些值呢?让我们也做一些帮助程序:

def left(tree):
    if tree.isLeaf:
        raise ArgumentError ("tree is leaf")
    else:
        return tree.value[0]

def right(tree):
    if tree.isLeaf:
        raise ArgumentError ("tree is leaf")
    else:
        return tree.value[1]

def value(tree):
    if not tree.isLeaf:
        raise ArgumentError ("tree is branch")
    else:
        return tree.value

就是这样。你得到了一个纯“代数”数据类型,可以用函数访问:

def count(bin_tree):
    if bin_tree.isLeaf:
        return 1
    else:
        return count(left(bin_tree))+count(right(bin_tree))
8yparm6h

8yparm6h2#

最接近的可能是带有方法的类:

class Leaf:
    def __init__(self, value):
        self.value = value

    def height(self):
        return 1

    def count(self):
        return 1

class Branch:
    def __init__(self, left, right):
        self.left = left
        self.right = right

    def height(self):
        return 1 + max(self.left.height(), self.right.height())

    def count(self):
        return 1 + self.left.count() + self.right.count()

虽然这有点习惯,也有自己的优点,但它也缺乏Haskell版本的一些品质。最重要的是,函数定义必须与类沿着定义,在同一个模块中。您可以使用single-dispatch generic functions而不是方法来恢复。结果比方法和Haskell函数都更开放。并且在有益时允许将定义扩展到多个模块。

@singledispatch
def height(_):
    raise NotImplementedError()

@singledispatch
def count(_):
    raise NotImplementedError()

class Leaf:
    def __init__(self, value):
        self.value = value

@height.register(Leaf)
def height_leaf(leaf):
    return 1

@height.register(Leaf)
def count_leaf(leaf):
    return 1    

class Branch:
    def __init__(self, left, right):
        self.left = left
        self.right = right

@height.register(Branch)
def height_branch(b):
    return 1 + max(b.left.height(), b.right.height())

@count.register(Branch)
def count_branch(b):
    return 1 + b.left.count() + b.right.count()
lvmkulzt

lvmkulzt3#

Python不像Haskell那样有“数据构造器”的概念。你可以像大多数其他OOP语言一样创建类。或者你可以总是用内置的函数来表示你的数据类型,只定义处理它们的函数(这是在heapq内置模块中实现堆的方法)。
python和haskell之间的差异是巨大的,所以最好避免在haskell和python的语法/功能之间进行严格的比较,否则你最终会写出非pythonic和低效的python代码。
即使python确实有一些函数式的特性,它也不是一种函数式语言,所以你必须完全改变你的程序的范式来获得可读的,pythonic的和高效的程序。
使用类的可能实现可以是:

class Bint(object):
    def __init__(self, value, left=None, right=None):
        self.a = value
        self.left = left
        self.right = right

    def height(self):
        left_height = self.left.height() if self.left else 0
        right_height = self.right.height() if self.right else 0
        return 1 + max(left_height, right_height)

    def count(self):
        left_count = self.left.count() if self.left else 0
        right_height = self.right.count() if self.right else 0
        return 1 + left_count + right_count

代码可以简化一点,为leftright提供一个“更智能”的默认值:

class Nil(object):
    def height(self):
        return 0
    count = height

nil = Nil()

class Bint(object):
    def __init__(self, value, left=nil, right=nil):
        self.value = value
        self.left = left
        self.right = right
    def height(self):
        return 1 + max(self.left.height(), self.right.height())
    def count(self):
        return 1 + self.left.count() + self.right.count()

注意,这些实现允许节点只有一个子节点。
然而,你并不一定要使用类来定义数据类型。例如,你可以说Bint可以是一个单一元素的列表,根的值,或者是一个有三个元素的列表:值,左孩子和右孩子。
在这种情况下,您可以将函数定义为:

def height(bint):
    if len(bint) == 1:
        return 1
    return 1 + max(height(bint[1]), height(bint[2]))

def count(bint):
    if len(bint) == 1:
    return 1 + count(bint[1]) + count(bint[2])

另一种方法是使用namedtuple s:

from collections import namedtuple

Leaf = namedtuple('Leaf', 'value')
Branch = namedtuple('Branch', 'value left right')

def height(bint):
    if len(bint) == 1: # or isinstance(bint, Leaf)
        return 1
    return 1 + max(height(bint.left), height(bint.right))

def count(bint):
    if len(bint) == 1:  # or isinstance(bint, Leaf)
    return 1 + count(bint.left) + count(bint.right)
smdncfj3

smdncfj34#

五年了,我的答案是。
data Tree a = Leaf a | Branch (Tree a) (Tree a) deriving (Show)转换为Python...

不带类型检查(更像python)

class Tree:
    def __init__(self):
        pass

    def height(self):
        pass

    def count(self):
        pass

class Leaf(Tree):
    def __init__(self, value):
        self.value = value

    def __str__(self):
        return("Leaf " + str(self.value))

    def height(self):
        return 1

    def count(self):
        return 1

class Branch(Tree):
    def __init__(self, left, right):
        if isinstance(left, Tree) and isinstance(right, Tree):
            self.left = left
            self.right = right
        else:
            raise ValueError

    def __str__(self):
        return("Branch (" + str(self.left) + " " +
               str(self.right) + ")")

    def height(self):
        return 1 + max(self.left.height(), self.right.height())

    def count(self):
        return 1 + self.left.count() + self.right.count()

#usage : Branch(Leaf(5), Branch(Leaf(3), Leaf(2)))

带类型检查(更像Haskell)

Tree类的heightcount方法

class Tree:
    def __init__(self, tree, type_):
        def typecheck(subtree):
            if isinstance(subtree, Leaf):
                if not isinstance(subtree.value, type_):
                    print(subtree.value)
                    raise ValueError
            elif isinstance(subtree, Branch):
                typecheck(subtree.left)
                typecheck(subtree.right)
            else:
                raise ValueError
        typecheck(tree)
        self.tree = tree
        self.type_ = type_

    def __str__(self):
        return ("Tree " + self.type_.__name__ + "\n" + str(self.tree))

    def height(self):
        if isinstance(self, Leaf):
            return 1
        elif isinstance(self, Branch):
            return 1 + max(self.left.height(), self.right.height())
        else:
            return self.tree.height()

    def count(self):
        if isinstance(self, Leaf):
            return 1
        elif isinstance(self, Branch):
            return 1 + self.left.count() + self.right.count()
        else:
            return self.tree.count()

class Leaf(Tree):
    def __init__(self, value):
            self.value = value

    def __str__(self):
            return("Leaf " + str(self.value))

class Branch(Tree):
    def __init__(self, left, right):
        if isinstance(left, Tree) and isinstance(right, Tree):
            self.left = left
            self.right = right
        else:
            raise ValueError

    def __str__(self):
        return("Branch (" + str(self.left) + " " +
               str(self.right) + ")")

#usage tree1 = Tree(Branch(Leaf(5), Branch(Leaf(3), Leaf(2))), int)
#usage tree1.height() -> 3
#usage tree1.count() -> 5

Leaf和分支类的heightcount方法

class Tree:
    def __init__(self, tree, type_):
        def typecheck(subtree):
            if isinstance(subtree, Leaf):
                if not isinstance(subtree.value, type_):
                    print(subtree.value)
                    raise ValueError
            elif isinstance(subtree, Branch):
                typecheck(subtree.left)
                typecheck(subtree.right)
            else:
                raise ValueError
        typecheck(tree)
        self.tree = tree
        self.type_ = type_

    def __str__(self):
        return ("Tree " + self.type_.__name__ + "\n" + str(self.tree))

    def height(self):
        return self.tree.height()

    def count(self):
        return self.tree.count()

class Leaf(Tree):
    def __init__(self, value):
            self.value = value

    def __str__(self):
            return("Leaf " + str(self.value))

    def height(self):
        return 1

    def count(self):
        return 1

class Branch(Tree):
    def __init__(self, left, right):
        if isinstance(left, Tree) and isinstance(right, Tree):
            self.left = left
            self.right = right
        else:
            raise ValueError

    def __str__(self):
        return("Branch (" + str(self.left) + " " +
               str(self.right) + ")")

    def height(self):
        return 1 + max(self.left.height(), self.right.height())

    def count(self):
        return 1 + self.left.count() + self.right.count()

#usage Tree(Branch(Leaf(5), Branch(Leaf(3), Leaf(2))), int)
#usage tree1.height() -> 3
#usage tree1.count() -> 5

heightcount类外方法(最像haskell)

class Tree:
    def __init__(self, tree, type_):
        def typecheck(subtree):
            if isinstance(subtree, Leaf):
                if not isinstance(subtree.value, type_):
                    print(subtree.value)
                    raise ValueError
            elif isinstance(subtree, Branch):
                typecheck(subtree.left)
                typecheck(subtree.right)
            else:
                raise ValueError
        typecheck(tree)
        self.tree = tree
        self.type_ = type_

    def __str__(self):
        return ("Tree " + self.type_.__name__ + "\n" + str(self.tree))

class Leaf(Tree):
    def __init__(self, value):
            self.value = value

    def __str__(self):
            return("Leaf " + str(self.value))

class Branch(Tree):
    def __init__(self, left, right):
        if isinstance(left, Tree) and isinstance(right, Tree):
            self.left = left
            self.right = right
        else:
            raise ValueError

    def __str__(self):
        return("Branch (" + str(self.left) + " " +
               str(self.right) + ")")

def height(tree):
    if not isinstance(tree, Tree):
        raise ValueError
    if isinstance(tree, Leaf):
        return 1
    elif isinstance(tree, Branch):
        return 1 + max(height(tree.left), height(tree.right))
    else:
        return height(tree.tree)

def count(tree):
    if not isinstance(tree, Tree):
        raise ValueError
    if isinstance(tree, Leaf):
        return 1
    elif isinstance(tree, Branch):
        return 1 + count(tree.left) + count(tree.right)
    else:
        return count(tree.tree)

#usage tree1 = Tree(Branch(Leaf(5), Branch(Leaf(3), Leaf(2))), int)
#usage height(tree1) -> 3
#usage count(tree1) -> 5
pepwfjgg

pepwfjgg5#

使用最现代的Python习惯用法的答案是,您需要:

  • 类型注解
  • Python 3.10+支持结构模式匹配
  • 支持递归类型别名的静态类型检查工具(mypy,pyright)
  • 使用frozen=True定义不变性的数据类
from __future__ import annotations

from dataclasses import dataclass
from typing import Generic, TypeAlias, TypeVar

A = TypeVar("A")
L = TypeVar("L")
R = TypeVar("R")

@dataclass(frozen=True)
class Leaf(Generic[A]):
    value: A

@dataclass(frozen=True)
class Branch(Generic[A, L, R]):
    root: A
    left: L
    right: R

Bint: TypeAlias = Leaf[A] | Branch[A, "Bint[A]", "Bint[A]"]

def height(x: Bint[A]) -> int:
    match x:
        case Leaf():
            return 1
        case Branch(_, l, r):
            return 1 + max(height(l), height(r))

def count(x: Bint[A]) -> int:
    match x:
        case Leaf():
            return 1
        case Branch(_, l, r):
            return 1 + count(l) + count(r)

这个答案的优点是:

  • 您可以完全删除运行时类型检查(isinstance),以使代码读取更简单、更高效
  • 当使用union类型而不是Leaf, Branch的超类时,可以进行穷举检查,因此如果match没有警告输入,则程序是正确的。

相关问题