mysql的python持久数据库连接

cgh8pdjw  于 2021-06-24  发布在  Mysql
关注(0)|答案(1)|浏览(328)

我正在用我的一个项目的刮痧。从spider中获取数据并将其传递到管道以插入数据库。这是我的数据库类代码:

import MySQLdb

class Database:

    host = 'localhost'
    user = 'root'
    password = 'test123'
    db = 'scraping_db'

    def __init__(self):
        self.connection = MySQLdb.connect(self.host, self.user, self.password, self.db,use_unicode=True, charset="utf8")
        self.cursor = self.connection.cursor()

    def insert(self, query,params):
        try:
            self.cursor.execute(query,params)
            self.connection.commit()
        except Exception as ex:
            self.connection.rollback()

    def __del__(self):
        self.connection.close()

下面是我的管道代码,用于处理刮下的项目并保存到mysql数据库中。

from con import Database 

class LinkPipeline(object):

    def __init__(self):
        self.db=Database()

    def process_item(self, item, spider):
        query="""INSERT INTO links (title, location,company_name,posted_date,status,company_id,scraped_link,content,detail_link,job_id) VALUES (%s, %s, %s, %s, %s, %s, %s, %s, %s,%s)"""
        params=(item['title'], item['location'], item['company_name'], item['posted_date'], item['status'], item['company_id'], item['scraped_link'], item['content'], item['detail_link'],item['job_id'])
        self.db.insert(query,params)
        return item

从上面的流程中,我感觉到无论何时通过管道处理项目,当流程项目完成时,都会打开和关闭数据库连接。这将打开太多的数据库连接。我想要一种方法,在spider的整个生命周期中,我的数据库连接只打开一次,在spider关闭时关闭。
我读到在spider类中有open\u spider和close\u spider方法,如果我使用它们,那么如何将对数据库连接的引用从spider的start\u requests方法传递到pipeline类?
有没有更好的办法?

cld4siwp

cld4siwp1#

class MySpider(scrapy.Spider):
    name = "myspidername"

    host = 'localhost'
    user = 'root'
    password = 'test123'
    db = 'scraping_db'

    def __init__(self):
        self.connection = MySQLdb.connect(self.host, self.user, self.password, self.db,use_unicode=True, charset="utf8")
        self.cursor = self.connection.cursor()

    def insert(self, query,params):
        try:
            self.cursor.execute(query,params)
            self.connection.commit()
        except Exception as ex:
            self.connection.rollback()

    def __del__(self):
        self.connection.close()

然后在你的管道里这样做 spider.cursor 访问 cursor 并执行任何mysql操作。

class LinkPipeline(object):

    def process_item(self, item, spider):
        query="""INSERT INTO links (title, location,company_name,posted_date,status,company_id,scraped_link,content,detail_link,job_id) VALUES (%s, %s, %s, %s, %s, %s, %s, %s, %s,%s)"""
        params=(item['title'], item['location'], item['company_name'], item['posted_date'], item['status'], item['company_id'], item['scraped_link'], item['content'], item['detail_link'],item['job_id'])
        spider.cursor.insert(query,params)
        return item

相关问题