scrapy 如何从传单Map中抓取位置数据?

rks48beu  于 12个月前  发布在  其他
关注(0)|答案(1)|浏览(117)

我想访问在this website中找到的水位传感器标记的位置(纬度,经度),但我找不到任何包含其位置的HTML标记。
任何指导都将非常有帮助!

2j4z5cfb

2j4z5cfb1#

查看网络检查器,您可以看到页面发送了一个GET请求,看起来像一个API调用:https://app.pub.gov.sg/waterlevel/pages/GetWLInfo.aspx?type=WL&d=2023-08-15T07:00:00.000Z
通过访问API端点,您可以找到所有传感器的当前数据,您可以通过以下方式获得这些数据:

import requests

# Fetch data
endpoint = "https://app.pub.gov.sg/waterlevel/pages/GetWLInfo.aspx"
params = {"type": "WL"}
response = requests.get(endpoint, params=params)
return response.content.decode("utf-8")

# Parse data
dataSplit = [[data for data in sensor.split("$#$")] for sensor in raw.split("$#$$@$")]
data = []
for record in dataSplit:
    # Convert data to dict and typecast
    if record not in data and len(record) == 7:
        data.append(
            {
                "sensor-id": record[0],
                "sensor-name": record[1],
                "latitude": float(record[3]),
                "longitude": float(record[2]),
                "water-level": float(record[4]),
                "status": float(record[5]),
                "timestamp": parseTimestamp(record[6]),
            }
        )

# Convert to dataframe
df = pd.DataFrame(data)

要解析时间戳,请使用以下命令:

def parseTimestamp(timestamp: str):
    # Standardise timestamp
    timestamp = timestamp.split(" ")
    while "" in timestamp:
        timestamp.remove("")
    # Pad day
    if len(timestamp[1]) < 2:
        timestamp[1] = "0" + timestamp[1]
    # Pad time
    if len(timestamp[-1].split(":")[0]) < 2:
        timestamp[-1] = "0" + timestamp[-1]
    timestamp = " ".join(timestamp)
    timestamp = datetime.strptime(timestamp, "%b %d %Y  %I:%M%p")
    return timestamp

相关问题