Scrapy允许我们使用JavaScript脚本从服务器上抓取数据吗?

6rqinv9w  于 2022-11-09  发布在  Java
关注(0)|答案(1)|浏览(178)

在我的大学网页上,我们通过输入姓名和学生ID来检索我们的学期结果。我现在正在学习网络抓取项目,Scrapy或BeatifulSoup是否提供了一个解决方案,例如一次检索100个结果?您可以在这里查看它的内容:查看源:http://app1.helwan.edu.eg/Commerce/HasasnUpMlist.asp它使用如下代码:

<html>
<head>
    <meta http-equiv="Content-Language" content="ar-eg">
    <title></title>

<link href="natiga.css" rel="stylesheet" type="text/css" />

<meta http-equiv="Content-Type" content="text/html; charset=windows-1256" />
<meta name="generator" content="Hassan_kandeell@yahoo.com" />
</head>
<body>

<script type="text/javascript">
<!--
var EW_DATE_SEPARATOR; // Default date separator
EW_DATE_SEPARATOR = "/";
if (EW_DATE_SEPARATOR == '') EW_DATE_SEPARATOR = '/';
EW_UPLOAD_ALLOWED_FILE_EXT = "gif,jpg,jpeg,bmp,png,doc,xls,pdf,zip"; // Allowed upload file extension
var EW_FIELD_SEP = ', '; // Default field separator
// Ajax settings
EW_LOOKUP_FILE_NAME = "ewlookup61.asp"; // lookup file name
EW_ADD_OPTION_FILE_NAME = "ewaddopt61.asp"; // add option file name
// Auto suggest settings
var EW_AST_SELECT_LIST_ITEM = 0;
var EW_AST_TEXT_BOX_ID;
var EW_AST_CANCEL_SUBMIT;
var EW_AST_OLD_TEXT_BOX_VALUE = "";
var EW_AST_MAX_NEW_VALUE_LENGTH = 5; // Only get data if value length <= this setting
// Multipage settings
var ew_PageIndex = 0;
var ew_MaxPageIndex = 0;
var ew_MinPageIndex = 0;
var EW_TABLE_CLASSNAME = "ewTable"; // Note: changed the class name as needed
var ew_MultiPageElements = new Array();
//-->
</script>
<script type="text/javascript" src="ew61.js"></script>
<script type="text/javascript" src="userfn61.js"></script>
<script language="JavaScript" type="text/javascript">
<!--
// Write your client script here, no need to add script tags.
// To include another .js script, use:
// ew_ClientScriptInclude("my_javascript.js");
//-->
</script>
<div align="center">
    <table border="0" width="1001" dir="rtl">
        <tr>
            <td width="995" colspan="2">
            <p align="center">
            <img border="0" src="Start.JPG" width="995" height="198"></td>
        </tr>
        <tr>
            <td bgcolor="#AC8601" width="737">
            <p align="center">&nbsp;</td>
            <td bgcolor="#800000" width="254">
            <p align="center"><font size="5" color="#FFFFFF"><b>نتائج كلية 
            التجارة وإدارة الأعمال</b></font></td>
        </tr>
    </table>
</div>

<script type="text/javascript">
<!--
var EW_PAGE_ID = "list"; // Page id
//-->
</script>
<script type="text/javascript">
<!--

function ew_ValidateForm2(fobj) {
    var infix = "";
    for (var i=0;i<fobj.elements.length;i++) {
        var elem = fobj.elements[i];
        if (elem.name.substring(0,2) == "s_" || elem.name.substring(0,3) == "sv_")
            elem.value = "";
    }
    return true;
}
//-->

我只是出于教育的目的,我想为我的同事做一个项目,因为网站的流量非常高,甚至需要几个小时才能得到一个结果。谢谢。

tgabmvqs

tgabmvqs1#

当然,如果所有的记录在同一个页面上都是可见的,你可以使用javascript、scrapy、BeautifulSoup等一次性删除所有的结果。
如果网页通过分页显示结果,则应访问所有页面并相应地删除。
希望这对你有帮助。

相关问题