当我使用css选择器在页面中查找表单时,总是得到一个null元素。
final String LOGIN_FORM_URL = "https://student.naviance.com/sbrunswick";
Connection.Response loginFormResponse = Jsoup.connect(LOGIN_FORM_URL)
.method(Connection.Method.GET)
.userAgent(USER_AGENT)
.execute();
FormElement loginForm = (FormElement)loginFormResponse.parse().select("div#main-container > div.components-NewLogin-style-loginFormBody > form").first();
我一直在尝试使用不同的css选择器来获取loginform,但是我总是得到一个错误,即它是空的。
如果有帮助的话,请链接到我用来学习网页抓取的教程:https://jsoup.programmingpedia.net/en/tutorial/4631/logging-into-websites-with-jsoup
我试过各种各样的选择器,比如下面的:
div#main-container > div.components-NewLogin-style-loginFormBody > form
# main-container > div.components-NewLogin-style-loginFormBody > form
body > div > div > div > div > div > div > div > form
form.components-NewLogin-style-loginFormWrapper[data-test-id='login_form']
更细微的变化。但是,我仍然将“loginform”的值设置为null。有人能帮我吗?我在这类问题上纠缠了一段时间。
这是网站上的代码。
<html lang="en-US">
<head>
<title>Login | Naviance Student</title>
<meta charset="utf-8">
<meta name="viewport" content="width=device-width,initial-scale=1,minimum-scale=1">
<meta http-equiv="X-UA-Compatible" content="IE=edge,chrome=1">
<link rel="apple-touch-icon" href="/apple-icon.png">
<link rel="apple-touch-icon" sizes="76x76" href="/apple-icon-76x76.png">
<link rel="apple-touch-icon" sizes="114x114" href="/apple-icon-114x114.png">
<link rel="apple-touch-icon" sizes="144x144" href="/apple-icon-144x144.png">
<link rel="apple-touch-icon" sizes="152x152" href="/apple-icon-152x152.png">
<link rel="apple-touch-icon" sizes="180x180" href="/apple-icon-180x180.png">
<link rel="apple-touch-startup-image" href="/apple-icon.png">
<meta name="apple-mobile-web-app-capable" content="yes">
<meta name="apple-mobile-web-app-title" content="Naviance Student">
<link rel="icon" type="image/png" sizes="16x16" href="/favicon-16x16.png">
<link rel="icon" type="image/png" sizes="32x32" href="/favicon-32x32.png">
<link rel="icon" type="image/png" sizes="96x96" href="/favicon-96x96.png">
<link rel="manifest" href="/manifest.json">
<meta http-equiv="Page-Enter" content="RevealTrans(Duration=2.0,Transition=2)">
<meta http-equiv="Page-Exit" content="RevealTrans(Duration=3.0,Transition=12)">
<meta http-equiv="cleartype" content="on">
<meta name="msapplication-config" content="IEconfig.xml">
<meta name="application-name" content="Naviance Student">
<meta name="author" content="Naviance">
<meta http-equiv="Content-Security-Policy" content="upgrade-insecure-requests">
<link href="/style-16726.css" rel="stylesheet">
<link rel="preload" href="/main.e6791.js" as="script">
<link rel="stylesheet" type="text/css" href="/0.style-c7e46.css">
<script type="text/javascript" async src="https://www.gstatic.com/recaptcha/releases/UFwvoDBMjc8LiYc1DKXiAomK/recaptcha__en.js" crossorigin="anonymous" integrity="sha384-K2LYnZEtBUcW6O6eiKyrX5HgXfaBzWmW7BmI0mEp+JFPi3pZyyiJwjMDjI12BtQg"></script>
<script type="text/javascript" async src="https://www.google-analytics.com/plugins/ua/linkid.js"></script>
<script type="text/javascript" async src="//bat.bing.com/bat.js"></script>
<script type="text/javascript" async src="//www.googleadservices.com/pagead/conversion_async.js"></script>
<script type="text/javascript" async src="https://www.google-analytics.com/analytics.js"></script>
<script async src="//www.googletagmanager.com/gtm.js?id=GTM-NPKP2M"></script>
<script charset="utf-8" src="/fc.common.f467c.js"></script>
<link rel="stylesheet" type="text/css" href="/56.style-ed477.css">
<script charset="utf-8" src="/fc.school-lookup.3f1be.js"></script>
<script src="https://googleads.g.doubleclick.net/pagead/viewthroughconversion/949855375/?random=1606140298936&cv=9&fst=1606140298936&num=1&guid=ON&resp=GooglemKTybQhCsO&u_h=1080&u_w=1920&u_ah=1040&u_aw=1920&u_cd=24&u_his=2&u_tz=-300&u_java=false&u_nplug=3&u_nmime=4&gtm=2wgb41&sendb=1&ig=1&frm=0&url=https%3A%2F%2Fstudent.naviance.com%2Fauth%2Ffclookup&ref=https%3A%2F%2Fwww.naviance.com%2F&tiba=Search%20for%20a%20School%20%7C%20Naviance%20Student&hn=www.googleadservices.com&async=1&rfmt=3&fmt=4"></script>
<link rel="stylesheet" type="text/css" href="/43.style-f1a23.css">
<script charset="utf-8" src="/fc.login.f56d4.js"></script>
</head>
<body data-new-gr-c-s-check-loaded="14.984.0" data-gr-ext-installed="">
<script src="/rewritten_config.js?v=1605811315155"></script>
<div id="root">
<div class="components-App-style-app">
<div>
<div style="height: 0px; width: 0px;"></div>
<div>
<div style="padding-bottom: 3rem;">
<a class="components-Header-styles-skipMain" href="#main-container">Skip to main content</a>
<header id="header" class="components-Header-styles-header" role="banner">
<div>
<nav class="components-Header-styles-nav" data-test-id="nav">
<div class="components-Header-styles-navContainer">
<a class="components-Header-styles-logoWrapper" href="/main"><img class="components-Header-styles-emblem" data-test-i
d="family_connection_emblem" src="/static/naviance-emblem-2c575.svg" alt="Logo" role="presentation"><img class="components-Header-styles-logo nophone" data-test-id="family_connection_header" src="/static/naviance-student-rgb-0a577.svg" alt="Naviance Student" role="img"></a>
</div>
</nav>
</div>
</header>
<div id="main-container" class="components-NewLogin-style-loginFormContainer">
<div class="components-NewLogin-style-loginFormBack">
<a href="/sbrunswick">
<figure class="components-Icon-style-icon">
<svg xmlns="http://www.w3.org/2000/svg" viewbox="0 0 19 30">
<path fill-rule="evenodd" d="M12.858 29.028c.212.23.556.23.761.008l3.502-3.783a.614.614 0 0 0-.006-.819l-8.812-9.432 8.824-9.621a.597.597 0 0 0-.021-.813l-3.472-3.6a.524.524 0 0 0-.776.008L.342 14.583a.622.622 0 0 0 0 .834l12.516 13.61z"></path>
</svg>
</figure> Back</a>
</div>
<div class="components-NewLogin-style-loginFormBody">
<h3 class="components-NewLogin-style-loginWelcome">Welcome Student!</h3>
<div class="components-NewLogin-style-userTypeImageContainer">
<img src="/static/backpack-cd9ef.svg">
</div>
<p data-test-id="login_to_naviance"><strong>Login to Naviance</strong></p>
<form class="components-NewLogin-style-loginFormWrapper" data-test-id="login_form">
<label class="components-NewLogin-style-loginInputLabel" for="login-username">Email </label>
<input id="login-username" name="username" type="email" class="components-NewLogin-style-loginInput" placeholder="For example navigator@naviance.com" data-test-id="username" value=""><label class="components-NewLogin-style-loginInputLabel" for="login-password">Password</label>
<input id="login-password" name="password" type="password" class="components-NewLogin-style-loginInput" placeholder="Type password" data-test-id="password" value="">
<div class="components-NewLogin-style-loginRememberForget">
<label for="checkbox_7" class="components-Checkbox-styles-label components-Checkbox-styles-light"><input data-test-id="remember_me" aria-label="Select checkbox_7" name="remember" id="checkbox_7" class="components-Checkbox-styles-input" type="checkbox" checked>
<figure class="components-Icon-style-icon components-Checkbox-styles-icon">
<svg width="1792" height="1792" viewbox="0 0 1792 1792" xmlns="http://www.w3.org/2000/svg">
<path d="M1671 566q0 40-28 68l-724 724-136 136q-28 28-68 28t-68-28l-136-136-362-362q-28-28-28-68t28-68l136-136q28-28 68-28t68 28l294 295 656-657q28-28 68-28t68 28l136 136q28 28 28 68z"></path>
</svg>
</figure>
<div class="components-Checkbox-styles-children">
Remember me
</div></label><a href="/sbrunswick/forgot-password">Forgot your password?</a>
</div>
<div>
<div>
<div class="grecaptcha-badge" data-style="bottomright" style="width: 256px; height: 60px; display: block; transition: right 0.3s ease 0s; position: fixed; bottom: 14px; right: -186px; box-shadow: gray 0px 0px 5px; border-radius: 2px; overflow: hidden;">
<div class="grecaptcha-logo">
<iframe src="https://www.google.com/recaptcha/api2/anchor?ar=1&k=6LfAN84UAAAAABfGTP7s2vIfa9lpQWoXg28LcQGV&co=aHR0cHM6Ly9zdHVkZW50Lm5hdmlhbmNlLmNvbTo0NDM.&hl=en&type=image&v=UFwvoDBMjc8LiYc1DKXiAomK&theme=light&size=invisible&badge=bottomright&cb=319m7d6n7h1b" width="256" height="60" role="presentation" name="a-831s6wkibtn5" frameborder="0" scrolling="no" sandbox="allow-forms allow-popups allow-same-origin allow-scripts allow-top-navigation allow-modals allow-popups-to-escape-sandbox"></iframe>
</div>
<div class="grecaptcha-error"></div><textarea id="g-recaptcha-response" name="g-recaptcha-response" class="g-recaptcha-response" style="width: 250px; height: 40px; border: 1px solid rgb(193, 193, 193); margin: 10px 25px; padding: 0px; resize: none; display: none;"></textarea>
</div><iframe style="display: none;"></iframe>
</div>
</div><button type="submit" class="components-NewLogin-style-btnNew components-NewLogin-style-loginBtn" disabled>Continue</button>
</form><a class="components-NewLogin-style-additionalHelp" target="_blank" rel="noopener noreferrer" href="https://student.naviance.com/additional-help">Need additional help?</a><a></a>
<p><a></a><a href="/sbrunswick/register">I'm new and need to register!</a></p>
</div>
</div>
</div>
<footer class="components-NewFooter-styles-footer">
<div class="components-NewFooter-styles-schoolInfo">
<div class="components-NewFooter-styles-west">
<div class="components-NewFooter-HobsonsBrand-styles-main">
<div>
<img class="components-NewFooter-HobsonsBrand-styles-hLogo" data-test-id="family_connection_header" src="/static/hobsons_w_tagline-fe51f.svg" alt="Hobsons">
</div>
<div class="components-NewFooter-HobsonsBrand-styles-linksDiv">
<span class="components-NewFooter-HobsonsBrand-styles-links"><a class="components-ClickHOC-styles-medium" href="/privacy-statement">Privacy Policy</a></span><span class="components-NewFooter-HobsonsBrand-styles-linkSeparator nophone"> | </span><span class="components-NewFooter-HobsonsBrand-styles-links"><a class="components-ClickHOC-styles-medium" href="/privacy-statement#ca">Your CA Privacy Rights</a></span>
</div>
<div class="components-NewFooter-HobsonsBrand-styles-copyright">
© 2020 Hobsons. All rights reserved worldwide.
</div>
</div>
</div>
<div class="components-NewFooter-styles-east">
<div class="components-NewFooter-UserInfo-styles-main">
<section class="card components-Card-styles-card components-NewFooter-UserInfo-styles-profileCard">
<div class="">
<div class="components-NewFooter-UserInfo-styles-profileSchool">
<div class="components-NewFooter-UserInfo-styles-schoolAddress">
<span><strong>South Brunswick High School</strong></span>
<div>
PO Box 183 750 Ridge Road
</div>
<div>
Monmouth Junction, NJ 08852-9721
</div>
<div>
<a href="tel:(732) 329-4044">p: (732) 329-4044</a>
</div>
<div>
<a href="http://www.sbschools.org/" target="_blank" rel="nofollow external noopener noreferrer">www.sbschools.org/</a>
</div>
</div>
</div>
</div>
</section>
</div>
</div>
</div>
</footer>
</div>
</div>
</div>
</div>
<script src="/fc.vendors~main.bb74e.js"></script>
<script src="/main.e6791.js" async></script>
<div class="ReactModalPortal"></div>
<div style="width:0px; height:0px; display:none; visibility:hidden;" id="batBeacon454285104070">
<img style="width:0px; height:0px; display:none; visibility:hidden;" id="batBeacon632230391008" width="0" height="0" alt="" src="https://bat.bing.com/action/0?ti=21008698&Ver=2&mid=21bd982f-afff-46e5-91b6-2a20e7b0ea84&sid=df4f0f802d9411eb9747713b1ab291c6&vid=b3661b601ac011ebb52bb79d756baba6&vids=0&pi=1200101525&lg=en-US&sw=1920&sh=1080&sc=24&tl=Search%20for%20a%20School%20%7C%20Naviance%20Student&p=https%3A%2F%2Fstudent.naviance.com%2Fauth%2Ffclookup&r=https%3A%2F%2Fwww.naviance.com%2F&lt=692&evt=pageLoad&msclkid=N&sv=1&rn=96110">
</div>
<script src="https://www.google.com/recaptcha/api.js?onload=onloadcallback&render=explicit" async></script>
<div style="visibility: hidden; position: absolute; width: 100%; top: -10000px; left: 0px; right: 0px; transition: visibility 0s linear 0.3s, opacity 0.3s linear 0s; opacity: 0;">
<div style="width: 100%; height: 100%; position: fixed; top: 0px; left: 0px; z-index: 2000000000; background-color: rgb(255, 255, 255); opacity: 0.5;"></div>
<div style="margin: 0px auto; top: 0px; left: 0px; right: 0px; position: absolute; border: 1px solid rgb(204, 204, 204); z-index: 2000000000; background-color: rgb(255, 255, 255); overflow: hidden;">
<iframe title="recaptcha challenge" src="https://www.google.com/recaptcha/api2/bframe?hl=en&v=UFwvoDBMjc8LiYc1DKXiAomK&k=6LfAN84UAAAAABfGTP7s2vIfa9lpQWoXg28LcQGV&cb=bcjuob9xkiu4" name="c-831s6wkibtn5" frameborder="0" scrolling="no" sandbox="allow-forms allow-popups allow-same-origin allow-scripts allow-top-navigation allow-modals allow-popups-to-escape-sandbox" style="width: 100%; height: 100%;"></iframe>
</div>
</div>
</body>
</html>
这是我在登录loginformresponse的html时得到的代码。
<html lang="en-US">
<head>
<title>Naviance Student</title>
<meta charset="utf-8">
<meta name="viewport" content="width=device-width,initial-scale=1,minimum-scale=1">
<meta http-equiv="X-UA-Compatible" content="IE=edge,chrome=1">
<link rel="apple-touch-icon" href="/apple-icon.png">
<link rel="apple-touch-icon" sizes="76x76" href="/apple-icon-76x76.png">
<link rel="apple-touch-icon" sizes="114x114" href="/apple-icon-114x114.png">
<link rel="apple-touch-icon" sizes="144x144" href="/apple-icon-144x144.png">
<link rel="apple-touch-icon" sizes="152x152" href="/apple-icon-152x152.png">
<link rel="apple-touch-icon" sizes="180x180" href="/apple-icon-180x180.png">
<link rel="apple-touch-startup-image" href="/apple-icon.png">
<meta name="apple-mobile-web-app-capable" content="yes">
<meta name="apple-mobile-web-app-title" content="Naviance Student">
<link rel="icon" type="image/png" sizes="16x16" href="/favicon-16x16.png">
<link rel="icon" type="image/png" sizes="32x32" href="/favicon-32x32.png">
<link rel="icon" type="image/png" sizes="96x96" href="/favicon-96x96.png">
<link rel="manifest" href="/manifest.json">
<meta http-equiv="Page-Enter" content="RevealTrans(Duration=2.0,Transition=2)">
<meta http-equiv="Page-Exit" content="RevealTrans(Duration=3.0,Transition=12)">
<meta http-equiv="cleartype" content="on">
<meta name="msapplication-config" content="IEconfig.xml">
<meta name="application-name" content="Naviance Student">
<meta name="author" content="Naviance">
<meta http-equiv="Content-Security-Policy" content="upgrade-insecure-requests">
<link href="/style-16726.css" rel="stylesheet">
<link rel="preload" href="/main.e6791.js" as="script">
</head>
<body>
<script src="/rewritten_config.js?v=1605811315155"></script>
<div id="root"></div>
<script src="/fc.vendors~main.bb74e.js"></script>
<script src="/main.e6791.js" async></script>
</body>
</html>
暂无答案!
目前还没有任何答案,快来回答吧!