我想从the following web page中提取评级的元素:
的数据
下一个是什么代码:
<ol data-bv-v="contentItemCollection:2" class="bv-content-list bv-content-list-reviews">
<li data-bv-v="contentItem:9" class="bv-content-item bv-content-top-review bv-content-review bv-content-loaded" itemprop="review" itemscope="" itemtype="http://schema.org/Review" data-content-id="Reviews-158638580">
<div data-bv-v="inlineProfile:13" class="bv-author-profile">
<div class="bv-inline-profile">
<div class="bv-author-avatar">
<div class="bv-author-avatar-nickname">
<div class="bv-content-author-name" role="presentation">
<button type="button" class="bv-author bv-fullprofile-popup-target bv-focusable" aria-label="Voir le profil de oceaned03.">
<h3>oceaned03</h3>
</button>
</div>
</div>
</div>
<div class="bv-popup-prosnap-userinfo bv-contains-profile-button">
<div class="bv-content-author-name" role="presentation">
<button type="button" class="bv-author bv-fullprofile-popup-target bv-focusable" aria-label="Voir le profil de oceaned03.">
<h3>oceaned03</h3>
</button>
</div>
<div class="bv-author-location"> <span> Clermont Ferrand </span> </div>
<div class="bv-author-userstats">
<ul class="bv-author-userstats-list" role="list">
<li class="bv-author-userstats-reviews"> <span class="bv-author-userstats-data"> Avis : </span> <span class="bv-author-userstats-value">1</span> </li>
<li class="bv-author-userstats-votes"> </li>
</ul>
</div>
<div class="bv-content-author-badges">
<ul class="bv-content-author-badges-list" role="presentation"> </ul>
</div>
<div class="bv-author-userinfo">
<ul role="list">
<li class="bv-author-cdv bv-first ">
<!-- UIA-7763 - removed default display so only translated strings matched by FB will display; can't remove defaultDisplay field entirely due to compilation errors, so used a value of '' --> <span class="bv-author-userinfo-data">Sexe </span> <span class="bv-author-userinfo-value">une femme</span>
</li>
<li class="bv-author-cdv ">
<!-- UIA-7763 - removed default display so only translated strings matched by FB will display; can't remove defaultDisplay field entirely due to compilation errors, so used a value of '' --> <span class="bv-author-userinfo-data">Age</span> <span class="bv-author-userinfo-value">18-24 ans</span>
</li>
<li class="bv-author-cdv ">
<!-- UIA-7763 - removed default display so only translated strings matched by FB will display; can't remove defaultDisplay field entirely due to compilation errors, so used a value of '' --> <span class="bv-author-userinfo-data">Couleur des yeux</span> <span class="bv-author-userinfo-value">Bleus</span>
</li>
<li class="bv-author-cdv bv-last">
<!-- UIA-7763 - removed default display so only translated strings matched by FB will display; can't remove defaultDisplay field entirely due to compilation errors, so used a value of '' --> <span class="bv-author-userinfo-data">Type de peau</span> <span class="bv-author-userinfo-value">Sèche</span>
</li>
</ul>
</div>
</div>
</div>
</div>
<div class="bv-content-item-author-profile-offset bv-content-item-author-profile-offset-on">
<div class="bv-content-container">
<div class="bv-content-core ">
<div class="bv-content-header">
<div class="bv-content-data-summary">
<div class="bv-content-badges-container">
<ul class="bv-badge-summary bv-badge-first bv-badge-top-three" role="presentation">
<li class="bv-badge-image bv-badge-content-loyaltyyes--im-a-beauty-insider" role="presentation"> <img src="https://display.ugc.bazaarvoice.com/static/Sephora-FR/main_site/951/3232/fr_FR/images/badgeImages/loyaltyyes--im-a-beauty-insider.png" alt="Carte White" title="Carte White"> </li>
</ul>
</div>
<div class="bv-content-header-meta">
<span class="bv-content-rating bv-rating-ratio" itemprop="reviewRating" itemscope="" itemtype="http://schema.org/Rating">
<meta itemprop="ratingValue" content="5">
<meta itemprop="bestRating" content="5">
<span class="bv-rating-stars-container"> <abbr title="5 sur 5 étoiles." class="bv-rating bv-rating-stars bv-rating-stars-off" aria-hidden="true"> ★★★★★ </abbr> <abbr title="5 sur 5 étoiles." class="bv-rating-max bv-rating-stars bv-rating-stars-on bv-width-from-rating-stats-100" aria-hidden="true"> ★★★★★ </abbr> <span class="bv-off-screen">5 sur 5 étoiles.</span> </span>
</span>
<div class="bv-content-meta-wrapper">
<div class="bv-content-meta" role="presentation">
<div class="bv-content-reference-data bv-content-author-name">
<button type="button" class="bv-author bv-fullprofile-popup-target bv-focusable" aria-label="Voir le profil de oceaned03." itemprop="author">
<h3>oceaned03</h3>
</button>
<div class="bv-content-datetime" role="presentation">
<meta itemprop="dateCreated" content="2020-06-24">
<meta itemprop="datePublished" content="2020-06-24">
<span class="bv-content-datetime-dot" aria-hidden="true">·</span> <span class="bv-content-datetime-stamp">il y a 5 mois </span>
</div>
</div>
</div>
</div>
</div>
<div class="bv-content-title-container">
<h3 class="bv-content-title" itemprop="headline"> Satisfaite </h3>
</div>
</div>
</div>
<div class="bv-content-details-offset-off">
<div class="bv-content-summary">
<div class="bv-content-summary-body" itemprop="reviewBody">
<div class="bv-content-summary-body-text">
<p>Très contente de mon achat. Je cherchais ce parfum depuis un temps en magasin et je suis heureuse qu’il soit disponible en ligne il sent tellement bon !! En plus en promo, génial ! <br>Livraison très rapide !</p>
</div>
<div class="bv-content-data">
<div class="bv-content-product-questions"> </div>
<div class="bv-content-tag-dimensions"> </div>
<ul class="bv-content-data-recommend-yes">
<li class="bv-content-data-label-container"> <span class="bv-content-data-icon" aria-hidden="true">✔</span> <span class="bv-content-data-label">Oui</span>, </li>
<li class="bv-content-data-value"> je recommande ce produit. </li>
</ul>
</div>
</div>
</div>
</div>
</div>
</div>
<div class="bv-content-actions-container bv-active-feedback">
<div data-bv-v="feedback:12" class="bv-feedback-container">
<div class="bv-content-feedback-vote bv-content-feedback-vote-active" role="group" aria-label="Utilité du contenu">
<div class="bv-content-feedback-vote-request">
<p>Avez-vous trouvé cet avis utile ?</p>
</div>
<div class="bv-content-feedback-btn-container"> <button type="button" class="bv-content-btn bv-content-btn-feedback-yes bv-focusable" aria-label="1 personne a trouvé cet avis utile. Oui, review de oceaned03 est utile."> <span aria-hidden="true"> Oui · <span class="bv-content-btn-count" aria-hidden="true">1</span> </span> </button> <button type="button" class="bv-content-btn bv-content-btn-feedback-no bv-focusable" aria-label="0 personne a trouvé cet avis inutile. Non, review de oceaned03 n'est pas utile."> <span aria-hidden="true"> Non · <span class="bv-content-btn-count" aria-hidden="true">0</span> </span> </button> </div>
<div class="bv-content-feedback-vote bv-content-feedback-vote-active"> <button type="button" class="bv-content-report-btn bv-focusable" aria-label="Marquer « Satisfaite » de oceaned03 comme inapproprié."> Signalez un contenu inapproprié </button> </div>
</div>
</div>
</div>
<div class="bv-inline-form-container"></div>
<div data-bv-v="secondaryContentList:10" class="bv-secondary-content-list">
<ol data-bv-v="secondaryContentItemCollection:11" class="bv-content-list bv-content-list-clientresponses" role="presentation">
</ol>
</div>
</div>
</li>
<li data-bv-v="contentItem:14" class="bv-content-item bv-content-top-review bv-content-review bv-content-loaded" itemprop="review" itemscope="" itemtype="http://schema.org/Review" data-content-id="Reviews-156726085">
<div data-bv-v="inlineProfile:18" class="bv-author-profile">
<div class="bv-inline-profile">
<div class="bv-author-avatar">
<div class="bv-author-avatar-nickname">
<div class="bv-content-author-name" role="presentation">
<button type="button" class="bv-author bv-fullprofile-popup-target bv-focusable" aria-label="Voir le profil de Jo56.">
<h3>Jo56</h3>
</button>
</div>
</div>
</div>
<div class="bv-popup-prosnap-userinfo bv-contains-profile-button">
<div class="bv-content-author-name" role="presentation">
<button type="button" class="bv-author bv-fullprofile-popup-target bv-focusable" aria-label="Voir le profil de Jo56.">
<h3>Jo56</h3>
</button>
</div>
<div class="bv-author-location"> <span> Lorient </span> </div>
<div class="bv-author-userstats">
<ul class="bv-author-userstats-list" role="list">
<li class="bv-author-userstats-reviews"> <span class="bv-author-userstats-data"> Avis : </span> <span class="bv-author-userstats-value">3</span> </li>
<li class="bv-author-userstats-votes"> </li>
</ul>
</div>
<div class="bv-content-author-badges">
<ul class="bv-content-author-badges-list" role="presentation"> </ul>
</div>
<div class="bv-author-userinfo">
<ul role="list">
<li class="bv-author-cdv bv-first ">
<!-- UIA-7763 - removed default display so only translated strings matched by FB will display; can't remove defaultDisplay field entirely due to compilation errors, so used a value of '' --> <span class="bv-author-userinfo-data">Sexe </span> <span class="bv-author-userinfo-value">une femme</span>
</li>
<li class="bv-author-cdv ">
<!-- UIA-7763 - removed default display so only translated strings matched by FB will display; can't remove defaultDisplay field entirely due to compilation errors, so used a value of '' --> <span class="bv-author-userinfo-data">Age</span> <span class="bv-author-userinfo-value">18-24 ans</span>
</li>
<li class="bv-author-cdv ">
<!-- UIA-7763 - removed default display so only translated strings matched by FB will display; can't remove defaultDisplay field entirely due to compilation errors, so used a value of '' --> <span class="bv-author-userinfo-data">Couleur des yeux</span> <span class="bv-author-userinfo-value">Marrons</span>
</li>
<li class="bv-author-cdv bv-last">
<!-- UIA-7763 - removed default display so only translated strings matched by FB will display; can't remove defaultDisplay field entirely due to compilation errors, so used a value of '' --> <span class="bv-author-userinfo-data">Type de peau</span> <span class="bv-author-userinfo-value">Sèche</span>
</li>
</ul>
</div>
</div>
</div>
</div>
<div class="bv-content-item-author-profile-offset bv-content-item-author-profile-offset-on">
<div class="bv-content-container">
<div class="bv-content-core ">
<div class="bv-content-header">
<div class="bv-content-data-summary">
<div class="bv-content-badges-container">
<ul class="bv-badge-summary bv-badge-first bv-badge-top-three" role="presentation">
<li class="bv-badge-image bv-badge-content-loyaltyyes--im-a-vib-rouge" role="presentation"> <img src="https://display.ugc.bazaarvoice.com/static/Sephora-FR/main_site/951/3232/fr_FR/images/badgeImages/loyaltyyes--im-a-vib-rouge.png" alt="Carte Gold" title="Carte Gold"> </li>
</ul>
</div>
<div class="bv-content-header-meta">
<span class="bv-content-rating bv-rating-ratio" itemprop="reviewRating" itemscope="" itemtype="http://schema.org/Rating">
<meta itemprop="ratingValue" content="5">
<meta itemprop="bestRating" content="5">
<span class="bv-rating-stars-container"> <abbr title="5 sur 5 étoiles." class="bv-rating bv-rating-stars bv-rating-stars-off" aria-hidden="true"> ★★★★★ </abbr> <abbr title="5 sur 5 étoiles." class="bv-rating-max bv-rating-stars bv-rating-stars-on bv-width-from-rating-stats-100" aria-hidden="true"> ★★★★★ </abbr> <span class="bv-off-screen">5 sur 5 étoiles.</span> </span>
</span>
<div class="bv-content-meta-wrapper">
<div class="bv-content-meta" role="presentation">
<div class="bv-content-reference-data bv-content-author-name">
<button type="button" class="bv-author bv-fullprofile-popup-target bv-focusable" aria-label="Voir le profil de Jo56." itemprop="author">
<h3>Jo56</h3>
</button>
<div class="bv-content-datetime" role="presentation">
<meta itemprop="dateCreated" content="2020-05-22">
<meta itemprop="datePublished" content="2020-05-22">
<span class="bv-content-datetime-dot" aria-hidden="true">·</span> <span class="bv-content-datetime-stamp">il y a 6 mois </span>
</div>
</div>
</div>
</div>
</div>
<div class="bv-content-title-container">
<h3 class="bv-content-title" itemprop="headline"> Excellent </h3>
</div>
</div>
</div>
<div class="bv-content-details-offset-off">
<div class="bv-content-summary">
<div class="bv-content-summary-body" itemprop="reviewBody">
<div class="bv-content-summary-body-text">
<p>J’adore les parfums de cette marque car je trouve qu’ils sont captivant et surtout ils tiennent toute la journée ! Ils ont des odeurs originales et que l’on ne retrouve pas partout ! Je conseil fortement</p>
</div>
<div class="bv-content-data">
<div class="bv-content-product-questions"> </div>
<div class="bv-content-tag-dimensions"> </div>
<ul class="bv-content-data-recommend-yes">
<li class="bv-content-data-label-container"> <span class="bv-content-data-icon" aria-hidden="true">✔</span> <span class="bv-content-data-label">Oui</span>, </li>
<li class="bv-content-data-value"> je recommande ce produit. </li>
</ul>
</div>
</div>
</div>
</div>
</div>
</div>
<div class="bv-content-actions-container bv-active-feedback">
<div data-bv-v="feedback:17" class="bv-feedback-container">
<div class="bv-content-feedback-vote bv-content-feedback-vote-active" role="group" aria-label="Utilité du contenu">
<div class="bv-content-feedback-vote-request">
<p>Avez-vous trouvé cet avis utile ?</p>
</div>
<div class="bv-content-feedback-btn-container"> <button type="button" class="bv-content-btn bv-content-btn-feedback-yes bv-focusable" aria-label="2 personnes ont trouvé cet avis utile. Oui, review de Jo56 est utile."> <span aria-hidden="true"> Oui · <span class="bv-content-btn-count" aria-hidden="true">2</span> </span> </button> <button type="button" class="bv-content-btn bv-content-btn-feedback-no bv-focusable" aria-label="0 personne a trouvé cet avis inutile. Non, review de Jo56 n'est pas utile."> <span aria-hidden="true"> Non · <span class="bv-content-btn-count" aria-hidden="true">0</span> </span> </button> </div>
<div class="bv-content-feedback-vote bv-content-feedback-vote-active"> <button type="button" class="bv-content-report-btn bv-focusable" aria-label="Marquer « Excellent » de Jo56 comme inapproprié."> Signalez un contenu inapproprié </button> </div>
</div>
</div>
</div>
<div class="bv-inline-form-container"></div>
<div data-bv-v="secondaryContentList:15" class="bv-secondary-content-list">
<ol data-bv-v="secondaryContentItemCollection:16" class="bv-content-list bv-content-list-clientresponses" role="presentation">
</ol>
</div>
</div>
</li>
</ol>
字符串
例如,我尝试了以下方法:
response.css('li.data-content-id').extract()
型
但它返回一个空数组。
更新
在查看了开发人员工具中页面的其他元素后,似乎我正在寻找的数据在batch.json文档中给出:
3条答案
按热度按时间oxosxuxt1#
你的代码使用了类选择器。你的html告诉
data-content-id
是属性。不熟悉scrappy来判断它是否有属性选择,但是你可以使用data-content-id
来代替字符串
vltsax252#
data-content-id
还可以为页面上的某些其他类型的内容提供某些其他值,因此上面的xpath可能会获取不需要的部分。下面的CSS应该工作:
字符串
或者,如果你真的想包含data-content-id,那么我相信我们应该使用:用途:
型
这将获取data-content-id属性包含Reviews-的所有元素。我们在这里使用通配符,因为Reviews后面的数字看起来像一个ID,每个review元素都不同。
xeufq47z3#
所需数据不在页面的html代码中!!
它由单独的API调用加载并由JavaScript呈现。
要查看原始html代码,您需要(在Chrome中)使用查看源代码选项CTRL+U
附加信息- >抓取动态内容(docs)