php XPath:获取标题后的第一段

sqyvllje  于 2023-03-11  发布在  PHP
关注(0)|答案(1)|浏览(100)

我想添加一个FAQPage架构到我的网站。
为此,我需要找到每个带问号的<h2><h3>标记,这就是问题所在。
之后,我需要标题后的第一个<p>标记作为anwser。
最终结果应如下所示:

{
    "@type": "Question",
    "name": "How long does it take to process a refund?",
    "acceptedAnswer": {
        "@type": "Answer",
        "text": "CONTENT FROM FIRST P-TAG",
        "url": "https://www.example.com/answer#anchor_link"
    }
}
  • 问题的"name"<h2><h3>标记。
  • 答案的"url"是来自<h2><h3>标记的永久链接和锚链接。
    *求解这两个参数

不幸的是,我不知道如何获得标题标签之后的第一个段落标签。
我需要第一段下面一行的内容:

"text": "CONTENT FROM FIRST P-TAG",

下面是我目前的代码:

<?php

$content_postid = get_the_ID();
$content_post   = get_post($content_postid);
$content        = $content_post->post_content;
$content        = apply_filters('the_content', $content);
$content        = str_replace(']]>', ']]&gt;', $content);

libxml_use_internal_errors(true);

$dom = new DOMDocument;
$dom->loadHTML('<?xml encoding="utf-8" ?>' . $content);

$xp = new DOMXPath($dom);
$query = "//h2[contains(., '?')] | //h3[contains(., '?')]";

$nodes = $xp->query($query);

$stack = [];

if ($nodes) {

    $faq_count = count($nodes);
    $faq_i = 1;
    
    echo '
    <script type="application/ld+json">
        {
            "@context": "https://schema.org",
            "@type": "FAQPage",
            "mainEntity": [';
    
        foreach($nodes as $node) {
        
            echo '{
                "@type": "Question",
                "name": "'.$node->nodeValue.'",
                "acceptedAnswer": {
                    "@type": "Answer",
                    "text": "CONTENT FROM FIRST P-TAG",
                    "url": "'.get_permalink().'#'.$node->getAttribute('id').'"
                }
            }';
            
            if ($faq_i != $faq_count) :  echo ','; endif; $faq_i++;
        
        }
    
    echo ']}</script>';

}
?>

正如您所看到的,我使用这一行来查找每个包含?<h2><h3>标记:

$query = "//h2[contains(., '?')] | //h3[contains(., '?')]";

我想我需要第二个$query来查找标题后面的paragrah,但是我如何检查标题后面的第一个标记呢?
我尝试了这个额外的查询:

$query2 = "//h2[contains(., '?')]/following-sibling::p[1] | //h3[contains(., '?')]/following-sibling::p[1]";

但是following-sibling::following::都不适合我,它总是显示最后一个标题之后的段落。
我需要回答第一个问题吗?要知道我在哪个级别?
下面是$content_post的一个例子(它总是不同的):

<h2>Lorem ipsum dolor sit amet?</h2>

<p>consectetuer adipiscing elit, sed diam nonummy nibh euismod tincidunt ut laoreet dolore magna aliquam erat volutpat. Ut wisi enim ad minim</p>

<p>veniam, quis nostrud exerci tation ullamcorper suscipit lobortis nisl ut aliquip ex ea commodo consequat.</p>

<h3>Duis autem vel eum?</h3>

<p>iriure dolor in hendrerit in vulputate velit esse molestie consequat, vel illum dolore eu feugiat nulla facilisis at vero et accumsan et iusto odio dignissim qui blandit praesent luptatum zzril delenit augue duis dolore te feugait nulla facilisi.</p>

<h2>Nam liber tempor cum soluta?</h2>

<h3>nobis eleifend option congue nihil</h3>

<p>imperdiet doming id quod mazim placerat facer possim assum. Lorem ipsum dolor sit amet, consectetuer adipiscing elit, sed diam nonummy nibh euismod tincidunt ut laoreet dolore magna aliquam erat volutpat.</p>

<p>Et wisi enim ad minim veniam, quis nostrud exerci tation ullamcorper suscipit lobortis nisl ut aliquip ex ea commodo consequat.</p>

<h3>Duis autem vel?</h3>

<p>eum iriure dolor in hendrerit in vulputate velit esse molestie consequat, vel illum dolore eu feugiat nulla facilisis at vero et accumsan et iusto odio dignissim qui blandit praesent luptatum zzril delenit augue duis dolore te feugait nulla facilisi.</p>

<h4>Nam liber tempor cum soluta nobis</h4>

<p>eleifend option congue nihil imperdiet doming id quod mazim placerat facer possim assum.</p>
dgtucam1

dgtucam11#

尝试像这样更改您的foreach,看看是否有效。

foreach($nodes as $node) {
        $ans = $xp->query("./following-sibling::p[1]",$node)[0]->nodeValue;
        echo "{
                '@type': 'Question',
                'name': '".$node->nodeValue."',
                'acceptedAnswer': {
                    '@type': 'Answer',
                    'text': {$ans}
                }
            }";

相关问题