多个匹配的光标位置(R中的Officer Package)

sr4lhrrt  于 2023-03-27  发布在  其他
关注(0)|答案(1)|浏览(94)

我正在尝试使用R中的Officer包确定插入/替换项目的确切位置
例如:

cursor_reach(document, "Adverse Events")

将发现第一个不良事件的措辞仅匹配。
因此,如果我有一个关于不良事件的文本,光标将首先指向该文本。但我真正想要的是找到带有“不良事件”的标题,这样我就可以在该标题后插入一些自动文本。
如果不良事件一词出现在标题之前,似乎没有办法将光标移动到标题之后?
谢谢!
我试过这个

cursor_reach(document, "Adverse Events")

cursor_reach(document, "Adverse Events")

但这行不通...

h7appiyu

h7appiyu1#

您可以使用以下命令直接设置光标位置:

# assuming your rdocx object is called document and the 
# desired cursor location is 3

document$officer_cursor$which <- 3

下面是一个 Package 器函数,用于确定所需的光标位置:

require(dplyr)
require(xml2)

cursor_reach_list <- function(x, keyword) {
  
  nodes_with_text <- xml_find_all(x$doc_obj$get(), "/w:document/w:body/*|/w:ftr/*|/w:hdr/*")
  if (length(nodes_with_text) < 1) {
    stop("no text found in the document", call. = FALSE)
  }
  text_ <- xml_text(nodes_with_text)
  test_ <- grepl(pattern = keyword, x = text_)
  if (!any(test_)) {
    stop(keyword, " has not been found in the document", 
         call. = FALSE)
  }
  # note: everything above was taken directly from officer's cursor_reach function

  # get the paragraph style associated with each paragraph
  style_ <- unlist(sapply(nodes_with_text,
                          function(x) {
                            ss <- xml_find_all(x, ".//w:pStyle")
                            if(length(ss) == 0) return("")
                            xml_attr(ss, "val", default = "")
                          }))

  # put the results in a table
  result <- data.frame(para = seq_along(text_),
                       keyword.found = test_,
                       style_id = style_) %>%
    left_join(styles_info(x) %>%
                filter(style_type == "paragraph") %>%
                select(style_id, style_name),
              by = "style_id") %>%
    select(-style_id)

  print(result)
}

用一个简单的文档演示:

# create simple document for testing #####
doc <- read_docx()
doc <- body_add_par(doc, "A paragraph of normal text that contains the keywords Adverse Events, and precedes any heading.")
doc <- body_add_par(doc, "Some other text.")
doc <- body_add_par(doc, "Header Adverse Events", style = "heading 1")
doc <- body_add_par(doc, "Another paragraph after the header, to beef up the document.")
print(doc, "temp_file.docx")
rm(doc)

# load document & use cursor_reach_list to identify desired location #####

doc <- read_docx("temp_file.docx")
cursor_reach_list(doc, "Adverse Events")

# Result:
#  para keyword.found style_name
#1    1          TRUE     Normal
#2    2         FALSE     Normal
#3    3          TRUE  heading 1
#4    4         FALSE     Normal
#5    5         FALSE           

# Both paragraphs 1 & 3 contain the keywords, but para 1 follows Normal style
# while para 3 doesn't.

# move cursor to para 3
doc$officer_cursor$which <- 3 

# insert text after heading
doc <- body_add_par(doc, "additional text in next line", pos = "after")

# save result to different location for ease of verification
print(doc, "temp_file1.docx")

我不熟悉你的实际用例,所以在确定光标的适当位置后,实际的光标位置的改变和新文本的插入都是手工操作。你可以根据自己的需要,在 Package 器函数中自动化所有的事情。

相关问题