asp.net 将html转换为word

myzjeezk  于 2023-03-13  发布在  .NET
关注(0)|答案(3)|浏览(239)

我使用docx.dll将html转换为word。但无法转换html标记。例如:html是一个p,具有
<p><em><strong>adfa à asdf asdf</strong></em></p>
因此,当转换完成后,Word文件的内容是相同的
<p><em><strong>adfa à asdf asdf</strong></em></p> .
我的代码如下

[System.Web.Services.WebMethod]
    public static void eprt_tml(int ptcn_id)
    {
        DataTable tml_tbl = DBclass.TruyVanTraVeTable1("select course_id, vname, vcontent from testimonial where contact_id='" + ptcn_id + "'");
        string course_name = DBclass.TruyVanTraVeGiaTri("select vname from course where id='"+tml_tbl.Rows[0]["course_id"]+"'");
        DataTable ptcn_tbl = DBclass.TruyVanTraVeTable1("select first_name, last_name, salutation_id, title from contact where id='"+ptcn_id+"'");
        string ptcn_name = ptcn_tbl.Rows[0]["last_name"].ToString() + " " + ptcn_tbl.Rows[0]["first_name"].ToString();
        DocX g_document;

        try
        {

            // Store a global reference to the loaded document.
            g_document = DocX.Load(@"D:\Project\CRM1\tml\tml_tpt.docx");
            /*
             * The template 'InvoiceTemplate.docx' does exist, 
             * so lets use it to create an invoice for a factitious company
             * called "The Happy Builder" and store a global reference it.
             */
            g_document = crt_from_tpl(DocX.Load(@"D:\Project\CRM1\tml\tml_tpt.docx"), course_name, tml_tbl.Rows[0]["vname"].ToString(), tml_tbl.Rows[0]["vcontent"].ToString(), ptcn_name, ptcn_tbl.Rows[0]["title"].ToString());
            // Save all changes made to this template as Invoice_The_Happy_Builder.docx (We don't want to replace InvoiceTemplate.docx).
            g_document.SaveAs(@"D:\Project\CRM1\tml\Invoice_The_Happy_Builder.docx");
        }

            // The template 'InvoiceTemplate.docx' does not exist, so create it.
        catch (FileNotFoundException)
        {

        }

    }

    //Create tml from template
    [System.Web.Services.WebMethod]
    private static DocX crt_from_tpl(DocX template, string course_name, string vname, string vcontent, string ptcn_name, string ptcn_title)
    {
        template.AddCustomProperty(new CustomProperty("static_title", "Ứng Dụng Thực Tiễn Thành Công"));
        template.AddCustomProperty(new CustomProperty("tmlname", vname));
        template.AddCustomProperty(new CustomProperty("tmlcontent", vcontent));
        template.AddCustomProperty(new CustomProperty("ptcnname", ptcn_name));
        template.AddCustomProperty(new CustomProperty("ptcntitle", ptcn_title));
        template.AddCustomProperty(new CustomProperty("coursename", course_name));
        return template;
    }

我该怎么解决呢?

ffx8fchx

ffx8fchx1#

在保存文档之前,将这两行代码添加到代码中。

g_document.ReplaceText(@"<p><em><strong>","");
g_document.ReplaceText(@"</strong></em></p>","");

注意:它只会删除html标签,但不会添加任何格式。

mm9b1k5b

mm9b1k5b2#

用Word打开你的HTML文件,点击“保存为”,然后在“文件类型”下选择“Word文档”。

hl0ma9xz

hl0ma9xz3#

在我的Windows PC上,我已安装Microsoft Office 2016。您必须在路径中添加对.NET核心项目的COM引用
C:\Windows\assembly\GAC_MSIL\office\15.0.0.0__71e9bce111e9429c\OFFICE.DLL
您必须先将HTML字符串保存为.HTML文件,然后使用下面的函数将该HTML文件加载到Word文档,然后将其保存为 *.docx文件。您可以使用下面的C#函数将HTML文本转换为Word文档

using System.Text;
using Microsoft.Office.Interop.Word;
using Application = Microsoft.Office.Interop.Word.Application;
using Document = Microsoft.Office.Interop.Word.Document;
using Paragraph = Microsoft.Office.Interop.Word.Paragraph;

    public void SaveHtmlAsWordFile(string htmlFilePath)
    {
        string wordFile = htmlFilePath.Replace(".html",".docx");
        // Create an instance of the Word application
        Application wordApp = new Application();

        // Open the HTML file
        Microsoft.Office.Interop.Word.Document doc = wordApp.Documents.Open(htmlFilePath);

        // Save the document as a .docx file
        doc.SaveAs2(wordFile, WdSaveFormat.wdFormatDocumentDefault);

        // Close the document and the Word application
        doc.Close();
        wordApp.Quit();
    }

相关问题