How to convert HTML to MS Word Document using PHP




How to convert HTML to MS Word Document using PHP

The conversion of HTML into MS Word Document mainly used in the web application to generate .doc/.docx file with dynamic HTML content. Microsoft word document is the most popular file format to export dynamic content for offline use. Export HTML content to MS word functionality can be easily implemented using JavaScript. But, if you want to convert dynamic content to Doc, the server-side interaction is needed.

Learn Internet Of Things by CrowdforGeeks

The server-side export to word functionality is very useful to convert dynamic HTML content to MS word document and download as a .docx file. The MS word document can be easily generated with HTML content using PHP. In this tutorial, we will show you how to convert HTML to MS word document in PHP.

HTML To Word (DOC/DOCX) Library

The HTML_TO_DOC class is a custom library that helps to generate MS word document and include the HTML formatted content in Word document using PHP.

1. setDocFileName() – Set document file name.

2. setTitle() – Set documfent title.

3. getHeader() – Create header section of the document.

4. getFotter() – Create footer section of the document.

5. createDoc() – Create word document in .dcox format.

6. _parseHtml() – Parse and filter HTML from source.

7. write_file() – Insert content in the word file.

<?php 
class HTML_TO_DOC 
{ 
    var $docFile  = ''; 
    var $title    = ''; 
    var $htmlHead = ''; 
    var $htmlBody = ''; 
     
    /** 
     * Constructor 
     * 
     * @return void 
     */ 
    function __construct(){ 
        $this->title = ''; 
        $this->htmlHead = ''; 
        $this->htmlBody = ''; 
    } 
     
    /** 
     * Set the document file name 
     * 
     * @param String $docfile  
     */ 
    function setDocFileName($docfile){ 
        $this->docFile = $docfile; 
        if(!preg_match("/\.doc$/i",$this->docFile) && !preg_match("/\.docx$/i",$this->docFile)){ 
            $this->docFile .= '.doc'; 
        } 
        return;  
    } 
     
    /** 
     * Set the document title 
     * 
     * @param String $title  
     */ 
    function setTitle($title){ 
        $this->title = $title; 
    } 
     
    /** 
     * Return header of MS Doc 
     * 
     * @return String 
     */ 
    function getHeader(){ 
        $return = <<<EOH 
        <html xmlns:v="urn:schemas-microsoft-com:vml" 
        xmlns:o="urn:schemas-microsoft-com:office:office" 
        xmlns:w="urn:schemas-microsoft-com:office:word" 
        xmlns="http://www.w3.org/TR/REC-html40"> 
         
        <head> 
        <meta http-equiv=Content-Type content="text/html; charset=utf-8"> 
        <meta name=ProgId content=Word.Document> 
        <meta name=Generator content="Microsoft Word 9"> 
        <meta name=Originator content="Microsoft Word 9"> 
        <!--[if !mso]> 
        <style> 
        v\:* {behavior:url(#default#VML);} 
        o\:* {behavior:url(#default#VML);} 
        w\:* {behavior:url(#default#VML);} 
        .shape {behavior:url(#default#VML);} 
        </style> 
        <![endif]--> 
        <title>$this->title</title> 
        <!--[if gte mso 9]><xml> 
         <w:WordDocument> 
          <w:View>Print</w:View> 
          <w:DoNotHyphenateCaps/> 
          <w:PunctuationKerning/> 
          <w:DrawingGridHorizontalSpacing>9.35 pt</w:DrawingGridHorizontalSpacing> 
          <w:DrawingGridVerticalSpacing>9.35 pt</w:DrawingGridVerticalSpacing> 
         </w:WordDocument> 
        </xml><![endif]--> 
        <style> 
        <!-- 
         /* Font Definitions */ 
        @font-face 
            {font-family:Verdana; 
            panose-1:2 11 6 4 3 5 4 4 2 4; 
            mso-font-charset:0; 
            mso-generic-font-family:swiss; 
            mso-font-pitch:variable; 
            mso-font-signature:536871559 0 0 0 415 0;} 
         /* Style Definitions */ 
        p.MsoNormal, li.MsoNormal, div.MsoNormal 
            {mso-style-parent:""; 
            margin:0in; 
            margin-bottom:.0001pt; 
            mso-pagination:widow-orphan; 
            font-size:7.5pt; 
                mso-bidi-font-size:8.0pt; 
            font-family:"Verdana"; 
            mso-fareast-font-family:"Verdana";} 
        p.small 
            {mso-style-parent:""; 
            margin:0in; 
            margin-bottom:.0001pt; 
            mso-pagination:widow-orphan; 
            font-size:1.0pt; 
                mso-bidi-font-size:1.0pt; 
            font-family:"Verdana"; 
            mso-fareast-font-family:"Verdana";} 
        @page Section1 
            {size:8.5in 11.0in; 
            margin:1.0in 1.25in 1.0in 1.25in; 
            mso-header-margin:.5in; 
            mso-footer-margin:.5in; 
            mso-paper-source:0;} 
        div.Section1 
            {page:Section1;} 
        --> 
        </style> 
        <!--[if gte mso 9]><xml> 
         <o:shapedefaults v:ext="edit" spidmax="1032"> 
          <o:colormenu v:ext="edit" strokecolor="none"/> 
         </o:shapedefaults></xml><![endif]--><!--[if gte mso 9]><xml> 
         <o:shapelayout v:ext="edit"> 
          <o:idmap v:ext="edit" data="1"/> 
         </o:shapelayout></xml><![endif]--> 
         $this->htmlHead 
        </head> 
        <body> 
EOH; 
        return $return; 
    } 
     
    /** 
     * Return Document footer 
     * 
     * @return String 
     */ 
    function getFotter(){ 
        return "</body></html>"; 
    } 
 
    /** 
     * Create The MS Word Document from given HTML 
     * 
     * @param String $html :: HTML Content or HTML File Name like path/to/html/file.html 
     * @param String $file :: Document File Name 
     * @param Boolean $download :: Wheather to download the file or save the file 
     * @return boolean  
     */ 
    function createDoc($html, $file, $download = false){ 
        if(is_file($html)){ 
            $html = @file_get_contents($html); 
        } 
         
        $this->_parseHtml($html); 
        $this->setDocFileName($file); 
        $doc = $this->getHeader(); 
        $doc .= $this->htmlBody; 
        $doc .= $this->getFotter(); 
                         
        if($download){ 
            @header("Cache-Control: ");// leave blank to avoid IE errors 
            @header("Pragma: ");// leave blank to avoid IE errors 
            @header("Content-type: application/octet-stream"); 
            @header("Content-Disposition: attachment; filename=\"$this->docFile\""); 
            echo $doc; 
            return true; 
        }else { 
            return $this->write_file($this->docFile, $doc); 
        } 
    } 
     
    /** 
     * Parse the html and remove <head></head> part if present into html 
     * 
     * @param String $html 
     * @return void 
     * @access Private 
     */ 
    function _parseHtml($html){ 
        $html = preg_replace("/<!DOCTYPE((.|\n)*?)>/ims", "", $html); 
        $html = preg_replace("/<script((.|\n)*?)>((.|\n)*?)<\/script>/ims", "", $html); 
        preg_match("/<head>((.|\n)*?)<\/head>/ims", $html, $matches); 
        $head = !empty($matches[1])?$matches[1]:''; 
        preg_match("/<title>((.|\n)*?)<\/title>/ims", $head, $matches); 
        $this->title = !empty($matches[1])?$matches[1]:''; 
        $html = preg_replace("/<head>((.|\n)*?)<\/head>/ims", "", $html); 
        $head = preg_replace("/<title>((.|\n)*?)<\/title>/ims", "", $head); 
        $head = preg_replace("/<\/?head>/ims", "", $head); 
        $html = preg_replace("/<\/?body((.|\n)*?)>/ims", "", $html); 
        $this->htmlHead = $head; 
        $this->htmlBody = $html; 
        return; 
    } 
     
    /** 
     * Write the content in the file 
     * 
     * @param String $file :: File name to be save 
     * @param String $content :: Content to be write 
     * @param [Optional] String $mode :: Write Mode 
     * @return void 
     * @access boolean True on success else false 
     */ 
    function write_file($file, $content, $mode = "w"){ 
        $fp = @fopen($file, $mode); 
        if(!is_resource($fp)){ 
            return false; 
        } 
        fwrite($fp, $content); 
        fclose($fp); 
        return true; 
    } 
}

Convert HTML to Word Document

The following example code convert HTML content to MS word document and save as a .docx file using HTML_TO_DOC class.

1. Load and initialize the HTML_TO_DOC class.

// Load library 
include_once 'HtmlToDoc.class.php';  
 
// Initialize class 
$htd = new HTML_TO_DOC();

Also Read:- How to add Days, Hours, Minutes, and Seconds to Datetime in PHP

2. Specify the HTML content want to convert.

$htmlContent = ' 
    <h1>Hello World!</h1> 
    <p>This document is created from HTML.</p>';

3. Call the createDoc() function to convert HTML to Word document.

  • Specify the variable that holds the HTML content ($htmlContent).
  • Specify the document name to save the word file (my-document).
$htd->createDoc($htmlContent, "my-document");

Download word file:
To download the word file, set the third parameter of the createDoc() to TRUE.

$htd->createDoc($htmlContent, "my-document", 1);

Create Word document from HTML File

You can convert the HTML file content to Word document by specifying the HTML file name.

$htd->createDoc("source.html", "my-document");

How to Develop a Blog App Using NextJS

Conclusion

There are various third-party library is available for HTML to Word conversion. But, you can convert the HTML content to Word Document using PHP without any external library. Our HTML_TO_DOC class provides an easy way to convert dynamic HTML content to Word document and save/download as a .docx file using PHP. You can easily enhance the functionality of the HTML To Doc class as per your needs.

Learn Internet Of Things by CrowdforGeeks



Author Biography.

Lokesh Gupta
Lokesh Gupta

Overall 3+ years of experience as a Full Stack Developer with a demonstrated history of working in the information technology and services industry. I enjoy solving complex problems within budget and deadlines putting my skills on PHP, MySQL, Python, Codeigniter, Yii2, Laravel, AngularJS, ReactJS, NodeJS to best use. Through Knowledge of UML & visual modeling, application architecture design & business process modeling. Successfully delivered various projects, based on different technologies across the globe.

Join Our Newsletter.

Subscribe to CrowdforThink newsletter to get daily update directly deliver into your inbox.

CrowdforJobs is an advanced hiring platform based on artificial intelligence, enabling recruiters to hire top talent effortlessly.

CrowdforJobs

CrowdforApps brings to you the well researched list of the most successful and finest App development companies, Web software developers.

CrowdforApps

CrowdforGeeks is where lifelong learners come to learn the skills they need, to land the jobs they want, to build the lives they deserve.

CrowdforGeeks

CrowdforThink is a leading Indian media and information platform, known for its end-to-end coverage of the Indian startup ecosystem.

CrowdforThink
CFT

News & Blogs

7a7daa743d93a744b4fd5748a8389281.jpg

Best PHP Frameworks To Use In 2019’s CMS Websit...

PHP is a widely used language for web design and development all over the world. Almost 83% of we...

7f219b78165ca3de8e1b5d68c5ed0834.jpg

Create a Simple CRUD Database App - Connecting ...

What is CRUD In computer programming, create, read, update, and delete are the four basic functi...

464bc819f41b6878fc27398415288c37.jpg

How to Use Web Scraping

Have you ever thought about web scrapping? What's the main benefit of it? How do you start us...

Top Authors

Hey, I am Suraj - a full-time blogger and a social media expert currently working on the Growth H...

Suraj Kumar

Zakariya has recently joined the PakWheels team as a Content Marketing Executive, shortly after g...

Zakariya Usman

Overall 3+ years of experience as a Full Stack Developer with a demonstrated history of working i...

Lokesh Gupta

With good communication and writing skiils, Astha Sharma is a full-time content writer working wi...

Astha Sharma
CFT

Our Client Says

WhatsApp Chat with Our Support Team