如何从网页将表格导出到 Excel

如何从网页将表格导出到 Excel。我希望出口包含所有的格式和颜色。

286350 次浏览

This is a php but you maybe able to change it to javascript:

<?php>
$colgroup = str_repeat("<col width=86>",5);
$data = "";
$time = date("M d, y g:ia");
$excel = "<html xmlns:o=\"urn:schemas-microsoft-com:office:office\" xmlns:x=\"urn:schemas-microsoft-com:office:excel\" xmlns=\"http://www.w3.org/TR/REC-html40\">
<!DOCTYPE html PUBLIC \"-//W3C//DTD XHTML 1.0 Transitional//EN\" \"http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd\">
<html>
<head>
<meta http-equiv=\"Content-type\" content=\"text/html;charset=utf-8\" />
<style id=\"Classeur1_16681_Styles\">
.xl4566 {
color: red;
}
</style>
</head>
<body>
<div id=\"Classeur1_16681\" align=center x:publishsource=\"Excel\">
<table x:str border=0 cellpadding=0 cellspacing=0 style=\"border-collapse: collapse\">
<colgroup>$colgroup</colgroup>
<tr><td class=xl2216681><b>Col1</b></td><td class=xl2216681><b>Col2</b></td><td class=xl2216681 ><b>Col3</b></td><td class=xl2216681 ><b>Col4</b></td><td class=xl2216681 ><b>Col5</b></td></tr>
<tr><td class=xl4566>1</td><td>2</td><td>3</td><td>4</td><td>5</td></tr>
</table>
</div>
</body>
</html>";
$fname = "Export".time().".xls";
$file = fopen($fname,"w+");
fwrite($file,$excel);
fclose($file);
header('Content-Type: application/vnd.ms-excel');
header('Content-Disposition: attachment; filename="'.basename($fname).'"');
readfile($fname);
unlink($fname); ?>

This is actually more simple than you'd think: "Just" copy the HTML table (that is: The HTML code for the table) into the clipboard. Excel knows how to decode HTML tables; it'll even try to preserve the attributes.

The hard part is "copy the table into the clipboard" since there is no standard way to access the clipboard from JavaScript. See this blog post: Accessing the System Clipboard with JavaScript – A Holy Grail?

Now all you need is the table as HTML. I suggest jQuery and the html() method.

There are practical two ways to do this automaticly while only one solution can be used in all browsers. First of all you should use the open xml specification to build the excel sheet. There are free plugins from Microsoft available that make this format also available for older office versions. The open xml is standard since office 2007. The the two ways are obvious the serverside or the clientside.

The clientside implementation use a new standard of CSS that allow you to store data instead of just the URL to the data. This is a great approach coz you dont need any servercall, just the data and some javascript. The killing downside is that microsoft don't support all parts of it in the current IE (I don't know about IE9) releases. Microsoft restrict the data to be a image but we will need a document. In firefox it works quite fine. For me the IE was the killing point.

The other way is to user a serverside implementation. There should be a lot implementations of open XML for all languages. You just need to grap one. In most cases it will be the simplest way to modify a Viewmodel to result in a Document but for sure you can send all data from Clientside back to server and do the same.

This code is IE only so it is only useful in situations where you know all of your users will be using IE (like, for example, in some corporate environments.)

<script Language="javascript">
function ExportHTMLTableToExcel()
{
var thisTable = document.getElementById("tbl").innerHTML;
window.clipboardData.setData("Text", thisTable);
var objExcel = new ActiveXObject ("Excel.Application");
objExcel.visible = true;


var objWorkbook = objExcel.Workbooks.Add;
var objWorksheet = objWorkbook.Worksheets(1);
objWorksheet.Paste;
}
</script>

It is possible to use the old Excel 2003 XML format (before OpenXML) to create a string that contains your desired XML, then on the client side you could use a data URI to open the file using the XSL mime type, or send the file to the client using the Excel mimetype "Content-Type: application/vnd.ms-excel" from the server side.

  1. Open Excel and create a worksheet with your desired formatting and colors.
  2. Save the Excel workbook as "XML Spreadsheet 2003 (*.xml)"
  3. Open the resulting file in a text editor like notepad and copy the value into a string in your application
  4. Assuming you use the client side approach with a data uri the code would look like this:
    
    <script type="text/javascript">
    var worksheet_template = '<?xml version="1.0"?><ss:Workbook xmlns:ss="urn:schemas-microsoft-com:office:spreadsheet">'+
    '<ss:Styles><ss:Style ss:ID="1"><ss:Font ss:Bold="1"/></ss:Style></ss:Styles><ss:Worksheet ss:Name="Sheet1">'+
    '<ss:Table>\{\{ROWS}}</ss:Table></ss:Worksheet></ss:Workbook>';
    var row_template = '<ss:Row ss:StyleID="1"><ss:Cell><ss:Data ss:Type="String">\{\{name}}</ss:Data></ss:Cell></ss:Row>';
    </script>
    
    
  5. Then you can use string replace to create a collection of rows to be inserted into your worksheet template
    
    <script type="text/javascript">
    var rows = document.getElementById("my-table").getElementsByTagName('tr'),
    row_data = '';
    for (var i = 0, length = rows.length; i < length; ++i) {
    row_data += row_template.replace('\{\{name}}', rows[i].getElementsByTagName('td')[0].innerHTML);
    }
    </script>
    
    
  6. Once you have the information collected, create the final string and open a new window using the data URI

    
    <script type="text/javascript">
    var worksheet = worksheet_template.replace('\{\{ROWS}}', row_data);

    window.open('data:application/vnd.ms-excel,'+worksheet); </script>

It is worth noting that older browsers do not support the data URI scheme, so you may need to produce the file server side for those browser that do not support it.

You may also need to perform base64 encoding on the data URI content, which may require a js library, as well as adding the string ';base64' after the mime type in the data URI.

Far and away, the cleanest, easiest export from tables to Excel is Jquery DataTables Table Tools plugin. You get a grid that sorts, filters, orders, and pages your data, and with just a few extra lines of code and two small files included, you get export to Excel, PDF, CSV, to clipboard and to the printer.

This is all the code that's required:

  $(document).ready( function () {
$('#example').dataTable( {
"sDom": 'T<"clear">lfrtip',
"oTableTools": {
"sSwfPath": "/swf/copy_cvs_xls_pdf.swf"
}
} );
} );

So, quick to deploy, no browser limitations, no server-side language required, and most of all very EASY to understand. It's a win-win. The one thing it does have limits on, though, is strict formatting of columns.

If formatting and colors are absolute dealbreakers, the only 100% reliable, cross browser method I've found is to use a server-side language to process proper Excel files from your code. My solution of choice is PHPExcel It is the only one I've found so far that positively handles export with formatting to a MODERN version of Excel from any browser when you give it nothing but HTML. Let me clarify though, it's definitely not as easy as the first solution, and also is a bit of a resource hog. However, on the plus side it also can output direct to PDF as well. And, once you get it configured, it just works, every time.

UPDATE - September 15, 2016: TableTools has been discontinued in favor of a new plugin called "buttons" These tools perform the same functions as the old TableTools extension, but are FAR easier to install and they make use of HTML5 downloads for modern browsers, with the capability to fallback to the original Flash download for browsers that don't support the HTML5 standard. As you can see from the many comments since I posted this response in 2011, the main weakness of TableTools has been addressed. I still can't recommend DataTables enough for handling large amounts of data simply, both for the developer and the user.

simple google search turned up this:

If the data is actually an HTML page and has NOT been created by ASP, PHP, or some other scripting language, and you are using Internet Explorer 6, and you have Excel installed on your computer, simply right-click on the page and look through the menu. You should see "Export to Microsoft Excel." If all these conditions are true, click on the menu item and after a few prompts it will be imported to Excel.

if you can't do that, he gives an alternate "drag-and-drop" method:

http://www.mrkent.com/tools/converter/

First, I would not recommend trying export Html and hope that the user's instance of Excel picks it up. My experience that this solution is fraught with problems including incompatibilities with Macintosh clients and throwing an error to the user that the file in question is not of the format specified. The most bullet-proof, user-friendly solution is a server-side one where you use a library to build an actual Excel file and send that back to the user. The next best solution and more universal solution would be to use the Open XML format. I've run into a few rare compatibility issues with older versions of Excel but on the whole this should give you a solution that will work on any version of Excel including Macs.

Open XML

Excel has a little known feature called "Web queries" which let you retrieve data from almost every web page without additional programming.

A web query basicly runs a HTTP request directly from within Excel and copies some or all of the received data (and optionally formatting) into the worksheet.

After you've defined the web query you can refresh it at any time without even leaving excel. So you don't have to actually "export" data and save it to a file - you'd rather refresh the data just like from a database.

You can even make use of URL parameters by having excel prompt you for certain filter criteria etc...

However the cons I've noticed so far are:

  • dynamicly loaded data is not accessible, because Javascript is not executed
  • URL length is limited

Here is a question about how to create web queries in Excel. It links to a Microsoft Help site about How-To Get external data from a Web page

A long time ago, I discovered that Excel would open an HTML file with a table if we send it with Excel content type. Consider the document above:

<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
<html xmlns="http://www.w3.org/1999/xhtml">
<head>
<title>Java Friends</title>
</head>
<body>
<table style="font-weight: bold">
<tr style="background-color:red"><td>a</td><td>b</td></tr>
<tr><td>1</td><td>2</td></tr>
</table>
</body>
</html>

I ran the following bookmarklet on it:

javascript:window.open('data:application/vnd.ms-excel,'+document.documentElement.innerHTML);

and in fact I got it downloadable as a Excel file. However, I did not get the expected result - the file was open in OpenOffice.org Writer. That is my problem: I do not have Excel in this machine so I cannot try it better. Also, this trick worked more or less six years ago with older browsers and an antique version of MS Office, so I really cannot say if it will work today.

Anyway, in the document above I added a button which would download the entire document as an Excel file, in theory:

<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
<html xmlns="http://www.w3.org/1999/xhtml">
<head>
<title>Java Friends</title>
</head>
<body>
<table style="font-weight: bold">
<tr style="background-color:red"><td>a</td><td>b</td></tr>
<tr><td>1</td><td>2</td></tr>
<tr>
<td colspan="2">
<button onclick="window.open('data:application/vnd.ms-excel,'+document.documentElement.innerHTML);">
Get as Excel spreadsheet
</button>
</td>
</tr>
</table>
</body>
</html>

Save it in a file and click on the button. I'd love to know if it worked or not, so I ask you to comment even for saying that it did not work.

Assumptions:

  1. given url

  2. the conversion has to be done on client side

  3. systems are Windows, Mac and linux

Solution for Windows:

python code that open the ie window and has access to it: theurl variable contain the url ('http://')

ie = Dispatch("InternetExplorer.Application")
ie.Visible = 1
ie.Navigate(theurl)

Note: if the page is not accessible directly, but login, you will need to handle this by entering the form data and emulating the user actions with python

here is the example

from win32com.client import Dispatch
ie.Document.all('username').value=usr
ie.Document.all('password').value=psw

the same manner for retrieval of data from web page. Let's say element with id 'el1' contain the data. retrieve the element text to the variable

el1 = ie.Document.all('el1').value

then when data is in python variable, you can open the excel screen in similar manner using python:

from win32com.client import Dispatch
xlApp = Dispatch("Excel.Application")
xlWb = xlApp.Workbooks.Open("Read.xls")
xlSht = xlWb.WorkSheets(1)
xlSht.Cells(row, col).Value = el1

Solution for Mac:

only the tip: use AppleScript - it has simple and similar API as win32com.client Dispatch

Solution for Linux:

java.awt.Robot might work for this it has click, key press (hot keys can be used), but none API for Linux that I am aware about that can work as simple as AppleScript

mozilla still support base 64 URIs. This allows you to compose dynamically the binary content using javascript:

<a href="data:application/vnd.ms-excel<base64 encoded binary excel content here>"> download xls</a>

if your excel file is not very fancy (no diagrams, formulas, macroses) you can dig into the format and compose bytes for your file, then encode them with base64 and put in to the href

refer to https://developer.mozilla.org/en/data_URIs

   function normalexport() {


try {
var i;
var j;
var mycell;
var tableID = "tblInnerHTML";
var drop = document.getElementById('<%= ddl_sections.ClientID %>');
var objXL = new ActiveXObject("Excel.Application");
var objWB = objXL.Workbooks.Add();
var objWS = objWB.ActiveSheet;
var str = filterNum(drop.options[drop.selectedIndex].text);
objWB.worksheets("Sheet1").activate; //activate dirst worksheet
var XlSheet = objWB.activeSheet; //activate sheet
XlSheet.Name = str; //rename




for (i = 0; i < document.getElementById("ctl00_ContentPlaceHolder1_1").rows.length - 1; i++) {
for (j = 0; j < document.getElementById("ctl00_ContentPlaceHolder1_1").rows(i).cells.length; j++) {
mycell = document.getElementById("ctl00_ContentPlaceHolder1_1").rows(i).cells(j);


objWS.Cells(i + 1, j + 1).Value = mycell.innerText;


//                                                objWS.Cells(i + 1, j + 1).style.backgroundColor = mycell.style.backgroundColor;
}
}


objWS.Range("A1", "L1").Font.Bold = true;
//                objWS.Range("A1", "L1").Font.ColorIndex = 2;
//                 objWS.Range("A1", "Z1").Interior.ColorIndex = 47;


objWS.Range("A1", "Z1").EntireColumn.AutoFit();


//objWS.Range("C1", "C1").ColumnWidth = 50;


objXL.Visible = true;


} catch (err) {
alert("Error. Scripting for ActiveX might be disabled")
return
}
idTmr = window.setInterval("Cleanup();", 1);


}




function filterNum(str) {


return str.replace(/[ / ]/g, '');
}

And now there is a better way.

OpenXML SDK for JavaScript.

https://openxmlsdkjs.codeplex.com/