I was looking over my generic export DataTable to Excel function the other day and noticed an issue. My default method was throwing everything into a string format, which was preventing users from highlighting number columns for subtotals. To fix the problem they could use “Text to Columns”, but this was becoming a common complaint I wanted to resolve. I came up with an improved “ExportToExcel” function that will insert the correct data type.
/// <summary>
/// Render DataTable to Excel File
/// </summary>
/// <param name = "sourceTable">Source DataTable</param>
/// <param name = "fileName">Destination File Name</param>
public static void ExportToExcel(DataTable sourceTable, string fileName)
{
HSSFWorkbook workbook = new HSSFWorkbook();
Sheet sheet = workbook.CreateSheet("Sheet1");
Row headerRow = sheet.CreateRow(0);
// Create Header Style
CellStyle headerCellStyle = workbook.CreateCellStyle();
headerCellStyle.FillForegroundColor = HSSFColor.GREY_25_PERCENT.index;
headerCellStyle.FillPattern = FillPatternType.SOLID_FOREGROUND;
// Create Date Style
CellStyle dateCellStyle = workbook.CreateCellStyle();
dateCellStyle.DataFormat = HSSFDataFormat.GetBuiltinFormat("m/d/yy");
// Build Header
foreach (DataColumn column in sourceTable.Columns)
{
Cell headerCell = headerRow.CreateCell(column.Ordinal);
headerCell.SetCellValue(column.ColumnName);
headerCell.CellStyle = headerCellStyle;
}
// Build Details (rows)
int rowIndex = 1;
int sheetIndex = 1;
const int maxRows = 65536;
foreach (DataRow row in sourceTable.Rows)
{
// Start new sheet max rows reached
if (rowIndex % maxRows == 0)
{
// Auto size columns on current sheet
for (int h = 0; h < headerRow.LastCellNum; h++)
{
sheet.AutoSizeColumn(h);
}
sheetIndex++;
sheet = workbook.CreateSheet("Sheet" + sheetIndex);
Row additionalHeaderRow = sheet.CreateRow(0);
for (int h = 0; h < headerRow.LastCellNum; h++)
{
Cell additionalHeaderColumn = additionalHeaderRow.CreateCell(h);
additionalHeaderColumn.CellStyle = headerRow.GetCell(h).CellStyle;
additionalHeaderColumn.SetCellValue(headerRow.GetCell(h).RichStringCellValue);
}
rowIndex = 1;
}
// Create new row in sheet
Row dataRow = sheet.CreateRow(rowIndex);
foreach (DataColumn column in sourceTable.Columns)
{
Cell dataCell = dataRow.CreateCell(column.Ordinal);
switch (column.DataType.FullName)
{
case "System.String":
dataCell.SetCellValue(row[column].ToString());
break;
case "System.Int":
case "System.Int32":
case "System.Int64":
case "System.Double":
case "System.Decimal":
double val;
dataCell.SetCellValue(Double.TryParse(row[column].ToString(), out val) ? val : 0);
break;
case "System.DateTime":
DateTime dt = new DateTime(1900, 01, 01);
DateTime.TryParse(row[column].ToString(), out dt);
dataCell.SetCellValue(dt);
dataCell.CellStyle = dateCellStyle;
break;
default:
dataCell.SetCellValue(row[column].ToString());
break;
}
}
rowIndex++;
}
for (int h = 0; h < headerRow.LastCellNum; h++)
{
sheet.AutoSizeColumn(h);
}
ExportToExcel(workbook, fileName);
}
The key part of the function above to review is the “switch (column.DataType.FullName)” code block. This grabs the DataTable’s column data type to use in the SetCellValue() call.
Once the workbook is built, we call a overload of the same function that expects a NPOI workbook. This overload will send a XLS file back to the user via their web browser using the HttpResponse stream.
/// <summary>
/// Render Excel File to HttpResponse (Browser)
/// </summary>
/// <param name="workbook">NPOI Workbook</param>
/// <param name="fileName">Destination File Name</param>
public static void ExportToExcel(HSSFWorkbook workbook, string fileName)
{
using (MemoryStream memoryStream = new MemoryStream())
{
workbook.Write(memoryStream);
memoryStream.Flush();
try
{
HttpResponse response = HttpContext.Current.Response;
response.ClearContent();
response.ClearHeaders();
response.Buffer = true;
response.ContentType = "application/vnd.ms-excel";
response.AddHeader("Content-Length", memoryStream.Length.ToString());
response.AddHeader("Content-Disposition", string.Format("attachment;filename={0}", fileName));
response.BinaryWrite(memoryStream.GetBuffer());
response.Flush();
response.End();
}
catch
{
// Do nothing, error expected due to Flush();
}
}
}
I’ve used the above function to send files with 3-4 sheets full of data back to the browser… but the file size is usually “VERY LARGE”. To work around this, I have a helper function which will add the excel file to a zip (in memory) before sending back to the client.
In my last test, a 17.5MB document with 2 sheets of data was compressed to 3.5MB with my zip utility. There was still a ~13 second delay to generate the XLS file, but once prompted to download form the browser the file download completed in a few seconds.
#1 by Matt on June 23, 2011 - 2:53 pm
Hi Zach, sorry I wasn’t more specific. I was referring to Excel’s AutoFilter feature as found in Excel’s menu: Data > Filter > AutoFilter. Using an excel template that has column names already entered in the spreadsheet, you can turn on the AutoFilter feature in the template, and then it works as expected after you use NPOI to fill the rows beneath the header. But if you’re building your header via NPOI, then can you use NPOI to generate Excel’s AutoFilter feature? I see NPOI has an HSSFAutoFilter…wait…just looking again, and it appears you can call a “SetAutoFilter()” function by passing a CellRangeAddress. I haven’t read the “how to” for these…
#2 by Zach on June 23, 2011 - 3:05 pm
Matt,
I think most things can be done via NPOI, I usually use the POI Java Docs when I’m looking for something because I find them easier to search/navigate.
http://poi.apache.org/apidocs/overview-summary.html
I’ve also ported over a few bug fixes to NPOI, but in general Tony, the NPOI project creator is getting the bug fixes applied really quick and is always watching the forum/bug reports. If you find something broken, posting in the discussion/bug thread on NPOI CodePlex page is a good place to report it. You can also look at POI forums, cause you might see it was also found and fixed there. I was on my own custom build of NPOI for awhile, but as of 1.2.3 I’m back on the official build. I don’t think I have any changes in my current production built, but I tweak things so often it’s hard to remember what I did in weeks prior.
#3 by Zach on June 23, 2011 - 12:56 pm
Matt, what do you mean by AutoFilter? I commonly filter my results by applying filters/sorts on my DataTable.DefaultView, which then gets sent to my ExportToExcel() function. This way I can chop up and display the same results in a few different ways, without have to re-query the DB (e.g. DataTable.DefaultView.RowFilter = “Column1 = ‘Apple'”;). Don’t forget, when you apply filters/sorts, you want to then use the DataTable.DefaultView.ToTable() to get the new slice/view of data.
#4 by Matt on June 23, 2011 - 12:02 pm
Is it possible to add AutoFilter to the function?
#5 by Matt on June 11, 2011 - 2:15 pm
Thanks; this will be good to refer back to if the database hits are too heavy.
Another issue worth noting: while filing the excel template using NPOI’s “Get…” syntax, an issue arose where the cell format in the XLS template, which started out as “General”, was not being treated as a number after data-fill. The cells with integers were generating Excel’s little cell formatting “green triangle” warning. And the charts relying on those integers were then failing to display. As a result, I tweaked the CASE statement for System.Int to convert back to Int, as opposed to string.
After that, no more cell warning messages appeared, and the charts loaded.
#6 by Matt on June 10, 2011 - 11:03 pm
These functions are very useful. I’m wondering what your strategies are for implementing them? If I’m using a DataSet and multiple DataTables, do these functions always have to be called on the Page_Load event?
For instance, if you have a web page, and then a report version of that same web page, do you re-query the database and open a new page, or do you work with a copy of the DataTable that’s part of the web page?
Independent of NPOI, I’m running into an issue where the DataTable from the Page_Load event pulls all the correct data, but a DataTable from a DataTable variable in code behind isn’t putting out the data correctly. (Sorry for yet another question… )
#7 by Zach on June 11, 2011 - 10:17 am
Good question, personally I cache all my hits to the db for a few minutes… This way I can load the XLS file without pulling the data from the DB again. The only time this could be an issue, is when there is LOTS of data. When this happens, I limit the web based query to only provide the top 500 results and force the user to use the download button to access all the rows. Most people prefer to use EXCEL to review lots of data (1K+ rows).
As for why your DataTable results aren’t matching… I’m not sure without a little more information, but here is what I normally do when I pull the data and want to use it multiple times.
This example is simplified, normally I would pass criteria to the GetData() and use the criteria in the “cacheKey” variable.
Hope this helps, let me know if you have questions.