Excel數據提取 - 列數據類型的問題

[英]Excel data extraction - Issue with column data type


I am writing a C# library to read in Excel files (both xls and xlsx) and I'm coming across an issue.

我正在編寫一個C#庫來讀取Excel文件(包括xls和xlsx),我遇到了一個問題。

Exactly the same as what was expressed in this question, if my Excel file has a column that has string values, but has a numeric value in the first row, the OLEDB provider assumes that column to be numeric and returns NULL for the values in that column that are not numeric.

與此問題中表達的內容完全相同,如果我的Excel文件具有包含字符串值但在第一行中具有數值的列,則OLEDB提供程序假定該列為數字,並為其中的值返回NULL不是數字的列。

I am aware that, as in the answer provided, I can make a change in the registry, but since this is a library I plan to use on many machines and don't want to change every user's registry values, I was wondering if there is a better solution.

我知道,正如在提供的答案中,我可以在注冊表中進行更改,但由於這是一個我計划在許多機器上使用的庫,並且不想更改每個用戶的注冊表值,我想知道是否存在是一個更好的解決方案

Maybe a DB provider other than ACE.OLEDB (and it seems JET is no longer supported well enough to be considered)?

也許是ACE.OLEDB之外的數據庫提供程序(似乎JET不再支持得足以被考慮)?

Also, since this needs to work on XLS / XLSX, options such as EPPlus / XML readers won't work for the xls version.

此外,由於這需要在XLS / XLSX上運行,因此EPPlus / XML讀取器等選項不適用於xls版本。

1 个解决方案

#1


1  

Your connection string should look like this

您的連接字符串應如下所示

Provider=Microsoft.ACE.OLEDB.12.0;Data Source=c:\myFolder\myExcelfile.xlsx;Extended Properties="Excel 12.0 Xml;HDR=YES;IMEX=1";

IMEX=1 in the connection string is the part that you need to treat the column as mixed datatype. This should work fine without the need to edit the registry.

連接字符串中的IMEX = 1是您需要將列視為混合數據類型的部分。這應該工作正常,無需編輯注冊表。

HDR=Yes is simply to mark the first row as column headers and is not needed in your particular problem, however I've included it anyways.

HDR = Yes只是將第一行標記為列標題,在您的特定問題中不需要,但我仍然包含它。

To always use IMEX=1 is a safer way to retrieve data for mixed data columns.

始終使用IMEX = 1是檢索混合數據列的數據的更安全的方法。

Source: https://www.connectionstrings.com/excel/

資料來源:https://www.connectionstrings.com/excel/

Edit:

編輯:

Here is the data I'm using:

這是我正在使用的數據:

data

Here is the output:

這是輸出:

enter image description here

This is the exact code I used:

這是我使用的確切代碼:

string connString = @"Provider=Microsoft.ACE.OLEDB.12.0;Data Source=C:\test.xlsx;Extended Properties=""Excel 12.0 Xml;HDR=YES;IMEX=1""";

using (DbClass db = new DbClass(connString))
{
    var x = db.dataReader("SELECT * FROM [Sheet1$]");
    while (x.Read())
    {
        for (int i = 0; i < x.FieldCount; i++)
            Console.Write(x[i] + "\t");
        Console.WriteLine("");
    }
}

The DbClass is a simple wrapper I made in order to make life easier. It can be found here:

DbClass是我制作的簡單包裝,以使生活更輕松。在這里能找到它:

http://tech.reboot.pro/showthread.php?tid=4713

http://tech.reboot.pro/showthread.php?tid=4713


注意!

本站翻译的文章,版权归属于本站,未经许可禁止转摘,转摘请注明本文地址:https://www.itdaan.com/blog/2015/07/28/7299938c032378d68e1ae42953915cbc.html



 
粤ICP备14056181号  © 2014-2020 ITdaan.com