pyexcel-io - Let you focus on data, instead of file formats¶
Author: | C.W. |
---|---|
Source code: | http://github.com/pyexcel/pyexcel-io.git |
Issues: | http://github.com/pyexcel/pyexcel-io/issues |
License: | New BSD License |
Released: | 0.5.4 |
Generated: | Nov 10, 2017 |
Introduction¶
pyexcel-io provides one application programming interface(API) to read and write data in different excel formats. It makes information processing involving excel files a simple task. The data in excel files can be turned into an ordered dictionary with least code. This library focuses on data processing using excel files as storage media hence fonts, colors and charts were not and will not be considered.
It was created due to the lack of uniform programming interface to access data in different excel formats. A developer needs to use different methods of different libraries to read the same data in different excel formats, hence the resulting code is cluttered and unmaintainable. This is a challenge posed by users who do not know or care about the differences in excel file formats. Instead of educating the users about the specific excel format a data processing application supports, the library takes up the challenge and promises to support all known excel formats.
All great work have done by individual library developers. This library unites only the data access API. With that said, pyexcel-io also bring something new on the table: “csvz” and “tsvz” format, new format names as of 2014. They are invented and supported by pyexcel-io.
Installation¶
You can install pyexcel-io via pip:
$ pip install pyexcel-io
or clone it and install it:
$ git clone https://github.com/pyexcel/pyexcel-io.git
$ cd pyexcel-io
$ python setup.py install
For individual excel file formats, please install them as you wish:
Package name | Supported file formats | Dependencies | Python versions |
---|---|---|---|
pyexcel-io | csv, csvz [1], tsv, tsvz [2] | 2.6, 2.7, 3.3, 3.4, 3.5, 3.6 pypy | |
pyexcel-xls | xls, xlsx(read only), xlsm(read only) | xlrd, xlwt | same as above |
pyexcel-xlsx | xlsx | openpyxl | same as above |
pyexcel-ods3 | ods | pyexcel-ezodf, lxml | 2.6, 2.7, 3.3, 3.4 3.5, 3.6 |
pyexcel-ods | ods | odfpy | same as above |
Package name | Supported file formats | Dependencies | Python versions |
---|---|---|---|
pyexcel-xlsxw | xlsx(write only) | XlsxWriter | Python 2 and 3 |
pyexcel-odsr | read only for ods, fods | lxml | same as above |
pyexcel-htmlr | html(read only) | lxml,html5lib | same as above |
In order to manage the list of plugins installed, you need to use pip to add or remove a plugin. When you use virtualenv, you can have different plugins per virtual environment. In the situation where you have multiple plugins that does the same thing in your environment, you need to tell pyexcel which plugin to use per function call. For example, pyexcel-ods and pyexcel-odsr, and you want to get_array to use pyexcel-odsr. You need to append get_array(..., library=’pyexcel-odsr’).
Footnotes
[1] | zipped csv file |
[2] | zipped tsv file |
After that, you can start get and save data in the loaded format. There are two plugins for the same file format, e.g. pyexcel-ods3 and pyexcel-ods. If you want to choose one, please try pip uninstall the un-wanted one. And if you want to have both installed but wanted to use one of them for a function call(or file type) and the other for another function call(or file type), you can pass on “library” option to get_data and save_data, e.g. get_data(.., library=’pyexcel-ods’)
Note
pyexcel-text is no longer a plugin of pyexcel-io but a direct plugin of pyexcel
pyexcel-io | xls | xlsx | ods | ods3 | odsr | xlsxw |
---|---|---|---|---|---|---|
0.5.1 | 0.5.0 | 0.5.0 | 0.5.0 | 0.5.0 | 0.5.0 | 0.5.0 |
0.4.x | 0.4.x | 0.4.x | 0.4.x | 0.4.x | 0.4.x | 0.4.x |
0.3.0+ | 0.3.0+ | 0.3.0 | 0.3.0+ | 0.3.0+ | 0.3.0 | 0.3.0 |
0.2.2+ | 0.2.2+ | 0.2.2+ | 0.2.1+ | 0.2.1+ | 0.0.1 | |
0.2.0+ | 0.2.0+ | 0.2.0+ | 0.2.0 | 0.2.0 | 0.0.1 |