Welcome#
This book contains a short introduction to pandas which is a fast, powerful, flexible and easy to use open source data analysis and manipulation tool.
pandas offers data structures and operations for manipulating tables and time series using a so called DataFrame which is similar to an in-memory spreadsheet. Like a spreadsheet:
A DataFrame stores data in cells.
A DataFrame has named columns (usually) and numbered rows.
Note
A DataFrame is a 2-dimensional data structure that can store data of different types (including characters, integers, floating point values, categorical data and more) in columns. It is similar to a spreadsheet, a SQL table or the data.frame in R.
To learn more about pandas, visit the getting started tutorials to see:
What kind of data does pandas handle?
How to create new columns derived from existing columns?
How to reshape the layout of tables?
How to combine data from multiple tables?
How to handle time series data with ease?
How to manipulate textual data?
Furthermore, you may want to review Python for Data Analysis, 3 edition.
The tool pandas tutor lets you write Python pandas code in your browser and see how it transforms your data step-by-step: