reading-notes

View project on GitHub

pandas:

import: import numpy as np ,import pandas as pd

Object creation :Creating a Series by passing a list of values.

Creating a DataFrame by passing a dict of objects that can be converted to series-like.

- Creating a DataFrame by passing a NumPy array.

- EX: dates = pd.date_range('20130101', periods=6)

  • OUTPUT: DatetimeIndex(['2013-01-01', '2013-01-02', '2013-01-03', '2013-01-04', '2013-01-05', '2013-01-06'], dtype='datetime64[ns]', freq='D')

Features of Pandas:

-Fast and efficient DataFrame object with default and customized indexing.

-Tools for loading data into in-memory data objects from different file formats.

-Data alignment and integrated handling of missing data.

-Label-based slicing, indexing and subsetting of large data sets

-High performance merging and joining of data.

-Time Series functionality.

Handling missing data:

As discussed above, data can be quite confusing to read. But that is not even one of the major problems. Data is very crude in nature and one of the many problems associated with data is the occurrence of missing data or value.

Cleaning up data

Like we just said, Data can be very crude. Therefore it is really messy, so much so that performing any analysis over such data would lead to severely wrong results. Thus it is of extreme importance that we clean our data up, and this Pandas feature is easily provided.

Input and output tools

Pandas provide a wide array of built-in tools for the purpose of reading and writing data. While analyzing you will obviously need to read and write data into data structures, web service, databases, etc.

Merging and joining of datasets

While analyzing data we constantly need to merge and join multiple datasets to create a final dataset to be able to properly analyze it.

Python support

This feature of Pandas is the deal closer. With an insane amount of helpful libraries at your, disposal Python has become one of the most sought after programming languages for data analysis. Thus Pandas being a part of Python and allowing us to access the other libraries like NumPy and MatPlotLib.

Resources:

Done by Omar-zoubi