Pandas - DataFrame¶
DataFrames description:
- two dimensional labeld data
- supports diffrent data types
- easy to manipulate: eg. reshaping, slicing, grouping
used for:
- data wrangling
- visualization
- data processing
- exploratory data analysis
- creating models
Create DataFrame by dict¶
In [2]:
import pandas as pd
data = {'age': [22,55,43],
'names': ['A','B','C'],
'country': ['uk', 'us','de'],
}
df = pd.DataFrame (data)
print(df)
Stats about DataFrame¶
In [3]:
print(df.describe())
print(df.info())
print(df.shape)
Use specific columns¶
In [4]:
df = pd.DataFrame (data, columns = ['age','country'])
print(df)
Filter Rows¶
In [5]:
dfFiltered = df[df['age']<50]
print(dfFiltered)
numpy array <> pandas DataFrame¶
You can use:
- np.array(yourDataFrameVariable)
- yourDataFrameVariable.values
to transform yout pandas DataFrame to a numpy array
In [6]:
import numpy as np
print(type(dfFiltered))
print(dfFiltered)
npArray = np.array(dfFiltered)
print(type(npArray))
print(npArray)
dfAgain = pd.DataFrame(npArray)
print(type(dfAgain))
print(dfAgain)
In [7]:
npArray = dfFiltered.values
print(type(npArray))
print(npArray)