pandas series
A Pandas series is a data structure that stores data in the form of a column.It is a one-dimensional array holding data of any type.
pd series
pandas.Series( data, index, dtype, copy)
Parameter | Description |
---|---|
data | array like - Contains data stored in Series |
index | Index values must be unique and hashable, same length as data |
dtype | It is for data type. If None, data type will be inferred |
copy | Copy data. Default
False |
Imports the Pandas module and then calls the
Series()
class constructor to create an empty series.
import pandas as pd mySeries = pd.Series()
You can also create a series using a NumPy array. But, first you need to pass the array to the
Series()
class constructor.
import pandas as pd import numpy as np myArray = np.array([10, 20, 30, 40, 50]) mySeries = pd.Series(myArray) print(mySeries)
Output
0 10 1 20 2 30 3 40 4 50 dtype: int64
In the Above Output, You can see that the indexes for a series starts from
0
, You can also define custom indexes for your series. To do so, you need to pass your list of indexes to the index attribute of the Series class.
import pandas as pd import numpy as np myArray = np.array([10, 20, 30, 40, 50]) mySeries = pd.Series(myArray, index = ["A", "B", "C", "D","E"]) print(mySeries)
Output
A 10 B 20 C 30 D 40 E 50 dtype: int64
We passed the index values here. Now we can see the customized indexed values in the output.
Create a Series from dict
- You can also create a series using a dictionary.
- If index is passed, the values in data corresponding to the labels in the index will be printed out.
- If no index is specified, the dictionary keys will become series indexes while the dictionary values are inserted as series items.
import pandas as pd myDict = { 'A' :1, 'B' :2, 'C' :3, 'D': 4, 'E': 5 } mySeries = pd.Series(myDict) print(mySeries)
Output
A 1 B 2 C 3 D 4 E 5 dtype: int64
import pandas as pd myDict = { 'A' :1, 'B' :2, 'C' :3, 'D': 4, 'E': 5 } mySeries = pd.Series(myDict, index = ['B', 'G' ,'D', 'E', 'Z']) print(mySeries)
Output
B 2.0 G NaN D 4.0 E 5.0 Z NaN dtype: float64
Create a Series from Scalar
- If data is a scalar value, an index must be provided.
import pandas as pd mySeries = pd.Series(44, index = ["A", "B", "C", "D","E"]) print(mySeries)
Output
A 44 B 44 C 44 D 44 E 44 dtype: int64
Accessing Items
We can Access the Items , You can use square brackets as well as index labels to access series items.
import pandas as pd import numpy as np myArray = np.array([10, 20, 30, 40, 50]) mySeries = pd.Series(myArray, index = ["A", "B", "C", "D","E"]) print(mySeries[0]) print(mySeries['D'])
Output
10 40
Get the first two elements in the Series.
import pandas as pd import numpy as np myArray = np.array([10, 20, 30, 40, 50]) mySeries = pd.Series(myArray, index = ["A", "B", "C", "D","E"]) print(mySeries[:2])
Output
A 10 B 20 dtype: int64
Finding Maximum and Minimum Values
We can find the maximum and minimum values, respectively, from a series using
min()
and
max()
functions from the NumPy module.
import pandas as pd import numpy as np myArray = np.array([10, 20, 30, 40, 50]) mySeries = pd.Series(myArray, index = ["A", "B", "C", "D","E"]) print(np.min(mySeries)) print(np.max(mySeries))
Output
10 50
Finding Mean and Median
The
mean()
method from the numpy module can help us find the mean of a given Pandas series.
import pandas as pd import numpy as np myArray = np.array([5, 3 , 7 , 11, 15]) mySeries = pd.Series(myArray, index = ["A", "B", "C", "D","E"]) print(np.mean(mySeries))
Output
8.2
The
median()
method from the numpy module can help us find the median of a given Pandas series.
import pandas as pd import numpy as np myArray = np.array([5, 3 , 7 , 11, 15]) mySeries = pd.Series(myArray, index = ["A", "B", "C", "D","E"]) print(np.median(mySeries))
Output
7.0