Lo-Fi Python

Feb 09, 2022

How to Convert a Python Dictionary to and from a pandas DataFrame

This is an example of how to cast a Python dict into a dataframe and vice versa. I picked up the df to dict part from this Python and R tips post and the dict to df part from a Stack Overflow post. The below adaptation begins by converting an "NFL quarterbacks" Python dictionary into a dataframe and then back into a dict.

Sometimes a dictionary is adequate to solve a problem with handy methods like get() and items(). You can also do a ton with a dict comprehension. When more complex tabular data operations are needed, the pandas pd.DataFrame class is well equipped for the job. Dictionaries and dataframes are delightfully interoperable, like Tom Brady and any football team on the planet.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
import pprint
import pandas as pd

qbs_dict = {
    "Matthew Stafford":"Los Angeles Rams",
    "Joe Burrow":"Cincinnati Bengals",
    "Tom Brady":"Tampa Bay Buccaneers",
    "Pat Mahomes":"Kansas City Chiefs",
    "Tony Romo":"Dallas Cowboys"
}
qbs_df = pd.DataFrame(qbs_dict.items(), columns=["name", "team"])
print(qbs_df.info())
qbs_dict = pd.Series(qbs_df.team.values, index=qbs_df.name.values).to_dict()
pprint.pprint(qbs_dict, sort_dicts=True)
print(qbs_dict.get("Tom Brady", "Name not found."))

Terminal Output

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 5 entries, 0 to 4
Data columns (total 2 columns):
 #   Column  Non-Null Count  Dtype
---  ------  --------------  -----
 0   name    5 non-null      object
 1   team    5 non-null      object
dtypes: object(2)
memory usage: 208.0+ bytes
None

{'Joe Burrow': 'Cincinnati Bengals',
 'Matthew Stafford': 'Los Angeles Rams',
 'Pat Mahomes': 'Kansas City Chiefs',
 'Tom Brady': 'Tampa Bay Buccaneers',
 'Tony Romo': 'Dallas Cowboys'}

Tampa Bay Buccaneers

Did you notice that pprint sorts dicts by default?

Here the printed dict is reordered alphabetically on the QB's names. Per the pprint docs, you can alter this behavior if desired via a keyword argument new in Python version 3.8:

1
pprint.pprint(qbs_dict, sort_dicts=False)

pandas Documentation

pandas installation documentation

pandas.DataFrame

pandas.Series

pandas.DataFrame.to_dict

pandas.DataFrame.info

Python Standard Library Documentation

pprint.pprint

dict

dict.get

Dec 31, 2021

Phone Number Cleaning Regex + pandas Series Example

This is a solution I worked out recently to strip phone numbers into a uniform format. To install pandas with pip, enter in command prompt:

python -m pip install pandas

The pandas library has regex built in and it's pretty neat! Behold the power of pandas and a regular expression to do trivial telephone tidying:

strip phone formatting with Python
1
2
3
4
5
6
import pandas as pd
s = pd.Series(data=["(010) 001-1010"], name="Phone", dtype="str")
# remove parentheses, hyphens and spaces with pandas + regex
s = s.str.replace(pat="\(|\)|-| ", repl="", regex=True)
print(s)
# resulting number: "0100011010"

Regex is cool.

Grasping the intricacies of what this code is doing feels elegant when you connect the dots.. or pipes. The replace is done via a pandas str accessor. In the pat string, the parentheses are escaped with slashes and separated by pipes "|". They act as an or operator, succinctly chaining multiple characters together for matching and in this case replacing them with nothing. Pretty nifty. If you read the pandas docs, you'll find regex is accessible in different parts of the API. Dive in, it's some of my favorite documentation to snoop. There is so much you can do with pandas. This example demonstrates how its flexible functions get the job done efficiently.

Further Reading:

pandas.Series documentation

pandas str.replace documentation

Source of the famous “Now you have two problems” quote