Lo-Fi Python

Nov 30, 2019

Inserting New Rows Into A Microsoft Access Database With Python and pyodbc

I recently automated the loading of data into a Microsoft Access database with pyodbc, a Python library for connecting to databases. ODBC stands for Open Database Connectivity. It can be used for a variety of Database Management Systems outside of Access also.

First, install libraries with pip. Enter in terminal or command prompt:

python -m pip install pyodbc
python -m pip install pandas

Next, check available Microsoft Access drivers on your computer. Enter the below statements into the Python interpreter:

1
2
3
python
>>> import pyodbc
>>> [x for x in pyodbc.drivers() if x.startswith('Microsoft Access Driver')]

Drivers for Access & Many Other Data Sources

The driver is the engine that allows you to connect to a specific type of database. The drivers available vary depending on your machine.

The two most common drivers for Access are Microsoft Access Driver (.mdb) and Microsoft Access Driver (.mdb, *.accdb). My computer only had *.mdb, which has been deprecated. My Access database was a .mdb file, so I was able to use this driver as shown below. Read more on Access drivers here.

"Drivers exist for all major DBMSs, many other data sources like address book systems and Microsoft Excel, and even for text or comma-separated values (CSV) files."  - Wikipedia

Database Data Types

I set all of the field data types to "Short Text" because I'm passing strings as SQL parameters below. Uploading as other data types may require additional formatting. To edit the data types of your table, open the table and select "Design View" under the "Home" tab. It got the job done for me!

Inserting new rows into a Microsoft Access Database:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
import pandas as pd
import pyodbc


def df_to_row_tuples(df):
    """Use list comprehension to format df rows as a list of tuples:
    rows = [('Garfield','Orange','Eat'),('Meowth','White','Scratch')]
    """
    df = df.fillna('')
    rows = [tuple(cell) for cell in df.values]
    return rows

"""Rows are not added to DB until they are committed.
Pass each row tuple as a SQL parameter (?,?,?).
cursor.execute docs: https://www.mcobject.com/docs/Content/Programming/Python/Classes/Cursor/execute.htm
"""
conn = pyodbc.connect(r'Driver={Microsoft Access Driver (*.mdb)};DBQ=C:\Add_Path\To_DB\Here\Your_DB.mdb;')
cursor = conn.cursor()
sql = ''' INSERT INTO Cats (Name, Color, Move)
          VALUES(?,?,?) '''
df = pd.read_csv('Cat Data.csv')
rows = df_to_row_tuples(df)
for row in rows:
    cursor.execute(sql, row)
conn.commit()

Conclusion

Running the above in command prompt uses pyodbc and SQL to add dataframe rows to a Microsoft Access DB table named "Cats". Passing each row as a SQL parameter has two benefits:

  1. It handles strings with single quotes (') and loads them to the DB.
  2. It protects against SQL injection attacks.

Access Limitation Disclaimer

Access topped out just shy of 10 million rows in my use case, when records stopped getting added to my database. So keep that in mind if you're thinking about using Access to store your data.

Supplementary Resources

Insert Values into MS Access Table using Python

pyodbc documentation

Microsoft Documentation pyodbc example

The Python Cursor Class

Psycopg Cursor Class Documentation