Lo-Fi Python

Dec 19, 2021

Memory Monitoring Python Libraries + Tools

If you write Python code, there's probably been a time or two when you saw the dreaded "MemoryError". This happens after one of your Python scripts stops because your computer has no spare RAM to execute it. I recently experienced this frustration whilst trying to write hundreds of thousands of csv files. However, this time I grasped for tools that support smarter memory management. Now, I can watch my computer's memory bounce around with the Windows Resource Monitor. Python has quite a few memory profiling libraries for monitoring memory too!

Python Libraries and Guides

Memory Management Overview, Python documentation

Memory Profiler: "monitor memory usage of Python code"

psutil: "Cross-platform lib for process and system monitoring in Python"

py-spy: "Sampling profiler for Python programs"

pyinstrument: "🚴 Call stack profiler for Python. Shows you why your code is slow!"

Scalene: "a high-performance, high-precision CPU, GPU, and memory profiler for Python"

Glances: "Glances an Eye on your system. A top/htop alternative for GNU/Linux, BSD, Mac OS and Windows operating systems."

Yappi: "Yet Another Python Profiler, but this time thread & coroutine & greenlet aware."

Fil: "A Python memory profiler for data processing and scientific computing applications" (Video)

line_profiler: "Line-by-line profiling for Python"

pprofile: "Line-granularity, thread-aware deterministic and statistic pure-python profiler"

Guppy 3: "Python programming environment and heap analysis toolset"

See also: The Python Profilers, Python documentation

CPython standard distribution comes with three deterministic profilers. cProfile, Profile and hotshot. cProfile is implemented as a C module based on lsprof, Profile is in pure Python and hotshot can be seen as a small subset of a cProfile.

Yappi Github, https://github.com/sumerc/yappi

Windows Tools

Task Manager: Windows process management tool with some memory analytics

Collect Data in Windows with Performance Monitor

Resource Monitor: Windows tool with Memory, CPU, Disk and Network monitoring tabs

Resource Monitor can stop processes from running and view in use, standby (Cached) and free memory. This shows 7 Python scripts running and 49% of total memory is being consumed. Looks like we are running steady and safely below "MemoryError" overflow. We might be able to add a few more scripts with 51% of RAM available!

Resource Monitor can stop processes from running and view in use, standby (Cached) and free memory. This shows 7 Python scripts running and 49% of total memory is being consumed. Looks like we are running steady and safely below "MemoryError" overflow. We might be able to add a few more scripts with 51% of RAM available!

Memory Tips and Guides

  • Use only the data you need. Any data you read in and aren't using is held in memory. The usecols argument in pandas is a great way to read a csv and only use the columns you need.
  • Reading data in chunks with the chunksize argument is another way to reduce memory usage for large datasets.
  • Measuring the memory usage of a Pandas dataframe
  • Some tools are line oriented, others are function oriented. If your code contains large functions, you might favor a line based profiling tool.
  • Be aware of the overhead some memory tools may incur. memory_profile was clocked with a whopping 270x slowdown per the Scalene PyCon talk below. The talk shows an awesome comparison of these Python profiling libraries:
Scalene Pycon US 2021 Talk

Recommended Reading

Conclusion

When you'll see "MemoryError" depends on your computer's hardware, the size of your dataset and what operations you need to script out. Generally speaking, I/O or file reads and writes are more expensive operations.

The tools in this post will help you anticipate how much computing power you have available, monitor your memory consumption more closely and avoid pushing your computer past its limits. You can do things like reading data in chunks and only using the columns you need to reduce your memory consumption. Realizing these tools and strategies can make getting things done with Python a smoother ride.