Lo-Fi Python

Oct 13, 2023

How I Sped Up My Python CLI By 25%

I recently noticed that the Yahoo Finance stock summary command line interface (CLI) I made seemed to be slowing down. Seeking to understand what was happening in my code, I remembered Python has multiple profilers available like Scalene, line_profiler, cProfile and pyinstrument. In this case, I was running my code on Python version 3.11.

First, I tried cProfile from the Python standard library. It is nice to have without any install required! However, I found its output to be tough to interpret. I also remembered I liked a talk I saw about Scalene, which gave a thorough overview of several Python profilers and how they're different. So next, I tried Scalene. Finally, I found pyinstrument and can safely say it is now my favorite Python profiler. This post will focus on how I used pyinstrument to make my command line tool faster.

Install pyinstrument with pip

pip install pyinstrument

I preferred the format in which pyinstrument presented the modules, functions and time they consumed in a tree structure. Scalene's percentage-based diagnosis was useful also. Scalene showed the specific lines where code was bottlenecked, whereas pyinstrument showed the time spent in each module and function. I liked that I could see time of specific functions from the external modules I was using with pyinstrument. For example, the beautiful soup and rich modules both consumed shockingly little time. However, the pandas module took a whole second.

Just importing the pandas module and doing nothing else was taking up to and sometimes over a second each time my CLI ran. On a script that takes about four seconds to execute, one second is 25% of the total run time! Once I realized this, I decided to only import the pandas module if my CLI's --csv argument was given. I was only using pandas to sort stocks and write a CSV. It wasn't critical functionality for my CLI.

My CLI script accepts a stock ticker as an argument. The below command fetches a stock report from Yahoo Finance and prints to the terminal. Swapping out "python" for pyinstrument runs the script and prints a pyinstrument report to your console.

Fetch a stock report from Yahoo.

pyinstrument finsou.py -s GOOG

pyinstrument Results With Normal Pandas Import

GOOG, Google

profiling a Python script with pyinstrument, before with GOOG

MSFT, Microsoft

profiling a Python script with pyinstrument, before with MSFT

The line for the pandas module looks like this:

0.946 <module> pandas/__init__.py:1

pyinstrument Results With Pandas Import Only If Necessary

After changing the pandas module to only import if needed, it is no longer eating almost a second of time. As a result, the script runs about second faster each time! Below are the pyinstrument reports for two different stocks after changing my pandas import to only be called if it was actually used:

GOOG, Google

profiling a Python script with pyinstrument, after with GOOG

NVDA, Nvidia

profiling a Python script with pyinstrument, after with NVDA

Sidebar: HTTP Request Volatility

The time that the script runs fluctuates about half a second to a few seconds based on the HTTP get request. It lags even more if my internet connection is weaker or Yahoo throttles my request because I've made too many in a short period of time. My time savings weren't gained from tinkering with the HTTP request, even though that was a time-eater. I noticed the requests module get request tends to fluctuate and sometimes causes an extra delay.

Simplified Python Example to Achieve Speed Gains

Below shows the method I used to achieve a faster CLI. Heads up, this code will not work if you run it. It's only meant to explain how I my code faster. You can find the actual script where I made this improvement here on Github.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
import argparse
from bs4 import BeautifulSoup
from rich import print as rprint
# Original import --> lazy import only if csv argument given: import pandas as pd

def yahoo_finance_prices(url, stock):
    return "Stonk went up.", "1000%"

parser = argparse.ArgumentParser(
    prog="finsou.py",
    description="Beautiful Financial Soup",
    epilog="fin soup... yum yum yum yum",
    )
parser.add_argument("-s", "--stocks", help="comma sep. stocks or portfolio.txt")
parser.add_argument("-c", "--csv", help='set csv export with "your_csv.csv"')
args = parser.parse_args()
prices = list()
for stock in args.stocks:
    summary, ah_pct_change = yahoo_finance_prices(url, stock)
    rprint(f"[steel_blue]{summary}[/steel_blue]\n")
    prices.append([stock, summary, url, ah_pct_change])
if args.csv:
    # Importing here shaves 1 second off the CLI when CSV is not required.
    import pandas as pd
    cols = ["Stock", "Price_Summary", "URL", "AH_%_Change"]
    stock_prices = pd.DataFrame(prices, columns=cols)
    stock_prices.to_csv(args.csv, index=False)

Make It Fast

"Make it work, make it better, make it fast." - Kent Beck

That's how I sped up my Python CLI by 25%. This method bucks the convention of keeping your import statements at the top of your script. In my case, it's a hobby project so I feel ok with making the trade-off of less readable code for a snappier CLI experience. You could also consider using the standard library csv module instead of pandas.

For Comparison, An import csv pyinstrument Report

profiling an import of the Python csv module

I clocked the csv module import at 0.003 or three thousandths of a second with pyinstrument. That's insanely fast compared to pandas. I chose to make a quick fix by shifting the import but using the csv module could be a better long-term solution for speeding up your scripts.

Supplementary Reading

Making a Yahoo Stock Price CLI With Python

The Python Profilers, Python Documentation

Stack Overflow Thread About Slow HTTP Requests

An Overview of Python Profiling and Diagnostic Tools

Oct 10, 2023

Making a Yahoo Stock Price Summary CLI with Python

Over the past few years, I found a few different external Python libraries that relied on a broken Yahoo Finance API. Apparently, the API changes frequently, leaving us developers in a tough spot troubleshooting tracebacks in order to get stock data. I wanted to check my stocks' prices from the terminal. 6 months ago, dealing with these frustrations inspired me to begin making a Python command line interface (CLI) to fetch stock info directly from the Yahoo Finance website.

With an idea and curiosity to see if I could make it work, I reached for the beautifulsoup4 library, the de facto HTML parser in Python. It turned out way better than I envisioned when I started. The CLI is working great, barring any changes to Yahoo's stock page HTML or CSS code. It accepts a stock ticker and grabs stock price data from the Yahoo website in your terminal. It is designed to monitor daily moves of stocks, including after hours prices.

Here is the Github repo with the code. I've named the CLI finsou.py, which I've been pronouncing to myself as "finsoupy", a word play on fin soup, short for financial soup. The standard library argparse module provided the CLI argument ingesting functionality. The CLI uses the requests module, beautifulsoup4 and re modules. With these 3 modules, it retrieves stock info and organizes it into a tidy, color coded report that is printed to your console. After getting the essential functionality working, I added improvements like the rich module to add in terminal color formatting and tqdm for a progress bar.

The CLI currently works after the US stock market has closed normal market hours. Additionally, after hours prices for "over the counter" (OTC) traded stocks are not listed on Yahoo so an error is returned for those stocks.

Getting Started with finsou.py

  1. First, install the necessary Python library dependencies:
pip install beautifulsoup4
pip install pandas
pip install requests
pip install rich
pip install tqdm
  1. Next, clone the Github repo:
git clone https://github.com/erickbytes/finsou.py.git
  1. Change directory into the finsou.py folder that contains the Python script:
cd finsou.py
  1. Query a stock's daily price summary:
# Print a daily stock summary for Virgin Galactic (SPCE).
python finsou.py -s SPCE
example stock summary report

Fetch a stock summary for multiple companies.

# Summarize a list of stocks.
python finsou.py -s BABA,SPOT,KO

Read a list of stocks from a .txt file.

# Read a list of stocks from a text file with one ticker on each line.
python finsou.py -s portfolio.txt -c "Portfolio Prices.csv"

Research + download media from investor relations websites.

Note: currently the code needs to be modified depending on the HTML structure of the page.

# Note: this is experimental and results will vary. URLs are typically buried in nested span and div tags.
python finsou.py -s GRAB -r https://investors.grab.com/events-and-presentations

How It Works

Check out the finsou.py Python script to see the complete code for how this stock report is created. Here is a brief simplified example of the logic behind the code.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
import re
import requests
from bs4 import BeautifulSoup

stock = "SNOW"
url = f"https://finance.yahoo.com/quote/{stock}/"
user_agent = "Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/534.1 (KHTML, like Gecko) Chrome/43.0.845.0 Safari/534.1"
headers = {
    "Cache-Control": "no-cache",
    "User-Agent": user_agent,
}
page = requests.get(url, headers=headers).text
soup = BeautifulSoup(page, "html.parser")
price_tags = soup.find_all(
    class_=re.compile("Fw\(b\) Fz\(36px\) Mb\(\-4px\) D\(ib\)")
)
mkt_close_price = price_tags[0].string.replace(",", "")
print(mkt_close_price)

First, an HTTP request is made and parsed by beautiful soup using Python's html.parser. We can then use bs4 and regex's re.compile function to return the HTML tags with a specific CSS class. Then once we have the tags, beautiful soup gives us a ".string" attribute for each tag to return their contents as a string. This pattern was applied to return all of the data in the stock report. To find the css classes I wanted, I right-clicked the price or data on Yahoo's website in a Chrome browser and selected "Inspect". Doing this opens Chrome's developer tools and drops you into that spot in the HTML code, where you can find the class you want to target.

No Official API, No Problem

It felt good to prove the concept that you don't need an official API to print stock data in your terminal! If you want to check in on your portfolio's daily moves, give this CLI a try: finsou.py Github Repo

If you're looking for a more robust finance Python module, I recommend yfinance for querying stock data.