Lo-Fi Python

Oct 27, 2023

Streamline Sharing Your Wi-Fi Network Details With Python

If you host a public space or office with shared Wi-Fi, a QR code skips the tedious process of exchanging your network's details. This is nice to have as an alternative to asking people to manually enter an auto-generated, cryptic, error-prone 16 character string password. Especially when you frequently have customers or new people asking for the information. You could post a sign with the network name and password like most coffee shops do, or you could try a QR code. Here's how to create a QR code for your Wi-Fi network.

To accomplish this task, I found the wifi-qr-code-generator library on pypi. It makes creating a Wi-Fi QR code very simple with help from the pillow and qrcode modules. It is a great example of a library that has a very specific purpose and does it well. The connection will only be automatic if your password is correct, so make sure you type it carefully.

The library has two ways to create a QR code:

  1. Run a Python script with the network details.
  2. Use wifi-qr-code-generator's CLI and respond to prompts for Wi-Fi details.

Install wifi-qrcode-generator

pip install wifi-qrcode-generator

Generating a QR Code Python Script

This code snippet prints the qr code to the terminal screen, then saves it as a png image.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
#!/usr/bin/env python3
import wifi_qrcode_generator.generator

qr_code = wifi_qrcode_generator.generator.wifi_qrcode(
    ssid="add_wi-fi_network_name",
    hidden=False,
    authentication_type="WPA",
    password="add_wi-fi_password",
)
qr_code.print_ascii()
qr_code.make_image().save("wifi-qr-code.png")

QR Code Example Image

QR code image result

Wi-Fi Auto-Connected Confirmation

confirmation of wi-fi connection

Generating a QR Code With CLI Command

The 2nd way to use this module is via a built-in command line interface to make your QR code. It can be invoked with this command:

wifi-qrcode-generator

Small Projects for the Win

Some of my favorite coding happens when I start with a simple goal, research the libraries available, apply Python skills and get a tangible result in a short period of time. If you want to streamline sharing your Wi-Fi network, remember this practical Python library!

Oct 25, 2023

Formatting URL Parameters in Python

When I first started working with APIs, I had a bad habit of passing URL parameters as one long ugly string. Anything longer than 79 characters violates PEP-8. It's also hard to read and can be difficult to edit the code in your text editor if the URL is trailing off the screen. In this post, you'll find some alternatives to the primitive "long ugly string" approach.

Did you know? URL stands for "uniform resource locator".

Below are two ways to neatly format your URLs so that they have parameters. Both involve using a Python dictionary. The requests API allows you to pass a dictionary or list of tuples to its params argument. Alternatively, if you want to see the full URL as a string, there's a sleek way to format URL arguments with urllib's urlencode function.

a visual breakdown of a url with parameters

source: Geeks for Geeks

Pass a dictionary to the requests params argument to include URL arguments.

You often want to send some sort of data in the URL’s query string. If you were constructing the URL by hand, this data would be given as key/value pairs in the URL after a question mark, e.g. httpbin.org/get?key=val. Requests allows you to provide these arguments as a dictionary of strings, using the params keyword argument. - requests documentation, Passing Parameters in URLs
1
2
3
4
5
6
7
8
9
import requests

payload = {
    "email": "[email protected]",
    "message": "This email is not real.",
    "status": "inactive"
}
r = requests.get("https://httpbin.org/get", params=payload)
print(r.text)

Use urllib's urlencode function to dynamically construct URL from a dictionary.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
import requests
from urllib.parse import urlencode

payload = {
    "email": "[email protected]",
    "message": "This email is not real.",
    "status": "inactive"
}
# Returns str of URL encoded parameters.
url_parameters = urlencode(payload)
# >>> url_parameters
# "email=example%40example.com&message=This+email+is+not+real.&status=inactive"
url = f"https://httpbin.org/get?{url_parameters}"
r = requests.get(url)
print(r.text)

Arguments can be a good thing.

This seemingly basic HTTP formatting was something that took me too long to realize. I hope it helps you keep your URLs tidy and your HTTP requests more readable.

Read More About URL Parameters

Passing Parameters in URLS, requests Documentation

urllib Examples, Python Documentation

requests API Documentation Reference

Stack Overflow, Python Dictionary to URL Parameters

Oct 13, 2023

How I Sped Up My Python CLI By 25%

I recently noticed that the Yahoo Finance stock summary command line interface (CLI) I made seemed to be slowing down. Seeking to understand what was happening in my code, I remembered Python has multiple profilers available like Scalene, line_profiler, cProfile and pyinstrument. In this case, I was running my code on Python version 3.11.

First, I tried cProfile from the Python standard library. It is nice to have without any install required! However, I found its output to be tough to interpret. I also remembered I liked a talk I saw about Scalene, which gave a thorough overview of several Python profilers and how they're different. So next, I tried Scalene. Finally, I found pyinstrument and can safely say it is now my favorite Python profiler. This post will focus on how I used pyinstrument to make my command line tool faster.

Install pyinstrument with pip

pip install pyinstrument

I preferred the format in which pyinstrument presented the modules, functions and time they consumed in a tree structure. Scalene's percentage-based diagnosis was useful also. Scalene showed the specific lines where code was bottlenecked, whereas pyinstrument showed the time spent in each module and function. I liked that I could see time of specific functions from the external modules I was using with pyinstrument. For example, the beautiful soup and rich modules both consumed shockingly little time. However, the pandas module took a whole second.

Just importing the pandas module and doing nothing else was taking up to and sometimes over a second each time my CLI ran. On a script that takes about four seconds to execute, one second is 25% of the total run time! Once I realized this, I decided to only import the pandas module if my CLI's --csv argument was given. I was only using pandas to sort stocks and write a CSV. It wasn't critical functionality for my CLI.

My CLI script accepts a stock ticker as an argument. The below command fetches a stock report from Yahoo Finance and prints to the terminal. Swapping out "python" for pyinstrument runs the script and prints a pyinstrument report to your console.

Fetch a stock report from Yahoo.

pyinstrument finsou.py -s GOOG

pyinstrument Results With Normal Pandas Import

GOOG, Google

profiling a Python script with pyinstrument, before with GOOG

MSFT, Microsoft

profiling a Python script with pyinstrument, before with MSFT

The line for the pandas module looks like this:

0.946 <module> pandas/__init__.py:1

pyinstrument Results With Pandas Import Only If Necessary

After changing the pandas module to only import if needed, it is no longer eating almost a second of time. As a result, the script runs about second faster each time! Below are the pyinstrument reports for two different stocks after changing my pandas import to only be called if it was actually used:

GOOG, Google

profiling a Python script with pyinstrument, after with GOOG

NVDA, Nvidia

profiling a Python script with pyinstrument, after with NVDA

Sidebar: HTTP Request Volatility

The time that the script runs fluctuates about half a second to a few seconds based on the HTTP get request. It lags even more if my internet connection is weaker or Yahoo throttles my request because I've made too many in a short period of time. My time savings weren't gained from tinkering with the HTTP request, even though that was a time-eater. I noticed the requests module get request tends to fluctuate and sometimes causes an extra delay.

Simplified Python Example to Achieve Speed Gains

Below shows the method I used to achieve a faster CLI. Heads up, this code will not work if you run it. It's only meant to explain how I my code faster. You can find the actual script where I made this improvement here on Github.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
import argparse
from bs4 import BeautifulSoup
from rich import print as rprint
# Original import --> lazy import only if csv argument given: import pandas as pd

def yahoo_finance_prices(url, stock):
    return "Stonk went up.", "1000%"

parser = argparse.ArgumentParser(
    prog="finsou.py",
    description="Beautiful Financial Soup",
    epilog="fin soup... yum yum yum yum",
    )
parser.add_argument("-s", "--stocks", help="comma sep. stocks or portfolio.txt")
parser.add_argument("-c", "--csv", help='set csv export with "your_csv.csv"')
args = parser.parse_args()
prices = list()
for stock in args.stocks:
    summary, ah_pct_change = yahoo_finance_prices(url, stock)
    rprint(f"[steel_blue]{summary}[/steel_blue]\n")
    prices.append([stock, summary, url, ah_pct_change])
if args.csv:
    # Importing here shaves 1 second off the CLI when CSV is not required.
    import pandas as pd
    cols = ["Stock", "Price_Summary", "URL", "AH_%_Change"]
    stock_prices = pd.DataFrame(prices, columns=cols)
    stock_prices.to_csv(args.csv, index=False)

Make It Fast

"Make it work, make it better, make it fast." - Kent Beck

That's how I sped up my Python CLI by 25%. This method bucks the convention of keeping your import statements at the top of your script. In my case, it's a hobby project so I feel ok with making the trade-off of less readable code for a snappier CLI experience. You could also consider using the standard library csv module instead of pandas.

For Comparison, An import csv pyinstrument Report

profiling an import of the Python csv module

I clocked the csv module import at 0.003 or three thousandths of a second with pyinstrument. That's insanely fast compared to pandas. I chose to make a quick fix by shifting the import but using the csv module could be a better long-term solution for speeding up your scripts.

Supplementary Reading

Making a Yahoo Stock Price CLI With Python

The Python Profilers, Python Documentation

Stack Overflow Thread About Slow HTTP Requests

An Overview of Python Profiling and Diagnostic Tools

Oct 10, 2023

Making a Yahoo Stock Price Summary CLI with Python

Over the past few years, I found a few different external Python libraries that relied on a broken Yahoo Finance API. Apparently, the API changes frequently, leaving us developers in a tough spot troubleshooting tracebacks in order to get stock data. I wanted to check my stocks' prices from the terminal. 6 months ago, dealing with these frustrations inspired me to begin making a Python command line interface (CLI) to fetch stock info directly from the Yahoo Finance website.

With an idea and curiosity to see if I could make it work, I reached for the beautifulsoup4 library, the de facto HTML parser in Python. It turned out way better than I envisioned when I started. The CLI is working great, barring any changes to Yahoo's stock page HTML or CSS code. It accepts a stock ticker and grabs stock price data from the Yahoo website in your terminal. It is designed to monitor daily moves of stocks, including after hours prices.

Here is the Github repo with the code. I've named the CLI finsou.py, which I've been pronouncing to myself as "finsoupy", a word play on fin soup, short for financial soup. The standard library argparse module provided the CLI argument ingesting functionality. The CLI uses the requests module, beautifulsoup4 and re modules. With these 3 modules, it retrieves stock info and organizes it into a tidy, color coded report that is printed to your console. After getting the essential functionality working, I added improvements like the rich module to add in terminal color formatting and tqdm for a progress bar.

The CLI currently works after the US stock market has closed normal market hours. Additionally, after hours prices for "over the counter" (OTC) traded stocks are not listed on Yahoo so an error is returned for those stocks.

Getting Started with finsou.py

  1. First, install the necessary Python library dependencies:
pip install beautifulsoup4
pip install pandas
pip install requests
pip install rich
pip install tqdm
  1. Next, clone the Github repo:
git clone https://github.com/erickbytes/finsou.py.git
  1. Change directory into the finsou.py folder that contains the Python script:
cd finsou.py
  1. Query a stock's daily price summary:
# Print a daily stock summary for Virgin Galactic (SPCE).
python finsou.py -s SPCE
example stock summary report

Fetch a stock summary for multiple companies.

# Summarize a list of stocks.
python finsou.py -s BABA,SPOT,KO

Read a list of stocks from a .txt file.

# Read a list of stocks from a text file with one ticker on each line.
python finsou.py -s portfolio.txt -c "Portfolio Prices.csv"

Research + download media from investor relations websites.

Note: currently the code needs to be modified depending on the HTML structure of the page.

# Note: this is experimental and results will vary. URLs are typically buried in nested span and div tags.
python finsou.py -s GRAB -r https://investors.grab.com/events-and-presentations

How It Works

Check out the finsou.py Python script to see the complete code for how this stock report is created. Here is a brief simplified example of the logic behind the code.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
import re
import requests
from bs4 import BeautifulSoup

stock = "SNOW"
url = f"https://finance.yahoo.com/quote/{stock}/"
user_agent = "Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/534.1 (KHTML, like Gecko) Chrome/43.0.845.0 Safari/534.1"
headers = {
    "Cache-Control": "no-cache",
    "User-Agent": user_agent,
}
page = requests.get(url, headers=headers).text
soup = BeautifulSoup(page, "html.parser")
price_tags = soup.find_all(
    class_=re.compile("Fw\(b\) Fz\(36px\) Mb\(\-4px\) D\(ib\)")
)
mkt_close_price = price_tags[0].string.replace(",", "")
print(mkt_close_price)

First, an HTTP request is made and parsed by beautiful soup using Python's html.parser. We can then use bs4 and regex's re.compile function to return the HTML tags with a specific CSS class. Then once we have the tags, beautiful soup gives us a ".string" attribute for each tag to return their contents as a string. This pattern was applied to return all of the data in the stock report. To find the css classes I wanted, I right-clicked the price or data on Yahoo's website in a Chrome browser and selected "Inspect". Doing this opens Chrome's developer tools and drops you into that spot in the HTML code, where you can find the class you want to target.

No Official API, No Problem

It felt good to prove the concept that you don't need an official API to print stock data in your terminal! If you want to check in on your portfolio's daily moves, give this CLI a try: finsou.py Github Repo

If you're looking for a more robust finance Python module, I recommend yfinance for querying stock data.

Oct 02, 2023

An Introduction to the LangChain Python Library

LangChain is a lauded Python library in the large language model space. It seems to be riding along on the AI hype train as of late and is getting mentioned everywhere I look. I wrote this post to understand better, what is LangChain? Warning: I learned a lot by researching for this post! Below you'll find basic information about what LangChain is and code examples for a few of its use cases. I connected a few different sources that helped me fill in the gaps in my knowledge. After reading this, you'll have a basic understanding of what this Python module does and some of the diverse ways it can be applied.

A huge bonus of this library is that it has excellent documentation. On the front page, they state its main value propositions.

The main value props of LangChain are:

Components: abstractions for working with language models, along with a collection of implementations for each abstraction. Components are modular and easy-to-use, whether you are using the rest of the LangChain framework or not

Off-the-shelf chains: a structured assembly of components for accomplishing specific higher-level tasks Off-the-shelf chains make it easy to get started. For complex applications, components make it easy to customize existing chains and build new ones.

The Github repo states LangChain is for "Building applications with LLMs through composability". Ok, so what is composability?

Composability is a system design principle that deals with the inter-relationships of components. A highly composable system provides components that can be selected and assembled in various combinations to satisfy specific user requirements. - Wikipedia

LangChain is important because it provides the function of "orchestration" in an LLM serving workflow:

The leading orchestration tool right now in the LLM space is LangChain, and it is a marvel. It has a feature list a mile long, it provides astonishing power and flexibility, and it enables you to build AI apps of all sizes and levels of sophistication. But with that power comes quite a bit of complexity. Learning LangChain isn’t necessarily an easy task, let alone harnessing its full power.

- Stephen Hood, So you want to build your own open source chatbot… – Mozilla Hacks

Sometimes an orchestration tool is required, but sometimes you can "roll your own" in a sense, as Mozilla did in their post about serving an LLM. When making your own large language model, you'll want to consider how it will be orchestrated. This is the functionality LangChain provides. In more complex LLM flows that involve tasks like writing code and subsequently running it, LangChain is essential.

Often people hook up LLMs as part of a sequence of operations. LangChain does this. It puts an LLM in series with some other tools. For instance, LLMs can’t do math, they just spout plausible answers. They can write code, because they’ve read so much of it in their training. By themselves, they can’t use it. But configure a LangChain agent with both an LLM and a Python interpreter, and it can answer word problems. First ask the LLM for a plan to solve the problem, given a Python interpreter; then when the LLM returns code, run it; then provide the answer to the LLM so it can structure the final response.

- Jessica Kerr, A Developer’s Starting Point for Integrating with LLMs

You'll also want to consider if you need to tune your own AI model, but beware that this can cost a lot of money in compute resources consumed. It seems more likely that for the AI-layperson coder, going with an off-the-shelf model via an API or open source code base makes more fiscal sense and is probably easier. However, you may achieve a unique quality of response by tuning your model to your specific use case, or providing examples as tokens for the model to consume before responding.

Model training needs lots and lots of machines in relatively close proximity to one another. It needs the absolute latest, greatest GPUs.

- Matthew Prince, James Governor, Cloudflare as an AI play. An Interview with Matthew Prince

An Example LLM Stack With LangChain

Overview of an LLM stack from Mozilla

source: So you want to build your own open source chatbot… – Mozilla Hacks

Installing the Python Libraries with pip

pip install langchain[all]
pip install openai

The Two Types of Language Models

There are two types of language models, which in LangChain are called:

LLMs: this is a language model which takes a string as input and returns a string

ChatModels: this is a language model which takes a list of messages as input and returns a message

Calling OpenAI Without a Chain from LangChain

1
2
3
4
5
6
7
8
from langchain.llms import OpenAI

llm = OpenAI(openai_api_key="...")
text = "What would be a good company name for a company that makes colorful socks?"
llm.predict(text)
# >> Feetful of Fun
chat_model.predict(text)
# >> Socks O'Color

Chaining Components with LangChain

The chains are a Python class, as demonstrated in this psuedo-code from their docs.

Using an LLM in isolation is fine for simple applications, but more complex applications require chaining LLMs - either with each other or with other components.

LangChain provides the Chain interface for such "chained" applications. We define a Chain very generically as a sequence of calls to components, which can include other chains. The base interface is simple:

- LangChain Documentation, Chains How-To

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
class Chain(BaseModel, ABC):
    """Base interface that all chains should implement."""

    memory: BaseMemory
    callbacks: Callbacks

    def __call__(
        self,
        inputs: Any,
        return_only_outputs: bool = False,
        callbacks: Callbacks = None,
    ) -> Dict[str, Any]:
        ...

Chaining Open AI Components

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
from langchain.llms import OpenAI
from langchain.prompts import PromptTemplate
from langchain.chains import LLMChain

llm = OpenAI(temperature=0.9)
prompt = PromptTemplate(
    input_variables=["product"],
    template="What is a good name for a company that makes {product}?",
)
chain = LLMChain(llm=llm, prompt=prompt)
# Run the chain only specifying the input variable.
print(chain.run("colorful socks"))
# >> Socks O'Color

Natural Language Queries with LangChain

These examples are shown in Analyzing Structured Data.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
from langchain.utilities import SQLDatabase
from langchain.llms import OpenAI
from langchain_experimental.sql import SQLDatabaseChain

# The documented examples use a Chinook DB.
db = SQLDatabase.from_uri("sqlite:///Chinook.db")
llm = OpenAI(temperature=0, verbose=True)
db_chain = SQLDatabaseChain.from_llm(llm, db, verbose=True)
db_chain.run("How many employees are there?")
# >>> 'There are 8 employees.'

Text to SQL Queries With Ability to Run the Query on the Database

1
2
3
4
5
6
7
8
9
from langchain.utilities import SQLDatabase
from langchain.chat_models import ChatOpenAI
from langchain.chains import create_sql_query_chain

db = SQLDatabase.from_uri("sqlite:///Chinook.db")
chain = create_sql_query_chain(ChatOpenAI(temperature=0), db)
response = chain.invoke({"question":"How many employees are there"})
print(response)
# >>> 'There are 8 employees.'

Use a LangChain Agent to Describe a Database Table

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
from langchain.agents import create_sql_agent
from langchain.agents.agent_toolkits import SQLDatabaseToolkit
# from langchain.agents import AgentExecutor
from langchain.agents.agent_types import AgentType

db = SQLDatabase.from_uri("sqlite:///Chinook.db")
llm = OpenAI(temperature=0, verbose=True)
agent_executor = create_sql_agent(
    llm=OpenAI(temperature=0),
    toolkit=SQLDatabaseToolkit(db=db, llm=OpenAI(temperature=0)),
    verbose=True,
    agent_type=AgentType.ZERO_SHOT_REACT_DESCRIPTION,
)
agent_executor.run("Describe the playlisttrack table")

Description of Database Table Result

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
> Entering new AgentExecutor chain...
Action: sql_db_list_tables
Action Input:
Observation: Album, Artist, Customer, Employee, Genre, Invoice, InvoiceLine, MediaType, Playlist, PlaylistTrack, Track
Thought: The PlaylistTrack table is the most relevant to the question.
Action: sql_db_schema
Action Input: PlaylistTrack
Observation:
CREATE TABLE "PlaylistTrack" (
    "PlaylistId" INTEGER NOT NULL,
    "TrackId" INTEGER NOT NULL,
    PRIMARY KEY ("PlaylistId", "TrackId"),
    FOREIGN KEY("TrackId") REFERENCES "Track" ("TrackId"),
    FOREIGN KEY("PlaylistId") REFERENCES "Playlist" ("PlaylistId")
)

/*
3 rows from PlaylistTrack table:
PlaylistId  TrackId
1   3402
1   3389
1   3390
*/
Thought: I now know the final answer
Final Answer: The PlaylistTrack table contains two columns, PlaylistId and TrackId, which are both integers and form a primary key. It also has two foreign keys, one to the Track table    and one to the Playlist table.

> Finished chain.

Versatile + Flexible for Your LLM Needs

If you prefer using the Meta's LLaMA model over OpenAI, more power to you. LangChain can do both and many more. At the time of this writing, the following models are documented: Anthropic, Anthropic Functions, Anyscale, Azure, Azure ML Chat Online Interface, Baidu Qianfan, Bedrock Chat, ERNIE-bot Chat, Fireworks, GCP Vertex API, JinaChat, Konko, LiteLLM, Llama API, MiniMax, Ollama, OpenAI, PromptLayer ChatOpenAI and vLLM Chat.

Wrapping Up With LangChain

These examples represent a few things you can do with this popular Python library. You're now a step closer to creating your next AI-infused product or service. No one needs to know it's just a wrapper for OpenAI and LangChain! The library's name makes more sense once you understand a bit of its context as orchestrator. It chains together the pieces of your large language model's parts into a shiny, impressive AI solution.

Read More:

LangChain Documentation, Chains How To

LangChain Documentation, Deployments

Sep 18, 2023

RSS Is Thriving: Working With RSS Python Tools

There have been countless obituaries about RSS on the internet, like when Dan McKinley wrote that "Google Reader Killed RSS". For sure, the existence of Google Reader had an impact on other readers gaining a wider audience before it was shut down by Google. This likely did have a suppressive impact on the adoption of RSS.

"Whatever happened to RSS?" is a question that murmurs thorugh the internet. The answer: it's still here and it's not going anywhere. I propose that RSS is not dead, but quietly regaining its strength and is on track to relevancy again. But did it actually fall off?

Did you know? RSS stands for "Really Simple Syndication".

Some say with the rise of social media and email newsletters, RSS is not worth the time. That's precisely what the social media and search engine giants want so you'll stay in their walled garden platform. Google killed Google Reader because it is not in their interests to support a syndication format that cuts out their search engine middleman role. Nonetheless, we only need to consciously choose to use RSS to bring it back to prominence. Who's with me?!

"RSS readers have not only survived in the era of social media, but are driving more and more attention back to themselves, as people are realizing the pitfalls"

- Brian Barrett. "It's Time for an RSS Revival". Wired. https://www.wired.com/story/rss-readers-feedly-inoreader-old-reader

You'll often hear the term "aggregator" or "reader" when dealing with RSS, which both mean "tool that collects multiple RSS feeds into single readable format". There are plenty of free RSS reader websites to keep up with new material from your favorite websites. Currently, I use Feeder to keep up with the blogs I follow. Their free plan allows you to follow up to 200 RSS feeds for free. Feedly is a commonly suggested RSS reader also.

RSS icon

A Few Benefits of RSS

  • Receive notice when a new post is published by your favorite websites and blogs.

  • The personal data of your readers is not collected.

  • An RSS subscription is less intrusive compared to blasting into someone's cluttered email inbox. Plus you won't end up in the spam folder.

  • Diversify your website's traffic to be less dependent on search engines and social media platforms.

    To know approximately how much RSS traffic you have, check how many HTTP requests have an XML content type. This blog post has a good breakdown of this HTTP based approach. Typically, 6% of total HTTP requests on my blog have XML content type, indicating RSS traffic.

Python RSS Tools

The rest of this post will highlight some practical Python tools and resources for RSS. The libraries used in the below code examples can be installed with pip. There are also some standard library Python tools mentioned. Enjoy!

feedparser: Universal Feed Parser is a Python module for downloading and parsing syndicated feeds

Parse an RSS feed with feedparser.

1
2
3
4
5
6
import feedparser

feed = "https://feedparser.readthedocs.io/en/latest/examples/atom10.xml"
d = feedparser.parse(feed)
print(d['feed']['title'])
print(d.feed.published)

Detect a malformed RSS feed with feedparser "bozo".

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
import feedparser

# https://feedparser.readthedocs.io/en/latest/bozo.html#detecting-a-non-well-formed-feed
feed = "http://feedparser.org/tests/illformed/rss/aaa_illformed.xml"
d = feedparser.parse(feed)
# d.bozo: An integer, either 1 or 0. Set to 1 if the feed is not well-formed XML, and 0 otherwise.
print(d.bozo)
if d.bozo:
    exc = d.bozo_exception
    # Print bozo error message.
    print(exc.getMessage())
    # Print line number where exception occurred.
    print(f"Error at line {exc.getLineNumber()}")
else:
    print("Feed is well-formed RSS.")

feedvalidator and W3 RSS Validator: check if your feed is valid RSS or Atom. Below shows how you can validate your feed using the webbrowser module with the Python CLI:

1
2
# Validate an RSS feed.
python -m webbrowser -t "https://validator.w3.org/feed/check.cgi?url=https://example.com/feeds/all.rss.xml"

atoma: an Atom, RSS and JSON feed parser

Parse an RSS feed with atoma.

1
2
3
4
5
6
import atoma
import requests

response = requests.get("https://example.com/feed.atom")
feed = atoma.parse_atom_bytes(response.content)
print(feed.title.value)

Additional RSS Tools, Reads + Resources

It's Time for an RSS Revival, Wired

Mozilla Thunderbird: an open source RSS client

Awesome Tech RSS: a list of tech RSS feeds you can follow

pelican-planet: a Pelican static site generator plugin that allows generating a page aggregating blog articles from other web sites. The pelican Python library also has built-in support for RSS and Atom feed generation.

Django Syndication Feed Framework: built-in RSS feed framework for Django websites

django-yarr: a lightweight, customisable RSS reader for the Django web framework

python-feedgen: generates atom feeds, RSS feeds and podcasts

A Roadmap to XML Parsers in Python, Real Python

lxml: a Pythonic, mature binding for the libxml2 and libxslt libraries

xml.sax API: standard library XML validation option that is based on a Java API

Python Documentation, XML Processing Modules

RSSerpent: open source software to create RSS feeds for websites without them

rawdog: an "RSS aggregator without visions of grandeur"

RSS2mastodon: a quick set of python scripts for auto-posting an RSS or Atom feed to Mastodon

Craigslist RSS Scraper Python Script

Sep 14, 2023

A Power Ranking of Python's Best Events and Conferences

Python events and conferences are a great way to learn about a niche problem someone solved, new libraries in the ecosystem or general programming topics. Here are some of the best events I've found or experienced related to Python.

The events on this list can be either a live, in-person conference, smaller meetup or an online conference or seminar. Most events will post their talks on YouTube so it's easy to look up interesting talks and watch them for free.

This power ranking is meant to be in good fun and not taken too seriously. I ranked them according to the popularity, cost effectiveness (less $ to learn is better) and usefulness of the event for the average Pythonista. For example, DjangoCon is a very popular conference, however it is focused on a particular web framework so not as widely applicable as other events.

Python Events Power Ranking

  1. PyCon (insert local country or city here) aka Python conferences: PyCon US is probably the best known of this type of Python event. Many countries, cities or regions have a local version of their Python conference. There's EuroPython for all of Europe, individual countries like PyCon Portugal, PyCon Amsterdam, PyCascades (US Pacific Northwest region), PyCon Latam, Kiwi PyCon (New Zealand), PyCon Singapore, PyCon Hong Kong, PyCon Atlanta, PyTexas, PyBeach (Los Angeles) and so on. There are so many PyCon-like options and this is barely scratching the surface. Sometimes these events include longer training sessions in addition to shorter talks. I frequently watch the talks posted to the PyCon US YouTube channel every year. It's a good option if you don't mind waiting weeks or months for them to drop the videos.

  2. Python "User Group" Meetups: often you can find free events that host a speaker or two locally. These are less intensive than a full conference and maybe last a few hours. User groups are a more chill way to learn and connect with your fellow Python developers. Shout-out to ChiPy, the Chicago Python user group I attended that hosts regular meetups monthly.

    There about 1,637 Python user groups worldwide in almost 191 cities, 37 countries and over 860,333 members. - https://wiki.python.org/moin/LocalUserGroups

  3. PyData: these scientific computing conferences showcase fascinating work from data professionals. I've watched a ton of PyData talks. They run these conferences in various cities and regions. Check out the PyData YouTube channel for technical talks. PyData is recommended by the SciPy project as well according to their website.

  4. Pyjamas: "the coziest Python conference of them all", this online conference had a very casual format to watch Python talks. The diversity of hosts and speakers from across the globe brought a wide range of perspectives that I enjoyed. The event had lots of engaging talks. It was one of my favorite Python events I've seen. Check out Pyjamas for a cozy online conference experience, I hope they continue to put it on! You can watch past talks on their YouTube channel also.

  5. Python Web Conf: this 5 year old online conference boasts "top names in the Python community" and lasts for 5 days. In addition to typical Python talks, it includes "lightning talks" and "end of day socials". You can watch 80 videos from Python Web Conf 2023 on YouTube. This event is run by Six Feet Up, a woman-owned company.

  6. PyLadies Con + PyLadies Meetups: PyLadies is "a group of women developers worldwide who love the Python programming language.", per their website. In addition to their annual online conference, they host local meetups all over the world. Shout-out to all the women Python developers out there!

  7. DjangoCon: Django is a popular Python web framework for building websites. This conference is tailored for Django developers and those who want to explore one of Python's most robust web frameworks. While searching for DjangoCon events, I found conferences like DjangoCon US, DjangoCon Europe and DjangoCon Africa. If you want to see what it's like, you can watch the past DjangoCon talks on their YouTube channel.

  8. SciPy: another annual Python conference to stoke your scientific computing prowess. The SciPy conference is held annually, with a few variations like EuroSciPy, SciPy Japan and SciPy India. It's also associated with the SciPy library, an open-source software for mathematics, science, and engineering. However the SciPy library project's website states, "The SciPy project doesn’t organize its own conferences." Check the SciPy Calendar for local meetups also.

The Python community hosts a cornucopia of events and conferences. Go to them and learn in person or stay home and watch online. That's all for now, I hope you found Python events you didn't know about. Now go learn something new!

Python Events Resources

Interactive Visualization of Python Events

PyCon.org Events Calendar

Python.org Conferences Wiki

Python.org Conferences and Workshops

Python Training Events

Aug 13, 2023

How to Install Python 3.11 or 3.12 on a Linux Computer

Below are the steps I followed to install both Python 3.11 and Python 3.12 in my Ubuntu Linux shell. Make sure to adjust your Python version to match 3.11 or 3.12 in all commands.

I downloaded the .tgz file from Python.org, not sure initially how to build Python from it. Once I unpacked the compressed files, I saw the build instructions in the README.rst to build a functional Python 3.11 on my Ubuntu computer. Here's how to install the speedier Python versions 3.11 or 3.12.

How to Install Python 3.11 or 3.12

Install Linux build libraries.

I followed this step posted on this blog. If you don't do this, you'll likely see an error about C not being found when running the ./configure command.

sudo apt install build-essential zlib1g-dev libncurses5-dev libgdbm-dev libnss3-dev libssl-dev libreadline-dev libffi-dev

Install sqllite Libraries (Django Requirement)

If you wish to make a Django website, install sqllite libraries before you build Python.

sudo apt install sqlite3 libsqlite3-dev

Use curl to download the Python gzip file.

curl https://www.python.org/ftp/python/3.11.0/Python-3.11.0.tgz --output Python-3.11.0.tgz
download Python with curl

Unpack gzip file to folder.

tar -xvzf Python-3.11.0.tgz

Change directory, read the README + find build commands.

cd Python-3.11.0
cat README.rst

Build Python.

# Build Python on Unix, Linux, BSD, macOS, and Cygwin:
./configure --enable-optimizations
make
make test
sudo make install

Building Python on Various Platforms

This will install Python as python3.

You can pass many options to the configure script; run ./configure --help to find out more. On macOS case-insensitive file systems and on Cygwin, the executable is called python.exe; elsewhere it's just python.

Building a complete Python installation requires the use of various additional third-party libraries, depending on your build platform and configure options. Not all standard library modules are buildable or useable on all platforms. Refer to the Install dependencies section of the Developer Guide for current detailed information on dependencies for various Linux distributions and macOS.

On macOS, there are additional configure and build options related to macOS framework and universal builds. Refer to Mac/README.rst.

On Windows, see PCbuild/readme.txt.

- Python 3.11 Linux README.rst

Aug 07, 2023

Opening Files From The Terminal With Text Editor CLIs

Most text editors can open files from a computer's command line shell. Here are 8 different text editor commands for opening a file:

IDLE (Python's Built-in Editor)

idle file.py
CLI help options for Python's IDLE text editor

image source: IDLE documentation

Sublime

subl template.rst
using subl CLI to open files in Sublime

VS Code

code file.py

Atom

atom file.py

Emacs

emacs -nw file.txt

source: Stack Overflow user Anthon

Notepad++

start notepad++ file.py

source: W3 Schools

TextEdit

open -a TextEdit file.txt

source: Stack Overflow user robmathers

Vim

:e file.txt

source: confirm blog

The ability to quickly pop open and view a file is essential. Ubuntu has the cat command to print a file's contents to the terminal screen also if you don't need to edit it. Tools like sed and awk are useful for command line file editing if you prefer to keep it in the terminal.

Want to read more about text editors? Check out my text editor file size comparison here.

Jul 27, 2023

Analyzing Football AKA Soccer With Python

The world's game is fun to watch. It's obvious when a team is dominant against a weaker opponent. What gives one team an edge over another? Is it short, crisp and reliable passing resulting in a high conversion percentage? Or shots on goal? Quality touches. Clinicality in the final third is what separates the champions from the rest. Making the most of your chances. Apparently, some of the best teams keep their passes on the ground. All of these things contribute to victory in a sense.

We all have our theories to what makes a great player or team. But how do we assess football performance from an analytics perspective? It is difficult to predict how teams with varying styles will match up. Fortunately, data is integrating with the football world. Extensive analytics resources and tactics now available for free online.

If you're interested in football analytics, there seems to be a few areas you can go. Do you need to collect data? If you can record a game correctly, it can be converted into data from which winning insights are extracted. If you are lucky enough to already have data, what does it say about player and team performance? Can you study open data from professional teams to explore your hypotheses?

Searching the internet, FC Python was the first thing I saw. They have some free tools available for collecting data from live games. I was impressed at the Python code for pitch heat maps to track Abby Wombach's passing. Their example uses seaborn and matplotlib:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
from matplotlib.patches import Arc
import seaborn as sns

%matplotlib inline

data = pd.read_csv("Data/passes.csv")
data.head()

fig, ax = plt.subplots()
fig.set_size_inches(14, 4)

# Plot One - distinct areas with few lines
plt.subplot(121)
sns.kdeplot(data["Xstart"], data["Ystart"], shade="True", n_levels=5)

# Plot Two - fade lines with more of them
plt.subplot(122)
sns.kdeplot(data["Xstart"], data["Ystart"], shade="True", n_levels=40)

plt.show()

# Create figure
fig = plt.figure()
fig.set_size_inches(7, 5)
ax = fig.add_subplot(1, 1, 1)

# Pitch Outline & Centre Line
plt.plot([0, 0], [0, 90], color="black")
plt.plot([0, 130], [90, 90], color="black")
plt.plot([130, 130], [90, 0], color="black")
plt.plot([130, 0], [0, 0], color="black")
plt.plot([65, 65], [0, 90], color="black")

# Left Penalty Area
plt.plot([16.5, 16.5], [65, 25], color="black")
plt.plot([0, 16.5], [65, 65], color="black")
plt.plot([16.5, 0], [25, 25], color="black")

# Right Penalty Area
plt.plot([130, 113.5], [65, 65], color="black")
plt.plot([113.5, 113.5], [65, 25], color="black")
plt.plot([113.5, 130], [25, 25], color="black")

# Left 6-yard Box
plt.plot([0, 5.5], [54, 54], color="black")
plt.plot([5.5, 5.5], [54, 36], color="black")
plt.plot([5.5, 0.5], [36, 36], color="black")

# Right 6-yard Box
plt.plot([130, 124.5], [54, 54], color="black")
plt.plot([124.5, 124.5], [54, 36], color="black")
plt.plot([124.5, 130], [36, 36], color="black")

# Prepare Circles
centreCircle = plt.Circle((65, 45), 9.15, color="black", fill=False)
centreSpot = plt.Circle((65, 45), 0.8, color="black")
leftPenSpot = plt.Circle((11, 45), 0.8, color="black")
rightPenSpot = plt.Circle((119, 45), 0.8, color="black")

# Draw Circles
ax.add_patch(centreCircle)
ax.add_patch(centreSpot)
ax.add_patch(leftPenSpot)
ax.add_patch(rightPenSpot)

# Prepare Arcs
leftArc = Arc(
    (11, 45), height=18.3, width=18.3, angle=0, theta1=310, theta2=50, color="black"
)
rightArc = Arc(
    (119, 45), height=18.3, width=18.3, angle=0, theta1=130, theta2=230, color="black"
)

# Draw Arcs
ax.add_patch(leftArc)
ax.add_patch(rightArc)

# Tidy Axes
plt.axis("off")

sns.kdeplot(data["Xstart"], data["Ystart"], shade=True, n_levels=50)
plt.ylim(0, 90)
plt.xlim(0, 130)

# Display Pitch
plt.show()
Analyzing football with Python

Impressive use of matplotlib and seaborn! This code is meant for a Jupyter notebook. I can't find the "passes.csv" data but suspect it is using statsbomb. It's a free footy dataset that's on display in this Towards Data Science blog post also.

In another practical example of wrangling data, Tactics FC shows how to calculate goal conversion rate with pandas. I'm guessing basic statskeeping and video is collected in great quantities by analytics teams during games for professional teams. At half time, typically on TV they will show both teams' shots, passes and time of possession.

Another intriguing field of study is extensive simulation and tracking of individual player position on the pitch. Google hosted a Kaggle competition with Manchester City 3 years ago, where the goal was to train AI agents to play football. Formal courses are available like the Mathematical Modeling of Football course at Uppsala University. There's also the football analytics topic on Github that shows 100+ repos.

From that topic, I found Awesome Football Analytics, which is a long list of resources to browse through. It seems wise to stop through Jan Van Haren's soccer analytics resources. I'm really looking forward to checking out Soccermatics for Python also. There is a ton of stuff online about football analytics that is happening.

I sense there is a passionate community pushing football analytics forward and innovating. There are many facets to consider from video optimization, data collection, drawing insights from established datasets, tracking game stats and codifying player movements.

Techniques like simulation and decoding live games into data could result in recommendations for players to uncover new advantages, adjust their positioning, conserve their energy or look for chances in a vulnerable spot on the field. The best teams are probably asking how they can leverage data to inform their strategy on the pitch and win more games.

Watching football is so satisfying. Why not study it with Python? My prediction is that the beautiful game will progress and improve as teams develop a more sophisticated data strategy.

← Previous Next → Page 2 of 13