Lo-Fi Python

Sep 25, 2024

When Microsoft Was Uncool and How They Flipped Apple

I began paying attention to what was relevant in the tech scene around 2014, in my 20s. Back then, I was just getting started studying Python. It was an interesting time in tech. The term "Big Data" was getting tossed around a lot, but the pandas library hadn't yet reached mass adoption in data circles like it now has today. People were still talking about Hadoop + Map Reduce. (RIP)

In the 2010s, it didn't take much perusing online to find people in the Python community bashing Microsoft. If tech companies were a high school, Apple was the cool kid everybody wanted to know, Microsoft was the kid who nobody liked and everyone made fun of. Understandably, the Windows operating system didn't mesh with Python programming as well as Linux or Mac OS. By 2024, Microsoft gained their mojo back, or found the mojo they never had. Having used Windows a lot at my last job, I recognize the OS and its Python implementation have flaws. I still got my work done and had no problems, without complaining. I continued to play around on Windows and write Python on it even though people trashed it online. I'm glad I did!

How did Microsoft flip Apple? Steve Ballmer left the company in 2014, yielding to Satya Nadella as CEO. Since then, the company culture shifted miraculously. In the Python community, they have made a huge impact by investing in the language. They constantly release free Python + AI courses, and integrated Excel with Python. Guido, the creator of Python is employed full-time, working on improving the Python language. That tells you a lot of how much has changed since Python's BDFL is still working there after 3 years. Microsoft's culture change propelled it into the 2020s with newfound momentum. With some timely bets, they saw the AI revolution coming and capitalized first.

If someone feels this way in 2024, they probably don't want to admit: Microsoft is Apple in 2012, and Apple is Microsoft in 2012.

What is funny to see is that nowadays, fewer people are bashing Microsoft. I used to see it regularly, people teeing off online, "writing Python on Windows is such a terrible experience for XYZ thing, why is Windows so awful??"" I see less of those people posting such thoughts now. Maybe they're still out there. If someone feels this way in 2024, they probably don't want to admit: Microsoft is Apple in 2012, and Apple is Microsoft in 2012. I posit they switched places in respective coolness among tech circles. People realized Apple is not the friend of developers or society in general. They are self-serving to a vicious degree. Apple is focused on maintaining their walled garden on iOS.

Microsoft is now a better advocate for techies and Python development. Sure, some people prefer to code on Macs, more power to them. Linux is typically the favorite of the three and it is awesome. It's also not released by a for profit corporation which is uber cool to developers.

Apple is also less cool due to their battle with Epic Games and insistence on 30% rake for in-app purchases on iOS. Not to mention an unwillingness to change their policies to appease stricter European Union regulations for things like 3rd party app stores.

Microsoft is integrating AI deep into their products. Apple, after being slow on the uptake to AI, followed Microsoft's lead to invest in OpenAI and roll its AI chat to iPhones. Who is the leader here? In terms of "What have you done for me lately?", it's Microsoft. In terms of who supports open and free information, it's Microsoft. Who's cool now?

Microsoft vs. Apple Stock Price, All-Time

Feb 14, 2021

So You Want to Learn Python?

Here are a few Python concepts for beginners to explore if you are starting out with the language. In this post, I'll highlight my favorite "must-learn" tools to master that come with your Python installation. Understanding them will make you a more capable Python programmer and problem solver.

  1. Built-in Functions. They are awesome! You can do so much with these. Learn to apply them. You won't regret it! See also: An Intro to Python's Built-in Functions
  2. String methods. Want to capitalize, lowercase or replace characters in text? How about checking if a str.isdigit()? Get to know Python's string methods. I use these frequently. Also, the pandas string method implementations are great for applying them to tabular data.
  3. Docstrings. I truly enjoy adding docstrings at the beginning of my functions. They add clarity and ease of understanding.
  4. The Mighty Dictionary. Lists and tuples are useful too, but dictionaries are so handy with the ability to store and access key-value pairs.
  5. List Comprehensions. These allow you to perform transformations on lists in one line of code! I love the feeling when I apply a list comprehension that is concise, yet readable.
  6. Lambda Expressions. These can be used to apply a function "on the fly". I love their succinctness. It took me a few years to become comfortable with them. Sometimes it makes sense to use a lambda expression instead of a regular function to transform data.
  7. Date Objects. Wielding date objects and formatting them to your needs is a pivotal Python skill. Once you have it down, it unlocks a lot of automation and scripting abilities when combined with libraries like pathlib, os or glob for reading file metadata and then executing an action based on the date of the file, for example. I use date.today() a lot when I want to fetch today's date and timedelta to compare two dates. The datetime module is your friend, dive in. Must know for custom date formatting: strftime() and strptime(). See also: Time Format Codes

For tabular data, I often use pd.to_datetime() to convert a series of strings to datetime objects:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
# install pandas with this command: python -m pip install pandas
import pandas as pd
events = [
    ["USA Born", "1776-07-04"],
    ["WTC Bombings", "2001-09-11"],
    ["Biden Inauguration", "2021-01-20"],
]
df = pd.DataFrame(events, columns=["events", "dates"])
# convert a pandas series of strings to datetime objects
df.dates = pd.to_datetime(df.dates)
print(df.dtypes)
print(df.head())

Just the tip of the iceberg...

The amazing part of Python is that its community has developed an astonishing plethora of external libraries which can be installed by pip. Usually I'll learn how to use new libraries after googling to find a well-written README on Github or helpful documentation. The language comes with an impressive line-up of baked-in tools and libraries way beyond what I've mentioned here. But I think this is a great start. Get to know these common Python language features and you'll be surprised how much you can do!

Additional Comprehensive Python Learning Resources

How long did it take you to learn Python?

Practical Python Programming (free course)

Google Python Style Guide

What the f*ck Python!

PySanity

Aug 09, 2020

Pondering Join Algorithms

Truly enjoying this Intro to Database Systems course from Carnegie Mellon University. Some really great breakdowns of common join algorithms in this lecture. Here are my notes.

Lecture 11- Join Algorithms(CMU Databases Systems / Fall 2019)

Prof. Andy Pavlo, Carnegie Mellon Database Group

Join Algorithms

screenshot from lecture

Table Positioning for a Join

"In general, your smaller table should be the "left" table when joining two tables."... Professor demonstrates better performance by making the smaller table the "outer" table in a join.

Block Nested Loop Join [mysql example]

  • "The brute force approach"
  • If you have enough memory to hold a large table, a good option for joining.
  • Always pick the smaller table as the outer table.
  • Buffer as much of your outer table in memory as possible to reduce redundant I/O.
  • Loop over the inner table or use an index.

Index Nested Loop Join [CS Course definition]

If indexes are available, or you could create an index to use for a join.

Sort-Merge Join [wikipedia]

Useful if one or both tables are sorted on a join key. Maximize sequential I/O.

Sort - Merge Join

screenshot from lecture

Hash Join

Best performance. For large datasets.

  1. Phase #1 Build (Hash Table)
  2. Phase #2 Probe

Use a Bloom Filter set operations for probe phase optimization.

  1. insert a key
  2. lookup a key

Additional Reading on Bloom Filters

Let's implement a Bloom Filter

Bloom Filters Debunked

Grace Hash Join [wikipedia]

  • "Do hash joins when things don't fit in memory."
  • Use a hash table for each table. Break the tables into buckets then do a nested loop join on each bucket. If the buckets do not fit in memory, use recursive partitioning. Then everything fits in memory for the join.

"Split outer relation into partitions based on the hash key."

Prof. Andy Pavlo on Hash Join algorithm

  • Hashing is almost always better than sorting for operator execution.

"No join algorithm works well in all scenarios."

-Prof. Andy Pavlo

webmention

webmention