Lo-Fi Python

Jan 08, 2023

pymarketer: an HTTP + Spreadsheet Wrangling Python package

Typically, this blog reviews the other Python libraries in its vast ecosystem. This time, it's my own package I made for fun, pymarketer. This was created in a single day and can be installed from the Github repo. Have a go at my most read post if you need help with pip.

Install with pip from the source Github repo:

python -m pip install git+https://github.com/erickbytes/pymarketer.git

The pymarketer package helps you do things like:

  • merging all the tabs of an Excel file into one CSV
  • generate HTTP code
  • make a word cloud image
  • splitting a CSV
  • merging CSVs

Generating a Word Cloud with the pymarketer Package** via wordcloud

1
2
3
4
5
6
7
8
import pandas as pd
import pymarketer as pm

xl = "Chicago Breweries.xlsx"
df = pd.read_excel(xl)
# Make a wordcloud from a pandas dataframe.
wordcloud = pm.word_cloud(df)
wordcloud.to_file("Text Word Cloud Visualization.jpg")
Python wordcloud example

This package relied on several Python libraries to complete:

I'll likely expand on this in the future. Anyone who wrangles data might be able to apply this package to good profit. At minimum, you might find it interesting to take a look at the project's __init__.py to see how some of the functions are implemented.

Additional Resources

Jun 24, 2022

Hammock-Driven Development Notes

Occasionally you will find a video or talk that connects or resonates with you in a great way. Rich Hickey's "Hammock Driven Development", a self-described "rant" is packed with wisdom. I keep coming back to re-watch and today, I have written down some key points from this amazing rant!

Key Ideas

Take more time to think through your problem.

When was the last time you...

thought about something for a whole day?

thought about something for a whole month or year?

Hammock Driven Development, https://www.youtube.com/watch?v=f84n5oFoZBc

On Bugs

  • Bugs are cheaper to fix in development.
  • Least expensive to avoid in design
  • Most expensive in to fix in production

Analysis & Design, Simplified

  • Identify problem trying to solve.
  • Assess whether it solves that problem.

On Problem Solving

solving problems by Rich Hickey

Problem Solving (cont.)

  • State the problem out loud.
  • Understand the problem's facts, context and constraints.
  • What don't you know?
  • Find problems in your solution.
  • Write it all down.

More Input, Better Output

  • Read in and around your space.
  • Look critically at other solutions.
  • You can't connect things you don't know about.

On Focus

  • On the hammock, no one knows if you're sleeping and they don't bother you because of this.
  • Computers are distracting.
  • Let loved ones know you are going to be "gone", focusing deeply for some time.

Waking Mind vs Background Mind

  • The waking mind is good at critical thinking.
  • Use waking time to assign tasks to background mind.
  • The background mind is good at making connections and good at strategy.

Sleep According to Scientific American:

  • The brain processes info learned while sleeping.
  • Sleep makes memories stonger and weeds out irrelevant details.
  • Our brain finds hidden relations among memories to solve waking problems.

Closing Ideas

Write the proposed solution down. Hammock time is important "mind's eye time". We switch from "input mode" to "recall mode" during hammock time. Wait overnight, or sometimes months, to think about your problem, sleep sober for best results! Eventually coding is required, and your feedback loop is important, but "don't lean on it too much". You will be wrong, facts and requirements will change. Mistakes happen. That's fine, do not be afraid of being wrong. /rant

The notes in this blog post are paraphrased from this rant.

May 12, 2018

A Stroll Through Pandas 1.0, Python’s Tabular Data Powerhouse

Introduction

pandasThanks to pandas, I have automated some data cleaning and file reading processes at my job. Here are some terms and code that have been useful or interesting to me after 2 years of exploration. I also checked out "Python for Data Analysis" from theChicago Public Library.

If I could suggest anything to be successful with pandas, it is repetition. I use it nearly every day at work. Dive into its API documentation. There are tons of useful tools there, laid out with meticulous detail and examples. I began learning pandas with this PyCon 2015 tutorial from Brandon Rhodes, it's informative and entertaining! (It's a little dated now but I still recommend it.) The Reproducible Data Analysis in Jupyter video series by Jake VanderPlas is also a great example of pandas-related workflows.

Table of Contents

  1. Pythonpandas Installation and Version Compatibility
  2. Welcome to pandas 1.0
  3. Data Wrangling, Exploration and Broadcasting
    • Series.str & Series.dt accessors
    • apply, applymap, lambda and map
    • featuring pandas.to_markdown()
    • SQL operations with df.merge() and pandas.read_sql()
    • pandas.read_clipboard()
    • converting between Series and DataFrame
  4. Turning json API responses into a dataframe with pandas.json_normalize()
  5. Plotting Visualizations with matplotlib
  6. Supplementary Resources and Links

(1) Python + pandas Installation and Version Compatibility

Python 3.6 and higher can install pandas 1.0.

Installing Python 3.8 on Windows

For Windows installation, see the python docs for an installer, "Using Python on Windows".

Installing Python 3.8 on Ubuntu

Follow these steps to download and install Python 3.8 in the Ubuntu terminal. To upgrade to pandas 1.0, I installed Python 3.8, the latest stable release, "from source" on Ubuntu 16.04.

If you intend to use pandas.to_markdown() on Ubuntu, it might save you trouble to pre-emptively install the '_bz2' library before you build your Python from source.

On Ubuntu, I ran into ModuleNotFoundError: No module named '_bz2' and fixed by entering in the terminal:

sudo apt-get install libbz2-dev

I also saw this message when completing install:

The necessary bits to build these optional modules were not found. To find the necessary bits, look in setup.py in detect-modules() for the module's name.

If you need to re-build Python on Ubuntu, enter:

cd /home/erick/Python-3.8.0/
./configure --enable-loadable-sqlite-extensions && make && sudo make install

I installed missing  _bz2 and _sqllite3 modules then re-built with these commands.

Installing Older pandas Versions on Ubuntu

The version downloaded with this command is about 6 months behind the current version. For me, this installed pandas 0.17 on Ubuntu:

sudo apt-get install python3-pandas

As of February 2020, this command installs pandas version 0.24 with pip when used with Python 3.5 on Linux Ubuntu 16.04:

python3.5 -m pip install pandas
successful_python_install

If pandas is already installed, you can upgrade with pip:

pip_list
python -m pip install --upgrade pandas

To check if pip is installed:

python -m pip list

Best Practice: Virtual Environments

Create a virtual environment with your new Python version. venv wasn't included in my Python 3.8 installation on Ubuntu 16.04, so I installed virtualenv:

python -m pip --user install virtualenv

Let's create a new virtual environment. Enter in terminal or command prompt:

virtualenv -p python3.8 add_env_name_here

Now, activate your new virtual environment on Linux:

source add_env_name_here/bin/activate

Or activate environment on Windows:

cd add_env_name_here\scripts & activate

"ImportError: Missing optional dependency 'tabulate'. Use pip or conda to install tabulate:" To use pd.to_markdown(), install Tabulate:

python -m pip install tabulate

To use pd.read_clipboard() on Linux, install xclip or xsel:

sudo apt-get install xclip

I also saw a prompt to install pyperclip:

python -m pip install pyperclip

Now install pandas 1.0 and matplotlib in your virtual environment for visualizations.

python3.8 -m pip install pandas
python -m pip install -U matplotlib

(2) Welcome to pandas 1.0

You did it! Welcome to the good life. The basis of pandas is the "dataframe", commonly abbreviated as df, which is similar to a spreadsheet. Another core pandas object is the pandas.Series object, which is similar to a Python list or numpy array. When imported, pandas is aliased as "pd". The pd object allows you to access many useful pandas functions. I'll use it interchangeably with pandas in this post.

The library’s name derives from panel data, a common term for multidimensional data sets encountered in statistics and econometrics.

pandas: a Foundational Python Library for Data Analysis and Statistics

  • Wes McKinney

(3) Data Wrangling, Exploration and Broadcasting

Data is commonly read in from file with pd.read_csv()

1
2
3
4
5
6
7
8
import pandas as pd
file_name = 'my_bank_statement.csv'
# you may sometimes need to specify an alternate encoding: encoding = "ISO-8859-1"
df = pd.read_csv(file_name, encoding='utf-8')
print(df.head())
print(df.shape) # returns a tuple: (# of rows, # of columns)
print(df.dtypes)
print(df.info())

Create a dataframe from a list of Python lists, named movies below, with pd.DataFrame:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
import pandas as pd

column_names = ["Title", "Release Date", "Character", "Actor", "Movie Budget", "Worldwide Gross"]
movies = [["Ocean's 11", "12/7/2001", "Danny Ocean", "George Clooney","$85,000,000"," $450,728,529"],
["Ocean's 11", "12/7/2001", "Tess Ocean", "Julia Roberts","$85,000,000"," $450,728,529"],
["Runaway Bride", "6/30/1999", "Ike Graham", "Richard Gere","$70,000,000","$309,457,509"],
["Runaway Bride", "6/30/1999", "Maggy Carpenter", "Julia Roberts","$70,000,000","$309,457,509"],
["Bonnie and Clyde", "9/1/1967", "Clyde Barrow", "Warren Beaty","$2,500,000", "$70,000,000"],
["Bonnie and Clyde", "9/1/1967", "Bonnie Parker", "Faye Dunaway","$2,500,000", "$70,000,000"]]

df = pd.DataFrame(movies, columns=column_names)
df = df[["Title","Character", "Actor", "Movie Budget", "Worldwide Gross"]]
print(df.to_markdown(showindex=False, tablefmt="simple"))

Let's print the table to our terminal with pd.to_markdown() new in pandas version 1.0.0:

simple_layout_markdown

Slicing and sorting a dataframe, removing duplicates, and working with datetime objects

  1. Let's create a new dataframe slice with only two columns
  2. Drop duplicate movies
  3. Convert the dates to datetime objects
  4. Get the year from an array of datetime objects
  5. Set the year as the dataframe index
1
2
3
4
5
6
7
8
df = pd.DataFrame(movies, columns=column_names)
date_df = df[['Title', 'Release Date']].drop_duplicates(subset=['Title'])
date_df['Release Date'] = pd.to_datetime(date_df['Release Date'])
# create year column using the pd.Series.dt datetime accessor:
date_df['Release Year'] = df['Release Date'].dt.year
date_df = date_df.sort_values(by=['Release Date'])
date_df = date_df.set_index('Release Year')
print(date_df.to_markdown(showindex=False, tablefmt='simple'))

dates_of_movies

Applying Broadcasting in pandas

Broadcasting means to map a function or an arithmetic calculation over an over an array (using apply or map) or dataframe (applymap).

"Summing up, apply works on a row/column basis of a DataFrame, applymap works element-wise on a DataFrame, and map works element-wise on a Series."

Applying a function to a pandas column

  • Convert columns to int and calculate the difference between two columns.
  • Let's format those integers back to dollars with python's lambda and pandas' applymap for extra jazz.
 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
def format_dollars_as_int(dollars):
    """Accepts a dollar formatted string, returns an int."""
    number = dollars.replace('$','').replace(',','')
    return int(number)

df = pd.DataFrame(movies, columns=column_names)
df = df.drop_duplicates(subset=['Title'])
df[['Movie Budget','Worldwide Gross']] = df[['Movie Budget','Worldwide Gross']].astype(str).applymap(format_dollars_as_int)
df['Movie Net Income'] = df['Worldwide Gross'] - df['Movie Budget']
money_columns = ['Movie Budget', 'Worldwide Gross','Movie Net Income']
df[money_columns] = df[money_columns].applymap(lambda x:'${:,}'.format(x))

Creating a new column and writing to a .csv file

  • Then add the IMDB ratings of our three films in a new column.
  • Finally, write the result to markdown and a csv file.
1
2
3
4
5
# create a new column with the three movies' IMDB ratings
df['IMDB Rating'] = list([7.8,5.5,7.8])
print(df.to_markdown(showindex=False, tablefmt='simple'))
df.to_csv('Movies.csv', index=False)
print(df.Actor.value_counts().to_markdown(tablefmt="github"))
IMDB_movies

See also: pandas.Series.value_counts() https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.Series.value_counts.html


Notice for column names without spaces, you can use dot notation instead of brackets:

1
2
3
# both valid ways to access column by name
df.Actor
df['Actor']

Lowercase column names Python's map function:

1
df.columns = map(str.lower, df.columns)

Strip whitespace from a column of strings with the pandas.Series.str accessor:

1
df['Character'] = df['Character'].astype(str).str.strip()

Fix pesky leading zero zip codes with str.zfill():

1
log_df['zip'] = log_df['zip'].astype(str).str.zfill(5)

Get a row by index number pandas.DataFrame.loc[]:

1
2
first_row = df.loc[0, df.columns]
third_row = df.loc[2, df.columns]

Filter the df to get rows where the actor is 'Julia Roberts'.

1
2
julia_roberts_movies = df[df.Actor=='Julia Roberts'].reset_index(drop=True)
print(julia_roberts_movies.head())

"Get" an item from a column of lists with str.get().

1
2
# returns first item in each cell's list into new column
df['first_item'] = df['items'].str.get(0)

Execute SQL-like operations between dataframes with df.merge().

First, use df.copy() to create a new dataframe copy of our actors table above.  By default, df.merge() uses an inner join to merge two dfs on a common column. Let's add each film's release year from our date_df to our original actors table, with an inner join based on 'Title':

1
2
3
4
5
6
actors = df.copy(deep=True)
# slice only the columns we want to merge:
date_df = date_df[['Title','Release Year']]
joined_df = actors.merge(date_df, on='Title', how='inner')
# You can pass the number of rows to see to head. It defaults to 5.
print(joined_df.head(10))

Execute database queries with pd.read_sql().

When the chunksize argument is passed, pd.read_sql() returns an iterator. We can use this to iterate through a database with lots of rows. When combined with DB connection libraries like pyodbc or SQLAlchemy, you can process a database in chunks. In this example, it's an Access DB connection via pyodbc to process 500,000 rows per chunk. Pyodbc works on a wide range of other databases also.

uses pd.Series.isin() to check if each email is in the DB.
 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
import pandas as pd
import pyodbc

emails = ['[email protected]', '[email protected]', '[email protected]']
connection_string = r'Driver={Microsoft Access Driver (*.mdb, *.accdb)};DBQ=C:\path_to_db\emails_database.accb;'
print(connection_string)
conn = pyodbc.connect(connection_string)
query = """
    SELECT *
    FROM   ADD_TABLE_NAME
    """
dfs = list()
for i, db_chunk in enumerate(pd.read_sql(query, conn, chunksize=500000)):
    emails_in_db = db_chunk[db_chunk.Email.isin(emails)]
    dfs.append(emails_in_db)
    print(i)
emails_in_db = pd.concat(dfs)
emails_in_db.to_csv('DB_Email_Query_Results.csv', index=False)
In case you are wondering, enumerate is a python built-in for enumerating, or counting an iterable, e.g. list or generator, as you iterate through it.

Using pd.read_clipboard():

1
2
3
import pandas as pd
clipboard_contents = pd.read_clipboard()
print(clipboard_contents)

Use pd.to_clipboard() to store a dataframe as clipboard text:

1
2
3
4
5
6
import pandas as pd
truths = ['pandas is great','I love pandas','pandas changed my life']
df = pd.DataFrame([truths], columns=['Truths'])
df = df.to_clipboard(index=False, sep='|')
clipboard_contents = input('Press ctrl-v ')
print(clipboard_contents)

Convert the clipboard contents to df with pd.DataFrame() https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.DataFrame.html:

1
2
3
4
import pandas as pd
clipboard_contents = list(input('Press ctrl-v '))
df = pd.DataFrame([clipboard_contents])
print(df.head())

If the clipboard dataframe has one column, you could squeeze the clipboard contents into a pd.Series object:

1
2
3
4
5
6
import pandas as pd
clipboard_text = pd.read_clipboard()
clipboard_contents = list(clipboard_text)
df = pd.DataFrame([clipboard_contents], columns=['Clipboard Data'])
clipboard_series = df.squeeze(axis='columns')
print(type(clipboard_series))

Inversely, consider using pandas.Series.to_frame() to convert a Series to a dataframe:

1
2
3
4
import pandas as pd
clipboard_contents = pd.Series(input('Press ctrl-v '))
df = clipboard_contents.to_frame()
print(df.head())

(4) Turning json API responses into a dataframe with pd.json_normalize()

Update: beginning in pandas 1.0, json_normalize became a top-level pandas namespace. For older pandas versions:

1
2
3
4
5
6
7
import pandas as pd
import requests
url = 'pseudo_API.com/endpoint/'
parameters = {'page_size'=100, format='json', api_type='contact_sync' }
response = requests.get(url=url, params=parameters)
data = response.json() # decode response into json
df = pd.json_normalize(data['any_key'])

pandas.json_normalize() is now exposed in the top-level namespace. Usage of json_normalize as pandas.io.json.json_normalize is now deprecated and it is recommended to use json_normalize as pandas.json_normalize() instead (GH27586).

What's new in pandas 1.0.0

(5) Plotting Visualizations with matplotlib

Make a bar plot of the movie release year counts using pandas and matplotlib formatting.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
import pandas as pd
import matplotlib.pyplot as plt
from matplotlib.ticker import MaxNLocator
import matplotlib.ticker as ticker

column_names = ["Title", "Release Date", "Character", "Actor"]
rows = [["Ocean's 11", "12/7/2001", "Danny Ocean", "George Clooney"],
["Ocean's 11", "12/7/2001", "Tess Ocean", "Julia Roberts"],
["Runaway Bride", "6/30/1999", "Ike Graham", "Richard Gere"],
["Runaway Bride", "6/30/1999", "Maggy Carpenter", "Julia Roberts"],
["Bonnie and Clyde", "9/1/1967", "Clyde Barrow", "Richard Gere"],
["Bonnie and Clyde", "9/1/1967", "Bonnie Parker", "Julia Roberts"]]
df = pd.DataFrame(rows, columns=column_names)
ax = df.Year.value_counts().plot(x='title', ylim=0, kind='bar', title='Release Year of Movies', rot=0)
ax.yaxis.set_major_locator(MaxNLocator(integer=True))
fig = ax.get_figure()
fig.tight_layout()
fig.savefig('images/Movie_Plot.png')

Use Jupyter Notebook to show plot, and/or download plot from command line.

Plot George Clooney's movies over time in a line graph. [Source]

1
2
3
4
5
import matplotlib.pyplot as plt
df = df[df.Actor=='George Clooney']
df.groupby(['Year']).size().plot(ylim=0)
fig = ax.get_figure()
fig.savefig('figure.pdf')

Jan 20, 2018

How to Install Libraries and Enable the pip Installer in Python

Python comes with a bunch of standard modules. My favorites are shutil, glob, datetime, time, os (operating system), re (regular expressions) and webbrowser. The standard library is loaded.

Inevitably, you'll want to install new libraries from Python's rich ecosystem of external modules. Enter pip, Python's handy package manager and people's champion.

This post will teach you some Python history, show how to install pandas, and help you troubleshoot problems if it's not working. You'll find Windows and Linux commands for venv setup (recommended). With pip, you'll feel like Neo when installing new modules. Any skill is at your fingertips. It's like learning kung fu. There's probably a library for that!

I know kung fu

First, Some Python Version Caveats + History

Python 2 reached end of life on January 1st, 2020. Python 2 has officially been sunset.

Python comes with pip now, no setup is required. But certain versions such as Python 3.2 or the Python 2.7 that came stock on my improbably still functioning 2008 black Macbook, for example, may not have it installed.

In December 2021, Python 3.6 reached "end of life phase". Python 3.6 is "now effectively frozen". Read more in PEP 494. (Released Oct. 2022)

TLDR: use Python 3.7 to 3.11. This blog endorses using the lightning fast Python version 3.11.

Enter This in Your Terminal

python -m pip install pandas

Pandas is a super useful library for wrangling spreadsheet data, AKA "tabular" data. If successful, you should see activity that looks similar to the below screenshot, where I am installing openpyxl, an additional Python Excel library you'll likely want. You are good to go! This is the part where you get to feel like Neo! See Installing Python Modules in the Python Documentation for more detailed instructions.

neo_pip

To view all your installed libraries, enter:

pip list

Write a "requirements.txt" of installed libraries:

pip freeze > requirements.txt

You can list your outdated packages with the --outdated argument:

pip list --outdated

Use pip's -h help command line argument:

pip -h

View your system and user pip config settings:

pip config debug

Supplementary Resources

Congrats on figuring out how to install packages with pip, have fun!

Having issues? Try upgrading your pip version.

python -m pip install --upgrade pip

Try the ensurepip command.

This command will install and upgrade pip to the newest version. New in Python 3.4:

python -m ensurepip --upgrade

"The ensurepip package provides support for bootstrapping the pip installer into an existing Python installation or virtual environment. This bootstrapping approach reflects the fact that pip is an independent project with its own release cycle, and the latest available stable version is bundled with maintenance and feature releases of the CPython reference interpreter."

- ensurepip Python Documentation

You should follow best practice and create a virtual environment before installing libraries. venv or virtualenv. To create with venv:

python3 -m venv add_env_name_here

After your environment is created, activate it with the first command below, then install a library on Ubuntu Linux:

source add_env_path_here/bin activate
python -m pip install pandas

Alternatively, on Windows computers:

cd add_env_path_here\scripts & activate
python -m pip install pandas

Getting the prefix right can be tricky.

In the install command, the prefix is a reference to your Python executable. You may just need to alter your prefix to call it correctly. Here are some to try in place of "python". Observe what happens when you run these command variations. Good luck!

python3 -m pip install pandas
python3.11 -m pip install pandas
py -m pip install pandas
pip3 install pandas

How to Manually Enable the pip Installer

The rest of this post may be useful to you if you are:

  1. Working on legacy Python 2 or < 3.3 for which pip is not installed.
  2. Seeking to fix a faulty pip install that is not working properly.
  3. Curious to know how to manually set up pip.

Assumes Python is already installed. If you're running Windows 10, I found it easy to install Python from the Windows store. Download the get-pip.py file. Go to the link, right click the page and "Save As" a .py file to download. Then place the file where you want to access it. I placed mine in C:Python27Libsite-packages

You could also download the file with curl:

curl https://bootstrap.pypa.io/get-pip.py -o get-pip.pyt-pip.py

If you are not sure where your site-packages folder is, type python -m site into command prompt for file path ideas.

Run the get-pip.py file.

Using command prompt's cd command with a Windows "&" operator to run the Python file in a Windows command prompt:

cd c:\Python27\Lib\site-packages & python get-pip.py

Or Linux terminal:

cd /Python27/Lib/site-packages && python get-pip.py

You should see some activity in command prompt that shows installation/updating of "setup" and "wheel". When it finishes, you have installed pip.

Type into command prompt at the same location:

python -m pip install requests

This installs the Requests module into your Python libraries. Requests is an http module which is highly regarded almost universally by the Python community.


Thanks for reading!
Check out these other posts with pip installed library examples:

fix Grammar and Spelling with language_tool_python and textblob

static site generation with pelican

text mojibake mash fixing with ftfy

a guide to making HTTP requests

simple GUI for scripts with gooey

Jan 14, 2018

Python File Handling Basics

The basis of many great programs revolve around a simple set of operations:

  1. Open a file.
  2. Do something with the file contents.
  3. Save the new file for the user.

Python is nice and simple for this. Paste the below lines into a text editor and save as a .py file. You need to have Python 3 installed. In the same folder as your .py file, save a .txt file with some words in it. Alright, let's write some code:

1
2
3
4
5
file_name = input("Enter your file name. e.g. words.txt")
file_handle = open(file_name, "r")
lines = file_handle.readlines()
print (lines)
file_handle.close()

In line 1, we ask the user to enter their file name with Python's raw_input function. When the program runs, the user enters their text file name with extension. This line stores the name in a variable called file_name.

In line 2, we open your text file and store it in a variable I have named file_handle. Think of the file handle as a bridge between your code and the text file. Quick point about the 'r' above: that tells the program to open the file in "Read" mode. There are several different file modes in programming. Some modes are just for reading an existing file, some are just for writing a new file, and some are capable of both. This Stack Overflow post is well written and details the differences between file modes. Once established, the file handle allows you to read the file's contents or write new contents to the file.

In line 3, we are calling the .readlines() method on our file handle. This method takes the file contents and stores them, line by line, into a list named "lines". An alternative method is .read(), which opens the file and stores its contents as one string. Try switching this out in place of  .readlines() to check out the difference.

In line 4, we are printing the stored lines to show them to the user. We now have the file contents, ready to be used however we please.

In line 5, we are closing the file.

Below, we are going to write a new file using the with statement, which is generally accepted as the best way to read or write a file:

with open("Notes.txt", "w") as fhand:
    fhand.write("Did you know whales can live up to 90 years?")

In line 1, we are using Python's input function to ask the user what to name the file and storing it in a variable named file_name.

In line 2,  we are calling the open function again that we used in the first example, but this time, notice the "w". This indicates that we are opening the file in "write" mode.

In line 3, we are calling the .write() method on our file handle, named save_file, and passing it our text to be saved in our new file.

In line 4, we are closing the file, completing the creation of our new file in the same folder as our .py program file.

Your program is now ready to be run. Double-click your .py file to execute it.

Before learning Python, file operations were a mystery to me. It took me a while to understand this clearly, and I wanted to share. Once you master these basic file operations, programming gets to be a lot more fun. Do try it out for yourself :D

Aug 05, 2017

Oversimplified Javascript Terms

I'm finally coming around in my understanding of Javascript. Here are a few quick explanations to help you if you are new to it.

Javascript = The language of the web. Most commonly used as a complement to HTML and CSS to create an interactive website.

JQuery = A popular Javascript library with many powerful commands that are quick and easy to call.

Node.JS = Software that allows you to run Javascript from the command line without being connected to the internet.

Express.JS = A popular Node.JS framework.

Angular = A popular front-end Javascript web framework. There are many out there but this seems to be the one I've heard of the most.

I've begun to see a pattern with programming languages:
1) Learn to execute the core building blocks. (using variables, loops, functions, etc.)
2) Learn more advanced libraries, documentation and uses.
3) Consider using and learning web frameworks depending on what you're trying to do with the language.
4) Practice to hone your knowledge. Build things you like.

I've also noticed that Javascript has been easier to learn than Python was for me, because it's not my first programming language. The concepts are the same. The syntax can trip me up at times, but I'm currently flying through Codeacademy's Javascript courses. Sometimes it even seems fun!

Jul 28, 2017

Should You Go To Programming School?

There is no one-size-fits-all answer. Below are some thoughts that may help you decide.

  1. What are your programming goals? Get a coding job? Create an app or website? Become more productive at your current job?
  2. What is your current experience level? Are you starting fresh or do you already know a language or two?
  3. Do you have money saved up? Otherwise, you might need to take out a loan.

A computer science degree is typically most expensive. Coding bootcamps are a lower cost option that pack a wide curriculum into a few weeks or months, but they can still be pricey. The cheapest option is to take a piecemeal approach through various online courses.

School Advantages

  • Wholistic approach. You get the ins and outs of programming from a proven curriculum.
  • Community. You learn with other students and from experienced teachers.
  • Job placement. Often various schools and bootcamps will connect you to a company.
  • Credentials. You gain confidence and the backing of your skills by an established institution.

Potential Downsides

  • Tuition Money. A lot of what you need to know is available for free or cheap on the web.
  • Skill level match. Some bootcamps are oriented for beginners, others are more advanced. If you do a bootcamp, make sure it fits your skill level.

If you want a coding job, school makes sense. The bootcamps look to be effective if you can handle the up-front investment. It's possible to land a job without schooling but much tougher. I am currently considering Full Stack Academy. and Coding Dojo. There are many out there. CodeAcademy is a popular route as well.

If you want to make an app or website, the school or the non-school route may both work. For the non-school route, the following languages are good places to start: (note - not a comprehensive list, these are my picks.)

  • Web App or Website: HTML, CSS, Python, Javascript
  • Web App or Website Framework: Flask, Django, py4web, Ruby on Rails, React
  • iOS app: Swift plus Apple's Xcode environment, Beeware (python library)
  • Android App: Java or Kotlin, Beeware
  • General Coding: Python or Ruby

If you want to be more productive at work, I recommend learning Python. More on Python and where to start here. Automate the Boring Stuff With Python is a great resource for boosting your productivity also.

It's not easy to decide whether or not school is for you. I'm still unsure after a year and a half of programming on the side. No matter what, continue to learn multiple languages and strive for a better grasp of the ones you know. Good luck!

My decision: continue self-study and learning online for free.

As of 8 months after writing this post, I have concluded that learning for free online was the right choice for me. I've achieved many of my programming goals in the last three years, thanks to materials available from Codeacademy, Coursera, YouTube, Stack Overflow, countless helpful resources,, interesting blogs, and documentation. I've talked with others who need the in-person assistance that a bootcamp offers to learn. Do what works for you. Good luck with your decision.

Update: Several years later, I also get paid to use Python and Excel for a living! I studied for free online intermittently over 2 years to achieve it.