How do you calculate stock valuation metrics like Sharpe ratio. Recently I've been reading about common stock valuation metrics and wondered how I can apply them to my stock portfolio. I started reading about different metrics, sketching formulas and entertained writing a script to calculate these metrics. But Python has no shortage of finance-related libraries. After some furious googling I found ffn, a way better option than rolling my own formulas. It's a "financial function" library, installable with pip.
It will be interesting to observe how these metrics vary in my portfolio and learn more of ffn's API. I like that they use pandas dataframes within their library because I'm already familiar with them. At minimum, it's good to understand what stock formulas purport to measure and what it means if the measure is low or high. It makes sense to compare stocks in similar industries or competitors like NKE and ADDYY. This is a neat tool for stock nerds who want to level up their stock analysis, make smart decisions and ideally pad the portfolio!
The funny thing is... my lowest university grade was a "C" in my only Finance class. It wasn't easy for me to grasp. But with Python it's a little more interesting and easier to apply. Anyone can level up their finance skills thanks to a cornucopia of finance calculation libraries in the Python ecosystem.
importffn# ffn.get returns a pandas dataframe of market data.data=ffn.get('tsla,spot,nflx,nke,addyy',start='2019-01-01',end='2020-12-31')print(data.head())stats=data.calc_stats()print(stats.display())
side note on the pyfolio library
I first considered using pyfolio to pull stock data. It is not "out of the box" ready per se to deliver the results pictured in their "single stock" example documentation. You'd need to find another way to fetch your market data or somehow patch the Yahoo Finance API within pyfolio. I preferred ffn, mostly because it worked right away after pip installing it and running the above code.
2024 Update
For a capable finance module, I recommend yfinance. It has worked well for me also.
ffn and pyfolio both depend on the Yahoo Finance API, which tends to change and break these libraries.
Troubleshooting traceback errors may be required.
Try these other Python financial analysis libraries:
My goal was to automate posting positive photos and quotes to my Facebook page, "Positive Thoughts Daily", with the Unsplash and Facebook APIs. Here's how I did it!
This implementation relies on my own collection of photos on Unsplash. I will manually select which photos I like to get added to my collection. Then my app will take the new photos and post one a day for me.
Side note: the free version of the Unsplash API is capped at 50 requests per week, which was enough for me.
Install the python-unsplash library. In the terminal enter:
python -m pip install python-unsplash
Decide what photo you want to post. This example fetches a random photo from my Unsplash collection. You can also fetch any photo at random, or pass in a query to get a certain type of photo.
1 2 3 4 5 6 7 8 9101112131415161718192021
fromunsplash.apiimportApifromunsplash.authimportAuthimportrequests"""Python-Unsplash library Github:https://github.com/yakupadakli/python-unsplash"""client_id="add_your_client_id"client_secret="add_your_client_secret"redirect_uri="add_your_redirect_uri"code=""auth=Auth(client_id,client_secret,redirect_uri,code=code)api=Api(auth)# returns a python list containing a classimage=api.photo.random(collections=66610223)# my collection idimage_id=image[0].id# Use image_id to get random photo's download link from a collection.url=f"https://api.unsplash.com/photos/{image_id}/download?client_id={client_id}"r=requests.get(url)print(r.text)image=r.json()download_link=image['url']
Posting the Unsplash Image to Facebook
123456
"""Use download link and post to page with Facebook API."""page_id="add_page_id_from_about_section"url=f"https://graph.facebook.com/{page_id}/photos?access_token={page_access_token}&url={download_link}"r=requests.post(url)post_ids=r.json()print(post_ids)
Post Project Reflections
This was my first time working with the Facebook API. Honestly, it's a little crazy trying to balance all the token types in your head. There are about 5 different types of tokens that are used for different things! Ultimately I was able to figure out how to to post a photo. So there is a bit of a learning curve. It's a good challenge to build your API skills. The Unsplash API requires no Oauth tokens and is easier to pick up.
My Facebook page posts are now triggered by page loads on this website! I am using a MySQL database to track which images I post to make sure I don't duplicate any posts and to make sure I only post once every 24 hours. Ah, I love the smell of fresh automation in the morning. 😀
Go is a popular programming language choice so my ears perked up when this lecture began. These notes were taken as the professor explains why he teaches his class in Go. He also mentioned he'd be able to teach it with Python or Java. He used C++ years ago.
The beginning of this lecture was a great summary of:
key benefits of Golang
what threads are and why they're great
how Go, threads and async tie together
Go is Good for Distributed Systems
Go is concurrency-friendly. With concurrent threads, you can effectively split a task such as making web requests to a server into many threads, completing them simultaneously.
A single thread, single loop that waits for an event.
Combining Threads and Event Driven Programming
"Create one thread for each procedure call."... "On each of the threads run a stripped down event driven loop. Sort of one event loop per core. That results in parallelism and I/O concurrency."
-Prof. Robert Morris
Postface: Concurrent Python Context
I've rarely if ever used multiple threads in Python. Simply running a single threaded script seems sufficient for most of my tasks. Maybe I could speed up API requests by splitting into threads when making a few hundred thousand requests? Apparently I'm missing out on concurrent threading efficiency gains.
I once experimented with the multiprocessing module's Process class, which worked on Linux but not Windows for me. I ended up taking an simpler, single thread approach instead. I've also heard of using multiprocessing pool objects. There's also the asyncio library concurrent.futures modules to consider. The ProcessPoolExecutor looks promising.
Python also has the queue module. I haven't used it yet but at one point I watched a talk where Raymond Hettinger recommended queue as a good option if you want concurrency in Python.
It seems there are many options available in Python but it's not clear which tools should be deployed and when. And your chosen concurrency strategy may add extra complexity. Handle with care. Or consider learning Go if you want to use threads to scale your distributed system.
Update: Python Concurrency Success
I recently deployed the ThreadPoolExecutor from the concurrent.futures module to efficiently move thousands of files to a new folder. So Python does have fairly accessible alternatives to concurrency. I guess I'll need to try Go sometime to compare!
1 2 3 4 5 6 7 8 910111213141516171819202122232425
fromconcurrent.futuresimportThreadPoolExecutorimportnumpyasnpimportshutilimportosdefmain():"""Move files concurrently from the current working directory to a new folder. This script is adapted from the Python ThreadPoolExecutor documentation: https://docs.python.org/3/library/concurrent.futures.html#concurrent.futures.Executor.shutdown """csvs=[fforfinos.listdir(os.getcwd())if'.csv'inf]split_num=len(csvs)/4+1file_batches=np.array_split(csvs,split_num)# write to local folder named "csvs"dst_folder="/csvs"withThreadPoolExecutor(max_workers=4)ase:fori,filesinenumerate(file_batches):csv_A,csv_B,csv_C,csv_D=filese.submit(shutil.move,csv_A,dst_folder)e.submit(shutil.move,csv_B,dst_folder)e.submit(shutil.move,csv_C,dst_folder)e.submit(shutil.move,csv_D,dst_folder)if__name__=='__main__':main()
Sometimes a spark comes from seemingly nowhere. That's when you reach for your tools and create. After a series of successful experiments,
I decided this stack might be my quickest, best shot to get a functional website up and running in Python.
I was pleasantly surprised to make rapid progress over the span of a quarantine weekend.
Here are the steps to create a MySQL backed website with Flask.
Hosting With pythonanywhere
pythonanywhere is a web hosting service like GoDaddy.
If you host your app with them, MySQL is the default database. Postgres integration is available at higher price tiers.
To get your Flask app's database up and running you need to:
Create your database (see the "Databases" tab in pythonanywhere)
Use the mysql terminal to create your tables
Use the mysql.connector API to connect to your table and execute SQL from your Flask app.
Using "ANALYZE TABLE" tooptimize themis periodically recommended by MySQL:
1
ANALYZETABLEMarijuana;
Installing Libraries in PythonAnywhere
You can use pip to install python libraries within the PythonAnywhere bash terminal. Go to the consoles tab and start a new bash terminal. Then to install a library, such as pandas:
importmysql.connectorfromflaskimportFlaskimportpandasaspdfromdatetimeimportdateimportloggingimportsysapp=Flask(__name__)logging.basicConfig(stream=sys.stderr,level=logging.DEBUG)@app.route("/")defhello_world():"""Call database and return data from df. Then display homepage."""try:email_df=get_database_table_as_dataframe()print(email_df.shape)html_page=render_homepage()returnhtml_pageexcept:logging.exception("Failed to connect to database.")defrender_homepage():"""Note: you should use Flask's render_template to render HTML files. However, for example you can make a quick f-string HTML page that works in this code. """html_page=f"""<html><head><link rel='stylesheet' href="/static/styles/some_file.css"><link rel="shortcut icon" type="image/x-icon" href="static/favicon.ico"> <Title>Dispensary Alerts</Title></head> <body><h2></h2> <p>Get alerts for your dope.</p><br> <h6><b>Sign Up</b></h6><br> <div class="form"> <form action="/add_signup_to_db" method="post" style="width:420px;text-align:center;display:block;" > <input type="text" name="Signup Form"> <input type="submit" value="Submit"> </form></div><br><br> <p><b>Current Time:</b>{str(date.today())} </p></body></html>"""returnhtml_pagedefget_database_table_as_dataframe():"""Connect to a table named 'Emails'. Returns pandas dataframe."""try:connection=mysql.connector.connect(host="username.mysql.pythonanywhere-services.com",db="username$DatabaseName",user="username",password="password",)email_df=pd.read_sql(sql="""SELECT * FROM Emails""",con=connection)logging.info(email_df.head())returnemail_dfexcept:logging.exception("Failed to fetch dataframe from DB.")return"Oops!"@app.route("/add_signup_to_db",methods=["GET","POST"])defadd_signup_to_db(email,date):"""Pass data as SQL parameters with mysql."""try:connection=mysql.connector.connect(host="username.mysql.pythonanywhere-services.com",db="username$DatabaseName",user="username",password="password",)cursor=connection.cursor()sql="""INSERT INTO Emails (message, date) VALUES (%s, %s) """record_tuple=(email,date)cursor.execute(sql,record_tuple)connection.commit()exceptmysql.connector.Erroraserror:logging.info("Failed to insert into MySQL table {}".format(error))except:logging.exception("Error inserting records to DB.")finally:ifconnection.is_connected():cursor.close()connection.close()return"MySQL connection is closed"
Iterative Development
Below: making my website look less like a "my first HTML" website, experimenting with my app's message name and adding a sign-up form connected to the database.
Note: if you see this error when making a request in pythonanywhere:
OSError: Tunnel connection failed: 403 Forbidden
It's likely because you are "whitelisted" on the free plan. Upgrading to the $5/month plan will fix it!
Scoping The Full Stack
I'm really enjoying this web development stack. Here are all of the tools and library choices for this website:
Flask is a little scary at first, but reasonable once you get a grasp of the basic syntax. Using the logging module to establish access, error
and server log feeds was a big step to finding my Python traceback fixing groove. It's a work in progress.
Recapping My Python Web Development and Database Experiences
I previously created a website with web2py,
another Python web framework like Flask and Django. I think it was a decent choice for me at that point in my Python journey. Back then,
I connected a MongoDB back-end to web2py. I randomly picked Mongo out of the DB hat and it worked well enough.
Of these two diverse Python stacks, I favor MySQL and Flask. But I learned a lot from watching web2py's tutorial videos and it's less intimidating for beginners. And I barely scratched the surface of web2py's "pure Python" pyDAL (Database Abstraction Layer), which seems pretty dope.
web2py's creator has a new framework in progress called py4web.
It has the same DAL and inherits many other web2py qualities.
Definitely looking forward to exploring the DAL on my first py4web website. I'll likely use it to connect to PostgreSQL or SQLite.
Maybe I'll install pyDAL with pip in the meantime.
Final Thoughts
Both of my websites are hosted with pythonanywhere, which gives you a text editor and bash terminal
to run your scripts in a shell environment. I'm so pleased with all of these tools. They fit together smoothly and made creating my website a fun experience.
This contains all of my best API-related knowledge picked up since learning how to use them. All APIs have their own style, quirks and unique requirements. This post explains general terminology, tips and examples if you're looking to tackle your first API.
Here's what is covered:
API & HTTP Lingo You Should Know
Testing and Exporting Python Request Code from Postman (Optional)
Formatting Your Request
Example GET and POST Requests
"Gotchyas" To Avoid
Sidebar: requests.Session()
Dig deeper into requests by raising your HTTPConnection.debuglevel
Terminology Clarification: I will refer to "items" or "data" throughout this post. This could be substituted for contacts or whatever data you are looking for. For example, you might be fetching a page of contacts from your CRM. Or fetching your tweets from Twitter's API. Or searching the Google location API, you might look up an address and return geo-location coordinates.
API & HTTP Lingo You Should Know
Hypertext Transfer Protocol (HTTP)
Per Mozilla, "Hypertext Transfer Protocol (HTTP) is an application-layer protocol for transmitting hypermedia documents, such as HTML. It was designed for communication between web browsers and web servers, but it can also be used for other purposes. HTTP follows a classical client-server model, with a client opening a connection to make a request, then waiting until it receives a response."
HTTP: you = client. API = way to communicate with server
Application Programming Interface (API)
Per Wikipedia, the purpose of an API is to simplify "programming by abstracting the underlying implementation and only exposing objects or actions the developer needs."
Representational State Transfer (REST)
REST is an architectural style of web APIs. It is the dominant architecture that many APIs use. Simple Object Access Protocol (SOAP) is another style I've heard of, but it seems less common nowadays.
A REST API is built for interoperability and has properties like: "simplicity of a uniform interface" and "visibility of communication between components by service agents." [Wikipedia] If an API follows REST, it has many good principles baked in.
GET, POST and PATCH
These are three common types of request methods.
GET: Read data returned, such as all of your tweets in the Twitter API.
POST: Create a new item, like writing a new tweet. Can also update existing data. Tweets aren't editable though!
PATCH: Similar to POST, this is typically used for updating data.
URL or "Endpoint"
This is the website target to send your request. Some APIs have multiple endpoints for different functionality.
URL Parameters
Values you pass to tell the API what you want. They are defined by the API specifications, which are usually well documented. In Python's requests library, they may be passed as keyword arguments. Sometimes they are passable directly within the endpoint url string.
Body or "Payload"
To make a request, you send a payload to the url. Often this is a JSON string with the API's URL parameters and values, AKA the request body. If the API is written specifically for Python, it might accept an actual Python dictionary.
Javascript Object Notation (JSON)
JSON is the data interchange standard for all languages. Usually it is the default way to pass data into and receive data from an API. If making a POST, you can check your json object is formatted correctly by using a json linter. Or try Python's json.tool! You can also pretty print your JSON or python dictionary with the pprint module. If you're using json.dumps remember it has pretty printing accessible by keyword arguments! These features are accessible in the standard library. Isn't Python great? See also: Python 101 - An Intro to Working with JSON
Pages
API data is commonly returned in multiple pages when there is a lot of data returned. Each page can be accessed one request at a time. Sometimes you can specify how many items you want on a page. But there is usually a maximum items per page limit like 100.
Status Code
Each request usually gives you a numeric code corresponding to happened when the server tried to handle your request. There is also usually a message returned.
Headers
These usually contain website cookies and authorization info. They also may tell the API what kind of data you want back. JSON and XML are the two most common types of data to return. You can specify the return format in the content-type headers.
Authorization varies widely. This is the level of identification you need to pass to the API to make a request. Public APIs might require none. Some just need a username and password. Others use the Oauth standard, which is a system involving credentials and tokens for extra security.
I recommend using Postman in most cases, depending on the complexity of the API. If the JSON syntax is straightforward, you can format your data as a python dictionary, then convert it to a JSON object with json.dumps from the standard library's json module. But JSON can be tricky sometimes. You may also need to pass a dictionary of HTTP headers.
Some APIs have "Postman Collections", a set of Python (or any language) script templates for the API. In those cases, it might make sense to use those resources.
Path One: Make HTTP request with json & requests libraries
Format Python dict with json.dumps from the standard library's json module. Infer API requirements from documentation. Use requests for HTTP.
Path Two: Make HTTP request with Postman & requests library
Use Postman to generate the JSON payload. Plug headers and payload into requests. Use requests library for HTTP.
Postman has a friendly interface for plugging in all your pieces and tinkering with your request body until it works. Make it easier on yourself and use Postman, especially if there are collections. An alternative is to troubleshoot in Python if you are confident in your grasp of the API. I use both options depending on my familiarity with the API at hand.
Formatting Your Request
Once you have the request working, you may export your Postman request to almost any language. For Python, you can sometimes export to the requests, http.client or urllib libraries. Hit the "code" button in Postman and then copy your code.
Paste your Postman headers, payload and url into your existing code.
You may want to use a dict or string formatting to pass values to your request parameters or url.
If the API uses a token or other form of authorization that needs to be refreshed intermittently, I usually have a function that returns a token. token = fetch_token() Then put the token in the headers dict. {"Authorization": f"basic {token}"} Finally pass your headers and payload to your requests.get, requests.post, or requests.request function along with the endpoint url. You're now ready to test the request.
If you choose not to use Postman, you can use the json library. See the use of json.dumps() to convert a dictionary to a JSON object in example #2 below.
Python Installation
You can install requests with pip. Alternatively, http.client is included within the Python standard library. If you want to convert HTTP response data to a dataframe or csv, install pandas.
importrequests# Find the best double-cheeseburger + fries $7 can buy.payload={"key":"Add_Google_API_Key_Here","address":"Redhot Ranch"}url="https://maps.googleapis.com/maps/api/geocode/json"# Optional: set a 5 second timeout for the http request.r=requests.get(url=url,params=payload,timeout=5)print(r.text)print(r.status_code)data=r.json()# Extract the latitude, longitude and formatted address of the first matching location.latitude=data["results"][0]["geometry"]["location"]["lat"]longitude=data["results"][0]["geometry"]["location"]["lng"]formatted_address=data["results"][0]["formatted_address"]print(longitude)print(latitude)print(formatted_address)# Optional: convert response into a dataframe with pandas.# import pandas as pd# location_df = pd.json_normalize(data['results'])# location_df.to_csv('Locations.csv')
Above you can see:
requests makes it easy to see the server's text response also with response.text
requests also makes JSON encoding easy with response.json()
pd.json_normalize() is convenient to convert the response dictionary to a dataframe.
Example #2: Encode a Python dictionary to json string and POST to a hypothetical API
Create a dictionary with request body data and pretty inspect it with pprint.
Encode the json string with json.dumps from the standard library's json module.
POST the encoded JSON to the endpoint url with requests.
importpprintimportjsonimportrequestsdefdict_to_json_data():"""Create request body with fictional contact details."""payload={"first_name":"P","last_name":"Sherman","address":"42 Wallaby Way","address_2":"","city":"Sydney","state":"NSW","country":"AU","zip":"2000",}pprint.pprint(payload)json_str=json.dumps(payload,ensure_ascii=True)# Optional: encode json str to utf-8.returnjson_str.encode("utf-8")defpost_data(json_str):"""This is a fictional API request that passes a json object to requests. It decodes the server response with response.json() and Returns dictionary value by calling the data's keys. """headers={"Authorization":f"Bearer {token}","Content-Type":"application/json","cache-control":"no-cache",}r=requests.request(method="POST",url="https://SomeSoftwareAPI.com/people/",data=json_str,headers=headers,)data=r.json()print(data.keys())# Call dict keys to get their values.contact_id=data["contact_id"]returncontact_idjson_str=dict_to_json_data()contact_id=post_data(json_str)
requests.request keyword argument alternatives for passing data
params – (optional) Dictionary, list of tuples or bytes to send in the query string for the Request.
data – (optional) Dictionary, list of tuples, bytes, or file-like object to send in the body of the Request
json – (optional) A JSON serializable Python object to send in the body of the Request
Status codes are your friend. They offer a hint at why your request is not working. If you see 200 or 201, that's a good sign. They're usually helpful, but sometimes they can be misleading.
Ensure you are defining the correct content-type. I had an experience where Postman defined two conflicting content-type headers and it caused my request to fail. The server's error message indicated the problem was in my JSON, so it took me a while to figure out the headers were the problem.
Sometimes it makes a difference if your url has http:// vs. https:// in it. Usually https:// is preferred.
You might be able to improve performance by using a requests "session" object.
123456789
importrequests# A session adds a "keep-alive" header to your HTTP connection + stores cookies across requests.s=requests.Session()forpageinrange(0,2):url=f"https://exampleapi.com/widgets/{str(page)}"r=s.get(url)print(r.text)
Dig deeper into requests by raising your HTTPConnection.debuglevel
HTTPResponse.debuglevel: A debugging hook. If debuglevel is greater than zero, messages will be printed to stdout as the response is read and parsed.
Source: http.client Python Docs
Web Server Gateway Interface (WSGI, pronounced "Wis-Ghee")
"As described in PEP3333, the Python Web Server Gateway Interface (WSGI) is a way to make sure
that web servers and python web applications can talk to each other." Gunicorn is one
of a few Python WSGI clients.
web2py is another WSGI client and web framework I have used.
Conclusion
I remember APIs seemed mysterious and daunting before I had used them. But like all things, they can be conquered with knowledge, understanding and tenacity to keep trying until you figure it out. Good luck!
Automatic pytest Testing and Failure Alert Monitoring
This is my solution to replace manually running pytest each day in command prompt. I want to automate running pytest every day, test if my automated python scripts ran smoothly and get notified if any tests fail.
It is a unit test framework in python. pytest expects you to write each test as a self-contained function. One python file can contain many different test functions.
Writing a Test
Let's use test_file_date.py as our test, which uses the glob module and os.getmtime to get the csv with the most recent modification dateon my desktop. Then it tests if that date is today, in my case, for an expected daily file drop.
1 2 3 4 5 6 7 8 910111213141516
fromdatetimeimportdatetime,dateimportglobimportosimportgetpassdeftest_csv_date_equals_today():"""The match format is for a Windows path with Unix style pattern matching."""match=f"C:Users/{getpass.getuser()}/Desktop/*.csv"# Get the most recent csv from a folder.csv=sorted(glob.iglob(match),key=os.path.getmtime)[-1]csv_timestamp=os.path.getmtime(csv)csv_date=datetime.fromtimestamp(csv_timestamp)print(csv_date.day)print(date.today().day)assertcsv_date.day==date.today().day
Here's the pytest text output when the test is passing:
If you haven't used python with Windows Task Scheduler before, my previous post on creating a task may help you. We'll create two of them for our testing system.
Adding Your Task Scheduler Program: a Windows .bat file
Add your username to the text below and adjust the paths to your system. Then save a Windows .bat file with this text, which points to your pytest.exe file:
This example is referencing an .exe within a hypothetical "sandbox" virtual environment, located on my Desktop. If you have a virtualenv or venv, check the Scripts folder. (Bin on Linux.)
Breaking this out, there are five .bat files parts:
Windows' cmd command takes a program, so we're passing pytest. The --capture=sys argument tells pytest to capture the test results. Note: switching cmd /c to cmd /k forces the terminal to stay open when you are testing your bat file. You can double-click your .bat file to test run it.
^
circumflex represents a line continuation in Windows batch files for better readability
C:\Users\your_username\Desktop\test_file_date.py
Next we're passing our python file as an argument to pytest, testing our file's modified date matches today's date.
>
This is a Windows redirect. It redirects the pytest output from sys to a text file, which is the last argument in our .bat file:
Browse to select your .bat file for your Windows Task Scheduler task:
Reading the Tests and Triggering Alerts
Passing tests signal your scripts are running successfully. When things don't work, email alerts of the failure help us respond quickly.
Let's set another task scheduler job to run read_test_results.py, to run a few minutes after the first job each day. See this example of running Python with Task Scheduler if you haven't triggered a python script from Task Scheduler before.
fromdatetimeimportdateimportgetpassimportloggingimportos"""Automate pytest with Windows Task Scheduler. Use Task Scheduler run a batch file.The batch file runs pytest and captures our pytest function results to sys.If a text file contains a failure or error, write the test contents into a folder."""logging.basicConfig(filename="Automated_Testing_Alerts.log",level=logging.INFO,format="%(asctime)s - %(message)s",datefmt="%d-%b-%y %H:%M:%S",)logging.info("Checking for failed tests...")directory=f"C:/Users/{getpass.getuser()}/Desktop/test_results/"test_results=[fnameforfnameinos.listdir(directory)if".txt"infname]fortxt_fileintest_results:file_path=directory+txt_filewithopen(file_path)asf:text=f.read()if"FAILURES"intext:directory=f"C:/Users/{getpass.getuser()}/Desktop/send_failure_alert/"name=f"{directory}{txt_file}_Failed_Results_{date.today()}.txt"withopen(name,"w+")asf:f.write(name)f.write(text)else:print("No failed tests found in file:")print(txt_file)
Setting up Email Alert Notifications on a Box Folder
The above script checks the test results and creates a file with any failed tests in a different folder. I edited the folder's settings to email me when a new file is added, thanks to Box notifications. We use
Box for secure file sharing at my day current day job.
Alternatively for those without Box, you could use 'ole trusty smtplib to send the failure alerts with python. I chose the easier, ready to go option. Remember, "Simple is better than complex."
We now have email monitoring alerts if our csv file date test fails, thanks to:
Windows Task Scheduler (two tasks running each day in succession)
python/pytest
a Windows .bat file
Box (or smtplib or pywin32) email alerts
In Summation
The first task runs a .bat file to run pytest and create a text file with daily automation test results.
The second task runs a python file. Setting in motion:
Checking the test result text files, looking for failed tests
Creating a text file with any failing tests in a Box folder, if found
Then Box emails me if any test fails.
This was the first time I successfully deployed a Windows batch file. It took me many tries and googling to properly format of the .bat file. They are worth understanding and seem to open up many possibilities on Windows. In this case it was a "glue" piece that allowed me to accomplish my goal, automated testing and alerts for my python scripts.
What we learn along the way shapes us. Learning to use shell commands with Python scripts extends their abilities to help you get stuff done. I've benefitted from improving both my Windows and Ubuntu shell knowledge, which can sometimes be a handy substitute or complement to python. Now, time to write more tests. Thanks for reading!
See also:
pytest-csv: write test results to a csv with this plugin
Disclaimer: This endeavor was before I discovered AppScript, which may be an alternative solution to using pygsheets or other python libraries. pygsheets is interesting, but it could be a stretch to justify using it for something that could be done with AppScript. Both are ways to solve a problem by automating Google Sheet operations.
This was done on the Windows 7 OS. First, install libraries with pip. Enter in command prompt or terminal:
Enable Drive API to account. Same as last step, but search for Drive.
Create a Client Secret json file. Select "Credentials" tab, and "Create Credentials". Select Client Secret from options. Export from console and place in same directory as your .py file.
Create a Service Account json file by selecting it instead of "Client Secret".
Authorize pygsheets with your json files. See below.
Copy spreadsheet to Google Sheet with pandas and pygsheets. See below.
After completing the first 5 steps, import pygsheets and authorize your account with the client secret json file:
You will be prompted by the terminal to go to a hyperlink in a browser, get your authorization code, and enter that authorization code into the terminal.
Now, import both libraries needed and switch to authorize with your service json file. Then, load the csv to a dataframe with pandas. Finally, copy it to an existing Google Sheet with pygsheets:
1 2 3 4 5 6 7 8 91011
importpygsheetsimportpandasaspd"""Select worksheets by id, index, or title."""gc=pygsheets.authorize(service_file='path/to/service_account_credentials.json')sh=gc.open('add_google_sheet_name_here')wks=sh.worksheet_by_title('add_sheet_tab_name_here')"""Set a pandas dataframe to google sheet, starting at 1st row, 1st column"""df=pd.read_csv('Example_Data.csv')wks.set_dataframe(df,(1,1))
[Example] Split and upload a sheet with 40 columns
Google Sheets limits importing to 26 columns and 1,000 rows at a time. So you'll have to load the sheets in chunks if you have more than that. This approach uses numpy's array_split:
1 2 3 4 5 6 7 8 9101112131415161718
importpygsheetsimportpandasaspdimportnumpyasnpgc=pygsheets.authorize(client_secret='path/to/client_secret[...].json')sh=gc.open('add_google_sheet_name_here')wks=sh.worksheet_by_title('add_sheet_tab_name_here')df=pd.read_csv('Data_to_GSheets.csv')# split columns into two dataframes with numpy and pandasfirst_half_cols,second_half_cols=np.array_split(df.columns,2)first_half=df[first_half_cols]second_half=df[second_half_cols]# set both dataframes side-by-side in Google sheetwks.set_dataframe(first_half,(1,1))start_column=first_half.shape[1]wks.set_dataframe(second_half,(1,start_column))
Conclusion
I found the terminal error messages from pygsheets to be very helpful while debugging the above. This module offers many other nifty spreadsheet operations. Solid library. You can now create and edit Google Sheets with Python.
AppsScript should probably be the default tool when working with Google Sheets because it is built in, but Python does have tools available to work with Google Sheets.
You may or may not be familiar with PyCon's Lightning Talks, a session of quick hitter talks given each year on a variety of topics. These "Lightning Scripts" are ten of my favorite Python scripts that have served me well as of late to perform a variety of tasks.
Highlights include the glob, os, calendar, webbrowser and pandas modules. Plus, how to convert a dataframe into a pdf and how to take a snapshot of a website with pdfkit. The webpage pdf was not the greatest snapshot in some cases, but did capture text and many of a live website's visual elements.
If you have Python installed, paste the script into a text editor and run your .py file from the command line or however you like to run your scripts. You may have to change file paths to work on your operating system.
The first 5 scripts use modules that ship with Python.
(1) Uppercase all of the file names in the current working directory.
importcalendarfromdatetimeimportdatetoday=date.today()ifcalendar.day_name[today.weekday()]=='Friday':print("Today is Friday.")else:print("Today is not Friday.")
(4) Get the two most recent file names from a directory. This is a Windows file path example. I have escaped the backslashes below to make it work in my Atom text editor.
1 2 3 4 5 6 7 8 910
importglobimportgetpassimportos# getctime == by created date, getmtime == by date modified# get all files, regardless of extensionfolder=f'C:/Users/{getpass.getuser()}/Desktop/'list_of_files=glob.glob(folder)newest_file=sorted(glob.iglob('C:/Users/your_username/Desktop/*.csv'),key=os.path.getmtime)[-1]second_newest_file=sorted(glob.iglob('C:/Users/your_username/Desktop/*.csv'),key=os.path.getmtime)[-2]print(newest_file,second_newest_file)
importpandasaspdfile_name='big_file.csv'# slice off extensionsrc_file_minus_ext=file_name.split('.')[:-1]fori,dfinenumerate(pd.read_csv(file_name,chunksize=250000)):csv=f'{src_file_minus_ext}{str(i)}.csv'df.to_csv(csv,index=False,header=df.columns)
pdfkit relies on another library, wkhtmltopdf, which adds an extra wrinkle to setting it up for Windows users. I've laid out the differences between using pdfkit for Windows vs. Ubuntu below.
python -m pip install pdfkit
installing wkhtmltopdf on Windows
To use pdfkit on Windows, go to the link, choose your version to download. This installs a Windows executable that pdfkit needs to find to work.
https://wkhtmltopdf.org/downloads.html
installing wkhtmltopdf on Ubuntu
sudo apt-get install wkhtmltopdf
**For Windows, **\ `download wkhtmltopdf <https://wkhtmltopdf.org/downloads.html>`__\ ** and add this config patch for pdfkit:**
I am very pleased with pdfkit. I am also toying around with Python's PyFPDF: https://pyfpdf.readthedocs.io/en/latest/index.html and PyPDF2: https://pythonhosted.org/PyPDF2 libraries. Together, these three offer some unique abilities for creating and manipulating pdf files. And the pandas library is amazing for reading and writing tabular data sheets, can't say enough great things about pandas. Worth the time. Hope you enjoyed these Lightning Scripts :)
Goal: recreate my resume in the dark Atom text editor theme
(background and fonts). Sub-goal: find a color eyedropper to grab
the actual color values of the Atom layout.
Approach #1: find an Atom eyedropper package to grab the colors. My
first thought was to find the easiest solution, within the packages of
my Atom text editor. After searching Atom's packages, the two best
potential solutions were "an-color-eyedropper" and "color picker" . The an-color-eyedropper
description sounds perfect: "A simple "real" color picker. By "real" I
mean it's able to pick colors anywhere on any screen."
Unfortunately it failed to install and displayed the error, "Unable to
download 400 Bad Request Repository inaccessible". It seems to rely on
the "python" Atom package which is now deprecated. I was unable to find
a repo anywhere by googling.
Color picker has
easy-to-follow instructions and installed with no problem. It allows you
to quickly select any color visually with sliders. Then the RGB or
Hexadecimal values of your color are added as text in the editor in
proper format. However, we are looking for a color grabber to pull
colors from a screen object. This is more of a productivity enhancing
and color exploration tool for programmers. On to Python options.
Approach #2: Use the python tkcolorpicker package to grab the colors.
The first thing I found on Google was tkcolorpicker,
a package that uses the tkinter
library. I couldn't tell exactly what it was, so let's find out. First,
install via pip install:
python -m pip install tkcolorpicker
Then run the below script. Cool gadget for sure, but also not quite
what I was looking to use. It allows selection of a color with
sliders or input values, similar to Atom's color picker, but for user
input rather than color picking. Nice little tool. :D
askcolor() returns a tuple with both the RGB and hex codes selected by
the user. Above, we are unpacking that tuple into the hex_code and
RGB_code variables.
Approach #3: Use the Python eyedropper package to grab the colors.
I then found eyedropper
for Windows, which has a minimalist repository and offers a simple
approach to desktop eyedropper functionality. Install eyedropper via
pip:
python -m pip install eyedropper
Hover your mouse over the object you want to grab the color from (in my
case, the Atom text editor background). Alternatively, I was able to run
eyedropper from the command line by entering:
py -m eyedropper
Mission possible. Then I hit ctrl+v in a text file and there was the
hex code for my Atom background. Some of the colors that eyedropper
grabbed were nearly identical to those in the Atom text editor dark
theme. Others were not quite the same. I made slight eyeball adjustments
to the colors for some of the fonts.
This desktop tool has
great U/X and an eyedropper feature. It runs on Electron, a
Javascript based framework. Coincidentally, Electron was used to
build Atom.
GUI stands for "Graphical User Interface", aka the part of a program designed for human interaction. Adding a GUI to a Python
script allows anyone to run it without having to code or use the command line.
There are several GUI libraries in Python. A few I have heard of are Tkinter (comes in the standard library), wxPython, PyQT,
easygui, DearPyGui and PySimpleGUI. I explored Tkinter
back when I first got into Python. It was more intricate and offered more control over the look of your app, and took longer to pick up. Gooey is more of a pre-packaged GUI library.
The Gooey Github page was most useful to me and helped me to do what I needed.
The script posted in this blog helped as well. I needed to enable a human to
supply three files and enter a number. Gooey was a good match for this. The library has two branches:
some basic widgets piggyback off the argparse library
another part of the library uses a function called the GooeyParser. The GooeyParser offers more advanced widgets, like a file chooser. This was exactly what I was looking to use to pull in files for my script.
Installing Gooey
Argparse comes stock with Python. You can install Gooey via the pip installer. Open command prompt or terminal and enter:
python -m pip install Gooey
Below is a basic argparse/Gooey combination script. The argparse version offers a handful of widgets such as checkboxes and dropdown, but I had trouble getting them to work with the GooeyParser (used in 2nd script).
1 2 3 4 5 6 7 8 910111213141516171819
fromargparseimportArgumentParserfromgooeyimportGooey@Gooey(program_name='Report Generator',default_size=(575,600))defget_args():"""Demonstrating python's vars built-in to store arguments in a python dict."""parser=ArgumentParser(description='A simple argument parser',epilog='This is where you might put example usage')parser.add_argument('Name',action='store',required=True,help='Help text for option X')parser.add_argument('Email',help='Help text for option Y',default=False)parser.add_argument('Campaign Number',help='Help text for option Z',type=int)parser.add_argument('Campaign Segment',choices=['A','B','All'],default='a',nargs='?')user_inputs=vars(parser.parse_args())print(user_inputs)name=user_inputs['Name']campaign_number=user_inputs['Campaign Number']returnparser.parse_args()if__name__=='__main__':get_args()
Side note: Check out Python's vars() built-in function above.
It returns the input data as a dictionary called user_inputs. Then we can get the values via the dictionary's keys. Pretty nifty!
The @Gooey() part of the code is an advanced function known as a decorator in Python.
Put simply, decorators are functions that modify the function to which they are attached.
Below is my script that uses the more advanced GooeyParser for its "FileChooser" widget. Gooey allows you to group widgets together
and set how many widgets per line with the gooey_options={} parameter.
1 2 3 4 5 6 7 8 910111213141516171819
fromgooeyimportGooey,GooeyParser@Gooey(program_name='Email Campaign Reporting Generator',default_size=(575,600))defget_args():"""Adding two argument groups, each accepting two arguments. Using gooey_options to set layout."""parser=GooeyParser(description='Export campaign report spreadsheets and upload below.')top_group=parser.add_argument_group(gooey_options={'show_border':False,'columns':1})top_group.add_argument('Contact List',help='Upload Send List (.xlsx)',widget='FileChooser')top_group.add_argument('Opens List',help='Upload Opens List (.xlsx)',widget='FileChooser')top_group.add_argument('Unsubscribe List',help='Upload Unsubscribe List (.xlsx)',widget='FileChooser')bottom_group=parser.add_argument_group(gooey_options={'show_border':False,'columns':1,'required':False})bottom_group.add_argument('Campaign ID',action='store',help="Number found in the Campaign 'Reports' tab")bottom_group.add_argument('Campaign Segment',action='store',help='Enter A, B, or All. All lists supplied must match segment.')user_inputs=vars(parser.parse_args())name=user_inputs['Name']returnparser.parse_args()if__name__=='__main__':get_args()
Overall, Gooey knows what it wants to be, an easy to use GUI framework for Python. It does it well. Here's a screenshot of my program's shiny GUI:
Now that I have a GUI on top of my program and it delivers the expected output file, I'm hoping to take it one step further by packaging it up as a Windows .exe file.
This would allow it to run as a desktop app on any Windows computer without the need to install Python or library dependencies.
I've only begun exploring options to do this but a few libraries I've heard of are pyinstaller, cx_Freeze and Py2Exe. Updates coming if I figure it out. Cheers :D
Update: I did figure out how to compile my Gooey app to a Windows application with Pyinstaller. You can read more on how I did it here.