Lo-Fi Python

Jul 25, 2024

Importing and Analyzing Data With Google Sheets Functions

This post focuses on Google Sheets formulas, rather than Python. If you're interested in accessing Google Sheets with Python, check out this post I wrote about the pygsheets library. I haven't personally tried it, but xlwings also looks like a decent option if you're looking for Google Sheets Python libraries.

Here are some functions I recently discovered to analyze data in Sheets. There a few different strategies you can use to import data into Google Sheets with the functions available. Below, you'll see some different approaches you use to get data into Google Sheets and then analyze it. The Google Sheets formulas rabbit hole is deep. Lots of power can be harnessed by getting familiar with the formulas it has built-in.

Year To Date SPARKLINE of a Stock

=SPARKLINE(GOOGLEFINANCE("Nasdaq:NVDA", "price", DATE(YEAR(TODAY()), 1, 1), TODAY(), "daily"), {"charttype", "line"; "linewidth", 2; "color", "green"})

Year to Date SPARKLINE of USD to MXN Currency Value with GOOGLEFINANCE + SPARKLINE USING IF to Dynamically Color the SPARKLINE

=SPARKLINE(GOOGLEFINANCE("CURRENCY:USDMXN", "price", TODAY()-365, TODAY(), "DAILY"), {"charttype", "line"; "linewidth", 2; "color", if(A2>0,"green","red")})

Above, we use an if condition to check another cell and set the color to green if > 0 or red if < 0.

Applying Google Sheets Sparklines

Market Cap with GOOOGLEFINANCE

=GOOGLEFINANCE("Nasdaq:NVDA", "marketcap")

Price/Earnings Ratio

=GOOGLEFINANCE("Nasdaq:MSFT", "pe")

Daily % Change of a Stock

=GOOGLEFINANCE("Nasdaq:TSLA","changepct") &"%"

Import the Price of ADA Cryptocurrency with IMPORTDATA

=IMPORTDATA("https://cryptoprices.cc/ADA")

This function imports the price of Cardano cryptocurrency from cryptoprices.cc.

Import the Market Cap of ADA Cryptocurrency with IMPORTDATA

=IMPORTDATA("https://cryptoprices.cc/ADA/MCAP")

This function imports the current market cap of Cardano cryptocurrency.

Import the Daily % Change of Ethereum Cryptocurrency with IMPORTXML and INDEX

=TEXT(
 IF(
     IMPORTXML("https://coinmarketcap.com/currencies/ethereum/", "//p/@data-change") = "down",
     "-" & INDEX(IMPORTXML("https://coinmarketcap.com/currencies/ethereum/", "//p[@data-change]"), 1, 2),
     INDEX(IMPORTXML("https://coinmarketcap.com/currencies/ethereum/", "//p[@data-change]"), 1, 2)
 ),
 "0.0%"
)

This method uses IMPORTXML to import data to Google Sheets by passing an "XPath query". The first line checks if the direction of the % change is "down". If it is down, then we know the % change is negative, otherwise the % change is positive. To select the HTML element, I right-clicked the number on the page that I wanted to import on coinmarketcap and selected "Inspect" to reference the HTML class names for the paragraph I was targeting.

Conditionally sum a range with SUMIFS, SUMIF and COUNTIF

=SUMIFS('sheet_name'!G:G, 'sheet_name'!N:N, ">0")

In the following examples, "sheet_name" = Your Google sheet's name. This function sums the corresponding cells in column N if column G contains a number greater than 0.

=SUMIF('sheet_name'!N:N, "Some Value",'sheet_name'!G:G)

This function sums all values in column G if the values in column N are equal to "Some Value".

=COUNTIF('sheet_name'!A:A, "Some Value")

Count all the cells in column A that equal "Some Value".

Select Columns from a Dataset

{ 'sheet_name'!A:D, 'sheet_name'!T:X, 'sheet_name'!Z:AA}

Google Sheets recognizes this format of bracket enclosed ranges of columns to select into your dataset. In the next example, you can see this applied.

Conditionally Select a Range of Cells from a Dataset with SORTN, FILTER, and REGEXMATCH

=SORTN(FILTER({'sheet_name'!A:A, 'sheet_name'!E:E, 'sheet_name'!P:P/1,'sheet_name'!F:F}, REGEXMATCH('sheet_name'!P:P, "-")), 5, 0, 3, TRUE)

This formula constructs a dataset, then filters it on a condition using REGEXMATCH to check if the cell contains a hyphen (-). 5 specifies how many rows to return, and 3 specifies which column to sort on, in this case the 3rd column.

Conditionally Select a Range of Cells from a Dataset with SORTN and QUERY with SQL-Like Language

=SORTN(QUERY({'sheet_name'!A:A, 'sheet_name'!E:E, 'sheet_name'!P:P,'sheet_name'!F:F}, "SELECT * WHERE Col3 IS NOT NULL"), 6, 0, 3, FALSE)

This queries rows that are not null, containing data.

Official Function Documentation

Google Sheets Function List

GOOGLEFINANCE

IMPORTDATA

IMPORTXML

SPARKLINE

FILTER

SUMIFS

QUERY

SORTN

INDEX

REGEXMATCH

Dec 21, 2019

Copying a pandas Dataframe to Google Sheets with pygsheets

Disclaimer: This endeavor was before I discovered AppScript, which may be an alternative solution to using pygsheets or other python libraries. pygsheets is interesting, but it could be a stretch to justify using it for something that could be done with AppScript. Both are ways to solve a problem by automating Google Sheet operations.

This was done on the Windows 7 OS. First, install libraries with pip. Enter in command prompt or terminal:

python -m pip install pandas
python -m pip install numpy
python -m pip install pygsheets

After installing necessary libraries, follow the steps documented by pygsheets:

  1. Create a Google Developer Account at console.developers.google.com
  2. Enable Sheets API to account
  3. Enable Drive API to account. Same as last step, but search for Drive.
  4. Create a Client Secret json file. Select "Credentials" tab, and "Create Credentials". Select Client Secret from options. Export from console and place in same directory as your .py file.
  5. Create a Service Account json file by selecting it instead of "Client Secret".
  6. Authorize pygsheets with your json files. See below.
  7. Copy spreadsheet to Google Sheet with pandas and pygsheets. See below.

After completing the first 5 steps, import pygsheets and authorize your account with the client secret json file:

1
2
import pygsheets
gc = pygsheets.authorize(client_secret='path/to/client_secret[...].json')

You will be prompted by the terminal to go to a hyperlink in a browser, get your authorization code, and enter that authorization code into the terminal.

Now, import both libraries needed and switch to authorize with your service json file. Then, load the csv to a dataframe with pandas. Finally, copy it to an existing Google Sheet with pygsheets:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
import pygsheets
import pandas as pd

"""Select worksheets by id, index, or title."""
gc = pygsheets.authorize(service_file='path/to/service_account_credentials.json')
sh = gc.open('add_google_sheet_name_here')
wks = sh.worksheet_by_title('add_sheet_tab_name_here')

"""Set a pandas dataframe to google sheet, starting at 1st row, 1st column"""
df = pd.read_csv('Example_Data.csv')
wks.set_dataframe(df,(1,1))

[Example] Split and upload a sheet with 40 columns

Google Sheets limits importing to 26 columns and 1,000 rows at a time. So you'll have to load the sheets in chunks if you have more than that. This approach uses numpy's array_split:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
import pygsheets
import pandas as pd
import numpy as np

gc = pygsheets.authorize(client_secret='path/to/client_secret[...].json')
sh = gc.open('add_google_sheet_name_here')
wks = sh.worksheet_by_title('add_sheet_tab_name_here')
df = pd.read_csv('Data_to_GSheets.csv')

# split columns into two dataframes with numpy and pandas
first_half_cols, second_half_cols = np.array_split(df.columns, 2)
first_half = df[first_half_cols]
second_half = df[second_half_cols]

# set both dataframes side-by-side in Google sheet
wks.set_dataframe(first_half,(1,1))
start_column = first_half.shape[1]
wks.set_dataframe(second_half,(1, start_column))

Conclusion

I found the terminal error messages from pygsheets to be very helpful while debugging the above. This module offers many other nifty spreadsheet operations. Solid library. You can now create and edit Google Sheets with Python.

AppsScript should probably be the default tool when working with Google Sheets because it is built in, but Python does have tools available to work with Google Sheets.

Resources

pygsheets Github

pygsheets Documentation

Google Sheets Documentation

pandas Documentation