From Google Sheets to BigQuery: Solving the Financial Team's Big Data Challenges

From Google Sheets to BigQuery: Solving the Financial Team's Big Data Challenges

Finance teams in modern companies often face overwhelming volumes of data that need to be processed and analyzed. Traditional tools like Google Sheets may become insufficient as operations scale. In this article, we’ll explore how a development team solved this issue using Google AppSheet and BigQuery, and how this impacted the work of the finance department.


The Problem of Google Sheets Overload

The main problem we encountered was the excessive number of records related to actual payments, causing our Google Sheets to slow down. Additionally, all automations set up on this sheet began processing too slowly.

💡
Google Sheets has several limitations, including a maximum of 5 million cells per document, 18,278 columns per sheet, and a file size limit of 100 MB. While there is no direct limit on the number of rows, the total number of cells must not exceed 5 million. There are also limits on formula usage, including a maximum of 1,000 GoogleFinance formulas and a maximum of 50 formulas for external data, such as ImportHtml and ImportXml.

More information about limitations here:

https://spreadsheetpoint.com/google-sheets-limitations/

We faced challenges in processing the large number of payment transactions recorded daily in Google Sheets. These sheets were linked to various integrations, but due to the high volume of transactions, the sheets couldn’t handle the load: they were unable to process integrations and register transactions adequately.

The situation worsened when one of the sheets accumulated around 16,000 records. Any attempt to use a formula to search for information or calculate average currency values, even to determine a date, required processing the entire data volume. This led to significant delays, "freezes," or even the complete failure of the sheet.

An Alternative Solution to Sheets

To speed up and optimize our work, we decided to use a specialized database like BigQuery from Google Cloud. This allowed us to significantly improve the data processing process.

💡 Google BigQuery is Google’s cloud service for working with large volumes of data. It’s a PaaS service that offers database management system features and integrates with many platforms. It allows you to load, store, and process large volumes of data using SQL queries. BigQuery provides high data processing speed and supports integration with services like Google Sheets and Microsoft Power BI, offering a wide range of analytical capabilities.

What is Google BigQuery and Why Should You Use It?
How to Work with Window Functions in Google BigQuery—A Detailed Guide
How to Get Started with BigData—Loading Data into Google BigQuery

The Solution Using Google Cloud and BigQuery

To address the overloaded Google Sheets issue and improve the handling of large data volumes, we developed and implemented a new solution using Google Cloud and BigQuery capabilities. We created a specialized database in a paid Google Cloud project, where strict privacy settings were established to allow data access only to authorized users.

To automate the data transfer process, we wrote a script in Google Apps Script, which was published as a web app. This enabled it to function as a webhook, receiving data from our internal systems, where appropriate triggers were set up to send data. The script then processes this data and redirects it to the BigQuery database.

Thus, the webhook in Google Apps Script receives data from our accounting systems, where webhook triggers are constantly active, and sends them to the BigQuery database, significantly speeding up and optimizing the process of handling large volumes of financial data.

Should You Use the Paid Version?

Here’s a comparative table of the benefits and features between the free and paid versions of Google Cloud:

Feature

Free Version of Google Cloud

Paid Version of Google Cloud

Access to Services

Limited access to selected services

Full access to all Google Cloud services

Support

Community support, online documentation

Several levels of paid support options

Computing Resources

Limited by Always Free limits

Scalable according to needs, with custom quotas

Storage Capacity

Standard persistent disk: 30 GB-months snapshots, 5 GB-months snapshot storage

Scalable to petabytes, pricing based on usage

Databases

Firestore: 1 GB storage, Cloud Bigtable: no direct free tier, but $300 credit available

Fully scalable, with costs based on storage and processing

Network

Outbound traffic limitations apply to free services

Higher or no outbound traffic limits, depending on service and pricing plan

API Calls

Limited

Higher or unlimited, depending on service

Pricing

Free tier available for selected services, $300 credit for 90 days

Pay-as-you-go, subscription, or custom pricing

Read More About Webhooks:

WEBHOOKS
Adding a "Send Webhook" Trigger

Choosing the Technology for Data Optimization

We chose Google BigQuery technology for several reasons. First, it is accessible and relatively easy to use, making it an ideal choice even for beginners. Additionally, given that Google is a large and reliable company, we were confident in the security and stability of the services provided. BigQuery is an SQL-based database that Google has adapted, adding unique interactive capabilities, making it easier to work with data.

It’s important to note that BigQuery uses standard SQL, enhanced by Google Cloud features, so there’s no need to learn a new programming language. This makes the technology even more appealing to a wide range of users.

Moreover, we integrated BigQuery with Google AppSheet, which serves as an interface for interacting with the database. This allowed us to easily connect the database and effectively use it within our infrastructure. AppSheet ensures high performance through integration with Google Accounts and simplifies access control for different users, allowing detailed control over who can view or delete data.

For more details on how to use AppSheet and databases:

Use Data from MySQL
Using Cloud SQL as a Datasource for Your AppSheet App
Use Data from SQL Server
AppSheet with MySQL Database
Connect to an On-Premises Database

A Simple Way to Repeat the Data Optimization Process

If we needed to collect data on actual payments again, the simplest and shortest way would be to use a webhook via the Make platform (formerly known as Integromat), which would provide integration with the BigQuery API for automatic data addition. This approach is ideal for no-code solutions, especially for those without programming skills.

However, for those with programming experience, an alternative option would be writing a script in Google Apps Script. This is a free tool with a straightforward syntax that allows easy creation and editing of code for working with various Google services, including BigQuery. This approach would enable deeper and more flexible customization of the data processing process.

Connectors in Make: webhook-kommo-bigquery

Challenges in Working with BigQuery

During the implementation of our BigQuery project, we encountered several challenges, especially in the initial stages. The main difficulties involved understanding the structure of the BigQuery API and the specific features of the database, such as special data fields. For instance, we needed to accurately define data types in fields, such as when dealing with ID columns. It was important to specify that such fields should be marked as 'big number' rather than just 'number' to avoid errors when processing large numerical values.

This level of detail required careful study of BigQuery documentation and service capabilities to properly configure the database and ensure data accuracy across all fields. This was especially important to ensure data accuracy and avoid issues with query processing related to data type limitations.

Helpful Documents:

BigQuery Documentation
BigQuery Service

Code and API Query Optimization

The most time-consuming phase of our project was developing a quality code structure and optimizing queries to the BigQuery API. We paid significant attention to understanding how the API works and writing efficient code that minimizes the number of necessary database queries. This was critical because each query to a specialized API incurs time and resource costs.

We aimed to make the code not only functional but also optimized for fast and efficient data processing. This task required careful planning and testing to ensure that API queries were formed correctly and executed without unnecessary delays. This process allowed us to significantly reduce data processing time and improve overall system performance.


Principles and Practices in Implementing Efficient Coding

In the process of creating our BigQuery-based solution, we initially faced the need to optimize the code structure. In the early stages of development, the code was written somewhat superficially, leading to redundancy and repetition of many lines of code. These lines often duplicated the same actions related to database queries or data processing.

Noticing this, we began extracting repetitive actions into separate functions and actively using conditional operators such as if and switch, which significantly accelerated and improved the quality of our queries. This also made our code cleaner, more elegant, and more efficient.


Skills Needed to Work with BigQuery

The main skills necessary for the successful implementation of the project were programming ability and the logical structuring of sequences of actions in the code.

It was also important to have a deep understanding of database operations, especially SQL queries like SELECT, INSERT, UPDATE, and DELETE. Helpful resources include:

Query Syntax
Data Manipulation Language (DML) Statements in GoogleSQL

In addition to technical skills, the ability to effectively search for information and solutions on specialized forums and in documentation was crucial.

Read about how to quickly get a response to your support query in Google Cloud:

Customer Care Procedures: Escalate a Case

As a result of careful work and optimization, our solution began functioning as a mirror copy of the current system, allowing us to accurately reconcile data and ensure a smooth transition to the new data processing system.


Conclusion

The implementation of Google AppSheet and BigQuery technologies had a significant impact on improving the efficiency of our company’s finance department. Overcoming the limitations of Google Sheets by migrating to a more powerful and flexible system enabled us not only to speed up the processing of large volumes of data but also to improve the accuracy of financial reporting.

The use of BigQuery, combined with process automation through Google Apps Script and webhooks, demonstrates how technological innovations can transform complex data processing tasks into simpler and more manageable processes.

This is an example of the successful use of modern cloud solutions to address specific business challenges, serving as a guide for other companies seeking to optimize their data management processes.

Ua