Bill.com is a leading provider of cloud-based software that simplifies, digitizes, and automates back-office financial processes for small and mid-size businesses. Bill.com helps businesses streamline their financial workflow, generate and process invoices, stream approvals, send and receive payments, sync with their accounting systems, and manage their cash. It connects businesses from all industries, ranging from startups to established brands and nonprofits to franchises.
A modern self-service BI platform is an absolute necessity in any organization such as Bill.com, which has growing data needs and a variety of data consumers across channels. However, a self-service centralized solution without oversight, governance, and control can lead to chaos and quickly become a management nightmare for data teams. Today’s organizations face several challenges:
- Data silos – Data is scattered in different places and controlled by different groups using different technology solutions to store, manage, and tackle problems in silos. Data is either manually or semi-manually collected and processed for larger organizations to consume. It’s difficult to get granular co-related details from the scattered data, and not everyone has access to the various data repositories or capabilities to combine data in real time.
- Analyzing diverse datasets – Data structure and information vary from system to system. Additionally, an increasing amount of unstructured data is produced and consumed each day. To further complicate the problem, most of the data is labeled differently in different systems. Combining all this data requires a lot of data preparation and extract, transform, and load (ETL) jobs. This also requires trade-offs on what data to keep and what to lose, and continually changes the structure of data.
- Managing data access – With data stored in so many locations, it’s difficult to access all of it and link to external tools for analysis. Organizations need a solution to get the right data to the right people at the right time. Having one set of credentials and managing these credentials to access data could be very challenging.
- Machine learning – Artificial intelligence and machine learning (ML) requires a large and diverse dataset to generate valuable inferences, and you must develop models that can provide accurate inferences by learning from this data. Moving all the data to a centralized location and managing more relevant data increases the accuracy of the models.
With these challenges in mind, Bill.com set a goal to provide a single source of data —a centralized and secured repository that would allow Bill.com to store, govern, discover, and share all structured and unstructured data at any scale. The idea was that this repository would act as a foundation for ML, business intelligence (BI), and analytics use cases. This led Bill.com to implement a data lake solution.
In this post, we talk about how Bill.com specifically implemented BI capabilities part of this data lake solution using AWS serverless services and Amazon QuickSight as a replacement for legacy reporting solution. QuickSight is a cloud-scale BI service that you can use to deliver easy-to-understand insights. It connects to your data in the cloud and combines data from many different sources, including any third-party data, big data, spreadsheet data, software as a service (SaaS) data, B2B data, and more.
Overview of solution
To achieve a successful transition at Bill.com, the new service had to have parity with the current BI solutions in the following areas:
- Feature-rich dashboards
- High performance
- Ease of use
In addition, Bill.com wanted this solution to support the following capabilities:
- Enhanced authentication
- Deep linking of dashboards
- Data security and authorization
- Cross-channel portability
- Solution manageability
Bill.com started with this goal and began investing and extending existing data lake solution. This required Bill.com to move off from legacy reporting solutions, and implement and provide better options to empower Bill.com data community. Bill.com evaluated QuickSight for reporting solution while building out new data lake on AWS. The dashboard features, security, and performance (powered by Amazon Athena and SPICE) of QuickSight stood out and exceeded the parity conditions Bill.com had set.
QuickSight supports identity federation through various SAML providers or using enterprise identity providers (IdPs) such Microsoft Active Directory. Bill.com uses Okta as IdP, and integrated QuickSight with Okta. However, Bill.com pre-provisioned users with federated identity using QuickSight APIs. A user that logs into Okta assumes an AWS Identity and Access Management (IAM) role and gets authenticated within QuickSight in an IdP-initiated SAML authentication.
In addition to the IdP-initiated authentication, one of the important features that Bill.com was looking for was deep linking of dashboards. Bill.com users were regularly using and sharing deep-linked dashboards in the legacy solution, but to do it using a federated SAML solution requires service provider-initiated SAML authentication for QuickSight. This way, when a user chooses the dashboard URL, it redirects to Okta for single sign-on and then sends them to specific dashboards.
The Bill.com team worked closely with the QuickSight team to become early adopter of the service provider-initiated feature and jointly tested this functionality. This feature helped Bill.com achieve full parity with existing legacy systems on dashboard shareability and also accelerate the transition to QuickSight.
For more information about using QuickSight with Okta, see Federate Amazon QuickSight access with Okta.
Data security and authorization
Data security and authorization typically includes subsystems responsible for ensuring the data security, protection, and access authorization checks. QuickSight enables you to manage your users and content using a comprehensive set of security features. This includes role-based access control, Microsoft Active Directory integration, and AWS CloudTrail auditing.
Fine-grained access control
Bill.com needed fine-grained access control on the tables and columns that a user can see within QuickSight. Bill.com also needed a permission layer between QuickSight and Bill.com data resources, Amazon Simple Storage Service (Amazon S3) and AWS Glue, to control what data can be accessed and by whom. AWS Lake Formation fit these requirements perfectly.
The following diagram is a high-level depiction of the integration architecture.
As a Lake Formation admin, the first thing Bill.com did was register the Amazon S3 locations that contain the data. This has two benefits:
- Client applications like Athena and QuickSight don’t need explicit resource-level permissions on the S3 buckets
- A user with an assumed IAM role must go through Lake Formation, which determines whether access to a resource is granted to that role
Bill.com created specific IAM policies that determine which AWS Glue databases and tables can be accessed in QuickSight through Athena, but resource-level permissions are governed in Lake Formation.
Scoped IAM assignments
In addition to Lake Formation, Bill.com fully leveraged the fine-grained IAM access control capabilities in QuickSight to scope-down permission for individual users and groups. The scope-down IAM policy assignment can add more control on AWS resources that a user or group can access. Bill.com used these assignments to restrict access to specific Athena workgroups, AWS Glue databases and tables, and more. This is a great way to add multiple layers of access control, with Lake Formation providing resource-level control and scope-down IAM policy assignments in QuickSight allowing for fine-grained data control within different groups of users. For example, User A can have access to SalesGroup, under which they have access to SalesTeam (an IAM assignment), under which they can access the Sales folder.
The following example code creates the group SalesGroup:
The following code assigns a user to this group:
You can use the QuickSight console to assign an IAM policy to the group (see the following screenshot).
Within QuickSight, Bill.com needed different levels of content access that determined which users may view and edit content. In addition to the author and reader roles that a user gets in QuickSight, Bill.com needed to manage content access among different types of users.
The level of access Bill.com adopted was an open system with restrictions, which means that access to shared content is restricted in some way — either only certain people can edit certain content, or certain content is entirely invisible to particular people. Bill.com achieved this with the following configuration:
- Each function team (Sales, Marketing, Product, Finance, and so on) has an associated folder
- Each user is associated with a group and a folder
- Each folder is assigned a group of users that can manage the folder content, and every QuickSight user is allowed view access to the folder through a default group
- The folder owners can create subfolders to further manage the content within that folder, and can decide which users can view the subfolders
- If certain content needs more restrictive access, you can create another group (for example, Finance Viewer) and grant only that group access to a subfolder
The following screenshot shows an example of these shared folders.
SalesGroup (which we created earlier) gets full ownership of the
Having a well-defined governance model and access control has helped Bill.com accelerate QuickSight adoption within the company. Bill.com now has hundreds of users in QuickSight developing and sharing the content every day. The total data access cost through QuickSight is further optimized with SPICE storage and Athena. Bill.com was able to achieve the following major benefits by adopting QuickSight:
- Data source compatibility
- Slick and smooth SPICE engine
- Portability across channels
- High scalability
- Smart interactive dashboards
Next, Bill.com is looking to further explore QuickSight capabilities:
We will share these experiences in future blog posts.
About the Authors
Kannan Iyengar heads the Bill.com data engineering team that is responsible for developing and operating the analytics and reporting platform at Bill.com. In his spare time, he likes running, hiking, and reading.
Manish Chugh is a Sr. Solutions Architect at AWS based in San Francisco, CA. He has worked with organizations ranging from large enterprises to early-stage startups. He is responsible for helping customers architect scalable, secure, and cost-effective workloads on AWS. In his free time, he enjoys hiking East Bay trails, road biking, and watching (and playing) cricket.