This guide explains how PostgreSQL Row-Level Security can be used to securely implement multitenant applications with Hibernate Reactive and the Quarkus framework.
In the context of Software-as-a-Service (SaaS) applications, multitenancy is a concept used to separate the data of different customers, also referred to as tenants. Multitenancy improves security by restricting the access of users to their own data. In the past, multitenancy was implemented mostly by either using a dedicated database instance per tenant or by controlling the data access within the application itself. In 2016, PostgreSQL introduced row-level security (RLS) as a core feature for natively implementing multitenancy by restricting data access based on the current user’s authorization. In this article, we demonstrate how to use RLS with Quarkus and Hibernate Reactive for implementing efficient and secure multitenant applications.
The naive approach to implementing multitenancy is scaling the number of databases up based on tenants served. All data for a particular tenant are stored in a separate and dedicated database to isolate them from other customers’ data. One of the advantages of the one-database-per-tenant approach is its flexibility: Each customer database can be scaled separately based on its individual needs. The flexibility comes at a high price: This approach does not share any resources, is very expensive to operate, and is the most complex to manage. Applications need to configure a unique connection pool for each customer (database), which causes extra overhead and complexity. Another drawback that comes with this approach is that using techniques like change data capture (CDC) can require more work than when storing all tenants in the same database.
Following the one-schema-per-tenant approach, a dedicated schema is created for each tenant. Compared to the one-database-per-tenant approach, this mode of operation offers less security since all of your data sit inside one database instance. Also, managing schema changes can become complicated since changes must be maintained consistently across all schemas. Both methods result in a large number of database objects, which can be more complex to manage. Implementing CDC can also require more work than when storing all tenants in the same schema.
Another way to implement multitenancy is
by handling the logic inside the application and always querying the database using a
WHERE
clause discriminator. This method does not isolate the data inside the database but lets
the application filter the data that the respective user is allowed to access.
While implementing multitenancy inside the application is fairly simple, it comes with
certain limitations. Compared to the two previous approaches, this method has fewer security
measures and is prone to developer-induced issues. If a database query without a
correctly-configured WHERE
clause slips through a code review, it could result in the
unauthorized access of sensitive user data. To mitigate this risk, a global wrapper could be used
within the application to automatically add the WHERE
clause to all database queries. Another
security issue that could be encountered is that attackers get access to the entire database
once the database credentials are breached whereas the other approaches allow the use of
different database users for different tenants.
PostgreSQL introduced its Row-Level Security (RLS) feature with the release of version 9.5. RLS allows to define policies that determine which users are allowed to view, update, and delete certain rows in a table based on which tenant they belong to. Policies can be defined for many different criteria, such as the user’s role, the user’s identifier, or the location from which the data is being queried.
RLS offers a few advantages compared to other methods outlined in this article:
At DataCater, we put a high value on being fast and responsive. Therefore, we try to adopt reactive practices wherever possible. We use Hibernate Reactive to ensure that our API calls are non-blocking for the best user experience possible. Unfortunately, at the time being, Hibernate Reactive does not support PostgreSQL RLS out of the box. This limitation is well known within the community, with some users avoiding Hibernate Reactive for certain use cases, or changing their data storage approach altogether. The following section describes how we overcame this hurdle and implemented multitenancy with PostgreSQL RLS and Hibernate Reactive using a custom solution. In a nutshell, we manually initialize RLS in PostgreSQL and inject the tenant information into all database queries executed by Hibernate Reactive.
By default, Hibernate generates the database schema as defined by the application at startup
time. Unfortunately, there is no way to let Hibernate enable RLS in PostgreSQL. To address this
issue, we switched from handling the database schema with Hibernate to using a
custom SQL script. The SQL script enables RLS in PostgreSQL, defines a tenant discriminator, and
populates the database schema. If you already have a PostgreSQL instance running you would
need to manually enable RLS and add a discriminator column (called tenant_id
in the
subsequent examples) to your existing tables.
In the following example, we create a PostgreSQL table (here called deployments
) with the
columns we need for the application and the additional column, tenant_id
, which will be used by
RLS as the discriminator. We set the default value of the column tenant_id
to the value
contained in the PostgreSQL custom setting datacater.tenant
. This is the setting our application
will set on calls to the database, as explained in the Modify Hibernate Reactive Calls to the
Database section.
CREATE TABLE deployments ( id UUID NOT NULL PRIMARY KEY, createdat TIMESTAMP, name VARCHAR(255), spec JSONB, updatedat TIMESTAMP, tenant_id TEXT DEFAULT current_setting('datacater.tenant')::text NOT NULL );
The PostgreSQL instance being used must have RLS activated for the tables it is to be used on. Here is an example showing how to enable RLS and create a policy that defines the access restriction strategy:
ALTER TABLE deployments ENABLE ROW LEVEL SECURITY; CREATE POLICY rls_policy ON deployments USING (tenant_id = current_setting('datacater.tenant'::text));
To utilize PostgreSQL RLS with Hibernate Reactive and the Quarkus framework, it is necessary
to manually add the tenant information to each database call. We developed a wrapper around
the Mutiny.SessionFactory
to globally configure the process. Our wrapper overrides
Hibernate’s withSession()
and withTransaction()
methods to initially apply tenant
information to the session and transaction queries, and then apply the defined work to the same
worker thread. This provides a seamless developer experience and allows these methods to be
used in the same manner as the original Hibernate functions.
public <T> Uni<T> withTransaction(BiFunction<Session, Transaction, Uni<T>> work) { String nativeQuery = String.format("SET datacater.tenant = %s", getPrincipal()); return sf.withTransaction( (session, transaction) -> { var update = session.createNativeQuery(nativeQuery).executeUpdate(); return update.onItem().transformToUni(ignore -> work.apply(session, transaction)); } ); }
The SecurityIdentity
containing the tenant data for the active user is injected into the class,
allowing us to extract a unique name for the use as a tenant ID.
@Inject SecurityIdentity si; private String getPrincipal() { String principalName = si.getPrincipal().getName(); if (principalName == null || principalName.isEmpty()) { throw new UnauthorizedException(); } LOGGER.info(String.format("Detected principal with name := %s.", principalName)); return principalName; }
Multitenancy is a crucial concept utilized by SaaS providers to manage customer data securely. The decision to choose the appropriate architecture primarily depends on the expected resource consumption of the software and the number of customers to be served. Although PostgreSQL RLS is an efficient way to secure data access, it comes with some drawbacks, such as Hibernate Reactive’s lack of support, which requires a custom solution. Overall, the proper implementation of multitenancy using PostgreSQL can enhance the security and scalability of SaaS applications, thereby improving the user experience.