Learn how to apply change data capture for extracting data changes from HubSpot.
HubSpot offers a suite of software products for inbound marketing, sales, and customer service. For instance, one of their products, HubSpot CRM, can be used for managing all kinds of relationships with customers, like contact people, companies, deals, or offers. While HubSpot offers powerful ways for managing the valuable data related to the business, the products, and the customers of a company, it lacks advanced support for analytics.
In this article, we show how to use DataCater’s REST/HTTP source connector for extracting data change events from your HubSpot CRM account. Using a pipeline, we could push the extracted events to data sinks, such as Google Cloud BigQuery, Snowflake, or another REST/HTTP API.
The HubSpot API provides multiple means for authentication, like OAuth2 or API keys, which are all supported by our REST/HTTP source connector. For ease of configuration, we use API keys in this article.
For all CRM objects, HubSpot provides a search endpoint in their API that can be used
to retrieve, sort, and filter records. For instance, the following POST
request retrieves companies sorted by the field hs_lastmodifieddate
in
ascending order:
curl \ -XPOST \ -H'Content-Type:application/json' \ -d'{"sorts": [{"propertyName": "hs_lastmodifieddate", "direction": "ASCENDING"}]}' \ https://api.hubapi.com/crm/v3/objects/companies/search/?hapikey=$YOUR_API_KEY
By default, the search endpoint returns 10 records. We can bump this
number up to 100 (HubSpot doesn’t allow more) by setting the parameter limit
:
curl \ -XPOST \ -H'Content-Type:application/json' \ -d'{"limit": 100, "sorts": [{"propertyName": "hs_lastmodifieddate", "direction": "ASCENDING"}]}' \ https://api.hubapi.com/crm/v3/objects/companies/search/?hapikey=$YOUR_API_KEY
The option filterGroups
allows us to select only records
matching specific filters (or requirements). For instance, we might use it to fetch
companies that have been modified in HubSpot (attribute hs_lastmodifieddate
) after 2022/01/01 00:00:00 (UNIX timestamp 1640991600
):
curl \ -XPOST \ -H'Content-Type:application/json' \ -d'{"filterGroups":[{"filters":[{"propertyName": "hs_lastmodifieddate","operator": "GT","value":1640991600}]}], "limit": 100, "sorts": [{"propertyName": "hs_lastmodifieddate", "direction": "ASCENDING"}]}' \ https://api.hubapi.com/crm/v3/objects/companies/search/?hapikey=$YOUR_API_KEY
By combining these different search options, we can (1) efficiently walk over all records stored in HubSpot and (2) implement change data capture by monitoring the HubSpot API for changes.
All we need to do is to recurringly execute an API request, which sorts companies by their change time (hs_lastmodifieddate
), and update the value in the filter on the attribute hs_lastmodifieddate
with each request such that a request searches for all companies, which have a value in hs_lastmodifieddate
that is greater than the greatest value of the prior request.
Sounds complicated to set up? Try out our REST/HTTP connector.
Please create a new REST source connector and fill out the fields URI, HTTP method, HTTP request headers, and HTTP request body, accordingly to the curl commands shown in the prior section:
Note that we use the variable ${offset.timestamp?datetime.iso?long}
in the HTTP request body. DataCater’s REST connector automatically fills it with the largest timestamp value it has processed. We also need to fill the fields JSON Pointer to records list with /results
and Name of the attribute holding the record key with id
to (1) point the connector to the JSON array in the API response that holds the records and (2) let the connector know which attribute holds unique record identifiers.
To finish the connector configuration, please navigate to the Change
Data Capture tab and choose updatedAt
as the Name
of the attribute holding the timestamp of the record’s most recent
update:
That’s it! You can save the data source, create a new pipeline, and start consuming change events from HubSpot in (near) real-time.
Using HubSpot’s search endpoint, we can perform powerful search queries
that sort and filter CRM records. By combining the sorts
and filterGroups
options, we can walk over all records
stored in HubSpot and monitor HubSpot for changes.
We used DataCater’s REST/HTTP source connector to implement change data capture with HubSpot without having to track the state of the data extraction, i.e., the timestamps, ourselves.