If you're new to GraphQL, skim this page for a brief introduction on what it is, and how it's different from other types of APIs. If you're already a pro, feel free to move along.
GraphQL is a new way of thinking about fetching and manipulating resources across the internet. Implicit in GraphQL is the understanding that all resources in a system are interconnected peices of a larger graph, and that you should be able to read from multiple parts of the graph at one time in a way that is driven by the client, not the server.
What does that mean? Here's an example:
Suppose you have a podcasting API where you aggregate RSS feeds of popular podcasts and publish information about them that people can search and access. There might be a few resources that you manage in your system, such as:
Although these are four distinct resources in your system, these resources are inter-related. A Publisher
might have many Podcast
s, a Podcast
might have many Podcast Episode
s, and a Podcast Episode
will have at least one Podcast Host
. You might even design your database schema around these relationships.
Additionally, each one of these resources has a number of attributes, like a name
, or a date
for a Podcast Episode
.
If you wanted to expose this data programmatically to developers, you'd traditionally build a REST
API to do it.
ReST is actually an acronym (not just a state of mind), that stands for Representational State Transfer. These are traditionally JSON or XML APIs that are oriented around one or more resources, and use canonical HTTP verbs (GET, POST, DELETE, etc.) to signal intent.
For example, you might build the following REST API to expose information about the podcasts in your system.
1 2 3 4 5 6 7 8 9 |
|
Now, imagine that you want to build an app that allows for simple navigation of podcasts. The app opens with a list of publishers. Clicking into one reveals a list of their podcasts, then clicking into that reveals a list of their episodes, etc. This API would power this easily. You'd make one API request per screen depending on what resource you wanted to view.
Now, this is obviously an oversimplified and contrived example, but it's illustrative of two problems you'd have building your app with this API:
What if you wanted to change your navigation, so that the first screen was list of a 25 recent podcast episides, and in each row you'd show the name of the podcast, the name of the episode, and the name of the host with their photo. How would you do that with the API that you have? You'd have to:
/api/podcast-episodes
to get a list of podcast episodes with names, streaming URLs, as well as podcast and host ids./api/podcasts/:id
to get details about the podcast like it's name and image./api/podcast-hosts/:id
to get the host's photo.If you're displaying 25 episodes, that's 51 API requests you've made to render one screen. Even if you could do 2 requests in parallel and each one took only 100ms, that's ~2.6 seconds of waiting time before your app is usable. That's a lot of time for a fickle user who could just go back to Apple Podcasts (even though everyone knows Apple Podcasts is the worst one).
The second problem with this paradigm is how much extra data you've had to fetch that you didn't need. You had to make 25 API requests to /api/podcasts/:id
just to fetch each podcast's name and image, even though the API probably returned a lot of other good info to you, like it's air date, and a list of all its episodes. Then, you had to do the same thing with /api/podcast-hosts/:id
just to get the image URL of the host, even though you probably also got back the name of their alma mater and what Malcom Gladwell book they're reading right now.
All that extra data isn't free, the server had to load it for you. Extra loading means extra latency, just to fetch data that you end up throwing away once you get it back.
Let's reimagine our resources above as an interconnected schema. It might look like this:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 |
|
Given that we know the schema, GraphQL allows us to construct queries that selectively grab information from objects in the graph. We could write one GraphQL query to replace our 51 calls above:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 |
|
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 |
|
You can think of GraphQL almost like a hyper-targeted SQL query, pulling out just the data you need.
Not every type of API will benefit from GraphQL, but we believe that financial datasets especially benefit from a data access layer with a highly flexible query language. Financial datasets can be quite large, and the applications using this data are quite varied in use case and end-user expectations. No REST API would be able to accurately predict and model for every use case. GraphQL allows developers the flexibility to prototype and scale applications quickly and not have to think about building out their own data access layer.
GraphQL is definitely not a silver bullet, and we're by no means GraphQL maximalists, but we think it's a great innovation for external developer APIs that offers a ton of flexibility (as shown above) and a few other key benefits:
Check out awesome-graphql, the best repository for discovering great community tools built around GraphQL.
graphql.org is the definitive source for in-depth learning about the GraphQL spec and covers much more than what's covered here.