Threat actors are after your APIs. Whether it’s your data that they’re interested in, or abusing your services for their financial gain. The reasons why threat actors target different organizations can vary, but one of the first steps they take in their hacking methodology is often consistent and predictable - information gathering or reconnaissance.
When hackers gather information on a target application, they do so both passively and actively. Passive information gathering does not involve sending packets to the targeted application - it involves gathering information through other channels that are indirect to the main application or the company operating it as a means to remain stealthy (but not only). For example, passive information gathering can be done by looking up the company’s name on GitHub in hopes of finding public repositories that could give them hints on what technologies are in use, or to find credentials that were hard-coded but forgotten about that they can then reuse.
Active information gathering involves communicating with the application, this requires the threat actors to send some traffic to the application. Imagine you are defending a large-scale application that serves millions of clients. You can appreciate how difficult it becomes to identify anomalous behavior if you don’t know what anomalous traffic could look like. Inigo’s solution not only allows cherry-picking those GraphQL queries that are targeting your application in a riskier way and surface the interesting ones to your security team, but also block and alert on them.
GraphQL introduces a shift in how clients request their data from applications using its innovative declarative query language. In GraphQL, clients express their intent using a query payload. If you ever used REST APIs, you may recall that in REST, a client’s intention is expressed by the combination of the HTTP method (GET, PUT, POST, DELETE) and the resource path (such as /v1/users). For instance, a GET request to /v1/users would result in fetching the list of all users of a given application.
GraphQL, on the other hand, uses a single endpoint (such as /graphql) and some query to fetch a list of users. Such query could look like the following:
users {
name
email
}
So, how can threat actors discover whether an application is using GraphQL or another API technology? They do so by sending some query (the query does not have to be a valid one) to a list of possible endpoints that GraphQL may exist on, and observing the responses returned by the server.
For example, this is how a cURL request to a GraphQL endpoint could look like:
$ curl https://example.inigo.io/graphql -d '{"query":"query { users { name email } }"}' -H "Content-Type: application/json"
A typical GraphQL response is in JSON, and includes data or errors (or both) objects, which give away the fact this is a GraphQL API that responded:
{"errors":[{"message":"Cannot query field \"users\" on type \"Query\".","extensions":{"code":"GRAPHQL_VALIDATION_FAILED"}}]}
This method can then be automated so queries run against multiple endpoints in parallel. Possible endpoints could be:
If API versioning is in place, you may find GraphQL located in paths such as:
As you can see, APIs can be present in different locations. GraphQL servers can also be customized to point to a completely arbitrary location that isn’t on the list above. However, GraphQL API responses are often predictable, as dictated by the official GraphQL specification Response Format section (7.1), allowing actors to identify that GraphQL is the interface which they are interacting with. Here is an excerpt of the response format section:
A response to a GraphQL request must be a map. If the request raised any errors, the response map must contain an entry with key errors. The value of this entry is described in the “Errors” section. If the request is completed without raising any errors, this entry must not be present. If the request included execution, the response map must contain an entry with key data. The value of this entry is described in the “Data” section. If the request failed before execution, due to a syntax error, missing information, or validation error, this entry must not be present. The response map may also contain an entry with key extensions. This entry, if set, must have a map as its value. This entry is reserved for implementors to extend the protocol however they see fit, and hence there are no additional restrictions on its contents.
This means that keys such as data, errors and extensions are all keys you may find in a GraphQL response. Armed with this knowledge, threat actors can build this logic into scanning tools to find where your GraphQL APIs live. In a future post, we will explore what threat actors might do after a GraphQL server was found.
When threat actors are performing an information gathering activity, they typically try different variations of queries. Some may be valid, some may not. Invalid queries could result in server exceptions. Inigo’s solution provides a layer of protection for your GraphQL APIs, our unique position in your architecture allows us to detect malformed queries against existent and non-existent endpoints and mitigate against such activity. Using our GraphQL Security and Management solution. Reach out to us for more information.