The One Question
Last week, while attending an interview with a FinTech, I was asked an interesting question:
Suppose, I have an API endpoint which I want to expose to public internet. How do I go about securing this?
While, it is a simple question on the surface, depending various factors, the answer could range to large number of approaches.
To get some more clarification, I asked some following questions and received some more constraints:
- The endpoint returns a string response
- The endpoint will be exposed to public for consumption
- Security can be anything, which makes sense for a publicly exposed API (open ended scope)
As, I am not very involved in Security topics on my day-to-day duties, but I receive recommendations, feedbacks and change requests from dedicated security teams to implement various mitigations for my own services from time to time.
After the interesting discussion, I decided to reflect with a cup of tea at my leisure about the question without the nervousness and time constraint and came up with several approaches I would go about securing an API.
Disclaimer
NOTE This is entirely an exploratory musing post and must not be taken as advice or best practices! Please consult adequate security experts for similar situations.
How To Secure My API on Public Web?
First of, if I were to expose my API to a public web, I would not exactly do it. There are numerous reasons why, rather the API should be frontend by some other layers.
Basic Security
Some first level basic security approaches to take when we are starting are discussed below.
Authentication and Authorization : Or simply Authz
When we hear the term security, we immediately think of access control and management. If the API is sensitive, we must implement adequate Authentication and Authorization.
- Authentication: the user is verified to be who they claim to be (identify)
- Authorization: the user has adequate permission to access the underlying resource (rights)
Depending on the consumer group, the API may implement various authz schema, ranging from jwt, session, api-key etc. Additionally, the API must also maintain a Authorization layer to verify that the consumer has the appropriate rights to consume the particular resource on our provided API.
Some nice tools for such implementations are conveniently backed by API Gateway and then Authz management tools, such as KeyClock. It is also possible to implement such layers directly in the backing service of the API using various framework and tools.
Validation
When requests are coming to the API, user may send invalid content. For example, if we expect a specific request envelop, but user instead sends a text file in the request body, this causes unnecessary processing and possible exception/error case.
We should always implement a request validation strategy on our API end point, allowing to discard and inform the user of expected request content. Not only does it avoid processing garbage data, it also allows the user to notice issues on their end and make appropriate corrections.
The validation can be implemented on various locations, however, API Gateway(discussed in Scale section), or backend are most preferable locations.
Sanitization
As most APIs accept and return some kind of content to the consumer, it is always important to Sanitize data on both directions, specially when it is coming from consumer side.
In this regard, a zero trust approach is recommended, where all user provided data is sanitized before processing and persistence. One common threat of un-sanitized data is code injection, when the data is rendered or evaluated during processing where parsing is performed.
Error Handling
Any implementation of a software is always prone to Errors/Exceptions. Usually, this aspect is often missed or ignored in initial phase of a service development.
The API has the tendency to reach a specific case(invalid data, corrupted request, backing service outage, logical error etc.) where an Error is raised.
If such errors are not correctly handled and secured, sensitive data or implementation details(stack trace!) may be exposed to the consumer. This kind of exposure can help a malicious actor to determine specifics of the service to perform security breaches or dump sensitive data.
A service must be correctly implemented to handle known error cases and a return a friendly error message to the consumer without exposing any sensitive or implementation data.
Monitoring
An API must have adequate monitoring of all activities. The logs and traces of requests should be constantly collected and monitored for incidents and anomaly.
If our API is receiving malicious data constantly, or raising exceptions, correct level of logging and alerting will help us to not only detect security breaches or attempts of abuse, but also to record the audit trail of how the service was being abused. With adequate logs, we can triage any security issues and implement required fixes.
Additionally, keeping logs and monitoring data helps to determine performance of the service and allow optimization, which saves cost as we scale our service in future.
Another use of monitoring via logs and metrics is, fraud detection. The events of each request can be forwarded to a fraud detection system to flag and alert(and sometimes block) abuse of the system, which can result in legal or other forms of loss.
Headers
If the API can be accessed from browser, it is important to also set some correct headers, such as Cross-Origin-Request-Sharing, Cross-Origin-Request-Policy, Content-Security-Policy.
MDN(Mozilla Developer Network) articles linked are very interesting and recommended to the reader to learn more.
Without proper headers, a malicious actor can build a phishing site and use the API to extract the user’s sensitive data.
Additionally, Cross-Site-Request-Forgery token must be used on client side to avoid usage of legitimized session.
Security Issues and Updates
Our API is usually implemented based on some open source or proprietary software. It is very important to always perform regular scans of our code base as well the backing tools and frameworks for latest security vulnerabilities.
A service built on out-dated tools are open to various kinds of buffer-overflow, remote execution, root escalation, data leak and other attacks.
Additionally, not only the frameworks, but our implementation may open up the service to variable attacks. For example, if the service performs raw eval on un-sanitized data, it may cause execution of malicious content infecting the service, but also further escalation. Another common cause of basic mishap is bad implementation, where the service performs massive data loading unnecessarily or sequential scans over large content for each request, which will cause exponential degradation as number of requests increase.
It is always important to scan our code base and frameworks/tools for new vulnerabilities. One of the popular tool for frequent scanning is Sonarqube.
Scale
As our service scale with more instances to service more consumers, additional steps are necessary to secure our service beyond the basic “security” practices.
API Gateway
As the API can be part of larger collection of other services, exposing an API to public web directly does not make sense, as the consumers must now keep a record of the service identifiers(hosts, IPs etc.) making it difficult to build on the API.
My first simple approach would be to put my API behind an API Gateway(e.g. AWS API Gateway, Nginx etc.). This allows the API to be available conveniently along with other APIs in my fleet.
Additionally, the API gateway provides other layers of convenience, such as logging, monitoring, security, auditing, correct rollout.
Rate Limiting
Any service exposed to consumers will eventually suffer from abuse(both intentional and accidental). It is important to implement various Rate Limiting strategies over an API to balance the user convenience as well as service load.
A broken implementation on consumer side may cause large number of requests to be generated over very short window, causing the service to be overwhelmed and dedicate too much resource to one consumer while locking out other consumers. Additionally, such behavior can cause dangerous levels of resource(cpu, memory, disk etc.) consumptions, which may trigger very large scaling automatically in cloud deployment and cause massive billing headache.
Additionally, malicious actors may also direct a large scale attack causing a Denial of Service(DoS) attack, knocking out our API and block all other consumers by working in degraded mode.
To avoid such cases, we need to implement rate-limiting. There are various known simply strategies for rate-limiting,
- Token Bucket
- Leaky Bucket
- Fixed Window Counter
- Sliding Window Log
- Sliding Window Counter
There are several locations, where rate-limiting can be implemented:
- API Gateway: requests are throttled at the entrypoint, but requires additional expenses
- Backend/Service: requests are throttled at the service level, but complex and requires additional resources
- Client: requests are throttled at user level, but these are unsafe and can be bypassed!
Load Balancing
As our number of users grow, rate-limiting will not protect the service for very long, as more users using our API will require more resource dedication.
To service large number of users without degrading or overwhelming our service, we may initially continue scaling vertically(dedicate more resources, such as CPU, Memory, Disk etc.). Eventually, we will hit a limit of vertical scaling when the service will continue to degrade in performance and start dropping requests.
In such cases, we can start scaling horizontally(i.e. increase the number of service instances available). However, as each service now has unique identities and difficult to keep track for the consumer, we will introduce a Load Balancer.
The responsibility of a Load-Balancer is to intercept any incoming request to our API and redirect it to one of the instances of our service.
There are various approaches, how the requests are distributed among different instances, such as:
- Round Robin: each new request goes to next available service in sequence
- Load Based: any new request is forwarded to least busy service instance
- Sticky Session: a request from particular consumer is always forwarded to particular service
- Quota: each new request is forwarded to same service until its quota(or queue) is full
There are numerous approaches to how load balancers work. Further research is recommended.
Additionally, the benefit of load balancer is, services can be scaled up(more instance added to the pool), or down(some idl or least utilized instances are taken offline to save costs) without affecting the incoming requests.
(Web Application) Firewall
As the service grows, despite our best intentions and approaches, our API will attract interest from various security attacks and other incidents or simple breach scanner and discovery bots.
One popular approach to mitigate unintended requests or attentions is Firewall. The firewall operates on lower levels of the stack and blocks large amount of traffic from reaching our API. With intelligent traffic monitoring and behavior features, diverse set of firewall services can be availed.
Additional benefit of firewall is allowing geo-blocking, if our API is only meant to be used by users of Germany, we can enable blocking any/all traffic originating from other locations, saving us numerous costs on traffic, processing and mitigation effort.
Summary
While the sections described above are something that comes to mind with a little bit of retrospective, these are by no means most exhaustive list. The scope of security is very broad and beyond my simple experience. However, if I were to expose an API on the public web, I would start by implementing all the sections on this post.
If any inaccuracies are visible, please leave a comment, I will immediately fix it. :)
One interesting reference for security topics is the OWASP project.