An Application Programming Interfaces (API) defines how to interact with software. Software components interact and work together through the use of an API. An API is a software program’s contract that describes how other programs should communicate with it. APIs define the publicly accessible functionality of the program. An API defines how that functionality is invoked, limitations on its use, and expectations when calling it.
An API can act as the front gate for its software. It can protect, log, filter, and approve communication between the software it fronts and external environments.
This section discusses the security of APIs.
Application Programming Interface Types
There are many different types of APIs. There are system-level APIs. There are programming-level APIs for packages or modules with method or function and data structure specifications. These modules can be those of user or system libraries. An API can be the specification of remote calls available to its consumers, like CORBA or RPC. Web- based APIs are based on HTTP/HTTPS. Web-based APIs exchange structured messages, usually in JavaScript Object Notation (JSON) or XML, often in the form Simple Object
Access Protocol (SOAP) structures. This section takes a general approach to describe API security and uses a REST-based model as the basis to explain concepts.
Risk Assessment
First, it is necessary to understand an API’s risk exposure. Each API should be subject to a risk assessment to determine the risk or residual risk it poses to the organization. An API risk assessment should take into account the data classification, the recovery objectives, and the real and total costs of downtime to the organization, among other things. Make a comprehensive list of the vulnerabilities and threats of the API. Focus on the appropriate risk response by implementing security capabilities to minimize the risks these vulnerabilities and threats pose to the software.
A threat model is useful in determining how an API can be compromised. One should think like an attacker when evaluating an API for its vulnerabilities and desirable targets to attack.
Security of APIs
An API’s risks determine what level of protection is appropriate for it. Despite this variety of different APIs, they share many common potential security capabilities. Designing security into an API is a necessity. There are a number of frameworks that offer security features that mitigate the top risks of APIs. Choose a framework that works with the chosen computer language and offers a comprehensive suite of security controls to mitigate the most critical security risks for your application. The following security control areas apply when securing APIs:
- Authentication and access control
- Input validation and sanitization
- Protection of resources
- Protecting communications
- Cryptography
- Security logging, monitoring, and alerting
Authentication and Access Control
Authentication verifies that the consumer calling the API is the entity that they are supposed to be. APIs protect sensitive data. Access to the software resources should be restricted to approved consumers. Authentication performs this activity.
There are many different authentication techniques. These include a username and password-based technique, certificate-based authentication, OAuth, Security Assertion Markup Language (SAML), and others. Some authentication methods have different versions and implementations. Choose the authentication mechanism that best fits the usage patterns of the API. Enforce re-authentication for APIs that grant access to sensitive or privileged functionality.
Basic Authentication
Basic authentication is an authentication scheme that is built into the HTTP protocol. The username and password credentials are sent as base64-encoded strings. Basic authentication is the simplest form of authentication and is the
on the weaker side of security. Basic authentication is not recommended when better options are available. If Basic authentication must be used, then it is recommended that effective communication encryption such as Transport Layer Security (TLS) is used as well.
Certificate-Based Authentication
Digital certificates can be used to identify an entity, such as a user, as part of an identity and authentication mechanism. Certificate-based authentication can be used by all endpoints such as users, servers, and the like. Distribution and management of digital certificates should be part of designing a certificate-based implementation.
Dynamic Tokens
Dynamic access tokens are typically tokens issued to users that assert that the bearer has a particular set of privileges for the term restriction of the token.
JWT
When software services are distributed and separated by distance over a network, it presents a challenge on how to communicate a user’s authentication and authorization status to various systems. The use of tokens to bear this and other information, collectively known as claims, on behalf of the user has been a common and useful strategy in this case. The JavaScript Object Notation (JSON) Web Token (JWT) is a type of dynamic token that uses JSON notation to describe these authentication and authorization claims in a standard format to be used across distributed systems. It is defined by the open standard RFC 7519 (https://tools.ietf.org/html/rfc7519).
An example of a claim that a JWT token can represent about a user to an environment could be the assertion of a user’s role. An example of this might be when a user is acting as an “administrator” gets presented as part of the user’s credentials in a JWT.
JWT is designed to be easy to use. It also can be encrypted based on RFC 7516: JSON Web Encryption (https://tools.ietf.org/html/rfc7516) and signed by RFC 7515: JSON Web Signature (https://tools.ietf.org/html/rfc7515).
OAuth
OAuth enables delegated authorization. It solves the problem of securely delegating access to a user across multiple resources. There are two OAuth specifications, OAuth 1.0a and OAuth 2.0. These two specifications are incompatible. OAuth 2.0 relies on HTTPS for its security. OAuth 1.0a has security built in to compensate for working with nonsecured protocols. Consider using JWT with OAuth.
Location-Based Access Control Restriction
API access can be restricted by the location of the API caller. This type of access restriction may be used where there are geopolitical or legal constraints on the use of the software, such as sanctions or export controls over data. Restricting access based on geographic location is not foolproof, however, because the caller can spoof the location.
Input Validation and Sanitization
Input validation and sanitization should be among the software security practitioner’s primary concerns. Because they accept data from the outside world, input channels expose software systems to potentially destructive or malicious data. Input validation and sanitization are controls against this exposure.
Input validation controls check for the safety and appropriateness of data. Input validation evaluates data for the correct data type, acceptable values, formatting, and particularly for the presence of any data that may be harmful or have unacceptable effects on the system. It also ensures that the input data is safe and secure for the system to accept and prohibits bad input from passing through.
Input sanitization takes this one step further by transforming the input data into a form that neutralizes the potential harm that it could do once accepted into the system. An example of input sanitization is to remove HTML markup or scripting text that may be embedded in input so as to prevent this disallowed content from becoming rendered as a part of subsequent pages that use this input. This is an active control to prevent injection attacks of all kinds.
The API layer is a place to conduct input validation and malicious code injection inspections. An API is the front gate between external entities and the logic. As such, the API is the most important location to perform input validation. However, the software architecture of an API dictates where these controls should be imposed.
It is important to clarify what this means precisely. Often with API calls, there is a client side and a server side. The client side of the API is how the API presents itself to its consumers. This client-side presentation can be any of a number of different representations, such as an HTML web page form or a thick client. Validating and sanitizing input on the client side acts as a convenience for the user by directing them to input correct data. However, it cannot control data that is sent directly to the server side of the API. Thus, it is important for the security practitioner to assume that client-side API input validation and sanitization is insufficient.
It is not exaggerating to state that input validation and sanitization controls should be imposed on every input method to the server side of an API. The server side of the client/server model of an API often can be accessed directly without the client. When the server-side API can be directly accessed, the security practitioner needs to emphasize the importance of imposing input validation and sanitization controls on the inputs accepted by the server-side API.
Protection of Resources
The API protects its program from the external calling environment. Although good software environment security surrounds and protects the software with layers of security, the software itself needs to maintain its self-protective defensive posture. Specifically related to the software itself, a program’s API is its first line of defense against hackers.
An API’s design should include protecting software and its resources. These protection considerations should include the prevention of invalid input, undesirable data loss, exposure, leakage, destruction, modification, or similar. It serves to hide the internal state and data of its program. While doing so, an API must maintain the availability of program services to its callers.
Protecting Communications
For APIs that have touchpoints with networks (internal to an organization or external on the Internet), sensitive data should be protected while in transit. A network-connected API should require encrypted communication, such as TLS.
Determining which, and to what degree, data protection mechanisms will be employed with the API depends upon its risk assessment. The higher the risk, the more likely it makes sense to use data protection mechanisms.
Cryptography
Cryptographic methods use a number of mechanisms to support the confidentiality, integrity, and availability of API message passing. Cryptographic-related mechanisms essential to API security include encrypted communications, hashing for message integrity, and the use of API keys.
Confidentiality of the message is supported by using encrypted communication protocols such as HTTPS. With HTTPS it is important to consider the role and use of certificates for confidentiality in setting up the encrypted communication channel as well as how they are used to authenticate trusted parties.
Hashing methods such as Message Authentication Codes (MAC) support the integrity of the messages to verify the integrity of authorization tokens passed between systems. This is particularly important with use of authorization tokens such as the JWT, which are used for access control decisions.
API keys are used to support the confidentiality and availability of API services. An API key is a secret key used to authenticate and authorize a client to use an API. Using an API key in this manner allows the API service to base access control on the presence of the key in the API request. Allowing or denying access to clients can be based on the key they use. Also, the use of an API key can reduce the impact of denial-of-service attacks by using the key as a mechanism to manage the volume of requests to the API service. Every request requires an API key, which denies requests that do not have them. Too many requests from the same API key can be remediated by load network controls and use of the HTTP 429 status of “Too Many Requests.”
Security Logging, Monitoring, and Alerting
All activities at the various API layers offer valuable information about callers and cal- lees’ environment at the times of interaction. These series of events can be correlated to identify patterns of behavior or activities that tell stories about attack pattern attempts or attacks in progress.
Security logging must be part of the API logging. Rank and prioritize the events.
These events should then be mapped to appropriate log streams and security event processing endpoints.
Some API event logs related to security are authentication failures, denied authorization, input validation failures, session management events, application errors, and the like. Follow your organization’s standard guidelines for the baseline requirements for security event logging.
The best logging is human readable, meaningful, and machine parseable. Events should be logged to the organization’s standard log collector mechanisms.
This event stream should be monitored for security events. When attack patterns are discovered in log analysis, it triggers a response such as a status change or an alarm. Rapid notification of security events is a capability that contributes to reducing the time to respond.
Aggregating these event logs and sending them to a SIEM to correlate the API events with other events occurring in the system is an important means to identify, detect, and alert on security events.
Information for the secure coding practitioner: some logging frameworks include functionality that maps log events to additional data or actions.
See “The 10 Commandments of Logging” (www.masterzen.fr/2013/01/13/the-10- commandments-of-logging/) for more information.
Security Testing APIs
Testing should be performed to validate the behavior of an API and verify the security and performance characteristics of the API as well. Testing should be comprehensive. It should include a variety of different calling subjects, from the different applicable kinds of endpoints that would call the API, not just web browsers.
A variety of test methods should be employed to see how APIs handle unexpected results.
A comprehensive API testing strategy should consider different degrees of knowledge about the API. Black-box testing, a form of testing that assumes to know nothing about its test targets, is used to test the API as it would usually be called, without knowledge of its internal software environment. White-box testing is done with an intimate knowledge of the API and its software. White-box testing is done because it is best to assume, and assure, that the API can defend against attacks that have the advantage of the full knowledge of the software. White-box testing also provides rapid assessment of potential weaknesses in the software.
Fuzzing, also known as fuzz testing, is a testing technique that involves sending large amounts of random data, known as fuzz, to an Application Programming Interfaces with the intention of breaking the software. Fuzz testing an API can discover susceptibility to denial of service, program crashes, failing code assertions, and potential memory leaks. Smart fuzzing is used during white-box testing to lower the amount of data and speed up the fuzzing process. Fuzzing generally tries to find and manipulate edge case test data to identify differences in application behavior.
Monkey testing is also a form of testing that intentionally focuses on breaking the soft- ware. Monkey testing is different than fuzz testing in that its techniques focus on random actions to break the API. Smart monkey testing is a particularly useful API testing technique because it is monkey testing based on knowledge of the system.
API Design Advantages
Good Application Programming Interfaces design improves security. It defines the contract of how software is to be called and used. An API can also serve as a layer to control access to the software’s business logic, its data, and resources. An API promotes good software design by protecting the internal logic, data, and state from the outside environment.
Declare Software Contracts
An application programming interface is the accessible formal contract of a software program that describes how and by what rules other programs should communicate with it. APIs define the publicly accessible functionality of the program. An Application Programming Interfaces defines how that functionality is invoked.
Along with describing the rules and specifications of communication, the API also serves as a protective boundary for the software.
Security Control Layering
Because an Application Programming Interfaces acts as a gate at trust boundaries between the software’s internal environment and the software consumers, it is a perfect place to apply security controls.
An API is naturally an access management function. This is a place to ensure only properly authenticated and authorized software users are allowed access to the software resources that the API protects. As a security practitioner, you should also consider whether session management is an appropriate control for the type of API and software functionality.
An Application Programming Interfaces is an excellent place to observe and conduct event activities. Logging and monitoring are essential to a secure API.
Input validation and sanitization should be done at the API. All input into the software should be inspected and made safe for consumption by the internal software environment or disposed through an error handling process. Similarly, all data output through the API should be inspected for safety and security purposes by evaluating the content, format, and encoding. These input and output controls are an important part of a defense-in-depth approach to protecting the software environment from abuse.
Layering security controls at an API is a smart secure software development practice.
An Application Programming Interfaces offers a natural point of inspection, management, and control for many safeguards because of its position at a trust boundary. An API also improves software security by promoting good design.
Promote Good Design
An Application Programming Interfaces defines how the software can be called and used by its users and consumers without revealing the implementation details. Because an API defines the contract of how the software is called, the API should rarely change, if at all. Consistency in an API over time is necessary for the stability of systems that are built with dependencies upon the API.
An API is a layer of opportunity for the software designer to consider how to design the underlying software.
Because an API both separates and contains its associated software, it can significantly shape the software’s design. Designing software through the lens of Application Programming Interfaces improves software modularity. Modularity in software design promotes a concentration of business logic and improves reuse. These qualities derive from how an API naturally decouples program logic, separating the code from the API. This in turn encourages a cohesion, which is the grouping of program logic into units.
Modularity
The structure of the Application Programming Interfaces, such as the function or method calls it supports, lends itself to support modularity in the design of the software. Each API function can map to one or more modules that provide its business logic. In this way, an API defines the building blocks of a software system. Designing software in a modular fashion improves the quality and maintainability of the software. Modular software also supports the security of the software, as security mechanisms can be designed and built into the resulting software modular building blocks.
Code Decoupling
Code decoupling, or loose coupling, is a design quality that minimizes implementation dependencies between software libraries and modules. An unchanging API, or slow-changing API, combined with modular-designed software supports the decoupling of the code from the API. The code decoupled from the Application Programming Interfaces has the benefit that the soft- ware can be maintained or changed without the API changing. The software can change without its consumers knowing.
Cohesion
In software, cohesion is the design quality that describes the extent to which code logic belongs grouped together. The more similar the software logic is, the more likely it will be a design benefit to associate or package that software together. Application Programming Interfaces (API)s support cohesion. Using the API as design guidance to implement software in a modular fashion promotes cohesion of the business logic. Similar business logic, supporting the same or similar APIs, is typically packaged together in software modules. Cohesion like this improves software readability, maintainability, and changeability, and thus overall it improves the quality of the software, leading to improved security.