NOTE: Yuck is in its planning phase at the moment. No code exists, only this document. Feedback on this document is welcome, via normal Ick channels. Ick will continue to use Qvisqve for the time being, until Yuck is ready to replace it.
Introduction
Yuck is an identity provider that allows end users to securely authenticate themselves to web sites and applications. Yuck also allows users to authorize applications to act on their behalf. Yuck supports the OAuth2 and OpenID Connect protocols, and has an API to allow storing and managing data about end users, applications, and other entities related to authentication.
Yuck does not provide any services unrelated to authentication. Other services can work with Yuck to control access to them.
OpenID Connect (OIDC) is a protocol suitable for interactively authenticating a person (the end user). OAuth2 is suitable for non-interactive API clients, possibly ones acting on behalf of the end user.
Both OAuth2 and OpenID Connect provide a number of variants and extensions. Yuck implements the "client credentials grant" for OAuth2, and the "authorization code flow" for OIDC.
Yuck has an extensible architecture for supporting different ways for users to authenticate, and for optionally using multiple authentication factors. Initially it will implement traditional passwords and time-based one-time passwords (TOTP, same as "Google Authenticator").
The Yuck architecture supports different ways for storing the data and credentials it needs. Initially it comes with support for using the Muck JSON store, but support for, say, LDAP can be added.
Terminology and concepts
access token: a token which grants access to a service or resource; usually short-lived, but see refresh token
API client: a program that uses the API, either on behalf of an end-user, or on its own behalf
application: software that provides a service using the RP
authenticate: prove the identity of someone or something; "this is how you know I am who I say am"; authentication can happen in any number of ways, and different relying parties may have different requirements: government ID; being able to read email sent to an email address; knowing a secret; possessing a unique thing; acting in a particular way; having particular body features (fingerprint, face, voice, hand shape, ...); etc, the list is almost endless
authorize: grant access to an authenticated entity; "what are they allowed to do?"
end-user: a human using the system, typically the reason the system exists, can also be a subject
front end: provides the user interface to an end user via the user agent or browser; typically provides HTML, JS, CSS, and images, statically or generated dynamically, but could audio, video, or anything the user can interact with
IDP: short for identity provider
identify: claim an identity; "this is who I say I am"
identity: who a human is, or which instance of a program is
identity provider: software the authenticates an end user and non-human entities, and also stores authorizations for them
JWT: a standard way to represent tokens, see JWT; Yuck will use digitally signed tokens
OAuth2: a protocol for authenticating software; see OAuth2
OIDC: short for OpenID Connect; a protocol for authenticating end users; see OIDC
refresh token: a token that can be used to get a new access token; usually long-lived, but can be revoked
relying party: software that relies on the IDP for authentication and authorization; often a resource provider, but can also do things on request instead of merely storing things
resource: data stored by a resource provider
resource provider: stores resources and allows authorized access to it; "database"
RP is short for relying party or resource provider
subject: a person whose personal information is handled by the system, see end-user
user agent: typically a web browser, but can be a mobile or desktop application; assumed to be under complete user control, and so trusted by the user, but not the ecosystem
Requirements
Yuck has at least the following high level requirements.
In this section, the key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in RFC 2119.
Each requirement and sub-requirement is given a unique name for easier reference in discussions.
- (SECURE) Yuck MUST be secure.
- (CREDSTORE) Yuck MUST be store credentials in a way that minimises damage if they leak. Credentials SHOULD be stored encrypted using a respected encryption algorithm (such as scrypt) and using per-credential salting. Or something stronger may be implemented instead.
- (MFA) Yuck MUST support multi-factor authentication using secure factors.
- (PROTOS) Yuck MUST use secure protocols to authenticate users and API clients.
- (HTTPS) Yuck MUST NOT ever use plain HTTP, only HTTPS.
- (AUDIT) Yuck SHOULD undergo security audits, and general scrutiny. Audits SHOULD happen regularly. (This is not an absolute requirement, as it depends on the availability of competent auditors. Yuck is not a for-profit project, and may not be able to pay them.)
- (SECUREANDUSABLE) The Yuck developers MUST keep security at the highest priority, without sacrificing usability.
- (QUALITY) The Yuck project MUST aim for high quality, by applying development methods that are known to work for achieving quality, such as test-driven development, automated test suites with high test coverage, and code review.
- (HSCALABLE) The Yuck architecture MUST be horizontally scalable to
very large numbers of concurrent users and API clients.
- (NOTUNSCALABLE) The implementation might not scale to very many users or concurrent users, especially initially, but the architecure MUST NOT prevent a scalable implementation.
- (ADMINFRIENDLY) Yuck MUST be flexible for system administrators to
manage, and applications to use.
- (ADMINAPIS) Yuck SHOULD provide APIs for managing the entities and data it needs, such as for creating end users and API clients, or changing their credentials.
- (APPFRIENDLY) Yuck SHOULD enable applications to delegate all authentication to Yuck.
- (FREEDOM) Yuck MUST be free software. It MUST NOT require applications, API clients, and other software that works with Yuck to be free software.
- (PRIVACYSTORE) Yuck MUST NOT store personal information it does not need.
- (PRIVACYLEAK) Yuck MUST NOT leak personal information.
Architecture: the ecosystem
An IDP interacts with several other systems to enable end users to do their thing. The RP provides the actual service, and delegates authentication to the IDP. There can be other services in front of the RP, and for security reasons there has to be at least one for end-user authentication.
The end user interacts directly with their web browser or other user agent, which is assumed to be entirely under their control, and thus not trusted by the IDP or other components. The end user is assumed to trust what they use.
The browser talks to the facade (to get the HTML and JS and other files to present a UI to the user), and the IDP (to allow the user to authenticate themselves).
The facade holds the access token on behalf of an authenticated end user.
The facade talks to a backend, giving it the user's access token as proof of authentication and authorization.
The backend provides an API suitable for the service it provides. It also allows access based on the access token.
The resource provider stores data for the backend. It also allows access based on the access token.
Some access is not interactive by the end user, but by API clients that either act on behalf of the user, or are unrelated to them in any way. The end user can authorize an API client access on their behalf. The authorization can limit the API client's access to a subset to what the end user user can do. If the end user can both read and write a resource, the authorization might only allow the API client to read the resource.
API clients that are unrelated to the user are authorized by the owners of the RP. See below for an example.
Authentication scenarios
As examples of how an authentication server might be used, consider a an online banking system. It should support at least three scenarios.
End user interactively accesses their account: The end user opens up the bank web page, and logs in, and can interactively do whatever they're allowed to do: view their bank statement, transfer money, etc.
End user authorizes an API client: The end user, who happens to be a Unix sysadmin, might want to automatically retrieve their bank statement and feed it to their accounting system. They create an authorization for an API client that only allows it to retrieve the statement, but not do anything else. This creates, in the IDP, a new API client identity, which is tied to the end user's identity, so that whatever the API client does, it is known to act on behalf of the end user.
Bank pays interest automatically: The bank runs an API client, authorized by the bank to act autonomously and without end user authorization, which annually transfers interest from the bank's own account to each end user's account.
Obviously, a real bank would need a lot more scenarios, but these will do for discussing Yuck.
Data model
Yuck needs to store data about end users, applications, and API clients. It models the data as a set of "resources", which can be represented as JSON objects. Initially, Yuck will store the JSON objects in Muck, which is a dedicated JSON object store, but Yuck will be able to support any store that supports the following:
- an object can be created and assigned a unique ID and revision
- an object can be updated, with collision prevention using the revision (updater gives the revision of what they think is the newest revision; the store will fail the update if it isn't)
- an object can be retrieved, given the ID
- an object can be deleted, given the ID
- objects can be search for, based on any field defined below, using case-independent equality or comparison to a pattern
A user
A user resource represents the user. It's object ID is used to identify users in the eco system, not a username. The object identity is unique, never changes, and is chosen by Yuck, and ideally is never shown to the user, and only used to reference the user internally.
The user resource stores the following data:
allowed_scopes
— (a list of strings) the scopes the user is allowed to have
Note that the user object does not store usernames or credentials in any way. They may have any number of credentials, for multi-factor authentication. When a user is being authenticated, they must provide all credentials.
A username
A username resource stores one name by which the user is identified to the system. As far as Yuck is concerned, a user may have any number of usernames, and they can change. The username is user-visible, and chosen by the user. They need to be unique.
user_ref
— (a string) ID of the user resource for the userusername
— (a string) a username for the user
Yuck stores as little about a user as possible. For example, it does not store the full name, or any contact information. The applications may store that separately.
An OAuth2 API client
For OAuth2 API clients, the following data is stored:
user_ref
— (a string, ornull
) ID of the user resource for the user on behalf of whom the API client acts, if anyallowed_scopes
— (a list of strings) the scopes the API client is allowed to have
Note that an API client may act on behalf of a user, but does not need
to do so. If user_ref
is set to a non-empty string, it is acting on
behalf of a user, and this will cause any access tokens the API client
gets to have the sub
claim set to the user's ID.
An OIDC application front end
For OIDC application front ends, the following data is stored:
allowed_scopes
— (a list of strings) the scopes the API client it allowed to havecallbacks
— (a list of strings) the callback URIs for the application
A password credential for scrypt
For password based authentication for users, API clients, and application front ends, Yuck will store the following data:
user_ref
— (a string, ornull
) ID of the user resource for the user, if anyclient_ref
— (a string, ornull
) ID of the resource for the API client, if anyhash
— (a string) password encrypted using scrypt, encoded as hexadecimalsalt
— (a string) randomly chosen string to salt the encryption, encoded as hexadecimalkey_len
— (an integer) used for scryptN
— (an integer) used for scryptr
— (an integer) used for scryptp
— (an integer) used for scrypt
Note that Yuck will require only one of user_ref
and client_ref
to
be set to a non-empty string, and the other one to null
.
The key_len
, N
, r
, and p
fields are used for scrypt
encryption. They are stored so that they can later be varied without
making previously stored passwords invalid.
A TOTP credential for a user
Yuck stores the TOTP credential for a user as follows:
user_ref
— (a string) ID of the user resource for the user- the rest to be determined, when TOTP is implemented
External interfaces of Yuck
Yuck provides the following interfaces to the rest of the ecosystem:
- endpoints for managing users, API clients, OIDC application frontends, including their credentials
- an endpoint for OAuth2 API clients to get tokens using client credential grants
- endpoints for OIDC frontends to use for interactively authenticating the end users, and for getting the resulting tokens (including refreshed tokens)
- an endpoint for monitoring the health of Yuck
Details will be specified later.
Authentication protocols
This chapter will walk through of each of the protocols Yuck supports, down to sample HTTP requests and responses.
Authorization information
Overview of how authorization happens in the eco system:
The IDP keeps track of what each end user and API client is authorized to do. This is encoded by storing a list of "scopes". A scope is a permission to do something, such as "create a resource" or "update a resource the end user owns". See
allowed_scopes
in the user and API client resources.The access token identifies the end user. The token grants permission to its bearer to do specific actions, encoded as a list of scopes. Note that an access token need not have all the allowed scopes.
The API provider actually implements the access control checks based on the access token and its contents. The API provider implements specific actions, and associates each with a scope, and checks that the token has that scope.
For example, assume that Alice is authorized the actions "create
resource" and "read resource owned by the user"; authorized_scopes
has the scopes create
and read
.
Alice creates an API client, but only allows it the read
scope. When
the API client gets an access token, it will have the sub
claim set
to alice
, and the scope
claim set to read
. With such an access
token, the API client can read any resources that Alice can read, but
can't create new resources.
OAuth2 for autonomous API clients
- walkthrough of an API client getting tokens via OAuth2 CC
- and using them
OIDC for interactive end users
- walkthrough of an end-user causing facade to get tokens
- and facade using them
- web sessions
End users authorizing API clients
- walkthrough
Architecure: Yuck itself
The diagram above doesn't include parts of the eco system that are not part of Yuck or don't directly interact with Yuck.
Yuck consists of three sets of endpoints, and a data store. The endpoints implement the external interfaces for the authentication protocols, and for administration. The data store stores JSON objects.
An API client acting on behalf of an administrator, will use the Yuck admin endpoints to manage uses, API clients, and OIDC applications. An application frontend may provide a user interface for doing the same.
Note that the various Yuck endpoints and the processes implementing them do not need to interact except via the data store. This enables horizontal scalability to the extent the data store scales.
(It may be more sensible to have the application backend provide an interface for admin actions. It will still need to use the Yuck admin endpoints for doing that. This possibility has been left out of the diagram to avoid clutter.)
The data store
The data store will initially be Muck, which as a RESTful HTTP API for managing JSON objects. The API uses the kind of JWT access tokens for access control that Yuck creates. Yuck can create the tokens for its own use.
Later, support for other data stores can be added. LDAP is probably going to be desired. This can be done by implementing a new component that provides a Muck-like interface, but stores the data in LDAP. Similarly, support can be added for SQL databases, etc.