Basic Encoding Rules
Created:
2017-11-10
Updated:
2017-11-10
The Basic Encoding Rules for ASN.1 (BER) give one or more ways to represent any ASN.1 value as an octets sequence.
There are three methods to encode an ASN.1 value under BER, the choice of which depends on the type of value and whether the length of the value is known. The three methods are primitive definite-length, constructed definite-length and constructed indefinite-length.
Simple non-string types employ the primitive definite-length method; structured types employ either of the constructed methods; and simple string types employ any of the methods, depending on whether the length of the value is known.
Types derived by implicit tagging employ the method of the underlying type and types derived by explicit tagging employ the constructed methods.
In each method, the BER encoding has three or four parts:
- Identifier octets. These identify the class and tag number of the ASN.1 value, and indicate whether the method is primitive or constructed.
- Length octets. For the definite-length methods, these give the number of contents octets. For the constructed, indefinite-length method, these indicate that the length is indefinite.
- Contents octets. For the primitive definite-length method, these give a concrete representation of the value. For the constructed methods, these give the concatenation of the BER encodings of the components of the value.
- End-of-contents octets. For the constructed indefinite-length method, these denote the end of the contents. For the other methods, these are absent.
Primitive definite-length
Applies to simple types and types derived from simple types by implicit tagging. Requires that the length of the value be known in advance.
Identifier octets
Low-tag-number form. For tag numbers between 0 and 30. One octet. Bit 8 an 7 specify the class, bit 6 has value “0”, indicating that the encoding is primitive, and bits 5-1 give the tag number.
Class | Bit 8 | Bit 7 |
---|---|---|
Universal | 0 | 0 |
Application | 0 | 1 |
Context-specific | 1 | 0 |
Private | 1 | 1 |
High-tag-number form. Two or more octets. First octet is as the low-tag-number form, except that bits 5-1 all have value “1”. Second and following octets give the tag number, base 128, most significant digit first, with as few digits as possible, and with the the bit 8 of each octet except the last set to “1”.
Length octets
Short form. For lengths between 0 to 127. One octet. Bit 8 has value “0” and bits 7-1 give the length.
Long form. For lengths between 0 and 2^1008-1. Two to 127 octets. Bit 8 of the first octet has value “1” and bits 7-1 give the number of additional length octets. Second and following octets give the length, base 256, most significant digit first.
Note: the value 11111111(binary) shall not be used. This restriction is introduced for possible future extension.
**Contents octet **
Gives a concrete representation of the value.
Constructed definite-length
Applies to simple string types, structured types, types derived by simple string types and structured types by implicit tagging, and types derived from anything by explicit tagging. Requires that the length of the value be known in advance.
Identifier octets. Like primitive definite-length except that bit 6 has value “1”.
Length octets. As described for primitive definite-length types.
Content octets. The concatenation of the BER encodings of the components. For string types that is the string payload.
Constructed indefinite-length
Applies to simple string types, structured types, types derived from simple string types and structured types by implicit tagging, and types derived from anything by explicit tagging. It does not require that the length of the value to be known in advance.
Identifier octets. Like constructed definite-length types.
Length octets. One octet, 0x80
Content octets. As described for constructed definite-length types.
End-of-contents. Two octets, 0x00 0x00. The two bytes have such a meaning only when are found in a type/length position.
Distinguished Encoding Rules
The DER are a subset of BER, and give exactly on way to represent any ASN.1 value as an octet string. DER is intended for applications in which a unique octet string encoding is needed, as in the case where a digital signature is computed on an ASN.1 value. DER is defined by X.509.
DER adds the following restrictions to the BER rules:
- The length is encoded using less bytes as possible, possibly short form.
- Indefinite length method is not used.
Other restructions that are defined for the particular types are described when required.
Notation and Encoding for some types
NULL
Denotes a ’null value'.
BER encoding. Primitive. The contents octets are empty.
DER encoding. Always 05 00
.
INTEGER
An arbitrary integer.
ASN.1 notation:
INTEGER [{ identifier_1 (value_1) ... identifier_n (value_n) }]
Where indentifier_1 ... identifier_n
are optional distinct identifiers and
value_1 ... value_n
are optional integer values. The identifiers, when
present, are associated with values of the type.
BER encoding. Primitive. The contents octets give the value of the integer, base 256, in two’s complement form, most signficant digit first, with the minimun number of octets. The value 0 is encoded as a single 00 octet.
DER encoding. Same as BER.
Examples
Value | Encoding |
---|---|
0 | 02 01 00 |
127 | 02 02 00 7F |
128 | 02 02 00 80 |
256 | 02 02 01 00 |
-128 | 02 01 80 |
-129 | 02 FF 7F |
BIT STRING
An arbitrary string of bits.
ASN.1. notation:
BIT STRING
BER encoding.
Primitive or constructed. In primitive encoding the first content octet gives the number of bits by which the length of the bit string is less than the next multiple of eight (the number of unused bits in the tail). The second and following octets give the value of the bit string, converted to an octet string. Padding is eventually done in the tail.
In constructed encoding, the content octets give the concatenation of the BER encoding of consecutive substrings of the bit string, where each substring except the last has a length that is a multiple of eight bits.
The padding bits can have any value.
DER encoding. Primitive. The content octets are as for a primitive BER encoding, except that the bit string is padded with zero-valued bits.
Example. Encoding of the same bit string value: “011011100101110111”
03 04 06 6e 5d c0 DER encoding
03 81 04 06 6e 5d c0 Long form of length octets
23 09 Constructed encoding : "0110111001011101" + "11"
03 03 00 6e 5d
03 02 06 c0
OCTET STRING
An arbitrary string of octets.
ASN.1 notation:
OCTET STRING [SIZE ({size | size_1 .. size_2})]
where size
, size_1 .. size_2
are optional size constraints.
In the form size_1 .. size_2
the octet string must have between size_1
and
size_2
octets.
BER encoding
Primitive or constructed. In primitive encoding the contents octet are the characters in the IA5 string, encoded in ASCII. In constructed encoding, the contents octets give the concatenation of the BER encodings of consecutive substrings of the OCTET STRING.
DER encoding. Primitive.
IA5String
An arbitrary string of IA5 characters (same as ASCII).
ASN.1 notation:
IA5String
BER encoding.
Primitive or constructed. In primitive encoding the contents octets give the value of the octet string, first octet to last octet. In constructed encoding, the contents octets give the concatenation of the BER encodings of consecutive substrings of the IA5 string.
DER encoding. Primitive.
PrintableString
An arbitrary string of printable characters from the following character set:
A .. Z
a .. z
0 .. 9
(space) ' ( ) + , - . / : = ?
ASN.1 notation:
PrintableString
Encoding is equal to the IA5String encoding.
OBJECT IDENTIFIER
Is a sequence of integer components that identify a well-known object such as an algorithm or a directory-name attribute. Can have any number of components and components can have any non-negative value. There are at least two components. Values are assigned by registration authorities.
ASN.1 notation:
{ [identifier] component_1 ... component_n }
component_i = identifier_i | identifier_i(value_i) | value_i
Where value_1 ... value_n
are optional integer values.
The form without identifier
is the “complete” value with all its components;
the form with identifier
abbreviates the beginning components with another
object identifier value.
The identifiers identifier_1 ... identifier_n
are intended for documentation,
but they must correnspond to the integer value when both are present.
These identifiers can appear without integer values only if they are among a
small set of identifiers defined in X.208.
Example
{ iso(1) member-body(2) 840 113549 }
{ 1 2 840 113549 }
BER encoding.
Primitive. the contents octets are the concatenation of n-1 octet strings, where n is the number of components in the complete object identifier.
The first subidentifier is encoded base 40: 40⋅value_1 + value_2
. Thus
value_2
is limited to the range 0 to 39.
The other identifiers are encoded, base 128, most significant bit first,
with as few digits as possible, and with bit 8 of each octet except the last
set to 1
.
DER encoding
Same as BER.
Example. Encoding of { 1 2 840 113549 1 }
1 * 40 + 2 = 42 = 0x2A
840 = 0x86, 0x48
113549 = 0x86, 0xF7, 0x0D
Encoding (TAG,LEN,VAL): 06 07 2A 86 48 86 F7 0D 01
ANY
Denotes an arbitrary value of an arbitrary type.
ASN.1 notation:
ANY [DEFINED BY identifier]
where identifier is an optional identifier. In the ANY
form, the actual type
is indeterminate. The ANY DEFINED BY
identifier can only appear in a component
of a SEQUENCE
or SET
type for which identifier identifies some other
component with type INTEGER
or OBJECT IDENTIFIER
. In that form, the actual
type is determined by the value of the other component, either in the
registration of the object identifier value, or in a table of integer values
BER/DER encoding. Same as BER/DER encoding of the actual value.
CHOICE
A union of one or more alternatives.
ASN.1 notation:
CHOICE {
[identifier_1] Type_1
...
[identifier_n] Type_n
}
Where identifier_1
, …, identifier_n
are optional, distinct identifiers
for the alternatives, and Type_1
, …, Type_n
are the types alternatives.
The types must have distinct tags (explicit/implicit tagging may be required).
BER/DER encoding. Same as BER/DER encoding of the chosen alternative.
Tagged types
Tagging allows to specify the presence of an optional parameter withing a data structure. The mechanism allows to avoid ambiguity in the case that one optional element preceeds another with the same type.
With tagging the decoder can decide whether the value is for the optional element or for the element that follows. That is, we assign a special tag to the optional element.
Implicit
Is a type derived from another by changing the tag of the underlying type. Implicit tagging is mainly used for optional SEQUENCE components.
ASN.1 notation:
[[class ]number] IMPLICIT Type
class = UNIVERSAL | APPLICATION | PRIVATE
Where Type is a type, class is an optional class name, and number is the tag number within the class, a non-negative integer. If the class name is absent, then the tag is CONTEXT-SPECIFIC.
BER/DER encoding. Primitive or constructed, depending on the underlying type. Content octets are as for the BER/DER encoding of the underlying value.
With implicit tags the decoder must, implicitly, know the type of the underlying type, this is because the real type has been masked by the tag. The advantage of implicitly tagged types is that they require less space to be serialized.
Explicit
Is a type derived from another by adding an outer tag (and length = 1) to the underlying type. Indeed is encoded like a constructed type with one element.
ASN.1 notation:
[[class ]number] EXPLICIT Type
class = UNIVERSAL | APPLICATION | PRIVATE
Where Type is a type, class is an optional class name, and number is the tag number within the class, a non-negative integer. If the class name is absent, then the tag is CONTEXT-SPECIFIC.
If the IMPLICIT/EXPLICIT keyword is not specified then EXPLICIT is taken as the default.
BER/DER encoding. Constructed. Contents octets are the BER/DER encoding of the underlying value.
With explicit tags the decoder will be able to decode the type even without known, a priori, the type of the tagged value. Another importand advantage of explicit tags is that the type con be changed in the future without changing the tag.