Object Description in Cryptography Using ASN.1

Overview

At this point one thing probably has become obvious to you. There are a number of areas in cryptography that, in order for communication to take place, require two implementations to have the same idea about how algorithms are implemented as well as have a common language for exchanging required parameter information for those algorithms.

This chapter introduces ASN.1, which is a language designed for just this purpose. By the end of this chapter, you should

Finally, you will see how to use the EncryptedPrivateKeyInfo class defined in the JCE, as well as see what is taking place when it does its job.

What Is ASN 1?

Abstract Syntax Notation 1, or ASN.1, came out of the standards developed by ISO and CCITT ( renamed ITU-T at the start of the 1990s) when work was being done on Open Systems Interconnection (OSI) standards. Work on OSI started in the early 1980s when, reputedly, hundreds of people used to attend the standards meetings. The primary design goal for ASN.1 was to provide a standard notation for use in specifying protocols that was concise in its encoding. Prior to the rise of the Internet, its use was mainly in the area of telecommunication standards, but we now see it in widespread use for describing key encodings, secure protocols, and algorithm parameters. The main standard defining it is X.680, and there are an additional six standard documents that build on X.680: X.681, X.682, X.683, X.684, X.691, and X.693. You'll find details of these listed in Appendix D, but as you can see, put briefly , ASN.1 is big!

It is not an easy thing to sum up in a few pages either. Most people who write about it tend to sprinkle quotes from H. P. Lovecraft's The Mountains of Madness into the commentary , as well as biblical offerings from the story of the Tower of Babel , in an attempt to help their readers work through the mysteries of ASN.1 and how it is used. True, it can be baroque, almost unfathomable in places, but it has a long history and has actually solved a lot of the problems associated with the building of universal protocols. So, while I doubt it is the final word on the problem of universal communication, it definitely has a lot to offer.

More importantly, in the case of cryptography, almost every standard you are likely to make use of, such as the PKCS standards and the PKIX RFCs, uses ASN.1 somewhere. Although this does not mean you have to understand ASN.1 in depth, it does mean you need some knowledge of it and how it works. Understanding something about it also gives you insight into some of the design decisions that were made when the Java cryptography APIs were developed and why various classes work together the way they do.

Still, as I have mentioned, both ASN.1 and its syntax can seem kind of weird and it is best approached with a sense of humor. To put you into the right frame of mind, I would remind you of Chancellor Gorkon's claim in Star Trek VI: The Undiscovered Country : "You have not experienced Shakespeare until you have read him in the original Klingon." I would not go as far as comparing most cryptographic standard documents to Shakespeare, but ASN.1 is definitely the original Klingon!

Getting Started

It is not necessary to add any new functionality to the Utils class for this chapter. However, as I will be extending it again in later chapters and to keep the examples regular, I'll define a new version of the Utils class to start the chapter5 package with. This new version is simply an extension of the one used in Chapter 4 and looks as follows :

package chapter5; /** * Chapter 5 Utils */ public class Utils extends chapter4.Utils { }

Create the new chapter5 package and type the Utils class in.

Now you are ready to proceed.

Basic ASN 1 Syntax

The underlying syntax of ASN.1, or at least the bit you have to deal with, is quite simple. There are three things you need to be able to recognize when you are trying to read an ASN.1 module:

Comment Syntax

As with Java, there are two commenting styles used in ASN.1, one for block comments, which are delimited by /* and */, and one for single-line comments, which start with -- and end with -- .

Unlike Java, ASN.1 block comments can contain other block comments. The block comment syntax is newer than the line comment syntax, so for historical reasons, you will often see multiple lines of line comments used where a block comment could have been used otherwise .

Object Identifiers

Object identifiers, often referred to as OIDs for short, provide a unique handle for an object or ASN.1 module in the ISO/ITU-T universe. They were introduced into ASN.1 to make it possible to construct a globally unique namespace, which also made it easy for organizations to be allocated part of that namespace, which they could then extend as required. For this reason, the best way to think of the structure of an OID is like it is a path , or arc, through a tree.

For example, the RSA Security algorithm that you are using when you call Signature.getInstance() with SHA256withRSA has an object identifier associated with it that goes like this:

iso(1) member-body(2) us(840) rsadsi(113549) pkcs(1) pkcs-1(1) 11

with the leftmost end representing where you start at the root of the tree. In its short form, this OID would be seen as "1.2.840.113549.1.1.11", and looking at the OID definition, you can see that RSA Security was assigned the OID "1.2.840.113549" by the U.S. branch of ISO. After that, RSA Security started its own branch, assigning the OID "1.2.840.113549.1" to its PKCS standards, and then further assigning OID "1.2.840.113549.1.1" to PKCS #1 and finally getting to the algorithm, which has simply been given the number 11, resulting in the OID for SHA-256 with RSA encryption: "1.2.840.113549.1.1.11". You can see another way of looking at the assignments that were done up to RSA Security in Figure 5-1.

Figure 5-1

There are three primary branches, or arcs , on the object identifier tree, all of which you will see used from time to time. The assignments are the ITU-T ”the number 0, ISO ”the number 1, and joint ISO/ITU-T organizations with the number 2. After that how the space is carved up becomes quite arbitrary, as the primary arc owners allocated numbers at the next level down to other organizations as they needed to. That said, despite the apparently arbitrary nature of how an OID gets created, the numbers are globally unique and serve the purpose of providing identification for ASN.1 modules, data types, algorithms, and just about anything else you can imagine. The only complication is that, because OIDs are based around organization rather than subject, in some cases you will notice the same cryptographic algorithm will be referred to by different OIDs.

The Module Structure

A module is typically structured along the following lines:

ModuleName { ObjectIdentifier } DEFINITIONS Tagging TAGS ::= BEGIN EXPORTS export_list ; IMPORTS import_list ; body END

ModuleName and ObjectIdentifier have values that are used to identify the module being described. For example, the module defined in RFC 3161 (Time-Stamp Protocol) starts as follows :

PKIXTSP {iso(1) identified-organization(3) dod(6) internet(1) security(5) mechanisms(5) pkix(7) id-mod(0) id-mod-tsp(13)}

This tells you the name of the module is PKIXTSP and that it is associated with an object identifier, which in its basic form will be 1.3.6.1.5.5.7.0.13.

Tagging tells you the tagging environment for the module. You will read about tagging in more detail a bit later, but you can expect to see one of IMPLICIT, EXPLICIT , or AUTOMATIC . If no tagging environment is specified, as in the TAGS :: = is missing altogether, you can assume tagging is EXPLICIT . In the case of RFC 3161, you will see the following:

DEFINITIONS IMPLICIT TAGS ::=

This means the default tagging type for the module is IMPLICIT . The "::=" symbol looks a bit like an assignment but more correctly reads as "is defined as." Later you will see that definitions can follow it, but in this case nothing follows it, as everything between the BEGIN and END is included.

The export_list is the list of types that this module defines that other ASN.1 modules can import. If EXPORTS is missing altogether, it means that everything defined in the module can be imported by another one. If you see the export list missing ”that is, you just see EXPORTS ; ”it means nothing is available for export.

The import_list is the list of types that are being imported into the module and where they are from. This is the thing that is of the most interest to you. If you are trying to implement a particular protocol or algorithm suite and are using an ASN.1 module as your reference, the import_list tells you where to look for definitions that are not in the module you are using as your primary reference point. An example of an import list, also from RFC 3161, is as follows:

IMPORTS Extensions, AlgorithmIdentifier FROM PKIX1Explicit88 {iso(1) identified-organization(3) dod(6) internet(1) security(5) mechanisms(5) pkix(7) id-mod(0) id-pkix1-explicit-88(1)} GeneralName FROM PKIX1Implicit88 {iso(1) identified-organization(3) dod(6) internet(1) security(5) mechanisms(5) pkix(7) id-mod(0) id-pkix1-implicit-88(2)} ContentInfo FROM CryptographicMessageSyntax {iso(1) member-body(2) us(840) rsadsi(113549) pkcs(1) pkcs-9(9) smime(16) modules(0) cms(1)} PKIFreeText FROM PKIXCMP {iso(1) identified-organization(3) dod(6) internet(1) security(5) mechanisms(5) pkix(7) id-mod(0) id-mod-cmp(9)} ;

As you can see, the component imports just follow a format of

import_item FROM source

where import_item is the type, or value, being imported and source is the name and OID for the module where import_item has been defined. Note the semicolon on the end of the collection of imports; it terminates the IMPORTS statement. Anything you read after the semicolon is a local definition. Of course, if the IMPORTS statement is missing, you should have all the information you need in the ASN.1 module in front of you.

Now consider body . It is terminated by the keyword END and it is the body where all the type and value definitions that the ASN.1 module provides are. Here are some examples taken from the body of the module defined in RFC 3161:

id-ct-TSTInfo OBJECT IDENTIFIER ::= { iso(1) member-body(2) us(840) rsadsi(113549) pkcs(1) pkcs-9(9) smime(16) ct(1) 4} TSAPolicyId ::= OBJECT IDENTIFIER

The first line defines a value. In this case, it is saying id-ct-TSTInfo is of the type OBJECT IDENTIFIER and has the OID value 1.2.840.113549.1.9.16.1.4. Looking at the line also gives you the basic syntax for ASN.1 value definitions:

name type ::= value

where name is the name you want to refer to value with, type is the type of the value, and then value is specified in whatever notation is appropriate for the particular type given by type .

The second definition defines a new type that can be used in the module. What it is saying is that where you see TSAPolicyId , you are looking at a special case of OBJECT IDENTIFIER . As you can probably see, the basic syntax is

newType ::= type

where newType is the new type being created and type is the existing type it is based on. As with other languages, types you define can be used to build other types and definitions.

ASN.1 names can consist of upper-and lowercase letters, numbers, and dashes (the "-" character). Like any language, there are also a couple of conventions for creating names. Module names and type names all start with uppercase letters, whereas names for everything else start with lowercase letters .

A lot more can go into an ASN.1 module, but this gives you enough to make sense of the various RFCs, PKCS documents, and other standards ”so I will stop here. The next thing you need to do is look at the basic types that are available.

ASN 1 Types

Broadly speaking, ASN.1 types fall into three categories: simple types, string types, and structured types. The string types are further subdivided into two more categories, those that just deal with raw bits and those that represent specific character encodings. The structured types consist of two container types ” SEQUENCE and SET ”that allow you to build complex structures using all the categories of type.

I will start with the simple types, which are used to represent fundamental values such as booleans, integers, and dates. I will then deal with the string and structured types.

Simple Types

The simple types in ASN.1 are

There are no real surprises with most of these:

BOOLEAN encodes a true or false value.

ENUMERATED is a special case of INTEGER that can be used to represent signed integers of any magnitude. Note that I said signed INTEGER values are encoded as two's-complement numbers , high byte first in "big endian" format.

You can think of NULL in a similar way to the Java null , although there is a slight twist, as it is ASN.1's way of distinguishing a value set to nothing, rather than absent, which you will see later is also a possibility.

You have already learned what object identifiers are in the section on basic ASN.1 syntax. Not surprisingly, OBJECT IDENTIFIER is the type they are given.

UTCTime and GeneralizedTime are two that deserve some special attention; both are used to define a "Coordinated Universal Time," but UTCTime has only a two-digit year. GeneralizedTime has a four-digit year. Both objects represent time as strings of ASCII, with major differences being that GeneralizedTime has a four-digit year and can represent seconds to an arbitrary precision, whereas UTCTime has a two-digit year and cannot go any lower than seconds in its resolution. Although it should be obvious how a GeneralizedTime is used, a question remains: How do you deal with the two-digit year in UTCTime?

One interpretation of UTCTime is that the two-digit year is interpreted as spanning the century starting from 1950 to 2049, but others are also used. A UTCTime can also be interpreted as going from 1900 to 1999, or as being on a sliding window, as in if it's 2005, the digits 55 to 99 are interpreted as indicating 1955 to 1999, and 0 to 55 is interpreted as meaning 2000 to 2055. How you work this one out depends on the standard you are working with, but you will be relieved to know that for the most part people have settled on the meaning that maps 50 to 99 as 1950 to 1999, and 00 to 49 as 2000 to 2049.

Bit String Types

The two bit string types are BIT STRING and OCTET STRING .

BIT STRING allows you to store an arbitrary string of bits of an arbitrary length. For this reason there are two components to a bit string: The first is a string of octets that contains the actual string of bits and 0 to seven pad bits to make the string a multiple of 8 bits in length, and the second is a pad count that records how many pad bits were added. A bit string can be of zero length.

OCTET STRING allows you to store a string of octets and maps quite nicely onto a Java byte array.

Character String Types

ASN.1 has many character string types, almost all of which appear in some standard or another. The character string types are as follows :

As you go through the different character string types, you will see that the distinctions made between each type are all based on what characters can be in a string of the particular type being considered , as the character range goes from a restricted 7-bit character set to a full 32-bit character set. Once you have had the experience of looking at a range of ASN.1 modules, you will probably realize that the use of a particular character string type is as much a reflection of the hardware and software that was available when the module was being written as it is a reflection of the mind of the module writer.

The BMPString takes its name from the "Basic Multilingual Plane," which contains all the characters associated with the "living languages," with a fixed 16 bits per character. Its character set is the one represented by ISO 10646, the same set represented by the Unicode standard. It is quite a natural fit with the Java programming language and now sees extensive use.

The GeneralString and the GraphicString are actually related, so I will discuss them together. Both string types are related to the character sets described in the International Register of Coded Character Sets detailed in ISO 2375. The GraphicString can contain any one of the printing characters that appear in the register, but not the control characters. A GeneralString can contain control characters as well. Since the arrival of Unicode, use of these two types is becoming rarer and rarer.

The IA5String takes its name from an old ITU-T recommendation "International Alphabet 5." ASCII is actually a variant of this alphabet, and these days the type is considered to cover the whole of ASCII with its character set.

The NumericString can contain the digits 0 to 9 and the space character.

The PrintableString can contain a limited range of ASCII characters, which consists of the uppercase letters "A" to "Z", the lowercase letters "a" to "z", the digits "0" to "9", and the following other characters: " " (space), "'" (apostrophe), "(,)" (comma in parentheses), "+", "," (comma), "-", ".", ":", "=", and "?".

The TeletexString was originally known as the T61String , as it was originally based on the character set specified in CCITT Recommendation T.61 for Teletex. It is an 8-bit-per-character type, but there is the added feature that the ASCII ESC character starts an escape sequence that changes which actual characters are represented by the character stream that follows. The way this works is that each escape sequence should cause a display device to change the lookup table used for interpreting characters to the one indicated by the escape sequence. This allows the TeletexString to be used to support a wide range of languages, but it also makes it very difficult to interpret if you are on the receiving end of one.

The UniversalString is another update for internationalization. It was added to ASN.1 in 1994 to allow for the representation of strings made up of 32-bit characters. It is very rare to see one of these, as most modern languages are built on Unicode so there is not a lot of native support. Nonetheless, you will run into them occasionally; just hope you will not have to display the contents of an arbitrary UniversalString using Java just yet.

The UTF8String takes its name from "Universal Transformation Format, 8 bit," the same encoding that is often discussed with Java. This character string type works nicely with Unicode while still allowing the representation of the full character set possible in a UniversalString . It is the recommended string type for full internationalization, so you will see these used increasingly.

The VideotexString was designed to accommodate characters that can be used to build simple images. The strings are a mixture of pixel information and control codes, where an 8-bit character is typically considered to contain a 3 —2 array of pixels. Fortunately, the type was designed for use with videotext systems, and, to date, I have never had to deal with one of these in a standards document involving cryptography.

The VisibleString could originally contain only characters that appeared in ISO 646, and occasionally you will even see it referred to as an ISO646String . Since 1994 it has been interpreted as containing plain ASCII, but unlike an IA5String , it includes only the printing characters plus space. No control characters are allowed.

Reading through this list you have probably realized that the broad coverage of character sets that some of these string types purport to represent makes them, for want of a better phrase, simply scary. Fortunately, it is very unusual to encounter the more bizarre ones, largely due to the standardization of some of the simpler string types on variants of ASCII and the growth of Unicode. So take a deep breath , relax, and read on.

Structured Types

ASN.1 has two structured types: SEQUENCE and SET .

You will also see these used as SEQUENCE OF and SET OF . When you see the OF keyword, the SET or SEQUENCE will only contain ASN.1 objects of the type specified afterwards. For example:

Counters ::= SEQUENCE OF INTEGER

indicates that an object of type Counters contains only 0, or more, ASN.1 objects of type INTEGER .

As for the difference between a SEQUENCE and a SET: A SEQUENCE specifies the order of its components in its declaration, whereas a SET is an unordered collection. For example:

DigestInfo ::= SEQUENCE { digestAlgorithm AlgorithmIdentifier, digest OCTET STRING }

tells you that an ASN.1 object of type DigestInfo is a sequence with two elements, the first of which is an AlgorithmIdentifier , the second an OCTET STRING . On the other hand, if you created a similar example for a SET , as in:

InfoSet ::= SET { digestAlgorithm AlgorithmIdentifier, digest OCTET STRING }

it just means you will find the two component types in the SET , but not necessarily in that order. For the most part, uses of SET will look like this:

attrValues SET OF AttributeValue

which is considerably easier to make sense of.

Type Annotations

Two annotations can be applied to types. One is OPTIONAL , which when applied to a field means it can be left out totally. The other is DEFAULT , which specifies the value for field if it is not present.

You will see both of these in the context of SEQUENCE and SET objects, for example:

VersionedData ::= SEQUENCE { version INTEGER DEFAULT 0, data OCTET STRING OPTIONAL }

DEFAULT tells you that the value may be left out of the encoding of the SEQUENCE . If it is, you should set the value of the version field in whatever Java object you are representing your VersionedData object with the value 0.

The data field, however, is marked as OPTIONAL , which means it can be left out. As you might also have an encoding that has not included the version field because it is set to its default value of 0, the possible lengths of a SEQUENCE representing a VersionedData object are as follows:

Note that being OPTIONAL and absent is not the same as setting the field to NULL . In fact you cannot use NULL here anyway, as the field has to contain a type of OCTET STRING . As I hinted at earlier, unlike in Java, where null is a value you can assign to any extension of Object, NULL in ASN.1 is a specific value.

Tagging

Every encoding of a standard ASN.1 type has a default tag value of one octet already, which serves the purpose of allowing someone parsing a byte stream containing ASN.1-encoded objects to work out how to interpret the bytes following. You can see the tag values for some of the common types that are specified for BER encoding in Table 5-1. The default tag value occupies the lower five bits (bits 5-1) of the available eight in the octet, and there are modifiers that can be applied to the default tag value.

Base Type

Tag Value

Base Type

Tag Value

Base Type

Tag Value

BOOLEAN

0 —01

ENUMERATED

0 —0a

IA5String

0 —16

INTEGER

0 —02

UTF8String

0 —0c

UTCTime

0 —17

BIT STRING

0 —03

SEQUENCE

0 —10

GeneralizedTime

0 —18

OCTET STRING

0 —04

SET

0 —11

VisibleString

0 —1a

NULL

0 —05

NumericString

0 —12

UniversalString

0 —1c

OBJECT IDENTIFIER

0 —06

PrintableString

0 —13

BMPString

0 —1e

The most important of these modifiers for you is bit 6, which if set means the type is a constructed type. What this means is that the byte stream following will be made up of other ASN.1 objects that need to be assembled to make up the object being parsed. I'll deal with this in more detail when you look at BER encoding, but for now it's enough to know that SEQUENCE and SET are always marked with the constructed bit. Therefore, although the tag value for SEQUENCE is 0 —10 and SET is 0 —11, the encoded values you will encounter will be 0 —30 to indicate a SEQUENCE follows and 0 —31 to indicate a SET , because both these types are composed of one or more other ASN.1 objects.

Bits 8 and 7 specify the class of the tag. As I'm currently talking about default types, these bits will both be zero, which indicates they are in the UNIVERSAL class, which generally means the actual tag value is for one of the predefined ASN.1 types. The other tag classes are CONTEXT-SPECIFIC, PRIVATE , and APPLICATION . The bit values associated with each class are as follows:

Class

Bit 8

Bit 7

UNIVERSAL

APPLICATION

1

CONTEXT-SPECIFIC

1

PRIVATE

1

1

You can ignore PRIVATE and APPLICATION because, as their names suggest, they are used in specific ASN.1 modules designed for specific applications. So at least in the case of the ASN.1 modules you deal with in cryptography for open standards, you will never run into them, and if you did, interpreting them would be dependent on the documentation accompanying the module. CONTEXT-SPECIFIC , on the other hand, is the default class of tagging and by far the most common, so I will concentrate on the CONTEXT-SPECIFIC class of tags here.

In addition to the predefined tag values, ASN.1 also allows the users to specify their own tag values. The syntax used to specify a tagged type in ASN.1 follows the pattern:

[ (class) number ] (TagStyle) Type

where the parameters in (), class and TagStyle , are optional and Type is the type that the tag in this context now represents. If class , which can be one of APPLICATION, CONTEXT-SPECIFIC , or PRIVATE , is left out, it is considered to be the default, that is, CONTEXT-SPECIFIC . So, if you see something like

encodedKey [1] OCTET STRING

you can tell that encodedKey is an OCTET STRING , which has the CONTEXT-SPECIFIC tag of value 1, done using the default style of tagging for the module. If the IMPLICIT or EXPLICIT keywords appear for TagStyle , then they override the tag style for the module, for that tag only, to be either IMPLICIT or EXPLICIT . The tag style for the module is determined by the tagging environment that has been specified in the DEFINITIONS block of the ASN.1 module, and as I mentioned earlier, it can be one of three tagging possibilities: AUTOMATIC, EXPLICIT , and IMPLICIT .

For small value tags, from 0 to 30, the actual value of the tag is stored in the bottom five bits, where the value associated with a normal ASN.1 tag goes otherwise . For tag values from 31 to 127, the bottom five bits are all set to 1 and the next octet is used to contain the tag number. If the number is higher than 127, the top bit of the next octet is set to 1 and the number is stored in seven-bit chunks , with the top bit of each octet being set to 1 if there is another chunk to follow.

By way of example, if you ignore whether bit 6 gets set for the moment, an ASN.1 object that has been given a 0 tag, as in has [0] in front of its type, will be the byte 0 —80. If it has been tagged with [32] , the tag will take two bytes ”0 —9f to indicate a tag greater than 31, and 0 —20 to give the actual value of the tag. Finally, if it has been tagged with [128] there will be three bytes of tag ”0 —9f, 0 —81 to give the first seven bits of the value with the bit 8 set to indicate another byte follows, and 0x00 to give the next seven bits of the value with bit 8 not set to indicate it is the last seven bits of the tag value.

The next sections review the three tagging environments, EXPLICIT, IMPLICIT , and AUTOMATIC , in the context of tags created in the CONTEXT-SPECIFIC class.

EXPLICIT Tagging

If you see nothing in the definition's block of an ASN.1 module, or you see

DEFINITIONS EXPLICIT TAGS ::=

then the default tagging style for that module is EXPLICIT . You might also see something like this:

encodedKey [1] EXPLICIT OCTET STRING

which means regardless of the default tagging for the module, encodedKey is an explicitly tagged OCTET STRING with the tag value 1.

EXPLICIT tagging actually wraps the underlying encoding, so it is the easiest to interpret. By "wrap" I mean that an explicitly tagged object has another object around it that serves the purpose of carrying the tag value. The easiest way to understand this is to start looking at the actual bytes produced during an encoding. For example, looking at the encodedKey definition again, assuming you started with 32-byte array when you made it, printing one in hex might give you the following bytes in the header:

a1 22 04 20...

where 0 —a1 tells you that you have a constructed (bit 6 is set), context-specific (bit 8 is set), tagged object with the tag value 1 (the bottom five bits) with a body of length 34 bytes (0 —22). The body then starts with 0 —04, a universal tag that is for an OCTET STRING , and the OCTET STRING also has a body of length 32 bytes (0 —20), which is the bytes that you started with.

Note that the constructed bit (bit 6) is set in the byte starting the tag header ”it is what tells you there is another encoding wrapped in the encoding of the tagged encoding. You will see how important the constructed bit becomes in interpreting tagged objects when you look at IMPLICIT tagging next.

IMPLICIT Tagging

Often you will see

DEFINITIONS IMPLICIT TAGS ::=

at the start of an ASN.1 module. If you see this, it means that the tagging style in the module is IMPLICIT . You may also see a declaration like

keyEncoded [1] IMPLICIT OCTET STRING

This also indicates that you are looking at an IMPLICIT tag.

The IMPLICIT tag style takes its name from the fact that the original tag value associated with the object it tags is overridden so the original tag value is now only implicit from the context in which the encoding is interpreted. As you will see, this can also introduce a certain amount of ambiguity.

Once again, the best way to deal with this is to look at the bytes produced in the header. This time the 32 byte value for the octets making up the encoded key gives the following header:

81 20 ...

This time you have a tagged object, with the tag value 1, which is 32 bytes in length. What happened to the OCTET STRING tag? Yes, it is gone. It has been replaced with the tag number you specified (0 —01), marked in the tag byte as CONTEXT-SPECIFIC (0 —80), thus giving you 0 —81.

Without getting into a debate about the merits of this tagging style, you can see how important it is to handle it correctly. Just how vital this is becomes obvious when you look at what can happen with the encoding for an ASN.1 structure that might be defined as follows:

TroublesomeSequence ::= SEQUENCE { encoding1 OCTET STRING, encoding2 [0] OCTET STRING OPTIONAL }

which is used in the following context:

troublesome [1] TroublesomeSequence

Now imagine that you encode a value for troublesome with both encoding1 and encoding2 created using 32-byte octet strings. The header for troublesome will look as follows:

a1 44 04 20 ...

Note that this time the byte starting the header ”0xa1 ”has bit 6 set indicating that the value is constructed. This has happened because the IMPLICIT tag is overriding the tag value of a constructed type, in this case a SEQUENCE . After the tag byte, you then have a length byte and you can see the 0x04 indicating your first octet string. So far, so good.

Now make it more interesting. The encoding2 field in TroublesomeSequence is OPTIONAL , meaning it can be left out. So I will now look at the header for the encoding of troublesome , where only encoding1 has been set to a 32-byte value and encoding2 has been left out. Doing this gives the following bytes in the header:

a1 22 04 20 ...

Now look back at the example for EXPLICIT tagging and see if you can find a difference. Yes, there is no difference. It is not possible to tell the difference between a particular ASN.1 base type with a tag of type EXPLICIT and a tagged SEQUENCE , or SET , of type IMPLICIT containing only one member that is of the same base type as used in the EXPLICIT case.

This leaves you with the following general rule, which you will be reminded of again later.

Important  

When interpreting encodings containing ASN.1 objects with IMPLICIT tagging, you must write code to interpret each IMPLICIT tag explicitly.

AUTOMATIC Tagging

You will see this only in the DEFINITIONS section of the module. In the particular world you are looking at, AUTOMATIC is very rare ”if you do see it, it means that everything in a SEQUENCE or SET is automatically tagged, with the first element tagged as 0. The tags are added using the IMPLICIT style unless the item that would be tagged is already tagged, or is a CHOICE item.

This one is almost better left to using an ASN.1 tool for dealing with it. If you have to deal with one by hand, the best way is to print the module out and then record what the tag values should be by hand. After that, cross your fingers and hope that you haven't missed one or tagged a CHOICE item by mistake.

CHOICE

The CHOICE type indicates that the ASN.1 field, or variable, will be one of a group of possible ASN.1 types or structures. If you are looking for another equivalent, the CHOICE type is very similar to a union in C or Pascal, the difference being that tagging is normally used to resolve any possible ambiguities . For example, looking at

SignerIdentifier ::= CHOICE { issuerAndSerialNumber IssuerAndSerialNumber, subjectKeyIdentifier [0] SubjectKeyIdentifier }

you can see that a SignerIdentifier can be either of the type IssuerAndSerialNumber or an object with a 0 tag of the underlying type SubjectKeyIdentifier .

The zero tag will be applied with the IMPLICIT style ”that is, it will override the default tag value for whatever type makes up a SubjectKeyIdentifier . This leads to an interesting issue about choice types that was touched on in the discussion on the AUTOMATIC style of tagging. Because choice types contain tag values that are used to distinguish which item in the CHOICE is represented, any object of type CHOICE is never tagged using the IMPLICIT style. I will repeat this as well; it is important!

Important  

A tag applied to an ASN.1 object of type CHOICE is always applied using the EXPLICIT style of tagging.

CLASS

The CLASS type was introduced in 1994 because of problems with the ANY syntax and ASN.1 macros, partly a result of the ambiguities they made possible in definitions, and also because the syntax was almost impossible to deal with properly if you were trying to write an automated tool. Strictly speaking, the ANY type is no longer supported, although as you will see, it still lives on. It represents the one non backward-compatible change made in ASN.1's history, and even 10 years on, we still seem to be coming to terms with it.

The effect is that prior to 1994, where you would have written

AlgorithmIdentifier ::= SEQUENCE { algorithm OBJECT IDENTIFIER, parameters ANY DEFINED BY algorithm OPTIONAL }

you now write

ALGORITHM-IDENTIFIER ::= CLASS { &id OBJECT IDENTIFIER UNIQUE, &Type OPTIONAL } WITH SYNTAX { OID &id [PARAMETERS &Type] } AlgorithmIdentifier { ALGORITHM-IDENTIFIER:InfoObjectSet } ::= SEQUENCE { algorithm ALGORITHM-IDENTIFIER.&id({InfoObjectSet}), parameters ALGORITHM-IDENTIFIER.&Type({InfoObjectSet}{@.algorithm}) OPTIONAL }

You'll see the definition using CLASS in PKCS #1. The definition using ANY appeared in X.509. The {@.algorithm} in the second definition provides the equivalent to the DEFINED BY algorithm . It tells you that for a given value of the algorithm field, the parameters field is constrained to the associated parameter value in InfoObjectSet .

Now look at the parameterization in the definition of AlgorithmIdentifier , the actual structure that contains ASN.1 values, as in:

AlgorithmIdentifier { ALGORITHM-IDENTIFIER:InfoObjectSet } ::= SEQUENCE {

The purpose behind this is to make the following definition possible in PKCS #1:

DigestInfo ::= SEQUENCE { digestAlgorithm DigestAlgorithm, digest OCTET STRING } DigestAlgorithm ::= AlgorithmIdentifier { {PKCS1-v1-5DigestAlgorithms} }

where PKCS1-v1-5DigestAlgorithms is defined as:

PKCS1-v1-5DigestAlgorithms ALGORITHM-IDENTIFIER ::= { { OID id-md2 PARAMETERS NULL } { OID id-md5 PARAMETERS NULL } { OID id-sha1 PARAMETERS NULL } { OID id-sha256 PARAMETERS NULL } { OID id-sha384 PARAMETERS NULL } { OID id-sha512 PARAMETERS NULL } }

or, in common language, the possible values for a PKCS #1 digest algorithm identifier.

I have glossed over the fact that if you look at the actual PKCS #1 document, you will see that id-md2, id-md5 , and so on are defined with the type OBJECT IDENTIFIER , but you now have the general idea.

Encoding Rules

There are currently five encoding methods recognized for encoding ASN.1 objects into streams of bytes:

I mention all of them for completeness, as you can see the number of methods indicates that people have had a few goes at producing encoding methods to date and there is probably still more to be written. Fortunately, the only two methods of interest are BER encoding and DER encoding.

BER Encoding

BER stands for Basic Encoding Rules. As you've probably guessed from the example encodings you've seen so far, BER encoding follows the tag-length-value (TLV) convention. A tag is used to identify the type, a value defining the length of the content is next , and then the actual value of the content follows.

BER encoding offers three methods for encoding an ASN.1 object:

Simple types employ the primitive definite-length, bit and character string types will employ whatever method is most expedient, and structured types employ one of the constructed methods. If an object is tagged with the IMPLICIT style, the encoding used is the same as that used for the type of the object being tagged. If an object is tagged with the EXPLICIT style, one of the constructed methods will be used to encode the tagging.

How is it decided which method is most expedient? Strictly speaking, the decision is made on the basis of whether you know how long the encoding of the object will be when you start writing it out. However, in some cases, standards do specify BER indefinite-length, so in situations like that, you will end up with objects that are indefinite-length encoded regardless of whether it would have been possible to hold the object in memory. To fully understand what this means, you need to take a look at the three methods in more detail.

The Primitive Definite-Length Method

The definite-length methods all require that you know the length of what you are trying to encode in advance. The primitive definite-length method is appropriate for any nonstructured type, or implicitly tagged versions of the same, and an encoding of this type is created by first encoding the tag assigned to the object, encoding the length, and then writing out the encoding of the body.

You'll look at how the bodies are encoded in more detail later, but how the encoding of the length is done is worth looking at here. If the length is less than or equal to 127, a single octet is written out containing the actual length as a 7-bit number. If the length is greater than 127, the first octet written out has bit 8 set and bits 7-1, represent the number of octets following that contain the actual length. The length is then written out, one octet at a time, high order octet first.

For example, a length of 127 will produce a length encoding with 1 byte of the value 0 —7f, a length of 128 will produce a 2-byte encoding with the values 0 —81 and 0 —80, and a length of 1,000 will produce a 3 byte encoding ”0 —82, 0 —03, and 0 —e8. This is the simplest method of encoding and, as you will see, is required for DER encodings.

The Constructed Definite-Length Method

Length octets in this case are generated the same way as for the primitive definite-length method, but the initial byte in the tag associated with any object encoded in this fashion will have bit 6 set, indicating the encoding is of the constructed type.

As you would imagine, the regular structured types such as SEQUENCE and SET , or implicitly tagged objects derived from them, are still encoded as the concatenation of the BER encoding of the objects that make them up. Likewise, explicitly tagged objects are encoded using the BER encoding of the object that was tagged. Where this does become different is when bit string and character string types, or implicit types derived from them, are encoded using the constructed definite-length method.

When this happens, the original bit string, or character string, is encoded as a series of substrings that are of the same base type as the constructed string. For example, if you are trying to encode a byte array using the constructed definite-length method as an OCTET STRING , the encoding will start with an OCTET STRING tag with bit 6 set indicating that it is constructed. Then, after the length octets, the body of the encoding will be made up a series of smaller OCTET STRING encodings using the primitive definite-length method, the sum of which will be the byte array that you were originally trying to encode.

You might use this method where you have several values that make up a single ASN.1 type that exist separately prior to creating an encoding of the ASN.1 type. For example, an e-mail address may be defined as a single ASN.1 type but be assembled from parts " name " + "@" + "domain" prior to encoding, which can be encoded as substrings of a constructed string representing the full address.

The Constructed Indefinite-Length Method

Unlike the previous two methods, the constructed indefinite-length method does not require you to know the length of the encoding you are trying to construct in advance. With this method, the encoding of the tag value follows the same procedure as for the constructed definite-length method; however, the length is written out as the single octet of the value 0 —80, and instead of being able to use the length of the encoding to determine when you reach the end of the contents in the body of the encoding, there is an end-of-contents marker ”two octets of the value 0 —00, which actually equate to tag 0, length 0. Other than the requirement for and the presence of the end-of-contents marker, encoding of objects is handled in much the same way as for constructed definite-length.

This method of encoding is useful where the length of the value is not known at the time the tag and length for the value is encoded. This method is common when encodings are very large and memory or efficiency constraints prevent the entire value being buffered to determine its length before encoding it.

DER Encoding

The Distinguished Encoding Rules, or DER, are so called because they make identical data within identical ASN.1 definitions reduce to identical binary encodings. This is particularly important in security applications where the binary data will be digitally signed. There is also an interesting covert channel made possible with BER encodings where equivalent, but different BER encodings can be used to transmit extra information. For example, an octet string representing encrypted data could be represented using a constructed method where the length of the substrings making up the encrypted data could be used to leak information about either the data itself or the key used to encrypt it. As DER always reduces a value to the same encoding no matter what, such a covert channel is not possible.

DER adds the following restrictions to BER encoding to make this possible:

DER encoding is the most common form of encoding you will encounter, and it is also the simplest to perform. The only area of complication is the sorting of the objects contained in SET objects. A DER-encoded SET is sorted by ordering the objects inside it according to their encoded value in ascending order. Encodings are compared by padding them with trailing zeros so they are all the same length, with the result that a DER-encoded SET will be ordered on the tag value of each object. Be careful about relying on this, though. A BER-encoded SET is not necessarily sorted, so if you are trying to write code to handle both BER-and DER-encoded SET objects, it is a mistake to rely on the ordering taking place.

The Bouncy Castle ASN 1 API

The Bouncy Castle ASN.1 API evolved to deal with the ASN.1 binary encoding and binary decoding requirements of the other Bouncy Castle APIs and the provider implementation. As such, although it does not represent a full implementation of ASN.1, it does cover most of the issues that seem to arrive when dealing with cryptographic protocols and structures.

The main package for the API is org.bouncycastle.asn1 , and there are a variety of packages off org.bouncycastle.asn1 that contain classes for assisting with the implementation of various message and data formats. For example, org.bouncycastle.asn1.pkcs has classes for use with the PKCS standards, org.bouncycastle.asn1.cms has classes for supporting the ASN.1 objects in RFC 3852, and org.bouncycastle.asn1.x509 has classes for supporting the ASN.1 objects used in X.509.

The org.bouncycastle.asn1 package has a few simple ideas underlying it to support the encoding requirements you run into with cryptographic protocols. The following conventions apply:

The need for having the DEROutputStream behave the way that it does is quite important. Two common issues arise when you are working with other implementations of protocols and also in the general sense when you are generating hashes and signatures.

The general issue of being compatible with other implementations is that a lot of implementations seem to ignore the requirement to support BER encoding even when it is specified in the documentation for the standard that is apparently being implemented. So on more than one occasion, you may find yourself having to convert an object that is BER encoded into one that is only DER encoded. The property DEROutputStream has of forcing DER encoding makes this conversion quite simple.

For example, imagine you have a BER-encoded object in a byte array called berData that you want to convert to a DER-encoded object in a byte array that you will call derData . Using the Bouncy Castle ASN.1 API, you can achieve the conversion to DER with the following lines:

ASN1InputStream aIn = new ASN1InputStream( berData ); ByteArrayOutputStream bOut = new ByteArrayOutputStream(); DEROutputStream dOut = new DEROutputStream(bOut); dOut.writeObject(aIn.readObject()); derData = bOut.toByteArray();

If you run into a problem where, say, you have a PKCS #12 file from one product that refuses to import into another, there is a good chance it is the inability of the second product to deal with BER encoding that is the problem.

In a similar vein, standards that specify methods for calculating digital signatures, hashes, or MACs on ASN.1-encoded data will often specify that it must be calculated on the DER encoding of the objects. You can use virtually the same technique to do the conversion, depending on how you have everything set up. The only difference might be that you need to iterate on aIn.readObject() if berData contains more than one object.

Creating the Basic ASN 1 Types

The object hierarchy in the ASN.1 base package works as follows . The root object for all the simple types, bit string types, and structured and tagged types is DERObject . As it turns out, a better name would have been ASN1Object, but originally, it seemed you might be able to get away without dealing with BER, so when you see the name you can think of it as a chance to learn from example. There is no escaping it!

At the moment, a number of types have both BER and DER encoding implementations, and they have parent implementations that following the convention of starting with ASN.1. Most types produce DERcompatible encodings. Currently the following implementations are associated with the following ASN.1 types:

All these objects provide implementations of Object.equals() and Object.hashCode() .

Only the object names starting with DER or BER provide constructors, and these are pretty much what you would expect. The constructors for classes corresponding to the bit string types can take byte arrays, with DERBitString also being able to take the number of pad bits. There are constructors for the classes corresponding to the character string types that can take String objects, except for DERUniversalString , which will only take a byte array, because Java is not able to represent 32-bit characters directly. There are constructors for the time-based classes that take Date objects. A DERObjectIdentifier can be constructed from a String representation of an OBJECT IDENTIFIER; the constructor for DERBoolean simply takes true or false.DERInteger and DEREnumerated both take int and BigInteger . The constructors for the tagged object types take a tag number, a flag specifying whether the tagging is explicit or not, and the object to be tagged.

Finally, the BER and DER types supporting SEQUENCE and SET can be either constructed as structures that have a single object inside them by using a constructor that takes a single encodable object or as structures that contain multiple objects by using an ASN.1 API equivalent to Vector or ArrayList , the ASN1EncodableVector class.

You will also notice that the base classes for each type usually have two getInstance() methods on them. The getInstance() methods come in two flavors. The first simply takes an object and returns whatever the base type is. This is a convenient method to enable you to avoid casting and other conversions that can be necessary when manipulating ASN.1 objects. The other method takes an ASN1TaggedObject and a boolean argument, the purpose of which is to say whether the tagging present on the ASN1TaggedObject is explicit, in which case the boolean is true , or implicit, in which case the boolean is false. It is this getInstance() pattern that is used throughout the ASN.1 API to deal with the complications that might arise due to implicit tagging with the various types.

Dealing with Tagging

The main issue with tagging, as you saw earlier, is that if an object is implicitly tagged, the actual tag the object is meant to have is implicit from the context in which its encoding appears. For the most part, without knowledge of what has been encoded originally, the best you can tell from the actual encoded information is that you have a tag value associated with a bunch of bytes. The exception to this is if the tag is marked as constructed, but even then the best you can do is tell you have a bunch of objects, or possibly just one object, which may in fact be inside an explicitly tagged SET or SEQUENCE .

From the point of view of programming this, a few simple examples would help. For example, say you created a BIT STRING from a byte array that represented a string of bits that was a multiple of 8, and then wanted to create an implicitly tagged version of it with the tag value 1. You could do this with the following code:

DERBitString bits = new DERBitString( byteArray , 0); ASN1TaggedObject taggedBits = new DERTaggedObject(false, 1, bits);

To recover bits from the tagged object you would then use the following:

bits = DERBitString.getInstance(taggedBits, false);

Likewise if bits was actually wrapped in a SEQUENCE because you were trying to encode the following structure:

WrappedBits ::= SEQUENCE { bits BIT STRING }

and you were trying to implicitly tag the outer sequence as well with a tag value of 1. You might end up with the following fragment instead:

DERBitString bits = new DERBitString(byteArray, 0); ASN1Sequence wBits = new DERSequence(bits); ASN1TaggedObject taggedWBits = new DERTaggedObject(false, 1, wBits);

and this time recovering it from the tagged object would be as follows:

wBits = ASN1Sequence.getInstance(taggedWBits, false); bits = (DERBitString)wBits.getObjectAt(0);

As I mentioned earlier, it is the use of false that tells the API that it is dealing with an implicitly tagged object. If, instead, the tagging used in the code fragments was explicit, you would replace every occurrence of false with true .

So that covers the basics. Before you go on to look at some real examples, you'll try creating an ASN.1 structure of your own.

Defining Your Own Objects

Imagine you are trying to create a Java object implementing the following ASN.1 structure using DER encoding:

MyStructure ::= SEQUENCE { version INTEGER DEFAULT 0, created GeneralizedTime, baseData OCTET STRING, extraData [0] UTF8String OPTIONAL, commentData [1] UTF8String OPTIONAL }

The first thing you can notice is the presence of tagging. In this case the tagging is not specified in the actual structure, so you need to know what tagging environment you are in. You therefore look at the DEFINITIONS block in the module the structure appears in and see this:

DEFINITIONS IMPLICIT TAGS ::=

So you know tags have to be handled implicitly.

The other thing you can notice is the use of a DEFAULT. As mentioned earlier, in a DER-encoded object, a field that's set to its default value must be left out of the encoding. In a BER-encoded object it may or may not be present. In this case, you are using DER, so you must ensure that you do not include the version field in the encoding if it is set to its default value. In general, though, it is a good idea to adopt the following policy for BER as well.

Important  

A field set to its specified default is left out of the encoding.

The final thing to note is the use of OPTIONAL . You have to take into account that one of, or both of, extraData and commentData might not be present in the encoding, and that, while they have the same base types, the tagging is used to distinguish them.

So now that you know what you are up against, take a look at the code.

Try It Out: Implementing an ASN.1-Based Java Object

This example is quite large, so I'll go through it in stages. The first step is to provide the necessary imports and to extend the general-purpose class that the API uses for creating ASN.1 objects ” org.bouncycastle.asn1.ASN1Encodable. So here is the basic class header:

package chapter5; import java.util.Date; import org.bouncycastle.asn1.*; /** * Implementation of an example ASN.1 structure. *

* MyStructure ::= SEQUENCE { * version INTEGER DEFAULT 0, * created GeneralizedTime, * baseData OCTET STRING, * extraData [0] UTF8String OPTIONAL, * commentData [1] UTF8String OPTIONAL } *

* */ public class MyStructure extends ASN1Encodable { private DERInteger version; private DERGeneralizedTime created; private ASN1OctetString baseData; private DERUTF8String extraData = null; private DERUTF8String commentData = null;

Now you need at least one constructor that will allow you to create the object from an ASN1Sequence that you might have just read from an ASN1InputStream. Note that doing this is a little involved, because any one of three fields in the actual sequence may be missing from the encoding. If the version field was its default value, it will be left out, and the extraData and commentData fields are optional, so they may be missing as well. What follows is a simple way of dealing with the optional fields; in some circumstances, you might want to include code that confirms the order the optional fields appear in the sequence as well, rather than simply recognizing them.

/** * Constructor from an ASN.1 SEQUENCE */ public MyStructure( ASN1Sequence seq) { int index = 0; // check for version field if (seq.getObjectAt(0) instanceof DERInteger) { this.version = (DERInteger)seq.getObjectAt(0); index++; } else { this.version = new DERInteger(0); } this.created = (DERGeneralizedTime)seq.getObjectAt(index++); this.baseData = (ASN1OctetString)seq.getObjectAt(index++); // check for optional fields for (int i = index; i != seq. size (); i++) { ASN1TaggedObject t = (ASN1TaggedObject)seq.getObjectAt(i); switch (t.getTagNo()) { case 0: extraData = DERUTF8String.getInstance(t, false); break; case 1: commentData = DERUTF8String.getInstance(t, false); break; default: throw new IllegalArgumentException( "Unknown tag" + t.getTagNo() + "in constructor"); } } }

Having written a constructor to get you from the ASN.1 world to the Java world, you now also need a constructor to get you from the Java world into a form where you can produce an ASN.1 binary encoding. The following is just a basic one, as you can imagine you might write convenience constructors depending on how you, or your fellow developers, were likely to use the class. As you might also imagine, this constructor is simpler than the one that builds from an ASN1Sequence , although you are still checking for the presence, or not, of optional fields.

Note that you are also creating the internal objects so they will encode in DER format, regardless of whether the object is written to a DEROutputStream or an ASN1OutputStream .

/** * Constructor from corresponding Java objects and primitives. */ public MyStructure( int version, Date created, byte[] baseData, String extraData, String commentData) { this.version = new DERInteger(version); this.created = new DERGeneralizedTime(created); this.baseData = new DEROctetString(baseData); if (extraData != null) { this.extraData = new DERUTF8String(extraData); } if (commentData != null) { this.commentData = new DERUTF8String(commentData); } }

I have skipped the get() methods altogether, as they are obvious. The last fragment you need to look at is the implementation of the abstract ASN1Encodable.toASN1Object() method and the end of the class. When ASN1OutputStream.writeObject() is called, the toASN1Object() method is invoked to produce an object that can then be encoded on the stream. You should note that while the code again checks for the presence of the optional fields before adding them to the ASN1EncodableVector , which will be used to create the DERSequence object, you must also check the value of the version field and only include it if it is not its default value.

/* * Produce an object suitable for writing to an ASN1/DEROutputStream */ public DERObject toASN1Object() { ASN1EncodableVector v = new ASN1EncodableVector(); if (version.getValue().intValue() != 0) { v.add(version); } v.add(created); v.add(baseData); if (extraData != null) { v.add(new DERTaggedObject(false, 0, extraData)); } if (commentData != null) { v.add(new DERTaggedObject(false, 1, commentData)); } return new DERSequence(v); } }

Now you can try a simple test class. The test class dumps the ASN.1 binary encoding that is generated for each version of MyStructure you create to the screen as hex. It does this by using the convenience method ASN1Encodable.getEncoded() that MyStructure inherits to generate the byte encoding. See what you get.

package chapter5; import java.util.Date; /** * Test for MyStructure */ public class MyStructureTest { public static void main(String[] args) throws Exception { byte[] baseData = new byte[5]; Date created = new Date(0); // 1/1/1970 MyStructure structure = new MyStructure( 0, created, baseData, null, null); System.out.println(Utils.toHex(structure.getEncoded())); if (!structure.equals(structure.toASN1Object())) { System.out.println("comparison failed."); } structure = new MyStructure(0, created, baseData, "hello", null); System.out.println(Utils.toHex(structure.getEncoded())); if (!structure.equals(structure.toASN1Object())) { System.out.println("comparison failed."); } structure = new MyStructure(0, created, baseData, null, "world"); System.out.println(Utils.toHex(structure.getEncoded())); if (!structure.equals(structure.toASN1Object())) { System.out.println("comparison failed."); } structure = new MyStructure(0, created, baseData, "hello", "world"); System.out.println(Utils.toHex(structure.getEncoded())); if (!structure.equals(structure.toASN1Object())) { System.out.println("comparison failed."); } structure = new MyStructure(1, created, baseData, null, null); System.out.println(Utils.toHex(structure.getEncoded())); if (!structure.equals(structure.toASN1Object())) { System.out.println("comparison failed."); } } }

And here is the output:

3018 180f31393730303130313030303030305a 04050000000000 301f 180f31393730303130313030303030305a 04050000000000 800568656c6c6f 301f 180f31393730303130313030303030305a 04050000000000 8105776f726c64 3026 180f31393730303130313030303030305a 0405000000000 0800568656c6c6f 8105776f726c64 301b 020101 180f31393730303130313030303030305a 04050000000000

I have highlighted every second object's encoding with bold so that it is easier to see where the encoding of one object starts and finishes. You will look at this in more depth in the following section.

How It Works

There have been a few of steps to make all this happen. The first is you have extended ASN1Encodable to create an object suitable for passing to an ASN1OutputStream and you have implemented the toASN1Object() method to construct a DERSequence object that contains the primitive types you want to encode. The second is that you have written a constructor that allows you to take the types that you normally use in Java programming and convert them into their ASN.1 counterparts. Likewise, you provided a constructor to get you from the ASN.1 view of the MyStructure object, where it exists only as a sequence, back to the Java viewpoint. The only thing missing are the get() methods, and in this case you would just add whatever was appropriate to the application you were trying to develop.

Having a closer look at the output of the example also gives you more of an insight into what is going on when the encoding is being generated.

Starting at the beginning, you can see the tag for a SEQUENCE ”0 —10 ”Ored together with the value indicating a constructed type 0 —20 (bit 6), giving you the value 0 —30. The next byte is the length byte, and after that the values making up the internals of the SEQUENCE start to appear. Looking at the first four lines, you can see that the first encoding appearing in the sequence is that of a GeneralizedTime , which starts with a tag of 0 —18. This has happened because, in the first four cases, the version field has its default value so is left out. On the fifth line, you can see the first object in the sequence is an INTEGER , tag 0 —02, and this has happened because the version field is now 1 ”a value different from its default of 0.

As you would expect, the GeneralizedTime is always present, and you also see the OCTET STRING (tag 0 —04) making up the baseData field. Then you come to the UTF8Strings , which would normally start with a tag of 0 —0c. However, because of implicit tagging, it starts with 0 —80 in the case of the extraData field, which has a tag value of 0, and 0 —81 in the case of the commentData field, which has a tag value of 1. The easiest place to see this is to look at the difference at the end of the encoding on lines 2 and 3 where, on line 2 the commentData field is absent, and on line 3 the extraData field is absent.

At this point you should have a basic understanding of what is happening when an ASN.1 structure you have seen the definition for is encoded. Before you go on to look at some real-world structures, I will just diverge briefly to mention the classes that can be used to examine structures for which you only have the encoded object.

Analyzing an Unknown Encoded Object

The API also provides a general-purpose class that allows you to get a, more or less, human-readable dump of an ASN.1-encoded object. You can find it in the package org.bouncycastle.asn1.util; it is called ASN1Dump . It has a single method on it called ASN1Dump.dumpAsString() , which takes a single ASN.1 encodable object and returns the hierarchy it contains as a String object.

There is also an associated utility class called Dump in the same package. It contains a main method, which takes a single argument being a file that you want to run the ASN1Dump class over. Running a command like this:

java org.bouncycastle.asn.util.Dump id.p12

will dump out every ASN.1 object found in the file id.p12.

Try It Out: Using ASN1Dump

Let's try a simple example that builds on the work you did in the last section and see what ASN1Dump produces:

package chapter5; import java.util.Date; import org.bouncycastle.asn1.util.ASN1Dump; /** * Example for ASN1Dump using MyStructure. */ public class ASN1DumpExample { public static void main(String[] args) throws Exception { byte[] baseData = new byte[5]; Date created = new Date(0); // 1/1/1970 MyStructure structure = new MyStructure( 0, created, baseData, "hello", "world"); System.out.println(ASN1Dump.dumpAsString(structure)); structure = new MyStructure(1, created, baseData, "hello", "world"); System.out.println(ASN1Dump.dumpAsString(structure)); } }

When you run the example, you should expect to see the two structures get dumped out as follows:

DER Sequence GeneralizedTime(19700101000000GMT+00:00) DER Octet String[5] Tagged [0] IMPLICIT UTF8String(hello) Tagged [1] IMPLICIT UTF8String(world) DER Sequence Integer(1) GeneralizedTime(19700101000000GMT+00:00) DER Octet String[5] Tagged [0] IMPLICIT UTF8String(hello) Tagged [1] IMPLICIT UTF8String(world)

As you can see, you have two SEQUENCE objects, both of which conform to the structure outlined in the previous section.

How It Works

ASN1Dump takes an object that extends ASN1Encodable and traverses its structure, building up an indented tree view of the internals of the object.

There is one thing a little odd about the output, though. As I have said previously, correctly working out what an implicitly tagged object is requires knowledge of the structure being parsed. In the previous output, the ASN1Dump class has correctly identified what is contained in the tagged objects. What is going on?

The answer, of course, is that it's cheating. As the tagged object has been constructed from the real objects involved, ASN1Dump can tell what type it is. Try adding the following lines to the example and running it again (you will need to import org.bouncycastle.asn1.ASN1InputStream as well):

ASN1InputStream aIn = new ASN1InputStream(structure.getEncoded()); System.out.println(ASN1Dump.dumpAsString(aIn.readObject()));

You will see the following extra lines of output:

DER Sequence Integer(1) GeneralizedTime(19700101000000GMT+00:00) DER Octet String[5] Tagged [0] IMPLICIT DER Octet String[5] Tagged [1] IMPLICIT DER Octet String[5]

This makes more sense. As the implicit tagging has overridden the tag for the UTF8String , the best the ASN1Dump class can do is recognize the implicitly tagged objects as being of the type OCTET STRING . Still, the ASN1Dump class is doing the best it can, and it does provide you with a basic analysis tool for dealing with ASN1-encoded messages. Having come this far, it is time to look at how ASN.1 applies to real-world situations in the JCA and JCE.

Using ASN 1 in Java Some Real Examples

I have already mentioned objects of the type AlgorithmParameters as usually having an ASN.1 equivalent. As it happens, public and private keys that return X.509 and PKCS#8 as their formats also return encodings created using ASN.1 when their Key.getEncoded() method is called and it is an ASN.1 object that is inside a PKCS #1 V1.5 signature when it is encrypted using an RSA private key.

Because these are all Java objects you are already familiar with, and you are going to be dealing with more ASN.1-based objects later in this book, now have a look at what goes on with these objects from the point of their ASN.1 binary encoding.

Some Basic ASN 1 Structures

There are a couple of structures that show up frequently enough in the ASN.1 modules associated with cryptography that they deserve mentioning before you start looking at some examples. The first is called AlgorithmIdentifier and originally appeared in X.509; the second is Attribute and originally appeared in the ISO/ITU-T useful definitions module. You will have a look at these common structures first, because they will provide you with some background when it comes to dealing with the examples.

The AlgorithmIdentifier Structure

The AlgorithmIdentifier structure serves simply to hold an object identifier representing a particular algorithm and an optional parameters structure that holds the parameters required. You will encounter a few variations of the structure, as some people define it for themselves , but even the variations usually boil down to this basic ASN.1 structure:

AlgorithmIdentifier ::= SEQUENCE { algorithm OBJECT IDENTIFIER, parameters SomeASN1Type OPTIONAL }

Pre-1994 the SomeASN1Type would have been ANY DEFINED BY algorithm. These days, of course, you will see a CLASS definition to show the linkage between the OBJECT IDENTIFIER representing the algorithm and the parameters field's actual type. This tells you that any ASN.1 structure can occupy that field and that the value you find there will depend on the value of the algorithm field.

One further thing you need to know: For historical reasons, the optional parameters field is often set to NULL instead of being left out, so you will often see a standard that specifies that the parameters field must be set to NULL rather than being left out. This happened because when the 1988 syntax for AlgorithmIdentifier was translated to the 1997 one, the OPTIONAL somehow got left out. Although this was later fixed via a defect report, whatever you do, if you are creating an AlgorithmIdentifier and you see parameters are NULL , make sure you include the NULL . Empty is not the same as NULL .

The Attribute Structure

The Attribute structure is another general structure that you will see a lot. The following definition is from RFC 3852:

Attribute ::= SEQUENCE { attrType OBJECT IDENTIFIER, attrValues SET OF AttributeValue } AttributeValue ::= ANY

Astute readers will remember that in the earlier discussion on the basics of ASN.1, support for ANY was withdrawn in 1994. The fact that RFC 3852 was published in 2004, some 10 years later, stands as a testimonial to how much of an upheaval that withdrawal caused. Versions of Attribute based on the use of the CLASS type and parameterization are starting to appear as well; see the ASN.1 module for PKCS #7 V1.6 as an example. However, the definition from RFC 3852 will be sufficient for the purposes here.

As you can see, the Attribute structure is basically a tagged SET , the content of which is determined by the OBJECT IDENTIFIER in the attrType field.

Encoding an IV

In the case of most block ciphers, such as AES in CBC mode, the only parameter value that is likely to be required in the parameters field of an AlgorithmIdentifier representing an encoding is the IV, and for most algorithms, it is defined as:

IvParam ::= OCTET STRING

So how do you convert an IV into its ASN.1 binary encoding using the JCA? Let's take a look at one approach.

Try It Out: Encoding an IV with ASN.1

Here is a basic example that uses an IvParameterSpec and an AlgorithmParameters object to create an ASN.1 binary encoding of an IV. Try running it.

package chapter5; import java.security.AlgorithmParameters; import javax.crypto.spec.IvParameterSpec; import org.bouncycastle.asn1.ASN1InputStream; import org.bouncycastle.asn1.util.ASN1Dump; /** * Example showing IV encoding */ public class IVExample { public static void main(String[] args) throws Exception { // set up the parameters object AlgorithmParameters params = AlgorithmParameters.getInstance( "AES", "BC"); IvParameterSpec ivSpec = new IvParameterSpec(new byte[16]); params.init(ivSpec); // look at the ASN.1 encodng. ASN1InputStream aIn = new ASN1InputStream(params.getEncoded("ASN.1")); System.out.println(ASN1Dump.dumpAsString(aIn.readObject())); } }

When you run the example, you should see the following output:

DER Octet String[16]

This output conforms to the ASN.1 description you saw for an IV earlier.

How It Works

The AlgorithmParameters class is the key to generating ASN.1 binary encodings for parameters. In this case, you have initialized the params object with an IvParameterSpec containing the IV you want to produce an encoding for. Then you called AlgorithmParameters.getEncoded() , explicitly requesting an ASN.1 binary encoding.

As you have already seen, you can also recover the AlgorithmParameters object for a particular cipher you have just used by calling Cipher.getParameters() , which will return an object that is already initialized. Likewise, this returned object's getEncoded() method will also return the ASN.1 binary encoding for the parameters. You will see an example of this a bit later.

Inside a PKCS #1 V1 5 Signature

I mentioned in the last chapter that PKCS #1 V1.5 signatures also included a structure around the hash, in addition to the padding that was applied. Like the IV parameters, the structure is very simple and is known as a DigestInfo object. It holds details of the message digest algorithm used to create the hash in the signature, as well as the actual bytes making up the hash that was calculated during the signing process. When a signature is sealed with a private key, the hash it contains is exported as the DER encoding of a DigestInfo structure. It is this encoded string of bytes that then has padding applied before encryption with the private key produces the final signature.

Earlier in this chapter, you looked at the DigestInfo structure when I was discussing the use of the ASN.1 CLASS type in the definition of AlgorithmIdentifier . The structure was defined as follows :

DigestInfo ::= SEQUENCE{ digestAlgorithm DigestAlgorithm, digest OCTET STRING } DigestAlgorithm ::= AlgorithmIdentifier { {PKCS1-v1-5DigestAlgorithms} }

As you can see, the DigestAlgorithm type is an AlgorithmIdentifier . There is one twist here, though; the extra bit at the end in the braces, namely:

{ {PKCS1-v1-5DigestAlgorithms} }

told you something about this particular extension of AlgorithmIdentifier . Its set of possible values comes from another structure called PKCS1-v1-5DigestAlgorithms , which consists of a list of object identifiers, parameter pairs that represent the current range of PKCS #1 V1.5 signature types supported by PKCS #1. I won't repeat the possible values here, as you can find them in the original discussion on the CLASS type; however, I will mention that is the encoded form of the possible values listed in PKCS1v1-5DigestAlgorithms , which you will find inside the plaintext of PKCS #1 V1.5 signatures. So much for the background; you'll see now if that is the case.

Try It Out: Looking Inside a PKCS #1 V1.5 Signature

Try the following example. As you can see, it uses the regular Signature class to create a byte array containing a signature created using SHA-256 and RSA. However, in the verification part, it uses a Cipher to unlock the signature and then parses the structure contained in the decrypted block, checking the section containing the hash that was calculated when the signature was created against one generated for the same data using SHA-256.

package chapter5; import java.security.*; import javax.crypto.Cipher; import org.bouncycastle.asn1.ASN1InputStream; import org.bouncycastle.asn1.ASN1OctetString; import org.bouncycastle.asn1.ASN1Sequence; import org.bouncycastle.asn1.util.ASN1Dump; /** * Basic class for exploring PKCS #1 V1.5 Signatures. */ public class PKCS1SigEncodingExample { public static void main(String[] args) throws Exception { KeyPairGenerator keyGen = KeyPairGenerator.getInstance("RSA", "BC"); keyGen.initialize(512, new SecureRandom()); KeyPair keyPair = keyGen.generateKeyPair(); Signature signature = Signature.getInstance("SHA256withRSA", "BC"); // generate a signature signature.initSign(keyPair.getPrivate()); byte[] message = new byte[] { (byte)'a', (byte)'b', (byte)'c' }; signature.update(message); byte[] sigBytes = signature.sign(); // verify hash in signature Cipher cipher = Cipher.getInstance("RSA/None/PKCS1Padding", "BC"); cipher.init(Cipher.DECRYPT_MODE, keyPair.getPublic()); byte[] decSig = cipher.doFinal(sigBytes); // parse the signature ASN1InputStream aIn = new ASN1InputStream(decSig); ASN1Sequence seq = (ASN1Sequence)aIn.readObject(); System.out.println(ASN1Dump.dumpAsString(seq)); // grab a digest of the correct type MessageDigest hash = MessageDigest.getInstance("SHA-256", "BC"); hash.update(message); ASN1OctetString sigHash = (ASN1OctetString)seq.getObjectAt(1); if (MessageDigest.isEqual(hash.digest(), sigHash.getOctets())) { System.out.println("hash verification succeeded"); } else { System.out.println("hash verification failed"); } } }

Running the example, you should see the following:

DER Sequence DER Sequence ObjectIdentifier(2.16.840.1.101.3.4.2.1) NULL DER Octet String[32] hash verification succeeded

So you found the correct hash value and the signature hash verified as expected. Looking at the dump, you can see the OBJECT IDENTIFIER value for SHA-256 and then a 20-byte OCTET STRING , which is the actual SHA-256 hash that was calculated.

How It Works

As I mentioned earlier, the byte array that gets padded during signature calculation is actually a DER encoding of a DigestInfo object. The signature is then a representation of the padded DER encoding that has been encrypted with the private key. Decrypting the signature with the public key gives you back this DER-encoded stream, and if you look at the section of the output representing the ASN.1 dump, you can see that stream produces at the top level a single sequence that contains two objects, the first of which is itself a sequence.

If you look back at the definition for an AlgorithmIdentifier , you can see that the OBJECT IDENTIFIER giving the value for algorithm field is 2.16.840.1.101.3.4.2.1. If you look it up in a registry of identifiers, you will discover this is for the algorithm SHA-256. The other thing you can tell from the dump is that the parameters field in the AlgorithmIdentifier is the value NULL , which you would expect, as the digest does not require any input other than the data it is supposed to verify. Of course, it could have been left out altogether because the field is optional, but this is another one of those moments where you have to allow for history.

Encoding PSS Signature Parameters

Although PSS signatures themselves do not contain an ASN.1 structure, they do have algorithm parameters. PSS algorithm parameters are interesting to look at, because it possible for every field in their corresponding ASN.1 structure to be set to their default values. The structure used for PSS parameters is defined in PKCS #1 and, after some simplifying, looks something like this:

RSASSA-PSS-params ::= SEQUENCE { hashAlgorithm [0] HashAlgorithm DEFAULT sha1, maskGenAlgorithm [1] MaskGenAlgorithm DEFAULT mgf1SHA1, saltLength [2] INTEGER DEFAULT 20, trailerField [3] TrailerField DEFAULT trailerFieldBC } HashAlgorithm ::= AlgorithmIdentifier MaskGenAlgorithm ::= AlgorithmIdentifier TrailerField ::= INTEGER

I won't go into the specifics of the default values here, but they are also the ones represented by PSSParameterSpec.DEFAULT , so you would expect that creating a PSS signature with the default parameter set and retrieving its AlgorithmParameters would produce an empty SEQUENCE in its ASN.1 binary encoding.

Try It Out: Encoding PSS Parameters

Here is a simple example using parameters with the Signature class that allows you to have a look at the encodings being produced. Strictly speaking, the setting of the default parameters is not necessary, as the Signature.getInstance() method returns a Signature object that already has parameters set. However, it does show how the use of algorithm parameters is different with the Signature class compared to the Cipher class. Have a look at the example, run it, and see what it produces.

package chapter5; import java.security.AlgorithmParameters; import java.security.Signature; import java.security.spec.PSSParameterSpec; import org.bouncycastle.asn1.ASN1InputStream; import org.bouncycastle.asn1.util.ASN1Dump; /** * Example showing PSS parameter recovery and encoding */ public class PSSParamExample { public static void main(String[] args) throws Exception { Signature signature = Signature.getInstance("SHA1withRSAandMGF1", "BC"); // set the default parameters signature.setParameter(PSSParameterSpec.DEFAULT); // get the default parameters AlgorithmParameters params = signature.getParameters(); // look at the ASN.1 encodng. ASN1InputStream aIn = new ASN1InputStream(params.getEncoded("ASN.1")); System.out.println(ASN1Dump.dumpAsString(aIn.readObject())); } }

Running the example produces the following output:

DER Sequence

indicating you have an empty sequence.

How It Works

The call to Signature.getParameters() returns an AlgorithmParameters object that is set to contain the default parameters for creating a PSS signature. As they are default parameters, you would expect none of the fields in the RSASSA-PSS-params to be included in the encoding, and consequently the SEQUENCE will be empty, which is what the output indicates.

An interesting thing to do here is to change one of the parameters and see how the encoding changes. In Chapter 4 I mentioned that PSS signatures can be created with a zero salt size . You can configure the Signature object used in the example by changing the call to setParameter() from

signature.setParameter(PSSParameterSpec.DEFAULT);

to

signature.setParameter(new PSSParameterSpec(0));

You should see the following output:

DER Sequence Tagged [2] Integer(0)

This output indicates that the sequence is no longer empty. It is now carrying a 0 value for the saltLength field, which has a default value of 20.

Encoding Public and Private Keys

I have already touched on the fact that the encoded forms of public and private keys contain a considerable amount of structure in them, and as it happens, the language used for describing these structures is ASN.1. Looking at it in the same manner as with other Java specification objects, an encoded form of a key is simply a value object, so as you would imagine, the JCA provides wrapping objects that can wrap the encoded forms and then be used to convert them back into keys using the KeyFactory class.

Because there are different ASN.1 structures built for handling public and private keys, the JCA provides two classes for wrapping key encodings. The first one you will look at, used for wrapping public key encodings, is the X509EncodedKeySpec. The second one, used for wrapping private key encodings, is the PKCS8EncodedKeySpec .

The X509EncodedKeySpec Class

The java.security.spec.X509EncodedKeySpec class takes its name from the origins of the structure used to wrap public keys. It has a single constructor on it that takes a byte array that should contain a DER encoding of the structure that appears in the key block of an X.509 certificate. This is the encoding that should be returned by a public key that has a return value for Key.getFormat() of X.509.

The structure for representing a public key was also defined in X.509 and is named SubjectPublicKeyInfo . In ASN.1 it appears as follows:

SubjectPublicKeyInfo ::= SEQUENCE { algorithm AlgorithmIdentifier, subjectPublicKey BIT STRING }

You can see it has two elements: an AlgorithmIdentifier , which in this case is used to signify what algorithm the key is for, and then a BIT STRING , which is used to store an encoding of the key material.

The reason for the BIT STRING is that, as you have seen already, asymmetric keys generally require different parameters depending on the algorithm, and as it should be possible to use the same basic structure to wrap anything from elliptic curve to RSA keys, a BIT STRING was settled on as the most general object to use. So, while the examples that follow will deal only with RSA keys, the principles can be applied to the encoding of any public key. It is just a matter of knowing what underlying structure goes in the subjectPublicKey field.

So, how do you deal with RSA public keys? As it turns out, X.509 also defined a structure for RSA public keys, which is also included in PKCS1. The ASN.1 definition is

RSAPublicKey ::= SEQUENCE { modulus INTEGER, publicExponent INTEGER }

and the subjectPublicKey field in a SubjectPublicKeyInfo structure is simply a BIT STRING wrapping a DER encoding of the previous structure.

This leaves you with the algorithm field. You need a specific AlgorithmIdentifier to indicate that the content of the subjectPublicKey field is an RSA public key. In the case of RSA, the OBJECT IDENTIFIER you use in the algorithm field of the AlgorithmIdentifier is

rsaEncryption OBJECT IDENTIFIER ::= { iso(1) member-body(2) us(840) rsadsi(113549) pkcs(1) pkcs-1(1) 1 }

and the parameters are defined as having the value NULL . (Remember, this is very different from empty; this means the parameters field is expected to have the NULL value in it.) Using the Bouncy Castle APIs, the Java equivalent for the AlgorithmIdentifier would be

AlgorithmIdentifier rsaKey = new AlgorithmIdentifier( PKCSObjectIdentifiers.rsaEncryption, new DERNull());

So when Key.getEncoded() is called on an RSA public key, the key material is encoded into the ASN.1 RSAPublicKey structure, the DER encoding of the structure is then converted into a BIT STRING , and the resulting BIT STRING and the RSA-specific AlgorithmIdentifier are then used to assemble the SubjectPublicKeyInfo object. The SubjectPublicKeyInfo object is then written out as a DERencoded stream and returned in a byte array.

So having retrieved the byte array from Key.getEncoded() , how do you convert it back into a public key? You will find out now.

Try It Out: Using the X509EncodedKeySpec

Try the following example. It generates a small RSA key pair and then takes the output of Key.getEncoded() for the public key and uses it to create an X509EncodedKeySpec , which is then used to regenerate the original key. Along the way, it uses Bouncy Castle's ASN.1 package to dump out the underlying structure, so you can compare what gets generated to the previous discussion.

package chapter5; import java.security.*; import java.security.spec.X509EncodedKeySpec; import org.bouncycastle.asn1.ASN1InputStream; import org.bouncycastle.asn1.util.ASN1Dump; import org.bouncycastle.asn1.x509.SubjectPublicKeyInfo; /** * Simple example showing use of X509EncodedKeySpec */ public class X509EncodedKeySpecExample { public static void main(String[] args) throws Exception { // create the keys KeyPairGenerator generator = KeyPairGenerator.getInstance("RSA", "BC"); generator.initialize(128, Utils.createFixedRandom()); KeyPair pair = generator.generateKeyPair(); // dump public key ASN1InputStream aIn = new ASN1InputStream( pair.getPublic().getEncoded()); SubjectPublicKeyInfo info = SubjectPublicKeyInfo.getInstance( aIn.readObject()); System.out.println(ASN1Dump.dumpAsString(info)); System.out.println(ASN1Dump.dumpAsString( info .getPublicKey())); // create from specification X509EncodedKeySpec x509Spec = new X509EncodedKeySpec( pair.getPublic().getEncoded()); KeyFactory keyFact = KeyFactory.getInstance("RSA", "BC"); PublicKey pubKey = keyFact.generatePublic(x509Spec); if (pubKey.equals(pair.getPublic())) { System.out.println("key recovery successful"); } else { System.out.println("key recovery failed"); } } }

You should see something like the following output:

DER Sequence DER Sequence ObjectIdentifier(1.2.840.113549.1.1.1) NULL DER Bit String[26, 0] DER Sequence Integer(193768625448396182147878757503948840199) Integer(65537) key recovery successful

As you can see from the ASN.1 dump information, the SubjectPublicKeyInfo structure (the first sequence) and the ASN.1 RSAPublicKey structure (the second sequence) are what we expected. Recreating the key has worked as well.

How It Works

You used the org.bouncycastle.asn1.x509.SubjectPublicKeyInfo object in the example to make life easier for yourself in reconstructing the encoded public key so you can print its contents. Like any class in the org.bouncycastle.asn1.x509 package, it is simply a value object. The main reason you use it here is that it has a convenient method called SubjectPublicKeyInfo.getPublicKey() that reconstructs the ASN.1 object encoded in the subjectPublicKey field of the ASN.1 structure.

The next step in the example is the creation of the X509EncodedKeySpec from the same encoded information that you just dumped. As you did with the other key specification objects for asymmetric algorithms, you simply create a KeyFactory of the right type and call the KeyFactory.generatePublic() method to create the public key.

The string representation of the object identifier can also be used to create the SecretKeyFactory . For example, rather than creating the SecretKeyFactory with the line:

KeyFactory keyFact = KeyFactory.getInstance("RSA", "BC");

you could have instead written

KeyFactory keyFact = KeyFactory.getInstance( info.getAlgorithmId().getObjectId().getId(), "BC");

This can be very useful if you do not know the name of the algorithm beforehand, or in situations where you may have a variety of different types of encoded key types and you are trying to minimize the amount of code required to handle them.

The PKCS8EncodedKeySpec Class

Like the X509EncodedKeySpec class, the java.security.spec.PKCS8EncodedKeySpec class also takes its name from the standard responsible for it. RSA Security's PKCS #8 entitled "Private Key Information Syntax" deals with encoding private keys, both with and without encryption, and the PKCS8EncodedKeySpec is designed to deal with private keys that have not been encrypted.

PrivateKeyInfo is the name of the structure for dealing with encoded private keys that have not been encrypted. Calling Key.getEncoded() on an object representing a private key that returns PKCS#8 if you call Key.getFormat() will return the DER encoding of a PrivateKeyInfo structure representing the material required to construct the private key.

The full definition of PrivateKeyInfo , defined in PKCS #8, reads as follows:

PrivateKeyInfo ::= SEQUENCE { version Version, privateKeyAlgorithm PrivateKeyAlgorithmIdentifier, privateKey PrivateKey, attributes [0] IMPLICIT Attributes OPTIONAL } Version ::= INTEGER {v1(0)} PrivateKeyAlgorithmIdentifier ::= AlgorithmIdentifier PrivateKey ::= OCTET STRING Attributes ::= SET OF Attribute

A look through the ASN.1 tells you a few things: The version number is currently always zero; the type PrivateKeyAlgorithmIdentifier is a renaming of AlgorithmIdentifier , which in this case is a similar structure to the one you are used to; and the privateKey field contains an OCTET STRING .

Note that the attributes field is implicitly tagged with the value zero. Assuming you had an ASN1Sequence object, called seq , that represented a PrivateKeyInfo object, you would need to take the implicit tagging into account, and the optional nature of the field with some code like this:

if (seq.size() == 4) { attributes = ASN1Set.getInstance( (ASN1TaggedObject)seq.getObjectAt(3), false); }

First you check if the attributes field is present. If it is, you recover the ASN1Set that the attributes field represents by using the static ASN1Set.getInstance() method for tagged objects and set the second argument, which determines the type of tagging to use, to false . The false parameter indicates implicit tagging. As I mentioned before, this is important because if the attributes field has only one element in its SET , attempting to parse the field assuming explicit tagging will make the field appear to be an explicitly tagged version of the structure represented by the element. The fact it is meant to be a SET containing that element will disappear.

So take a look at what happens when you try to wrap an RSA private key in a PrivateKeyInfo structure. First, the privateKeyAlgorithm field holds the same contents as the algorithm field in the SubjectPublicKeyInfo field for an RSA public key(you use the OBJECT IDENTIFIER rsaEncryption with the parameters field set to NULL . As with the RSA public key, there needs to be another structure to provide a DER-encoded object to go in the OCTET STRING represented by the privateKey field of the PrivateKeyInfo structure. In the case of an RSA private key, PKCS #1 defines a structure called RSAPrivateKey , which looks as follows:

RSAPrivateKey ::= SEQUENCE { version Version, modulus INTEGER, publicExponent INTEGER, privateExponent INTEGER, prime1 INTEGER, prime2 INTEGER, exponent1 INTEGER, exponent2 INTEGER, coefficient INTEGER, otherPrimeInfos OtherPrimeInfos OPTIONAL } Version ::= INTEGER { two-prime(0), multi(1) } (CONSTRAINED BY {-- version must be multi if otherPrimeInfos present --}) OtherPrimeInfos ::= SEQUENCE SIZE(1..MAX) OF OtherPrimeInfo OtherPrimeInfo ::= SEQUENCE { prime INTEGER, exponent INTEGER, coefficient INTEGER }

The structure is pretty close to what you would expect. You have a collection of INTEGER objects representing a regular RSA private key that uses Chinese Remainder Theorem and an optional extra field for a sequence of extra values on the end in case the key is a multi-prime one. The value MAX is applicationdependent, but you can expect in most provider implementations it will be big enough for the number of OtherPrimeInfo structures held by any encoded keys you are using.

More formally , a look at the comment in the constraints on the Version value tells us that a call to Key.getEncoded() on a Java object implementing RSAPrivateCrtKey should produce a two-prime version with the version field set to 0. An RSAMultiPrimePrivateCrtKey , on the other hand, will produce a multiversion with the version field set to 1 with the extra coefficients for a multi-prime RSA private key in the sequence represented by otherPrimeInfos . If it was a multi-prime key, you will find that the OtherPrimeInfo structures making up the sequence have the values corresponding to those in the RSAOtherPrimeInfo objects associated with the Java instance of the key.

So, how do you use this knowledge in Java? The first thing you should look at is the mechanism for creating a private key using a KeyFactory . As you would expect, it is just the same as for a public key. Given a Java object privKey , which implements the RSAPrivateKey interface, you can create a PKCS8EncodedKeySpec by writing

PKCS8EncodedKeySpec pkcs8Spec = new PKCS8EncodedKeySpec(privKey.getEncoded());

Having created the pkcs8Spec object, you can then re-create the private key as follows:

KeyFactory keyFact = KeyFactory.getInstance("RSA", "BC"); PrivateKey priv = keyFact.generatePrivate(pkcs8Spec);

The Bouncy Castle APIs also provide a wrapper class that allows you to dump the internals of the structure returned by the encoding of a private key. You can find the class in the org.bouncycastle.asn1.pkcs package and its name is PrivateKeyInfo . If you wanted to try dumping the contents of Key.getEncoded() for an object implementing the PrivateKey interface, you could use the following code fragment:

ASN1InputStream aIn = new ASN1InputStream(priv.getEncoded()); PrivateKeyInfo info = PrivateKeyInfo.getInstance(aIn.readObject()); System.out.println(ASN1Dump.dumpAsString(info)); System.out.println(ASN1Dump.dumpAsString(info.getPrivateKey()));

and you should see the contents of the PrivateKeyInfo structure printed out, followed by the contents of the structure encoded in the string of octets contained in the privateKey field of the PrivateKeyInfo structure.

When you are exporting private keys, the normal reason is to make them available to another application, or perhaps persist them on disk. Given that it is a private key you are dealing with in this case, you will often want to encrypt the DER-encoded data representing the private key as well. PKCS #8 also provides a structure for putting together an encrypted coding for a key. It builds on the PrivateKeyInfo object, and the JCE provides a class that allows you to construct and manipulate the encrypted coding directly. The class you use for this is the EncryptedPrivateKeyInfo class.

The EncryptedPrivateKeyInfo Class

The javax.security.EncryptedPrivateKeyInfo class allows you to package encrypted private key data with details of the encryption algorithm used to create it. The class takes its name from a structure of the same name defined in PKCS #8, where it is defined as follows:

EncryptedPrivateKeyInfo ::= SEQUENCE { encryptionAlgorithm AlgorithmIdentifier, encryptedData EncryptedData } EncryptedData ::= OCTET STRING

Other than the fact that the octets held in the encryptedData field represent an encrypted encoding of a PrivateKeyInfo object, this is a very simple structure. You have the encryptionAlgorithm field, which contains an AlgorithmIdentifier describing the algorithm used for encryption and any parameters that might need to be passed to another cipher trying to implement the same algorithm, and then you have the encrypted data enclosed in an OCTET STRING .

You'll now review some of the methods on the EncryptedPrivateKeyInfo class so you can see how it is built on top of the PKCS #8 EncryptedPrivateKeyInfo structure.

EncryptedPrivateKeyInfo()

The EncryptedPrivateKeyInfo class has three constructors on it. One is used to create the object from a byte array that contains an ASN.1-encoded PKCS #8 EncryptedPrivateKeyInfo structure. The other two are used for creating an EncryptedPrivateKeyInfo object that will be used to produce the ASN.1 binary encoding of the structure in PKCS #8.

Encrypted Private KeyInfo.get Alg Parameters()

This method returns an AlgorithmParameters object, which carries the parameters information that is needed, together with the key to initialize the cipher used to decrypt the private key. As you have probably already guessed, this is the same information that is carried in the parameters field of the AlgorithmIdentifier in the PKCS #8 EncryptedPrivateKeyInfo structure.

Encrypted Private Key Info.get Key Spec()

The getKeySpec() method takes an appropriately initialized cipher and returns a PKCS8EncodedKeySpec that contains the encoding for the PKCS #8 PrivateKeyInfo object that was encrypted when the EncryptedPrivateKeyInfo object was created. You can then pass the key specification to a KeyFactory and create a private key suitable for use with the provider you are using.

Encrypted Private Key Info.get Encoded()

This method returns the ASN.1 binary encoding of the EncryptedPrivateKeyInfo structure defined in PKCS #8 as a byte array. If you are exporting one of these objects, the data returned by this method is the one you want to use. Do not confuse getEncoded() with the EncryptedPrivateKeyInfo.getEncryptedData() method. That method only returns the value of the encryptedData field, so, as you would expect, it does not include any of the information about the encryption algorithm used.

Now take a look at how this all goes together.

Try It Out: Using EncryptedPrivateKeyInfo and PBE

The following example builds not only on the discussion of the PKCS8EncodedKeySpec and the EncryptedPrivateKeyInfo class, but also on the previous discussion about the AlgorithmParameters class and password-based encryption. Try running it and then read on.

package chapter5; import java.security.*; import java.security.spec.PKCS8EncodedKeySpec; import javax.crypto.Cipher; import javax.crypto.EncryptedPrivateKeyInfo; import javax.crypto.SecretKeyFactory; import javax.crypto.spec.PBEKeySpec; /** * Simple example showing how to use PBE and an EncryptedPrivateKeyInfo object. */ public class EncryptedPrivateKeyInfoExample { public static void main(String[] args) throws Exception { // generate a key pair KeyPairGenerator kpg = KeyPairGenerator.getInstance("RSA", "BC"); kpg.initialize(128, Utils.createFixedRandom()); KeyPair pair = kpg.generateKeyPair(); // wrapping step char[] password = "hello".toCharArray(); byte[] salt = new byte[20]; int iCount = 100; String pbeAlgorithm = "PBEWithSHAAnd3-KeyTripleDES-CBC"; PBEKeySpec pbeKeySpec = new PBEKeySpec(password, salt, iCount); SecretKeyFactory secretKeyFact = SecretKeyFactory.getInstance( pbeAlgorithm, "BC"); Cipher cipher = Cipher.getInstance(pbeAlgorithm, "BC"); cipher.init(Cipher.WRAP_MODE, secretKeyFact.generateSecret(pbeKeySpec)); byte[] wrappedKey = cipher.wrap(pair.getPrivate()); // create carrier EncryptedPrivateKeyInfo pInfo = new EncryptedPrivateKeyInfo( cipher.getParameters(), wrappedKey); // unwrapping step - note we only use the password pbeKeySpec = new PBEKeySpec(password); cipher = Cipher.getInstance(pInfo.getAlgName(), "BC"); cipher.init(Cipher.DECRYPT_MODE, secretKeyFact.generateSecret(pbeKeySpec), pInfo.getAlgParameters()); PKCS8EncodedKeySpec pkcs8Spec = pInfo.getKeySpec(cipher); KeyFactory keyFact = KeyFactory.getInstance("RSA", "BC"); PrivateKey privKey = keyFact.generatePrivate(pkcs8Spec); if (privKey.equals(pair.getPrivate())) { System.out.println("key recovery successful"); } else { System.out.println("key recovery failed"); } } }

Assuming all went according to plan, running the example will produce the message:

key recovery successful

As usual, the RSA key size is absurdly small for reasons that it will fit better when you look at the ASN.1 structures a bit further on. Other than that, after the key generation, there is a lot going on. Therefore, you will work through the example step-by-step.

How It Works

The first stage is familiar from the discussion in Chapter 2 about password-based encryption and the discussion on key wrapping in Chapter 4. You set up a PBEKeySpec , create a SecretKeyFactory to handle it using the PKCS #12 algorithm PBEWithSHAAnd3 -KeyTripleDES-CBC. The PBE algorithm uses a mixing function similar to the one you looked at in Chapter 3 and SHA-1 to create three DES keys and an IV, which are then used to initialize a Triple-DES cipher operating in CBC mode.

Having initialized the Cipher object appropriately with the key generated by the SecretKeyFactory , you then used the Cipher.wrap() method to create the byte array that contains the encrypted version of the encoding of the private key. Next, you construct the EncryptedPrivateKeyInfo in the following manner:

EncryptedPrivateKeyInfo pInfo = new EncryptedPrivateKeyInfo( cipher.getParameters(), wrappedKey);

As you saw earlier, Cipher.getParameters() returns an AlgorithmParameters object. One feature of this object is that it has an AlgorithmParameters.getEncoded() method on it that returns an encoded form of the parameters it contains. In this case, the getEncoded() method returns the DER encoding of the following structure, which is defined in PKCS #12:

pkcs-12PbeParams ::= SEQUENCE { salt OCTET STRING, iterations INTEGER }

The pkcs-12PbeParams structure carries the salt and the iteration count that you initialized the PBEKeySpec object, pbeKeySpec , with. It is the SEQUENCE represented by the pkcs-12PbeParams structure that serves as the parameters field in the AlgorithmIdentifier contained in the encryptionAlgorithm field of the PKCS #8 EncryptedPrivateKeyInfo object. You can dump out this structure by adding

System.out.println(ASN1Dump.dumpAsString( new ASN1InputStream(cipher.getParameters().getEncoded()).readObject()));

which will produce the following extra output:

DER Sequence DER Octet String[20] Integer(100)

As you can see the iteration count is 100, and if you dumped out the contents of the OCTET STRING , you would find it contained the bytes making up the salt that you passed in to the PBEKeySpec originally.

The only apparently missing ingredient is the OBJECT IDENTIFIER that is required to populate the algorithm field contained in the AlgorithmIdentifier inside the encryptionAlgorithm field. Funnily enough if you were to add

System.out.println(cipher.getParameters().getAlgorithm());

to the line after the call to cipher.wrap() in the example, you would see the following extra line printed in the output:

1.2.840.113549.1.12.1.3

which happens to be the value of the OBJECT IDENTIFIER defined in PKCS #12 for PBEWithSHAAnd3-KeyTripleDES-CBC. This last piece of information then allows the EncryptedPrivateKeyInfo object you are creating to fill in the encryptionAlgorithm field of the ASN.1 structure it contains.

The last step in the example is where you recover the encrypted private key that is stored in pInfo , our EncryptedPrivateKeyInfo object. In this case, you only have to initialize the PBEKeySpec as follows:

pbeKeySpec = new PBEKeySpec(password);

as the other parameters are stored in the AlgorithmParameters object that is extracted from the EncryptedPrivateKeyInfo object. After the pbeKeySpec is converted to a key and the cipher is initialized for decryption, a PKCS8EncodedKeySpec is retrieved from the EncryptedPrivateKeyInfo object and the original private key is recovered. As you would expect adding the following lines:

ASN1InputStream aIn = new ASN1InputStream(pkcs8Spec.getEncoded()); PrivateKeyInfo info = PrivateKeyInfo.getInstance(aIn.readObject()); System.out.println(ASN1Dump.dumpAsString(info)); System.out.println(ASN1Dump.dumpAsString(info.getPrivateKey()));

after the point where pkcs8Spec is set produces the following extra output:

DER Sequence Integer(0) DER Sequence ObjectIdentifier(1.2.840.113549.1.1.1) NULL DER Octet String[100] DER Sequence Integer(0) Integer(193768625448396182147878757503948840199) Integer(65537) Integer(176280162144807927893221216887705181313) Integer(16432478544733070881) Integer(11791807603515952679) Integer(5879506071310035233) Integer(6852660091055565157) Integer(6468103171312594380)

The first ASN.1 dump is that of the PrivateKeyInfo structure, and it contains an AlgorithmIdentifier structure whose algorithm field has been set to the OBJECT IDENTIFIER for RSA encryption, indicating that the PrivateKeyInfo structure contains an RSA key. The second ASN.1 dump is the private key encoded as a version 0 RSAPrivateKey structure from PKCS #1, which simply contains the value making up the private key in much the same way as the RSAPrivateCrtKeySpec class does.

There is one thing in the example worth further discussion. When you created the encrypted byte array to initialize the EncryptedPrivateKeyInfo object with, you did the following:

cipher.init(Cipher.WRAP_MODE, secretKeyFact.generateSecret(pbeKeySpec)); byte[] wrappedKey = cipher.wrap(pair.getPrivate());

which returns a byte array containing the encrypted form of the ASN.1-encoded PrivateKeyInfo object that was returned by the private key's Key.getEncoded() method. You could have used the following instead:

cipher.init(Cipher.ENCRYPT_MODE, secretKeyFact.generateSecret(pbeKeySpec)); byte[] wrappedKey = cipher.doFinal(pair.getPrivate().getEncoded());

The reason this was not done is to draw attention to the fact that if you were using a provider based on a hardware cryptographic device and the device has been set up not to publish private keys outside of it ”often the very reason for having one ” Key.getEncoded() on the private key might actually return null because the device forbids the operation. The reason this might happen is that the provider will be expecting developers to use the wrapping mechanism instead, as it allows the provider to encrypt the private key information before it leaves the safety of the device.

Summary

This chapter looked at ASN.1, its binary encodings, and how they are used by a variety of classes in the JCA and JCE to allow you to pass data structures around in an application-independent manner. Building on this, you have also had a closer look at how public and private keys are encoded and how to make use of the EncryptedPrivateKeyInfo class.

Over the course of the chapter, you learned

ASN.1 has uses that go well beyond just the encoding of algorithm parameters and asymmetric keys. It is also used to provide messaging formats and provides the method for encoding X.509 certificates, which are a fundamental part of most public key infrastructure (PKI) solutions. The next chapter looks at how X.509 certificates are constructed , as well as how to use the JCA classes that support them.

Exercises

1.  

What happens to fields set to their default values when the ASN.1 structure that contains them is encoded using DER?

2.  

How would you implement the following ASN.1 type using the Bouncy Castle ASN.1 API?

MyChoice ::= CHOICE { message UTF8String, id INTEGER }

3.  

What is meant by the word IMPLICIT in respect to a style of tagging? Think of a simple example of how it would be done using one of the classes representing an ASN.1 primitive in the Bouncy Castle API. What implication does the IMPLICIT style have for items that are derived from the CHOICE type?

4.  

What are the two classes used to hold the DER encodings of public and private keys, respectively? What class is used to convert the encodings back into actual keys?

5.  

What does an EncryptedPrivateKeyInfo object contain?

Answers

1.  

What happens to fields set to their default values when the ASN.1 structure that contains them is encoded using DER?

They are left out of the encoding.

2.  

How would you implement the following ASN.1 type using the Bouncy Castle ASN.1 API?

MyChoice ::= CHOICE { message UTF8String, id INTEGER }

Here is one way of doing it. As you can see, the implementation reads a lot like a Java version of a C, or Pascal, union. Note the use of the ASN1Choice interface. Use of this will reduce the likelihood that this object is mistakenly tagged implicitly.

public class MyChoice extends ASN1Encodable implements ASN1Choice { ASN1Encodable value; public MyChoice(DERInteger value) { this.value = value; } public MyChoice(DERUTF8String value) { this.value = value; } public boolean isInteger() { return (value instanceof DERInteger); } public DERUTF8String getMessage() { if (isInteger()) { throw new IllegalStateException("not a message!"); } return (DERUTF8String)value; } public DERInteger getId() { if (isInteger()) { return (DERInteger)value; } throw new IllegalStateException("not an id!"); } public DERObject toASN1Object() { return value.toASN1Object(); } }

3.  

What is meant by the word IMPLICIT in respect to a style of tagging? Think of a simple example of how it would be done using one of the classes representing an ASN.1 primitive in the Bouncy Castle API. What implication does the implicit style have for items that are derived from the CHOICE type?

Objects that are assigned a tag value using the implicit style have their own tag value overridden by the tag value assigned. In the Bouncy Castle API, this is done by setting the explicit parameter in the tagged object constructor to false . For example, the ASN.1 declaration

value [0] IMPLICIT INTEGER

would be created as a DER tagged value using the following:

DERTaggedObject t = new DERTaggedObject(false, derIntegerValue);

and then recovered using the following:

derIntegerValue = DERInteger.getInstance(t, false);

Remember too that if an object is already tagged, implicitly tagging it will remove the tag value. For this reason, as they commonly contain tagged values, any tag applied to an item of type CHOICE is applied explicitly.

4.  

What are the two classes used to hold the DER encodings of public and private keys, respectively? What class is used to convert the encodings back into actual keys?

The X509EncodedKeySpec is used to contain public keys. The PKCS8EncodedKeySpec is used for carrying the encodings of private keys. The KeyFactory class is used to take the information in the encoded key specification objects and produce actual keys.

5.  

What does an EncryptedPrivateKeyInfo object contain?

An EncryptedPrivateKeyInfo contains an encrypted encoding of a PKCS #8 PrivateKeyInfo object and the parameter and algorithm details that were used to do the encryption. On decryption, the recovered information is used to create a PKCS8EncodedKeySpec, which is then used to create a private key.

Категории