Ritter's Crypto Glossary and
Dictionary of Technical Cryptography

Technical Cryptographic Terms Explained

Hyperlinked definitions and discussions of many cryptographic, mathematic, logic, statistics, and electronics terms used in cipher construction and analysis.

A Ciphers By Ritter Page


Terry Ritter

Current Edition: 1999 Jan 19

For a basic introduction to cryptography, see Learning About Cryptography. Please feel free to send comments and suggestions for improvement to: ritter@io.com. You may wish to help support this work by patronizing Ritter's Crypto Bookshop.


Contents

A
Absolute, AC, Additive Combiner, Additive RNG, Affine, Affine Boolean Function, Alphabet, Alternative Hypothesis, Amplifier, Amplitude, Analog, AND, ASCII, Associative, Asymmetric Cipher, Attack, Augmented Repetitions, Authentication, Authenticating Block Cipher, Autokey, Avalanche, Avalanche Effect
B
Back Door, Balance, Balanced Block Mixer, Balanced Block Mixing, Balanced Combiner, Base-64, Bel, Bent Function, Bernoulli Trials, Bijective, Binary, Binomial Distribution, Birthday Attack, Birthday Paradox, Bit, Block, Block Cipher, Block Size, Boolean, Boolean Function, Boolean Function Nonlinearity, Boolean Logic, Boolean Mapping, Break, Brute Force Attack, Bug, Byte
C
Capacitor, CBC, c.d.f., CFB, Chain, Chaos, Chi-Square, Cipher, Cipher Taxonomy, Ciphering, Ciphertext, Ciphertext Expansion, Ciphony, Circuit, Clock, Closed, Code, Codebook, Codebook Attack, Combination, Combinatoric, Combiner, Commutative, Complete, Component, Computer, Conductor, Confusion, Confusion Sequence, Congruence, Contextual, Conventional Cipher, Convolution, Correlation, Correlation Coefficient, CRC, Cryptanalysis, Cryptanalyst, Cryptographer, Cryptographic Mechanism, Cryptography, Cryptography War, Cryptology, Current
D
dB, DC, Debug, Decipher, Decryption, Deductive Reasoning, Defined Plaintext Attack, Degrees of Freedom, DES, Decibel, Decimal, Design Strength, Deterministic, Dictionary Attack, Differential Cryptanalysis, Diffusion, Digital, Diode, Distribution, Distributive, Divide and Conquer, Domain, Dyadic, Dynamic Keying, Dynamic Substitution Combiner, Dynamic Transposition
E
ECB, Electric Field, Electromagnetic Field, Electronic, Encipher, Encryption, Entropy, Ergodic, Extractor, Exclusive-OR
F
Factorial, Fallacy, Fast Walsh Transform, FCSR, Feistel Construction, Fenced DES, Fencing, Fencing Layer, FFT, Field, Finite Field, Flip-Flop, Fourier Series, Fourier Theorem, Fourier Transform, Frequency, Function, FWT
G
Gain, Galois Field, Gate, GF 2n, Goodness of Fit, Group
H
Hamming Distance, Hardware, Hash, Hexadecimal (Hex), Homophonic, Homophonic Substitution
I
IDEA, Ideal Secrecy, i.i.d., Inductive Reasoning, Inductor, Injective, Insulator, Integer, Intermediate Block, Interval, Into, Inverse, Invertible, Involution, Irreducible, IV
J
Jitterizer
K
KB, Kb, Kerckhoff's Requirements, Key, Key Distribution Problem, Keyspace, Keyed Substitution, Known Plaintext Attack, Kolmogorov-Smirnov
L
Latency, Latin Square, Latin Square Combiner, Layer, LFSR, Linear, Linear Complexity, Linear Feedback Shift Register, Linear Logic Function, Logic, Logic Function, LSB
M
M-Sequence, Machine Language, Magnetic Field, Man-in-the-Middle Attack, Mapping, Markov Process, Mathematical Cryptography, Maximal Length, MB, Mb, Mechanism, Mechanistic Cryptography, Mersenne Prime, Message Digest, Message Key, MITM, Mixing, Mixing Cipher, Mod 2, Mod 2 Polynomial, Mode, Modulo, Monadic, Monoalphabetic Substitution, Monographic, Multiple Encryption
N
Nominclator, Nominal, Nonlinearity, NOT, Null Hypothesis
O
Object Code, Objective, Octal, Octave, OFB, One Time Pad, One-To-One, One Way Diffusion, Onto, Opcode, Operating Mode, Opponent, OR, Order, Ordinal, Orthogonal, Orthogonal Latin Squares, OTP, Overall Diffusion
P
Padding, Password, Patent, Patent Infringement, Perfect Secrecy, Permutation, PGP, Physically Random, Pink Noise, Plaintext, Poisson Distribution, Polyalphabetic Combiner, Polyalphabetic Substitution, Polygram Substitution, Polygraphic, Polynomial, Polyphonic, Population, Population Estimation, Power, Primitive, Primitive Polynomial, Prime, Prior Art, PRNG, Process, Pseudorandom, Public Key Cipher
R
Random, Random Number Generator, Random Variable, Range, Really Random, Relay, Research Hypothesis, Resistor, Ring, Root, RMS, Root Mean Square, RNG, Round, RSA, Running Key
S
Salt, Sample, S-Box, Scalable, Secrecy, Secret Code, Secret Key Cipher, Security, Security Through Obscurity, Semiconductor, Semigroup, Session Key, Set, Shift Register, Shuffle, Sieve of Eratosthenes, Significance, Simple Substitution, Software, Source Code, State, Stationary Process, Statistic, Statistics, Steganography, Stochastic, Stream Cipher, Strength, Strict Avalanche Criterion (SAC), Subjective, Substitution, Substitution-Permutation, Substitution Table, Superencryption, Surjective, Switch, Switching Function, Symmetric Cipher, Symmetric Group, System, System Design
T
Table Selection Combiner, TEMPEST, Transformer, Transistor, Transposition, Trap Door, Triple DES, Truly Random, Trust, Truth Table, Type I Error, Type II Error
U
Unary, Unexpected Distance, Unicity Distance, Uniform Distribution
V
Variable Size Block Cipher, Voltage
W
Walsh Functions, Weight, Whitening White Noise Wire
X
XOR

Absolute
In the study of logic, something observed similarly by most observers, or something agreed upon, or which has the same value each time measured. Something not in dispute, unarguable, and independent of other state. As opposed to contextual.

AC
Alternating Current: Electrical power which repeatedly reverses direction of flow. As opposed to DC.

Generally used for power distribution because the changing current supports the use of transformers. Utilities can thus transport power at high voltage and low current, which minimize "ohmic" or I2R losses. The high voltages are then reduced at power substations and again by pole transformers for delivery to the consumer.

Additive Combiner
An additive combiner uses numerical concepts similar to addition to mix multiple values into a single result.

One example is byte addition modulo 256, which simply adds two byte values, each in the range 0..255, and produces the remainder after division by 256, again a value in the byte range of 0..255. Subtraction is also an "additive" combiner.

Another example is bit-level exclusive-OR which is addition mod 2. A byte-level exclusive-OR is a polynomial addition.

Additive RNG
(Additive random number generator.) A LFSR-based RNG typically using multi-bit elements and integer addition (instead of XOR) combining. References include:
Knuth, D. 1981. The Art of Computer Programming, Vol. 2, Seminumerical Algorithms. 2nd ed. 26-31. Addison-Wesley: Reading, Massachusetts.
Marsaglia, G. and L. Tsay. 1985. Matrices and the Structure of Random Number Sequences. Linear Algebra and its Applications. 67: 147-156.

Advantages include:

In addition, a vast multiplicity of independent cycles has the potential of confusing even a "quantum computer," should such a thing become possible.

   For Degree-n Primitive, and Bit Width w

   Total States:       2nw
   Non-Init States:    2n(w-1)
   Number of Cycles:   2(n-1)(w-1)
   Length Each Cycle:  (2n-1)2(w-1)
   Period of LSB:      2n-1

The binary addition of two bits with no carry input is just XOR, so the lsb of an Additive RNG has the usual maximal length period.

A degree-127 Additive RNG using 127 elements of 32 bits each has 24064 unique states. Of these, 23937 are disallowed by initialization (the lsb's are all "0") but this is just one unusable state out of 2127. There are still 23906 cycles which each have almost 2158 steps. (The Cloak2 stream cipher uses an Additive RNG with 9689 elements of 32 bits, and so has 2310048 unique states. These are mainly distributed among 2300328 different cycles with almost 29720 steps each.)

Note that any LFSR, including the Additive RNG, is very weak when used alone. But when steps are taken to hide the sequence (such as using a jitterizer and Dynamic Substitution combining) the result can have significant strength.

Affine
Generally speaking, linear. Sometimes affine generalizes "linearity" to expressions of multiple independent variables, with only a single-variable expression being called "linear." From analytic and algebraic geometry.

Assume the flat plane defined by two arbitrary unit vectors e1, e2 and a common origin O; this is a coordinate "frame." Assume a grid of lines parallel to each frame vector, separated by unit lengths (a "metric" which may differ for each vector). If the vectors happen to be perpendicular, we have a Cartesian coordinate system, but in any case we can locate any point on the plane by its position on the grid.

An affine transformation can change the origin, the angle between the vectors, and unit vector lengths. Shapes in the original frame thus become "pinched," "squashed" or "stretched" images under the affine transformation. This same sort of thing generalizes to higher degree expressions.

The Handbook of Mathematics says that if e1, e2, e3 are linearly independent vectors, any vector a can be expressed uniquely in the form a = a1e1 + a2e2 + a3e3 where the ai are the affine coordinates. (p.518)

The VNR Concise Encyclopedia of Mathematics says "All transformations that lead to a uniquely soluble system of linear equations are called affine transformations." (p.534)

Affine Boolean Function
A Boolean function which can be represented in the form:
anxn + an-1xn-1 + ... + a1x1 + a0
where the operations are mod 2: addition is Exclusive-OR, and multiplication is AND.

Note that all of the variables xi are to the first power only, and each coefficient ai simply enables or disables its associated variable. The result is a single Boolean value, but the constant term a0 can produce either possible output polarity.

Here are all possible 3-variable affine Boolean functions (each of which may be inverted by complementing the constant term):

     affine    truth table

          c    0  0  0  0  0  0  0  0
         x0    0  1  0  1  0  1  0  1
      x1       0  0  1  1  0  0  1  1
      x1+x0    0  1  1  0  0  1  1  0
   x2          0  0  0  0  1  1  1  1
   x2+   x0    0  1  0  1  1  0  1  0
   x2+x1       0  0  1  1  1  1  0  0
   x2+x1+x0    0  1  1  0  1  0  0  1

Alphabet
The set of symbols under discussion.

Alternative Hypothesis
In statistics, the statement formulated so that the logically contrary statement, the null hypothesis H0 has a test statistic with a known distribution for the case when there is nothing unusual to detect. Also called the research hypothesis H1, and logically identical to "NOT-H0" or "H0 is not true."

Amplifier
a component or device intended to sense a signal and produce a larger version of that signal. In general, any amplifying device is limited by available power, frequency response, and device maximums for voltage, current, and power dissipation.

Transistors are analog amplifiers which are basically linear over a reasonable range and so require DC power. In contrast, Relays are classically mechanical devices with direct metal-to-metal moving connections, and so can handle generally higher power and AC current.

Amplitude
The signal level, or height.

Analog
Pertaining to continuous values. As opposed to digital or discrete quantities.

AND
A Boolean logic function which is also mod 2 multiplication.

ASCII
A public code for converting between 7-bit values 0..127 (or 00..7f hex) and text characters. ASCII is an acronym for American Standard Code for Information Interchange.
DEC HEX CTRL CMD    DEC HEX CHAR    DEC HEX CHAR   DEC HEX CHAR

  0  00  ^@  NUL     32  20  SPC     64  40  @      96  60  '
  1  01  ^A  SOH     33  21   !      65  41  A      97  61  a
  2  02  ^B  STX     34  22   "      66  42  B      98  62  b
  3  03  ^C  ETX     35  23   #      67  43  C      99  63  c
  4  04  ^D  EOT     36  24   $      68  44  D     100  64  d
  5  05  ^E  ENQ     37  25   %      69  45  E     101  65  e
  6  06  ^F  ACK     38  26   &      70  46  F     102  66  f
  7  07  ^G  BEL     39  27   '      71  47  G     103  67  g
  8  08  ^H  BS      40  28   (      72  48  H     104  68  h
  9  09  ^I  HT      41  29   )      73  49  I     105  69  i
 10  0a  ^J  LF      42  2a   *      74  4a  J     106  6a  j
 11  0b  ^K  VT      43  2b   +      75  4b  K     107  6b  k
 12  0c  ^L  FF      44  2c   ,      76  4c  L     108  6c  l
 13  0d  ^M  CR      45  2d   -      77  4d  M     109  6d  m
 14  0e  ^N  SO      46  2e   .      78  4e  N     110  6e  n
 15  0f  ^O  SI      47  2f   /      79  4f  O     111  6f  o
 16  10  ^P  DLE     48  30   0      80  50  P     112  70  p
 17  11  ^Q  DC1     49  31   1      81  51  Q     113  71  q
 18  12  ^R  DC2     50  32   2      82  52  R     114  72  r
 19  13  ^S  DC3     51  33   3      83  53  S     115  73  s
 20  14  ^T  DC4     52  34   4      84  54  T     116  74  t
 21  15  ^U  NAK     53  35   5      85  55  U     117  75  u
 22  16  ^V  SYN     54  36   6      86  56  V     118  76  v
 23  17  ^W  ETB     55  37   7      87  57  W     119  77  w
 24  18  ^X  CAN     56  38   8      88  58  X     120  78  x
 25  19  ^Y  EM      57  39   9      89  59  Y     121  79  y
 26  1a  ^Z  SUB     58  3a   :      90  5a  Z     122  7a  z
 27  1b  ^[  ESC     59  3b   ;      91  5b  [     123  7b  {
 28  1c  ^\  FS      60  3c   <      92  5c  \     124  7c  |
 29  1d  ^]  GS      61  3d   =      93  5d  ]     125  7d  }
 30  1e  ^^  RS      62  3e   >      94  5e  ^     126  7e
 31  1f  ^_  US      63  3f   ?      95  5f  _     127  7f  DEL

Associative
A dyadic operation in which two sequential operations on three arguments can first operate on either the first two or the last two arguments, producing the same result in either case: (a + b) + c = a + (b + c).

Also see: commutative and distributive.

Asymmetric Cipher
A public key cipher.

Attack
General ways in which a cryptanalyst may try to "break" or penetrate the secrecy of a cipher. These are not algorithms; they are just approaches as a starting place for constructing specific algorithms.

Classically, attacks were neither named nor classified; there was just: "here is a cipher, and here is the attack." And while this gradually developed into named attacks, there is no overall attack taxonomy. Currently, attacks are often classified by the information available to the attacker or constraints on the attack, and then by strategies which use the available information. Not only ciphers, but also cryptographic hash functions can be attacked, generally with very different strategies.

Informational Constraints

We are to attack a cipher which enciphers plaintext into ciphertext or deciphers the opposite way, under control of a key. The available information necessarily constrains our attack strategies.

Attack Strategies

The goal of an attack is to reveal some unknown plaintext, or the key (which will reveal the plaintext). An attack which succeeds with less effort than a brute-force search we call a break. An "academic" ("theoretical," "certificational") break may involve impractically large amounts of data or resources, yet still be called a "break" if the attack would be easier than brute force. (It is thus possible for a "broken" cipher to be much stronger than a cipher with a short key.) Sometimes the attack strategy is thought to be obvious, given a particular informational constraint, and is not further classified.

Many attacks try to isolate unknown small components or aspects so they can be solved separately, a process known as divide and conquer. Also see: security.

Augmented Repetitions
When sampling with replacement, eventually we again find some object or value which has been found before. We call such an occurrence a "repetition." A value found exactly twice is a double, or "2-rep"; a value found three times is a triple or "3-rep," and so on.

For a known population, the number of repetitions expected at each level has long been understood to be a binomial expression. But if we are sampling in an attempt to establish the effective size of an unknown population, we have two problems:

  1. The binomial equations which predict expected repetitions do not reverse well to predict population, and
  2. Exact repetitions discard information and so are less accurate than we would like. For example, if we have a double and then find another of that value, we now have a triple, and one less double. So if we are using doubles to predict population, the occurrence of a triple influences the predicted population in exactly the wrong direction.

Fortunately, there is an unexpected and apparently previously unknown combinatoric relationship between the population and the number of combinations of occurrences of repeated values. This allows us to convert any number of triples and higher n-reps to the number of 2-reps which have the same probability. So if we have a double, and then get another of the same value, we have a triple, which we can convert into three 2-reps. The total number of 2-reps from all repetitions (the augmented 2-reps value) is then used to predict population.

We can relate the number of samples s to the population N through the expected number of augmented doubles Ead:

     Ead(N,s) = s(s-1) / 2N .
This equation is exact, provided we interpret all the exact n-reps in terms of 2-reps. For example, a triple is interpreted as three doubles; the augmentation from 3-reps to 2-reps is (3 C 2) or 3. The augmented result is the sum of the contributions from all higher repetition levels:
           n    i
     ad = SUM  ( ) r[i] .
          i=2   2
where ad is the number of augmented doubles, and r[i] is the exact repetition count at the i-th level.

And this leads to an equation for predicting population:

     Nad(s,ad) = s(s-1) / 2 ad .
This predicts the population Nad as based on a mean value of augmented doubles ad. Clearly, we expect the number of samples to be far larger than the number of augmented doubles, but an error in the augmented doubles ad should produce a proportionally similar error in the predicted population Nad. We typically develop ad to high precision by averaging the results of many large trials.

However, since the trials should have approximately a simple Poisson distribution (which has only a single parameter), we could be a bit more clever and fit the results to the expected distribution, thus perhaps developing a bit more accuracy.

Also see the article: Estimating Population from Repetitions in Accumulated Random Samples, and the Population Estimation Worksheets in JavaScript page of the Ciphers By Ritter / JavaScript computation pages.

Authentication
One of the objectives of cryptography: Assurance that a message has not been modified in transit or storage (message authentication or message integrity). Also key authentication for public keys. Also user or source identification, which may verify the right to send the message in the first place.

Message Authentication

One form of message authentication computes a CRC hash across the plaintext data, and appends the CRC remainder (or result) to the plaintext data: this adds a computed redundancy to an arbitrary message. The CRC result is then enciphered along with the data. When the message is deciphered, if a second CRC operation produces the same result, the message can be assumed unchanged.

Note that a CRC is a fast, linear hash. Messages with particular CRC result values can be constructed rather easily. However, if the CRC is hidden behind strong ciphering, an Opponent is unlikely to be able to change the CRC value systematically or effectively. In particular, this means that the CRC value will need more protection than a simple exclusive-OR stream cipher or the exclusive-OR approach to handling short last blocks in a block cipher.

A similar approach to message authentication uses a nonlinear cryptographic hash function. These also add a computed redundancy to the message, but generally require significantly more computation than a CRC. It is thought to be exceedingly difficult to construct messages with a particular cryptographic hash result, so the hash result perhaps need not be hidden by encryption.

One form of cryptographic hash is DES CBC mode: using a key different than that used for encryption, the final block of ciphertext is the hash of the message. This obviously doubles the computation when both encryption and authentication are needed. And since any cryptographic hash is vulnerable to birthday attacks, the small 64-bit block size implies that we should be able to find two different messages with the same hash value by constructing and hashing "only" about 232 different messages.

Another approach to message authentication is to use an authenticating block cipher; this is often a block cipher which has a large block, with some "extra data" inserted in an "authentication field" as part of the plaintext before enciphering each block. The "extra data" can be some transformation of the key, the plaintext, and/or a sequence number. This essentially creates a homophonic block cipher: If we know the key, many different ciphertexts will produce the same plaintext field, but only one of those will have the correct authentication field.

The usual approach to authentication in a public key cipher is to encipher with the private key. The resulting ciphertext can then be deciphered by the public key, which anyone can know. Since even the wrong key will produce a "deciphered" result, it is also necessary to identify the resulting plaintext as a valid message; in general this will also require redundancy in the form of a hash value in the plaintext. The process provides no secrecy, but only a person with access to the private key could have enciphered the message.

User Authentication

The classical approach to user authentication is a password; this is "something you know." One can also make use of "something you have" (such as a secure ID card), or "something you are" (biometrics).

The classic problem with passwords is that they must be remembered by ordinary people, and so carry a limited amount of uniqueness. Easy-to-remember passwords are often common language phrases, and so often fall to a dictionary attack. More modern approaches involve using a Diffie-Hellman key exchange, plus the password, thus minimizing exposure to a dictionary attack. This does require a program on the user end, however.

Key Authentication

In secret key ciphers, key authentication is inherent in secure key distribution.

In public key ciphers, public keys are exposed and often delivered insecurely. But someone who uses the wrong key may unknowingly have "secure" communications with an Opponent, as in a man-in-the-middle attack. It is thus absolutely crucial that public keys be authenticated or certified as a separate process. Normally this implies the need for a Certification Authority or CA.

Authenticating Block Cipher
A block cipher mechanism which inherently contains an authentication value or field.

Autokey
A cipher whose key is produced by message data. One common form is "ciphertext feedback," where ciphertext is "fed back" into the state of the random number generator used to produce the confusion sequence for a stream cipher.

Avalanche
The observed property of a block cipher constructed in layers or "rounds" with respect to a tiny change in the input. The change of a single input bit generally produces multiple bit-changes after one round, many more bit-changes after another round, until, eventually, about half of the block will change. An analogy is drawn to an avalanche in snow, where a small initial effect can lead to a dramatic result. As originally described by Feistel:
"As the input moves through successive layers the pattern of 1's generated is amplified and results in an unpredictable avalanche. In the end the final output will have, on average, half 0's and half 1's . . . ." [p.22]

Feistel, H. 1973. Cryptography and Computer Privacy. Scientific American. 228(5): 15-23.

Also see mixing, diffusion, overall diffusion, strict avalanche criterion, complete, S-box, and the bit changes section of the Ciphers By Ritter / JavaScript computation pages.

Avalanche Effect
The result of avalanche. As described by Webster and Tavares:
"For a given transformation to exhibit the avalanche effect, an average of one half of the output bits should change whenever a single input bit is complemented." [p.523]

Webster, A. and S. Tavares. 1985. On the Design of S-Boxes. Advances in Cryptology -- CRYPTO '85. 523-534.

Also see the bit changes section of the Ciphers By Ritter / JavaScript computation pages.


Back Door

A cipher design fault, planned or accidental, which allows the apparent strength of the design to be easily avoided by those who know the trick. When the design background of a cipher is kept secret, a back door is often suspected. Similar to trap door.

Balance
A term used in S-box and Boolean function analysis. As described by Lloyd:
"A function is balanced if, when all input vectors are equally likely, then all output vectors are equally likely."

Lloyd, S. 1990. Properties of binary functions. Advances in Cryptology -- EUROCRYPT '90. 124-139.

There is some desire to generalize this definition to describe multiple-input functions. (Is a function "balanced" if, for one value on the first input, all output values can be produced, but for another value on the first input, only some output values are possible?) Presumably a two-input balanced function would be balanced for either input fixed at any value, which would essentially be a Latin square or a Latin square combiner.

Balanced Block Mixer
A process or any implementation (for example, hardware, computer software, hybrids, or the like) for performing Balanced Block Mixing.

Balanced Block Mixing
The block mixing mechanism described in U.S. Patent 5,623,549 (see the BBM articles on the Ciphers By Ritter page).

A Balanced Block Mixer is an m-input-port m-output-port mechanism with various properties:

  1. The overall mapping is one-to-one and invertible: Every possible input value (over all ports) to the mixer produces a different output value (including all ports), and every possible output value is produced by a different input value;

  2. Each output port is a function of every input port;

  3. Any change to any one of the input ports will produce a change to every output port;

  4. Stepping any one input port through all possible values (while keeping the other input ports fixed) will step every output port through all possible values.

If we have a two port mixer, with input ports labeled A and B, output ports labeled X and Y, and some irreducible mod 2 polynomial p of degree appropriate to the port size, a Balanced Block Mixer is formed by the equations:

X = 3A + 2B (mod 2)(mod p),
Y = 2A + 3B (mod 2)(mod p).

This particular BBM is a self-inverse or involution, and so can be used without change whether enciphering or deciphering. One possible value for p for mixing 8-bit values is 100011011.

Balanced Block Mixing functions probably should be thought of as orthogonal Latin squares. For example, here is a tiny nonlinear "2-bit" BBM:

   3 1 2 0   0 3 2 1       30  13  22  01
   0 2 1 3   2 1 0 3   =   02  21  10  33
   1 3 0 2   1 2 3 0       11  32  03  20
   2 0 3 1   3 0 1 2       23  00  31  12

Suppose we wish to mix (1,3); 1 selects the second row up in both squares, and 3 selects the rightmost column, thus selecting (2,0) as the output. Since there is only one occurrence of (2,0) among all entry pairs, this discrete mixing function is reversible, as well as being balanced on both inputs.

Cryptographic advantages of balanced block mixing include the fact that each output is always balanced with respect to either input, and that no information is lost in the mixing. This allows us to use balanced block mixing as the "butterfly" operations in a fast Walsh-Hadamard transform or the well-known FFT. By using the mixing patterns of these transforms, we can mix 2n elements such that each input is guaranteed to affect each and every output in a balanced way. And if we use keying to generate the tables, we can have a way to mix huge blocks in small nonlinear mixing tables with overall mixing guarantees.

Also see Mixing Cipher, Dynamic Substitution Combiner, Variable Size Block Cipher, and the Active Balanced Block Mixing in JavaScript page of the Ciphers By Ritter / JavaScript computation pages.

Balanced Combiner
In the context of cryptography, a combiner mixes two input values into a result value. A balanced combiner must provide a balanced relationship between each input and the result.

In a statically-balanced combiner, any particular result value can be produced by any value on one input, simply by selecting some appropriate value for the other input. In this way, knowledge of only the output value provides no information -- not even statistical information -- about either input.

The common examples of cryptographic combiner, including byte exclusive-OR (mod 2 polynomial addition), byte addition (integer addition mod 256), or other "additive" combining, are perfectly balanced. Unfortunately, these simple combiners are also very weak, being inherently linear and without internal state.

A Latin square combiner is an example of a statically-balanced reversible nonlinear combiner with massive internal state. A Dynamic Substitution Combiner is an example of a dynamically or statistically-balanced reversible nonlinear combiner with substantial internal state.

Base-64
A public code for converting between 6-bit values 0..63 (or 00..3f hex) and text symbols accepted by most computers:
        0  1  2  3  4  5  6  7  8  9  a  b  c  d  e  f

   0    A  B  C  D  E  F  G  H  I  J  K  L  M  N  O  P
   1    Q  R  S  T  U  V  W  X  Y  Z  a  b  c  d  e  f
   2    g  h  i  j  k  l  m  n  o  p  q  r  s  t  u  v
   3    w  x  y  z  0  1  2  3  4  5  6  7  8  9  +  /

   use "=" for padding

Bel
The base-10 logarithm of the ratio of two power values (which is also the same as the difference between the log of each power value). The basis for the more-common term decibel: One bel equals 10 decibels.

Bent Function
A bent function is a Boolean function whose fast Walsh transform has the same absolute value in each term (except, possibly, the zeroth). This means that the bent function has the same distance from every possible affine Boolean function.

We can do FWT's in "the bottom panel" at the end of Active Boolean Function Nonlinearity Measurement in JavaScript page of the Ciphers By Ritter / JavaScript computation pages.

Here is every bent sequence of length 4, first in {0,1} notation, then in {1,-1} notation, with their FWT results:

   bent {0,1}       FWT           bent {1,-1}       FWT

   0  0  0  1    1 -1 -1  1        1  1  1 -1    2  2  2 -2
   0  0  1  0    1  1 -1 -1        1  1 -1  1    2 -2  2  2
   0  1  0  0    1 -1  1 -1        1 -1  1  1    2  2 -2  2
   1  0  0  0    1  1  1  1       -1  1  1  1    2 -2 -2 -2
   1  1  1  0    3  1  1 -1       -1 -1 -1  1   -2 -2 -2  2
   1  1  0  1    3 -1  1  1       -1 -1  1 -1   -2  2 -2  2
   1  0  1  1    3  1 -1  1       -1  1 -1 -1   -2 -2  2 -2
   0  1  1  1    3 -1 -1 -1        1 -1 -1 -1   -2  2  2  2
These sequences, like all true bent sequences, are not balanced, and the zeroth element of the {0,1} FWT is the number of 1's in the sequence.

Here are some bent sequences of length 16:

   bent {0,1}    0 1 0 0   0 1 0 0   1 1 0 1   0 0 1 0
    FWT          6,-2,2,-2,2,-2,2,2,-2,-2,2,-2,-2,2,-2,-2
   bent {1,-1}   1 -1 1 1   1 -1 1 1   -1 -1 1 -1   1 1 -1 1
    FWT          4,4,-4,4,-4,4,-4,-4,4,4,-4,4,4,-4,4,4

   bent {0,1}    0 0 1 0   0 1 0 0   1 0 0 0   1 1 1 0
    FWT          6,2,2,-2,-2,2,-2,2,-2,-2,-2,-2,2,2,-2,-2
   bent {1,-1}   1 1 -1 1   1 -1 1 1   -1 1 1 1   -1 -1 -1 1
    FWT          4,-4,-4,4,4,-4,4,-4,4,4,4,4,-4,-4,4,4

Bent sequences are said to have the highest possible uniform nonlinearity. But, to put this in perspective, recall that we expect a random sequence of 16 bits to have 8 bits different from any particular sequence, linear or otherwise. That is also the maximum possible nonlinearity, and here we actually get a nonlinearity of 6.

There are various more or less complex constructions for these sequences. In most cryptographic uses, bent sequences are modified slightly to achieve balance.

Bernoulli Trials
In statistics, observations or sampling with replacement which has exactly two possible outcomes, typically called "success" and "failure." Bernoulli trials have these characteristics:

Bernoulli trials have a Binomial distribution.

Bijective
A mapping f: X -> Y which is both one-to-one and onto. For each unique x in X there is corresponding unique y in Y. An invertible mapping function.

Binary
From the Latin for "dual" or "pair." Dominantly used to indicate "base 2": The numerical representation in which each digit has an alphabet of only two symbols: 0 and 1. This is just one particular coding or representation of a value which might otherwise be represented (with the exact same value) as octal (base 8), decimal (base 10), or hexadecimal (base 16). Also see bit and Boolean.

Possibly also the confusing counterpart to unary when describing the number of inputs or arguments to a function, but dyadic is almost certainly a better choice.

Binomial Distribution
In statistics, the probability of finding exactly k successes in n independent Bernoulli trials, when each trial has success probability p:
               n    k      n-k
   P(k,n,p) = ( )  p  (1-p)
               k

This ideal distribution is produced by evaluating the probability function for all possible k, from 0 to n.

If we have an experiment which we think should produce a binomial distribution, and then repeatedly and systematically find very improbable test values, we may choose to reject the null hypothesis that the experimental distribution is in fact binomial.

Also see the binomial section of the Ciphers By Ritter / JavaScript computation pages.

Birthday Attack
A form of attack in which it is necessary to obtain two identical values from a large population. The "birthday" part is the realization that it is far easier to find an arbitrary matching pair than to match any particular value. Often a hash attack.

Also see: birthday paradox.

Birthday Paradox
The apparent paradox that, in a schoolroom of only 23 students, there is a 50 percent probability that at least two will have the same birthday. The "paradox" is that we have an even chance of success with at most 23 different days represented.

The "paradox" is resolved by noting that we have a 1/365 chance of success for each possible pairing of students, and there are 253 possible pairs or combinations of 23 things taken 2 at a time. (To count the number of pairs, we can choose any of the 23 students as part of the pair, then any of the 22 remaining students as the other part. But this counts each pair twice, so we have 23 * 22 / 2 = 253 different pairs.)

We can compute the overall probability of success from the probability of failure (1 - 1/365 = 0.99726) multiplied by itself for each pair. The overall probability of failure is thus 0.99726253 (0.99726 to the 253rd power) or 0.4995. So the success probability for 253 pairs is 0.5005.

We can relate the probability of finding at least one "double" of some birthday (Pd) to the expected number of doubles (Ed) as:

   Pd = 1 - e-Ed  ,

so

   Ed = -Ln( 1 - Pd )

and

   365 * -Ln( 0.5 ) = 365 * 0.693 = 253  .

Also see: Estimating Population from Repetitions in Accumulated Random Samples, my "birthday" article.

Bit
A contraction of "binary digit." The smallest possible unit of information. A Boolean value: True or False; Yes or No; one or zero; Set or Cleared. Virtually all information to be communicated or stored digitally is coded in some way which fundamentally relies on individual bits. Alphabetic characters are often stored in eight bits, which is a byte.

Block
Some amount of data treated as a single unit. For example, the DES block cipher has a 64-bit block. So DES ciphers 64 bits (8 bytes or typically 8 ASCII characters) at once.

A 64-bit block supports 264 or about 1.8 x 1019 block values or code values. Each different permutation of those values can be considered a complete code. A block cipher has the ability to select from among many such codes using a key.

It is not normally possible to block-cipher just a single bit or a single byte of a block. An arbitrary stream of data can always be partitioned into one or more fixed-size blocks, but it is likely that at least one block will not be completely filled. Using fixed-size blocks generally means that the associated system must support data expansion in enciphering, if only by one block. Handling even minimal data expansion may be difficult in some systems.

Block Cipher
A cipher which requires the accumulation of data (in a block) before ciphering can complete. Other than simple transposition ciphers, this seems to be the province of ciphers designed to emulate a keyed simple substitution with a table of size far too large to realize. A block cipher operates on a block of data (for example, multiple bytes) in a single ciphering, as opposed to a stream cipher, which operates on bytes or bits as they occur. Block ciphers can be called "codebook-style" ciphers. Also see Variable Size Block Cipher.

A block cipher is a transformation between plaintext block values and ciphertext block values, and is thus an emulated simple substitution on huge block-wide values. Within a particular block size, both plaintext and ciphertext have the same set of possible values, and when the ciphertext values have the same ordering as the plaintext, ciphering is obviously ineffective. So effective ciphering depends upon re-arranging the ciphertext values from the plaintext ordering, and this is a permutation of the plaintext values. A block cipher is keyed by constructing a particular permutation of ciphertext values.

Block Cipher Data Diffusion

In an ideal block cipher, changing even a single bit of the input block will change all bits of the ciphertext result, each with independent probability 0.5. This means that about half of the bits in the output will change for any different input block, even for differences of just one bit. This is overall diffusion and is present in a block cipher, but not in a stream cipher. Data diffusion is a simple consequence of the keyed invertible simple substitution nature of the ideal block cipher.

Improper diffusion of data throughout a block cipher can have serious strength implications. One of the functions of data diffusion is to hide the different effects of different internal components. If these effects are not in fact hidden, it may be possible to attack each component separately, and break the whole cipher fairly easily.

Partitioning Messages into Fixed Size Blocks

A large message can be ciphered by partitioning the plaintext into blocks of a size which can be ciphered. This essentially creates a stream meta-cipher which repeatedly uses the same block cipher transformation. Of course, it is also possible to re-key the block cipher for each and every block ciphered, but this is usually expensive in terms of computation and normally unnecessary.

A message of arbitrary size can always be partitioned into some number of whole blocks, with possibly some space remaining in the final block. Since partial blocks cannot be ciphered, some random padding can be introduced to fill out the last block, and this naturally expands the ciphertext. In this case it may also be necessary to introduce some sort of structure which will indicate the number of valid bytes in the last block.

Block Partitioning without Expansion

Proposals for using a block cipher supposedly without data expansion may involve creating a tiny stream cipher for the last block. One scheme is to re-encipher the ciphertext of the preceding block, and use the result as the confusion sequence. Of course, the cipher designer still needs to address the situation of files which are so short that they have no preceding block. Because the one-block version is in fact a stream cipher, we must be very careful to never re-use a confusion sequence. But when we only have one block, there is no prior block to change as a result of the data. In this case, ciphering several very short files could expose those files quickly. Furthermore, it is dangerous to encipher a CRC value in such a block, because exclusive-OR enciphering is transparent to the field of mod 2 polynomials in which the CRC operates. Doing this could allow an Opponent to adjust the message CRC in a known way, thus avoiding authentication exposure.

Another proposal for eliminating data expansion consists of ciphering blocks until the last short block, then re-positioning the ciphering window to end at the last of the data, thus re-ciphering part of the prior block. This is a form of chaining and establishes a sequentiality requirement which requires that the last block be deciphered before the next-to-the-last block. Or we can make enciphering inconvenient and deciphering easy, but one way will be a problem. And this approach cannot handle very short messages: its minimum size is one block. Yet any general-purpose ciphering routine will encounter short messages. Even worse, if we have a short message, we still need to somehow indicate the correct length of the message, and this must expand the message, as we saw before. Thus, overall, this seems a somewhat dubious technique.

On the other hand, it does show a way to chain blocks for authentication in a large-block cipher: We start out by enciphering the data in the first block. Then we position the next ciphering to start inside the ciphertext of the previous block. Of course this would mean that we would have to decipher the message in reverse order, but it would also propagate any ciphertext changes through the end of the message. So if we add an authentication field at the end of the message (a keyed value known on both ends), and that value is recovered upon deciphering (this will be the first block deciphered) we can authenticate the whole message. But we still need to handle the last block padding problem and possibly also the short message problem.

Block Size and Plaintext Randomization

Ciphering raw plaintext data can be dangerous when the cipher has a small block size. Language plaintext has a strong, biased distribution of symbols and ciphering raw plaintext would effectively reduce the number of possible plaintexts blocks. Worse, some plaintexts would be vastly more probable than others, and if some known plaintext were available, the most-frequent blocks might already be known. In this way, small blocks can be vulnerable to classic codebook attacks which build up the ciphertext equivalents for many of the plaintext phrases. This sort of attack confronts a particular block size, and for these attacks Triple-DES is no stronger than simple DES, because they both have the same block size.

The usual way of avoiding these problems is to randomize the plaintext block with an operating mode such as CBC. This can ensure that the plaintext data which is actually ciphered is evenly distributed across all possible block values. However, this also requires an IV which thus expands the ciphertext.

Another approach is to apply data compression to the plaintext before enciphering. If this is to be used instead of plaintext randomization, the designer must be very careful that the data compression does not contain regular features which could be exploited by The Opponents.

An alternate approach is to use blocks of sufficient size for them to be expected to have a substantial amount of uniqueness or "entropy." If we expect plaintext to have about one bit of entropy per byte of text, we might want a block size of at least 64 bytes before we stop worrying about an uneven distribution of plaintext blocks. This is now a practical block size.

Boolean
TRUE or FALSE; one bit of information.

Boolean Function
A function which produces a Boolean result. The individual output bits of an S-box can each be considered to be separate Boolean functions.

Boolean Function Nonlinearity
The number of bits which must change in the truth table of a Boolean function to reach the closest affine Boolean function. This is the Hamming distance from the closest "linear" function.

Typically computed by using a fast Walsh-Hadamard transform on the Boolean-valued truth table of the function. This produces the unexpected distance to every possible affine Boolean function (of the given length). Scanning those results for the maximum value implies the minimum distance to some particular affine sequence.

Especially useful in S-box analysis, where the nonlinearity for the table is often taken to be the minimum of the nonlinearity values computed for each output bit.

Also see the Active Boolean Function Nonlinearity Measurement in JavaScript page of the Ciphers By Ritter / JavaScript computation pages.

Boolean Logic
The logic which applies to variables which have only two possible values. Also the digital hardware devices which realize such logic, and are used to implement a electronic digital computers.

Boolean Mapping
A mapping of some number n Boolean variables into some number m Boolean results. For example, an S-box.

Break
The result of a successful cryptanalytic attack. To destroy the advantage of a cipher in hiding information.

A cipher is "broken" when the information in a message can be extracted without the key, or when the key itself can be recovered. The strength of a cipher can be considered to be the minimum effort required for a break, by any possible attack. A break is particularly significant when the work involved need not be repeated on every message.

The use of the term "break" can be misleading when an impractical amount of work is required to achieve the break. This case might be better described a "theoretical" or "certificational" weakness.

Block Size
The amount of data in a block. For example, the size of the DES block is 64 bits or 8 bytes or 8 octets.

Brute Force Attack
A form of attack in which each possibility is tried until success is obtained. Typically, a ciphertext is deciphered under different keys until plaintext is recognized. On average, this may take about half as many decipherings as there are keys.

Recognizing plaintext may or may not be easy. Even when the key length of a cipher is sufficient to prevent brute force attack, that key will be far too small to produce every possible plaintext from a given ciphertext (see perfect secrecy). Combined with the fact that language is redundant, this means that very few of the decipherings will be words in proper form. Of course, if the plaintext is not language, but is instead computer code, compressed text, or even ciphertext from another cipher, recognizing a correct deciphering can be difficult.

Brute force is the obvious way to attack a cipher, and the way any cipher can be attacked, so ciphers are designed to have a large enough keyspace to make this much too expensive to use in practice. Normally, the design strength of a cipher is based on the cost of a brute-force attack.

Bug
Technical slang for "error in design or implementation." An unexpected system flaw. Debugging is a normal part of system development and interactive system design.

Byte
A collection of eight bits. Also called an "octet." A byte can represent 256 different values or symbols. The common 7-bit ASCII codes used to represent characters in computer use are generally stored in a byte; that is, one byte per character.


Capacitor

A basic electronic component which acts as a reservoir for electrical power in the form of voltage. A capacitor thus acts to "even out" the voltage across its terminals, and to "conduct" voltage changes from one terminal to the other. A capacitor "blocks" DC and conducts AC in proportion to frequency. Capacitance is measured in Farads: A current of 1 Amp into a capacitance of 1 Farad produces a voltage change of 1 Volt per Second across the capacitor.

Typically, two conductive "plates" or metal foils separated by a thin insulator, such as air, paper, or ceramic. An electron charge on one plate attracts the opposite charge on the other plate, thus "storing" charge. A capacitor can be used to collect a small current over long time, and then release a high current for a short time, as used in a camera strobe or "flash."

Also see inductor and resistor.

CBC
CBC or Cipher Block Chaining is an operating mode for block ciphers. CBC mode is essentially a crude meta-stream cipher which streams block transformations.

In CBC mode the ciphertext value of the preceding block is exclusive-OR combined with the plaintext value for the current block. This has the effect of distributing the combined block values evenly among all possible block values, and so prevents codebook attacks.

On the other hand, ciphering the first block generally requires an IV or initial value to start the process. The IV necessarily expands the ciphertext, which may or may not be a problem. And the IV must be dynamically random-like so that statistics cannot be developed on the first block of each message sent under the same key.

In CBC mode, each random-like confusing value is the ciphertext from each previous block. Clearly this ciphertext is exposed to The Opponent, so there would seem to be little benefit associated with hiding the IV, which is just the first of these values. But if The Opponent knows the first sent plaintext, and can intercept and change the message IV, The Opponent can manipulate the first block of received plaintext. Because the IV does not represent a message enciphering, manipulating this value does not also change any previous block.

Accordingly, the IV may be sent enciphered or may be specifically authenticated in some way. Alternately, the complete body of the plaintext message may be authenticated, often by a CRC. The CRC remainder should be block ciphered, perhaps as part of the plaintext.

c.d.f.
In statistics, cumulative distribution function. A function which gives the probability of obtaining a particular value or lower.

CFB
CFB or Ciphertext FeedBack is an operating mode for a block cipher.

CFB is closely related to OFB, and is intended to provide some of the characteristics of a stream cipher from a block cipher. CFB generally forms an autokey stream cipher. CFB is a way of using a block cipher to form a random number generator. The resulting pseudorandom confusion sequence can be combined with data as in the usual stream cipher.

CFB assumes a shift register of the block cipher block size. An IV or initial value first fills the register, and then is ciphered. Part of the result, often just a single byte, is used to cipher data, and the resulting ciphertext is also shifted into the register. The new register value is ciphered, producing another confusion value for use in stream ciphering.

One disadvantage of this, of course, is the need for a full block-wide ciphering operation, typically for each data byte ciphered. The advantage is the ability to cipher individual characters, instead of requiring accumulation into a block before processing.

Chain
An operation repeated in a sequence, such that each result depends upon the previous result, or an initial value. One example is the CBC operating mode.

Chaos

The unexpected ability to find numerical relationships in physical processes formerly considered random. Typically these take the form of iterative applications of fairly simple computations. In a chaotic system, even tiny changes in state eventually lead to major changes in state; this is called "sensitive dependence on initial conditions." It has been argued that every good computational random number generator is "chaotic" in this sense.

In physics, the "state" of an analog physical system cannot be fully measured, which always leaves some remaining uncertainty to be magnified on subsequent steps. And, in many cases, a physical system may be slightly affected by thermal noise and thus continue to accumulate new information into its "state."

In a computer, the state of the digital system is explicit and complete, and there is no uncertainty. No noise is accumulated. All operations are completely deterministic. This means that, in a computer, even a "chaotic" computation is completely predictable and repeatable.

Chi-Square
In statistics, a goodness of fit test used for comparing two distributions. Mainly used on nominal and ordinal measurements. Also see: Kolmogorov-Smirnov.

In the usual case, many independent samples are counted by category or separated into value-range "bins." The reference distribution gives us the the number of values to expect in each bin. Then we compute a X2 test statistic related to the difference between the distributions:

   X2 = SUM( SQR(Observed[i] - Expected[i]) / Expected[i] )

("SQR" is the squaring function, and we require that each expectation not be zero.) Then we use a tabulation of chi-square statistic values to look up the probability that a particular X2 value or lower (in the c.d.f.) would occur by random sampling if both distributions were the same. The statistic also depends upon the "degrees of freedom," which is almost always one less than the final number of bins. See the chi-square section of the Ciphers By Ritter / JavaScript computation pages.

The c.d.f. percentage for a particular chi-square value is the area of the statistic distribution to the left of the statistic value; this is the probability of obtaining that statistic value or less by random selection when testing two distributions which are exactly the same. Repeated trials which randomly sample two identical distributions should produce about the same number of X2 values in each quarter of the distribution (0% to 25%, 25% to 50%, 50% to 75%, and 75% to 100%). So if we repeatedly find only very high percentage values, we can assume that we are probing different distributions. And even a single very high percentage value would be a matter of some interest.

Any statistic probability can be expressed either as the proportion of the area to the left of the statistic value (this is the "cumulative distribution function" or c.d.f.), or as the area to the right of the value (this is the "upper tail"). Using the upper tail representation for the X2 distribution can make sense because the usual chi-squared test is a "one tail" test where the decision is always made on the upper tail. But the "upper tail" has an opposite "sense" to the c.d.f., where higher statistic values always produce higher percentage values. Personally, I find it helpful to describe all statistics by their c.d.f., thus avoiding the use of a wrong "polarity" when interpreting any particular statistic. While it is easy enough to convert from the c.d.f. to the complement or vise versa (just subtract from 1.0), we can base our arguments on either form, since the statistical implications are the same.

It is often unnecessary to use a statistical test if we just want to know whether a function is producing something like the expected distribution: We can look at the binned values and generally get a good idea about whether the distributions change in similar ways at similar places. A good rule-of-thumb is to expect chi-square totals similar to the number of bins, but distinctly different distributions often produce huge totals far beyond the values in any table, and computing an exact probability for such cases is simply irrelevant. On the other hand, it can be very useful to perform 20 to 40 independent experiments to look for a reasonable statistic distribution, rather than simply making a "yes / no" decision on the basis of what might turn out to be a rather unusual result.

Since we are accumulating discrete bin-counts, any fractional expectation will always differ from any actual count. For example, suppose we expect an even distribution, but have many bins and so only accumulate enough samples to observe about 1 count for every 2 bins. In this situation, the absolute best sample we could hope to see would be something like (0,1,0,1,0,1,...), which would represent an even, balanced distribution over the range. But even in this best possible case we would still be off by half a count in each and every bin, so the chi-square result would not properly characterize this best possible sequence. Accordingly, we need to accumulate enough samples so that the quantization which occurs in binning does not appreciably affect the accuracy of the result. Normally I try to expect at least 10 counts in each bin.

But when we have a reference distribution that trails off toward zero, inevitably there will be some bins with few counts. Taking more samples will just expand the range of bins, some of which will be lightly filled in any case. We can avoid quantization error by summing both the observations and expectations from multiple bins, until we get a reasonable expectation value (again, I like to see 10 counts or more). In this way, the "tails" of the distribution can be more properly (and legitimately) characterized.

Cipher
In general, a key-selected secret transformation between plaintext and ciphertext. Specifically, a secrecy mechanism or process which operates on individual characters or bits independent of semantic content. As opposed to a secret code, which generally operates on words, phrases or sentences, each of which may carry some amount of complete meaning. Also see: cryptography, block cipher, stream cipher, a cipher taxonomy, and substitution.

A good cipher can transform secret information into a multitude of different intermediate forms, each of which represents the original information. Any of these intermediate forms or ciphertexts can be produced by ciphering the information under a particular key value. The intent is that the original information only be exposed by one of the many possible keyed interpretations of that ciphertext. Yet the correct interpretation is available merely by deciphering under the appropriate key.

A cipher appears to reduce the protection of secret information to enciphering under some key, and then keeping that key secret. This is a great reduction of effort and potential exposure, and is much like keeping your valuables in your house, and then locking the door when you leave. But there are also similar limitations and potential problems.

With a good cipher, the resulting ciphertext can be stored or transmitted otherwise exposed without also exposing the secret information hidden inside. This means that ciphertext can be stored in, or transmitted through, systems which have no secrecy protection. For transmitted information, this also means that the cipher itself must be distributed in multiple places, so in general the cipher cannot be assumed to be secret. With a good cipher, only the deciphering key need be kept secret.

A Cipher Taxonomy
For the analysis of cipher operation it is useful to collect ciphers into groups based on their functioning (or intended functioning). The goal is to group ciphers which are essentially similar, so that as we gain an understanding of one cipher, we can apply that understanding to others in the same group. We thus classify not by the components which make up the cipher, but instead on the "black-box" operation of the cipher itself.

We seek to hide distinctions of size, because operation is independent of size, and because size effects are usually straightforward. We thus classify serious block ciphers as keyed simple substitution, just like newspaper amusement ciphers, despite their obvious differences in strength and construction. This allows us to compare the results from an ideal tiny cipher to those from a large cipher construction; the grouping thus can provide benchmark characteristics for measuring large cipher constructions.

We could of course treat each cipher as an entity unto itself, or relate ciphers by their dates of discovery, the tree of developments which produced them, or by known strength. But each of these criteria is more or less limited to telling us "this cipher is what it is." We already know that. What we want to know is what other ciphers function in a similar way, and then whatever is known about those ciphers. In this way, every cipher need not be an island unto itself, but instead can be judged and compared in a related community of similar techniques.

Our primary distinction is between ciphers which handle all the data at once (block ciphers), and those which handle some, then some more, then some more (stream ciphers). We thus see the usual repeated use of a block cipher as a stream meta-cipher which has the block cipher as a component. It is also possible for a stream cipher to be re-keyed or re-originate frequently, and so appear to operate on "blocks." Such a cipher, however, would not have the overall diffusion we normally associate with a block cipher, and so might usefully be regarded as a stream meta-cipher with a stream cipher component.

The goal is not to give each cipher a label, but instead to seek insight. Each cipher in a particular general class carries with it the consequences of that class. And because these groupings ignore size, we are free to generalize from the small to the large and so predict effects which may be unnoticed in full-size ciphers.

  1. BLOCK CIPHER
    A block cipher requires the accumulation of some amount of data or multiple data elements for ciphering to complete. (Sometimes stream ciphers accumulate data for convenience, as in cylinder ciphers, which nevertheless logically cipher each character independently.)

    (Note that this definition is somewhat broader than the now common understanding of a huge, and thus emulated, Simple Substitution. But there are ciphers which require blocked plaintext and which do not emulate Simple Substitution, and calling these something other than "block" ciphers negates the advantage of a taxonomy.)

    1. SUBSTITUTION CIPHER
      • A "codebook" or "simple substitution."
      • Each code value becomes a distinguishable element. Thus, substitution generally converts a collection of independent elements to a single related unit.
      • Keying constitutes a permutation or re-arrangement of the fixed set of possible code values.
      • Avalanche or data diffusion is a natural consequence of an arbitrary selection among all possible code values.
      • The usual complete binary substitution distributes bit-changes between code values binomially, and this effect can be sampled and examined statistically.
      • Avalanche is two-way diffusion in the sense that "later" plaintext can change "earlier" ciphertext.
      • A conventional block cipher is built from small components with a design intended to simulate a substitution table of a size vastly larger than anything which could be practically realized.

      1. Transposition Cipher
        • Clearly, it is necessary for all message elements which will be transposed to be collected before operations begin; this is the block cipher signature.
        • Any possible transposition is necessarily a subset of an arbitrary substitution; thus, transposition can be seen as a particular keying subset of substitution.
        • Notice, however, that the usual avalanche signature of substitution is not present, and of course the actual data values are not changed at all by transposition, just moved about.
        • Also notice that we are close to using the idea of permutation in two very different ways: first as a particular n-bit to n-bit substitution, and second as a particular re-arrangement of characters in the block. These have wildly different ciphering effects.

  2. STREAM CIPHER
    • A stream cipher does not need to accumulate some amount of data or multiple data elements for ciphering to complete. (Since we define only two main "types" of cipher, a stream cipher is the opposite of a block cipher and vise versa. It is extremely important that the definitions for block and stream ciphering enclose the universe of all possible ciphers.)
    • A stream cipher has the ability to transform individual elements one-by-one. The actual transformation usually is a block transformation, and may be repeated with the same or different keying.
    • In a stream cipher, data diffusion may or may not occur, but if it does, it is necessarily one-way (from earlier to later elements).
    • Since elements are ciphered one-by-one, changing part of the plaintext can affect that part and possibly later parts of the ciphertext; this is a stream cipher signature.
    • The simple re-use of a block transformation to cover more data than a single block is a stream operation.

    1. CONFUSION SEQUENCE
      • With a truly random sequence, used once, we have a one time pad.
      • With a pseudorandom confusion sequence and a simple additive combiner, we have a Vernam cipher.
      • A simple additive transformation becomes weak upon the second character ciphered, or immediately, under known plaintext, making strength dependent on the confusion sequence.
      • More complex transformations imply the need for correspondingly less strong confusion sequences.

      1. Autokey
        • Normally the use of ciphertext, but also perhaps plaintext, as the cipher key.
        • Can create a random-like confusion stream which will re-synchronize after ciphertext data loss.
        • Under known-plaintext, the common "ciphertext feedback" version exposes both the confusion sequence and the input which creates that sequence. This is a lot of pressure on a single transformation.

    2. MONOALPHABETIC (e.g., DES CBC)
      • The repeated use of a single fixed substitution.
      • A conventional block cipher simulates a large substitution.
      • A substitution becomes weak when its code values are re-used.
      • Code value re-use can be minimized by randomizing the plaintext block (e.g., CBC). This distributes the plaintext evenly across the possible block values, but at some point the transformation itself must change or be exposed.
      • Another alternative is to use a very large block so that code value re-use is made exceedingly unlikely. A large block also has room for a dynamic keying field which would make code value re-use even more unlikely.

    3. POLYALPHABETIC
      • The use of multiple fixed substitutions.
      • By itself, the use of multiple alphabets in a regular sequence is inherently not much stronger than just a single alphabet.
      • It is of course possible to select an alphabet or transformation at pseudo-random, for example by re-keying DES after every block ciphered. This brings back sequence strength as an issue, and opens up the sequence generator starting state as an IV.
      • A related possibility is the use of a Latin square combiner which effectively selects among a balanced set of different fixed substitution alphabets.

      1. Cylinder
        • A cipher which has or simulates the use of a number of different alphabet disks on a common rod.
        • Primary keying is the arrangement of the alphabet around each disk, and the selection and arrangement of disks.
        • By entering the plaintext on one row, any of n-1 other rows can be sent as ciphertext; this selection is an IV.
        • If the plaintext data are redundant, it is possible to avoid sending the IV by selecting the one of n-1 possible decipherings which shows redundancy. But this is not generally possible when ciphering arbitrary binary data.
        • If an IV is selected first, each character ciphering in that "chunk" is independent of each other ciphering. There is no data diffusion.
        • In general, each disk is used at fixed periodic intervals through the text, which is weak.
        • The ciphertext selection is homophonic, in the sense that different ciphertext rows each represent exactly the same plaintext.
        • Cylinder operation is not polyphonic in the usual sense: While a single ciphertext can imply any other row is plaintext, generally only one row has a reasonable plaintext meaning.

    4. DYNAMIC
      • The use of one (monoalphabetic) or multiple (polyalphabetic) substitutions which change during ciphering.

    5. ITERATIVE
      • The iterative re-use of a stream cipher with a new random IV on each iteration so as to eventually achieve the effect of a message key.
      • Each iteration seemingly must expand the ciphertext by the size of the IV, although this is probably about the same expansion we would have with a message key.
      • Unfortunately, each iteration will take some time.

Ciphering
The use of a cipher. The general term which includes both enciphering and deciphering.

Ciphertext
The result of enciphering. Ciphertext will contain the same information as the original plaintext, but hide the original information, typically under the control of a key. Without the key it should be impractical to recover the original information from the ciphertext.

Ciphertext Expansion
When the ciphertext is larger than the original plaintext.

Ciphertext expansion is the general situation: Stream ciphers need a message key, and block ciphers with a small block need some form of plaintext randomization, which generally needs an IV to protect the first block. Only block ciphers with a large size block generally can avoid ciphertext expansion, and then only if each block can be expected to hold sufficient uniqueness or "entropy" to prevent a codebook attack.

It is certainly true that in most situations of new construction a few extra bytes are not going to be a problem. However, in some situations, and especially when a cipher is to be installed into an existing system, the ability to encipher data without requiring additional storage can be a big advantage. Ciphering data without expansion supports the ciphering of data structures which have been defined and fixed by the rest of the system, provided only that one can place the cipher at the interface "between" two parts of the system. This is also especially efficient, as it avoids the process of acquiring a different, larger, amount of store for each ciphering. Such an installation also can apply to the entire system, and not require the re-engineering of all applications to support cryptography in each one.

Ciphony
Audio or voice encryption. A contraction of "ciphered telephony."

Circuit
The "circular" flow of electrons from a power source, through conductors and components and back to the power source. Or the arrangement of components which allows such flow and performs some function.

Clock
A repetitive or cyclic timing signal to coordinate state changes in a digital system. A clock can coordinate the movement of data and results through various stages of processing. Although a clock signal is digital, the source of the repetitive signal is almost always an analog circuit.

In an analog system we might produce a known delay by slowly charging a capacitor and measuring the voltage across it continuously until the voltage reaches the desired level. A big problem with this is that the circuit becomes increasingly susceptible to noise at the end of the interval.

In a digital system we create a delay by simply counting clock cycles. Since all external operations are digital, noise effects are virtually eliminated, and we can easily create accurate delays which are as long as the count in any counter we can build.

Closed
An operation on a set which produces only elements in that set.

Code
Symbols or values which stand for symbols, values, sequences, or even operations (as in computer "opcodes"). As opposed to a cipher, which operates only on individual characters or bits, classically, codes also represent words, phrases, and entire sentences. One application was to decrease the cost of telegraph messages. In modern usage, a code is often simply a correspondence between information (such as character symbols) and values (such as the ASCII code or Base-64), although computer opcodes do have independent meanings and variable lengths.

Coding is a very basic part of modern computation and generally implies no secrecy or information hiding. Some codes are "secret codes," however, and then the transformation between the information and the coding is kept secret. Also see: cryptography and substitution.

Codebook
Literally, the listing or "book" of code transformations. More generally, any collection of such transformations. Classically, letters, common words and useful phrases were numbered in a codebook; messages transformed into those numbers were "coded messages." Also see nomenclator. A "codebook style cipher" refers to a block cipher.

Codebook Attack
A form of attack in which The Opponent simply tries to build or collect a codebook of all the possible transformations between plaintext and ciphertext under a single key. This is the classic approach we normally think of as "codebreaking."

The usual ciphertext-only approach depends upon the plaintext having strong statistical biases which make some values far more probable than others, and also more probable in the context of particular preceding known values. Such attacks can be defeated if the plaintext data are randomized and thus evenly and independently distributed among the possible values. (This may have been the motivation for the use of a random confusion sequence in a stream cipher.)

When a codebook attack is possible on a block cipher, the complexity of the attack is controlled by the size of the block (that is, the number of elements in the codebook) and not the strength of the cipher. This means that a codebook attack would be equally effective against either DES or Triple-DES.

One way a block cipher can avoid a codebook attack is by having a large block size which will contain an unsearchable amount of plaintext "uniqueness" or entropy. Another approach is to randomize the plaintext block, often by using an operating mode such as CBC.

Combination
The mathematical term for any particular subset of symbols, independent of order. (Also called the binomial coefficient.) The number of combinations of n things, taken k at a time, read "n choose k" is:
    n
   ( ) = C(n,k) =  n! / (k! (n-k)!)
    k

See the combinations section of the Ciphers By Ritter / JavaScript computation pages. Also see permutation.

Combinatoric
Combinatorics is a branch of mathematics, like analysis or number theory. Combinatorics is often related to counting the subsets of finite sets. One result is to help us to understand the probability of a particular subset in the universe of possible values.

Consider a block cipher: For any given size block, there is some fixed number of possible messages. Since every enciphering must be reversible (deciphering must work), we have a 1:1 mapping between plaintext and ciphertext blocks. The set of all plaintext values and the set of all ciphertext values is the same set; particular values just have different meanings in each set.

Keying gives us no more ciphertext values, it only re-uses the values which are available. Thus, keying a block cipher consists of selecting a particular arrangement or permutation of the possible block values. Permutations are a combinatoric topic. Using combinatorics we can talk about the number of possible permutations or keys in a block cipher, or in cipher components like substitution tables.

Permutations can be thought of as the number of unique arrangements of a given length on a particular set. Other combinatoric concepts include binomials and combinations (the number of unique given-length subsets of a given set).

Combiner
In a cryptographic context, a combiner is a mechanism which mixes two data sources into a single result. A "combiner style cipher" refers to a stream cipher.

Reversible combiners are used to encipher plaintext into ciphertext in a stream cipher. The ciphertext is then deciphered into plaintext using a related inverse or extractor mechanism.

Irreversible or non-invertible combiners are often used to mix multiple RNG's into a single confusion sequence, also for use in stream cipher designs.

Also see balanced combiner, additive combiner and complete, and The Story of Combiner Correlation: A Literature Survey, in the Literature Surveys and Reviews section of the Ciphers By Ritter page.

Commutative
A dyadic operation in which exchanging the two argument values must produce the same result: a + b = b + a.

Also see: associative and distributive.

Complete
A term used in S-box analysis to describe a property of the value arrangement in an invertible substitution or, equivalently, a block cipher. If we have some input value, and then change one bit in that value, we expect about half the output bits to change; this is the result of diffusion; when partial diffusion is repeated we develop avalanche; and the ultimate result is strict avalanche. Completeness tightens this concept and requires that changing a particular input bit produce a change in a particular output bit, at some point in the transformation (that is, for at least one input value). Completeness requires that this relationship occur at least once for every combination of input bit and output bit. It is tempting to generalize the definition to apply to multi-bit element values, where this makes more sense.

Completeness does not require that an input bit change an output bit for every input value (which would not make sense anyway, since every output bit must be changed at some point, and if they all had to change at every point, we would have all the output bits changing, instead of the desired half). The inverse of a complete function is not necessarily also complete.

As originally defined in Kam and Davida:

"For every possible key value, every output bit ci of the SP network depends upon all input bits p1,...,pn and not just a proper subset of the input bits." [p.748]
Kam, J. and G. Davida. 1979. Structured Design of Substitution-Permutation Encryption Networks. IEEE Transactions on Computers. C-28(10): 747-753.

Component
A part of a larger construction; a building-block in an overall design or system. Modern digital design is based on the use of a few general classes of pre-defined, fully-specified parts. Since even digital logic can use or even require analog values internally, by enclosing these values the logic component can hide complexity and present the appearance of a fully digital device.

The most successful components are extremely general and can be used in many different ways. Even as a brick is independent of the infinite variety of brick buildings, a flip-flop is independent of the infinite variety of logic machines which use flip-flops.

The source of the ability to design and build a wide variety of different electronic logic machines is the ability to interconnect and use a few very basic but very general parts.

Electronic components include

Cryptographic system components include:

Computer
Originally the job title for a person who performed a laborious sequence of arithmetic computations. Now a machine for performing such calculations.

A logic machine with:

  1. Some limited set of fundamental computations. Typical operations include simple arithmetic and Boolean logic. Each operation is selected by a particular operation code value or "opcode." This is a hardware interpretation of the opcode.

  2. The ability to follow a list of instructions or commands, performing each in sequence. Thus capable of simulating a wide variety of far more complex "instructions."

  3. The ability to execute or perform at least some instructions conditionally, based on parameter values or intermediate results.

  4. The ability to store values into a numbered "address space" which is far larger than the instruction set, and later to recover those values when desired.

Also see: source code, object code and software.

Conductor
A material in which electron flow occurs easily. Typically a metal; usually copper, sometimes silver, brass or even aluminum. A wire. As opposed to an insulator.

Confusion
Those parts of a cipher mechanism which change the correspondence between input values and output values. In contrast to diffusion.

Confusion Sequence
The sequence combined with data in a stream cipher. Normally produced by a random number generator, it is also called a "running key."

Contextual
In the study of logic, an observed fact dependent upon other facts not being observed. Or a statement which is conditionally true, provided other unmentioned conditions have the appropriate state. As opposed to absolute.

Conventional Cipher
A secret key cipher.

Congruence
Casually speaking, the remainder after a division of integers.

In number theory we say than integer a (exactly) divides integer b (denoted a | b) if and only if there is an integer k such that ak = b.

In number theory we say that integer a is congruent to integer b modulo m, denoted a = b (mod m), if and only if m | (a - b). Here m is the divisor or modulus.

Convolution
Polynomial multiplication. A multiplication of each term against each other term, with no "carries" from term to term. Also see correlation.

Used in the analysis of signal processing to develop the response of a processing system to a complicated real-valued input signal. The input signal is first separated into some number of discrete impulses. Then the system response to an impulse -- the output level at each unit time delay after the impulse -- is determined. Finally, the expected response is computed as the sum of the contributions from each input impulse, multiplied by the magnitude of each impulse. This is an approximation to the convolution integral with an infinite number of infinitesimal delays. Although originally accomplished graphically, the process is just polynomial multiplication.

It is apparently possible to compute the convolution of two sequences by taking the FFT of each, multiplying these results term-by-term, then taking the inverse FFT. While there is an analogous relationship in the FWT, in this case the "delays" between the sequences represent mod 2 distance differences, which may or may not be useful.

Correlation
In general, the probability that two sequences of symbols will, in any position, have the same symbol. We expect two random binary sequences to have the same symbols about half the time.

One way to evaluate the correlation of two real-valued sequences is to multiply them together term-by-term and sum all results. If we do this for all possible "delays" between the two sequences, we get a "vector" or 1-dimensional array of correlations which is a convolution. Then the maximum value represents the delay with the best correlation.

Correlation Coefficient
The value from -1 to +1 describing the correlation of two binary sequences, averaged over the length of interest. Correlation coefficient values are related to the probability that, given a symbol from one sequence, the other sequence will have that same symbol. A value of:

"The correlation coefficient associated with a pair of Boolean functions f(a) and g(a) is denoted by C(f,g) and is given by

C(f,g) = 2 * prob(f(a) = g(a)) - 1 ."

Daemen, J., R. Govaerts and J. Vanderwalle. 1994. Correlation Matrices. Fast Software Encryption. 276. Springer-Verlag.

CRC
Cyclic Redundancy Check: A fast error-check hash based on mod 2 polynomial operations.

A CRC is essentially a fast remainder operation over a huge numeric value which is the data. (For best speed, the actual computation occurs as mod 2 polynomial operations.) The CRC result is an excellent (but linear) hash value corresponding to the data.

No CRC has any appreciable strength, but some applications -- even in cryptography -- need no strength:

Cryptanalysis
That aspect of cryptology which concerns the strength analysis of a cryptographic system, and the penetration or breaking of a cryptographic system. Also "codebreaking."

Because the