Set Theory Primer for Music 

Part I. Nonlinear Sets

copyright © 1998, 2002, rev 2005 by Larry Solomon

Go to Set Theory Part II. Linear Sets (Serial Theory)


Basic Definitions

Abdo. see pitch number

div, or directed-interval vector, also interval string. The distance between successive (ordered) pcs cycling to an octave. A prime div is the div of a prime form.

Forte Prime. A generalized version of a set that includes its inversion.

Index number. The transposition number, in semitones, above a reference pc. P5 would be a transposition up 5 semitones from P0.

Interval. The distance between two pitches. In set theory intervals are measured by the number of semitones. Thus, CE is not a major third (M3) but 4 semitones, or simply 4. A minor sixth would be 8.

Interval class (ic). The distance between two pitch classes, measured by the shortest distance. C to G may be the interval of 7, but its interval class is 5. Thus, the largest ic is the tritone (6).

Interval String. see div

Modulo 12 (mod12). An arithmetic system nearly identical to that of a clock, where 13=1, 14=2 etc. However, in modulo 12 the number 12=0. If we want to know what 2 hours past 11 is (11+2), we say it is one o'clock (1). Thus, in mod12, 11+2=1, and there is no number greater than 11.

Normal form or normal order. A cyclic permutation of a pc set arranged in ascending order as compactly as possible with respect to the first pc. Each pc is represented by a pitch number in the absolute-do system. The normal order of an F major chord would be 590.

Pitch class (pc). All pitches with the same name plus their enharmonic equivalents; e.g. all C#s make up a single pitch class. But, Db and Bx are also in the same class.

Pitch number (pin). Each pc can be represented by a number from 0 to 11 in the twelve-tone system.

C C# D D# E F F# G G# A Bb B
0 1 2 3 4 5 6 7 8 9 10 11
0 1 2 3 4 5 6 7 8 9 A B

The first row of numbers in this table indicates the decimal notation for each pc. The last row shows the same pcs in hexadecimal (base 16) notation. The table shows abdo (absolute-do) notation, where C is always zero (0). In the reldo (relative-do) notation, any pc may be set to zero, usually the first of an arbitrary rotation. Thus, FAC is represented as 590 in abdo, but as 047 in reldo.

pcs or pitch-class set. A group of pitch classes.

Protoprime. The prime form of a set without including its inversion.

Reldo. see Pitch number

Unordered set, (or nonlinear set). A set whose temporal order is irrelevant, as in chords.


1. Identifying a Set Class, the Prime

The first and most important way that a pc set is identified is by its protoprime, or simply prime. However, sometimes the normal form, or normal order, is used. The normal order may be considered as a step on the way to the prime. So, we'll start by figuring it. As in the above glossary, the normal order is a cyclic permutation of a pc set arranged in ascending order as compactly as possible with respect to the first pc. Each pc is represented by a pitch number in the absolute-do system. Thus, the normal order of an F major chord would be 590. Here are the quickest steps for finding the normal form and then the prime:

1. Eliminate any duplicate pcs. For example, D# C G# F# A C G#, eliminate the duplicate G# and C. Thus, this pc set is initially labeled 30869, with the pitch duplications eliminated.

2. Place the numbers in ascending order, 03689.

3. Figure the intervals between consecutive pairs of pitch numbers, cycling back to the initial pc. 3-0=3, 6-3=3, 8-6=2, 9-8=1, 0-9=3. Thus, the intervals are 33213. This is called the directed-interval vector, or div.

pins=   0 3 6 8 9
div=     3 3 2 1 3 

4. Find the index number by locating the largest interval number. In this case, the largest interval number is 3, but there are three of them. When there are more than one of the largest number, choose the one with the smallest number following it (cyclically). This would be the second 3 in the above example. The pin (pitch number) following this is the index number. In our example the index pin is 6. (If the "smallest" number occurs more than once following a tie, then the next number should be considered using the same criteria, etc.)

5. Arrange the pins ascending from the index number: 68903. This is the normal form. The normal form has very little use and can be discarded.

6. The prime is figured from the normal form by setting the first pin to zero by transposition. This is done in our example by subtracting 6. Subtract the same number from all the pins. [6-6=0, 8-6=2, 9-6=3, 0-6=6, 3-6=9]. The result is 02369. This is the prime, which may be simplified to 2369, omitting the superfluous leading zero.

A more elegant (simpler) way to get the prime is to use the div, or interval string. In our example this is 33213. The largest interval followed by the smallest is the second 3. This points to the next interval as the starting interval for the prime. Therefore, cycling the intervals, we get 21333. The digits should always sum to 12 to complete an octave. If we build a set class from this we get 02369. To get the Inversion (I), reverse the div: 33312. Then find the inversion's prime div by the same method: 12333. Building a set class from this we get: I=01369.

6. The Forte prime may or may not differ from the protoprime. To get the Forte Prime (after Allen Forte), compare the prime with its inversion. The one which is most compacted toward zero is the Forte Prime. In this case, compare P=02369 with I=01369. The latter is most compacted (smallest interval) near zero. So, the Forte Prime is 01369. Remember that Forte Primes do not discriminate between major and minor.

Ex 1.

  1. Consider the pitch set D#, Bb, F#, D, Ab, A#, G#.
  2. Assign pitch numbers, i.e., 3A628A8.
  3. Eliminate duplications, including enharmonics, and , i.e., 3A628.
  4. Put the numbers in order: 2368A.
  5. Compute the cyclic div by subtraction: 13224. (Remember that the last interval cycles the set back to the first pitch.)
  6. Find the largest interval (4), the index.
  7. Cycle the div starting with the next interval, in this case the div is already in the required order: 13224.
  8. Starting with zero (0), generate the prime: P=01468 .

NOTICE: This is all that is required to identify the set class (The set class is the protoprime). The following are additional steps for computing the inversion and the Forte Prime.

  1. To get the inversion, project the div intervals in the reverse order (42231).
  2. As before, find the largest interval: 42231.(The index here is 4)
  3. Cycle the div from the next interval: 22314
  4. Generate the Inversion from this: 02478.
  5. If desired, the Forte Prime can be computed by selecting the most compact form from P and I, which in this case is 01468.

Ex 2.

  1. Pitch set: BGDFBbBD#A#E#.
  2. Assign pitch numbers (pins), using hexadecimal: B725AB3A5
  3. Eliminate pin duplications: B725A3
  4. Put numbers in order: 2357AB
  5. Compute cyclic div by subtraction: 122313
  6. Find largest interval followed by smallest combination, cyclically: 122313 (last 3 is followed by 1,2 cyclically, whereas the other 3 is followed by 1,3. So, last 3 is the index.
  7. Cycle the div starting with the number following the index: 122313 (same result in this case)
  8. Starting with zero (0) generate the prime: 013589 (the last 3 of the div cycles the set back to zero)

To find the Forte Prime;

  1. Reverse (retrograde) the div to get the inverse div: 313221
  2. Find the index for the inversion by locating the largest interval followed by the smallest combination: 313221
  3. Cycle the div starting with the number following the index: 132213
  4. Generate the Inversion, starting again with zero: 014689. In some cases the inversion is identical with the protoprime. These sets are called mirrors.
  5. Find the Forte Prime by comparing the prime with the inversion (when they differ) and choose the most compact form, e.g., 013589

The Forte Primes can be computed in real time on the Internet at Paul Nelson's Tools.

Primes from the Keyboard

The keyboard of a piano, organ, or other electronic keyboard, can be used to simplify the determination of the prime. For example, a dominant seventh chord, e.g., a C7, can be imagined or played on a keyboard in four possible configurations or positions.

(1)    (2)     (3)   (4) 

These are the cyclical rotations of this set, commonly known as root, first, second, and third inversions. From these, choose the most compact form under the hand, i.e., number 2, which encompasses the smallest inverval of a minor sixth or 8 semitones. Using this form, set the first pc to zero, i.e., using reldo, and figure the intervals in semitones above it; i.e., 0368. This is the prime. This method can be used for any set. If two or more forms compete with the smallest span, choose the one with the smallest interval from the bass to the next note above.

Interval String Notation

The most elegant way to represent a set class is with Interval String Notation (ISN), first documented by Ernst-Lecher Bacon in The Monist, 27:1, October 1917, under the title "Our Musical Idiom". In this system a pc set is represented by a series of intervals (in semitones) that fills an octave. Thus, the minor 7th chord, 037, becomes 345 in ISN, the intervals between the pcs with one more to complete the octave. On close scrutiny it will be ascertained that a set class is really a directed-interval vector, or div, rather than a pitch-class set. A new catalog of sets may be constructed with this notation in which set identity is very elegant and economical, without the need for set names; e.g., C7 chord, 0368 ( 4-27B), becomes 3324. The half-diminished seventh (0258 or 4-27) becomes simply 2334. In ISN, the difference between major (435) and minor (345) chords are clearly nonequivalent, just as are the dominant seventh and half-diminished seventh. The intervals in ISN should always add up to 12.

Why Does a (proto) Prime Differ from the Forte Prime?

Allen Forte's "prime forms" are actually combined pairs of protoprimes (or simply, primes) that are not perceived or conceived as such in our music. As a simple example, 047, the major chord, does not appear in Forte's table, but is subsumed into 037, the minor chord. (It is important to recognize that set theory calls these chords "inversions", which is not the same as the traditional concept of chord inversion as determined by the bass note.) Thus, it is impossible to distinguish these "inversions" in Forte's system, e.g., impossible to distinguish major chords from minor. This problem expands to all distinct pairs of prime inversions. The dominant-seventh (0368), as another example, is subsumed into the half-diminished seventh (0258), making them indistinguishable. The same is true for more complex sets. The Table of Set Classes retains all the original Forte set-names, but identifies each distinct inversion as a "B" form, an identifying label that is suffixed to the Forte name for each inversion. Thus, these additional primes are reinstated to their proper position in the pantheon of chords, distinct, yet related, to their inversions. In no way does this subtract from the information of set theory, nor does it change Forte's foundational sets. Rather, it embraces them and expands upon them; i.e., more information is provided -- information that is omitted by subsumption of inversions into the same set class. It also has the additional benefit of simplifying the determination of the prime form by elimination of the steps that include the inversion, normal form, "best normal form", which are unnecessary and have little use.

It is maintained by some theorists that the reduction in the Forte primes is valid because of the "atonal" context for which set theory was designed; i.e., major and minor chords are the same in an "atonal" context. But, the division between tonality and atonality" is itself questionable, and objectively indefinable, just as is "atonal music". (See my web essay Tonality, Modality and Atonality. Schoenberg himself maintained that "atonality" is actually a misnomer and is indeterminate, . Even the concept "pantonality" can only be defined subjectively. Forte himself uses set theory to analyze Stravinsky's Rite of Spring and other at least marginally tonal music, such as in Scriabin's late music. Major and minor chords are found in the Rite, and they are rendered indistinguishable by Forteian analysis. I would contend that these chords are not heard as identical sonorities in this, or any other context. They are simply not normally perceived as equivalent set classes. The division between tonal and and atonal music is very unclear. The problem is exacerbated in the "atonal" work of Schoenberg, Ives, Satie, and others.


2. Identifying the Set Name and Set Class

The set name is found in the Table of Pc Sets. The Prime, 02369, is found as set number 125 with the set name 5-31B. The Forte Prime, 01369, is 5-31.*


Set name. A tag used to identify a pc set. Allen Forte's set name consists of two numbers separated by a dash. An example is 4-27. The number before the dash indicates the cardinality of the set (the number of pcs it contains). The number after the dash, the ordinal number, indicates a catalog number determined by its alphanumeric order in the complete list of sets. Another way to identify a set is by its prime form, which may also be its set name; 047, the major chord may be represented as simply 47 and minor as 37.

Set class. All the pc sets represented by a single prime form, including transpositions. In Forte's system the set class also includes the inversion. Thus, 037 and 047 are in the same set class.

*Note: another way to find the Forte Prime is to look up the Prime in a table of sets (try this link), and note its set-name (5-31B). Find the same set-name without the B at the end (5-31). This is the Forte Prime, 01369.


3. Interval Vector

Interval vector (iv). The total ic content of a pc set. This is normally represented by an array of six digits, where the first indicates the number of semitones, the second the number of whole tones, the third the number of ic3, etc. The last digit indicates the number of tritones. The iv for a major chord is 001110.

Taking our example 02369, we can figure the interval vector using the following method:

1. Subtract the first pin from the following pins: 2369
2. Subtract the second pin from the following pins: 147
3. Subtract the third pin from the following pins: 36
3. Subtract the fourth pin from the last: 3

4. The intervals are: 2369147363. These need to be converted to ics; i.e., any number over 6 must be converted to a number less than 7. This is done by subtracting any number over 6 from 12, thereby inverting the interval. There are two such numbers here: 9 and 7, which when subtracted from 12 become 3 and 5 respectively. These are their ic numbers. So, the original list becomes 2363145363.

5. Tally the number of each ic and place each number in the corresponding position of the iv array; i.e., there is one 1, one 2, four 3s, one 4, one 5, and two 6s. Therefore, the iv is 114112.

Another way to find the iv is to look it up in the table. Notice that inversionally related sets have the same iv; thus, the Forte Prime also has an iv of 114112.


4. Set Relations

Complement. All the pcs that are not in a given pc set.

Subset Relation. Two sets are so related when one set is included within the other. The sets must be of differing cardinalities.

Similarity Relation. Sets of the same cardinality may be related by their similarity. There are several different types, the most important of which are described below.

Directed Interval (di). The distance between two pins that are placed in an order. E.g., the di of E to C is 8, whereas the di from C to E is 4.

Directed Interval Vector (div). the di between a series of pins; e.g., the div of ECGAC is 8723.


Since pc sets are not bound by the octave, two pc sets are equivalent if they map under rotation and/or transposition. EGC (470) and CEG (047) are equivalent by rotation. GBDF (7B25) and FACEb (5903) are equivalent by transposition.The first operation involves rotating the pitch numbers as in a circle. The second, transposition, is a matter of addition; add 2 to 5903 and the result is 7B25. Thus, pc sets are equivalent by these two operations. Sets FBGD (5B72) and CEGBb (047A) are equivalent after both operations, rotation and transposition.

Inversion and Z-Related Sets

Inversion is achieved by projecting intervals of a set in the opposite direction. Mathematically, this is acheived by subtracting the pins from 12 (the modulus). As an example, 047, the major chord is inverted by subtracting its pins from 12 to give: 085. When placed in prime form 085 becomes 037. Thus, 037, the minor chord is the inverse of 047, the major. In Forte's system inversionally related sets are equivalent and are made indistinguishable. Thus, 047 is subsumed into 037. In this system inversionally related sets are identified as distinct but related by mutual inversion. Inversionally related sets always have the same interval vector.

Z related sets are sets that have the same interval vector. Additionally, when two sets have the same name except for the B ending on one, the sets are inversionally related. This makes it easy to establish these relations in the tables. Z-related sets are accompanied by an extension on the set name with two dots followed by an ordinal number. This identifies the ordinal number of another set having the same interval vector.


Mirror Sets

A mirror set is actually not a relation between sets, but a relation that a set may have within itself. Such a set results in the equivalent set when inverted and is thereby called a mirror set. All such sets are indicated in the table of sets with an asterisk after their set names. Thus, such a set has no distinct inverse, but instead, each is its own inverse.

Subset Relation

When one set is included within another, they are said to be in the subset relation, also known as the inclusion relation. This may be abbreviated S for the subset relation. A familiar example of this is the incomplete dominant-seventh chord. BGF is a subset of GBDF. BDF, the diminished chord, is also a subset of the dominant seventh.

The subset relation is the only relation that two sets of differing cardinalities may have. However, Forte has also described special set complexes that relate groups of such sets. The first is called the set complex K, where a set OR its complement are in the inclusion relation (superset or subset) with all the other sets in its group. The second is called the set complex Kh, where a set AND its complement are in the inclusion relation with all the other sets in its group. The latter is more selective. The set to which all the others are so related is called a nexus set, which is used as a reference. A table of the set complexes Kh may be found in SAM, appendix 3.

Not all subset relations are equally significant. For instance, the statement that the major chord is a subset of the 12-note set, although true, is insignificant, because all sets are subsets of the 12-note set; i.e., the statement is not discriminating. The larger the superset is, the less significant are its subsets. A 3-note subset is more significant if it is a subset of a 4-note set, than it is if it were a subset of an 9-note set. This observation leads to a method for establishing subset significance. It is suggested here that a subset whose cardinality is more than half of the superset is more significant than one that is not. In this way one may make a distinction between significant-subsets (mapping>50%) by labelling them with an S, and a less-significant subset (mapping<=50%) by labelling it with a +.

How to Determine Subset Relations

To determine if one set is contained in another place both in prime form. In some cases, the pins of the smaller set will be the same as those of the larger set and can, thus, reveal the subset relation. But, most of the time this will not be the case.

Figure the intervals between successive pins of both sets. These intervals should, as usual, be considered cyclic. If the div of the smaller set can be aligned with the larger or with successive sums of the larger, then the subset relation applies.

Example 1. GBF, prime form 026, has a di content of 246 (cyclically). GBDF, prime form 0368, has a div of 3324 (Considered cyclically the div is 3324332... etc.). 246 aligns with the latter starting with the 24. The 6 is a sum of the two 3s; i.e., 2433 = 246. Therefore, the two sets are in the subset relation.
Example 2. Compare 023568 (div=212124) with 0134679A (div=12121212). The div of the first set aligns with the second when starting with pin 7 [212124 = 21212(2+1+1)]. Therefore, they have the subset relation.


Complement Relation

Two sets are in the complement relation when one contains all the pcs that are excluded from the other. For example, a C major scale excludes "black keys". The five black keys are its complement, and vice versa. Thus, a 7-note set has a 5-note complement, a 4-note set has an 8-note complement, etc., where the complement cardinalities always add up to 12.

In the table of sets, sets with complementary cardinalities and the same ordinal number are complements. But, if one has a B ending, the complement does not. Exceptions to this are indicated with a < sign in the table of sets. This sign indicates that the complement has the same name ending.

All hexachords have hexachord complements. Many are there own complements, except for the B ending. Those that are not their own complements identify their complements by the ordinal number that follows two dots in the set name. These dots also identify Z-sets, those that have the same interval vector.

Every set is in the subset relation to its complement except for the complementary couple 7-Z12/5-Z12.

How to Determine if Two Sets are in the Complement Relation

The cardinality of the two sets must add up to 12. Write the pins of the complement of one of the sets and place it in prime form. If this matches the other set, the two sets are complements.


Similarity Relations

Similarity relations are used to describe the relationships between sets of the same cardinality. These are based upon pc similarity and ic similarity.

Allen Forte describes four basic types of similarity relations, which he designates Rp, Ro, R1, and R2. Rp is determined by pc similarity, and the other three are determined by ic similarity.

Rp, maximum similarity of pc, exists when the two sets of cardinality C have at least one common subset of cardinality C-1, which is the same as saying that there is one unmatched pc.

Forte remarks that Rp by itself is "not especially significant" because it is too common, and leaves it at that. By Forte's Rp criteria, for example, 014 would be just as similar to 047 (major chord) as is 037 (minor chord). And, 014 is just as similar to 037 (minor chord) as is 036 (diminished chord). This does not agree with the way these chords are commonly perceived.

For a revision of Rp that would agree more with our perceived notions of similarity, one could require that the unmatched pc pair be within a semitone of a match. This makes the criteria for maximal similarity more selective, more distinguished, and corresponding more closely to our perceived notions of similarity. This type of similarity could be designated as simply P.

Ro, called "minimum similarity", exists when two set of cardinality C have no corresponding interval vector digits in common.

R1 and R2 are known as maximum similarity of interval class. Such is the case when four out of their six iv digits are equivalent. The remaining two dissimilar digits determine the difference between R1 and R2. If these are the same numbers but switched in position, then the relation is R1. If the two dissimiar digits are simply not equivalent, even if switched, the relation is R2.


When are Sets Maximally Similar?

1. They must be the same cardinality C.
2. They have four out of six iv digits corresponding and parallel (same position). (R1/R2)
3. They have all but one pc correspondence. (Rp)

Maximal similarity exists when two pc sets of the same cardinality can be mapped to one another, with the exception of one pc in one of the sets. Additionally, the interval vectors of the two sets must have four out of six matches.

These may be called the X relation (Forte's R1) when the unmatched digits are switched and the O relation (Forte's R2) when they are not. (The reasons for changing Forte's symbols become apparent when a single letter is needed on a "Relations Triangle". Additionally, X and O are easier to remember, because they neatly describe their respective relations in the shape of the letters.)


The R Relation

Another type of maximal similarity, called simply the R relation, is more selective, and is based upon a perception model that is statistically determined. By this criteria it is assumed that the perception of the similarity of small sets is easier than is the perception of the similarity of large ones; i.e., the similarity of, say, 9-note sets would be more difficult to perceive than their complements, 3-note sets. Thus, a formula is constructed to simulate this difference in perception. To satisfy R the sets must meet the following criteria:

1. Two pc sets of the same cardinality can be mapped to one another, with the exception of one pc in one of the sets, which must be within a semitone of a match with the unmatched pc of the other set.

2. There must be a minimum of interval correspondence T, where T equals the total number of ics corresponding in the two sets. T must be equal or greater than SC/8, where S is the sum of the ics in the set cardinality (the sum of the digits in the iv), and C is the cardinality.

As an example, compare the following two sets, given in prime form with their interval vectors:

013578       232341
013579       142422

C=6, the cardinality. Comparing the primes, only one pin pair is unmatched (8 and 9), and they are a semitone apart, which satisfies condition 1. Then observing the ivs, the sum (S) of the ics in this cardinality is 15. (This is determined by adding the numbers in either iv. All sets of cardinality C will have the same S.)

To determine T: The number of ics common in each iv position is equal to the smaller number (comparing each pair of digits in the same position), and 132321 is the result. Adding these together gives T, which is 12 in this case.

The number 8 is the cardinality chosen to represent a reasonable limit to the perception of R; i.e., sets having cardinalities greater than 7 cannot have the R relation (although they may have other maximally similar relations).

Since SC/8=11.25 and T=12 is greater, the two sets of our example meet the criteria for the R relation.


PC Invariance

Various operations are commonly performed on pc sets. These include rotations (CEG becomes EGC, GCE, ECG, etc.) and registral displacement. These operations have no effect on pc content. That is, they remain invariant under these operations.

Two operations that are commonly performed on pc sets that can alter pc content are (1) transposition and (2) inversion.

Transpositional PC Invariance

Pc sets often maintain some pc invariance after a transposition. A common example is the whole-tone scale, 02468A, or 6-35*. When this set is transposed by 2 semitones all of its pcs are held invariant. This would, of course, have important compositional consequences. It would be important, then, to know when pcs are held invariant under the operation of transposition. We can determine this by examining the iv of a pc set. 6-35* has an iv of 060603. The number that appears in each position indicates the number of pcs held invariant when transposed by its respective ic. Thus, the 6 that appears in ic position 2 reveals that when this set is transposed by 2 semitones (t2), 6 pcs (all) are held invariant. Since this is an ic transposition it may be either up or down, i.e., t2 or t10. A zero in the first position reveals that when this set is transposed by 1 semitone no pcs are held constant (also true for t11). The same is true of t3 and t9. There is a 6 in the ic4 position, meaning that at t4 or t8, 6 pcs (all) are again held constant. The 3 in the ic6 position, however, may seem puzzling at first. But, recall that 6 is its own inversion; therefore, the number of invariant pcs is double the number in the ic6 position; i.e., t6 also results in 6 invariant pcs).

As another example, consider set 6-7*, 012678, with iv=420243. The iv tells us that there will be 4 pcs invariant when t=1 (or t11), 2pcs invariant when t=2 (or t10), no pcs invariant when t=3 (or t9), 2 pcs invariant when t=4 (or t8), 4 pcs invariant when t=5 (or t7), and 6 pcs invariant when t=6.


Inversional PC Invariance

Inversion can and often does lead to pc invariance. But, this operation needs to be considered in combination with transposition, i.e., TnI, representing a transposition n of the inversion. The easiest way to determine this type of pc invariance is to form an addition table with the set represented horizontally and vertically. Let us take 2478 as an example.

2 4 7 8
2 4 6 9 A
4 6 8 B 0
7 9 B 2 3
8 A 0 3 4

Notice that the set is placed at the top, horizontally, and down the left side, vertically. The numbers within the table are the sums at the intersections of the pins. By tallying these numbers we can ascertain the invariant pcs under TnI. For example, the two 9s reveal that there are 2 common pcs at T9I. Further, the table tells us which pcs are invariant under this T, namely those that create the intersection: 2 and 7. The two 4s indicate 2 common pcs at T4I, and they are 2 and 8, etc.