Smith set
In voting systems, the Smith set, named after John H. Smith, but also known as the top cycle, or as Generalized Top-Choice Assumption (GETCHA), is the smallest non-empty set of candidates in a particular election such that each member defeats every candidate outside the set in a pairwise election. The Smith set provides one standard of optimal choice for an election outcome. Voting systems that always elect a candidate from the Smith set pass the Smith criterion and are said to be 'Smith-efficient' or to satisfy the Smith criterion.
A set of candidates each of whose members pairwise defeats every candidate outside the set is known as a dominating set.
The Smith set can be seen as defining a voting method (Smith's method) which is most often encountered when extended by an IRV tie-break as Smith/IRV or as Tideman's Alternative, or by minimax as Smith/minimax.
Properties of Smith sets
- The Smith set always exists and is well defined (see next section).
- The Smith set can have more than one candidate, either because of pairwise ties or because of cycles, such as in Condorcet's paradox.
- The Condorcet winner, if one exists, is the sole member of the Smith set. If weak Condorcet winners exist then they are in the Smith set.
- The Smith set is always a subset of the mutual majority-preferred set of candidates, if one exists.[1]
Properties of dominating sets
Theorem: Dominating sets are nested; that is, of any two dominating sets in an election, one is a subset of the other.
Proof: Suppose on the contrary that there exist two dominating sets, D and E, neither of which is a subset of the other. Then there must exist candidates d ∈ D, e ∈ E such that d ∉ E and e ∉ D. But by hypothesis d defeats every candidate not in D (including e) while e defeats every candidate not in E (including d), which is a contradiction. ∎
Corollary: It follows that the Smith set is the smallest non-empty dominating set, and that it is well defined.
Theorem: If D is a dominating set, then there is some threshold θD such that the elements of D are precisely the candidates whose Copeland scores are at least θD. (A candidate's Copeland score is the number of other candidates whom he or she defeats plus half the number of other candidates with whom he or she is tied.)
Proof: Choose d as an element of D with minimum Copeland score, and identify this score with θD. Now suppose that some candidate e ∉ D has a Copeland score not less than θD. Then since d belongs to D and e doesn't, it follows that d defeats e; and in order for e's Copeland score to be at least equal to d's, there must be some third candidate f against whom e gets a better score than does d. If f ∈ D, then we have an element of D who does not defeat e, and if f ∉ D then we have a candidate outside of D whom d does not defeat, leading to a contradiction either way. ∎
Schwartz set comparison
The Schwartz set, known as the Generalized Optimal-Choice Axiom or GOCHA, is closely related to and is always a subset of the Smith set. The Smith set is larger if and only if a candidate in the Schwartz set has a pair-wise tie with a candidate that is not in the Schwartz set.
The Smith set can be constructed from the Schwartz set by repeatedly adding two types of candidates until no more such candidates exist outside the set:
- candidates that have pairwise ties with candidates in the set,
- candidates that defeat a candidate in the set.
Note that candidates of the second type can only exist after candidates of the first type have been added.
The Smith criterion
The Smith criterion is satisfied by any voting method whose winner always belongs to the Smith set. Any method which satisfies the Smith criterion must also satisfy the Condorcet criterion; hence any method (such as IRV) which is not Condorcet consistent must also fail the Smith criterion. Minimax is the best known Condorcet method which does not satisfy the Smith criterion.
Computing the Smith set
The logical properties of dominating sets stated above can be used to construct an efficient algorithm for computing the Smith set. We have seen that the dominating sets are nested by Copeland score. It follows that by adjusting the Copeland threshold it is possible to work through the nested sets in increasing order of size until a dominating set is reached; and this set is necessarily the Smith set. Darlington sketches this method.[2]
Testing whether a set is a dominating set at each stage might repeat some calculations, but this can fairly easily be avoided leading to an algorithm whose work factor is quadratic in the number of candidates.
Detailed algorithm
The algorithm can be presented in detail through an example. Suppose that the results matrix is as follows:
2nd 1st |
A | B | C | D | E | F | G | score | |
---|---|---|---|---|---|---|---|---|---|
A | – | 1 | 1 | 1 | 1 | 1 | 0 | 5 | |
B | 0 | – | 0 | 0 | 1 | 0 | 0 | 1 | |
C | 0 | 1 | – | 0 | 1 | 1/2 | 1 | 31/2 | |
D | 0 | 1 | 1 | – | 1 | 1 | 1 | 5 | |
E | 0 | 0 | 0 | 0 | – | 0 | 0 | 0 | |
F | 0 | 1 | 1/2 | 0 | 1 | – | 0 | 21/2 | |
G | 1 | 1 | 0 | 0 | 1 | 1 | – | 4 |
Here an entry in the main table is 1 if the first candidate was preferred to the second by more voters than preferred the second to the first; 0 if the opposite relation holds; and 1/2 if there is a tie. The final column gives the Copeland score of the first candidate.
The algorithm to compute the Smith set is agglomerative: it starts with the Copeland set, which is guaranteed to be a subset of it but will often be smaller, and adds items until no more are needed. The first step is to sort the candidates according to score:
2nd 1st |
A | D | G | C | F | B | E | score | |
---|---|---|---|---|---|---|---|---|---|
A | – | 1 | 0 | 1 | 1 | 1 | 1 | 5 | |
D | 0 | – | 1 | 1 | 1 | 1 | 1 | 5 | |
G | 1 | 0 | – | 0 | 1 | 1 | 1 | 4 | |
C | 0 | 0 | 1 | – | 1/2 | 1 | 1 | 31/2 | |
F | 0 | 0 | 0 | 1/2 | – | 1 | 1 | 21/2 | |
B | 0 | 0 | 0 | 0 | 0 | – | 1 | 1 | |
E | 0 | 0 | 0 | 0 | 0 | 0 | – | 0 |
We look at the highest score – 5 – and consider the candidates (the Copeland winners) whose score is at least this high, i.e. {A,D}. These certainly belong to the Smith set, and any candidates whom they do not defeat will need to be added. To find undefeated candidates we look at the cells in the table below the top-left 2×2 square containing {A,D} (this square is shown with a broken border): the cells in question are shaded yellow in the table. We need to find the lowest (positionally) non-zero entry among these cells, which is the cell in the G row. All candidates as far down as this row, and any lower rows with the same score, need to be added to the set, which expands to {A,D,G}.
Now we look at any new cells which need to be considered, which are those below the top-left square containing {A,D,G}, but excluding those in the first two columns which we have already accounted for. The cells which need attention are shaded pale blue. As before we locate the positionally lowest non-zero entry among the new cells, adding all rows down to it, and all rows with the same score as it, to the expanded set, which now comprises {A,D,G,C}.
We repeat the operation for the new cells below the four members which are known to belong to the Smith set. These are shaded pink, and allow us to find any candidates not defeated by any of {A,D,G,C}. Again there is just one, F, whom we add to the set.
The cells which come into consideration are shaded pale green, and since all their entries are zero we do not need to add any new candidates to the set, which is therefore fixed as {A,D,G,C,F}. And by noticing that all the entries in the black box are zero, we have confirmation that all the candidates above it defeat all the candidates within it.
The following C function illustrates the algorithm by returning the cardinality of the Smith set for a given doubled results matrix r and array s of doubled Copeland scores. There are n candidates; ri j is 2 if more voters prefer i to j than prefer j to i, 1 if the numbers are equal, and 0 if more voters prefer j to i than prefer i to j ; si is the sum over j of the ri j . The candidates are assumed to be sorted in decreasing order of Copeland score.
int smithset(int ** r, int * s, int n) {
int row, col, lhs, rhs;
for (rhs = 1, lhs = 0; lhs < rhs; lhs = rhs, rhs = row + 1) {
for (; rhs < n && s[rhs] == s[rhs - 1]; rhs++); /* this line optional */
for (col = rhs, row = n; col == rhs && row >= rhs; row--)
for (col = lhs; col < rhs && r[row - 1][col] == 0; col++);
}
return lhs;
}
Smith's method
The Smith set can be seen as defining a ranked voting method in which...
All members of the Smith set are winners. Usually combined with another method.[3]
A notable example of Smith's method in combination is Smith/IRV, which reduces the field of candidates to the Smith set and then uses instant-runoff voting to break the tie if this set has more than one element. A more complicated method (also invoking IRV) is Tideman's Alternative, whose article contains a table listing important properties of the two methods. Smith/minimax uses a significantly different procedure for breaking ties.[2]
Elimination ties
The IRV component of both Smith/IRV and Tideman's Alternative will occasionally encounter a tie for bottom place amongst first preferences (the probability of this becomes arbitrarily small as the number of voters increases). When such a tie arises it may be broken in various ways. For Smith/IRV, the set of all candidates with the fewest first-order votes whose votes together total less than any other candidate's can be eliminated without changing the outcome; Tideman's Alternative recalculates the Smith Set after each single elimination, and cannot be optimized in this manner.
Instant-runoff voting cannot accept ballots with two candidates at the same rank, but reducing the field to the Smith set is possible even if voters have indicated ties between candidates. Ballots with equal first-choice rankings after eliminating non-Smith candidates must then be discarded.
References
- http://dss.in.tum.de/files/brandt-research/dodgson.pdf
- R. B. Darlington, 'Are Condorcet and minimax voting systems the best?' (v8, 2021).
- Condorcet.org mirror page.
Further reading
- Ward, Benjamin (1961). "Majority Rule and Allocation". Journal of Conflict Resolution. 5 (4): 379–389. doi:10.1177/002200276100500405. S2CID 145231466. In an analysis of serial decision making based on majority rule, describes the Smith set and the Schwartz set.
- Smith, J.H. (1973). "Aggregation of Preferences with Variable Electorates". Econometrica. The Econometric Society. 41 (6): 1027–1041. doi:10.2307/1914033. JSTOR 1914033. Introduces a version of a generalized Condorcet Criterion that is satisfied when pairwise elections are based on simple majority choice, and for any dominating set, any candidate in the set is collectively preferred to any candidate not in the set. But Smith does not discuss the idea of a smallest dominating set.
- Fishburn, Peter C. (1977). "Condorcet Social Choice Functions". SIAM Journal on Applied Mathematics. 33 (3): 469–489. doi:10.1137/0133030. Narrows Smith's generalized Condorcet Criterion to the smallest dominating set and calls it Smith's Condorcet Principle.
- Schwartz, Thomas (1986). The Logic of Collective Choice. New York: Columbia University Press. Discusses the Smith set (named GETCHA) and the Schwartz set (named GOTCHA) as possible standards for optimal collective choice.
- Somdeb Lahiri (nd), "Group and multi-criteria decision making". Outlines some properties of choice sets.