De re/de dicto distinction: a logicians’ perspective on data anonymity

Abstract

The distinction between de re (of the thing) and de dicto (of what is said) readings of sentences has long been the topic of studies in logic and philosophy of language. The article proposes to apply these concepts to anonymity. It argues that, in the proposed setting, de dicto knowledge preserves anonymity, while de re knowledge does not. The article also considers a third, “overt,” form of knowledge. The main technical result is a sound and complete logical system that captures the interplay between a data traceability expression and the de re, de dicto, and overt knowledge modalities. The article also shows that the three knowledge modalities are not definable through each other even in the presence of the traceability expression.

Introduction

Imagine that you have a nice neighbor whom you really like as a person. At the same time, you happen to hate rich people. Your neighbor is rich, but you do not know this. Thus, you like your neighbor and you hate him at the same time. However, you have these feelings in two distinct senses. You like him as an individual, but you hate the designator (“rich person”) that applies to him. In the philosophy of language, this distinction is often called de re/de dicto distinction [1–6]. You like the neighbor de re (of the thing) and hate him de dicto (of what is said). The de re/de dicto distinction is an umbrella term that is understood differently by different scholars and applies not only to preferences. As another example, imagine that you are travelling on a train and a fellow passenger starts giving you surprising medical advice. You most likely will not trust the stranger de re. However, you might be the type of person who trusts all doctors. If the passenger, unknown to you, happened to be a doctor, then you would trust her de dicto despite not trusting de re.

In addition to our examples of preferences and trust, de re/de dicto distinction also manifests itself in knowledge. Imagine a young woman Ann who is walking through a park. Ann was given up for adoption as a baby and she has never met her mom. Ann is curious about who her mom might be, so she has asked a close friend to look for the mom. While walking through the park, Ann sees an old woman sitting on a bench. Unknown to both of them, the woman is Ann’s mother. Let us now consider two possible scenarios of what might happen next. In the first scenario, the woman tells Ann that she, the woman, is sick. In this case, Ann knows that the woman is sick, the woman is Ann’s mother, thus, Ann de re knows that her mom is sick. In the second scenario, while chatting with the woman on the bench about the weather, Ann gets a text from her friend. The text tells her that the friend was able to trace Ann’s mother and has learned that the mom is sick. In this case, Ann knows de dicto that her mom is sick. Note that these two forms of knowledge are very different. Under the first scenario, Ann might tell the woman that Ann is sorry that she, the woman, is sick. She might recommend a doctor. It would be strange for Ann to recommend a doctor under the second scenario.

As our final example, imagine that Ann is your neighbor and that you are working as a technician at a health clinic. Ann visits the clinic, and a nurse there draws Ann’s blood sample and sends it to you with a doctor’s order to test if “the patient” is pregnant. You conduct the test and learn that the patient indeed is pregnant. The patient is your neighbor Ann, but you don’t know this. In spite of this, you now de dicto know that Ann is pregnant. Because you know this de dicto, your knowledge is really about the designator “patient,” not about Ann as a human being. This knowledge, as long as it stays de dicto, does not violate Ann’s anonymity. However, if you can trace the blood sample to your neighbor Ann, then you will know de re that she is pregnant. This knowledge will break Ann’s anonymity and violate her privacy. As this example shows, de re/de dicto distinction, when applied to knowledge based on data (such as test results), gives a way to characterize anonymity. Of course, privacy/anonymity is a broad term whose different aspects might be captured by different formal approaches. We are not suggesting here that de re/de dicto distinction can replace other, already existing, approaches to defining anonymity. But we think it can complement them in a meaningful way.

In this article, we propose a formal logical system that describes the interaction between de re (non-anonymized) knowledge based on data and de dicto (anonymized) forms of such knowledge. In addition to modalities representing those two forms of knowledge, the system also includes data tracability expression and a third, overt, knowledge modality. To define a formal semantics of this logical system, we combine two previously independent research areas: two-dimensional egocentric logic and data-informed epistemic logic. The main technical results of this work are the undefinability of the proposed modalities through each other and a completeness theorem for the logical system describing their interplay.

The rest of the article is organized as follows. First, we discuss egocentric logics and data-informed knowledge. Then, we give several more formal examples illustrating the difference between de re, de dicto, and overt knowledge. In the “Syntax and semantics” section, we introduce the language of our system and its formal semantics. In the “Undefinability results” section, we show that the modalities capturing these three forms of knowledge are not definable through each other. In the “Axioms” section, we state the axioms and the inference rules of our logical system. We prove their soundness and completeness in the next two sections. In the “Future work” section, we discuss k-anonimyty and how it can be captured in our setting.

Preliminaries

Egocentric logic

Traditionally, the semantics of modal logics is defined in terms of a satisfaction relation |$w\Vdash \varphi$| between a possible world w and a formula |$\varphi$|⁠. Under such semantics, the formula |$\varphi$| captures a property of the world w. Prior proposed to consider egocentric logics that capture the properties of agents rather than the properties of possible worlds [7]. The semantics of such logics can be defined in terms of a satisfaction relation |$a\Vdash \varphi$| between an agent a and a formula |$\varphi$|⁠. For example, to express the fact that agent a is pregnant, we can write |$a\Vdash \text{''is pregnant''}$|⁠. In the egocentric setting, the Boolean connectives can be used in the standard way. The statement |$a\Vdash \text{``is pregnant''} \wedge \text{``lives in Bath''}$| means that agent a is pregnant and lives in Bath. Seligman, Liu, and Girard suggested to consider friendship modality |${\sf F}$| (read “for each friend”) in the egocentric setting [8,9]. For example, they read the statement |$a\Vdash {\sf F}\, \text{``lives in Bath''}$| as “each friend of agent a lives in Bath”. As usual, modalities can be combined with Boolean connectives. For instance, the statement |$a\Vdash {\sf F}\lnot (\text{``is pregnant''} \wedge \text{``lives in Bath''})$| means that agent a does not have a pregnant friend who lives in Bath. At the same time, |$a\Vdash \lnot {\sf F}\lnot (\text{``is pregnant''} \wedge \text{``lives in Bath''})$| means that agent a does have such a friend. Modality |${\sf F}$| is also used in [10,11]. We previously introduced modality |${\sf L}$|⁠, which stands for “likes those who” [12]. For example, the statement |$a\Vdash {\sf L}\, \text{``lives in Bath''}$| means that agent a likes those who live in Bath. Modalities can be nested. For instance, the following statement |$a\Vdash {\sf L}{\sf L}\, \text{``lives in Bath''}$| means that agent a likes those who like people living in Bath.

Independently from the works on egocentric logics, a new class of two-dimensional semantics have been proposed in the philosophy of language [13]. Under such semantics, the truth value of a sentence depends not only on the current world but on some other parameter sometimes called indexical. Stalnaker suggested visualizing world/indexical combinations as cells in a two-dimensional matrix [14]. Grove and Halpern combined the egocentric logic and two-dimensional semantics ideas by considering a formal logical system where the role of an indexical is played by an agent [15–17]. The formal semantics of their logical system is defined in terms of a ternary satisfaction relation |$w,a\Vdash \varphi$|⁠. For example, the following statement: |$w,a\Vdash \text{``is pregnant''}$| means that agent a is pregnant in world w (but might not be in some other worlds). In this setting, one can define the “knows about herself” modality |${\sf K}$|⁠. Using such a modality, for example, one can say that, in world w, agent a knows that she is pregnant: |$w,a\Vdash {\sf K}\, \text{``is pregnant''}$|⁠. In this article, we refer to the Grove-Halpern class of semantics as (two-dimensional) egocentric semantics. Such semantics has been used to define “know-who” [18,19] and “know how to tell apart” [20] modalities. Besides the introduction of a new modality, an important contribution of [20] is a new technique for proving completeness technique for two-dimensional semantics. The technique builds on Stalnaker [14] matrices to recursively define the canonical model as a limit of an infinite chain of matrices of maximal consistent sets of formulae. In the current article, we further develop this construction by specifying tree structures on the cells of the matrices.

Data-informed knowledge

Imagine that there is a digital clock on a wall. We use t to denote the time shown by the clock. Suppose that in world w the clock shows 13:24. Thus, anyone who sees the clock would conclude that it is afternoon now. This conclusion can be made by any agent as long as the agent has access to the value of t. We say that this knowledge is informed by data variable t and write it as

$$\begin{eqnarray} w\Vdash {\sf K}_t\text{``It is afternoon''}. \end{eqnarray}$$

This type of knowledge in the setting where t is a Boolean variable has been proposed by Grossi, Lorini, and Schwarzentruber [21]. For arbitrary variables, it was introduced by Baltag and van Benthem [22]. The term “data-informed knowledge” is suggested by us [23,24]. We also considered data-informed beliefs [25,26]. In this work, we consider data-informed knowledge in two-dimensional egocentric settings.

De re/de dicto data-informed knowledge

Imagine a hypothetical Family Planning survey that asks participants just three questions: their age, sex, and if they expect a child. To keep the example simple, let us assume that the survey has been conducted only among four people, agents a, b, c, and d. Suppose that it is common knowledge that agent d is 25 and the rest are 23. Also, it is common knowledge that agents a and c are female and agents b and d are male. Assuming only women can be pregnant, there are 4 possible ways the survey can be answered. We call such combinations “possible worlds” and depict them in Fig. 1 as worlds |$w_1$|⁠, |$w_2$|⁠, |$w_3$|⁠, and |$w_4$|⁠. The figure also shows that a and b as well as c and d are married couples, but this becomes important only later in our story. Finally, suppose now that the real world is |$w_1$|⁠. In other words, only a is expecting a child.

If somebody has access only to the anonymized results of agent a’s answers, |$(23,F,+)$|⁠, then that person would know that the respondent is pregnant. Since the respondent, in this case, is agent a, one can say that the a’s survey data informs the knowledge that a is pregnant:

$$\begin{eqnarray} w_1,a\Vdash {\sf K}_{\text{age,sex,reply}}\, \text{``is pregnant''}. \end{eqnarray}$$

(1)

At the same time, the observer of the anonymized survey data |$(23,F,+)$| would not be able to distinguish this a’s data in world |$w_1$| from the c’s data in world |$w_2$|⁠, see Fig. 1. Thus, one can argue, that the a’s survey data does not inform the knowledge that a is pregnant:

$$\begin{eqnarray} w_1,a\nVdash {\sf K}_{\text{age,sex,reply}}\, \text{``is pregnant''}. \end{eqnarray}$$

(2)

The inconsistency between statements (1) and (2) comes from the fact that they refer to two different types of knowledge. In statement (1), modality |${\sf K}$| captures the knowledge about the name “participant,” which, in the current world, happens to refer to agent a. In statement (2), modality |${\sf K}$| captures the knowledge about the actual participant. As we discussed in the introduction, in the philosophy of language, this distinction between a statement about the name of an object and an actual object is known as de dicto/de re distinction. De dicto (“of what is said”) specifies that a statement refers to the name (such as “participant” in our case) and de re (“of the thing”) specifies that a statement refers to the actual object. In this article, instead of a single knowledge modality |${\sf K}$|⁠, we use modalities |${\sf D}$| and |${\sf R}$| for de dicto and de re knowledge, respectively. Thus, in our example,

$$\begin{eqnarray} &&w_1,a\Vdash {\sf D}_{\text{age,sex,reply}}\, \text{``is pregnant''},\\ &&w_1,a\nVdash {\sf R}_{\text{age,sex,reply}}\, \text{``is pregnant''}. \end{eqnarray}$$

Because de re data-informed knowledge of a formula |$\varphi$| reveals that |$\varphi$| is true about the actual person, it does not preserve the anonymity of the participant and potentially violates her privacy. At the same time, if personal data only informs de dicto knowledge, it reveals |$\varphi$| about “the participant,” not an actual person. As a result, generally speaking, it preserves the anonymity of the participant.

At this point, the reader might think that de re data-informed knowledge is stronger than de dicto one. In other words, knowing something de re implies knowing de dicto. Perhaps surprisingly, this is not true.

Indeed, let us modify our Family Planning survey as shown in Fig. 2. Here, the agents a, b, c, and d are the same two couples of the same age as in the previous example. They are again the only participants in a family planning survey. Let us suppose that instead of sex, the survey asks about the city in which the participant lives. As one can see from the figure, agents a and b live in Bath and agents c and d live in York. We assume that the city in which each couple lives, just as their ages, is public information. In addition, instead of “Are you expecting a child?,” the survey asks “Is your family expecting a child?”. The four possible worlds in this new setting are depicted in Fig. 2. Just like before, we assume that only agent a is pregnant. In other words, the current world is |$w_1$|⁠.

Note that by looking at agent a’s anonymized survey data, |$(23,Bath,+)$|⁠, one would not be able to conclude that “the participant” is pregnant because a’s data is indistinguishable from the survey data for agent b in the very same world |$w_1$| and agent b is not pregnant:

$$\begin{eqnarray} w_1,a\nVdash {\sf D}_{\text{age,city,reply}}\, \text{``is pregnant''}. \end{eqnarray}$$

(3)

At the same time, it is easy to see that in any possible world, if the survey data for any agent is |$(23,Bath,+)$|⁠, then the agent a is pregnant in that world, see Fig. 2. Thus, any observer of the anonymized survey data |$(23,Bath,+)$| would know that agent a is pregnant. In other words, any observer of agent a’s anonymized survey data in world |$w_1$| would de re know that agent a is pregnant:

$$\begin{eqnarray} w_1,a\Vdash {\sf R}_{\text{age,city,reply}}\, \text{``is pregnant''}. \end{eqnarray}$$

(4)

Overt data-informed knowledge and traceability

In the previous section, we introduced two forms of knowledge that can be informed by anonymized data. As we have seen from the two examples, neither of these two forms is stronger than the other in the sense that knowing something de dicto does not imply knowing de re and vice versa. In this section, we consider the knowledge informed by non-anonymized data. That is, we consider what can be concluded from personal data when not only the data is given, but it is also explicitly stated about which agent this data is. We refer to this third form of knowledge as overt data-informed knowledge and represent it by modality |${\sf O}$|⁠.

To illustrate the difference between the three forms of data-informed knowledge, consider yet another variation of our Family Planning survey depicted in Fig. 3. The only difference from Fig. 2 is that c had to relocate to Bath while still being married to agent d who remains in York. Using the same argument as we gave for statement (3), we can see that agent a’s anonymized personal data does not inform de dicto knowledge that she is pregnant:

$$\begin{eqnarray} w_1,a\nVdash {\sf D}_{\text{age,city,reply}}\, \text{``is pregnant''}. \end{eqnarray}$$

(5)

At the same time, in world |$w_1$|⁠, any observer looking only at a’s anonymized data, |$(23,Bath,+)$|⁠, will also not be able to infer de re that agent a is pregnant:

$$\begin{eqnarray} w_1,a\nVdash {\sf R}_{\text{age,city,reply}}\, \text{``is pregnant''}. \end{eqnarray}$$

(6)

This is because the personal data of agent a in world |$w_1$| is the same as the personal data of agent c in world |$w_2$|⁠, where agent a is not pregnant. Thus, by relocating agent c to Bath we made statement (4) false.

Observe now that anyone who, in world |$w_1$|⁠, has access to agent a’s personal data |$(23,Bath,+)$| and is also told that this data belongs to a, knows that she is pregnant:

$$\begin{eqnarray} w_1,a\Vdash {\sf O}_{\text{age,city,reply}}\, \text{``is pregnant''}. \end{eqnarray}$$

(7)

This is because agent a is pregnant in each world where her personal data is |$(23,Bath,+)$|⁠.

In this article, we investigate the dependencies between de re, de dicto, and overt forms of knowledge. To understand the relation between them, it is also important to consider the notion of traceability. We say that a set of data variables X is traceable to agent a in world w if knowing the values of data variables for agent a in world w informs the knowledge of which agent the data belongs to. We denote this by |$w,a\Vdash !X$|⁠. For example, |$w_1,b\Vdash !\lbrace \text{age,sex,reply}\rbrace$| in the example depicted in Fig. 1. Indeed, in that example, there is only one agent of age 23 who is male. At the same time, |$w_1,a\nVdash !\lbrace \text{age,sex,reply}\rbrace$| for the same example, because data |$(23,F,+)$| could come from agent a in world |$w_1$| or, for instance, from agent c in world |$w_2$|⁠.

Diagram shows four worlds and four agents. It specifes age, sex, and reply of each agent in each world.

Figure 1.

Possible worlds in a Family Planning survey. Reply |$+$| means that the participant is expecting a child.

Open in new tab Download slide

Diagram shows four worlds and four agents. It specifes age, city, and reply of each agent in each world.

Figure 2.

Possible worlds in a survey. Reply |$+$| means that the participant’s family is expecting a child.

Open in new tab Download slide

Syntax and semantics

We start by defining the class of models that will be used to give formal semantics of our logical system. Throughout the rest of the article, we assume a fixed set of data variables V and a fixed set of atomic propositions. Examples of data variables from the two previous sections are |$age$|⁠, |$sex$|⁠, |$city$|⁠, and |$reply$|⁠. Examples of atomic propositions are “is pregnant” and “lives in Bath.” By a dataset we mean any subset of V.

Intuitively, we assume that data variables have certain values that depend on the world and the agent. In the example depicted in Fig. 3, in world |$w_1$| for agent a, the value of data variable |$reply$| is |$+$| and the value of data variable |$age$| is 23. As we will see in Definition 2, the actual values of data variables are not important for our semantics. The only important thing is if the data variable has the same or different values for any two given world-agent combinations. Hence, formally, it is more convenient to view a data variable as an equivalence relation on such pairs. We adopt this approach in the definition below.

Definition 1.

A model is a tuple |$(W,\mathcal {A},\lbrace \sim _x\rbrace _{x\in V},\pi )$|⁠, where

W is a (possibly empty) set of worlds,
|$\mathcal {A}$| is a (possibly empty) set of agents,
|$\sim _x$| is an equivalence relation on |$W\times \mathcal {A}$| for each data variable |$x\in V$|⁠,
|$\pi (p)\subseteq W\times \mathcal {A}$| for each atomic proposition p.

Note that, unlike the traditional approach in modal logic, valuation |$\pi (p)$| is a set of pairs rather than a set of possible worlds. For example, if “is pregnant” is one of the atomic propositions, then |$(u,a)\in \pi (\text{``is pregnant''})$| means that agent a is pregnant in world u.

The language |$\Phi$| of our logical system is defined by the following grammar:

$$\begin{eqnarray} \varphi :=p \,\,| \,\,!X \,\,| \,\,\lnot \varphi \,\,| \,\,\varphi \vee \varphi \,\,| \,\,{\sf O}_X\varphi \,\,| \,\,{\sf D}_X\varphi \,\,| \,\,{\sf R}_X\varphi , \end{eqnarray}$$

where p is an atomic proposition and |$X\subseteq V$| is a dataset. We read |$!X$| as “dataset X is traceable,” |${\sf O}_X\varphi$| as “dataset X informs overt knowledge of |$\varphi$|⁠,” |${\sf D}_X\varphi$| as “dataset X informs de dicto knowledge of |$\varphi$|⁠,” and |${\sf R}_X\varphi$| as “dataset X informs de re knowledge of |$\varphi$|⁠.” We assume that implication |$\rightarrow$| and biconditional |$\leftrightarrow$| as well as the Boolean constants true |$\top$| and false |$\bot$| are defined through negation and disjunction in the usual way.

Definition 2.

For any model |$(W,\mathcal {A},\lbrace \sim _x\rbrace _{x\in V},\pi )$|⁠, world |${w\in W}$|⁠, agent |$a\in \mathcal {A}$|⁠, and formula |$\varphi \in \Phi$|⁠, the satisfaction relation |$w,a\Vdash \varphi$| is defined as follows:

|$w,a\Vdash p$|⁠, if |$(w,a)\in \pi (p)$|⁠,
|$w,a\Vdash \, !X$|⁠, if |$a=b$| for each world |$u\in W$| and each agent |$b\in \mathcal {A}$| such that |$(w,a)\sim _X (u,b)$|⁠,
|$w,a\Vdash \lnot \varphi$|⁠, if |$w,a\nVdash \varphi$|⁠,
|$w,a\Vdash \varphi \vee \psi$|⁠, if either |$w,a\Vdash \varphi$| or |$w,a\Vdash \psi$|⁠,
|$w,a\Vdash {\sf O}_X\varphi$|⁠, if |$u,a\Vdash \varphi$| for each world |$u\in W$| such that |$(w,a)\sim _X (u,a)$|⁠,
|$w,a\Vdash {\sf D}_X\varphi$|⁠, if |$u,b\Vdash \varphi$| for each world |$u\in W$| and each agent |$b\in \mathcal {A}$| such that |$(w,a)\sim _X (u,b)$|⁠,
|$w,a\Vdash {\sf R}_X\varphi$|⁠, if |$u,a\Vdash \varphi$| for each world |$u\in W$| and each agent |$b\in \mathcal {A}$| such that |$(w,a)\sim _X (u,b)$|⁠.

Recall that, informally, the relation |$(w,a)\sim _X (u,b)$| means that all variables in dataset X have the same values for agent a in world w as for agent b in world u. Thus, item 2 above captures the fact that in world w dataset X about agent a uniquely identifies the agent.

Item 5 of the above definition assumes that the identity of agent a is revealed, so it only considers worlds u in which dataset X for this agent have the same values as in world w.

Item 6 and item 7 do not assume that the identity is revealed. Thus, they consider all possible world-agent combinations |$(u,b)$| for which the value of dataset X is the same as for |$(w,a)$|⁠.

Undefinability results

In this section, we show that none of the three modalities is definable through the two others even if the traceability expression is used. We use the “truth sets algebra” technique for proving undefinability, which has been proposed in [27] and used in [24,28,29]. Unlike the more traditional “bisimulation” method, this technique uses a single model. The results in this section use the two definitions below:

Definition 3.

Definition 4.

Formulae |$\varphi ,\psi \in \Phi$| are semantically equivalent if |$[\![ \varphi ]\!] =[\![ \psi ]\!]$| in each model.

Without loss of generality, in this section, we suppose that language |$\Phi$| contains a single atomic proposition p and a single data variable x. Alternatively, in the models we construct below, the valuation of all atomic propositions could defined the same way as for p. Similarly, equivalence relations for other data variables can be defined the same as for data variable x.

Undefinability of |${\sf R}$| through |$!$|⁠, |${\sf O}$|⁠, and |${\sf D}$|

Consider a model that contains three worlds: w, u, and v, and two agents a and b. Suppose that equivalence relation |$\sim _x$| divides set |$\lbrace w,u,v\rbrace \times \lbrace a,b\rbrace$| into two equivalence classes: |$\lbrace (w,a),(w,b),(u,b)\rbrace$| and |$\lbrace (u,a),(v,a),(v,b)\rbrace$|⁠. Informally, it means that data variable x has one value, say |$+$|⁠, for each pair in the first class and another value, say -, for each pair in the second class. We depict these values in the left-most diagram in Fig. 4. Let |$\pi (p)=\lbrace (w,a),(w,b),(u,b)\rbrace$|⁠.

We visualize the truth sets of various formulae as |$3\times 2$| tables whose rows are indexed with worlds and whose columns are indexed with agents. A cell |$(w,a)$| in such a table is colored gray if the pair |$(w,a)$| belongs to the truth set. The four diagrams in the center of Fig. 4 visualize the truth sets |$[\![ p]\!]$|⁠, |$[\![ \bot ]\!]$|⁠, |$[\![ \lnot p]\!]$|⁠, and |$[\![ \top ]\!]$|⁠.

The idea behind the truth set algebra technique is to show that any formula that does not use modality |${\sf R}$| has a truth set equal to one of the sets |$[\![ p]\!]$|⁠, |$[\![ \bot ]\!]$|⁠, |$[\![ \lnot p]\!]$|⁠, and |$[\![ \top ]\!]$|⁠. At the same time, a formula that uses modality |${\sf R}$| can have a different truth set.

Lemma 1.

|$[\![ {\sf O}_\varnothing \varphi ]\!] , [\![ {\sf D}_\varnothing \varphi ]\!] ,[\![ {\sf O}_x\varphi ]\!]$|⁠, |$[\![ {\sf D}_x\varphi ]\!] \!\in \!\lbrace [\![ p]\!] ,[\![ \bot ]\!] ,[\![ \lnot p]\!] ,[\![ \top ]\!] \rbrace$| for any formula |$\varphi \in \Phi$| such that |$[\![ \varphi ]\!] \in \lbrace [\![ p]\!] ,[\![ \bot ]\!] ,[\![ \lnot p]\!] ,[\![ \top ]\!] \rbrace$|⁠.

Proof.

Suppose that |$[\![ \varphi ]\!] =[\![ p]\!]$|⁠. First, we show that |$[\![ {\sf O}_\varnothing \varphi ]\!] =[\![ \bot ]\!] =\varnothing$|⁠. Indeed, assume the opposite. Thus, there is a world |$w^{\prime }\in \lbrace w,u,v\rbrace$| and an agent |$a^{\prime }\in \lbrace a,b\rbrace$| such that |$(w^{\prime },a^{\prime })\in [\![ {\sf O}_\varnothing \varphi ]\!]$|⁠. Hence, |$w^{\prime },a^{\prime }\Vdash {\sf O}_\varnothing \varphi$| by Definition 3. Then, |$v,a^{\prime }\Vdash \varphi$| by item 5 of Definition 2 because |$(w^{\prime },a^{\prime })\sim _\varnothing (v,a^{\prime })$|⁠. Thus, |$(v,a^{\prime })\in [\![ \varphi ]\!]$| by Definition 3. Hence, by the assumption |$[\![ \varphi ]\!] =[\![ p]\!]$|⁠, we have |$(v,a^{\prime })\in [\![ p]\!]$|⁠. Observe in Fig. 4 that the truth set |$[\![ p]\!]$| does not contain any single pair |$(x,y)$|⁠, where |$x=v$|⁠. This is a contradiction. Therefore, |$[\![ {\sf O}_\varnothing \varphi ]\!] =[\![ \bot ]\!]$|⁠.

The proof that |$[\![ {\sf D}_\varnothing \varphi ]\!] =[\![ \bot ]\!]$| is similar, but it uses item 6 of Definition 2 instead of item 5. In Fig. 4, we visualize both results by a directed edge from diagram |$[\![ p]\!]$| to diagram |$[\![ \bot ]\!]$| labeled with |${\sf O}_\varnothing ,{\sf D}_\varnothing$|⁠.

Next, we show that |$[\![ {\sf O}_x\varphi ]\!] =[\![ p]\!]$|⁠.

|$(\subseteq )$|⁠: Let us consider any world |$w^{\prime }\in \lbrace w,u,v\rbrace$| and any agent |$a^{\prime }\in \lbrace a,b\rbrace$| such that |$(w^{\prime },a^{\prime })\in [\![ {\sf O}_x\varphi ]\!]$|⁠. Thus, |$w^{\prime },a^{\prime }\Vdash {\sf O}_x\varphi$| by Definition 3. Note that |$(w^{\prime },a^{\prime })\sim _{x}(w^{\prime },a^{\prime })$|⁠. Hence, |$w^{\prime },a^{\prime }\Vdash \varphi$| by item 5 of Definition 2. Then, |$(w^{\prime },a^{\prime })\in [\![ \varphi ]\!]$| again by Definition 3. Therefore, |$(w^{\prime },a^{\prime })\in [\![ p]\!]$| by the assumption |$[\![ \varphi ]\!] =[\![ p]\!]$|⁠.

|$(\supseteq )$|⁠: Consider any |$w^{\prime }\in \lbrace w,u,v\rbrace$| and any agent |$a^{\prime }\in \lbrace a,b\rbrace$| such that

$$\begin{eqnarray} (w^{\prime },a^{\prime })\in [\![ p]\!] . \end{eqnarray}$$

(8)

By Definition 3, it suffices to show that |$w^{\prime },a^{\prime }\Vdash {\sf O}_x\varphi$|⁠. Toward this proof, consider any |$w^{\prime \prime }\in \lbrace w,u,v\rbrace$| such that |$(w^{\prime },a^{\prime })\sim _x (w^{\prime \prime },a^{\prime })$|⁠. By item 5 of Definition 2, it suffices to establish that |$w^{\prime \prime },a^{\prime }\Vdash \varphi$|⁠.

Observe that the assumption |$(w^{\prime },a^{\prime })\sim _x (w^{\prime \prime },a^{\prime })$| implies that |$(w^{\prime },a^{\prime })\in [\![ p]\!]$| iff |$(w^{\prime \prime },a^{\prime })\in [\![ p]\!]$|⁠, see Fig. 4. Then, |$(w^{\prime \prime },a^{\prime })\in [\![ p]\!]$| by statement (8). Thus, by the assumption |$[\![ \varphi ]\!] =[\![ p]\!]$| it follows that |$(w^{\prime \prime },a^{\prime })\in [\![ \varphi ]\!]$|⁠. Therefore, |$w^{\prime \prime },a^{\prime }\Vdash \varphi$| by Definition 3.

The proof that |$[\![ {\sf D}_x\varphi ]\!] =[\![ p]\!]$| is similar, but it uses item 6 of Definition 2 instead of item 5. In Fig. 4, we visualize both results by a directed loop edge from diagram |$[\![ p]\!]$| back to the same diagram |$[\![ p]\!]$| labeled with |${\sf O}_x,{\sf D}_x$|⁠.

Lemma 2.

|$[\![ \varphi ]\!] \in \lbrace [\![ p]\!] ,[\![ \bot ]\!] ,[\![ \lnot p]\!] ,[\![ \top ]\!] \rbrace$| for any formula |$\varphi \in \Phi$| that does not contain modality |${\sf R}$|⁠.

Proof.

We prove the statement of the lemma by induction on the structural complexity of formula |$\varphi$|⁠.

Suppose that formula |$\varphi$| has the form |$!X$|⁠, where dataset X is either |$\varnothing$| or |$\lbrace x\rbrace$|⁠. Observe from the left-most diagram in Fig. 4 that for any |$w^{\prime }\in \lbrace w,u,v\rbrace$| and any |$a^{\prime }\in \lbrace a,b\rbrace$| there is |$w^{\prime \prime }\in \lbrace w,u,v\rbrace$| and an agent |$a^{\prime \prime }\in \lbrace a,b\rbrace$| such that |$(w^{\prime },a^{\prime })\sim _X (w^{\prime \prime },a^{\prime \prime })$| and |$a^{\prime }\ne a^{\prime \prime }$|⁠. Thus, by item 2 of Definition 2, it follows that |$w^{\prime },a^{\prime }\nVdash \, !X$| for each world |$w^{\prime }\in \lbrace w,u,v\rbrace$| and each agent |$a^{\prime }\in \lbrace a,b\rbrace$|⁠. Hence, truth set |$[\![ !X]\!]$| is empty by Definition 2. Then, |$[\![ !X]\!] =[\![ \bot ]\!]$|⁠, see Fig. 4. Therefore, |$[\![ !X]\!] \in \lbrace [\![ p]\!] ,[\![ \bot ]\!] ,[\![ \lnot p]\!] ,[\![ \top ]\!] \rbrace$|⁠.

If formula |$\varphi$| has either the form |${\sf O}_X\psi$| or the form |${\sf D}_X\psi$|⁠, then the statement of the lemma follows from the induction hypothesis and Lemma 1.

Next, let us make an auxiliary observation based on the left-most diagram in Fig. 4.

Lemma 3.

For any world |$w^{\prime }\in \lbrace w,u,v\rbrace$| and any agent |$a^{\prime }\in \lbrace a,b\rbrace$|⁠, if data variable x has value |$+$| at |$(w^{\prime },a^{\prime })$|⁠, then data variable x has value |$+$| at |$(w^{\prime },b)$|⁠.

Lemma 4.

|$[\![ {\sf R}_x p]\!] \notin \lbrace [\![ p]\!] ,[\![ \bot ]\!] ,[\![ \lnot p]\!] ,[\![ \top ]\!] \rbrace$|⁠.

Proof.

It suffices to prove that the truth set |$[\![ {\sf R}_x p]\!]$| is the one depicted by the right-most diagram in Fig. 4. In other words it is enough to show that for any cell |$(w^{\prime },a^{\prime })$| of that diagram, |$w^{\prime },a^{\prime }\Vdash {\sf R}_x p$| iff cell |$(w^{\prime },a^{\prime })$| is gray.

|$(\Rightarrow ):$| Suppose that the cell |$(w^{\prime },a^{\prime })$| is white at the right-most diagram in Fig. 4. We consider the following two cases separately:

Case 1: |$(w^{\prime },a^{\prime })=(w,a)$|⁠. Note that |$(u,a)\notin \pi (p)$| by the choice of valuation function |$\pi$|⁠. Hence, |$(u,a)\nVdash p$| by item 1 of Definition 2. Additionally, |$(w,a)\sim _x (u,b)$|⁠, see the left-most diagram in Fig. 4. Thus, |$(w,a)\nVdash {\sf R}_x p$| by item 7 of Definition 2. Therefore, |$(w^{\prime },a^{\prime })\nVdash {\sf R}_x p$| by the assumption |$(w^{\prime },a^{\prime })=(w,a)$| of the case.

Case 2: |$(w^{\prime },a^{\prime })\ne (w,a)$|⁠. Thus, the assumption that the cell |$(w^{\prime },a^{\prime })$| is white at the right-most diagram in Fig. 4 implies that the same cell |$(w^{\prime },a^{\prime })$| is white at the second-from-left diagram in Fig. 4. Hence, |$(w^{\prime },a^{\prime })\notin [\![ p]\!]$|⁠. Then, |$(w^{\prime },a^{\prime })\nVdash p$| by Definition 3. Therefore, it follows that |$(w^{\prime },a^{\prime })\nVdash {\sf R}_x p$| by item 7 of Definition 2 and because |$(w^{\prime },a^{\prime })\sim _x (w^{\prime },a^{\prime })$|⁠.

|$(\Leftarrow ):$| Suppose that the cell |$(w^{\prime },a^{\prime })$| is gray at the right-most diagram in Fig. 4. Thus |$a^{\prime }=b$| and the value of data variable x is |$+$|⁠, see Fig. 4. Consider any world |$w^{\prime \prime }\in \lbrace w,u,v\rbrace$| and any agent |$a^{\prime \prime }\in \lbrace a,b\rbrace$| such that |$(w^{\prime },b)\sim _x (w^{\prime \prime },a^{\prime \prime })$|⁠. By item 7 of Definition 2, it suffices to show that |$(w^{\prime \prime },b)\Vdash p$|⁠.

Indeed, the assumption |$(w^{\prime },b)\sim _x (w^{\prime \prime },a^{\prime \prime })$| means that the value of data variable x at cell |$(w^{\prime \prime },a^{\prime \prime })$| is the same as its value at cell |$(w^{\prime },b)$|⁠, which is |$+$|⁠. Hence, the value of data variable x at cell |$(w^{\prime \prime },b)$| is also |$+$| by Lemma 3. Hence, |$(w^{\prime \prime },b)\in [\![ p]\!]$|⁠, see Fig. 4. Therefore, |$(w^{\prime \prime },b)\Vdash p$| by Definition 3.

The next result follows from Lemma 2 and Lemma 4.

Theorem 1 (undefinability).

Formula |${\sf R}_x p$| is not semantically equivalent to any formula in language |$\Phi$| that does not contain modality |${\sf R}$|⁠.

Figure 3.

Possible worlds in a survey. Reply |$+$| means that the participant’s family is expecting a child.

Open in new tab Download slide

Diagram visualizes the ndistinguishability relation for variable x and five truth sets in the model.

Figure 4.

Toward the proof of Theorem 1.

Open in new tab Download slide

Undefinability of |${\sf O}$| and |${\sf D}$|

The proof of the next theorem is similar to the proof of Theorem 1 except that instead of Fig. 4 it uses Fig. 5.

Theorem 2 (undefinability).

Formula |${\sf O}_x p$| is not semantically equivalent to any formula in language |$\Phi$| that does not contain modality |${\sf O}$|⁠.

The proof of the next theorem is also similar to the proofs of the two previous theorems except that it uses Fig. 6. Note that, in this case, data variable x has three possible values that we denote by -, 1, and 2, see Fig. 6.

Theorem 3 (undefinability).

Formula |${\sf D}_x p$| is not semantically equivalent to any formula in language |$\Phi$| that does not contain modality |${\sf D}$|⁠.

Figure 5.

Toward the proof of Theorem 2.

Open in new tab Download slide

Figure 6.

Toward the proof of Theorem 3.

Open in new tab Download slide

Axioms

In addition to propositional tautologies in language |$\Phi$|⁠, our logical system contains the following axioms:

Truth: |${\sf O}_X\varphi \rightarrow \varphi$|⁠,
Introspection of De Re Knowledge: |${\sf R}_X\varphi \rightarrow {\sf O}_X{\sf R}_X\varphi$|⁠,
Negative Introspection:
|$\lnot \Box _X\varphi \rightarrow \Box _X\lnot \Box _X\varphi$|⁠, where |$\Box \in \lbrace {\sf O},{\sf D}\rbrace$|⁠,
Distributivity:
|$\Box _X(\varphi \rightarrow \psi )\rightarrow (\Box _X\varphi \rightarrow \Box _X\psi )$|⁠, where |$\Box \in \lbrace {\sf O},{\sf D},{\sf R}\rbrace$|⁠,
Monotonicity:
|$\Box _X\varphi \rightarrow \Box _Y\varphi$| and |$!X\rightarrow !Y$|⁠, where |$X\subseteq Y$| and |$\Box \in \lbrace {\sf O},{\sf D},{\sf R}\rbrace$|⁠,
Overt Knowledge: |$\Box _X\varphi \rightarrow {\sf O}_X\varphi$|⁠, where |$\Box \in \lbrace {\sf D},{\sf R}\rbrace$|⁠,
Introspection of Traceability: |$!X\rightarrow {\sf D}_X!X$|⁠,
Data-Free Knowledge: |$\Box _\varnothing \varphi \rightarrow {\sf R}_\varnothing \varphi$|⁠, where |$\Box \in \lbrace {\sf O},{\sf D}\rbrace$|⁠,
Traceable Data: |$!X\rightarrow ({\sf O}_X\varphi \rightarrow \Box _X\varphi )$|⁠, where |$\Box \in \lbrace {\sf D},{\sf R}\rbrace$|⁠.

The Truth, the Negative Introspection, and the Distributivity axioms are the standard axioms of the epistemic logic. The Truth axiom is valid for all three modalities, |${\sf O}$|⁠, |${\sf D}$|⁠, and |${\sf R}$|⁠. We list it only for modality |${\sf O}$| because this principle for the other two modalities can be derived from using the Overt Knowledge axiom. At the same time, the Negative Introspection axiom is not universally valid for modality |${\sf R}$|⁠. As usual in epistemic logic, the positive introspection principle is provable from the Negative Introspection and some other axioms. We show this in Lemma 6. The positive introspection principle is also not universally valid for modality |${\sf R}$|⁠. However, a weaker form of it, expressed by the Introspection of De Re Knowledge axiom, holds.

The Introspection of Traceability axiom states that if the data uniquely identifies the agent, then the data informs the de dicto knowledge of this. We state this axiom for modality |${\sf D}$|⁠, the same property for modality |${\sf O}$| and |${\sf R}$| is derivable.

The Data-Free Knowledge axiom, when combined with the Overt Knowledge axiom, implies that the formulae |${\sf O}_\varnothing \varphi$| and |${\sf R}_\varnothing \varphi$| are equivalent. Note that the formula |${\sf D}_\varnothing \varphi$| is not equivalent to the two formulae above. Indeed, |$w,a\Vdash {\sf O}_\varnothing \varphi$| means that |$u,a\Vdash \varphi$| for each world u, see Definition 2. At the same time, |$w,a\Vdash {\sf D}_\varnothing \varphi$| means that |$u,b\Vdash \varphi$| is true for each world u and each agent b.

Finally, the Traceable Data axiom, when combined with the Overt Knowledge axiom, implies that there is no difference between the three forms of knowledge if the data is traceable.

We write |$\vdash \varphi$| and say that formula |$\varphi \in \Phi$| is a theorem of our logical system if this formula is derivable from our axioms using the Necessitation and the Modus Ponens inference rules:

$$\begin{eqnarray} \frac{\varphi }{{\sf D}_X\varphi } \frac{\varphi , \,\, \,\, \,\, \,\, \varphi \rightarrow \psi }{\psi }. \end{eqnarray}$$

We omit the Nesessitation inference rules for modalities |${\sf O}$| and |${\sf R}$| because these rules are derivable in our system. In addition to unary relation |$\vdash \varphi$|⁠, we also consider binary relation |$\Gamma \vdash \varphi$| between a (possibly infinite) set of formulae |$\Gamma \subseteq \Phi$| and a formula |$\varphi \in \Phi$|⁠. We say that |$\Gamma \vdash \varphi$| is true if formula |$\varphi$| is derivable from the theorems of our logical system and an additional set of assumptions |$\Gamma$| using the Modus Ponens inference rule only. Note that statements |$\varnothing \vdash \varphi$| and |$\vdash \varphi$| are equivalent. We say that a set of formulae |$\Gamma \subseteq \Phi$| is consistent if |$\Gamma \nvdash \bot$|⁠.

We conclude this section with a few technical observations that will be used in the proof of the completeness.

Lemma 5.

|$\vdash \Box _X\varphi \rightarrow \varphi$|⁠, where |$\Box \in \lbrace {\sf D},{\sf R}\rbrace$|⁠.

Proof.

Note that |$\vdash \Box _X\varphi \rightarrow {\sf O}_X\varphi$| by the Overt Knowledge axiom. At the same time, |$\vdash {\sf O}_X\varphi \rightarrow \varphi$| by the Truth axiom. Therefore, |$\vdash \Box _X\varphi \rightarrow \varphi$| by propositional reasoning.

The next lemma states a well-known observation that the positive introspection principle is derivable from S5 axioms. To keep the article self-contained, we give its proof in the appendix.

Lemma 6.

|$\vdash \Box _X\varphi \rightarrow \Box _X\Box _X\varphi$|⁠, where |$\Box \in \lbrace {\sf O},{\sf D}\rbrace$|⁠.

Lemma 7.

|$\vdash \Box _X\varphi \leftrightarrow {\sf O}_X\Box _X\varphi$|⁠, where |$\Box \in \lbrace {\sf D},{\sf R}\rbrace$|⁠.

Proof.

Note that the formula |${\sf O}_X\Box _X\varphi \rightarrow \Box _X\varphi$| is an instance of the Truth axiom. Thus, it suffices to show that |$\vdash \Box _X\varphi \rightarrow {\sf O}_X\Box _X\varphi$|⁠. If |$\Box$| is modality |${\sf R}$|⁠, then the last formula is an instance of the Introspection of De Re Knowledge axiom.

Finally, let us prove |$\vdash {\sf D}_X\varphi \rightarrow {\sf O}_X{\sf D}_X\varphi$|⁠. It follows from Lemma 6 that |$\vdash {\sf D}_X\varphi \rightarrow {\sf D}_X{\sf D}_X\varphi$|⁠. At the same time, the following formula is an instance of the Overt Knowledge axiom:|${\sf D}_X{\sf D}_X\varphi \rightarrow {\sf O}_X{\sf D}_X\varphi$|⁠. Therefore, |$\vdash {\sf D}_X\varphi \rightarrow {\sf O}_X{\sf D}_X\varphi$| by propositional reasoning.

The proof of the next standard lemma is also in the appendix.

Lemma 8.

If |$\varphi _1,\dots ,\varphi _n\vdash \psi$|⁠, then |$\Box _X\varphi _1,\dots ,\Box _X\varphi _n\vdash \Box _X\psi$|⁠, for any modality |$\Box \in \lbrace {\sf O},{\sf D},{\sf R}\rbrace$|⁠.

Lemma 9 (Lindenbaum).

Any consistent set of formulae can be extended to a maximal consistent set of formulae.

Proof.

The standard proof of Lindenbaum’s lemma [30, Proposition 2.14] applies here.

Soundness

Theorem 4 (soundness).

If |$\vdash \varphi$|⁠, then |$w,a\Vdash \varphi$| for any world w and any agent a of any model |$(W,\mathcal {A},\lbrace \sim _x\rbrace _{x\in V},\pi )$|⁠.

Proof.

The soundness of the Truth, the Negative Introspection, the Distributivity, the Monotonicity, and Overt Knowledge axioms as well as of the Necessitation and the Modus Ponens inference rules is straightforward. Below, we prove the soundness of each of the remaining axioms as a separate claim.

Claim 1.

If |$w,a\Vdash {\sf R}_X\varphi$|⁠, then |$w,a\Vdash {\sf O}_X{\sf R}_X\varphi$|⁠.

Proof of Claim.

Let us consider any world |$u\in W$| such that |$(w,a)\sim _X (u,a)$|⁠. By item 5 of Definition 2, it suffices to show that |$u,a\Vdash {\sf R}_X\varphi$|⁠. Toward this proof, consider any world |$v\in W$| and any agent |$b\in \mathcal {A}$| such that |$(u,a)\sim _X(v,b)$|⁠. By item 7 of Definition 2, it suffices to prove that |$v,a\Vdash \varphi$|⁠.

Note that |$(w,a)\sim _X (v,b)$| by the assumptions |$(w,a)\sim _X (u,a)$| and |$(u,a)\sim _X(v,b)$|⁠. Thus, |$v,a\Vdash \varphi$| by the assumption |$w,a\Vdash {\sf R}_X\varphi$| of the claim and item 7 of Definition 2.

Claim 2.

If |$w,a\Vdash \, !X$|⁠, then |$w,a\Vdash {\sf D}_X!X$|⁠.

Proof of Claim.

Consider any world |$u\in W$| and any agent |$b\in \mathcal {A}$| such that

$$\begin{eqnarray} (w,a)\sim _X(u,b). \end{eqnarray}$$

(9)

By item 6 of Definition 2, it suffices to prove that |$u,b\Vdash \, !X$|⁠. Toward this proof, consider any world |$v\in W$| and any agent |$c\in \mathcal {A}$| such that |$(u,b)\sim _X(v,c)$|⁠. By item 2 of Definition 2, it suffices to show that |$b=c$|⁠.

Claim 3.

If |$w,a\Vdash \Box _\varnothing \varphi$|⁠, then |$w,a\Vdash {\sf R}_\varnothing \varphi$|⁠, where |$\Box \in \lbrace {\sf O},{\sf D}\rbrace$|⁠.

Proof of Claim.

Note that the statement |$(w,a)\sim _\varnothing (u,b)$| is true for any world |$u\in W$| and any agent |$b\in \mathcal {A}$|⁠. Thus, by item 5 (or item 6) of Definition 2 the assumption |$w,a\Vdash \Box _\varnothing \varphi$| implies that |$u,a\Vdash \varphi$| for any world |$u\in W$|⁠. Hence, |$w,a\Vdash {\sf R}_\varnothing \varphi$| by item 7 of Definition 2.

Claim 4.

If |$w,a\Vdash \, !X$| and |$w,a\Vdash {\sf O}_X\varphi$|⁠, then |$w,a\Vdash \Box _X\varphi$|⁠, where |$\Box \in \lbrace {\sf D},{\sf R}\rbrace$|⁠.

Proof of Claim.

First, assume that |$\Box ={\sf D}$|⁠. Consider any world |$u\in W$| and any agent |$b\in \mathcal {A}$| such that |$(w,a)\sim _X(u,b)$|⁠. By item 6 of Definition 2, it suffices to show that |$u,b\Vdash \varphi$|⁠.

Note that the assumption |$w,a\Vdash \, !X$| of the claim implies that |$a=b$| by the assumption |$(w,a)\sim _X(u,b)$| and item 2 of Definition 2. Thus, |$(w,a)\sim _X(u,a)$| again by the assumption |$(w,a)\sim _X(u,b)$|⁠. Hence, |$u,a\Vdash \varphi$| by the assumption |$w,a\Vdash {\sf O}_X\varphi$| of the claim and item 5 of Definition 2. Therefore, |$u,b\Vdash \varphi$| because |$a=b$|⁠.

The proof in the case |$\Box ={\sf R}$| is similar.

This concludes the proof of the theorem.

Completeness

As usual, the proof of completeness for our logical system consists of a canonical model construction and a proof of a “truth” lemma that connects the satisfaction relation of the canonical model with the provability in our logical system. Traditionally, the worlds of the canonical model are defined as the maximal consistent sets of formulae. This approach works for modal logic with a few exceptions. One such exception is the epistemic logic of distributed knowledge. To prove the completeness of this system, Fagin, Halpern, and Vardi use a tree construction in which nodes of the tree are labeled with maximal consistent sets [31]. Note that although our logical system does not have distributed knowledge explicitly, it has it implicitly. Indeed, for each agent a, one can consider a variable |$x_a$| that represents all information known to agent a. In such a setting, knowledge informed by a dataset |$\lbrace x_a\rbrace _{a\in G}$| is the distributed knowledge of group G.

The maximal consistent set construction also needs to be adjusted for the egocentric setting when the semantics is defined in terms of a world-agent pair positioned on the left side of the satisfaction relation |$\Vdash$|⁠. In such a setting, a maximal consistent set captures the properties of a world-agent pair rather than just of a possible world. As a result, it is not, generally speaking, possible to define possible worlds simply as maximal consistent sets of formulae. One of the approaches to deal with this is suggested by Naumov and Tao [20]. They use a “matrix” technique in which the canonical model is recursively constructed as an infinite matrix whose cells are maximal consistent sets of formulae. The rows and columns of such a matrix correspond to possible worlds and agents, respectively. The maximal consistent set |$S_{wa}$| in the w-th row and a-th column captures the properties that agent a has in world w.

In this article, we combine tree and matrix techniques to prove the completeness of our logical system. At the center of our construction is a tree structure on the cells of a matrix. Because the position of each cell in a matrix could be described by an element of the Cartesian product of the set of rows and the set of columns, we call such structure a Cartesian tree.

Cartesian trees

Figure 7 depicts an example of a Cartesian tree. The nodes of this tree are the cells in |$2\times 3$| matrix. As shown in the figure, we index rows and columns starting from 0. The nodes and the undirected edges between them form a tree (graph without cycles). As shown in the figure, edges in a Cartesian tree are labeled with datasets. The Cartesian tree depicted in Fig. 7 has a finite number of rows and columns. Our proof of completeness requires us to consider infinite matrices whose rows and columns can be indexed by the whole set of integer numbers (ordinal |$\omega$|⁠). To achieve this, we define cells of the matrix as elements of the Cartesian product |$\alpha\times \beta$|⁠, where |$\alpha\le \omega$| and |$\beta\le \omega$| are two ordinals. For the example shown in Fig. 7, we have |$\alpha=2=\lbrace 0,1\rbrace$| and |$\beta=3=\lbrace 0,1,2\rbrace$|⁠. The formal definition of a Cartesian tree is below.

Cartesian tree with six nodes and labels on edges.

Figure 7.

A Cartesian tree.

Open in new tab Download slide

Figure 8.

Cartesian tree |$(\alpha+1,\beta,E^{\prime },\ell ^{\prime })$|⁠.

Open in new tab Download slide

An addition of a new row and a new column to a Cartesian tree.

Figure 9.

Toward a proof of Lemma 11.

Open in new tab Download slide

Definition 5.

A Cartesian tree is a tuple |$(\alpha,\beta,E,\ell )$|⁠, where

|$\alpha,\beta\le \omega$| are two ordinals,
|$E\subseteq (\alpha\times \beta)^2$| is an symmetric adjacency relation on |$\alpha\times \beta$| that forms an (undirected) tree structure,
|$\ell$| is a labeling function that to each edge |$e\in E$| assigns a dataset |$\ell (e)\subseteq V$|⁠.

By |$\langle n_1,n_2\rangle$| we denote the edge in the Cartesian tree between nodes |$n_1,n_2\in \alpha\times \beta$|⁠. We use corner brackets to differentiate unordered pairs from ordered pairs that we continue to denote by parentheses. We say that an edge |$\langle n_1,n_2\rangle \in E$| is labeled with a variable |$x\in V$| if |$x\in \ell \langle n_1,n_2\rangle$|⁠. By a simple path, we mean a path in the tree without repeating nodes. We allow trivial paths that start and end at the same node. Recall that for any two nodes in a tree, there is a unique simple path between these two nodes.

Definition 6.

For any nodes |$n_1,n_2\in \alpha\times \beta$| and any |$X\subseteq V$|⁠, let |$n_1\stackrel{X}{\rm o\!\!-\!\!o }n_2$| if all edges along the simple path connecting nodes |$n_1$| and |$n_2$| are labeled with each data variable in set X.

As an example, |$(1,0)\stackrel{\lbrace x,y\rbrace }{\rm o\!\!-\!\!o }(0,1)$| for the Cartesian tree depicted in Fig. 7. Also, |$(1,0)\stackrel{\lbrace x\rbrace }{\rm o\!\!-\!\!o }(0,2)$| and |$(1,1)\stackrel{\lbrace x,y\rbrace }{\rm o\!\!-\!\!o }(1,1)$|⁠. In fact, |$n\stackrel{X}{\rm o\!\!-\!\!o }n$| is true for each dataset |$X\subseteq V$| and each node n in any Cartesian tree.

Complete loaded Cartesian trees

Cartesian trees provide a framework on which we construct the canonical model by “loading” a maximal consistent set of formulae at each node of the tree. We expect there to be some correlation between maximal consistent sets and the tree structure. The formal definition of a loaded Cartesian tree is below.

Definition 7.

A loaded Cartesian tree is a tuple |$(\alpha,\beta,E,\ell , S)$| such that

|$(\alpha,\beta,E,\ell )$| is a Certesian tree,
S is a node labeling function that maps each node |$(w,a)$| in set |$\alpha\times \beta$| into a maximal consistent set of formulae |$S(w,a)$|⁠, which we denote by |$S_{wa}$|⁠, such that
- if |$!X\in S_{wa}$| and |$(w,a)\stackrel{X}{\rm o\!\!-\!\!o }(u,b)$|⁠, then |$a=b$|⁠,
- if |$(w,a)\stackrel{X}{\rm o\!\!-\!\!o }(u,a)$|⁠, then |${\sf O}_X\varphi \in S_{wa}$| iff |${\sf O}_X\varphi \in S_{ua}$|⁠,
- if |$(w,a)\stackrel{X}{\rm o\!\!-\!\!o }(u,b)$|⁠, then |${\sf D}_X\varphi \in S_{wa}$| iff |${\sf D}_X\varphi \in S_{ub}$|⁠,
- if |${\sf R}_X\varphi \in S_{wa}$| and |$(w,a)\stackrel{X}{\rm o\!\!-\!\!o }(u,b)$|⁠, then |$\varphi \in S_{ua}$|⁠.

Informally, we say that one loaded Cartesian tree is an extension of another if the extended tree adds new nodes while preserving the edges, labels, and loads in the original tree. The formal definition is given below.

Definition 8.

A loaded Cartesian tree |$(\alpha^{\prime },\beta^{\prime },E^{\prime },\ell ^{\prime }, S^{\prime })$| is an extension of a loaded Cartesian tree |$(\alpha,\beta,E,\ell , S)$| if

|$\alpha\le \alpha^{\prime }$|⁠,
|$\beta\le \beta^{\prime }$|⁠,
|$E=E^{\prime }\cap (\alpha\times \beta)^2$|⁠,
|$\ell ^{\prime }(e)=\ell (e)$| for each edge |$e\in E$|⁠,
|$S^{\prime }_{wa}=S_{wa}$| for each node |$(w,a)\in \alpha\times \beta$|⁠.

In order to be able to convert a loaded Cartesian tree into a canonical model, the tree must be “complete.” Below, we give the formal definition of what “complete” means. In the next subsection, we prove that any finite loaded Cartesian tree can be extended to a (possibly infinite) complete loaded Cartesian tree. This result is stated at the end of the next subsection as Theorem 5. The non-trivial proof of this theorem constitutes the bulk of the proof of the completeness theorem.

Definition 9.

if |$!X\notin S_{ub}$|⁠, then there is a node |$(w,a)\in \alpha\times \beta$| such that |$(u,b)\stackrel{X}{\rm o\!\!-\!\!o }(w,a)$| and |$b\ne a$|⁠,
if |${\sf O}_X\varphi \notin S_{ub}$|⁠, then there is a node |$(w,b)\in \alpha\times \beta$| such that |$(u,b)\stackrel{X}{\rm o\!\!-\!\!o }(w,b)$| and |$\varphi \notin S_{wb}$|⁠,
if |${\sf D}_X\varphi \notin S_{ub}$|⁠, then there is a node |$(w,a)\in \alpha\times \beta$| such that |$(u,b)\stackrel{X}{\rm o\!\!-\!\!o }(w,a)$| and |$\varphi \notin S_{wa}$|⁠,
if |${\sf R}_X\varphi \notin S_{ub}$|⁠, then there is a node |$(w,a)\in \alpha\times \beta$| such that |$(u,b)\stackrel{X}{\rm o\!\!-\!\!o }(w,a)$| and |$\varphi \notin S_{wb}$|⁠.

Extension theorem

A loaded Cartesian tree |$(\alpha,\beta,E,\ell , S)$| is finite if |$\alpha,\beta\lt \omega$|⁠. As mentioned above, in this subsection we show that each finite loaded Cartesian tree can be extended to a complete loaded Cartesian tree. To be complete, the tree must satisfy conditions 1–4 of Definition 9 for each dataset X, each formula |$\varphi$|⁠, each row w, and each column a. Instead of constructing the complete tree at once, we define it as a limit of an infinite chain of extensions. Each of these extensions makes one (or more) of conditions 1–4 satisfied for some specific values of X, |$\varphi$|⁠, w, and a. The existence of these single-step extensions is shown in the next two lemmas.

Lemma 10.

For any finite loaded Cartesian tree |$T=(\alpha,\beta,E,\ell , S)$|⁠, any node |$(u,b)\in \alpha\times \beta$|⁠, and any formula |${\sf O}_X\varphi \notin S_{ub}$|⁠, there is an extension |$T^{\prime }=(\alpha+1,\beta,E^{\prime },\ell ^{\prime }, S^{\prime })$| of tree T such that |$(u,b)\stackrel{X}{\rm o\!\!-\!\!o }(\alpha,b)$| in tree |$T^{\prime }$| and |$\varphi \notin S^{\prime }_{\alpha b}$|⁠.

Proof.

To define the loaded Cartesian tree |$T^{\prime }=(\alpha+1,\beta,E^{\prime },\ell ^{\prime }, S^{\prime })$|⁠, we first construct Cartesian tree |$(\alpha+1,\beta,E^{\prime },\ell ^{\prime })$| and then specify the load |$S^{\prime }$|⁠. The tree |$(\alpha+1,\beta,E^{\prime },\ell ^{\prime })$| is obtained from tree |$(\alpha,\beta,E,\ell )$| by an addition of a new row |$\alpha$|⁠. Each node in this new row is connected by an edge to the cell |$(u,b)$|⁠. All of the newly added edges are labeled with the empty set except the edge between nodes |$(u,b)$| and |$(\alpha,b)$|⁠, which is labeled with set X, see Fig. 8. Formally, relation |$E^{\prime }$| on the set |$(\alpha+1)\times \beta$| is defined as:

$$\begin{eqnarray} E^{\prime }=E\cup \lbrace \langle (u,b),(\alpha,a)\rangle \,\,| \,\, a\lt \beta\rbrace. \end{eqnarray}$$

(10)

and the labeling function |$\ell ^{\prime }$| defined as:

$$\begin{eqnarray} \ell ^{\prime }\,(\it e)= \left\lbrace \begin{array}{@{}l@{\quad }l@{}}X, & \rm{if e=\langle (u,b),(\alpha,b)\rangle },\\ \varnothing,& \rm{if {\it e} =\langle (u,b),(\alpha,a)\rangle , where a\ne b},\\ \ell (e), & \text{otherwise}. \end{array}\right. \end{eqnarray}$$

(11)

Note that |$(\alpha+1,\beta,E^{\prime },\ell ^{\prime })$| is a Cartesian tree by Definition 5.

Next, we proceed to define function |$S^{\prime }$|⁠. First, let us consider the set of formulae

$$\begin{eqnarray} Q^-=\lbrace \lnot \varphi \rbrace \cup \lbrace \psi \,\,| \,\,{\sf O}_X\psi \in S_{ub}\rbrace. \end{eqnarray}$$

(12)

Claim 5.

|$Q^-$| is a consistent set of formulae.

Proof of Claim.

Suppose the opposite. Then, there are formulae

$$\begin{eqnarray} {\sf O}_X\psi _1,\dots ,{\sf O}_X\psi _n\in S_{ub} \end{eqnarray}$$

(13)

such that |$\psi _1,\dots ,\psi _n\vdash \varphi$|⁠. Hence, |${\sf O}_X\psi _1,\dots ,{\sf O}_X\psi _n\vdash {\sf O}_X\varphi$|⁠, by Lemma 8. Thus, |$S_{ub}\vdash {\sf O}_X\varphi$| by assumption (13). Then, |${\sf O}_X\varphi \in S_{ub}$| because |$S_{ub}$| is a maximal consistent set, which contradicts the assumption |${\sf O}_X\varphi \notin S_{ub}$| of the lemma.

By Lemma 9, set |$Q^-$| can be extended to a maximal consistent set Q.

To finish the construction of the loaded Cartesian tree, we need to define the labeling function |$S^{\prime }$| that maps pairs from Cartesian product |$(\alpha+1)\times \beta$| into maximal consistent sets. The values of this function |$S^{\prime }_{wa}$| are given by the matrix of size |$\alpha+1$| by |$\beta$| shown in below equation:

$$\begin{eqnarray} \left(S''_{wa}\right)= \left[\begin{array}{@{}cccccccc@{}} S_{00} & S_{01} & \dots & S_{0,b-1} & S_{0b} & S_{0,b+1} & \dots & S_{0,\beta-1}\\ S_{10} & S_{11} & \dots & S_{1,b-1} & S_{1b} & S_{1,b+1} & \dots & S_{1,\beta-1}\\ \vdots & \vdots & & \vdots& \vdots& \vdots&& \vdots\\ S_{u0} & S_{u1} & \dots & S_{u,b-1} & S_{ub} & S_{u,b+1} & \dots & S_{u,\beta-1}\\ \vdots & \vdots & & \vdots& \vdots& \vdots&& \vdots\\ S_{\alpha-1,0} & S_{\alpha-1,1} & \dots & S_{\alpha-1,b-1} & S_{\alpha-1,b} & S_{\alpha-1,b+1} & \dots & S_{\alpha-1,\beta-1}\\ S_{u0} & S_{u1} & \dots &S_{u,b-1} & Q &S_{u,b+1} & \dots & S_{u,\beta-1} \\ \end{array}\right]. \end{eqnarray}$$

Matrix |$(S''_{wa})$| toward the proof of Lemma 10.Although the visual definition of |$S^{\prime }_{wa}$| through a matrix is easier to understand intuitively, in proofs, it is more convenient to refer to an equivalent definition in a more traditional form:

$$\begin{eqnarray} S^{\prime }_{wa}= \left\lbrace \begin{array}{@{}l@{\quad }l@{}}S_{ua}, & \rm{if\,\it w={\alpha\,{\rm and}\,a\ne b}},\\ Q, & \rm{if\,\it w={\alpha\,{\rm and}\,a=b}},\\ S_{wa}, &\text{otherwise}. \end{array}\right. \end{eqnarray}$$

(14)

This concludes the definition of the tuple |$T^{\prime }=(\alpha+1,\beta,E^{\prime },\ell ^{\prime }, S^{\prime })$|⁠.

Claim 6.

|$(w_1,a_1)\stackrel{Y}{\rm o\!\!-\!\!o }(w_2,a_2)$| in tree T iff |$(w_1,a_1)\stackrel{Y}{\rm o\!\!-\!\!o }(w_2,a_2)$| in tree |$T^{\prime }$|⁠, for each dataset |$Y\subseteq V$|⁠, each |$w_1,w_2\lt \alpha$|⁠, and each |$a_1,a_2\lt \beta$|⁠.

To show that |$T^{\prime }$| is a loaded Cartesian tree, we need to verify conditions (a)–(d) of item 2 of Definition 7.

We verify these conditions in Claim 8, Claim 10, Claim 11, and Claim 13 below. First, let us prove an auxiliary statement used in the proof of Claim 8.

Claim 7.

|${\sf O}_Y\psi \in S_{ub}$| iff |${\sf O}_Y\psi \in Q$|⁠, for each dataset |$Y\subseteq X$| and each formula |$\psi \in \Phi$|⁠.

Proof of Claim.

|$(\Rightarrow ):$| By Lemma 6 and the Modus Ponens rule, the assumption |${\sf O}_Y\psi \in S_{ub}$| implies that |$S_{ub}\vdash {\sf O}_Y{\sf O}_Y\psi$|⁠. Hence, |$S_{ub}\vdash {\sf O}_X{\sf O}_Y\psi$| by the Monotonicity axiom, the Modus Ponens inference rule, and the assumption |$Y\subseteq X$|⁠. Thus, |${\sf O}_X{\sf O}_Y\psi \in S_{ub}$| because |$S_{ub}$| is a maximal consistent set. Then, |${\sf O}_Y\psi \in Q^-$| by equation (12). Therefore, |${\sf O}_Y\psi \in Q$| because |$Q^-\subseteq Q$|⁠.

|$(\Leftarrow ):$| Suppose that |${\sf O}_Y\psi \notin S_{ub}$|⁠. Thus, |$\lnot {\sf O}_Y\psi \in S_{ub}$| because |$S_{ub}$| is a maximal consistent set. Then, |$S_{ub}\vdash {\sf O}_Y\lnot {\sf O}_Y\psi$| by the Negative Introspection axiom and the Modus Ponens inference rule. Hence, |$S_{ub}\vdash {\sf O}_X\lnot {\sf O}_Y\psi$| by the Monotonicity axiom, the Modus Ponens inference rule, and the assumption |$Y\subseteq X$| of the claim. Thus, |${\sf O}_X\lnot {\sf O}_Y\psi \in S_{ub}$| because |$S_{ub}$| is a maximal consistent set. Then, |$\lnot {\sf O}_Y\psi \in Q^-$| by equation (12). Hence, |$\lnot {\sf O}_Y\psi \in Q$| because |$Q^-\subseteq Q$|⁠. Therefore, |${\sf O}_Y\psi \notin Q$| because set Q is consistent.

We are now ready to prove condition 2(b) of Definition 7.

Claim 8.

|${\sf O}_Y\psi \in S^{\prime }_{w_1a}$| iff |${\sf O}_Y\psi \in S^{\prime }_{w_2a}$|⁠, for each |$w_1,w_2\lt \alpha+1$|⁠, each |$a\lt \beta$|⁠, and each |$Y\subseteq V$| such that |$(w_1,a)\stackrel{Y}{\rm o\!\!-\!\!o }(w_2,a)$| in |$T^{\prime }$|⁠.

Proof of Claim.

We consider the following three cases separately.

Case 1: |$w_1=w_2$|⁠. Then, |${\sf O}_Y\psi \in S^{\prime }_{w_1a}$| iff |${\sf O}_Y\psi \in S^{\prime }_{w_2a}$|⁠.

Case 2: |$w_1,w_2\lt \alpha$|⁠. Then, |$(w_1,a)\stackrel{Y}{\rm o\!\!-\!\!o }(w_2,a)$| in T by Claim 6 and the assumption |$(w_1,a)\stackrel{Y}{\rm o\!\!-\!\!o }(w_2,a)$| in |$T^{\prime }$| of the current claim. Hence, |${\sf O}_Y\psi \in S_{w_1a}$| iff |${\sf O}_Y\psi \in S_{w_2a}$| by condition 2(b) of Definition 7. Therefore, |${\sf O}_Y\psi \in S^{\prime }_{w_1a}$| iff |${\sf O}_Y\psi \in S^{\prime }_{w_2a}$| by equation (14).

Case 3: Exactly one of |$w_1,w_2$| is equal to |$\alpha$|⁠. Without loss of generality, suppose that

$$\begin{eqnarray} w_1\lt \alpha \,\, \,\, \,\, \text{ and } \,\, \,\, \,\, w_2=\alpha. \end{eqnarray}$$

(15)

Then, the assumption |$(w_1,a)\stackrel{Y}{\rm o\!\!-\!\!o }(w_2,a)$| in |$T^{\prime }$| of the claim implies that |$(w_1,a)\stackrel{Y}{\rm o\!\!-\!\!o }(\alpha,a)$| in |$T^{\prime }$|⁠. In other words, in tree |$T^{\prime }$|⁠, the unique path from node |$(w_1,a)$| to node |$(\alpha,a)$| is labeled with each data variable from set Y. Note, see Fig. 8, that this path must go through the node |$(u,b)$| and contain the edge |$\langle (u,b),(\alpha,a)\rangle$|⁠. Thus,

$$\begin{eqnarray} Y\subseteq \ell ^{\prime }(\langle (u,b),(\alpha,a)\rangle ) \end{eqnarray}$$

(16)

and |$(w_1,a)\stackrel{Y}{\rm o\!\!-\!\!o }(u,b)$| in tree |$T^{\prime }$|⁠. The last statement, by Claim 6, implies

$$\begin{eqnarray} (w_1,a)\stackrel{Y}{\rm o\!\!-\!\!o }(u,b) \,\, \,\, \,\,\text{in} \,\, \,\, \,\,T. \end{eqnarray}$$

(17)

We further split this case into two subcases:

Case 3A: |$a=b$|⁠. Then, |$\ell ^{\prime }(\langle (u,b),(\alpha,a)\rangle )=X$| by equation (11). Hence, by statement (16),

$$\begin{eqnarray} Y\subseteq X. \end{eqnarray}$$

(18)

In addition, statement (17) and the assumption |$a=b$| of Case 3A imply that |${\sf O}_Y\psi \in S_{w_1a}$| iff |${\sf O}_Y\psi \in S_{ub}$| by item 2(b) of Definition 7. Hence, |${\sf O}_Y\psi \in S_{w_1a}$| iff |${\sf O}_Y\psi \in Q$| by Claim 7 and statement (18). Then, |${\sf O}_Y\psi \in S^{\prime }_{w_1a}$| iff |${\sf O}_Y\psi \in S^{\prime }_{\alpha b}$| by equation (14). Therefore, |${\sf O}_Y\psi \in S^{\prime }_{w_1a}$| iff |${\sf O}_Y\psi \in S^{\prime }_{w_2a}$| because of the part |$w_2=\alpha$| of statement (15) and the assumption |$a=b$| of Case 3A.

Case 3B: |$a\ne b$|⁠. Then, |$\ell ^{\prime }(\langle (u,b),(\alpha,a)\rangle )=\varnothing$| by equation (11). Hence, by statement (16),

$$\begin{eqnarray} Y=\varnothing . \end{eqnarray}$$

(19)

By the assumption of the lemma |$u\lt \alpha$|⁠. Thus, the node |$(u,a)$| also belongs to the tree T. Hence, vacuously, all edges along the unique simple path between the nodes |$(w_1,a)$| and |$(u,a)$| are labeled with each element of the empty set. In other words, |$(w_1,a)\stackrel{\varnothing }{\rm o\!\!-\!\!o }(u,a)$| in T. Thus, |$(w_1,a)\stackrel{Y}{\rm o\!\!-\!\!o }(u,a)$| in T by statement (19). Then, |${\sf O}_Y\psi \in S_{w_1a}$| iff |${\sf O}_Y\psi \in S_{ua}$| by item 2(b) of Definition 7. Hence, |${\sf O}_Y\psi \in S^{\prime }_{w_1a}$| iff |${\sf O}_Y\psi \in S^{\prime }_{\alpha a}$| by equation (14), the part |$w_1\lt \alpha$| of statement (15), and the assumption |$a\ne b$| of Case 3B. Therefore, |${\sf O}_Y\psi \in S^{\prime }_{w_1a}$| iff |${\sf O}_Y\psi \in S^{\prime }_{w_2 a}$| by the part |$w_2=\alpha$| of statement (15).

Claim 10 verifies condition 2(c) of Definition 7. The next claim is an auxiliary statement toward proving Claim 10.

Claim 9.

|${\sf D}_Y\psi \in S^{\prime }_{\alpha a}$| iff |${\sf D}_Y\psi \in S^{\prime }_{ub}$|⁠, for each formula |$\psi\in\Phi$|⁠, each ordinal |$a\lt \beta$|⁠, and each dataset |$Y\subseteq \ell ^{\prime }(\langle (\alpha,a),(u,b)\rangle )$|⁠.

Proof of Claim.

We consider the following two cases separately:

Case 1: |$a\ne b$|⁠. Then the assumption |$Y\subseteq \ell ^{\prime }(\langle (\alpha,a),(u,b)\rangle )$| of the claim implies, see Fig. 8,

$$\begin{eqnarray} Y=\varnothing . \end{eqnarray}$$

(20)

The assumption |$(u,b)\in \alpha\times \beta$| of the lemma implies that |$u\lt \alpha$|⁠. Thus, nodes |$(u,b)$| and |$(u,a)$| both belong to the tree T. Vacuously, all edges along the simple path connecting these nodes in tree T are labeled with each element of the empty set. Hence, |$(u,b)\stackrel{\varnothing }{\rm o\!\!-\!\!o }(u,a)$| in tree T. Then, |${\sf D}_\varnothing \psi \in S_{ub}$| iff |${\sf D}_\varnothing \psi \in S_{ua}$| by item 2(c) of Definition 7. Thus, |${\sf D}_\varnothing \psi \in S^{\prime }_{ub}$| iff |${\sf D}_\varnothing \psi \in S^{\prime }_{\alpha a}$| by equation (14) and the assumption |$a\ne b$| of the claim. Therefore, |${\sf D}_Y\psi \in S^{\prime }_{ub}$| iff |${\sf D}_Y\psi \in S^{\prime }_{\alpha a}$| by equation (20).

Case 2: |$a= b$|⁠. Then, the assumption |$Y\subseteq \ell ^{\prime }(\langle (\alpha,a),(u,b)\rangle )$| of the claim imply that |$(\alpha,a)\stackrel{Y}{\rm o\!\!-\!\!o }(u,a)$| in |$T^{\prime }$|⁠. Thus, |${\sf O}_Y{\sf D}_Y\psi \in S^{\prime }_{\alpha a}$| iff |${\sf O}_Y{\sf D}_Y\psi \in S^{\prime }_{u a}$| by Claim 8. Hence, |${\sf D}_Y\psi \in S^{\prime }_{\alpha a}$| iff |${\sf D}_Y\psi \in S^{\prime }_{u a}$| by Lemma 7 because |$S^{\prime }_{\alpha a}$| and |$S^{\prime }_{u a}$| are maximal consistent sets. Therefore, |${\sf D}_Y\psi \in S^{\prime }_{\alpha a}$| iff |${\sf D}_Y\psi \in S^{\prime }_{u b}$| by the assumption |$a=b$| of the case.

Claim 10.

|${\sf D}_Y\psi \in S^{\prime }_{w_1a_1}$| iff |${\sf D}_Y\psi \in S^{\prime }_{w_2a_2}$| for each formula |$\psi \in \Phi$|⁠, each |$w_1,w_2\lt \alpha+1$|⁠, each |$a_1,a_2\lt \beta$|⁠, and each |$Y\subseteq V$| such that |$(w_1,a_1)\stackrel{Y}{\rm o\!\!-\!\!o }(w_2,a_2)$| in |$T^{\prime }$|⁠.

Proof of Claim.

We prove the claim by induction on the length of the simple path between the nodes |$(w_1,a_1)$| and |$(w_2,a_2)$| in tree |$T^{\prime }$|⁠.

Base Case. |$(w_1,a_1)=(w_2,a_2)$|⁠. Then, |$w_1=w_2$| and |$a_1=a_2$|⁠. Therefore, |${\sf D}_Y\psi \in S^{\prime }_{w_1a_1}$| iff |${\sf D}_Y\psi \in S^{\prime }_{w_2a_2}$|⁠.

Induction Step. Consider the following two cases separately:

Case 1: |$w_1,w_2\lt \alpha$|⁠. Then, |$(w_1,a_1)\stackrel{Y}{\rm o\!\!-\!\!o }(w_2,a_2)$| in T by Claim 6 and the assumption |$(w_1,a_1)\stackrel{Y}{\rm o\!\!-\!\!o }(w_2,a_2)$| in |$T^{\prime }$| of the current claim. Thus, |${\sf D}_Y\psi \in S_{w_1a_1}$| iff |${\sf D}_Y\psi \in S_{w_2a_2}$| by item 2(c) of Definition 7. Therefore, |${\sf D}_Y\psi \in S^{\prime }_{w_1a_1}$| iff |${\sf D}_Y\psi \in S^{\prime }_{w_2a_2}$| by equation (14) and the assumption |$w_1,w_2\lt \alpha$| of the case.

Case 2: At least one of |$w_1$| and |$w_2$| is equal to |$\alpha$|⁠. Without loss of generality, let |$w_1=\alpha$|⁠. Consider the simple path between the nodes |$(w_1,a_1)=(\alpha,a_1)$| and |$(w_2,a_2)$| in tree |$T^{\prime }$|⁠. Because we consider the induction step, this path has more than one node. Observe that the node |$(\alpha,a_1)$| in tree |$T^{\prime }$| is only connected to node |$(u,b)$|⁠, see Fig. 8. Thus, the path has the form: |$(\alpha,a_1),(u,b),\dots ,(w_2,a_2)$|⁠. Then, by the assumption |$(w_1,a_1)\stackrel{Y}{\rm o\!\!-\!\!o }(w_2,a_2)$| in |$T^{\prime }$| of the claim,

$$\begin{eqnarray} Y\subseteq \ell ^{\prime }(\langle (\alpha,a_1),(u,b)\rangle ), \end{eqnarray}$$

(21)

$$\begin{eqnarray} &(u,b)\stackrel{Y}{\rm o\!\!-\!\!o }(w_2,a_2) \,\, \,\, \,\,\text{in} \,\, \,\, \,\,T^{\prime }. \end{eqnarray}$$

(22)

Thus, |${\sf D}_Y\psi \in S^{\prime }_{\alpha a_1}$| iff |${\sf D}_Y\psi \in S^{\prime }_{ub}$| by Claim 9 and statement (21). Additionally, by the induction hypothesis, statement (22) implies that |${\sf D}_Y\psi \in S^{\prime }_{ub}$| iff |${\sf D}_Y\psi \in S^{\prime }_{w_2a_2}$|⁠. Hence, |${\sf D}_Y\psi \in S^{\prime }_{w_1 a_1}$| iff |${\sf D}_Y\psi \in S^{\prime }_{w_2a_2}$| because |$w_1=\alpha$|⁠.

The next claim verifies condition 2(d) of Definition 7.

Claim 11.

If |${\sf R}_Y\psi \in S^{\prime }_{w_1a_1}$| and |$(w_1,a_1)\stackrel{Y}{\rm o\!\!-\!\!o }(w_2,a_2)$| in |$T^{\prime }$|⁠, then |$\psi \in S^{\prime }_{w_2a_1}$|⁠, for each |$w_1,w_2\lt \alpha+1$|⁠, each |$a_1,a_2\lt \beta$|⁠, each dataset |$Y\subseteq V$|⁠, and each formula |$\psi \in \Phi$|⁠.

Proof of Claim.

One of the following four cases must take place, see Fig. 8:

Case 1: either |$Y=\varnothing$| or |$a_1=a_2$|⁠. In the former case, all edges along the simple path between nodes |$(w_1,a_1)$| and |$(w_2,a_1)$| are labeled with all data variables from set Y. Hence, |$(w_1,a_1)\stackrel{Y}{\rm o\!\!-\!\!o }(w_2,a_1)$| in |$T^{\prime }$|⁠. In the latter case, the same statement is true by the assumption of the claim that |$(w_1,a_1)\stackrel{Y}{\rm o\!\!-\!\!o }(w_2,a_2)$| in |$T^{\prime }$|⁠. Thus, |${\sf O}_Y\psi \in S^{\prime }_{w_1a_1}$| iff |${\sf O}_Y\psi \in S^{\prime }_{w_2a_1}$| by Claim 8. Note also that |${\sf O}_Y\psi \in S^{\prime }_{w_1a_1}$| by the assumption |${\sf R}_Y\psi \in S^{\prime }_{w_1a_1}$| of the claim, the Overt Knowledge axiom, the Modus Ponens inference rule, and the maximality of the set |$S^{\prime }_{w_1a_1}$|⁠. Hence, |${\sf O}_Y\psi \in S^{\prime }_{w_2a_1}$|⁠. Therefore, |$\psi \in S^{\prime }_{w_2a_1}$| by the Truth axiom, the Modus Ponens inference rule, and the maximality of the set |$S^{\prime }_{w_1a_1}$|⁠.

Case 2: |$w_1,w_2\lt \alpha$|⁠. Then, |${\sf R}_Y\psi \in S_{w_1a_1}$| by the assumption |${\sf R}_Y\psi \in S^{\prime }_{w_1a_1}$| of the claim and equation (14). Additionally, |$(w_1,a_1)\stackrel{Y}{\rm o\!\!-\!\!o }(w_2,a_2)$| in T by the assumption |$(w_1,a_1)\stackrel{Y}{\rm o\!\!-\!\!o }(w_2,a_2)$| in |$T^{\prime }$| of the current claim and Claim 6. Hence, |$\psi \in S_{w_2a_1}$| by item 2(d) of Definition 7. Therefore, |$\psi \in S^{\prime }_{w_2a_1}$| by equation (14).

Case 3: |$a_1\ne a_2$|⁠, one of the nodes |$(w_1,a_1)$| and |$(w_2,a_2)$| belongs to tree T and the other is the node |$(\alpha,b)$|⁠. Then, the simple path between the nodes |$(w_1,a_1)$| and |$(w_2,a_2)$| must go through the node |$(u,b)$|⁠, see Fig. 8. Furthermore, note that the assumption |$(w_1,a_1)\stackrel{Y}{\rm o\!\!-\!\!o }(w_2,a_2)$| in |$T^{\prime }$| of the claim implies that

$$\begin{eqnarray} (w_1,a_1)\stackrel{Y}{\rm o\!\!-\!\!o }(u,b) \,\, \,\, \,\,\text{in} \,\, \,\, \,\,T^{\prime }, \end{eqnarray}$$

(23)

$$\begin{eqnarray} (u,b)\stackrel{Y}{\rm o\!\!-\!\!o }(w_2,a_2) \,\, \,\, \,\,\text{in} \,\, \,\, \,\,T^{\prime } . \end{eqnarray}$$

(24)

We further split this case into two subcases:

Case 3A: |$w_1\lt \alpha$| and |$(w_2,a_2)=(\alpha,b)$|⁠. Then, |${\sf R}_Y\psi \in S_{w_1a_1}$| by the assumption |${\sf R}_Y\psi \in S^{\prime }_{w_1a_1}$| of the claim and equation (14). Note also that |$(w_1,a_1)\stackrel{Y}{\rm o\!\!-\!\!o }(u,b)$| in T by statement (23) and Claim 6. Thus, |$\psi \in S_{ua_1}$| by item 2(d) of Definition 7. Then, |$\psi \in S^{\prime }_{\alpha a_1}$| by equation (14) and the assumption |$a_1\ne a_2$| of Case 3. Therefore, |$\psi \in S^{\prime }_{w_2 a_1}$| by the assumption |$(w_2,a_2)=(\alpha,b)$| of Case 3A.

Case 3B: |$w_2\lt \alpha$| and |$(w_1,a_1)=(\alpha,b)$|⁠. By the Introspection of De Re Knowledge axiom and the Modus Ponens inference rule, the assumption |${\sf R}_Y\psi \in S^{\prime }_{w_1a_1}$| of the claim implies |$S^{\prime }_{w_1a_1}\vdash {\sf O}_Y{\sf R}_Y\psi$|⁠. Then, |${\sf O}_Y{\sf R}_Y\psi \in S^{\prime }_{w_1a_1}$| because |$S^{\prime }_{w_1a_1}$| is a maximal consistent set. Thus, |${\sf O}_Y{\sf R}_Y\psi \in S^{\prime }_{ub}$| by statement (23) and Claim 8. Hence, |$S^{\prime }_{ub}\vdash {\sf R}_Y\psi$| by the Truth axiom and the Modus Ponens inference rule. Then, |${\sf R}_Y\psi \in S^{\prime }_{ub}$| because |$S^{\prime }_{ub}$| is a maximal consistent set. Thus, by equation (14) and the assumption |$(u,b)\in \alpha\times \beta$| of the lemma,

$$\begin{eqnarray} {\sf R}_Y\psi \in S_{ub}. \end{eqnarray}$$

(25)

At the same time, statement (24), by Claim 6, the assumption |$(u,b)\in \alpha\times \beta$| of the lemma, and the assumption |$w_2\lt \alpha$| of Case 3B, implies that |$(u,b)\stackrel{Y}{\rm o\!\!-\!\!o }(w_2,a_2)$| in T. Thus, |$\psi \in S_{w_2b}$| by item 2(d) of Definition 7 and statement (25). Hence, |$\psi \in S_{w_2a_1}$| because |$b=a_1$| by the assumption |$(w_1,a_1)=(\alpha,b)$| of Case 3B. Therefore, |$\psi \in S^{\prime }_{w_2a_1}$| by equation (14) and the assumption |$w_2\lt \alpha$| of Case 3B.

The condition 2(a) of Definition 7 is verified in Claim 13. The next claim is an auxiliary statement used in the proof of Claim 13. Note that we have chosen to delay the verification of condition 2(a) because the proof of Claim 12 is using Claim 10.

Claim 12.

|$!Y\in S^{\prime }_{w_1a_1}$| iff |$!Y\in S^{\prime }_{w_2a_2}$|⁠, for each |$Y\subseteq V$|⁠, each |$w_1,w_2\lt \alpha+1$|⁠, and each |$a_1,a_2\lt \beta$| such that |$(w_1,a_1)\stackrel{Y}{\rm o\!\!-\!\!o }(w_2,a_2)$| in |$T^{\prime }$|⁠.

Proof of Claim.

It suffices to show that if |$!Y\in S^{\prime }_{w_1a_1}$|⁠, then |$!Y\in S^{\prime }_{w_2a_2}$|⁠. Indeed, by the Introspection of Traceability axiom and the Modus Ponens inference rule, the assumption |$!Y\in S^{\prime }_{w_1a_1}$| implies that |$S^{\prime }_{w_1a_1} \vdash {\sf D}_Y(!Y)$|⁠. Hence, |${\sf D}_Y(!Y)\in S^{\prime }_{w_1a_1}$| because |$S^{\prime }_{w_1a_1}$| is a maximal consistent set. Thus, |${\sf D}_Y(!Y)\in S^{\prime }_{w_2a_2}$| by Claim 10 and the assumption |$(w_1,a_1)\stackrel{Y}{\rm o\!\!-\!\!o }(w_2,a_2)$| in |$T^{\prime }$| of the current claim. Hence, |$S^{\prime }_{w_2a_2}\vdash \, !\, Y$| by Lemma 5. Therefore, |$!Y\in S^{\prime }_{w_2a_2}$| because |$S^{\prime }_{w_2a_2}$| is a maximal consistent set.

Claim 13.

If |$!Y\in S^{\prime }_{w_1a_1}$| and |$(w_1,a_1)\stackrel{Y}{\rm o\!\!-\!\!o }(w_2,a_2)$| in |$T^{\prime }$|⁠, then |$a_1=a_2$|⁠, for each dataset |$Y\subseteq V$|⁠, each |$w_1,w_2\lt \alpha+1$|⁠, and each |$a_1,a_2\lt \beta$|⁠.

Proof of Claim.

We prove the statement of the claim by induction on the length of the simple path between the nodes |$(w_1,a_1)$| and |$(w_2,a_2)$| in tree |$T^{\prime }$|⁠.

Base Case: if the length of the path is zero, then |$(w_1,a_1)=(w_2,a_2)$|⁠. Therefore, |$a_1=a_2$|⁠.

Induction Step: we further split the induction step into the following three cases:

Case 1: |$w_1,w_2\lt \alpha$|⁠. Then, |$!Y\in S_{w_1a_1}$| by equation (14). Also, by Claim 6 and the assumption |$(w_1,a_1)\stackrel{Y}{\rm o\!\!-\!\!o }(w_2,a_2)$| in |$T^{\prime }$| of the current claim, |$(w_1,a_1)\stackrel{Y}{\rm o\!\!-\!\!o }(w_2,a_2)$| in T. Therefore, |$a_1=a_2$| by item 2(a) of Definition 7.

Case 2: at least one of |$w_1,w_2$| is equal to |$\alpha$|⁠. By Claim 12, without loss of generality, we can suppose that |$w_2=\alpha$|⁠. Note that |$(w_1,a_1)\ne (w_2,a_2)$| because we consider the induction step of the proof. Thus, the assumption |$w_2=\alpha$| implies that the simple path in tree |$T^{\prime }$| from the node |$(w_1,a_1)$| to the node |$(w_2,a_2)=(\alpha,a_2)$| must go through the node |$(u,b)$|⁠, see Fig. 8. Then, the simple path from the node |$(w_1,a_1)$| to the node |$(u,b)$| is shorter than the simple path from the node |$(w_1,a_1)$| to the node |$(w_2,a_2)$|⁠. In addition,

$$\begin{eqnarray} &(w_1,a_1)\stackrel{Y}{\rm o\!\!-\!\!o }(u,b) \,\, \,\, \,\,\text{in} \,\, \,\, \,\,T^{\prime }, \end{eqnarray}$$

(26)

$$\begin{eqnarray} &Y\subseteq \ell ^{\prime }(\langle (u,b),(\alpha,a_2)\rangle ) \end{eqnarray}$$

(27)

by the assumption |$(w_1,a_1)\stackrel{Y}{\rm o\!\!-\!\!o }(w_2,a_2)$| in |$T^{\prime }$| of the claim. We further split the proof into the following two subcases:

Case 2A: |$a_2=b$|⁠. Note that |$a_1=b$| by the induction hypothesis, the assumption |$!Y\in S^{\prime }_{w_1a_1}$| of the claim and statement (26). Therefore, |$a_1=a_2$| by the assumption |$a_2=b$| of Case 2A.

Case 2B: |$a_2\ne b$|⁠. Then, |$\ell ^{\prime }(\langle (u,b),(\alpha,a_2)\rangle )=\varnothing$|⁠, see Fig. 8. Thus, |$Y=\varnothing$| by statement (27). Then, vacuously, all edges along the simple path between the nodes |$(u,b)$| and |$(u,a_2)$| in tree T are labeled with all elements of the set Y. In other words,

$$\begin{eqnarray} (u,b)\stackrel{Y}{\rm o\!\!-\!\!o }(u,a_2) \,\, \,\, \,\,\text{in} \,\, \,\, \,\,T. \end{eqnarray}$$

(28)

At the same time, the assumption |$!Y\in S^{\prime }_{w_1a_1}$| of the claim implies that |$!Y\in S^{\prime }_{ub}$| by Claim 12 and statement (26). Thus, |$!Y\in S_{ub}$| by equation (14) and because |$u\lt \alpha$| due to the assumption |$(u,b)\in (\alpha,\beta)$| of the lemma. Therefore, |$a_2=b$| by item 2(a) of Definition 7 and statement (28), which contradicts the assumption of Case 2B.

This concludes the proof of the lemma.

Lemma 11.

For any finite loaded Cartesian tree |$T=(\alpha,\beta,E,\ell , S)$|⁠, any node |$(u,b)\in \alpha\times \beta$|⁠, and any formulae |${\sf D}_X\varphi \notin S_{ub}$| and |${\sf R}_X\psi \notin S_{ub}$| such that |$!X\notin S_{ub}$| there is an extension |$T^{\prime }=(\alpha+1,\beta+1,E^{\prime },\ell ^{\prime }, S^{\prime })$| of tree T such that |$(u,b)\stackrel{X}{\rm o\!\!-\!\!o }(\alpha,\beta)$| in tree |$T^{\prime }$|⁠, |$\varphi \notin S^{\prime }_{\alpha\beta}$|⁠, and |$\psi \notin S^{\prime }_{\alpha b}$|⁠.

Proof.

Following the general structure of the proof of Lemma 10, we first define the Cartesian tree |$(\alpha+1,\beta+1,E^{\prime },\ell ^{\prime })$|⁠. This tree is obtained by connecting all newly added nodes to the node |$(u,b)$| of the original tree |$(\alpha,\beta,E,\ell )$|⁠, see Fig. 9. Formally, relation |$E^{\prime }$| on the set |$(\alpha+1)\times (\beta+1)$| is defined as follows:

$$\begin{eqnarray} E^{\prime } &=& E\cup \lbrace ((u,b),(\alpha,\beta))\rbrace \cup \lbrace ((u,b),(\alpha,a)) \,\,| \,\, a\lt \beta\rbrace \\ &\cup& \lbrace ((u,b),(w,\beta)) \,\,| \,\, w\lt \alpha\rbrace . \end{eqnarray}$$

(29)

We label all newly added edges with the empty set except for the edge |$\langle (u,b),(\alpha,\beta)\rangle$|⁠, which is labeled with set X, see Fig. 9. Formally, we define edge labeling function |$\ell ^{\prime }$| as follows:

$$\begin{eqnarray} \ell ^{\prime }(\it e)= \left\lbrace \begin{array}{@{}l@{\quad }l@{}}X, & \rm{if\,\it e=((u,b),(\alpha,\beta))},\\ \varnothing , & \rm{if\,\it e=((u,b),(\alpha,a)), {\rm where}\, a\lt \beta},\\ \varnothing , & \rm{if\,\it e=((u,b),(w,\beta)), {\rm where}\,w\lt \alpha},\\ \ell (e), & \rm{otherwise}. \end{array}\right. \end{eqnarray}$$

(30)

The above equation concludes the definition of the Cartesian tree |$(\alpha+1,\beta+1,E^{\prime },\ell ^{\prime })$|⁠. Toward the definition of the labeling function |$S^{\prime }$|⁠, let us first consider the following two sets of formulae:

$$\begin{eqnarray} H^-&=\lbrace \lnot \varphi \rbrace \cup \lbrace \chi \in \Phi \,\,| \,\,{\sf D}_X\chi \in S_{ub}\rbrace , \end{eqnarray}$$

(31)

$$\begin{eqnarray} G^-&=\lbrace \lnot \psi \rbrace \cup \lbrace \chi \in \Phi \,\,| \,\,{\sf R}_X\chi \in S_{ub}\rbrace . \end{eqnarray}$$

(32)

The proof of the next claim is the same as the proof of Claim 5 except that it uses modalities |${\sf D}$| and |${\sf R}$| instead of modality |${\sf O}$|⁠.

Claim 14.

|$H^-$| and |$G^-$| are consistent sets of formulae.

By Lemma 9, sets |$H^-$| and |$G^-$| can be extended to maximal consistent sets H and G, respectively. We are now ready to define the labeling function |$S^{\prime }$|⁠. The values of the labelling function |$S^{\prime }_{wa}$| are given by the |$(\alpha+1)\times (\beta+1)$| matrix shown in below equation:

$$\begin{eqnarray} (S''_{wa})= \left[\begin{array}{@{}ccccccccc@{}} S_{00} & S_{01} & \dots & S_{0,b-1} & S_{0b} & S_{0,b+1} & \dots & S_{0,\beta-1} & H\\ S_{10} & S_{11} & \dots & S_{1,b-1} & S_{1b} & S_{1,b+1} & \dots & S_{1,\beta-1} & H\\ \vdots&\vdots&&\vdots&\vdots&\vdots&&\vdots&\vdots\\ S_{u0} & S_{u1} & \dots & S_{u,b-1} & S_{ub} & S_{u,b+1} & \dots & S_{u,\beta-1} & H\\ \vdots&\vdots&&\vdots&\vdots&\vdots&&\vdots&\vdots\\ S_{\alpha-1,0} & S_{\alpha-1,1} & \dots & S_{\alpha-1,b-1} & S_{\alpha-1,b} & S_{\alpha-1,b+1} & \dots & S_{\alpha-1,\beta-1} & H\\ S_{u0} & S_{u1} & \dots & S_{u,b-1} & G & S_{u,b+1} &\dots & S_{u,\beta-1} & H\\ \end{array}\right]. \end{eqnarray}$$

Matrix |$(S''_{wa})$| towards the proof of Lemma 11.In other words, let

$$\begin{eqnarray} S^{\prime }_{wa}= \left\lbrace \begin{array}{@{}l@{\quad }l@{}}S_{ua}, & \rm{if\,\it w=\alpha, a\ne b, {\rm and}\,a\ne \beta},\\ G, & \rm{if\,\it w=\alpha\,{\rm and}\,a=b},\\ H, & \rm{if a=\beta},\\ S_{wa}, &\text{otherwise}. \end{array}\right. \end{eqnarray}$$

(33)

The tuple |$T^{\prime }=(\alpha+1,\beta+1,E^{\prime },\ell ^{\prime }, S^{\prime })$| is now defined. Next, we show that |$T^{\prime }$| is a loaded Cartesian tree through verifying conditions 2(a)–(d) of Definition 7. We start with the following equivalent of Claim 6.

Claim 15.

Claim 17 verifies condition 2(b) of Definition 7. Claim 16 is an auxiliary statement used in the proof of Claim 17.

Claim 16.

|${\sf O}_\varnothing \chi \in S_{ub}$| iff |${\sf O}_\varnothing \chi \in G$|⁠, for each formula |$\chi \in \Phi$|⁠.

Proof of Claim.

|$(\Rightarrow ):$| Suppose that |${\sf O}_\varnothing \chi \in S_{ub}$|⁠. Then, by Lemma 6 and the Modus Ponens inference rule, |$S_{ub}\vdash {\sf O}_\varnothing {\sf O}_\varnothing \chi$|⁠. Hence, |$S_{ub}\vdash {\sf R}_\varnothing {\sf O}_\varnothing \chi$| by the Data-Free Knowledge axiom and the Modus Ponens inference rule. Then, |$S_{ub}\vdash {\sf R}_X{\sf O}_\varnothing \chi$| by the Monotonicity axiom and the Modus Ponens inference rule. Thus, |${\sf R}_X{\sf O}_\varnothing \chi \in S_{ub}$| because |$S_{ub}$| is a maximal consistent set of formulae. Therefore, |${\sf O}_\varnothing \chi \in G^-\subseteq G$| by equation (32).

|$(\Leftarrow ):$| Suppose that |${\sf O}_\varnothing \chi \notin S_{ub}$|⁠. Then, |$\lnot {\sf O}_\varnothing \chi \in S_{ub}$| because |$S_{ub}$| is a maximal consistent set of formulae. Thus, |$S_{ub}\vdash {\sf O}_\varnothing \lnot {\sf O}_\varnothing \chi$| by the Negative Introspection axiom and the Modus Ponens inference rule. Hence, |$S_{ub}\vdash {\sf R}_\varnothing \lnot {\sf O}_\varnothing \chi$| by the Data-Free Knowledge axiom and the Modus Ponens inference rule. Thus, |$S_{ub}\vdash {\sf R}_X\lnot {\sf O}_\varnothing \chi$| by the Monotonicity axiom and the Modus Ponens inference rule. Then, |${\sf R}_X\lnot {\sf O}_\varnothing \chi \in S_{ub}$| because |$S_{ub}$| is a maximal consistent set of formulae. Thus, |$\lnot {\sf O}_\varnothing \chi \in G^-\subseteq G$| by equation (32). Therefore, |${\sf O}_\varnothing \chi \notin G$| because set G is consistent.

Claim 17.

|${\sf O}_Y\chi \in S^{\prime }_{w_1a}$| iff |${\sf O}_Y\chi \in S^{\prime }_{w_2a}$|⁠, for each formula |$\chi\in\Phi$|⁠, each |$w_1,w_2\lt \alpha+1$|⁠, each |$a\lt \beta+1$|⁠, and each dataset |$Y\subseteq V$| such that |$(w_1,a)\stackrel{Y}{\rm o\!\!-\!\!o }(w_2,a)$| in |$T^{\prime }$|⁠.

Proof of Claim.

We consider the following four cases separately:

Case 1: |$w_1=w_2$|⁠. Then, |$S^{\prime }_{w_1a}=S^{\prime }_{w_2a}$|⁠. Therefore, |${\sf O}_Y\chi \in S^{\prime }_{w_1a}$| iff |${\sf O}_Y\chi \in S^{\prime }_{w_2a}$|⁠.

Case 2: |$a=\beta$|⁠. Then, |$S^{\prime }_{w_1a}=H=S^{\prime }_{w_2a}$| by equation (33). Therefore, |${\sf O}_Y\chi \in S^{\prime }_{w_1a}$| iff |${\sf O}_Y\chi \in S^{\prime }_{w_2a}$|⁠.

Case 3: |$a\lt \beta$| and |$w_1,w_2\lt \alpha$|⁠. Then, |$(w_1,a)\stackrel{Y}{\rm o\!\!-\!\!o }(w_2,a)$| in T by Claim 15 and the assumption |$(w_1,a)\stackrel{Y}{\rm o\!\!-\!\!o }(w_2,a)$| in |$T^{\prime }$| of the current claim. Hence, |${\sf O}_Y\chi \in S_{w_1a}$| iff |${\sf O}_Y\chi \in S_{w_2a}$| by item 2(b) of Definition 7. Therefore, |${\sf O}_Y\chi \in S^{\prime }_{w_1a}$| iff |${\sf O}_Y\chi \in S^{\prime }_{w_2a}$| by equation (33) and the assumptions |$a\lt \beta$| and |$w_1,w_2\lt \alpha$| of the case.

Case 4: |$a\lt \beta$| and one of |$w_1,w_2$| is equal to |$\alpha$| while the other is less than |$\alpha$|⁠. Without loss of generality, let |$w_1\lt \alpha$| and |$w_2=\alpha$|⁠. Hence, the simple path between nodes |$(w_1,a)$| and |$(w_2,a)$| must contain the edge |$\langle (u,b),(w_2,a)\rangle$|⁠, see Fig. 9. In addition, |$Y\subseteq \ell ^{\prime }(\langle (u,b),(w_2,a)\rangle )$| by the assumption |$(w_1,a)\stackrel{Y}{\rm o\!\!-\!\!o }(w_2,a)$| in |$T^{\prime }$| of the claim. Note that |$\ell ^{\prime }(\langle (u,b),(w_2,a)\rangle )=\varnothing$| by either equation (30) or Fig. 9 and the assumption of the case that |$a\lt \beta$|⁠. Thus, |$Y=\varnothing$|⁠. Hence, it suffices to show that |${\sf O}_\varnothing \chi \in S^{\prime }_{w_1a}$| iff |${\sf O}_\varnothing \chi \in S^{\prime }_{w_2a}$|⁠.

Observe that nodes |$(w_1,a)$| and |$(u,a)$| belong to the tree T by the assumption of Case 4 and the assumption |$u\lt \alpha$| of the lemma. Then, all edges along the simple path in tree T between these nodes are vacuously labeled with all elements of the empty set. Hence, |$(w_1,a)\stackrel{\varnothing }{\rm o\!\!-\!\!o }(u,a)$| in T. Thus, by item 2(b) of Definition 7,

$$\begin{eqnarray} {\sf O}_\varnothing \chi \in S_{w_1a} \,\, \,\,\text{iff} \,\, \,\,{\sf O}_\varnothing \chi \in S_{ua}. \end{eqnarray}$$

(34)

We further split this case into two subcases:

Case 4A: |$a=b$|⁠. Hence, |${\sf O}_\varnothing \chi \in S_{ua}$| iff |${\sf O}_\varnothing \chi \in G$| by Claim 16. Then, |${\sf O}_\varnothing \chi \in S_{w_1a}$| iff |${\sf O}_\varnothing \chi \in G$| by statement (34). Thus, |${\sf O}_\varnothing \chi \in S^{\prime }_{w_1a}$| iff |${\sf O}_\varnothing \chi \in S^{\prime }_{w_2a}$| by equation (33), the assumptions |$a\lt \beta$|⁠, |$w_1\lt \alpha$|⁠, and |$w_2=\alpha$| of Case 4 and the assumption |$a=b$| of Case 4A.

Case 4B: |$a\ne b$|⁠. Then, |${\sf O}_\varnothing \chi \in S_{ua}$| iff |${\sf O}_\varnothing \chi \in S^{\prime }_{\alpha a}$| by equation (33), the assumption |$a\lt \beta$| of Case 4 and the assumption |$u\lt \alpha$| of the lemma. Thus, |${\sf O}_\varnothing \chi \in S_{w_1a}$| iff |${\sf O}_\varnothing \chi \in S^{\prime }_{\alpha a}$| by statement (34). Therefore, |${\sf O}_\varnothing \chi \in S^{\prime }_{w_1a}$| iff |${\sf O}_\varnothing \chi \in S^{\prime }_{w_2a}$| by equation (33), the assumptions |$a\lt \beta$|⁠, |$w_1\lt \alpha$|⁠, and |$w_2=\alpha$| of Case 4.

Claim 19 verifies condition 2(c) of Definition 7. Claim 18 is an auxiliary statement used in the proof of Claim 19.

Claim 18.

|${\sf D}_Y\chi \in S^{\prime }_{w a}$| iff |${\sf D}_Y\chi \in S^{\prime }_{ub}$|⁠, for each |$\chi \in \Phi$|⁠, each |$(w,a)\in (\alpha+1)\times (\beta+1)$| such that |$(w,a)\notin \alpha\times \beta$|⁠, and each dataset |$Y\subseteq \ell ^{\prime }(\langle (w,a),(u,b)\rangle )$|⁠.

Proof of Claim.

We consider the following two cases separately:

Case 1: |$w=\alpha$|⁠, |$a\lt \beta$|⁠. Then, |$\ell ^{\prime }(\langle (w,a),(u,b)\rangle )=\varnothing$| by equation (30); alternatively, see Fig. 9. Thus, |$Y=\varnothing$| by the assumption |$Y\subseteq \ell ^{\prime }(\langle (w,a),(u,b)\rangle )$| of the claim. Hence, it suffices to prove that |${\sf D}_\varnothing \chi \in S^{\prime }_{w a}$| iff |${\sf D}_\varnothing \chi \in S^{\prime }_{ub}$| We further split this case into two subcases.

Case 1A: |$a\ne b$|⁠. The assumption |$(u,b)\in \alpha\times \beta$| of the lemma implies that |$u\lt \alpha$| and |$b\lt \beta$|⁠. Hence, |$(u,a),(u,b)\in \alpha\times \beta$| by the assumption |$a\lt \beta$| of Case 1. Thus, nodes |$(u,a)$| and |$(u,b)$| belong to tree T. Note that all edges along the simple path between these two nodes are vacuously labeled with all elements of the empty set. Hence, |$(u,a)\stackrel{\varnothing }{\rm o\!\!-\!\!o }(u,b)$| in tree T. Thus, |${\sf D}_\varnothing \chi \in S_{ua}$| iff |${\sf D}_\varnothing \chi \in S_{ub}$| by item 2(c) of Definition 7. Then, |${\sf D}_\varnothing \chi \in S^{\prime }_{w a}$| iff |${\sf D}_\varnothing \chi \in S_{ub}$| by equation (33) and the assumption |$w=\alpha$| of Case 1 and the assumption |$a\ne b$| of Case 1A. Hence, |${\sf D}_\varnothing \chi \in S^{\prime }_{w a}$| iff |${\sf D}_\varnothing \chi \in S^{\prime }_{ub}$| by equation (33) and the assumption |$(u,b)\in \alpha\times \beta$| of the lemma.

Case 1B: |$a=b$|⁠. Note that, vacuously, all edges along the path between the nodes |$(w,a)$| and |$(u,b)$| in tree |$T^{\prime }$| are labeled with all elements of the empty set. In other words, |$(w,a)\stackrel{\varnothing }{\rm o\!\!-\!\!o }(u,b)$| in tree |$T^{\prime }$|⁠. Hence, |${\sf O}_\varnothing {\sf D}_\varnothing \chi \in S^{\prime }_{w a}$| iff |${\sf O}_\varnothing {\sf D}_\varnothing \chi \in S^{\prime }_{u b}$| by Claim 17 and the assumption |$a=b$| of Case 1B. Therefore, |${\sf D}_\varnothing \chi \in S^{\prime }_{w a}$| iff |${\sf D}_\varnothing \chi \in S^{\prime }_{u b}$| by Lemma 7 because |$S^{\prime }_{w a}$| and |$S^{\prime }_{u b}$| are maximal consistent sets of formulae.

Case 2: |$a=\beta$|⁠. Then, the assumption |$Y\subseteq \ell ^{\prime }(\langle (w,a),(u,b)\rangle )$| of the claim implies, see Fig. 9,

$$\begin{eqnarray} Y\subseteq X. \end{eqnarray}$$

(35)

In addition, by equation (33) and the assumption |$(u,b)\in \alpha\times \beta$|⁠, it suffices to prove that |${\sf D}_Y\chi \in H$| iff |${\sf D}_Y\chi \in S_{ub}$|⁠.

|$(\Rightarrow ):$| Suppose that |${\sf D}_Y\chi \notin S_{ub}$|⁠. Then, |$\lnot {\sf D}_Y\chi \in S_{ub}$| because |$S_{ub}$| is a maximal consistent set. Hence, |$S_{ub}\vdash {\sf D}_Y\lnot {\sf D}_Y\chi$| by the Negative Introspection axiom and the Modus Ponens inference rule. Thus, |$S_{ub}\vdash {\sf D}_X\lnot {\sf D}_Y\chi$| by the Monotonicity axiom and statement (35). Hence, |${\sf D}_X\lnot {\sf D}_Y\chi \in S_{ub}$| because |$S_{ub}$| is a maximal consistent set. Then, |$\lnot {\sf D}_Y\chi \in H^-\subseteq H$| by equation (31). Therefore, |${\sf D}_Y\chi \notin H$| because set H is consistent.

|$(\Leftarrow ):$| Let |${\sf D}_Y\chi \in S_{ub}$|⁠. Then, |$S_{ub}\vdash {\sf D}_Y{\sf D}_Y\chi$| by Lemma 6 and the Modus Ponens inference rule. Thus, |$S_{ub}\vdash {\sf D}_X{\sf D}_Y\chi$| by the Monotonicity axiom and statement (35). Hence, |${\sf D}_X{\sf D}_Y\chi \in S_{ub}$| because |$S_{ub}$| is a maximal consistent set of formulae. Then, |${\sf D}_Y\chi \in H^-\subseteq H$| by equation (31).

The proof of the next claim is similar to the proof of Claim 10 except it uses Claim 15 and Claim 18 instead of Claim 6 and Claim 9.

Claim 19.

|${\sf D}_Y\chi \in S^{\prime }_{w_1a_1}$| iff |${\sf D}_Y\chi \in S^{\prime }_{w_2a_2}$|⁠, for each dataset |$Y\subseteq V$|⁠, each formula |$\chi \in \Phi$|⁠, each |$w_1,w_2\le \alpha+1$| and each |$a_1,a_2\le \beta+1$| such that |$(w_1,a_1)\stackrel{Y}{\rm o\!\!-\!\!o }(w_2,a_2)$| in |$T^{\prime }$|⁠.

Next, we verify condition 2(d) of Definition 7.

Claim 20.

If |${\sf R}_Y\chi \in S^{\prime }_{w_1a_1}$| and |$(w_1,a_1)\stackrel{Y}{\rm o\!\!-\!\!o }(w_2,a_2)$| in |$T^{\prime }$|⁠, then |$\chi \in S^{\prime }_{w_2a_1}$| for each dataset |$Y\subseteq V$|⁠, each formula |$\chi \in \Phi$|⁠, each |$w_1,w_2\le \alpha+1$| and each |$a_1,a_2\le \beta+1$|⁠.

Proof of Claim.

The assumption |$(w_1,a_1)\stackrel{Y}{\rm o\!\!-\!\!o }(w_2,a_2)$| in |$T^{\prime }$| implies that one of the following cases must take place, see Fig. 9.

Case 1: |$(w_1,a_1)=(w_2,a_2)$|⁠. The assumption |${\sf R}_Y\chi \in S^{\prime }_{w_1a_1}$| implies that |$S^{\prime }_{w_1a_1}\vdash \chi$| by Lemma 5. Hence, |$\chi \in S^{\prime }_{w_1a_1}$| because |$S^{\prime }_{w_1a_1}$| is a maximal consistent set. Therefore, |$\chi \in S^{\prime }_{w_2a_1}$| by the assumption of the case.

Case 2: |$w_1,w_2\lt \alpha$| and |$a_1,a_2\lt \beta$|⁠. Thus, the assumption |$(w_1,a_1)\stackrel{Y}{\rm o\!\!-\!\!o }(w_2,a_2)$| in |$T^{\prime }$| implies that |$(w_1,a_1)\stackrel{Y}{\rm o\!\!-\!\!o }(w_2,a_2)$| in T by Claim 15. In addition, the assumption |${\sf R}_Y\chi \in S^{\prime }_{w_1a_1}$| implies that |${\sf R}_Y\chi \in S_{w_1a_1}$| by equation (33). Then, |$\chi \in S_{w_2a_1}$| by item 2(d) of Definition 7. Therefore, |$\chi \in S^{\prime }_{w_2a_1}$| by equation (33) and the assumptions |$w_2\lt \alpha$| and |$a_1\lt \beta$| of the case.

Case 3: |$Y=\varnothing$|⁠. Thus, by the assumption |${\sf R}_Y\chi \in S^{\prime }_{w_1a_1}$| of the claim, |${\sf R}_\varnothing \chi \in S^{\prime }_{w_1a_1}$|⁠. Then, |$S^{\prime }_{w_1a_1}\vdash {\sf O}_\varnothing \chi$| by the Overt Knowledge axiom and the Modus Ponens inference rule. Hence, |${\sf O}_\varnothing \chi \in S^{\prime }_{w_1a_1}$| because |$S^{\prime }_{w_1a_1}$| is a maximal consistent set of formulae. Observe that, vacuously, all edges along the simple path between the nodes |$(w_1,a_1)$| and |$(w_2,a_1)$| in tree |$T^{\prime }$| are labeled with each data variable from the empty set. In other words, |$(w_1,a_1)\stackrel{\varnothing }{\rm o\!\!-\!\!o }(w_2,a_1)$| in |$T^{\prime }$|⁠. Thus, |${\sf O}_\varnothing \chi \in S^{\prime }_{w_2a_1}$| by Claim 17. Hence, |$S^{\prime }_{w_2a_1}\vdash \chi$| by the Truth axiom and the Modus Ponens inference rule. Therefore, |$\chi \in S^{\prime }_{w_2a_1}$| because |$S^{\prime }_{w_2a_1}$| is a maximal consistent set.

Case 4: |$a_1=\beta$|⁠. The assumption |${\sf R}_Y\chi \in S^{\prime }_{w_1a_1}$| of the claim implies that |$S^{\prime }_{w_1a_1} \vdash \chi$| by Lemma 5. Thus, |$\chi \in S^{\prime }_{w_1a_1}$| because |$S^{\prime }_{w_1a_1}$| is a maximal consistent set. Then, |$\chi \in H$| by equation (33) and the assumption |$a_1=\beta$| of the case. Therefore, |$\chi \in S^{\prime }_{w_2a_1}$| again by equation (33) and the assumption |$a_1=\beta$| of the case.

Case 5: |$w_2=\alpha$|⁠, |$w_1\lt \alpha$|⁠, |$a_1\lt \beta$|⁠, and |$Y\subseteq X$|⁠. Then, any path from the node |$(w_1,a_1)$| to the node |$(w_2,a_2)$| must go through the node |$(u,b)$|⁠, see Fig. 9. Hence, the assumption |$(w_1,a_1)\stackrel{Y}{\rm o\!\!-\!\!o }(w_2,a_2)$| in |$T^{\prime }$| of the claim implies that

$$\begin{eqnarray} (w_1,a_1)\stackrel{Y}{\rm o\!\!-\!\!o }(u,b) \,\, \,\,\text{in} \,\, \,\,T^{\prime }. \end{eqnarray}$$

(36)

$$\begin{eqnarray} (w_1,a_1)\stackrel{Y}{\rm o\!\!-\!\!o }(u,b) \,\, \,\,\text{in} \,\, \,\,T. \end{eqnarray}$$

(37)

We further split this case into two subcases:

Case 5A: |$a_1= b$|⁠. The assumption |${\sf R}_Y\chi \in S^{\prime }_{w_1a_1}$| of the claim implies |$S^{\prime }_{w_1a_1}\vdash {\sf O}_Y{\sf R}_Y\chi$| by the Introspection of De Re Knowledge axiom and the Modus Ponens rule. Hence, |${\sf O}_Y{\sf R}_Y\chi \in S^{\prime }_{w_1a_1}$| because |$S^{\prime }_{w_1a_1}$| is a maximal consistent set. Thus, |${\sf O}_Y{\sf R}_Y\chi \in S^{\prime }_{ub}$| by Claim 17, statement (36) and the assumption |$a_1= b$| of Case 5A. Then, |${\sf O}_Y{\sf R}_Y\chi \in S_{ub}$| by equation (33) and the assumption |$(u,b)\in \alpha\times \beta$| of the lemma. Hence, |$S_{ub}\vdash {\sf R}_Y\chi$| by the Truth axiom and the Modus Ponens inference rule. Thus, |$S_{ub}\vdash {\sf R}_X\chi$| by the Monotonicity axiom, the Modus Ponens inference rule, and the assumption |$X\subseteq Y$| of Case 5. Then, |${\sf R}_X\chi \in S_{ub}$| because |$S_{ub}$| is a maximal consistent set. Hence, |$\chi \in G^-\subseteq G$| by equation (32). Thus, |$\chi \in S^{\prime }_{w_2a_1}$| by equation (33), the assumption |$w_2=\alpha$| of Case 5 and the assumption |$a_1=b$| of Case 5A.

Case 5B: |$a_1\ne b$|⁠. The assumption |${\sf R}_Y\chi \in S^{\prime }_{w_1a_1}$| of the claim implies |${\sf R}_Y\chi \in S_{w_1a_1}$| by equation (33) and the assumptions |$w_1\lt \alpha$| and |$a_1\lt \beta$| of Case 5. Hence, |$\chi \in S_{u,a_1}$| by item 2(d) of Definition 7 and statement (37). Therefore, |$\chi \in S_{w_2,a_1}$| by equation (33), the assumptions |$w_2=\alpha$| and |$a_1\lt \beta$| of Case 5 and the assumption |$a_1\ne b$| of Case 5B.

Finally, we verify condition 2(a) of Definition 7.

Claim 21.

If |$!Y\in S^{\prime }_{w_1a_1}$| and |$(w_1,a_1)\stackrel{Y}{\rm o\!\!-\!\!o }(w_2,a_2)$| in |$T^{\prime }$|⁠, then |$a_1=a_2$| for each dataset |$Y\subseteq V$|⁠, each |$w_1,w_2\le \alpha+1$| and each |$a_1,a_2\le \beta+1$|⁠.

Proof of Claim.

At least one of the following three cases must take place, see Fig. 9:

Case 1: |$(w_1,a_1)=(w_2,a_2)$|⁠. Then, |$a_1=a_2$|⁠.

Case 2: |$w_1,w_2\lt \alpha$| and |$a_1,a_2\lt \beta$|⁠. Then, the assumption |$(w_1,a_1)\stackrel{Y}{\rm o\!\!-\!\!o }(w_2,a_2)$| in |$T^{\prime }$| implies that |$(w_1,a_1)\stackrel{Y}{\rm o\!\!-\!\!o }(w_2,a_2)$| in T by Claim 15. Also, the assumption |$!Y\in S^{\prime }_{w_1a_1}$| implies |$!Y\in S_{w_1a_1}$| by equation (33). Therefore, |$a_1=a_2$| by item 2(a) of Definition 7.

Case 3: The simple path between the nodes |$(w_1,a_1)$| and |$(w_2,a_2)$| goes through the node |$(u,b)$|⁠. Then,

$$\begin{eqnarray} &(w_1,a_1)\stackrel{Y}{\rm o\!\!-\!\!o }(u,b)\rm{ in T^{\prime }}, \end{eqnarray}$$

(38)

$$\begin{eqnarray} &Y\subseteq X. \end{eqnarray}$$

(39)

by the assumption |$(w_1,a_1)\stackrel{Y}{\rm o\!\!-\!\!o }(w_2,a_2)$| in |$T^{\prime }$| of the claim. At the same time, the assumption |$!Y\in S^{\prime }_{w_1a_1}$| of the claim implies that |$S^{\prime }_{w_1a_1}\vdash {\sf D}_Y!Y$| by the Introspection of the Traceability axiom and the Modus Ponens inference rule. Hence, |${\sf D}_Y!Y\in S^{\prime }_{w_1a_1}$| because |$S^{\prime }_{w_1a_1}$| is a maximal consistent set. Thus, |${\sf D}_Y!Y\in S^{\prime }_{ub}$| by statement (38) and Claim 19. Then, |$S^{\prime }_{ub}\vdash !Y$| by Lemma 5. Thus, |$S^{\prime }_{ub}\vdash !X$| by the Monotonicity axiom, statement (39) and the Modus Ponens inference rule. Then, |$!X\in S^{\prime }_{ub}$| because |$S^{\prime }_{ub}$| is a maximal consistent set. Therefore, |$!X\in S_{ub}$| by equation (33) and the assumption |$(u,b)\in \alpha\times \beta$| of the lemma, which contradicts the assumption |$!X\notin S_{ub}$| of the lemma.

This completes the proof of the lemma.

$$\begin{eqnarray} (A^1_1,\dots ,A^1_n)\sqsubseteq (A^2_1,\dots ,A^2_n)\sqsubseteq (A^3_1,\dots ,A^3_n)\sqsubseteq \dots \end{eqnarray}$$

Lemma 12.

For any infinite chain of loaded trees |$T_1\sqsubseteq T_2\sqsubseteq T_3\sqsubseteq \dots$|⁠, if loaded tree |$T_{i+1}$| is an extension of the loaded tree |$T_i$| for each |$i\ge 1$|⁠, then |$\bigcup _i T_i$| is a loaded tree that extends each of the loaded trees |$T_1,T_2,\dots$|⁠.

Intuitively, Definition 9 requires a complete loaded tree to have no “deficiencies”. It is convenient to talk about each such deficiency separately. To be able to do this, below we introduce the notion of |$(u,b,\psi )$|-complete loaded tree.

Definition 10.

if formula |$\psi$| has the form |$!X$| and |$!X\notin S_{ub}$|⁠, then is there is |$(w,a)\in \alpha\times \beta$| such that |$(u,b)\stackrel{X}{\rm o\!\!-\!\!o }(w,a)$| and |$b\ne a$|⁠,
if formula |$\psi$| has the form |${\sf O}_X\varphi$| and |${\sf O}_X\varphi \notin S_{ub}$|⁠, then there is |$(w,b)\in \alpha\times \beta$| such that |$(u,b)\stackrel{X}{\rm o\!\!-\!\!o }(w,b)$| and |$\varphi \notin S_{wb}$|⁠,
if formula |$\psi$| has the form |${\sf D}_X\varphi$| and |${\sf D}_X\varphi \notin S_{ub}$|⁠, then there is |$(w,a)\in \alpha\times \beta$| such that |$(u,b)\stackrel{X}{\rm o\!\!-\!\!o }(w,a)$| and |$\varphi \notin S_{wa}$|⁠,
if formula |$\psi$| has the form |${\sf R}_X\varphi$| and |${\sf R}_X\varphi \notin S_{ub}$|⁠, then there is |$(w,a)\in \alpha\times \beta$| such that |$(u,b)\stackrel{X}{\rm o\!\!-\!\!o }(w,a)$| and |$\varphi \notin S_{wb}$|⁠.

The next lemma follows from Definition 8 and Definition 10.

Lemma 13.

If a loaded Cartesian tree T is |$(u,b,\psi )$|-complete, then any extension of tree T is also |$(u,b,\psi )$|-complete.

The lemma below follows from Definition 9 and Definition 10.

Lemma 14.

Lemma 15.

Proof.

Note that |$S_{ub}\nvdash \bot$| because set |$S_{ub}$| is consistent. Then, |$S_{ub}\nvdash {\sf D}_X\bot$| and |$S_{ub}\nvdash {\sf R}_X\bot$| by Lemma 5 applied contrapositively. Thus,

$$\begin{eqnarray} {\sf D}_X\bot \notin S_{ub} \,\, \,\,\text{and} \,\, \,\,{\sf R}_X\bot \notin S_{ub}. \end{eqnarray}$$

(40)

To finish the proof of the lemma, we consider the cases corresponding to conditions 1-4 of Definition 10 separately.

Case 1: suppose that formula |$\psi$| has the form |$!X$| and |$!X\notin S_{ub}$|⁠. Hence, by Lemma 11 and statement (40), there is an extension |$T^{\prime }=(\alpha+1,\beta+1,E^{\prime },\ell ^{\prime }, S^{\prime })$| of tree T such that |$(u,b)\stackrel{X}{\rm o\!\!-\!\!o }(\alpha,\beta)$| in tree |$T^{\prime }$|⁠. Thus, |$T^{\prime }$| is a |$(u,b,\psi )$|-complete extension of tree T by Definition 10.

Case 2: suppose formula |$\psi$| has the form |${\sf O}_X\varphi$| and |${\sf O}_X\varphi \notin S_{ub}$|⁠. Then, by Lemma 10 there is an extension |$T^{\prime }=(\alpha+1,\beta,E^{\prime },\ell ^{\prime }, S^{\prime })$| of tree T such that |$(u,b)\stackrel{X}{\rm o\!\!-\!\!o }(\alpha,b)$| in tree |$T^{\prime }$| and |$\varphi \notin S^{\prime }_{\alpha b}$|⁠. Thus, |$T^{\prime }$| is a |$(u,b,\psi )$|-complete extension of tree T by Definition 10.

Case 3: suppose formula |$\psi$| has the form |${\sf D}_X\varphi$| and |${\sf D}_X\varphi \notin S_{ub}$|⁠. We further slit this case into two subcases.

Case 3A: |$!X\in S_{ub}$|⁠. The assumption |${\sf D}_X\varphi \notin S_{ub}$| of Case 3 implies |$S_{ub}\nvdash {\sf D}_X\varphi$| because |$S_{ub}$| is a maximal consistent set. Thus, it follows that |$S_{ub}\nvdash {\sf O}_X\varphi$| by the Traceable Data axiom applied contrapositively and the assumption |$!X\in S_{ub}$| of Case 3A. Hence, |${\sf O}_X\varphi \notin S_{ub}$| because |$S_{ub}$| is a maximal consistent set. Then, by Lemma 10 there is an extension |$T^{\prime }=(\alpha+1,\beta,E^{\prime },\ell ^{\prime }, S^{\prime })$| of tree T such that |$(u,b)\stackrel{X}{\rm o\!\!-\!\!o }(\alpha,b)$| in tree |$T^{\prime }$| and |$\varphi \notin S^{\prime }_{\alpha b}$|⁠. Thus, |$T^{\prime }$| is a |$(u,b,\psi )$|-complete extension of tree T by Definition 10.

Case 3B: |$!X\notin S_{ub}$|⁠. Hence, by the assumption |${\sf D}_X\varphi \notin S_{ub}$| of Case 3, the part |${\sf R}_X\bot \notin S_{ub}$| of statement (40), and Lemma 11, there is an extension |$T^{\prime }=(\alpha+1,\beta+1,E^{\prime },\ell ^{\prime }, S^{\prime })$| of tree T such that |$(u,b)\stackrel{X}{\rm o\!\!-\!\!o }(\alpha,\beta)$| in tree |$T^{\prime }$| and |$\varphi \notin S^{\prime }_{\alpha\beta}$|⁠. Thus, |$T^{\prime }$| is a |$(u,b,\psi )$|-complete extension of tree T by Definition 10.

Case 4: suppose formula |$\psi$| has the form |${\sf R}_X\varphi$| and |${\sf R}_X\varphi \notin S_{ub}$|⁠. Similar to the previous case, we further slit this case into two subcases.

Case 4A: |$!X\in S_{ub}$|⁠. By applying the argument similar to the one in Case 3A, but using the form |$!X\rightarrow ({\sf O}_X\rightarrow {\sf R}_X)$| of the Traceability axiom instead of the form |$!X\rightarrow ({\sf O}_X\rightarrow {\sf D}_X)$|⁠, one can conclude that there is an extension |$T^{\prime }=(\alpha+1,\beta,E^{\prime },\ell ^{\prime }, S^{\prime })$| of tree T such that |$(u,b)\stackrel{X}{\rm o\!\!-\!\!o }(\alpha,b)$| in tree |$T^{\prime }$| and |$\varphi \notin S^{\prime }_{\alpha b}$|⁠. Thus, |$T^{\prime }$| is a |$(u,b,\psi )$|-complete extension of tree T by Definition 10.

Case 4B: |$!X\notin S_{ub}$|⁠. Thus, by the assumption |${\sf R}_X\varphi \notin S_{ub}$| of Case 4, the part |${\sf D}_X\bot \notin S_{ub}$| of statement (40), and Lemma 11, there is an extension |$T^{\prime }=(\alpha+1,\beta+1,E^{\prime },\ell ^{\prime }, S^{\prime })$| of tree T such that |$(u,b)\stackrel{X}{\rm o\!\!-\!\!o }(\alpha,\beta)$| in tree |$T^{\prime }$| and |$\varphi \notin S^{\prime }_{\alpha b}$|⁠. Therefore, |$T^{\prime }$| is a |$(u,b,\psi )$|-complete extension of tree T by Definition 10.

Theorem 5 (extension).

Any finite loaded Cartesian tree can be extended to a complete loaded Cartesian tree.

Proof.

Let T be a finite loaded Cartesian tree. To prove the theorem, we will construct an infinite chain of finite loaded Cartesian trees |$T=T_1\sqsubseteq T_2\sqsubseteq T_3\sqsubseteq \dots$| such that |$T_{i+1}$| is an extension of tree |$T_i$| for each integer |$i\ge 1$|⁠. Toward the construction of this chain, consider an enumeration |$(u_1,b_1,\psi _1),(u_2,b_2,\psi _2),\dots$| of all triples |$(u,b,\psi )$| such that u and b are nonnegative integer numbers and |$\psi \in \Phi$| is a formula.

We define chain |$T=T_1\sqsubseteq T_2\sqsubseteq T_3\sqsubseteq \dots$| recursively. Suppose that tree |$T_k=(\alpha_k,\beta_k,E_k,\ell _k, S_k)$| is already defined. If |$T_k$| is |$(u,b,\psi )$|-complete for each |$u\lt \alpha_k$|⁠, |$b\lt \beta_k$|⁠, and |$\psi \in \Phi$|⁠, then let |$T_{k+1}=T_k$|⁠. Otherwise, let |$i_{min}$| be the smallest i such that |$u_i\lt \alpha_k$|⁠, |$b_i\lt \beta_k$|⁠, and |$T_k$| is not |$(u_i,b_i,\psi _i)$|-complete. By Lemma 15, tree |$T_k$| can be extended to a finite |$(u_i,b_i,\psi _i)$|-complete tree |$T_{k+1}$|⁠.

Let |$T^{\prime }=\cup _{k\ge 1} T_k=(\alpha^{\prime },\beta^{\prime },E^{\prime },\ell ^{\prime }, S^{\prime })$|⁠. Note that |$T^{\prime }$| is an extension of all trees |$T_k$| by Lemma 12. |$T^{\prime }$| is |$(u,b,\psi )$|-complete for each |$u\lt \alpha^{\prime }$| and |$b\lt \beta^{\prime }$| by the construction of the chain |$T=T_1\sqsubseteq T_2\sqsubseteq T_3\sqsubseteq \dots$| and Lemma 13. Therefore, |$T^{\prime }$| is complete by Lemma 14.

Canonical model

In this subsection, we define a canonical model based on a complete loaded Cartesian tree. Recall that we think about such a tree as a matrix. In our construction, the rows of the matrix become possible worlds of the model, and the columns of the matrix become the agents.

Definition 11.

For each complete loaded Cartesian tree |$T=(\alpha,\beta,E,\ell , S)$|⁠, let canonical model |$M(T)$| be defined as the tuple |$(\alpha,\beta,\lbrace \sim \rbrace _{x\in V},\pi )$|⁠, such that

|$(w_1,a_1)\sim _x(w_2,a_2)$| if |$(w_1,a_1)\stackrel{\lbrace x\rbrace }{\rm o\!\!-\!\!o }(w_2,a_2)$| for each |$x\in V$|⁠, each |$w_1,w_2\in \alpha$|⁠, and each |$a_1,a_2\in \beta$|⁠,
|$\pi (p)=\lbrace (w,a)\in \alpha\times \beta \,\,| \,\,p\in S_{wa}\rbrace$| for each p.

Lemma 16.

|$(w_1,a_1)\sim _X(w_2,a_2)$| iff |$(w_1,a_1)\stackrel{X}{\rm o\!\!-\!\!o }(w_2,a_2)$|⁠, for each |$w_1,w_2\in \alpha$|⁠, each |$a_1,a_2\in \beta$|⁠, and each dataset |$X\subseteq V$|⁠.

Proof.

By the definition, the statement |$(w_1,a_1)\sim _X(w_2,a_2)$| is equivalent to the statement that |$(w_1,a_1)\sim _x(w_2,a_2)$| for each data variable |$x\in X$|⁠. By item 1 of Definition 11, the last statement is equivalent to the statement that |$(w_1,a_1)\stackrel{\lbrace x\rbrace }{\rm o\!\!-\!\!o }(w_2,a_2)$| for each data variable |$x\in X$|⁠. By Definition 6, the previous statement is equivalent to the statement that for each |$x\in X$| there is a simple path between nodes |$(w_1,a_1)$| and |$(w_2,a_2)$| such that all edges along this path are labeled with date variable x. Note that in any tree there is a unique simple path between any two nodes. Thus, the last statement is equivalent to the statement that all edges along this unique simple path between nodes |$(w_1,a_1)$| and |$(w_2,a_2)$| are labeled with each variable from set X. Finally, by Definition 6, the previous statement is equivalent to |$(w_1,a_1)\stackrel{X}{\rm o\!\!-\!\!o }(w_2,a_2)$|⁠.

Lemma 17.

|$w,a\Vdash \varphi$| iff |$\varphi \in S_{wa}$|⁠, for any formula |$\varphi \in \Phi$|⁠, any world |$w\in \alpha$|⁠, and any agent |$a\in \beta$| of any canonical model |$M(T)$| based on a complete loaded Cartesian tree |$T=(\alpha,\beta,E,\ell , S)$|⁠.

Proof.

We prove the statement of the lemma by induction on structural complexity of formula |$\varphi$|⁠. If formula |$\varphi$| is an atomic proposition, then the statement of the lemma follows from item 1 of Definition 2 and item 2 of Definition 11. The case when formula |$\varphi$| is a negation or a disjunction follows from items 3 and 4 of Definition 2 and the fact that |$S_{wa}$| is a maximal consistent set in the standard way.

Suppose that formula |$\varphi$| has the form |$!X$|⁠.

|$(\Rightarrow ):$| Assume that |$!X\notin S_{wa}$|⁠. Then, by item 1 of Definition 9, there is a node |$(w^{\prime },a^{\prime })$| such that |$(w,a)\stackrel{X}{\rm o\!\!-\!\!o }(w^{\prime },a^{\prime })$| and |$a\ne a^{\prime }$|⁠. Hence, |$(w,a)\sim _X(w^{\prime },a^{\prime })$| by Lemma 16. Therefore, |$w,a\nVdash !X$| by item 2 of Definition 2.

|$(\Leftarrow ):$| Assume that |$w,a\nVdash \, !X$|⁠. Then, by item 2 of Definition 2, there exist a world |$w^{\prime }$| and an agent |$a^{\prime }$| such that |$(w,a)\sim _X (w^{\prime },a^{\prime })$| and |$a\ne a^{\prime }$|⁠. Hence, |$(w,a)\stackrel{X}{\rm o\!\!-\!\!o } (w^{\prime },a^{\prime })$| by Lemma 16. Therefore, |$!X\notin S_{wa}$| by item 2(a) of Definition 7 applied contrapositively.

Suppose that formula |$\varphi$| has the form |${\sf O}_X\psi$|⁠.

|$(\Rightarrow ):$| Assume that |${\sf O}_X\psi \notin S_{wa}$|⁠. Then, by item 2 of Definition 9, there is a node |$(w^{\prime },a)$| such that |$(w,a)\stackrel{X}{\rm o\!\!-\!\!o }(w^{\prime },a)$| and |$\psi \notin S_{w^{\prime }a}$|⁠. Hence, |$(w,a)\sim _X(w^{\prime },a)$| by Lemma 16 and |$w^{\prime },a\nVdash \psi$| by the induction hypothesis. Therefore, |$w,a\nVdash {\sf O}_X\psi$| by item 5 of Definition 2.

|$(\Leftarrow ):$| Suppose that |${\sf O}_X\psi \in S_{wa}$|⁠. Consider any world |$w^{\prime }$| such that |$(w,a)\sim _X(w^{\prime },a)$|⁠. By item 5 of Definition 2, it suffices to show that |$w^{\prime },a\Vdash \psi$|⁠.

The assumption |$(w,a)\sim _X(w^{\prime },a)$| implies |$(w,a)\stackrel{X}{\rm o\!\!-\!\!o }(w^{\prime },a)$| by Lemma 16. Then, |${\sf O}_X\psi \in S_{w^{\prime }a}$| by item 2(b) of Definition 7 and the assumption |${\sf O}_X\psi \in S_{wa}$|⁠. Hence, |$S_{w^{\prime }a}\vdash \psi$| by the Truth axiom and the Modus Ponens inference rule. Thus, |$\psi \in S_{w^{\prime }a}$| because |$S_{w^{\prime }a}$| is a maximal consistent set. Therefore, |$w^{\prime },a\Vdash \psi$| by the induction hypothesis.

Suppose that formula |$\varphi$| has the form |${\sf D}_X\psi$|⁠.

|$(\Rightarrow ):$| Assume that |${\sf D}_X\psi \notin S_{wa}$|⁠. Then, by item 3 of Definition 9, there is a node |$(w^{\prime },a^{\prime })$| such that |$(w,a)\stackrel{X}{\rm o\!\!-\!\!o }(w^{\prime },a^{\prime })$| and |$\psi \notin S_{w^{\prime }a^{\prime }}$|⁠. Hence, |$(w,a)\sim _X(w^{\prime },a^{\prime })$| by Lemma 16 and |$w^{\prime },a^{\prime }\nVdash \psi$| by the induction hypothesis. Therefore, |$w,a\nVdash {\sf D}_X\psi$| by item 6 of Definition 2.

|$(\Leftarrow ):$| Suppose that |${\sf D}_X\psi \in S_{wa}$|⁠. Consider any world |$w^{\prime }$| and any agent |$a^{\prime }$| such that |$(w,a)\sim _X(w^{\prime },a^{\prime })$|⁠. By item 6 of Definition 2, it suffices to show that |$w^{\prime },a^{\prime }\Vdash \psi$|⁠.

The assumption |$(w,a)\sim _X(w^{\prime },a^{\prime })$| implies |$(w,a)\stackrel{X}{\rm o\!\!-\!\!o }(w^{\prime },a^{\prime })$| by Lemma 16. Then, |${\sf D}_X\psi \in S_{w^{\prime }a^{\prime }}$| by item 2(c) of Definition 7 and the assumption |${\sf D}_X\psi \in S_{wa}$|⁠. Then, |$S_{w^{\prime }a^{\prime }}\vdash \psi$| by Lemma 5. Thus, |$\psi \in S_{w^{\prime }a^{\prime }}$| because |$S_{w^{\prime }a^{\prime }}$| is a maximal consistent set. Therefore, |$w^{\prime },a^{\prime }\Vdash \psi$| by the induction hypothesis.

Suppose that formula |$\varphi$| has the form |${\sf R}_X\psi$|⁠.

|$(\Rightarrow ):$| Assume that |${\sf R}_X\psi \notin S_{wa}$|⁠. Then, by item 4 of Definition 9, there is a node |$(w^{\prime },a^{\prime })$| such that |$(w,a)\stackrel{X}{\rm o\!\!-\!\!o }(w^{\prime },a^{\prime })$| and |$\psi \notin S_{w^{\prime }a}$|⁠. Hence, |$(w,a)\sim _X(w^{\prime },a^{\prime })$| by Lemma 16 and |$w^{\prime },a\nVdash \psi$| by the induction hypothesis. Therefore, |$w,a\nVdash {\sf R}_X\psi$| by item 7 of Definition 2.

|$(\Leftarrow ):$| Suppose that |${\sf R}_X\psi \in S_{wa}$|⁠. Consider any world |$w^{\prime }$| and any agent |$a^{\prime }$| such that |$(w,a)\sim _X(w^{\prime },a^{\prime })$|⁠. By item 7 of Definition 2, it suffices to show that |$w^{\prime },a\Vdash \psi$|⁠.

The assumption |$(w,a)\sim _X(w^{\prime },a^{\prime })$| implies |$(w,a)\stackrel{X}{\rm o\!\!-\!\!o }(w^{\prime },a^{\prime })$| by Lemma 16. Then, |$\psi \in S_{w^{\prime }a}$| by item 2(d) of Definition 7 and the assumption |${\sf R}_X\psi \in S_{wa}$|⁠. Therefore, |$w^{\prime },a\Vdash \psi$| by the induction hypothesis.

Theorem 6 (strong completeness).

For any set of formulae |$\Gamma \subseteq \Phi$| and any formula |$\varphi \in \Phi$|⁠, if |$\Gamma \nvdash \varphi$|⁠, then there is a world |$w\in W$| and an agent |$a\in \mathcal {A}$| of some model, such that |$w,a\Vdash \gamma$| for each formula |$\gamma \in \Gamma$| and |$w,a\nVdash \varphi$|⁠.

Proof.

The assumption |$\Gamma \nvdash \varphi$| implies that the set |$\Gamma \cup \lbrace \lnot \varphi \rbrace$| is consistent. Let |$\Delta$| be any maximal consistent extension of this set. Consider the loaded Cartesian tree |$(1,1,\varnothing ,\varnothing ,S)$|⁠, where |$S_{00}=\Delta$|⁠. In other words, the loaded cartesian tree consists of a single node (0,0) labeled with set |$\Delta$|⁠. Because the set of edges is empty, the tuple |$(1,1,\varnothing ,\varnothing )$| is a Cartesian tree by Definition 5. Thus, to prove that the tuple |$(1,1,\varnothing ,\varnothing ,S)$| is a loaded Cartesian tree, it suffices to show that conditions 2(a)-(d) of Definition 7 are satisfied. Conditions 2(a)-(c) are vacuously satisfied because the tree has only one node. To verify condition 2(d), it suffices to show that if |${\sf R}_X\psi \in S_{00}$|⁠, then |$\psi \in S_{00}$|⁠. Indeed, the assumption |${\sf R}_X\psi \in S_{00}$| implies |$S_{00}\vdash \psi$| by Lemma 5 and the Modus Ponens inference rule. Therefore, |$\psi \in S_{00}$| because |$S_{00}$| is a maximal consistent set.

By Theorem 5, tree |$(1,1,\varnothing ,\varnothing ,S)$| can be extended to a complete loaded Cartesian tree |$T^{\prime }=(\alpha,\beta,E,\ell , S^{\prime })$|⁠. Note that |$S^{\prime }_{00}=S_{00}=\Delta \supseteq \Gamma \cup \lbrace \lnot \varphi \rbrace$| by item 5 of Definition 8. Consider canonical model |$M(T^{\prime })$|⁠. By Lemma 17, we have |$0,0\Vdash \gamma$| for each |$\gamma \in \Gamma$| and |$0,0\Vdash \lnot \varphi$| because |$\Gamma \cup \lbrace \lnot \varphi \rbrace \subseteq S^{\prime }_{00}$|⁠. Note that the statement |$0,0\Vdash \lnot \varphi$| implies |$0,0\nVdash \varphi$| by item 3 of Definition 2.

Future work: k-anonymity

The term “k-anonymity” is usually used to refer to a setting where some information about an agent not only cannot be used to identify the agent precisely but it even cannot be used to identify a group of k agents to which the agent belongs. This concept has been extensively studied in the database literature [32,33]. It also was investigated by logicians in the context of knowledge [34] and actions [35]. In the setting of our article, there are at least two different notions that capture different aspects of k-anonymity.

First, we can generalize our traceability expression |$!X$| to k-tracability by modifying item 2 of Definition 2 as follows:

|$w,a\Vdash \, !^kX$|⁠, when there are agents |$a_1,\dots ,a_k\in \mathcal {A}$| such that for each world |$u\in W$| and each agent |$b\in \mathcal {A}$|⁠, if |$(w,a)\sim _X (u,b)$|⁠, then |$b\in \lbrace a_1,\dots ,a_k\rbrace$|.

Note that |$!^1 X$| is equivalent to the traceability expression |$!X$| we studied earlier in this article. The Monotonicity and the Introspection of Traceability axioms are also valid for k-tracability:

$$\begin{eqnarray} !^kX\rightarrow !^kY, \,\,\rm{if X\subseteq Y},\,\,\,\,\,\,\,\,\,\, !^kX\rightarrow {\sf D}_X!^kX. \end{eqnarray}$$

Before discussing the other notion related to k-anonymity, let us go back to our de re modality. Recall from statement (4) that in world |$w_1$|⁠, knowing |$age$|⁠, |$city$|⁠, and |$reply$| of agent a informs de re knowledge that agent a is pregnant. Let us now suppose that, in the same world |$w_1$|⁠, instead of the questionnaire filled in by agent a we are looking at the questionnaire filled in by her husband, agent b. Although these two questionnaires are filled identically, see Fig. 2, agent b’s questionnaire, of course, does not inform the de re knowledge that he is pregnant:

$$\begin{eqnarray} w_1,b\nVdash {\sf R}_{\text{age,city,reply}}\, \text{``is pregnant''}. \end{eqnarray}$$

Nevertheless, his questionnaire does inform de re knowledge that his wife, agent a, is pregnant. In other words, although the data in his questionnaire does not violate his own privacy, it does violate the privacy of his wife. The language of our original logical system cannot express this. However, one can introduce a new modality |${\sf R}^\exists$| (“de re knowledge about somebody”) that does it. The semantics of this modality can be defined as follows:

|$w,a\Vdash {\sf R}^\exists _X\varphi$|⁠, if there is a agent |$a^{\prime }\in \mathcal {A}$| such that for each world |$u\in W$| and each agent |$b\in \mathcal {A}$|⁠, if |$(w,a)\sim _X (u,b)$|⁠, then |$u,a^{\prime }\Vdash \varphi$|.

In our example,

$$\begin{eqnarray} w_1,b\Vdash {\sf R}^\exists _{\text{age,city,reply}}\, \text{``is pregnant''}. \end{eqnarray}$$

Some properties of the modality |${\sf R}^\exists$| are expected:

$$\begin{eqnarray} {\sf R}_X\varphi \rightarrow {\sf R}^\exists _X\varphi , {\sf R}^\exists _X\varphi \rightarrow {\sf R}^\exists _Y\varphi \,\,\rm{if X\subseteq Y}. \end{eqnarray}$$

Among the unexpected ones is perhaps the introspection:

$$\begin{eqnarray} {\sf R}^\exists _X\varphi \rightarrow {\sf D}_X{\sf R}^\exists _X\varphi . \end{eqnarray}$$

Note that unlike the corresponding axiom for modality |${\sf R}$|⁠, the introspection property for modality |${\sf R}^\exists$| is stronger, because it is true for de dicto knowledge, not just the overt knowledge.

Finally, let us get back to k-anonymity. Imagine a different survey where an agent is asked if one of the agent’s children is getting married. Suppose that an agent a, who has k adult children, fills in the survey and answers this question positively. If the survey has enough information to trace the identity of the agent, then the dataset X in the survey informs the knowledge that one of the k children is getting married. We write this as |${\sf R}^k_X \text{``is getting married''}$|⁠. In general, modality |${\sf R}^k$| can be defined as follows:

|$w,a\Vdash {\sf R}^k_X\varphi$|⁠, if there are agents |$a_1,\dots ,a_k\in \mathcal {A}$| such that for each world |$u\in W$| and each agent |$b\in \mathcal {A}$|⁠, if |$(w,a)\sim _X (u,b)$|⁠, then |$u,a_i\Vdash \varphi$| for some |$i\le k$|.

Observe that the statement |$\lnot {\sf R}^k_X\varphi$| means that dataset X does not provide enough information to identify a group of k agents one of which has property |$\varphi$|⁠. Thus, just like the expression |$\lnot \, !^kX$| it captures an aspect of k-anonymity. Below are some of the properties of modality |${\sf R}^k$|⁠:

$$\begin{eqnarray} \begin{array}{cc}{\sf R}^1_X\varphi \leftrightarrow {\sf R}^\exists _X\varphi , & \qquad {\sf R}^k_X\varphi \rightarrow {\sf D}_X{\sf R}^k_X\varphi ,\\ {\sf R}^k_X\varphi \rightarrow {\sf R}^k_Y\varphi \,\,\rm{if X\subseteq Y}, & \qquad {\sf R}^k_X\varphi \rightarrow {\sf R}^\ell _Y\varphi \,\,\rm{if \ell \le k}.\end{array} \end{eqnarray}$$

A complete axiomatization of the interplay between the expression |$!^kX$| and the modality |${\sf R}^k_X$| remains an open question.

Conclusion

In this article, we observed that what one “knows” about an agent based on the agent’s data can be defined in three different ways. We called them de re, de dicto, and overt forms of knowledge. Out of these three forms, only de dicto knowledge preserves the privacy of the agent. Our main technical results are the mutual undefinability of these three forms of knowledge in an egocentric setting even when the traceability expression |$!$| is used and a sound and complete logical system describing the interplay between the three forms of knowledge and the traceability expression.

Finally, in the “Future work” section, we discuss how our approach can be generalized to capture two forms of k-anonymity.

Author contribution

Junli Jiang (Conceptualization-Equal, Formal analysis-Equal, Funding acquisition-Equal, Investigation-Equal, Methodology-Equal, Project administration-Equal, Resource, Writing - original draft-Equal, Writing - review & editing-Equal).

Pavel Naumov (Conceptualization-Equal, Formal analysis-Equal, Funding acquisition-Equal, Investigation-Equal, Methodology-Equal, Project administration-Equal, Resource, Writing - original draft-Equal, Writing - review & editing-Equal).

Conflict of interest

The authors wish to declare that they are not aware of any conflict of interest related to this work.

Funding

J.J. acknowledges the support of the Key project of the National Social Science Fund of China (No. 24AZX019) and the China Scholarship Council (No. 202306990071).

Appendix

To keep the article self-contained, in this appendix, we give the proofs of the lemmas not proven in the main part of the article. We place them here because they are either straightforward or well-known.

Lemma 18.

The inference rule |$\frac{\varphi }{\Box _{X}\varphi }$|⁠, where |$\Box \in \lbrace {\sf O},{\sf R}\rbrace$|⁠, is derivable.

Proof.

Suppose that |$\vdash \varphi$|⁠. Thus, |$\vdash {\sf D}_X\varphi$| by the Necessitation inference rule. Hence, |$\vdash {\sf O}_X\varphi$| by the Overt Knowledge axiom and the Modus Ponens inference rule. This proves that the rule |$\frac{\varphi }{{\sf O}_{X}\varphi }$| is derivable.

To prove derivability of the rule |$\frac{\varphi }{{\sf R}_{X}\varphi }$|⁠, assume again that |$\vdash \varphi$|⁠. Thus, |$\vdash {\sf D}_\varnothing \varphi$| by the Necessitation inference rule. Hence, |$\vdash {\sf R}_\varnothing \varphi$| by the Data-Free Knowledge axiom and the Modus Ponens inference rule. Therefore, |$\vdash {\sf R}_X\varphi$| by the Monotonicity axiom and the Modus Ponens inference rule.

Lemma 6.

|$\vdash \Box _X\varphi \rightarrow \Box _X\Box _X\varphi$|⁠, where |$\Box \in \lbrace {\sf O},{\sf D}\rbrace$|.

Proof.

Formula |$\Box _X\lnot \Box _X\varphi \rightarrow \lnot \Box _X\varphi$| is either an instance of the Truth axiom (if |$\Box$| is modality |${\sf O}$|⁠) or Lemma 5 (if |$\Box$| is modality |${\sf D}$|⁠). Thus, by the law of contraposition, |$\vdash \Box _X\varphi \rightarrow \lnot \Box _X\lnot \Box _X\varphi$|⁠. Hence, taking into account the following instance of the Negative Introspection axiom: |$\lnot \Box _X\lnot \Box _X\varphi \rightarrow \Box _X\lnot \Box _X\lnot \Box _X\varphi$|⁠, we have

$$\begin{eqnarray} \vdash \Box _X\varphi \rightarrow \Box _X\lnot \Box _X\lnot \Box _X\varphi . \end{eqnarray}$$

(41)

At the same time, |$\lnot \Box _X\varphi \rightarrow \Box _X\lnot \Box _X\varphi$| is an instance of the Negative Introspection axiom. Thus, |$\vdash \lnot \Box _X\lnot \Box _X\varphi \rightarrow \Box _X\varphi$| by the law of contrapositive in the propositional logic. Hence, |$\vdash \Box _X(\lnot \Box _X\lnot \Box _X\varphi \rightarrow \Box _X\varphi )$| by either the Necessitation inference rule (if |$\Box$| is modality |${\sf D}$|⁠) or Lemma 18 (if |$\Box$| is modality |${\sf O}$|⁠). Thus, by the Distributivity axiom and the Modus Ponens inference rule, |$\vdash \Box _X\lnot \Box _X\lnot \Box _X\varphi \rightarrow \Box _X\Box _X\varphi .$| The latter, together with statement (), implies the statement of the lemma by propositional reasoning.

Lemma 19 (deduction).

If |$\Gamma ,\varphi \vdash \psi$|⁠, then |$\Gamma \vdash \varphi \rightarrow \psi$|⁠.

Proof.

|$\vdash \psi _k$|⁠, or
|$\psi _k\in \Gamma$|⁠, or
|$\psi _k$| is equal to |$\varphi$|⁠, or
there are |$i,j\lt k$| such that formula |$\psi _j$| is equal to |$\psi _i\rightarrow \psi _k$|⁠.

It suffices to show that |$\Gamma \vdash \varphi \rightarrow \psi _k$| for each |$k\le n$|⁠. We prove this by induction on k by considering the four cases above separately.

Lemma 8.

If |$\varphi _1,\dots ,\varphi _n\vdash \psi$|⁠, then |$\Box _X\varphi _1,\dots ,\Box _X\varphi _n\vdash \Box _X\psi$|⁠, where |$\Box \in \lbrace {\sf O},{\sf D},{\sf R}\rbrace$|.

Proof.

The assumption |$\varphi _1,\dots ,\varphi _n\vdash \psi$|⁠, by Lemma 19 applied n times, implies that

$$\begin{eqnarray} \vdash \varphi _1\rightarrow (\varphi _2\rightarrow \dots (\varphi _n\rightarrow \psi )\dots ). \end{eqnarray}$$

Thus, by either the Necessitation inference rule (if |$\Box$| is the modality |${\sf O}$|⁠) or by Lemma 18 (if |$\Box$| is one of the modalities |${\sf D}$| and |${\sf R}$|⁠),

$$\begin{eqnarray} \vdash \Box _X(\varphi _1\rightarrow (\varphi _2\rightarrow \dots (\varphi _n\rightarrow \psi )\dots )). \end{eqnarray}$$

Hence, by the Distributivity axiom and the Modus Ponens rule,

$$\begin{eqnarray} \vdash \Box _X\varphi _1\rightarrow \Box _X(\varphi _2\rightarrow \dots (\varphi _n\rightarrow \psi )\dots ). \end{eqnarray}$$

Then, again by the Modus Ponens rule,

$$\begin{eqnarray} \Box _X\varphi _1\vdash \Box _X(\varphi _2\rightarrow \dots (\varphi _n\rightarrow \psi )\dots ). \end{eqnarray}$$

Therefore, |$\Box _X\varphi _1,\dots ,\Box _X\varphi _n\vdash \Box _X\psi$| by applying the previous steps |$(n-1)$| more times.

References

Quine

Quantifiers and propositional attitudes

J Philos

1956

;

177

–

Month:	Total Views:
January 2025	132
February 2025	139
March 2025	186
April 2025	222
May 2025	20

Article Contents

De re/de dicto distinction: a logicians’ perspective on data anonymity

Abstract

Introduction

Preliminaries

Egocentric logic

Data-informed knowledge

De re/de dicto data-informed knowledge

Overt data-informed knowledge and traceability

Syntax and semantics

Undefinability results

Undefinability of |${\sf R}$| through |$!$|⁠, |${\sf O}$|⁠, and |${\sf D}$|

Undefinability of |${\sf O}$| and |${\sf D}$|

Axioms

Soundness

Completeness

Cartesian trees

Complete loaded Cartesian trees

Extension theorem

Canonical model

Future work: k-anonymity

Conclusion

Author contribution

Conflict of interest

Funding

Appendix

References

Citations

Views

Altmetric

Email alerts

Citing articles via

Latest

Most Read

Most Cited

This Feature Is Available To Subscribers Only