(This started as an answer on Math.Stackexchange. This version has been lightly edited and expanded. Also posted at my blog.)
Throughout this post, theory means first-order theory. In fact, we are concerned with theories that are recursively presented, though the abstract framework applies more generally. Thanks to Fredrik Engström Ellborg for suggesting in Google+ the reference Kaye-Wong, and to Ali Enayat for additional references and many useful conversations on this topic.
Informally, to say that a theory interprets a theory means that there is a procedure for associating structures in the language of to structures in the language of in such a way that if is a model of , then is a model of .
Let us be a bit more precise, and do this more syntactically to reduce the requirements of the metatheory: One can take “the theory interprets the theory ” to mean that there are
- A map that assigns formulas in the language of to the symbols of the language of , and
- A formula in the language of ,
with the following properties: We can extend to all -formulas recursively: , etc, and . It then holds that proves
- , and
- for each axiom of (including the axioms of first-order logic).
Here, are taken to be recursive, and so is .
If the above happens, then we can see as a strong witness to the fact that the consistency of implies the consistency of .
Two theories are mutually interpretable iff each one interprets the other. By the above, this is a strong version of the statement that they are equiconsistent.
Two theories are bi-interpretable iff they are mutually interpretable, and in fact, the interpretations from is and from in can be taken to be “inverses” of each other, in the sense that proves that and are equivalent for each in the language of , and similarly for , and . In a sense, two theories that are bi-interpretable are very much “the same”, only differing in their presentation.
A good example illustrating this notion occurs when formalizing the idea of finitary mathematics. We can show that the two standard formalizations, and , are bi-interpretable, where is the variant of resulting from replacing the axiom of infinity with its negation, and where foundation is stated as the axiom scheme of -induction.
If we do not take the precaution of stating foundation this way, then the resulting theory is mutually interpretable with , but it is not immediately clear whether they are bi-interpretable, see section 3 for this. (Similar issues, curiously, occur with NFU and appropriate fragments of set theory.) The problem is related to the existence of transitive closures, see section 4 for more details.
The standard interpretation of inside goes by defining iff, writing in base , the th number from the right is . That is, for some and some . From Gödel’s work on the incompleteness theorems, we know that this is definable inside , see here for some details and references. This interpretation goes back to Ackermann:
Wilhelm Ackermann. Die Widerspruchsfreiheit der allgemeinen Mengenlehre, Math. Ann. 114 (1937), 305–315.
In the other direction, the interpretation goes by realizing that in we can still define and ordinal arithmetic, and the result is a model of . But this is not the inverse. To obtain the inverse, we work in , and compose this with a definable bijection , namely
(Of course, stands here for “”, the universe, and stands for “”.) The paper Kaye-Wong explains all this and provides additional references. This inverse interpretation is due to Mycielski, who published it in Russian:
Jan Mycielski. The definition of arithmetic operations in the Ackermann model. Algebra i Logika Sem. 3 (5-6), (1964), 64–65. MR0177876 (31 #2134).
Mycielski’s result remained “hidden” until attention was drawn to it in the survey
Richard Kaye and Tin Lok Wong. On interpretations of arithmetic and set theory. Notre Dame Journal of Formal Logic 48 (4), (2007), 497–510. MR2357524 (2008i:03075).
(The pdf linked to above has some additional information than the version published at the Journal.)
On the difference between mutual interpretability and bi-interpretability, see this nice post by Ali Enayat to the FOM list. Ali shows there that and , which are obviously mutually interpretable, are not bi-interpretable. The proof goes by noticing that if and are bi-interpretable, then the class of groups arising as for coincides with the class of groups of the form for .
However, admits a model with an automorphism of order , and this is not possible for . (Additional details are given at the link.)
This naturally leads to the question of whether and (or, equivalently, ) are bi-interpretable, that is, whether a more clever interpretation than Ackermann’s may get rid of the issues that forced us to adopt -induction.
That this is not the case, meaning, the theories are mutually interpretable, but not bi-interpretable, was shown recently in
Ali Enayat, James H. Schmerl, and Albert Visser. -models of finite set theory. In Set theory, arithmetic, and foundations of mathematics: theorems, philosophies, Juliette Kennedy, and Roman Kossak, eds. Lect. Notes Log., vol. 36, Assoc. Symbol. Logic, La Jolla, CA, 2011, pp. 43–65. MR2882651 (2012m:03132).
In fact, in Theorem 5.1, they show that and are not even sententially equivalent. This is a weaker notion than bi-interpretability:
Recall that if we have an interpretation from in , then to each model of we associate a model of . That and are bi-interpretable gives us interpretations from in , and from in such that for any model of , the model is isomorphic to , and for any model of , the model is isomorphic to .
We say that and are sententially equivalent when the interpretations and can be chosen to satisfy the weaker demand than and are elementarily equivalent, and similarly for and .
Their proof that this property fails for and ultimately traces back to the fact that no arithmetically definable model of that is non-standard can be elementarily equivalent to the standard model. This is a result of Dana Scott. On the other hand, there are many arithmetically definable -models of . In section 4 we illustrate a very general method for producing such examples; further details can be seen in the Enayat-Schmerl-Visser paper.
This section was originally an answer at Math.Stackexchange.
Let be the theory resulting from replacing in the axiom of infinity by its negation, but leaving foundation as the axiom stating that any non-empty set has an element disjoint from . This theory is not strong enough to prove the existence of transitive closures. This is the reason why is usually formulated with foundation (regularity) replaced by its strengthening of -induction, namely the scheme that for every formula adds the axiom
Over the base theory consisting of (the empty set exists), extensionality, pairing, union, comprehension, and replacement, one can prove that for any , there is a transitive set containing iff there is a smallest such set, that is, there is a transitive set containing iff its transitive closure exists.
Over plus foundation, the scheme of -induction is equivalent to the statement that every set is contained in a transitive set.
Unfortunately, as I claimed above, cannot prove the existence of transitive closures. To see this, start with , the standard model of . Let
for , and for . Now define a new “membership” relation by
for . It turns out that
In fact, is not contained in any transitive set in this structure. (Note that is not the empty set of this model.)
All this and much more is discussed in Kaye-Wong. The result that is consistent with , and the proof just sketched, are due to Mancini. For additional details, Kaye-Wong refers to
Antonella Mancini and Domenico Zambella. A note on recursive models of set theories, Notre Dame Journal of Formal Logic, 42 (2), (2001), 109-115. MR1993394 (2005g:03061).
In fact, if is any bijection, then defining as above gives a model of , except possibly for foundation. Many examples can be produced starting from this idea.
Let me close by mentioning a very recent paper by Ali Enayat,
Ali Enayat. Some interpretability results concerning . Preprint.
Here, Ali extends the result mentioned above that and are not bi-interpretable. In fact, he shows that no two distinct extensions of are bi-interpretable. (And some more.)
This should be contrasted with the situation for mutually interpretability. The development of forcing and inner model theory in fact has provided us with a plethora of examples of mutually interpretable such extensions.
Hopefully someone in the know can add appropriate categories here. And I noticed I neglected to mention the key modern reference on interpretations, A. Visser, Categories of theories and interpretations, Logic in Tehran, Lecture Notes in Logic, vol. 26, Association for Symbolic Logic, La Jolla, CA, 2006, pp. 284–341.
Let me put interpretations in a broader context by quoting from page 29 of the Borovik Nesin book on Groups of finite Morley rank.
“The notion of interpretation in model theory corresponds to a number of familiar phenomena in algebra which are often considered distinct: coordinatization, structure theory, and constructions like direct product and homomorphic image. For example a Desarguesian projective plane is coordinatized by a division ring; Artinian semisimple rings are finite direct products of matrix rings over divisions rings; many theorems of finite group theory have as their conclusion that a certain abstract group belongs to a standard family of matrix groups over . All of these examples have a common feature: certain structures of one kind are somehow encoded in terms of structures of another kind. All of these examples have a further feature which plays no role in algebra but which is crucial for us: in each case the encoded structures can be recovered from the encoding structures definably.”
Thank you, John, for your comment and mentions (of Borovik-Nesin)…