Category Archives: Model Theory


(This started as an answer on Math.Stackexchange. This version has been lightly edited and expanded. Also posted at my blog.)

Throughout this post, theory means first-order theory. In fact, we are concerned with theories that are recursively presented, though the abstract framework applies more generally. Thanks to Fredrik Engström Ellborg for suggesting in Google+ the reference Kaye-Wong, and to Ali Enayat for additional references and many useful conversations on this topic.


Informally, to say that a theory T interprets a theory S means that there is a procedure for associating structures \mathcal N in the language of S to structures \mathcal M in the language of T in such a way that if \mathcal M is a model of T, then \mathcal N is a model of S.

Let us be a bit more precise, and do this more syntactically to reduce the requirements of the metatheory: One can take “the theory T interprets the theory S to mean that there are

  1. A map i that assigns formulas in the language of T to the symbols of the language \mathcal L of S, and
  2. A formula \mathrm{Dom}(x) in the language of T,

with the following properties: We can extend i to all \mathcal L-formulas recursively: i(\phi\land\psi)=i(\phi)\land i(\psi), etc, and i(\forall x\phi)=\forall x(\mathrm{Dom}(x)\to i(\phi)). It then holds that T proves

  1. \exists x\,\mathrm{Dom}(x), and
  2. i(\phi) for each axiom \phi of S (including the axioms of first-order logic).

Here, T,S are taken to be recursive, and so is i.

If the above happens, then we can see i as a strong witness to the fact that the consistency of T implies the consistency of S.

Two theories are mutually interpretable iff each one interprets the other. By the above, this is a strong version of the statement that they are equiconsistent.

Two theories are bi-interpretable iff they are mutually interpretable, and in fact, the interpretations i from T is S and j from S in T can be taken to be “inverses” of each other, in the sense that T proves that \phi and j(i(\phi)) are equivalent for each \phi in the language of T, and similarly for S, \psi and i(j(\psi)). In a sense, two theories that are bi-interpretable are very much “the same”, only differing in their presentation.


A good example illustrating this notion occurs when formalizing the idea of finitary mathematics. We can show that the two standard formalizations, \mathsf{PA} and \mathsf{ZFfin}, are bi-interpretable, where \mathsf{ZFfin} is the variant of \mathsf{ZF} resulting from replacing the axiom of infinity with its negation, and where foundation is stated as the axiom scheme of \epsilon-induction.

If we do not take the precaution of stating foundation this way, then the resulting theory \mathsf{ZFfin}' is mutually interpretable with \mathsf{PA}, but it is not immediately clear whether they are bi-interpretable, see section 3 for this. (Similar issues, curiously, occur with NFU and appropriate fragments of set theory.) The problem is related to the existence of transitive closures, see section 4  for more details.

The standard interpretation of \mathsf{ZFfin} inside \mathsf{PA} goes by defining n\mathrel E m iff, writing m in base 2, the nth number from the right is 1. That is, m=(2k+1)2^n+s for some k and some s<2^n. From Gödel’s work on the incompleteness theorems, we know that this is definable inside \mathsf{PA}, see here for some details and references. This interpretation goes back to Ackermann:

Wilhelm Ackermann. Die Widerspruchsfreiheit der allgemeinen Mengenlehre, Math. Ann. 114 (1937), 305–315.

In the other direction, the interpretation goes by realizing that in \mathsf{ZFfin} we can still define \omega and ordinal arithmetic, and the result is a model of \mathsf{PA}. But this is not the inverse. To obtain the inverse, we work in \mathsf{ZFfin}, and compose this with a definable bijection p:V\to\mathsf{ORD}, namely

p(x)=\sum \{2^{p(y)}\mid y\in x\}.

(Of course, V stands here for “V_\omega”, the universe, and \mathsf{ORD} stands for “\omega”.) The paper Kaye-Wong explains all this and provides additional references. This inverse interpretation is due to Mycielski, who published it in Russian:

Jan Mycielski. The definition of arithmetic operations in the Ackermann model. Algebra i Logika Sem. 3 (5-6), (1964), 64–65. MR0177876 (31 #2134).

Mycielski’s result remained “hidden” until attention was drawn to it in the survey

Richard Kaye and Tin Lok Wong. On interpretations of arithmetic and set theory. Notre Dame Journal of Formal Logic 48 (4), (2007), 497–510. MR2357524 (2008i:03075).

(The pdf linked to above has some additional information than the version published at the Journal.)


On the difference between mutual interpretability and bi-interpretability, see this nice post by Ali Enayat to the FOM list. Ali shows there that \mathsf{ZF} and \mathsf{ZFC}, which are obviously mutually interpretable, are not bi-interpretable. The proof goes by noticing that if T and S are bi-interpretable, then the class of groups arising as \mathrm{Aut}(\mathcal M) for \mathcal M\models T coincides with the class of groups of the form \mathrm{Aut}(\mathcal N) for \mathcal N\models S.

However, \mathsf{ZF} admits a model with an automorphism of order 2, and this is not possible for \mathsf{ZFC}. (Additional details are given at the link.)

This naturally leads to the question of whether \mathsf{ZFfin}' and \mathsf{PA} (or, equivalently, \mathsf{ZFfin}) are bi-interpretable, that is, whether a more clever interpretation than Ackermann’s may get rid of the issues that forced us to adopt \epsilon-induction.

That this is not the case, meaning, the theories are mutually interpretable, but not bi-interpretable, was shown recently in

Ali Enayat, James H. Schmerl, and Albert Visser. \omega-models of finite set theory. In Set theory, arithmetic, and foundations of mathematics: theorems, philosophies, Juliette Kennedy, and Roman Kossak, eds. Lect. Notes Log., vol. 36, Assoc. Symbol. Logic, La Jolla, CA, 2011, pp. 43–65. MR2882651 (2012m:03132).

In fact, in Theorem 5.1, they show that \mathsf{ZFfin}' and \mathsf{PA} are not even sententially equivalent. This is a weaker notion than bi-interpretability:

Recall that if we have an interpretation i from T in S, then to each model \mathcal M of S we associate a model i(\mathcal M) of T. That T and S are bi-interpretable gives us interpretations i from T in S, and j from S in T such that for any model \mathcal M of S, the model j(i(\mathcal M)) is isomorphic to \mathcal M, and for any model \mathcal N of T, the model i(j(\mathcal N)) is isomorphic to \mathcal N.

We say that T and S are sententially equivalent when the interpretations i and j can be chosen to satisfy the weaker demand than i(j(\mathcal N)) and \mathcal N are elementarily equivalent, and similarly for j(i(\mathcal M)) and \mathcal M.

Their proof that this property fails for \mathsf{PA} and \mathsf{ZFfin}' ultimately traces back to the fact that no arithmetically definable model of \mathsf{PA} that is non-standard can be elementarily equivalent to the standard model. This is a result of Dana Scott. On the other hand, there are many arithmetically definable \omega-models of \mathsf{ZFfin}'. In section 4 we illustrate a very general method for producing such examples; further details can be seen in the Enayat-Schmerl-Visser paper.


This section was originally an answer at Math.Stackexchange.

Let \mathsf{ZF}^{\lnot\infty} be the theory resulting from replacing in \mathsf{ZF} the axiom of infinity by its negation, but leaving foundation as the axiom stating that any non-empty set x has an element disjoint from x. This theory is not strong enough to prove the existence of transitive closures. This is the reason why \mathsf{ZFfin} is usually formulated with foundation (regularity) replaced by its strengthening of \epsilon-induction, namely the scheme that for every formula \phi(x,\vec y) adds the axiom

\forall \vec y\,(\forall x\,(\forall z\in x\,\phi(z,\vec y)\to\phi(x,\vec y)\to\forall x\,\phi(x,\vec y)).

Over the base theory \mathsf{BT} consisting of (the empty set exists), extensionality, pairing, union, comprehension, and replacement, one can prove that for any x, there is a transitive set containing x iff there is a smallest such set, that is, there is a transitive set containing x iff its transitive closure exists.

Over \mathsf{BT} plus foundation, the scheme of \epsilon-induction is equivalent to the statement \mathsf{TC} that every set is contained in a transitive set.

Unfortunately, as I claimed above, \mathsf{ZF}^{\lnot\infty} cannot prove the existence of transitive closures. To see this, start with (V_\omega,\in), the standard model of \mathsf{ZF}^{\lnot\infty}. Let

 \omega^\star=\{\{n+1\}\in V_\omega\mid n\in\omega\}.

Define F:V_\omega\to V_\omega by


for n\in\omega, and F(x)=x for x\notin\omega\cup\omega^\star. Now define a new “membership” relation by

x\in_F y\Longleftrightarrow x\in F(y)

for x,y\in V_\omega. It turns out that


In fact, \emptyset is not contained in any transitive set in this structure. (Note that \emptyset is not the empty set of this model.)

All this and much more is discussed in Kaye-Wong. The result that \mathsf{ZF}^{\lnot\infty} is consistent with \lnot\mathsf{TC}, and the proof just sketched, are due to Mancini. For additional details, Kaye-Wong refers to

Antonella Mancini and Domenico Zambella. A note on recursive models of set theories, Notre Dame Journal of Formal Logic, 42 (2), (2001), 109-115. MR1993394 (2005g:03061).

In fact, if F:V_\omega\to V_\omega is any bijection, then defining \in^F as above gives a model of \mathsf{ZF}^{\lnot\infty}, except possibly for foundation. Many examples can be produced starting from this idea.


Let me close by mentioning a very recent paper by Ali Enayat,

Ali Enayat. Some interpretability results concerning \mathsf{ZF}. Preprint.

Here, Ali extends the result mentioned above that \mathsf{ZF} and \mathsf{ZFC} are not bi-interpretable. In fact, he shows that no two distinct extensions of \mathsf{ZF} are bi-interpretable. (And some more.)

This should be contrasted with the situation for mutually interpretability. The development of forcing and inner model theory in fact has provided us with a plethora of examples of mutually interpretable such extensions.

Model theoretic stability and definability of types, after A. Grothendieck

It had happened more than once that combinatorial model theoretic dividing lines introduced by Shelah were invented independently in different fields of mathematics. This curious note by Itai Ben Yaacov gives another example of this phenomenon:

We point out how the “Fundamental Theorem of Stability Theory”, namely the equivalence between the “non order property” and definability of types, proved by Shelah in the 1970s, is in fact an immediate consequence of Grothendieck’s “Critères de compacité” from 1952. The familiar forms for the defining formulae then follow using Mazur’s Lemma regarding weak convergence in Banach spaces.

In a meeting in Kolkata in January 2013, the author asked the audience who had first defined the notion
of a stable formula and when, and to the expected answer replied that, no, it had been Grothendieck, in
the fifties.

MALOA “final” conference in Luminy

I had just attended the final MALOA conference “Logic and interactions“.  MALOA was a European network, which is now essentially finished (though the “final” meeting was not the last one, there will be a workshop in Manchester soon). The meeting took place at CIRM in Luminy, a wonderful place not only for mathematical reasons:

Luminy(you can see some more photos here).

As MALOA was about logic, rather than model theory, the topics of the talks were quite diverse, ranging from purely algebraic model theory (definable valuations and height bound in arithmetic nullstellensatz) to proof theory and even “logical description for behaviour analysis in aerospace systems”. Not sure how productive this diversity is, but at least it is entertaining.

Some talks that I found of particular interest are:

  • Talks by Deirdre Haskell  and Chris Laskowski on NIP, VC-density and connections to probability and combinatorics (in general one could safely add to this list some more subjects including set theory). These all are quite fascinating topics which deserve some postings in the future. There are already some examples of importing ideas from combinatorics (e.g. the beautiful (p,q)-theorem of Alon, Kleitman and Matousek) to prove model-theoretic results (e.g. UDTFS for NIP theories), but I believe that many more connections remain to be discovered.
  • Todor Tsankov spoke about generalizations of de Finetti’s theorem. Classical de Finetti’s theorem from probability theory says that a sequence of random variables is exchangeable if and only if it is independent and identically distributed over its tail sigma-algebra. Various multi-dimensional generalizations of this characterization form the so-called exchangeability theory. This theorem can viewed as providing a classification of all probability measures on 2^{\mathbb{N}} invariant under the action of S_{\infty}. Now, in a general situation, given a permutation group G acting on a countable set M, one can’t really hope to give any kind of “classification” of G-invariant measures on 2^{\mathbb{N}} as we are in the context of the general ergodic theory. However, it appears that if the group G is sufficiently large compared to the index set, one can arrive at stronger results. Todor’s approach is to consider oligomorphic groups, i.e. such that the action of G on M^n has only finitely many orbits for each n. These groups are familiar to model theorists as automorphism groups of \omega-categorical structures. Todor provides a classification in the case when the underlying structure has trivial algebraic closure, and gives some promising partial results in the general case. In fact, this subject appears to have a lot to do with model theory. I am involved in a project together with Itai Ben Yaacov, of an abstract model theoretic approach to de Finetti’s theory in terms of the forking calculus, canonical bases and Morley sequences in the context of an arbitrary stable first-order theory, in the sense of continuous logic (which specializes to the classical case considering the theory of [0, 1]-valued random variables equipped with the L^1 metric).

Also I gave what was probably my last talk as a “student”. I spoke about some new results with Pierre Simon and Anand Pillay concerning definable topological dynamics in NIP theories. The slides are available here. We show that notions like definable (extreme) amenability of a definable group, as well as various model theoretic components, are not affected by adding externally definable sets to the picture (that is, passing to a Shelah’s expansion of a model). These facts appear to have some applications to the questions of Newelski on describing G/G^{00} in terms of the so-called Ellis group.

“Model theory and applications to geometry”, Satellite workshop associated with LC 2013, Lisbon, 18th-19th of July 2013

The aim of the workshop is to give the opportunity to the people attending Logic Colloquium 2013 and to all the people interested, to hear more about the recent developments in model theory and its applications and connections to real geometry.

List of confirmed speakers:

Raf Cluckers (Université de Lille and KULeuven)
Georges Comte (Université de Savoie – Chambéry)
Antongiulio Fornasiero (Seconda Università di Napoli)
Isaac Goldbring (UIC)
François Loeser (Université Pierre et Marie Curie)
Chris Miller (Ohio State University)
Giovanni Morando (Università di Padova)
Jean-Philippe Rolin (Université de Bourgogne)

You can find information about travelling and accommodation on the webpage of the workshop:

REGISTRATION: if you are planning on attending this event, please contact

We hope to see you in Lisbon!

The Model Theory Group at CMAF
Universidade de Lisboa

Some counterexamples for forking, dividing, invariance

At the end of my paper with Itay Kaplan “Forking and dividing in NTP2 theories” we had asked several questions, admittedly without giving them much thought. Since 2008 when the paper went in circulation, some people had actually shown interest in those questions. By now two of them are known to have negative answers, one due to Gabriel Conant and one by myself, with very easy examples. I’d like to have them written down for a reference somewhere, so I’ve thought this might be an appropriate place.

good forking

Question 1. (rephrased as more elaborate latex is not available here): Is it true that in a simple theory, every type has a global Lascar-invariant extension.

I recall that a complete global type {p\left(x\right)\in S\left(\mathbb{M}\right)} is Lascar-invariant over a small set {A} if whenever {\phi\left(x,a\right)\in p} and {b} has the same Lascar strong type over {A} as {a}, then {\phi\left(x,b\right)\in p}. Having the same Lascar strong type means that {a} is equivalent to {b} with respect to every equivalence relation with boundedly many classes which is {\mbox{Aut}\left(\mathbb{M}/A\right)}-invariant.

This property is true in the random graph, for example – any type can be extended to a global one without adding any new edges. This is also true in any extensible NIP theory, say in any stalbe theory, any {o}-minimal theory (e.g. real closed fields) or any {C}-minimal theory (e.g. algebraically closed valued fields), as well as in any theory with definable Skolem functions (e.g. {p}-adics). A theory is extensible if every type does not fork over its domain. However, the crucial point is that this property need not be preserved in reducts of the theory, which immediately gives an easy simple counterexample.

Example. Let {T} be the reduct of the random graph given by the ternary relation {R(x,y,z)} which holds if and only if {x\neq y\neq z} and the number of edges between vertices in the set {\left\{ x,y,z\right\} } is odd.


  1. {T} is supersimple of {SU}-rank {1}. Thus, Lascar strong type is determined by the strong type.
  2. For any set {A}, {\mbox{acl}(A)=A}.
  3. All pairs of different elements have the same type over {\emptyset}.
  4. Let {M\models T} and {p\in S(M)}. Then {p} is not Lascar-invariant over {\emptyset}.

Proof: (1) is because {T} is definable on the set of singletons in the random graph, which is supersimple of {SU}-rank 1. Now it is a well-known fact that Lascar strong type is determined by the strong type in supersimple theories.

(2) is easy to see, and (3) is by back-and-forth.

(4) Assume {p} is Lascar-invariant over {\emptyset}, thus invariant over it by (1) and (2). Let {a\models p}, then by (3) either {\models R(a,b,c)} for all {b\neq c\in M} or {\models\neg R(a,b,c)} for all {b\ne c\in M}. In the first case, let {b\neq c\neq d\in M} satisfy {\neg R(b,c,d)}. Then it is easy to see that {\not\models R(a,b,c)\land R(a,c,d)\land R(a,b,d)} — a contradiction. In the other case, take {b,c,d} satisfying {R(b,c,d)} and check that {\not\models\neg R(a,b,c)\land\neg R(a,c,d)\land\neg R(a,b,d)} — a contradiction again. \Box

Thus, by (4) the unique type over the empty set has no global Lascar-invariant extension.

There are various modifications of the question which still make sense, and also one can ask if this property holds in particular algebraic structures of interest. I have some things to say about it, but not this time.

Question 2. “Can similar results be proved for NSOP theories?”

Here “similar results” refers to the main result of the paper, that is that in an NTP2 theory a formula divides over an extension base if and only if it forks over it. Now, Gabe shows in “Forking and dividing in Henson graphs” that it is not the case for the triangle-free random graph. From my own experience, triangle-free random graph seems to demonstrate the failure of all the phenomena which holds for NTP2 theories.

Example. Let {T} be the theory of the triangle-free random graph, and let {b_{0}\neq b_{1}\neq b_{2}\neq b_{3}}. Let {\phi\left(x,b_{0}b_{1}b_{2}b_{3}\right)=\bigvee_{i<j<4}\left(xRb_{i}\land xRb_{j}\right)}.


  1. {xRb_{i}\land xRb_{j}} divides over {\emptyset} for any {i<j}.
  2. {\phi\left(x,b_{0}b_{1}b_{2}b_{3}\right)} does not divide over {\emptyset}.
  3. {\emptyset} is an extension base.
  4. {T} is {\mbox{SOP}_{3}} but {\mbox{NSOP}_{4}}.
  5. However, forking and dividing are the same for complete types.

See Gabe’s article for details and for the general case of Henson graphs.

Still the following part of the question remains open:


  1. Is forking=dividing for complete types?
  2. Is forking equal to dividing for formulas in NTP1 over models?

On the SOPn hierarchy inside NTP2

In this post I would like to popularize a certain question around Shelah’s classification theory of unstable (and even non-simple) theories.

tree property

1. Tree properties of the first and second kind, {\mbox{NTP}_{1}} and {\mbox{NTP}_{2}}

As usual we are in a very big and saturated monster model {\mathbb{M}} of a complete first-order theory {T}.

Definition. A formula {\phi(x,y)} has {\mbox{TP}_{1}} if there is a tree {\left(a_{\eta}\right)_{\eta\in\omega^{<\omega}}} of tuples and {k\in\omega} such that:
{\left\{ \phi(x,a_{\eta|i})\right\} _{i\in\omega}} is consistent for any {\eta\in\omega^{\omega}}
{\left\{ \phi(x,a_{\eta_{i}})\right\} _{i<k}} is inconsistent for any mutually incomparable {\eta_{0},...,\eta_{k-1}\in\omega^{<\omega}}.
Otherwise we say that {\phi(x,y)} is {\mbox{NTP}_{1}}, and {T} is {\mbox{NTP}_{1}} if every formula is.

Definition. A formula {\phi(x,y)} has {\mbox{TP}_{2}} if there is an array {\left(a_{\alpha,i}\right)_{\alpha,i<\omega}} of tuples such that {\left\{ \phi(x,a_{\alpha,i})\right\} _{i<\omega}} is {2}-inconsistent for every {\alpha<\omega} and {\left\{ \phi(x,a_{\alpha,f(\alpha)})\right\} _{\alpha<\omega}} is consistent for any {f:\,\omega\rightarrow\omega}. Otherwise we say that {\phi(x,y)} is {\mbox{NTP}_{2}}, and {T} is {\mbox{NTP}_{2}} if every formula is.

It is an easy exercise to see that each of this properties implies the usual tree property, that is the failure of simplicity. An important insight of Shelah is that failure of simplicity always occurs in one of these two explicit ways.

Theorem. {T} is simple if and only if it is both {\mbox{NTP}_{1}} and {\mbox{NTP}_{2}}.

These tree properties were introduced and studied by Shelah in [1] and [2]. {\mbox{NTP}_{2}} re-appeared on the scene after it arose in the work on forking and dividing in NIP theories in [3] and was studied further in [4] and [5]. {\mbox{NTP}_{1}} re-appears in [6]. I will probably return to these in future postings.

2. On the {\mbox{NSOP}_{n}} hierarchy inside {\mbox{NTP}_{2}}

We recall the definition of {\mbox{SOP}_{n}} for {n\geq2} from [7,Definition 2.5], another hierarchy introduced by Shelah in order to study non-simple theories without the strict order property:

Definition. 1. Let {n\geq3}. A formula {\phi\left(x,y\right)} has {\mbox{SOP}_{n}} if there are tuples {\left(a_{i}\right)_{i\in\omega}} such that:

  • There is an infinite chain: {\models\phi\left(a_{i},a_{j}\right)} for all {i<j<\omega}
  • There are no cycles of length {n}: {\models\neg\exists x_{0}\ldots x_{n-1}\bigwedge_{j=i+1\left(\mod n\right)}\phi\left(x_{i},x_{j}\right)}.

2. {\phi\left(x,y\right)} has {\mbox{SOP}_{2}} if and only if it has {\mbox{TP}_{1}}.
3. {\mbox{SOP}\Rightarrow \mbox{SOP}_{n+1}\Rightarrow \mbox{SOP}_{n}\Rightarrow\ldots\Rightarrow\mbox{SOP}_{3}\Rightarrow\mbox{SOP}_{2}\Rightarrow \mbox{not simple}}.
4. By Shelah’s theorem we see that restricting to {\mbox{NTP}_{2}} theories, the last 2 items coincide.

The following are standard examples showing that the {\mbox{SOP}_{n}} hierarchy is strict for {n\geq3}:

Example. [7, Claim 2.8]
1. The usual example of a theory which is not-simple and {\mbox{NSOP}_{3}} is the model companion of the theory of a parametrized family of equivalence relations with infinitely many infinite classes (see [8]).
2. For {n\geq3}, let {T_{n}} be the model completion of the theory of directed graphs (no self-loops or multiple edges) with no directed cycles of length {\leq n}. Then it has {\mbox{SOP}_{n}} but not {\mbox{SOP}_{n+1}}.
3. For odd {n\geq3}, the model completion of the theory of graphs with no odd cycles of length {\leq n}, has {\mbox{SOP}_{n}} but not {\mbox{SOP}_{n+1}}.
4. Consider the model companion of a theory in the language {\left(<_{n,l}\right)_{l\leq n}} saying:

  • {x<_{n,m-1}y\rightarrow x<_{n,m}y},
  • {x<_{n,n}y},
  • {\neg\left(x<_{n,n-1}x\right)},
  • if {l+k+1=m\leq n} then {x<_{n,l}y\,\land\, y<_{n,k}z\,\rightarrow\, x<_{n,m}z}.

 It eliminates quantifiers.

However, all these examples have {\mbox{TP}_{2}}.
(1) It is immediate that the formula {E\left(x;y,i\right)} has {\mbox{TP}_{2}}.
(2) Let {\phi\left(x,y_{1}y_{2}\right)=xRy_{1}\land y_{2}Rx}. For {i\in\omega} we choose sequencese {\left(a_{i,j}b_{i,j}\right)_{j\in\omega}} such that {\models R\left(a_{i,j},b_{i,k}\right)} and {R\left(b_{i,j},a_{i,k}\right)} for all {j<k\in\omega}, and these are the only edges around — it is possible as no directed cycles are created. Now for any {i}, if there is {c\models\phi\left(x,a_{i,0}b_{i,0}\right)\land\phi\left(x,a_{i,1}b_{i,1}\right)}, then we would have a directed cycle {c,b_{i,0},a_{i,1}} of length {3} — a contradiction. On the other hand, given {i_{0}<\ldots<i_{n}} and {j_{0},\ldots,j_{n}} there has to be an element {a\models\bigwedge_{\alpha\leq n}\phi\left(x,a_{i_{\alpha},j_{\alpha}}b_{i_{\alpha},j_{\alpha}}\right)} as there are no directed cycles created. Thus {\phi\left(x,y_{1}y_{2}\right)} has {\mbox{TP}_{2}}.
(3) and (4) Similar.

This naturally leads to the following question (which I originally asked in [4]):

Question: Is the {\mbox{SOP}_{n}} hierarchy strict for {\mbox{NTP}_{2}} theories? Note that the strictness of the implication {\mbox{SOP}_{3}\Rightarrow \mbox{SOP}_{2}} is open even in general.

Even an example of a non-simple theory with {\mbox{NSOP}} and {\mbox{NTP}_{2}} is missing. In [2, §7, Exercise 7.12] Shelah suggests an example of such a theory as an exercise. However, I couldn’t make sense of the suggested example. Perhaps someone else would?


  1. Saharon Shelah, “Simple unstable theories”, Ann. Math. Logic,
  2. Saharon Shelah, “Classification theory and the number of nonisomorphic models”, North-Holland Publishing Co., 1990
  3. Artem Chernikov and Itay Kaplan, “Forking and dividing in NTP2 theories”, J. Symbolic Logic, 2012
  4. Artem Chernikov, “Theories without the tree property of the second kind”,
  5. Itai Ben Yaacov and Artem Chernikov, “An independence theorem for NTP2 theories”, arXiv:1207.0289
  6. Byunghan Kim and Hyeung-Joon Kim, “Notions around tree property 1”, Annals of Pure and Applied Logic, 2011
  7. Saharon Shelah, “Toward classifying unstable theories”, Ann. Pure Appl. Logic, 1996
  8. Saharon Shelah and Alex Usvyatsov, “More on SOP1 and SOP2”, arXiv:math/0404178

Large cardinals in AECs: back to the road.

(This was originally posted by Andrés Villaveces in avn va – עולם)

The connections between Model Theory and Large Cardinals have recently been given a very interesting boost (and twist) in the work of Will Boney, a graduate student at Carnegie Mellon University.

Boney builds his results on a line originally opened in the papers by Makkai and Shelah ([MaSh:285] Categoricity of theories in L_{\kappa\omega}, with \kappa a compact cardinal — Annals Pure and Applied Logic 47 (1990) 41-97) and Kolman and Shelah ([KoSh:362] Categoricity of Theories in L_{\kappa,\omega}, when \kappa is a measurable cardinal. Part 1 — Fundamenta Math 151 (1996) 209-240) and a follow-up by Shelah ([Sh:472] Categoricity of Theories in L_{\kappa^* \omega}, when \kappa^* is a measurable cardinal. Part II — Fundamenta Math 170 (2001) 165-196): use of strongly compact ultrafilters to get relatively strong “compactness-like” properties, uses of measurable embeddings to get reasonable independence notions.

But Boney seems to go much further: by taking seriously the consequences of the presence of the embeddings and ultrafilters, he provides

  • a Łoś Theorem for Abstract Elementary Classes – under strongly compact cardinals \kappa: closure under (\kappa-complete) ultrapowers of models in the AEC, connections between realization of types inside monster models of the class, in both the ultrapower and the “approximations”
  • a duality (under categoricity at some \kappa – no large card. hypotheses here) between tameness and type shortness. Tameness can be regarded as a strong coherence property over domains, for types, type shortness is the analog but switching the coherence from domains of the types to realizations of the type
  • under a proper class of strongly compact cardinals, nothing less than a version of the Shelah Conjecture (eventual Categoricity Transfer from a Successor, for AECs) – Boney essentially gets tameness and type shortness of such classes; the meat of the proof really is there
  • under smaller large cardinals (measurables, weakly compacts, etc.), he gets different, weaker results, by using ultrapower techniques allowing him to gain control of properties of the class – reflection properties become crucial for model theoretic properties other than categoricity (stability, amalgamation, uniqueness of limit models!)
  • finally, his results open up many questions that puzzle one: what is the actual strength of Shelah’s Conjecture? where can the counterexamples started by Hart and Shelah (and refined by Baldwin and Kolesnikov) be pushed?

Here is also a skeleton for a Seminar Lecture in our Logic Seminar in Bogotá (in Spanish).

Pila-Wilkie Theorem and Its Consequences

Recent years have witnessed some very exciting applications of model theory in various areas of mathematics; among which is the (nth) proof of Manin-Mumford Conjecture (I will refer to it as MMC) by Pila and Zannier ([7]). I will not get into why MMC is important as it is the subject of an article in a number theory blog. I will, however, get into details of how o-minimality helps proving MMC. In vague terms, MMC states that there aren’t too many torsion points on a subvariety of an abelian variety, unless that subvariety itself is close to being an abelian (sub)variety.

Recall that an abelian variety A is a complex torus \mathbb C^n/\Lambda that can be embedded into a projective space \mathbb P^N(\mathbb C) (hence it has the structure of a projective complex variety). With this definition, an abelian variety is obviously an algebraic group which, in turn, is abelian.

There are various ways to state MMC with various strengths. Here is the version I’d like to discuss as it appears in [7].

Theorem 1. Let A be an abelian variety and X a subvariety, both defined over a number field. Suppose that X does not contain a translate of an infinite abelian subvariety of A.  ThenX contains only finitely many torsions of A.

This has been proven first by Reynaud, and later came the proofs many others like Hindry, Buium and Hrushovski (this last one is a model theoretic proof as well). The proof I’ll discuss is due to Pila and Zannier and is quite different in vein, as uses o-minimality.

Without further ado, I’ll start explaining the main tool in that proof.

Pila-Wilkie Theorem

In a nutshell, Pila-Wilkie Theorem ([6]) gives an upper bound on the number of rational points of a set definable in an o-minimal expansion \mathcal R of the real field, in terms of their heights. Well, this is very vague; anything this vague has to be correct. So where is the hardness? Let’s make things more precise.

First of all, the height of a rational number \frac{a}{b} is the maximum of |a| or |b|, provided that \gcd(a,b)=1 and the height of a tuple of rational numbers is the maximum of the heights of its components. This is the most naive notion of height, but it works fine in the case in hand.

We wish to state that given a definable set Y\subseteq\mathbb R^m, the number of rational points in Y of height less than T is bounded by a linear function of  T. But of course this cannot be the case, taking Y to be the whole \mathbb R^m; besides such a result would finish a huge part of number theory.  For this reason we need to take the (semi)algebraic part   Y^{alg} of Y out; which is defined to be the union of all connected infinite semialgebraic sets contained in Y.

With all this notation in hand, here is the precise statement of Pila-Wilkie Theorem (I’ll refer to it as PW).

Theorem 2.  Let Y\subseteq\mathbb R^n be definable and \epsilon>0. There exists a constant c=c(Y,\epsilon)>0 such that for any T>0 we have 

|\{\alpha\in (Y\setminus Y^{alg})\cap\mathbb Q^n: H(\alpha)<T\}|\leq c T^{\epsilon}.

The proof of this result is quite involved, but is a nice combination of the earlier individual works of the authors.  Some years ago, Pila has proven a similar results in dimensions one and two ([1] (with Bombieri)  and [5]) and Wilkie has a result on the integer points of one dimensional sets definable in o-minimal expansions of the real field ([8]).

Proof of MMC

The main difficulty in getting MMC from PW is to determine the algebraic part of a definable set. In general, we don’t really have any way to do that, but in our situation we deal with special kind of definable sets, definable in certain o-minimal structures.

Remember that we want to understand the torsions in a subvariety X of an abelian variety A=\mathbb C^n/\Lambda.  In this case we have the universal covering map:

\pi: \mathbb C^n\to A.

Let \lambda_1,\dots,\lambda_{2n}\in\mathbb C^n be a basis of the lattice \Lambda. If we represent \mathbb C^n with respect this basis, we see that the torsion points of A correspond to the rational points of \mathbb C^n.

In particular, the torsion points of X correspond to the rational points of the set Y'=\pi^{-1}(X). The main virtue of this set Y' is that it is periodic with respect to \Lambda. Therefore we could restrict our attention to its rational points in [0,1]^n.  Now this new set Y:=Y'\cap [0,1]^n is definable in the expansion \mathbb R_{an} of the real field by restricted analytic functions. From now on we play with this set. Before forgetting X, note that the assumption that it doesn’t contain any translate of any infinite abelian subvariety of A translates to Y as it doesn’t contain any translate of any \mathbb C-linear subspace V of \mathbb C^n such that dim V= rank (V\cap \Lambda). Such translates are called torus cosets.

A large part of the paper ([7]) is devoted to the following.

Theorem 3.  Let Z\subseteq\mathbb C^n be an analytic set which is periodic with respect to a full-lattice. Suppose that Z does not contain any torus coset. Then Z does not contain any infinite connected semialgebraic set. This is to say that Z^{alg}=\emptyset.

Note that this result holds for any complex torus, rather than just the abelian varieties. In short, the proof of this is just local (and tedious) analysis.

Now using this we get that (Y')^{alg}=\emptyset and hence Y^{alg}=\emptyset. So applying Theorem 2, we get that for a given \epsilon>0, there is a constant c=c(Y,\epsilon)>0 such that there are at most cT^{\epsilon} many rational points in Y with common denominator T, hence at most that many torsions in X of exponent at most T. This only gives an upper bound and the following result by Masser ([3]) gives the lower bound, finishing the proof of MMC.

Theorem 4. Let A be an abelian variety defined over a number field K. Then there is d>0 and \rho depending only on the dimension of A such that for every torsion point P\in A of order T we have

[K(P):\mathbb Q]\geq d T^{\rho}.

Final remarks

Recently this strategy played the role of a template for the proofs of even more complicated results such as the proof of certain cases of Andre-Oort conjecture by  Habbegger and Pila ([2]). Other than finding the right setting, the main obstacles in such proofs are determining the algebraic part of the appropriate set and finding the substitute for Theorem 4.  Of course, finding the right kind of o-minimal structure is another important step; as done by Peterzil and Starchenko in ([4]).


  1. E. Bombieri, J. Pila. The number of integral points on arcs and ovals. Duke Math. J. 59 (1989), no. 2, 337–357.
  2. P. Habegger, J. Pila. Some Unlikely Intersections Beyond André-Oort. Compositio Math. 148 (2012).
  3. D. W. Masser. Small values of the quadratic part of the Néron-Tate height on an abelian variety. Compositio Math., 53(2):153–170, 1984.
  4. Y. Peterzil and S. Starchenko. Definablity of restricted theta functions and families of abelian varieties. arXiv:1103.3110.
  5. J. Pila. Rational points on a subanalytic surface.  Ann. Inst. Fourier (Grenoble) 55 (2005), no. 5, 1501–1516.
  6. J. Pila and A. Wilkie. The rational points of a definable set. Duke Math. J. 133 (2006), no. 3, 591–616.
  7. Jonathan Pila and Umberto Zannier. Rational points in periodic analytic sets and the Manin-Mumford conjecture. Atti Accad. Naz. Lincei Cl. Sci. Fis. Mat. Natur. Rend. Lincei (9) Mat. Appl., 19(2):149–162, 2008.
  8. A. J. Wilkie. Diophantine properties of sets definable in o-minimal structures. J. Symbolic Logic 69 (2004), no. 3, 851–861.

When are bi-embeddable structures isomorphic?

Let (\textbf{K}, \textbf{Mor}) be a class of algebraic structures \textbf{K} with a distinguished class of morphisms \textbf{Mor} between them. We say that the class has the Schröder-Bernstein (SB) property if any two elements of \mathbf{K} which are bi-embeddable with respect to \mathbf{Mor} are isomorphic.

Question 1: How can we tell when (\textbf{K}, \textbf{Mor}) has the SB property?

So, for instance, when \textbf{K} = \textbf{Sets} and \textbf{Mor} is all injective functions, then this class will have the SB property by the usual Cantor-Schröder-Bernstein theorem. The reader should try to think of examples which do not have the SB property, and check whether or not his or her favorite categories have SB.

The model-theoretic angle here is that we will concentrate on the case where \textbf{K} is all of the models of some first-order theory T and \textbf{Mor} is the class of all elementary embeddings between them (that is, the maps which preserve the truth of all first-order formulas). If this is true, then we will say that the theory T has the SB property.

But why is the Schröder-Bernstein property interesting? One could easily come up with a list of other strange, never-before-studied properties that some theories have and others do not, so what makes the Schröder-Bernstein property worth studying?

In this post I’ll try to explain a bit about why I am interested in this property as well as give some partial answers to Question 1 in model theory.

The naturality of considering the Schröder-Bernstein property

When many people first hear about it, the interest in studying the Schröder-Bernstein property seems evident. Others are skeptical. Partly for the skeptics, I’ll explain what made me first start thinking about it (within model theory).

A prime model of a theory T is a model M of T which is elementarily embeddable into every other model of T. Some theories have prime models, others do not. Naturally if one has a prime model of a theory T, one would like to know whether it is unique (up to isomorphism).

“Theorem 2” If T has a prime model, then it this prime model is unique.

Here is the “proof” of this statement that occurred to me as a grad student, and must have occurred to hundreds of other students before me: let M and N be two prime models of T. Then, by definition, M is elementarily embeddable into N, and also N is elementarily embeddable into M. So they must be isomorphic, right?

But this argument only works if we somehow know that T has the SB property, and (to me, as I write this) it seems no easier to prove that a given theory has the SB property than to show that it has a unique prime model. I now know about the famous, famously difficult result of Shelah that totally transcendental theories have unique prime models, as well as his examples showing that there exist theories with nonisomorphic prime models (see Shelah, “On uniqueness of prime models,” Journal of Symbolic Logic 44 (1979) no. 2, 215-220). For me, trying to repair this too-hasty argument for “Theorem 2” is what got me thinking about SB for theories.

Schröder-Bernstein and classifiability

Probably the very first time I thought about the Schröder-Bernstein property (in a totally different context than logic) was while reading Irving Kaplansky’s book Infinite Abelian Groups many years ago. In this book, Kaplansky at one point asks the following question (to paraphrase): say we have a system for classifying all objects of a certain type — say, abelian groups — up to isomorphism, using some sorts of “invariants.” How do we know if this system is really useful or enlightening? After all, we always have the “trivial classification system” of associating each object to its isomorphism class! This is silly, but the interesting question is: how do we know that our classification system is really better than the trivial classification system?

In a neat bit of parallel evolution (I doubt Kaplansky was influenced much by Shelah, or vice versa), Kaplansky basically answers this by saying that we should come up with test questions to which any good classification scheme should give a definite “Yes” or “No” answer. Crucially, these test questions should be phrased in terminology which is independent of our supposed classification system, so that we know that the system can answer “real questions” about the classified objects themselves as opposed to merely solving problems that arise from the internal complexity of our classifying scheme.

One of Kaplansky’s test questions was:

Question 3: If G and H are two abelian groups such that each one is isomorphic to a direct summand of the other, then are G and H isomorphic?

(Clearly this is a case of Schröder-Bernstein with \textbf{Mor} being all injections whose images are direct summands. If we instead considered all injective group maps, then Question 3 would have many simple counterexamples, as the reader should verify.)

Somewhat surprisingly, the answer to Question 3 is “No,” with a counterexample so complex that Kaplansky himself could not find it. But the details of this do not matter to us here. What’s important to note is that we can know that Question 3 is true for certain classes of abelian groups using classification schemes.

For instance, as Kaplansky explains in his book, the answer to Question 3 is “Yes” if G and H are both countable abelian torsion groups (that is, for each element g \in G, there is some n \in \mathbb{N} such that n g = 0, but there is not necessarily a uniform bound on n for all g in G). This is immediate using Ulm invariants. Another example, much simpler to explain in a blog post, is if G and H are finitely-generated abelian groups: then each one is a direct sum of indecomposable summands that are isomorphic to either \mathbb{Z} or \mathbb{Z} / p^k \mathbb{Z} for some prime power p^k, and we just have to count the number of summands of each type to prove the result.

Thinking about examples like the ones above leads to the following intuition, partly justifying the study of the SB property (and I won’t try to make this too precise here):

Idea 4: If the objects in \mathbf{K} can be classified by a bounded set of cardinal-number invariants which are preserved by embeddings in \mathbf{Mor}, then we should expect (\mathbf{K}, \mathbf{Mor}) to have the SB property.

These cardinal-number invariants could be things like the number of direct summands belonging to a certain isomorphism class, or they could be dimensions (as for a vector space).

Famously, Shelah’s primary test question (which he used to help justify the development of his “classification theory,” now usually called stability theory) was to compute all possible spectrum functions I(T, \kappa) for a complete theory T, where I(T, \kappa) is the number of isomorphism types of models of T with cardinality \kappa. But we could ask whether Shelahian stability theory is also useful for answering other test questions.

I’ll end this section with a couple of very naïve test questions of my own which I think that the classification theory for models should eventually be able to give definite answers to. But even though I have been thinking off and on about them for years now, I still cannot answer them except in certain special cases.

Question 5: Suppose T is a first-order theory without the Schröder-Bernstein property. Is there an infinite collection of models of T which are pairwise elementarily bi-embeddable but pairwise non-isomorphic?

Question 6: Suppose T is a complete first-order theory with the Schröder-Bernstein property. Is it true that any expansion of T by new constant symbols also has the Schröder-Bernstein property?

Schröder-Bernstein and stability theory

Now I’ll explain a few cases where I know that the Schröder-Bernstein property does not hold. Before giving precise statements, here is another vague generality to motivate the discussion:

Idea 7: If a class of structures is not easily classifiable, then often it is because it contains two objects A and B which “look very similar” and yet are not isomorphic. If A and B “look sufficiently similar,” then often they will be bi-embeddable, thus giving a counterexample to the Schröder-Bernstein property.

So we might expect that those theories which Shelah calls unclassifiable to not have the SB property. And it turns out that this is true! One result from my thesis was:

Theorem 8: Suppose that T is a countable first-order theory. If T is not classifiable (either unstable, or not superstable, or DOP, or else OTOP), then T does not have the SB property. If T is unstable, DOP, or OTOP, then T has an infinite collection of pairwise-bi-embeddable, pairwise nonisomorphic models.

Maybe I will try to explain what this “classifiability” is all about in a future post; it is a fascinating and complicated subject. But a simple example of an unclassifiable (unstable) theory is the theory of all dense linear orderings without endpoints, and it is a fun exercise to construct counterexamples by hand (HINT: by quantifier elimination, any injective order-preserving map will be elementary).

A simple example of a theory which is classifiable yet still does not have the SB property is the theory of all equivalence relations with infinitely many classes, each of which is infinite. For this example, the word “classifiable” naïvely makes sense because any model is “classified” simply by a list of cardinal numbers (that is, the cardinalities of each of the equivalence classes). Again, it is interesting to construct one’s own counterexamples to the SB property here, and to compare this with Idea 4 above (note I was very careful to add the word “bounded,” and in this example there is no fixed prior bound on the length of the list of cardinals classifying a model).

Further reading

I’ll wrap up this post now since it’s already longer than I planned, and I could try to develop various of the ideas here in more detail in future posts. But I should at least give some references to precise theorems and proofs for those who are interested.

Before my thesis (finished in 2007), I am not aware of anyone else attempting a systematic study of the SB property within model theory other than T. A. Nurmagambetov in a pair of articles from the 1980’s (“The mutual embeddability of models” (in Russian), from The Theory of
Algebraic Structures, Karagand. Gos. Univ., 1985, pp. 109–
115, and “Characterization of \omega-stable theories with a bounded number of dimensions,” Algebra i Logika 28 (1989), no. 5, 388–396.) In these articles, Nurmagambetov only considered totally transcendental theories, and he showed that in this case the SB property is equivalent to nonmultidimensionality.
An interesting variation on Idea 7 comes from Shelah’s paper “Existence of many L_{\infty,\lambda}-equivalent, non- isomorphic models of T of power \lambda” (Annals of Pure and Applied Logic 34 (1987), no. 3, 291-310). Note that L_{\infty,\lambda} is a very powerful non-first-order logic (allowing infinitely long conjuctions and disjunctions, among other things), so two models that are L_{\infty,\lambda}-equivalent really do look quite similar. Shelah establishes the property in the title for any theory which is not superstable or which has DOP or OTOP. However, this does not, as far as I know, have obvious consequences for bi-embeddability.

In joint work with Chris Laskowski, we treated special cases of the SB property for theories in the papers “The Schröder-Bernstein property for weakly minimal theories” and “The Schröder-Bernstein property for a-saturated models.” The first paper only considers weakly minimal theories, but we give a complete characterization of the SB property in this context in terms of weak orthogonality of types, and we show that given a failure of SB for one of these theories, one can construct and infinite family of pairwise bi-embeddable, pairwise non-isomorphic structures. The second paper linked above was written with a broader audience in mind (especially the first section) and it gives a complete answer to when a theory has an expansion by constants with the SB property as well as a partial answer to when the sufficiently saturated models have the SB property.

For me, the way in which one can manage to prove such results is much more interesting than merely listing the statements of the theorems themselves… but maybe more about this later!