What is an instance in information modeling?

Pieter Wisse

1. Introduction

This is partly a review of the article 'Emancipating Instances from the Tyranny of Classes in Information Modeling' by J. Parsons and Y. Wand (2000). I have concentrated on their concept of instance through which I believe the authors are addressing the right problems. However, though the modeling approach they have derived from it improves flexibility when compared to other approaches, their instance concept's potential for securing the right solutions remains limited. From the perspective of my own work on modeling theory, I'll then proceed to argue for emancipation from the tyranny of behaviorally absolute instances.

Parsons and Wand start from challenging the priority of the concept of class for information modeling. They label such priority "the assumption of inherent classification." Their alternative design yields priority to instances, instead. It ends up separating instances from subsequent ways of (sub)grouping them. On account of this division between instances and classes they call it "a two-layered approach to information modeling." They claim their approach "is consistent with ontological and cognitive principles."

I strongly agree with Parsons and Wand that how the class concept is predominantly integrated in — approaches to — information modeling is problematic. I disagree, though, with their favored solution. What I especially seek to demonstrate is that they only refer to a particular set of "ontological and cognitive principles." As their principles of choice are lacking in requisite variety for information modeling, attention must first of all be directed to an improved ontology.

2. Balance between pre- and postcoordination

I propose a key for analyzing the two-layered modeling approach, and why it cannot fully deliver on its laudable-enough promises, lies with Parsons and Wand's initial terminology. On my part, I believe "inherent classification" doesn't optimally capture the target of their criticism. Let me substitute a priori classification for it. The second layer in their modeling theory is then easily recognized as facilitating a posteriori classification.

Now the question can be put more succinctly as follows: Is it possible to shift from radical a priori to radical a posteriori classification? Parsons and Wand seem to say it is. It requires for them to declare the a priori inventory of instances outside classification. For "[t]he first layer represents existence of things with properties, independent of any classes to which the things may belong."

Yes, I acknowledge the attraction of this assumption. I believe it is mistaken nonetheless. For illustration, I don't have to point out the illusory nature of the Kantian Ding-an-sich. A more immediately practical background provides a discipline long familiar with classification, and with balancing a priori with a posteriori classification. In library science, the terminology of pre- and postcoordination is usually applied (Foskett, 1969). The instance traditionally at stake for a library is a document. Of course it should be possible to retrieve the document from the collection upon request. How such requests may be processed is prepared by so-called cataloguing. The organization, or structure, of a catalogue reflects a trade-off between request pattern, quantity of documents, qualitative variety, inventory resources, cataloguing technology, etcetera. It is important to realize that postcoordination by definition involves — the mechanism of — classification, too. In postcoordination, Foskett explains (p 86), "we can index documents by terms denoting all the individual concepts present, but use a physical form that permits us to coordinate at the time of searching." At the minimum, postcoordination requires a classification of "individual concepts." In fact, Parsons and Wand acknowledge the need for controlling 'properties' of 'things' but downplay its significance: "Agreement must be reached only on instances and properties." As library science shows, in modeling there's no escaping classification. Parsons and Wand overplay their hand when they insist that "[t]hings and their properties exist independently of any classification." Complete independence is impossible to achieve. However, library science can also teach the principles behind balancing a priori with a posteriori classification (also read: pre- and postcoordination, respectively). An instance (a thing, a particular, an object) is modeled with properties (including its relationships with other instances). At the next level of decomposition it is of course possible to treat a specific property as an instance, and so on. It is when instances are required to be grouped that one or more relevant 'property classes' must be assumed for (post)coordination. What modelers should possess is an awareness of balancing factors. A priori absolutizing instances is certainly not how to subside "the tyranny of classes."

A relativistic view of a priori and a posteriori classification in fact saves much of what Parsons and Wand claim. The problems they have identified, however, are not radically solved. Rather, opportunities for changing the relative contributions of pre- and postcoordination may be taken to define a solution space. So, for example, "the multiple classification problem" is never completely solved. For a minimum of property classification is always required. Without it, postcoordination cannot succeed as the necessary and sufficient precoordinated basis for grouping instances fails. Changing relative contributions can certainly mitigate the problem. Exactly the same may be said about "the view integration problem," "the schema evolution problem" and "the interoperability problem." And as I've tried to demonstrate in this paragraph, those problems1 — and their preferred solutions — are not at all novel to application of digital information technology.

3. Ontological innovation for requisite variety in modeling

Moving along the spectrum from pre- to postcoordination equals giving instances priority at deeper levels of modeling. At the same time, this move may compromise the integrity of the instance assumed by Parsons and Wand. In order to substantiate my point, I'll first state my understanding of the concept of instance as applied by Parsons and Wand. They refer to M. Bunge and his ontological framework. From it, they derive as a first postulate that "[t]he world is made of things that possess properties," adding as a principle that "[n]o two things possess exactly the same set of properties."

Such atomism dates back at least to ancient Greece. A revival was inaugurated by logicians such as Frege, Russell and Wittgenstein (in his early work). It is the foundation of positivistic science. What it implies is that a thing, an object, or whatever, both exists independently and can be known as such. And it is the unique set of properties that, say, certify the 'thing' in its absolute individuality. Now there is an essential assumption about the set of properties that usually remains implicit. (I admit having been tempted to label it inherent.) Keeping my notion informal, all properties appear aligned. They are consistent among themselves.

Calling upon yet another discipline, it is social psychology informing us that different behaviors of one and the same 'thing' are not necessarily consistent. On the face of 'things,' one behavior may even contradict another. The problem is resolved by introducing a differentiating variable, usually called situation.

The concept of situation is absent with Parsons and Wand. But it is precisely from a situational perspective that the limits of their approach show, even when changing the claim from radical solution to problem mitigation. My point is that what they call "problems" may often not be a problem, at all. I mean to say that a thing behaving differently in different situations doesn't make that thing problematic. Behavioral differentiation might instead reflect the optimum in adaptation. An ontology, like Bunge's for example, should not foreclose situational differentiation but allow and even facilitate its modeling.

Again, struggling with the relationship between identity and difference is not a recent effort. Aristotle already inquired into the issue, as did Hegel. More recently, Derrida attempted to develop a play of philosophy with the concept of difference. Directed at information modeling, and applying terms such as role and context, for example G.M. Nijssen, J.F. Sowa and T. Halpin have suggested approaches.2 They all fall short of expectations, however, because one particular assumption remains unchallenged. Parsons and Wand are no exception. Too hastily, they turn to outlining a solution. But it is the problem that first has to be recognized more clearly and fundamentally. Parsons and Wand overlook it, insisting as they do "that in the two-layered model, every instance is identified universally. This agrees with the notion of object identity." They are right on identity but — as modeling theorists habitually still seem to do — fail to account for equally universal difference.

In my design for a theory of information modeling,3 I've acknowledged the lack of behavioral integrity of an instance as a whole. But, then, how to reconcile the two irrefutable facts of identity and difference? I have removed the status of logical atom from the instance (object etc). In a situational approach to information modeling there are still logical atoms, though. As atomic counts an instance's behavior as preconditioned (also read: precoordinated) through a situation. In addition, situation, object (also read: instance) and behavior have become relative. For what should be considered an object cannot be absolutely fixed. It is a matter of focus, or particular interest. It is not only "classes [that] serve a utilitarian role," as Parsons and Wand maintain. The choice of instance, i.e., situated behavior of an object, is just as essentially purposeful, utilitarian or pragmatic.4

I won't pretend to adequately summarize here in a few sentences something as complex and literally fundamental as an ontology. Below, as references several recent publications are listed where I've developed a theory of information modeling grounded on both instance identity and situated behavioral diversity.5 An ontology with requisite variety is the single most important tool information modelers need to address real problems and create real opportunities.

Acknowledgement

I am grateful Professor Wand has kindly made his article available to me. I hope he and Professor Parsons appreciate any criticism on their work that they feel I may have expressed here. I offer it in the spirit of constructive theoretical development (and await to be treated in kind by anyone taking issue with my views).

Notes

1. I'm missing guaranteeing an audit trail in Parson and Wand's inventory of problems. From their simple idea of "reclassification" I rather have the impression they are unaware of issues of historical information integrity.
2. The sheer volume of relevant publications make it increasingly impossible to be anywhere near exhaustive when referring to the literature. In naming these authors and their work I generally acknowledge other contributions.
3. See note 2 for a reason why I don't put an absolute claim on originality for my theory of information modeling. I've tried to conduct due diligence, meaning that I continue to orient myself at work by others in information modeling. So far I haven't come across a synthesis such as I propose as metapattern and subjective situationism.
4. The proposal by Parsons and Wand "that implementations of the instance-based model should use only one global identifier for every instance" is therefore still too crude. Global identifiers should instead be allocated to each and every situated behavior of an object.
5. Additional information is available from http://www.informationdynamics.nl and http://www.informationdynamics.nl/pwisse.

References

Foskett, A.C. The Subject Approach to Information Originally published in 1969. Fourth edition, Bingley/Linnet, 1982.
Halpin, T. Information Modeling and Relational Databases: From Conceptual Analysis to Logical Design Morgan Kaufmann, 2001.
Nijssen, G.M. Universele Informatiekunde PNA Publishing, 1993.
Parsons, J., and Y. Wand Emancipating Instances from the Tyranny of Classes in Information Modeling In: ACM Transactions on Database Systems, Vol. 25, No. 2 (pp 228-268), June 2000.
Sowa, J.F. Knowledge Representation: Logical, Philosophical, and Computational Foundations Brooks/Cole, 2000.
Wisse, P.E. Multicontextual paradigm for object orientation: a development of information modeling toward fifth behavioral form In: P.E. Wisse, Informatiekundige ontwerpleer, Ten Hagen Stam, 1999. The full text is also accessible at http://www.informationdynamics.nl/knitbits/htm/multicontextual_paradigm.htm.
———— Metapattern: context and time in information models Addison-Wesley, 2001.
———— Metapattern: information modeling as enneadic dynamics In: PrimaVera working paper series, nr 2001-04, Universiteit van Amsterdam, 2001. The full text is accessible at ../pdf/pv-2001-04.pdf.
———— Semiosis & Sign Exchange: design for a subjective situationism, including conceptual grounds for business information modeling Information Dynamics, 2002.
———— The ontological atom of behavior: toward a logic for information modeling beyond the classics In: PrimaVera working paper series, nr 2002-05, Universiteit van Amsterdam, 2002. The full text is accessible at ../pdf/pv-2002-05.pdf.

Dr ir Pieter Wisse is president of Information Dynamics (Netherlands), an independent company involved in research & development. Contact at pieter@wisse.cc.