I want to suggest that our languages need to capture a much wider variety of data sources and to suggest how this can be achieved by by emphasizing type structure in the design of an "information interchange" language. Jeff Ullman has rightly suggested that we should standardize on a "core" language that all I3 agents should speak. He proposes relational algebra in the guise of non-recursive datalog as the language and lines of text with tab-separated fields as the data format for relational data. This proposal is fine if the data to be exchanged is in (or can easily converted to) relational format, but in many cases it cannot. There is a huge variety of data formats that have been already been designed for data archive exchange. I would guess that more data -- especially scientific data -- is maintained in these formats than is held in conventional databases, and I think it would be appropriate to consider this large body of data in any discussion of information interchange. At first glance these formats appear ad hoc, but closer inspection reveals that they are based on a remarkably small number of types. Typically they specify a number of base types, and some subset of the types one uses in programming languages and consequently in object-oriented databases: record and variant types, and the collection types -- sets, multisets, lists, and arrays. Since there is a great deal of commonality between the basic operations on these collection types, languages based on these operations have simple syntax and a semantics that naturally extends that of the relational algebra. There are a number of working implementations of such languages: O2SQL (the QL for O2); CPL at Penn, Aqua at Brown. I should add, by the way, a plug of our own work on CPL. It is being currently used in the Genome project for integrated querying through generic interfaces, across relational DBMS, together with ASN.1, ACeDB, and other data sources for which relational interfaces have been tried and found inadequate, Thus, in our experience, a small set of primitives based on type structure and use of patterns (in the spirit of, for example, ML) combine to provide concise and conceptually simple programs for heterogeneous data restructuring. To summarize the advantages of this approach: a. By adopting it one has _generic_ interfaces for a large number of data formats. One does not have to write conversion packages or "wrappers" for each new data source. b. Restricted to sets of records, these languages express precisely the relational algebra. So this really is an extension of Jeff's proposal. c. The syntax is a relational calculus look-alike. A SELECT ... FROM ... WHERE syntax is common. d. Optimization of these languages is becoming well understood and extends the known techniques for relational query languages. I should also mention the disadvantages: -- These languages do not do everything (a consequence of point (b) above). More importantly extending the functionality for arrays beyond that for lists is still a research problem. -- There are several proposals for surface syntax (though a lot of commonality for abstract syntax). If you held a gun to my head I would say ODMG's OQL proposal minus the nesting constructs plus pattern matching, primitives for variant types, and some form of function definition. This brings it roughly in line with our CPL. -- This proposal says nothing about the transfer of "knowledge" as opposed to data. The bottom line is that by adopting this somewhat extended language one has a much more flexible medium in which the problem of querying, and more importantly, restructuring data is greatly simplified,