As of January 2004 au.riversinfo.org is archived under Precision Info
Table of Contents: Environmental Mapping Systems  Locationally Linked Databases
This chapter establishes that it is not possible to write a general algorithm^{1} for analysing interactions between biological organisms and properties recorded at various locations described by a locationally linked database; the problem can not be solved, even approximately. In order to establish this fact this chapter demonstrates that any attempt to write such an algorithm meets with intractable problems. This result has substantial implications for analysing pattern in such spatial data. The arguments presented are comprehensive, exhaustive, and at times abstract. A summary of the core arguments of this chapter is given below.
Since it is not generally possible to use a stochastic process operating over the set of location elements to test a hypotheses regarding the effect of property p_{i} on an organism, one has to know how the particular organism will respond to coded properties prior to writing the algorithm for determining interactions between the organism and coded properties.
The distribution of organisms is frequently limited by conditions that are sub optimal rather than lethal. A transplant experiment may be necessary to account for historical factors responsible for determining the distribution of an organism. The Supplement to this chapter explains the meaning of some technical terms used in this chapter.
To formulate information in a scientific manner it is essential to have basic facts stated in numerical terms. However, it is not necessary to enumerate each unit in the universe in order to arrive at an acceptable estimate for the total. A carefully designed sample may provide the necessary information (Raj; 1968). It is possible to enumerate only those characteristics of entities for which there exists a correspondence between the characteristic and the concept of Cardinality. It is reasonable to write an algorithm for analysing interactions between entities distributed at various locations described by a locationally linked database and physical properties recorded within the database only when it is known exactly what can and what can not be counted, or represented by numbers, and how the said entities can be analysed. In many environmental management applications distributed entities of interest are biological organisms and the recorded properties describe physical characteristics present at the section of the environment described by each location element.
According to the Collins Dictionary of Mathematics (1989) the concept of space embraces a set of objects to which may be attached associated attributes, together with relation(s) defined on that set. A straight line defined between pairs of objects is one such relation.
A spatial examination of an entity must address itself to either analysing a variable that assumes values at different locations and/or analysing properties assigned to a particular spatial location. This entails a statement of location with respect to an imposed coordinate system, physical properties and topological characteristics at the stated location. Topological characteristics are significant because they address the spatial nature of geographic descriptions and specifically deal with issues associated with connectedness of points and areas.
The number of elements any finite adding machine can address is finite, therefore, the method of digital representation must be finite. A finite set of location elements, denoted by Å, can be constructed to represent all distinct location elements addressable to a locationally linked database. Without loss of generality the set Å, of location elements, can be said to have M^{2} members.
In order to represent the real world using the finite set of location elements it is necessary to propose a map, an ordered pair, made up of a transformation with a given codomain, that takes a location in the real world (W) and maps the location to a particular location element addressable to the database. To effect reasonable analysis such a map would have to be well defined.
World  f:W ® Å  Database 
Conclusions derived about the real world W, using the representation Å, can be extended to the real world only if the transformation f:W®Å is invertible, at least approximately. The analysis that follows is independent of whether or not such a map exists and therefore establishes that even if it is possible to construct an f:W®Å that is exactly invertible it is still not possible to write a general algorithm for analysing interactions between biological organisms and properties recorded at various locations described by a locationally linked database.
Scales of measurement based on elements of the Real Number Line are possible only because there exists an isomorphism between what can be done with measurable properties of objects and what can be done with numbers (Stevens; 1946). When measuring characteristics of objects, experimental operations are performed for classifying (determining equality), for rankordering, and for determining when differences and when ratios between the aspects of objects are equal. The empirical operations performed and the characteristics of the property being measured determine the type of measuring scale attained (Stevens; 1946).
Table 2.1 describes the group structure and permissible statistics for each of 4 scales of measurement that can be erected when using a representation based on elements of the Real Number Line. The mathematical group structure of a scale is the determined by the collection of algebraic functions which leave the scale form invariant. For a statistic to have any meaning when applying any particular scale the statistic must be invariant under all the transformations permissible for the scales listed mathematical group structure.
Scale  Basic Empirical operations  Mathematical Group Structure  Permissible Statistics (invariantive) 

Nominal  Determination of Equality  Permutation Group: x(i) = f(x)
f(x) means any onetoone substitution. 

Ordinal  Determination of greater or less  Isotonic group: x(i) = f(x)
f(x) means any monotonic increasing function. 

Interval  Determination of equality of intervals or differences  General Linear Group
x(i) = ax + b 

Ratio  Determination of equality of ratios  Similarity Group
x(i) = ax 

The permutation group includes the isotonic, general linear, and similarity groups as subgroups. Therefore, any statistic applicable when using the nominal scale is automatically applicable when using an ordinal, interval or ratio scale. Similarly, any statistic applicable when using an ordinal scale can be applied when using the interval or ratio scale and any statistic applicable when using the interval scale is applicable when using the ratio scale.
In order to take ratios of measurable characteristics of objects in any selfconsistent manner it is essential that the ratio scale be used. This scale requires an absolute zero. Measurement of temperature in degrees Celsius serves as a specific example. Given a temperature x ^{0}C it is not reasonable to consider a temperature 2*x ^{0}C; for if x be 10 ^{0}C then 2*x corresponds to 20 ^{0}C, but if x be 10 ^{0}C then 2*x ^{0}C corresponds to a temperature of 20 ^{0}C! The problem arises since measurements of temperature in degrees Celsius conform only to the interval scale.
P  The Set Of Measurable Properties For Each Location Element.
Every number is represented by a symbol and therefore to describe the distribution of phenomena that can be represented by numbers, at any location L_{i}, a member of Å, it is necessary and sufficient to ascertain the presence or absence of an associated symbol. Properties that can be represented by numerals, at any L_{i}, are determined to be either present or absent. Properties coded as present or absent at each location element must be drawn from a finite set of properties, this set can be denoted by the symbol P. Without loss of generality P can be said to have N^{3} members. Each member of the set P can be uniquely identified using 1 of the symbols p_{i}, where i takes all integral values between 1 and N.
Generally it will not be possible to impose an ordering on elements of the set P. This can be seen by considering an example for which the properties p_{1}, p_{2} and p_{3} represent a value for a measurable property of a plant, an elevation value and a particular class of land aspect respectively. It is possible to define a property of magnitude only on those sets endowed with an ordering. In the case where it is not possible to define a property of magnitude it follows immediately that it is not possible to define a metric.
W  The Set Of All Possible Distinct Property Codings That Can Be Used To Describe Location Elements.
The set of N distinct properties can be used to recursively generate a set of all possible distinct property codings that can be used to describe location elements. This set can be denoted by the symbol W. In order to generate the set W, a rule for generating W must be established. This can be achieved using the characteristic function g:P®W, where g maps property p_{i} to the ith member of an N dimensional tuple and assigns to the ith member of the tuple the value 0 to denote the presence of property p_{i} and 1 to denote the absence of property p_{i} . Each member of the set W is a word of length N. It is clear that W contains 2^{N }members. Each member of W can be uniquely represented by one of the first 2^{N} binary numbers and/or one of symbols w_{i,} where i takes all integral values between 1 and 2^{N}.
P: The Set of all distinct properties  g : P ® W  W: The set of all possible distinct property codings. 
The property coding for each member of Å is described uniquely by one of the members of W. The map y:Å®W, that assigns to each member of Å the member of W that describes the distribution of properties at the member of Å is an injective function from Å to W. In general the map y:Å®W will not be surjective and it is to be expected that many members of the set W will not have any members of the set Å mapped to them by y:Å®W.
Å : The set of all distinct location elements  y : Å ® W  W: The set of all possible distinct property codings. 
If ~ is an equivalence relation on any set Z then the family of all equivalence classes is a partition of Z. Additionally, if {S_{a}} is a partition of Z then there exists an equivalence relation on Z whose equivalence classes are the {S_{a}}(Rotman; 1965). Because the mapping y:Å®W is an injective function from Å to W, any partition of W can be used to partition the set of all location elements, Å. The properties p_{i}, for i=1 to N, can be used to construct a partition the set W of all possible words of length N.
When measuring with the nominal scale it is possible to identify either a most numerous class or a number of most numerous classes. The group structure permits that subject to certain conditions, a contingency correlation can be defined which allows hypotheses regarding the distribution of cases among the classes to be tested. In order to do this it is necessary to identify exactly which entities are to be considered as classes and exactly which entities are to be considered as cases.
W is the set of all possible distinct property codings that can be assigned to location elements, members of the set Å. W consists of words of length N letters. To investigate interactions between biological organisms and coded properties it is necessary to split the set of N letters into 2 sets of letters:
Without loss of generality, it can be claimed that X^{4} of the N recorded properties are measurable properties describing organism Z and that NX of the properties are measurable properties describing entities other than organism Z.
In order to prove that no general algorithm can be written for analysing interactions between the measurable properties of biological organisms and other coded properties it is necessary to demonstrate only that intractable problems arise when the simplest possible measuring scale is applied to the organism. The simplest measurement that can be made for a biological organism can be achieved by applying the nominal scale using 2 classes. Class 0 organism coded present and class 1 organism coded absent could be used. This approach has the added advantage of keeping the argument and notation simple.
b And a The Sets Of Classes And Cases Respectively.
Using such an approach, the following two sets can be generated:
1. The set a consisting of a 1 dimensional tuple in which a 0 is placed to denote the presence of organism Z and a 1 to denote the absence of organism Z at a specified location element. The set a has two members.
2. The set b consisting of all distinct property codings using the remaining N1 coded properties. The set b can be generated in an analogous way to the set W. The set b has 2^{N}/2 members.
Two additional mappings can now be proposed:
1. The map §: Å®a
2. The map ¨: Å® b .
Both of the mappings §: Å®a and ¨: Å® b are injective functions from Å to a and Å to b respectively. Therefore any partition of a or b can be used to partition Å
Å: The set of all distinct location elements  §: Å®a  a: The set of all possible cases. 
¨: Å® b  b: The set of all possible classes. 
The two members of a would be referred to as the cases: case 1, organism Z present and case 2 organism Z absent. The number of classes is the number of members of b constituting the range of ¨:Å®b; frequently the number of classes will be completely described by a proper subset of b. The most numerous class, or most numerous classes, are defined by the member or members of the set b that have at least as many members of Å mapped to them by the injective function ¨:Å® b as any other single member of b. The group structure permits that subject to certain conditions, a contingency correlation can be defined which allows hypotheses regarding the distribution of cases (members of the set a) among the classes (a subset of members of the set b) to be tested. For example, it may be possible to test the hypothesis that organism Z is distributed at random and therefore the distribution of members of the set a is independent of any, and all, the classes defined by elements of the set b. It may be possible to test such an hypothesis by comparing the assignment of members of Å to elements of a and the assignment of members of Å to members of b, with the expected assignment if elements of a were assigned using a random walk over the set Å.
Members Of The Sets W And b Need Conform Only To A Nominal Scale.
The ordinal scale is applied to place objects in a rank ordering scheme. Interval scales are defined by imposing the constraint of linearity on an ordinal scale. Ratio scales can be defined when it is possible to define the four relations of equality, rankorder, equality of intervals and equality of ratios (Stevens; 1946). It is known that members of the set P need not conform to an ordinal scale and therefore there is no requirement that members of either of the sets W or b conform to an ordinal scale. Any statistic requiring an ordinal, interval or ratio scale need not be applicable to elements of the sets W or b. Because elements of the set W or b need be measurable only with the nominal scale it will not generally be possible to define a property of magnitude on elements of the sets W or b.
Generally it will not be possible to take the average or standard deviation of elements of the sets W or b. In order to take a "representative sample" it is necessary to invoke some measure of central tendency. Importantly, the average and standard deviation measures of central tendency can not generally be applied to elements of Å. In the case where it is not possible to establish a measure of central tendency it is impossible to take a "representative sample".
In order to establish the effect (potential or actual) of properties p_{i} (for i=1 to N1), on a biological organism Z, at a location described by a member of Å, it is necessary to establish an operation for decomposing the member of b that describes the location element to its component letters. Essentially, in order to determine interactions (potential or actual) between the biological organism Z and particular properties it is necessary to factor elements of b in some reasonable manner.
Any attempt at such a decomposition must consider the issue of what the Property codes actually represent and how they are likely to impact on the biological organism Z. The properties describe physical characteristics of the environment in which organisms live. The concept of an organisms niche is well established and arises from the elementary observation that organisms are not distributed at random throughout the worlds ecosystems. Each of the properties mentioned can be thought of as a dosage. Weber's general quantitative assertion of psychophysics holds that there exists a constant C such that two stimuli S and S+DS will be physiologically indistinguishable unless the ratio DS/S>C. C is frequently approximately equal to 1/30 (Resnikoff; 1989). Organisms respond catastrophically to dosages associated with any selective physical phenomenon. This observation forms the basis of the LD50; or logdose 50 statistic. The presence or absence of some property codes (represented by letters) may be immediately fatal; regardless of how suitable the rest of the character string is. Due to the catastrophic response of organisms to some properties it may be impossible to decompose a word to determine the individual effects of coded properties on an organism.
Sinks Of Information And Stochastic Processes.
Once a fatal property, for a particular organism, is present at a given location, it is impossible to say anything about the potential or actual effect of other recorded properties on the organism. A fatal property acts as a sink of information.
It is apparent that even if a property p_{i} has no effect on organism Z, organism Z will not generally be distributed independently of location elements containing property p_{i}. Therefore, it is not generally possible to use a stochastic process operating over the set Å to test a hypotheses regarding the effect of property p_{i} on organism Z.
The absence of any adequate decomposition, for members of b, means that it may be impossible to construct a numeric representation to investigate the interaction between all properties pi (for i=1 to N) and the biological organism Z distributed at locations described by members of Å.
In order to construct a numeric representation directed at establishing potential or actual relationships between the organism Z at various elements of the set Å and particular properties p_{i} it is necessary to apply a filtration to elements of the set Å and exclude from consideration any elements of the set Å at which a fatal attribute, for the particular organism, is present (whether coded or not). If the location of all fatal properties was known then it would definitely be possible to apply such a filtration to elements of the set Å. However, in the case where the property p_{i} is fatal for a particular organism and the absence of the property p_{i} implies the absence of property pk at all location elements, it is impossible, from the set Å, to say anything about the interaction between Property pk and the particular organism. Given this problem, the fact that any attempt to establish an equivalence relation on members of the sets W or b defines a partition of these sets, and the observation that measures of central tendency need not exist for either b or W it may be impossible to take a statistically valid representative sample of location elements from the set Å. The distribution of fatal properties at both the sampled and unsampled sites must be accounted for.
There May Exist Distinct Partitions Of W And b That Can Be Used To Aggregate Members Of Å Identically.
It is observed that the mappings y:Å®W need not be bijective and therefore there may exist distinct partitions of W and b that can be used to aggregate members of Å identically. In the case where a subset of location elements mapped by y:Å®W to elements of W having Properties p1 AND p2 AND p3 coded as present is sought a partition of W can be defined by the equivalence relation that relates elements of W having identical coding for properties p1, p2, and p3. In this case the partition consists of a family of eight equivalence classes. These eight equivalence classes can be denoted by the symbols e1, e2, e3,...,e8. A distinct partition of W could be obtained by the family of 8 equivalence classes having identical property coding for p_{4}, p_{5} and p_{6}. These eight equivalence classes can be denoted by the symbols f1, f2, f3,...,f8. In such a case each member of Å can be assigned uniquely to both one e and one f class. Since y:Å®W is only an injective function there may exist a 1:1 correspondence between the members of Å assigned to each e class and the members of Å assigned to each f class.
The Property Specification Required To Select A Particular Subset Of Location Elements Need Not Be Unique.
An obvious corollary to the fact there may exist distinct partitions of W that can be used to aggregate members of Å identically is the fact that the property specification required to select a particular subset of location elements need not be unique. A statement to select a subset of location elements mapped by the injective mapping y:Å®W to members of W having up to N specified properties coded as present always selects a well defined subset of location elements. For example, if N=10 the statement to select all location elements mapped by y:Å®W to elements of W having properties p_{1}, p_{2} and p_{3} coded as present would always select a well defined subset of location elements. Such a subset of location elements can be denoted by the symbol j. It would not be at all difficult to construct a database in which the subset j was equivalent to the set of location elements mapped by y:Å®W to elements of W having property p_{i} (where i is any single specified integer between 1 and N) coded as present. In such circumstances there may be no reasonable way of assigning location elements not in the set j to classes established by a combination of the specified properties used to select j. If the property coding required to select a particular subset of location elements is not unique then it may be impossible to analyse the database to investigate the effects (potential or actual) of coded properties on an organism.
Analysing For The Effect Of 1 Property When Specifying A Union Of Elements Of W.
Boolean operators acting on members of W are not associative. For example, the statement to select all location elements mapped by y:Å®W to members of W having properties p_{1} AND p_{2} OR p_{3} coded as present is ambiguous and can be given an analytical meaning only by imposing an order of operations. The Axiom of Choice, which permits the construction of a set containing exactly one element of each of a given family of sets, requires that the sets used for the construction be disjoint. Sets defined by a union of elements of W are not disjoint and therefore in the case where a selected subset of location elements is determined by a union of elements of W it will not generally be possible to analyse for the effect of 1 specified property on an organism.
An algorithm sufficient to permit the construction of a numerical representation for analysing interactions between biological organisms and particular properties recorded at various locations described by a locationally linked database need not exist because:
Apart from the theoretical problems associated with attempting to apply mathematical analysis methods to such databases there are practical problems:
The success or failure of managing data of mega or gigabyte proportions depends on whether or not the accuracy and content of each data entry is adequate and appropriate to give consistent results for the types of question being asked. Attempts to estimate the change of an ecosystem over time can not be mounted in any reasonable manner until the relationship between the rate at which the information becomes obsolete and the scale at which the information describes the ecosystem is established.
The natural environment is susceptible to external influences and is "governed by rules" that are not rigorously understood. It is a characteristic of mathematical thought that it can have no success where generalisations can not be made (Peirce; 1895). Unfortunately, because the construction of a GIS database defines a partition, and data that has been aggregated to define a partition either can not be, or can not readily be generalised, prediction is practically impossible. In order to generalise the data it is usually necessary to establish cause and effect and this must be done statistically. In order to statistically interrogate the database for the effects of coded properties on organisms, a filtration (the existence of which can not be guaranteed) must be applied, and this inevitably means that much of the data must be rejected. Additionally, conclusions derived from filtered data may not be valid when extended over the entire database. The assumption that more data necessarily increases predictive ability does not hold under circumstances where sinks of information exist.
I substantially attempted to apply standard definitions; as published in the Collins Dictionary of Mathematics. To suit their purpose, the meaning of some terms defined by authors varies, therefore explanation of some terms serves to prevent confusion.
^{1} A general algorithm is an algorithm that must be able to answer any well posed question concerning interactions between all coded properties and all organisms distributed at locations described by a locationally linked database.
^{2} Where M is a positive integer.
^{3} Where N is a positive integer
^{4} Where X is a positive integer less than N.
A paper summarising aspects of the information contained withing this thesis was subjected to independent review by three impartial referees and published in the Australian Journal of Soil and Water Conservation (Volume 1 number 9 February 1996).
Meta Data Record: Environmental Mapping Systems  Locationally Linked Databases
Table of Contents: Environmental Mapping Systems  Locationally Linked Databases
Single Frame Page View 
Printer Friendly Page 
Email & Feedback 
au.riversinfo.org 
Riversinfo Australia (au.riversinfo.org): Publication Information:
Copyright ©  Disclaimer 