Factors Space, A New Frontier to Fuzzy Sets Theory

ABSTRACT Professor L. A. Zadeh is a special inspiring leader in cybernetics and machine intelligence. As a tribute to his pioneer work in fuzzy sets theory, this paper follows his treatment in fuzzy membership functions and operations and introduces the factors space theory which further explains the relationships between them. The factors space theory calculates the membership degree of a human concept by first surveying the sampling interval statistics around a concept, and then introducing the vision of ‘falling shadow’ to convert the covering probability of a random interval in the power set of the concept domain to membership degree. This paper introduces the factorial probability and transfers typical probability density to typical falling membership curves. Based on this idea, a subjective scoring system used on fuzzy evaluation and decision-making is suggested. Facing the question on the selection of fuzzy operations, the paper indicates that the source is in the joint distribution of random sets and a selection proposition is given. Factors space opens a new frontier for the developments of fuzzy sets theory.


Introduction
In 1965, Professor L.A. Zadeh puts forward the theory of fuzzy sets [1]. Classic set theory can only express precise concepts, which encounters a serious limitation in practical application as it is not flexible enough to simulate how the human brain reasons and makes decisions. In order to change this situation, the fuzzy concept must be mathematically represented by a new sets theory. As a cybernetic expert, Zadeh was a trailblazer in the development of machine intelligence from mathematics, and after two short years when his monumental paper was published, the field explored and the fuzzy articles increased exponentially each year, spreading to all fields of computer applications. Concept is a unit of thinking; the fuzzy concept is constantly used by a human being in our daily living. Fuzzy sets theory makes the computer accessible to the people, which builds a bridge that crosses from the qualitative and quantitative descriptions. In other words, the birth of fuzzy set theory is an important milestone in the history of artificial intelligence. As students and friends of Prof. L. A. Zadeh, the authors are deeply indebted to his great contribution to this field.
The best way to pay tribute to him is to inherit and promote his theory. This paper aims to report a brief outline of the researches of factors space theory. Since the most important problem in the early stage of fuzzy sets was the comparison between the fuzziness with the randomness. The authors recognised that the two kinds of uncertainties are caused by the lack of factors to describe both, where we employ a very important word 'factor' into people's intention. The authors observe that the key to open the door of major advancement in biology is the discovery of gene, the root of bio-qualities. It was called the Mendelian factor. Indeed, factor is a generalisation of gene, which is the root of qualities for anything, especially, for comparing the two kinds of uncertainties.
From the view of mathematics, a factor f is essentially a mapping, which maps an object u ∈ U to its state/attribute in X(f ), Definition 1.1: A factor space defined on universe of discourse U is a family of sets ψ = ({X(f )} (f∈F ) ; U) satisfying: The definition has not been changed much since the first paper [2] was published in 1982 -the same year when the Formal Concept Analysis [3] and the Rough Sets [4] were also appeared. They all attempted to put forth formal, mathematical description knowledge and intelligence. P. Z. Wang, H. X. Li, Z. L. Liu and many colleagues have done a lot of works on factor space theory and its applications .
We developed factorial analysis on the basic space of a probability field and defined the factorial probability in order to develop a stronger foundation, which establishes the symmetry principle and the data parity principle, respectively. Then we do factorial analysis on the domain U of fuzzy subsets. We had found out the duality between randomness and fuzziness and presented the falling shadow theory, which offers four contributions to fuzzy sets theory: (1) Developing the theoretical base of subjective measurement by the falling basic theorem; (2) Developing the typical models of membership curves; (3) Establishing the typical fuzzy numbers scoring system; (4) Indicating the selection of fuzzy operations. The organisation of the paper is as follows: The factorial probability is introduced in Section 2; the Falling shadow theory of fuzzy sets is introduced in Section 3 and the conclusion is given in Section 4. To predict the occurrence of an event, the process of prediction is called a 'test', which is a factor f, its states space X(f ) = {a 1 , . . . , a n } is called the testing space and a i are called results. In each test, there occurs one and only one result.

Factorial Probability
Randomness is the uncertainty on predicting which result will be occurred, which is caused by the insufficiency of the stipulation on the conditional or causal factors. Randomness comes from the break-down of causality, while probability is a generalised law of causality. How do we take such a proposition? We propose the assumption as follows: The Symmetry Principle The symmetric test factor guarantees that the result will have the same degree of certainty: P(a 1 ) = . . . = P(a n ) = 1/n. (2.1)

Definition 2.2:
Let S: U → X(S) = {a 1 , . . . , a n } be a symmetric factor, let A be the σ -field generated by X(S), and let p be the probability on A extended by the symmetry principle (2.1). We call P the factorial probability under the symmetric factor S.
Probability is the law of generalised causality; condition causes result and condition determines the value of probability. However, in many practical applications, causal conditions are unknown or not available. When we say that a dam can withstand 'the flood of a century', would we know the condition to ensure that the probability of unexpected flood is lower than1%? The lack of rigorous specification on probability will lead to inconspicuous risks. The factorial probability has a rigorous specification on its condition, which is the symmetry of the test factor. The factorial probability is essentially a logical matter, which is the essence reflecting the causal relationship between things. If we do experiments on event-frequency, it must be stabled around the factorial probability.
Some people refer to subjective probability, which is not a derogatory term, it reflects the subjective initiative of cognition subject, and which is worth to be studied. Factorial probability is not subjective probability, but it can provide a logical basis for it.
Factorial probability has the universality of application. In the area of data mining and information transformation, all probabilities we encounter are factorial probabilities because we propose the following principle: Data parity principle. Without special declaring, all sample points in a data sampling are equal and have the same weights.
If S is a sampling factor with state space D ⊆ R, let n be the size of sampling, then according to the data parity principle, each sample point carries probability 1/n falls into D, and a sample distribution is formed in D. Which will not be symmetric usually! The multifactorial sampling factor forms a joint sample distribution in the state space D ⊆ R n , and the joint distribution determines the causal relationship between the group of factors.
By increasing the size of sampling and investigating the limiting properties of the sample distribution, we get the population distribution. Existing probability distributions can be recertified along this approach. In some areas, such as remote sensing data, there are many unknown distributions, which can be discovered. New distribution types and expressions can be obtained by means of the Monte Carlo method.
The probability distribution of the same class is called a type. There are some parameters in each type, and the most important parameters are two: one is the sample mean, and the other one is the sample mean variance. These two parameters play two roles: (1) Generalisation of data processing: if we take linear transformation such that the two parameters are equal to 0 and 1, respectively, then the distributions with the same type becomes one. This is the generalisation principle of image data processing; (2) Taking these two as hidden parameters, solving or iterating them in the optimised format, then the key factors will be obtained, which are secret of success in the classification and learning.

Basic Falling Theorem
In the fuzzy sets theory, the domain U is an undefined noun, but we take it as a factors space. Fuzziness is the uncertainty of boundary demarcation caused by the lack of description for judge-factors around a concept. The duality between fuzziness and randomness is found by comparing the two kinds of uncertainties. As shown in Figure 1, the probability model is 'the circle is fixed, the point is changing', the fuzzy model is 'the point is fixed, the circle/interval is changing', which is mathematically the relation between the Universe U (ground) and the power set of U, P(U)(sky).
For example, in coin-tossing, if one knows the speed, angle of contact, the mass of the coin, . . . , etc., one could apply Newtonian mechanics to solve the motion equations and may accurately determine whether the coin will land as head or back. However, if one does not bother to measure all these factors, then one can attribute all these to probability. The outcome of the coin-tossing is a point in the fundamental space as shown in the left figure of Figure 1. Thus, the condition circle is fixed, the point (i.e. the outcome) is changing. On the other hand, for a typical concept like 'young', different persons have slightly different ideas. One may regard young as between 17 and 30 years, other may be 18 and 28 years. Each opinion can be regarded as a random interval. The aggregation of these sample gives rise to the general concept of 'young', which is a fuzzy concept with certain membership function. So for a fuzzy concept, the point (i.e. 'young') is fixed but the circle (i.e. the interval) is changing.
The fuzzy model on the ground can be transformed into a random model in the sky, which is stated by the 'falling shadow' theory. In the book [5], three kinds of mathematical structures, Order, Topology and Measure, have been promoted from U to the power of U, and we set up a variety of hyper-topologies and produced a variety of super measurable structures, they formed a variety of random sets in the sky, and according to the different ways of falling, a variety of subjective/non-additive measures can be formed. Among them, there are four popular types: Belief (BL), Plausibility (PL), anti-belief (ABL) and antiplausibility (APL). Fuzzy measure is a special APL. Their definitions are quite complex, and we can simply re-define them as follows: Suppose that H is a σ -field defined on P(U). For any subset A of U, set It is known that A o and A o are the filter and ideal of A, respectively.

Definition 3.1:
Let p be a probability defined on P(U), set They are called the BL, PL, ABL and APL on U, respectively. Basic falling theorem [5]. Given a measure µ (belongs to BL, PL, ABL or APL), there is one and only one probability p defined on H such that it falls down to be µ according to (3.1). This is a very important theorem. Without the existence of this uniqueness theorem, fuzzy sets and Dempster-Shafer's non-additive measures theory cannot be applied in practice since there lack of solid foundation. The proof of this theorem is not easy; it needs to apply the measure expansion theorem. The starting point of the measure expansion is semi-ring, but this proof requires that the starting point must be π-system, which is much primeval than semi-ring, and so that the theorem is a very difficult and important work in mathematics.

The Types of Fuzzy Membership Curves
Probability has had distribution styles such as Poison distribution, Normal distribution, -distribution and some distributions have Tables to query the confidence limits for statistical inference. Those progresses show the mature of probability; fuzzy sets can have similar styles and tables. Falling shadow theory can transfer probability distributions to membership degrees. The only problem we have to face in practice is such a challenge: The complexity will be exponentially increased when we promote domain to its power. Fortunately, this problem can be overcome by limiting the set-value statistics to be interval statistics. Let us look at an example: Consider a domain including eight elements U = {a,b,c,d,e,f,g}, its power P(U) contains 256 subsets. But, if we limit the power contains hyphen only, then.
With an empty hyphen, there are only 74 elements. Where I is the set of hyphens and P I (U) = {A ∈ I|A ⊆ U}, which contains the common subsets of U and I. We call I the background set, and P I (U) is called the background power.
Suppose that the total probability 1 is assigned to the following five hyphens: Then the falling shadow of random hyphen on U is the following membership degrees: Consider background power P I (R), and the background set is the set I of all intervals, the complexity is not so big. Under the simplification, the membership curve in the real domain only depends on the change of the two random variables only. As shown in Figure 2, the left and right sides of the subordinate curve are the left and right distribution functions of the distribution density of the two endpoints of the random interval.
If we divide the membership curve in half and only take care around the tail of the membership curves, we can prove the following theorem.
The transformation theorem. The left (right) subordinate curve of membership (falling shadow) is equal to the left (right) distributive function of the distributive density of the left (right) endpoint of the random interval.
According to this theorem, the tail type of the membership curve can be determined by the probability density. There are three main types of subordinate membership curve: (1) Negative power type. If the distribution density of the left (right) endpoint of the random interval is the power density curve then the left (right) tail expression of the subordinate membership curve is.
(2) Negative exponential type. If the distribution density of the left (right) endpoint of the random interval is the negative exponential density curve then the left (right) tail expression of the subordinate membership curve is.
(3) Negative logarithmic type. If the distribution density of the left (right) ending point of the random interval is the reciprocal curve then the left (right) tail expression of the subordinate membership curve is.
Negative exponential type and logarithmic type should become the normal types of membership curves.
Logarithmic regression, the logarithmic weight distribution is derived from these typical membership curves.
Similar to the confidence limit settled in probability statistics, we can design table to query the certainty limit in fuzzy statistics.
Thus, the transformation theory enables us to obtain the membership degree of a fuzzy concept through the random interval statistics. And the random intervals can be obtained through a survey. In essence, the bridge between probability and fuzziness is established.

Fuzzy Scoring Systems
One important task of fuzzy sets in artificial intelligence is grading or scoring. There are two types of data: the physical measure data and the subjective scoring data. The latter one is mainly graded by experts with decision-making, it requires the operation of fuzzy numbers. Guo's [29] structural element theory has made an important contribution to the fuzzy number. According to his theory, we take a standard triangular fuzzy number E as the structural unit, which satisfies that E(−1) = 0, E(0) = 1 and E(1) = 0. For any fuzzy number A, there is the only monotone function h such that A(x) = E(h −1 (x)), and we denote that A = h(E). Any fuzzy number defined on R can be generated by the structural unit E throughout a monotone function. A fuzzy number is called a tine fuzzy number if it takes height 1 at a single point. All triangular fuzzy numbers are tine fuzzy numbers. We can concretely set that When b < a, you can also do the subtraction of fuzzy numbers within the system N(0, +∞): The scalar multiply can be also taken within the system: The domain of the scoring system should be the interval (0,1], but the distinguish rate is not satisfied on the domain. The falling shadow theory emphasise that the negative exponential or the negative logarithm curves are the better style. We need to transfer the scoring domain from (0,1) to (0,+∞). Set that y = −ln x, i.e., x = exp{−ln y}, and set Information synthesis for physical data is a process of weighted average. Given a group of numbers y 1 , . . . ,y n and weight w = (w 1 , . . . ,w n ), we get the weighted average y = w 1 y 1 + . . . +w n y n . Information synthesis on subjective scoring is different with that of physical measurements. According to Guo's fuzzy number system, we have the following principle: Arm scoring principle. The fuzzy numbers should be taken in the arm scoring system firstly, and the weights should be averaged on the arm. And then return to the ground domain. Scoring

Fuzzy Operations' Selection
There many kinds of operations between fuzzy sets. How to select the fuzzy operations is a puzzle problem. Falling shadow theory tells us that the source is in the joint distribution of random sets in the sky. Selection Theorem on fuzzy operations. [30] Let ξ and η be two random sets falling to be fuzzy sets, A and B respectively.

p(ξ(u) = a and η(u)
There are many different formulae that have been defined around the union and intersection operations of fuzzy subsets, but no one tells us how to select operations for fuzzy sets. The selection theorem first time gave the selection principle in three typical kinds, which indicates that the selection depends on the relationship between random sets. Thus, the falling shadow theory offers a theoretical basis and explanations on the mathematical meanings of fuzzy theory, which further establishes a deeper understanding of fuzzy sets theory ( Figure 3).

Conclusion
Factor space theory has been used in the analysis towards fuzziness' domain, which can be transformed to be the randomness in the 'sky'. The falling shadow theory, which brings fuzzy sets and evidence theory to a solid foundation in applications, offers the following: (1) by means of the transformation theorem in between probability density and membership curves, fuzzy sets have obtained typical formulae; (2) by means of scoring fuzzy numbers systems, fuzzy calculation can perform mechanical evaluation and decision-making; and (3) by means of the joint distribution of random sets in the sky, the selection of fuzzy operations becomes reasonable. As such, factors space opens a new frontier of fuzzy sets.

Disclosure statement
No potential conflict of interest was reported by the authors.

Funding
This study was partially supported by the grants [grant numbers 61350003, 70621001, 70531040, 90818025] from the Natural Science Foundation of China.

Notes on contributors
Peizhuang Wang, male, born in 1936, professor, doctoral supervisor, mainly studied fuzzy mathematics and its application in artificial intelligence, has published more than 200 and found the fuzzy falling shadow representation, truth value flow inferences and factors space theory, won several national rewards and an international awards, the recent focus on factors space in the application of artificial intelligence and data science. Yanke Bao, male, born in 1962, associate professor, master supervisor, mainly studied statistical machine learning and theory and application of knowledge discovery based on factors space, has published more than 20 and proposed the algorithm of S&R, found and proved the dual convolution theorem of factors space.