The paradigm of complex probability and Ludwig Boltzmann's entropy

ABSTRACT The Andrey Nikolaevich Kolmogorov's classical system of probability axioms can be extended to encompass the imaginary set of numbers and this by adding to his original five axioms an additional three axioms. Hence, any experiment can thus be executed in what is now the complex probability set 𝒞 which is the sum of the real set ℛ with its corresponding real probability and the imaginary set ℳ with its corresponding imaginary probability. The objective here is to evaluate the complex probabilities by considering supplementary new imaginary dimensions to the event occurring in the ‘real’ laboratory. Whatever the probability distribution of the input random variable in ℛ is, the corresponding probability in the whole set 𝒞 is always one, so the outcome of the random experiment in 𝒞 can be predicted totally and perfectly. The result indicates that chance and luck in ℛ is replaced now by total determinism in 𝒞 . This is the consequence of the fact that the probability in 𝒞 is got by subtracting the chaotic factor from the degree of our knowledge of the stochastic system. This novel complex probability paradigm will be applied to Ludwig Boltzmann's classical concept of entropy in thermodynamics and in statistical mechanics.

Extended Kolmogorov's Axioms; complex set; probability norm; degree of our knowledge; chaotic factor; real entropy; imaginary entropy; negative entropy; complementary real entropy; complex entropy Nomenclature R real set of events M imaginary set of events C complex set of events i the imaginary number where i = √ −1 or i 2 = −1 EKA Extended Kolmogorov's Axioms CPP Complex Probability Paradigm P rob probability of any event P r probability in the real set R P m probability in the imaginary set M corresponding to the real probability in R Pc probability of an event in R with its associated event in M = probability in the complex set C Z complex probability number = sum of P r and P m = complex random vector DOK = |Z| 2 = the degree of our knowledge of the random system, it is the square of the norm of Z. Chf the chaotic factor of the random system MChf magnitude of the chaotic factor of the random system number of the stochastic system microstates max number of the stochastic system microstates at equilibrium k B Ludwig Boltzmann's constant CONTACT Abdo Abou Jaoude abdoaj@idm.net.lb S Entropy of a stochastic system S R = S = Entropy in the real probability set R S R Entropy in the complementary real probability set to R NegS R The negative entropy in the real probability set R S M Entropy in M S C Entropy in C The complex entropy constant

Introduction
Firstly, in this introductory section an overview of statistical mechanics and entropy will be done. Statistical mechanics is a branch of theoretical physics that uses probability theory to study the average behaviour of a mechanical system whose exact state is uncertain. Statistical mechanics is commonly used to explain the thermodynamic behaviour of large systems. This branch of statistical mechanics, which treats and extends classical thermodynamics, is known as statistical thermodynamics or equilibrium statistical mechanics. Microscopic mechanical laws do not contain concepts such as temperature, heat, or entropy; however, statistical mechanics shows how these concepts arise from the natural uncertainty about the state of a system when that system is prepared in practice. The benefit of using statistical mechanics is that it provides exact methods to connect thermodynamic quantities (such as heat capacity) to microscopic behaviour, whereas, in classical thermodynamics, the only available option would be to just measure and tabulate such quantities for various materials. Statistical mechanics also makes it possible to extend the laws of thermodynamics to cases which are not considered in classical thermodynamics, such as microscopic systems and other mechanical systems with few degrees of freedom. (Wikipedia, the free encyclopedia, Statistical Mechanics; Wikipedia, the free encyclopedia, Entropy; Wikipedia, the free encyclopedia, Entropy (statistical thermodynamics); Wikipedia, the free encyclopedia, Thermodynamics; Gibbs, 1902;Tolman, 1938;Balescu, 1975;Gibbs, 1993). In addition, statistical mechanics also finds use outside equilibrium. An important sub-branch known as 'nonequilibrium statistical mechanics' deals with the issue of microscopically modelling the speed of irreversible processes that are driven by imbalances. Examples of such processes include chemical reactions or flows of particles and heat. Unlike with equilibrium, there is no exact formalism that applies to non-equilibrium statistical mechanics in general, and so this branch of statistical mechanics remains an active area of theoretical research (Aleiner & Blanter, 2002;Altshuler, Aronov, & Khmelnitsky, 1982;Baxter, 1982;Mahon, 2003;Maxwell, 1860;Reif, 1965).
Also, the term statistical mechanics is sometimes used to refer to only statistical thermodynamics. This article takes the broader view. By some definitions, statistical physics is an even broader term which statistically studies any type of physical system, but is often taken to be synonymous with statistical mechanics (Wikipedia, the free encyclopedia, Statistical Physics).
Secondly, the French mathematician Lazare Carnot proposed in his 1803 paper Fundamental Principles of Equilibrium and Movement that in any machine the accelerations and shocks of the moving parts represent losses of moment of activity. In other words, in any natural process there exists an inherent tendency towards the dissipation of useful energy. Building on this work, in 1824 Lazare's son Sadi Carnot published Reflections on the Motive Power of Fire which posited that in all heatengines, whenever 'caloric' (what is now known as heat) falls through a temperature difference, work or motive power can be produced from the actions of its fall from a hot to cold body. He made the analogy with that of how water falls in a water wheel. This was an early insight into the second law of thermodynamics. Carnot based his views of heat partially on the early eighteenth century 'Newtonian hypothesis' of Sir Isaac Newton that both heat and light were types of indestructible forms of matter, which are attracted and repelled by other matter, and partially on the contemporary views of Sir Benjamin Thompson (or Count Rumford) who showed (1789) that heat could be created by friction as when cannon bores are machined. Carnot reasoned that if the body of the working substance, such as a body of steam, is returned to its original state at the end of a complete engine cycle that 'no change occurs in the condition of the working body.' (Gyenis, 2017;Ebeling & Sokolov, 2005;Gibbs, 1906;Mayants, 1984;Carnot, Sadi (1796-1832Mcculloch, 1876;Clausius, 1850;Mcgovern, 2013, February 5;6.5. Irreversibility, Entropy Changes, and Lost Work, 2016;Lower, 2016, May 21;Lavenda, 2010;Carnot & Fox, 1986).
Moreover, the first law of thermodynamics, deduced from the heat-friction experiments of James Joule in 1843, expresses the concept of energy, and its conservation in all processes; the first law, however, is unable to quantify the effects of friction and dissipation.
Furthermore, in the 1850s and 1860s, the German physicist Rudolf Clausius objected to the supposition that no change occurs in the working body, and gave this 'change' a mathematical interpretation by questioning the nature of the inherent loss of usable heat when work is done, e.g. heat produced by friction. Clausius described entropy as the transformation-content, i.e. dissipative energy use, of a thermodynamic system or working body of chemical species during a change of state. This was in contrast to earlier views, based on the theories of Sir Isaac Newton, that heat was an indestructible particle that had mass (Atkins & De Paula, 2006;Clausius, 1865, April 24;Clausius, 1867;Giles, 2016, January 22;Hawking, 2005;Kuhn, 1970;Poincaré, 1968;Truesdell, 1980).
Later, scientists such as Ludwig Boltzmann, Josiah Willard Gibbs, and James Clerk Maxwell gave entropy a statistical basis. In 1877 Boltzmann visualized a probabilistic way to measure the entropy of an ensemble of ideal gas particles, in which he defined entropy to be proportional to the logarithm of the number of microstates such a gas could occupy. Henceforth, the essential problem in statistical thermodynamics, i.e. according to Erwin Schrödinger, has been to determine the distribution of a given amount of energy E over N identical systems. Carathéodory linked entropy with a mathematical definition of irreversibility, in terms of trajectories and integrability (Barrow, 1992;Carathéodory, 1909;Greene, 2003;Penrose, 1999;Stewart, 1996;Stewart, 2012;Warusfel & Ducrocq, 2004, September;Bogdanov & Bogdanov, 2013).
Additionally, there are two related definitions of entropy: the thermodynamic definition and the statistical mechanics definition. Historically, the classical thermodynamics definition developed first. In the classical thermodynamics viewpoint, the system is composed of very large numbers of constituents (atoms, molecules) and the state of the system is described by the average thermodynamic properties of those constituents; the details of the system's constituents are not directly considered, but their behaviour is described by macroscopically averaged properties, e.g. temperature, pressure, entropy, heat capacity. The early classical definition of the properties of the system assumed equilibrium. The classical thermodynamic definition of entropy has more recently been extended into the area of non-equilibrium thermodynamics. Later, the thermodynamic properties, including entropy, were given an alternative definition in terms of the statistics of the motions of the microscopic constituents of a system -modelled at first classically, e.g. Newtonian particles constituting a gas, and later quantum-mechanically (photons, phonons, spins, etc.). The statistical mechanics description of the behaviour of a system is necessary as the definition of the properties of a system using classical thermodynamics become an increasingly unreliable method of predicting the final state of a system that is subject to some process (Aczel, 2000;Balibar, 2002;Bogdanov & Bogdanov, 2009;Bogdanov & Bogdanov, 2010;Bogdanov & Bogdanov, 2012;Davies, 1993;Hawking, 2002;Hawking, 2011;Hoffmann, 1975;Pickover, 2008;Reeves, 1988;Ronan, 1988).
Finally and to conclude, this research paper is organized as follows: After the introduction in section 1, the purpose and the advantages of the present work are presented in section 2. Afterward, in section 3, the complex probability paradigm with its original parameters and interpretation will be explained and illustrated. In section 4, I will extend the real Boltzmann's entropy to the imaginary and complex probability sets and hence link this concept to my new paradigm. Moreover, in section 5, the complex probability paradigm will be applied to the first and second derivatives of all entropies in the sets R, M , and C . Furthermore, the Taylor series of all the entropies will be computed and illustrated in section 6. Additionally, I will link entropy in statistical mechanics to entropy in information theory in section 7. Also, in section 8, a final analysis will be done. Finally, I conclude the work by doing a comprehensive summary in section 9, and then present the list of references cited in the current research work.

The purpose and the advantages of the present work
Firstly, in this section the purpose and the advantages of this research paper will be done. All our work in classical probability theory is to compute probabilities. The original idea in this paper is to add new dimensions to our random experiment which will make the work deterministic. In fact, the probability theory is a nondeterministic theory by nature that means that the outcome of the events is due to chance and luck. By adding new dimensions to the event in R, we make the work deterministic and hence a random experiment will have a certain outcome in the complex set of probabilities C . It is of great importance that stochastic systems become totally predictable since we will be totally knowledgeable to foretell the outcome of chaotic and random events that occur in nature like for example in statistical mechanics or in all stochastic processes. Therefore the work that should be done is to add to the real set of probabilities R the contributions of M which is the imaginary set of probabilities which will make the event in C = R + M deterministic. If this is found to be fruitful, then a new theory in statistical sciences and prognostic is elaborated and this to understand deterministically those phenomena that used to be random phenomena in R. This is what I called 'the complex probability paradigm' that was initiated and elaborated in my eleven previous papers (Abou Jaoude, 2013a;Abou Jaoude, 2013b;Abou Jaoude, 2014;Abou Jaoude, 2015a, April;Abou Jaoude, 2015b;Abou Jaoude, 2016a;Abou Jaoude, 2016b;Abou Jaoude, 2017a;Abou Jaoude, 2017b;Abou Jaoude, 2017c;Abou Jaoude, El-Tawil, & Kadry, 2010).
Consequently, the purpose and the advantages of the present work are to: (1) Extend classical probability theory to the set of complex numbers, hence to relate probability theory to the field of complex analysis. This task was initiated and elaborated in my eleven previous papers. (2) Apply the new probability axioms and paradigm to Ludwig Boltzmann's entropy; hence, extend the classical concept of real entropy to the imaginary and complex sets.
(3) Prove that all random phenomena can be expressed deterministically in the complex set C . (4) Quantify both the degree of our knowledge and the chaos of the stochastic system. (5) Draw and represent graphically the functions and parameters of the novel paradigm associated to Boltzmann's real, imaginary, and complex entropies. (6) Show that the classical concept of entropy is always equal to zero in the complex probability set; hence, no chaos, no ignorance, no unpredictability, no uncertainty, no randomness, no disorder, and no information gain or loss exist in C (complex set) = R (real set) + M (imaginary set). (7) Link the concept of entropy taken from statistical mechanics and thermodynamics to the same concept taken from information theory.
(8) Pave the way to apply the original paradigm to other topics in statistical mechanics, in stochastic processes, and to the field of prognostics in engineering. These will be the subjects of my subsequent research papers.
Concerning some applications of the novel proposed prognostic paradigm and as a future work on the suggested theoretical analysis results, it can be applied potentially to any stochastic dynamic system studied by statistical mechanics, whether gaseous or liquid like in thermodynamics, and which is subject to chaos and stochastic effects. Hence, the novel paradigm has no limitations and can be related to the entropy of any random dynamic system.
It is important to mention here that one essential and very well-known probability distribution was considered in the current research work which is the uniform probability distribution (P r = 1/ at equilibrium) although the original CPP model can be applied to any random distribution considered in my previous publications (Abou Jaoude, 2013a;Abou Jaoude, 2013b;Abou Jaoude, 2014;Abou Jaoude, 2015a, April;Abou Jaoude, 2015b;Abou Jaoude, 2016a;Abou Jaoude, 2016b;Abou Jaoude, 2017a;Abou Jaoude, 2017b;Abou Jaoude, 2017c;Abou Jaoude et al., 2010). This will lead surely to similar results and conclusions and proves the success of my novel paradigm. Consequently, a way to develop and to further improve the results is to consider and to study the system entropy at non-equilibrium state. This would be a progress on this new paradigm and original field.
To conclude, compared with existing literature, the main contribution of the current research paper is to apply the original complex probability paradigm to the classical concept of entropy defined by Ludwig Boltzmann in statistical mechanics and hence to extend this concept from the real set to the imaginary and complex sets. This is the benefit of the proposed paradigm illustrated clearly throughout the whole work and this is why it is important.
The following figure summarizes the objectives of the current research paper (Figure 1).

The extended set of probability axioms
In this section, the extended set of probability axioms of the complex probability paradigm will be presented.

The original Andrey Nikolaevich Kolmogorov set of axioms
The simplicity of Kolmogorov's system of axioms may be surprising. Let E be a collection of elements {E 1 , E 2 , . . . } called elementary events and let F be a set of subsets of E called random events. The five axioms for a finite set E are (Benton, 1966a;Benton, 1966b;Feller, 1968;Montgomery & Runger, 2003;Walpole, Myers, Myers, & Ye, 2002): Axiom 1: F is a field of sets.
Axiom 2: F contains the set E.
Axiom 3: A non-negative real number P rob (A), called the probability of A, is assigned to each set A in F.
Axiom 5: If A and B have no elements in common, the number assigned to their union is: hence, we say that A and B are disjoint; otherwise, we have: And we say also that: Moreover, we can generalize and say that for N disjoint (mutually exclusive) events A 1 , A 2 , . . . , A j , . . . , A N (for 1 ≤ j ≤ N), we have the following additivity rule: And we say also that for N independent events A 1 , A 2 , . . . , A j , . . . , A N (for 1 ≤ j ≤ N), we have the following product rule:

Adding the imaginary Part M
Now, we can add to this system of axioms an imaginary part such that: Axiom 6: Let P m = i × (1 − P r ) be the probability of an associated event in M (the imaginary part) to the event A in R (the real part). It follows that P r + P m /i = 1 where i is the imaginary number with i = √ −1.
Axiom 7: We construct the complex number or vector Z = P r + P m = P r + i(1 − P r ) having a norm |Z| such that: Axiom 8: Let Pc denote the probability of an event in the complex probability universe C where C = R + M . We say that Pc is the probability of an event A in R with its associated event in M such that: and is always equal to 1.

The purpose of extending the axioms
It is apparent from the set of axioms that the addition of an imaginary part to the real event makes the probability of the event in C always equal to 1. In fact, if we begin to see the set of probabilities as divided into two parts, one is real and the other is imaginary, understanding will follow directly. The random event that occurs in the real probability set R (like tossing a coin and getting a head), has a corresponding probability P r . Now, let M be the set of imaginary probabilities and let |Z| 2 be the Degree of Our Knowledge (DOK for short) of this phenomenon. P r is always, and according to Kolmogorov's axioms, the probability of an event.
A total ignorance of the set M makes: P r = 0.5 and |Z| 2 in this case is equal to: Conversely, a total knowledge of the set in R makes: P rob (event) = P r = 1 and P m = P rob (imaginary part) = 0. Here we have |Z| 2 = 1 − (2 × 1) × (1 − 1) = 1 because the phenomenon is totally known, that is, its laws and variables are completely determined, hence; our degree of our knowledge of the system is 1 = 100%. Now, if we can tell for sure that an event will never occur i.e. like 'getting nothing' (the empty set), P r is accordingly = 0, that is the event will never occur in R. P m will be equal to: because we can tell that the event of getting nothing surely will never occur; thus, the Degree of Our Knowledge (DOK) of the system is 1 = 100%.  We can infer that we have always: And what is important is that in all cases we have: In fact, according to an experimenter in R, the game is a game of chance: the experimenter doesn't know the output of the event. He will assign to each outcome a probability P r and he will say that the output is nondeterministic. But in the universe C = R + M , an observer will be able to predict the outcome of the game of chance since he takes into consideration the contribution of M , so we write: Hence Pc is always equal to 1. In fact, the addition of the imaginary set to our random experiment resulted to the abolition of ignorance and indeterminism. Consequently, the study of this class of phenomena in C is of great usefulness since we will be able to predict with certainty the outcome of experiments conducted. In fact, the study in R leads to unpredictability and uncertainty. So instead of placing ourselves in R, we place ourselves in C then study the phenomena, because in C the contributions of M are taken into consideration and therefore a deterministic study of the phenomena becomes possible. Conversely, by taking into consideration the contribution of the set M we place ourselves in C and by ignoring M we restrict our study to nondeterministic phenomena in R (Bell, 1992;Boursin, 1986;Dacunha-Castelle, 1996;Srinivasan & Mehata, 1988;Stewart, 2002;Van Kampen, 2006). Moreover, it follows from the above definitions and axioms that (Abou : 2iP r P m will be called the Chaotic factor in our experiment and will be denoted accordingly by 'Chf '. We will see why we have called this term the chaotic factor; in fact: In case P r = 1, that is the case of a certain event, then the chaotic factor of the event is equal to 0. In case P r = 0, that is the case of an impossible event, then Chf = 0. Hence, in both two last cases, there is no chaos since the outcome is certain and is known in advance. In case P r = 0.5, Chf = −0.5. (Figures 2-4) We notice that : −0.5 ≤ Chf ≤ 0, ∀ P r : 0 ≤ P r ≤ 1.
What is interesting here is thus we have quantified both the degree of our knowledge and the chaotic factor of any random event and hence we write now: Then we can conclude that: Pc 2 = Degree of our knowledge of the system − Chaotic factor = 1, therefore Pc = 1 permanently. This directly means that if we succeed to subtract and eliminate the chaotic factor in any random experiment, then the output will always be with a probability equal to 1 (Dalmedico-Dahan & Peiffer, 1986;Dalmedico-Dahan, Chabert, & Chemla, 1992;Ekeland, 1991;Gleick, 1997;Gullberg, 1997;Science Et Vie, 1999).
The graph below shows the linear relation between both DOK and Chf. (Figure 5) Furthermore, we need in our current study the absolute value of the chaotic factor that will give us the magnitude of the chaotic and random effects on the studied system materialized by a probability density function and which lead to an increasing system chaos in R. This new term will be denoted accordingly MChf or Magnitude of the Chaotic factor (Abou Jaoude, 2013a; Abou Jaoude, 2013b; Abou Jaoude, 2014; Abou Jaoude, 2015a, April; Abou Jaoude, 2015b; Abou Jaoude, 2016a; Abou Jaoude, 2016b; Abou Jaoude, 2017a; Abou Jaoude, 2017b; Abou Jaoude, 2017c; Abou . Hence, we can deduce the following: And The graph below ( Figure 6) shows the linear relation between both DOK and MChf. Moreover,  show the graphs of Chf, MChf, DOK, and Pc as functions of the real probability P r for any probability distribution and for a beta probability distribution. It is important to mention here that we could have considered deliberately any probability distribution besides the beta random distribution like the continuous Gaussian normal distribution or the exponential distribution or the discrete Poisson or Binomial random distributions, etc. Although the graphs would have been different whether in 2D or in 3D but the mathematical consequences and interpretations would have been similar for any possible and imaginable probability distribution. This hypothesis is verified in my eleven previous research papers by the mean of many examples encompassing both discrete and continuous probability distributions (Abou Jaoude, 2013a; Abou Jaoude, 2013b; Abou Jaoude, 2014; Abou Jaoude, 2015a, April; Abou Jaoude, 2015b; Abou Jaoude, 2016a; Abou Jaoude, 2016b; Abou Jaoude, 2017a; Abou Jaoude, 2017b; Abou Jaoude, 2017c; Abou .
To summarize and to conclude, as the degree of our certain knowledge in the real universe R is unfortunately        incomplete, the extension to the complex probability set C includes the contributions of both the real set of probabilities R and the imaginary set of probabilities M . Consequently, this will result to a complete and perfect degree of knowledge in C = R + M (since Pc = 1). In fact, in order to have a certain prediction of any random event, it is necessary to work in the complex set C in which the chaotic factor is quantified and subtracted from the computed degree of knowledge to lead to a probability in C equal to one (  In this section, the complex probability paradigm will be linked to Boltzmann's entropy and hence the classical concept of entropy originally defined in the real set will be extended to the imaginary and complex sets.

The real entropy S R in R
Entropy is a logarithmic measure of the number of states with significant probability of being occupied; so mathematically we write: where k B is the Boltzmann constant, equal to 1.38065 × 10 −23 J/K or 8.6173324 × 10 −5 eV/K. The summation is over all the possible microstates of the system, and p j is the probability that the system is in the j-th microstate. This definition assumes that the basis set of states has been picked so that there is no information on their relative phases. In the novel complex probability paradigm we can deduce the following consequences: In the set R, we denote the corresponding real entropy by S R and is the number of microstates that is the number of microstates that corresponds to the macroscopic thermodynamic state. We have for an isolated system in equilibrium where = max the real probability equals to: Then S R is a divergent non-decreasing series that means in R chaos and disorder are increasing with time ( Figures 15 and 16).

The real entropy S R as a function of all the CPP parameters
In the real probability set R we have: with p j = P r = p = 1 Figure 16. The real entropy S R in R as a function of the real probability P r = 1/ .
And from CPP we have: Then Chf = −2p + 2p 2 ⇒ 2p 2 − 2p − Chf = 0 which is a second degree equation function of p. So the discriminant is: = 4 + 8Chf . Since −0.5 ≤ Chf ≤ 0 then 0 ≤ ≤ 4, therefore the two real roots are: and p 1 = 1 − p 2 Therefore, All this since Chf ( ), MChf ( ), and DOK( ) are not monotonous functions of the strictly increasing variable ∈ [1, +∞) as well as Chf (p), MChf (p), and DOK(p) are not monotonous functions of the strictly decreasing vari- We can check and see that: always, as computed from the CPP axioms. Then: .5 with Pc = 1 always, as computed from the CPP axioms. Then: always, as computed from the CPP axioms. Then:

The real entropy S R as a function of Chf alone
We have: ( Figure 23)

The real entropy S R as a function of DOK alone
Since from the CPP axioms we have:  2DOK − 1, therefore: ( Figure 24)

The real entropy S R as a function of MChf alone:
Since from the CPP axioms we have MChf = −Chf then: ( Figure 25)

The real entropy S R as a function of DOK and Chf alone:
Since from the CPP axioms we have Pc 2 = DOK − Chf = 1 ⇒ DOK = 1 + Chf then:

The real entropy S R as a function of DOK and MChf alone
Since from the CPP axioms we have Pc 2 = DOK + MChf = 1 ⇒ DOK = 1 − MChf and MChf = −Chf then:

The real entropy S R as a function of Chf and MChf alone
Since from the CPP axioms we have MChf = −Chf then: ( Figure 28)

The complementary real entropyS R in R
In the complementary real probability set to R, we denote the corresponding real entropy byS R . The meaning ofS R is the following: it is the real entropy in the real set R and which is related to the complementary real probability p j = P m /i = 1 − P r . We have for an isolated system in equilibrium the complementary real probability equals to: and is a convergent non-decreasing series. In fact, , that means in the complementary real probability set to R, chaos is increasing with time, and its corresponding entropy is converging to the Boltzmann constant k B = 8.6173324×10 −5 eV/K.
( Figure 29) Sincē The complementary entropyS R to S R in R as a function of the real probability P r = 1/ .

The complementary real entropyS R as a function of all the CPP parameters
And p 2 = 1 − p 1 and p 1 = 1 − p 2 . Then: Consequently,

The relation betweenS R and S R
Since S R = k B Ln ⇒ = exp(S R /k B ).

The real negative entropy NegS R in R
We define NegS R = −S R = −k B Ln with the real probability equals to: for an isolated system in equilibrium.
We can deduce that: Therefore NegS R is a divergent non-increasing series. (Figure 34)  Since ( Figure 35) Note that, S R ≥ 0 and NegS R ≤ 0 and if S R is maximum then NegS R = −S R is minimum and vice versa. So when S R = 0 = minimum for = 1 and P r = 1 then NegS R = 0 = maximum. Also, when S R → +∞ then NegS R → −∞. Therefore, if S R measures in R the amount of disorder, of uncertainty, of chaos, of ignorance, of unpredictability, and of information gain in a random system then since NegS R = −S R , that means the opposite of S R , NegS R measures in R the amount of order, of certainty, of predictability, and of information loss in a stochastic system.

The complementary real entropy NegS R as a function of all the CPP parameters
Since Consequently, Also ( Figures 36 and 37)

The complex entropy S M in M
In the set M , we denote the corresponding entropy by S M . We have for an isolated system in equilibrium the  imaginary probability in M equals to: Now using Leonhard Euler's formula: e iθ = cos θ + i sin θ then for θ = π 2 + 2kπ where k ∈ Z (the set of all integers), we get: Therefore: since Ln(x θ ) = θ × Ln(x) and Ln(e) = 1.
Note that for k = 0 ⇒ −Ln(i i ) = π 2 = 1.570796327 Thus we conclude that: where k ∈ Z and ∈ N.
Knowing that N is the set of all natural numbers. If = k B ((π/2) + 2kπ) = the complex entropy constant, hence:  And Then:

The complex entropy S M in M as a function of all the CPP parameters
We have:

The relations between S M ,S R and S R
Since S R = k B Ln ⇒ = exp(S R /k B ). Moreover, we have S M = iS R + ( − 1). Therefore: ( Figures 41 and 42).

The entropy S C in the set C
In the set C , we denote the corresponding real entropy by S C . We have for an isolated system in equilibrium the real probability in C equals to: ⇒ dS C = 0, ∀ ∈ [1, +∞), ∈ N and ∀P r ∈ (0, 1], P r ∈ R. Therefore, S C is a constant series that is equal permanently to 0. That means, in the probability set C = R + M , we have complete order, no chaos, no ignorance, no uncertainty, no disorder, no randomness, and no unpredictability since all measurements are completely and perfectly deterministic (Pc = 1 and S C = 0). (Figures 43-45).
We can deduce from all the above that: In this section, the complex probability paradigm will be linked to Boltzmann's entropy first and second derivatives and consequently the classical concepts of entropy derivatives originally defined in the real set will be extended to the imaginary and complex sets.

The first derivative of S R in R
We have Since The maximum of (dS R /d ) occurs when is minimum that means when = 1. In this case (dS R /d ) = k B 1 = k B = 8.6173324 × 10 −5 eV/K. The minimum of (dS R /d ) occurs when is maximum that means when → +∞. In this case (dS R /d ) → (k B / + ∞) = 0 + . Since lim →+∞ S R = lim →+∞ k B Ln = +∞ and lim →+∞ (S R / ) = lim →+∞ (k B Ln / ) = 0 + by L'Hôpital's rule, therefore S R = 0 is an asymptotic direction.

The first derivative of the complementary real entropyS
In fact, using calculus we get: = 8.6173324 × 10 −5 eV/K that means thatS R = k B is a horizontal asymptote. ( Figure  48) Sincē Consequently, And we have also: ( Figure 49).

The first derivative of the real negative entropy NegS R in R
We have.

The first derivative of the complex entropy S M in M
Since and We have from previous calculations: dS R ≥ 0 and lim →+∞SR = k B , that means that S M is a convergent non-decreasing complex series that is equal toS R in the real planes k B ( − 1)((π/2) + 2kπ) depending on the values of 1 ≤ < ∞ with ∈ N and of k ∈ Z.
( Figure 52). Furthermore, where p = P r = 1/ Hence: Then: dS M dp = i × Im dS M dp + Re dS M dp And we have also: ( Figure 53)

The first derivative of the entropy S C in the set C
Since S C = 0 then and dS C dP r = 0, ∀P r : 0 < P r ≤ 1 (67) ⇒ dS C = 0, ∀ ∈ [1, +∞) and ∀P r ∈ (0, 1], therefore S C is a constant series always equal to 0 and is a horizontal line. (Figures 54 and 55)

The second derivative of S R in R
Since As ≥ 1 ⇒ d 2 S R d 2 ≤ 0, that means that S R ( ) is a curve concave downward. We have: (Figure 56)

Furthermore, since
As p > 0 ⇒ (d 2 S R /dp 2 ) > 0, that means that S R (p) is a curve concave upward. We have: If p = 1 ⇒ d 2 S R dp 2 = k B p 2 = k B = Minimum of d 2 S R dp 2 If p → 0 + ⇒ d 2 S R dp 2 = k B p 2 → +∞ = Maximum of d 2 S R dp 2 (Figure 57)

The second derivative of the complementary real entropyS
that means thatS R ( ) is a curve concave downward. We have:

Moreover, sinceS
that means thatS R (p) is a curve concave downward. We have: R dp 2 → −∞ (by L'Hôpital's rule) = Minimum of d 2S R dp 2 Figure 59. The graph of the second derivative ofS R (P r ) in R.

The second derivative of the real negative entropy NegS R in R
Since As ≥ 1 ⇒ (d 2 NegS R /d 2 ) ≥ 0 that means that NegS R ( ) is a curve concave upward. We have: (Figure 60) where p = P r = 1 (73) As 0 < p ≤ 1 ⇒ (d 2 NegS R /dp 2 ) ≤ 0 that means that NegS R (p) is a curve concave downward. We have: Maximum of d 2 NegS R dp 2 Minimum of d 2 NegS R dp 2 (Figure 61)

The second derivative of the complex entropy
Hence:
That means that S M (p) is a complex curve concave downward that is equal toS R (p) in the real surfaces k B ((1/p) − 1)((π/2) + 2kπ) depending on the values of 0 < p ≤ 1 and of k ∈ Z.

The second derivative of the entropy S C in the set C
Since S C = 0 then  and That means that S C is a constant series that is equal to 0 and is a horizontal line with zero concavity. (Figures 64  and 65).

The Taylor's series of all the entropies in R, M , and C
In this section, the complex probability paradigm will be linked to Boltzmann's entropy Taylor's series and hence the usual concept of entropy Taylor's series originally defined in the real set R will be extended to the imaginary and complex sets which are M and C .
The following figures recapitulate all the above calculations and figures. (Figures 78 and 79). In this section, we will illustrate and understand the tight link between Boltzmann's statistical mechanics entropy and Shannon's information theory entropy using mathematical and physical examples and equations taken from different scientific fields like from quantum mechanics or astrophysics.
There are close parallels between the mathematical expressions for the thermodynamic entropy, usually

Equivalence of form of the defining expressions
The defining expression for entropy in the theory of statistical mechanics established by Ludwig Boltzmann and Josiah Willard Gibbs in the 1870s, is of the form: where p j is the probability of the microstate j taken from an equilibrium ensemble. The defining expression for entropy in the theory of information established by Claude Elwood Shannon in 1948 is of the form: where p j is the probability of the message m j taken from the message space M, and b is the base of the logarithm used. Common values of b are 2, Leonhard Euler's number e ∼ = 2.718281828 . . ., and 10, and the unit of entropy is shannon (or bit) for b = 2, nat for b = e, and hartley for b = 10. Mathematically H may also be seen as an average information, taken over the message space, because when a certain message occurs with probability p j , the information quantity − log(p j )will be obtained.
If all the microstates are equiprobable (a microcanonical ensemble), the statistical thermodynamic entropy reduces to the form, as given by Boltzmann: where is the number of microstates, that is the number of microstates that corresponds to the macroscopic thermodynamic state. Therefore S depends on temperature. If all the messages are equiprobable, the information entropy reduces to the Hartley entropy: where |M| is the cardinality of the message space M.
The logarithm in the thermodynamic definition is the natural logarithm. It can be shown that the Gibbs entropy formula, with the natural logarithm, reproduces all of the properties of the macroscopic classical thermodynamics of Rudolf Clausius.
The logarithm can also be taken to the natural base in the case of information entropy. This is equivalent to choosing to measure information in nats instead of the usual bits (or more formally, shannons). In practice, information entropy is almost always calculated using base 2 logarithms, but this distinction amounts to nothing other than a change in units. One nat is about 1/Ln(2) ∼ = 1.442695041 bits.
For a simple compressible system that can only perform volume work, the first law of thermodynamics becomes:

dE = −pdV + TdS
But one can equally well write this equation in terms of what physicists and chemists sometimes call the 'reduced' or dimensionless entropy, σ = S/k B , so that: Just as S is conjugate to T, so σ is conjugate to k B T(the energy that is characteristic of T on a molecular scale).

Theoretical relationship
Despite the foregoing, there is a difference between the two quantities. The information entropy H can be calculated for any probability distribution (if the 'message' is taken to be that the event j which had probability p j occurred, out of the space of the events possible), while the thermodynamic entropy S refers to thermodynamic probabilities p j specifically. The difference is more theoretical than actual, however, because any probability distribution can be approximated arbitrarily closely by some thermodynamic system. Moreover, a direct connection can be made between the two. If the probabilities in question are the thermodynamic probabilities p j : the (reduced) Gibbs entropy σ can then be seen as simply the amount of Shannon information needed to define the detailed microscopic state of the system, given its macroscopic description. Or, in the words of Gilbert Newton Lewis writing about chemical entropy in 1930, 'Gain in entropy always means loss of information, and nothing more.' To be more concrete, in the discrete case using base two logarithms, the reduced Gibbs entropy is equal to the minimum number of yes-no questions needed to be answered in order to fully specify the microstate, given that we know the macrostate.
Furthermore, the prescription to find the equilibrium distributions of statistical mechanics -such as the Boltzmann distribution -by maximizing the Gibbs entropy subject to appropriate constraints (the Gibbs algorithm) can be seen as something not unique to thermodynamics, but as a principle of general relevance in statistical inference, if it is desired to find a maximally uninformative probability distribution, subject to certain constraints on its averages.
The Shannon entropy in information theory is sometimes expressed in units of bits per symbol. The physical entropy may be on a 'per quantity' basis (h) which is called 'intensive' entropy instead of the usual total entropy which is called 'extensive' entropy. The 'shannons' of a message (H) are its total 'extensive' information entropy and is h times the number of bits in the message.
A direct and physically real relationship between h and S can be found by assigning a symbol to each microstate that occurs per mole, kilogram, volume, or particle of a homogeneous substance, then calculating the 'h' of these symbols. By theory or by observation, the symbols (microstates) will occur with different probabilities and this will determine h. If there are N moles, kilograms, volumes, or particles of the unit substance, the relationship between h (in bits per unit substance) and physical extensive entropy in nats is: where Ln(2) is the conversion factor from base 2 of Shannon entropy to the natural base e of physical entropy. Nh is the amount of information in bits needed to describe the state of a physical system with entropy S. Landauer's principle demonstrates the reality of this by stating the minimum energy E required (and therefore heat Q generated) by an ideally efficient memory change or logic operation by irreversibly erasing or merging Nh bits of information will be S times the temperature which is: where h is in informational bits and E and Q are in physical Joules. This has been experimentally confirmed.
Temperature is a measure of the average kinetic energy per particle in an ideal gas (Kelvins = 2/3 × Joules/k B ) so the J/K units of k B is fundamentally unitless (Joules/Joules). k B is the conversion factor from energy in 3/2×Kelvins to Joules for an ideal gas. If kinetic energy measurements per particle of an ideal gas were expressed as Joules instead of Kelvins, k B in the above equations would be replaced by 3/2. This shows that S is a true statistical measure of microstates that does not have a fundamental physical unit other than the units of information, in this case 'nats,' which is just a statement of which logarithm base was chosen by convention.

Leó Szilárd's engine
A physical thought experiment demonstrating how just the possession of information might in principle have thermodynamic consequences was established in 1929 by Leó Szilárd, in a refinement of the famous Maxwell's demon scenario. In this experiment, information to energy conversion is performed on a Brownian particle by means of feedback control; that is, synchronizing the work given to the particle with the information obtained on its position. Computing energy balances for different feedback protocols, has confirmed that the Jarzynski equality requires a generalization that accounts for the amount of information involved in the feedback.

Rolf Landauer's principle
In fact one can generalize: any information that has a physical representation must somehow be embedded in the statistical mechanical degrees of freedom of a physical system.
Thus, Rolf Landauer argued in 1961, if one were to imagine starting with those degrees of freedom in a thermalized state, there would be a real reduction in thermodynamic entropy if they were then re-set to a known state. This can only be achieved under informationpreserving microscopically deterministic dynamics if the uncertainty is somehow dumped somewhere else -i.e. if the entropy of the environment (or the noninformationbearing degrees of freedom) is increased by at least an equivalent amount, as required by the Second Law, by gaining an appropriate quantity of heat: specifically k B T × Ln(2) of heat for every 1 bit of randomness erased. On the other hand, Landauer argued, there is no thermodynamic objection to a logically reversible operation potentially being achieved in a physically reversible way in the system. It is only logically irreversible operations -for example, the erasing of a bit to a known state, or the merging of two computation paths -which must be accompanied by a corresponding entropy increase. When information is physical, all processing of its representations, i.e. generation, encoding, transmission, decoding and interpretation, are natural processes where entropy increases by consumption of free energy.
Applied to the Maxwell's demon/Szilard engine scenario, this suggests that it might be possible to 'read' the state of the particle into a computing apparatus with no entropy cost; but only if the apparatus has already been SET into a known state, rather than being in a thermalized state of uncertainty. To SET (or RESET) the apparatus into this state will cost all the entropy that can be saved by knowing the state of Szilard's particle.

Negentropy
Shannon entropy has been related by physicist Léon Brillouin to a concept sometimes called negentropy. In 1953, Brillouin derived a general equation stating that the changing of an information bit value requires at least k B T × Ln(2) energy. This is the same energy as the work Leó Szilárd's engine produces in the idealistic case, which in turn equals to the same quantity found by Landauer. In his book, he further explored this problem concluding that any cause of a bit value change (measurement, decision about a yes/no question, erasure, display, etc.) will require the same amount, k B T × Ln(2), of energy. Consequently, acquiring information about a system's microstates is associated with an entropy production, while erasure yields entropy production only when the bit value is changing. Setting up a bit of information in a sub-system originally in thermal equilibrium results in a local entropy reduction. However, there is no violation of the second law of thermodynamics, according to Brillouin, since a reduction in any local system's thermodynamic entropy results in an increase in thermodynamic entropy elsewhere. In this way, Brillouin clarified the meaning of negentropy which was considered as controversial because its earlier understanding can yield Carnot efficiency higher than one. Additionally, the relationship between energy and information formulated by Brillouin has been proposed as a connection between the amount of bits that the brain processes and the energy it consumes.
In 2009, Mahulikar & Herwig redefined thermodynamic negentropy as the specific entropy deficit of the dynamically ordered sub-system relative to its surroundings. This definition enabled the formulation of the Negentropy Principle, which is mathematically shown to follow from the 2nd Law of Thermodynamics, during order existence.

Black holes
Stephen Hawking often speaks of the thermodynamic entropy of black holes in terms of their information content. Do black holes destroy information? It appears that there are deep relations between the entropy of a black hole and information loss.

Quantum theory
Hirschman showed (Hirschman uncertainty) that Heisenberg's uncertainty principle can be expressed as a particular lower bound on the sum of the classical distribution entropies of the quantum observable probability distributions of a quantum mechanical state, the square of the wave-function, in coordinate, and also momentum space, when expressed in Planck units. The resulting inequalities provide a tighter bound on the uncertainty relations of Heisenberg.
It is meaningful to assign a 'joint entropy,' because positions and momenta are quantum conjugate variables and are therefore not jointly observable. Mathematically, they have to be treated as joint distribution. Note that this joint entropy is not equivalent to the Von Neumann entropy, −Tr (ρ Lnρ) = − Lnρ . Hirschman's entropy is said to account for the full information content of a mixture of quantum states.
Dissatisfaction with the Von Neumann entropy from quantum information points of view has been expressed by Stotland, Pomeransky, Bachmat and Cohen, who have introduced a yet different definition of entropy that reflects the inherent uncertainty of quantum mechanical states. This definition allows distinction between the minimum uncertainty entropy of pure states, and the excess statistical entropy of mixtures.

The fluctuation theorem
The fluctuation theorem provides a mathematical justification of the second law of thermodynamics under these principles, and precisely defines the limitations of the applicability of that law for systems away from thermodynamic equilibrium.

Final analysis
In this section, we will do a final analysis of the whole research paper by drawing some final consequent conclusions inferred from the previous mathematical proofs and equations.
In the complex set C we have the entropy always equal to 0, so no loss no gain but complete conservation of information. The Lavoisier principle in chemistry and science affirms that mass and energy are conserved. The Law of Conservation of Mass (or Matter) in a chemical reaction can be stated thus: In a chemical reaction, matter is neither created nor destroyed. Knowing that it was discovered by Antoine Laurent Lavoisier (1743-94) about 1785. Therefore, it applies also to information theory (Abou Jaoude, 2017b) and statistical mechanics.
Moreover, in M we have infinite different surfaces (depending on the values of , of k, and hence of ) and thus infinite similar curves for entropy embedded in these complex surfaces. In R, we have disorder, uncertainty, and unpredictability. In C we have order, certainty, and predictability since Pc = 1 permanently and entropy = 0 constantly. Additionally, in R we have chaos and imperfect and incomplete knowledge or partial ignorance. In C , which is a higher dimensional universe, we have chaos always equal to 0 and DOK = 1 continuously, thus complete and perfect and total knowledge of any stochastic system.
Furthermore, the extension of all random and nondeterministic phenomena in R to the set C leads to certain knowledge and sure events since DOK = 1 and Pc = 1. Consequently, no randomness exists in C and all phenomena are deterministic in this complex set. Therefore, in C prognostic is assured and definite. Table 1. Summarizes the complex probability paradigm prognostic functions DOK, Chf, MChf, P m /i, Z, and Pc for any number of microstates Omega (↑ = increases and ↓ = decreases). Table 2. Summarizes the complex probability paradigm prognostic entropies S R ,S R , NegS R , S M , and S C in the probability sets R, M , and C for any number of microstates Omega.
Accordingly, at each instant in the novel prognostic model, the random entropy and the stochastic system microstate are certainly predicted in the complex set C with Pc 2 = DOK -Chf = DOK + MChf maintained as equal to one through a continuous compensation between DOK and Chf. This compensation is from the initial instant when = 1 (one microstate) until the final instant when the system is in equilibrium ( is maximum). We can understand also that DOK is the measure of our certain knowledge (100% probability) about the expected event, it does not include any uncertain knowledge (with a probability less than 100%). We can see that in computing Pc 2 we have eliminated and subtracted in the equation above all the random factors and chaos (Chf ) from our random experiment, hence no chaos exists in C , it only exists (if it does) in R ; therefore, this has yielded a 100% deterministic experiment and outcome in C since the probability Pc is continuously equal to 1. This is one of the advantages of extending R to M and hence of = 1 ⇒ P r = 1 = 1 = 0 = 0 = 0 = 1 = 1 ↑⇒ P r ↓ 0.5 < P r < 1 ↓ ↓ ↑ ↑ Re(Z) ↓ Im(Z) ↑ =1 = 2 ⇒ P r = 0.5 = Min = +0.5 = Min = −0.5 = Max = +0.5 = +0.5 = 0.5 + 0.5i = 1 ↑⇒ P r ↓ 0 < P r < 0.5 working in C = R + M . Hence, in the novel prognostic model, our knowledge of all the parameters and indicators (S, P r , DOK, Chf, MChf, Pc, Z, etc . . . ) is always perfect, constantly complete, and totally predictable since Pc = 1 permanently, independently of any probability profile or random factors.

Conclusion and perspectives
In the current paper we applied and linked the theory of Extended Kolmogorov Axioms to Ludwig Boltzmann's statistical mechanics theory and Claude Shannon's information theory. Hence, a tight bond between the new paradigm and entropy was established. Thus, the theory of 'Complex Probability' was developed beyond the scope of my previous eleven papers on this topic.
Moreover, as it was proved and illustrated in the new model, when = 1 (one microstate) and when is maximum (the stochastic system is in equilibrium) then the degree of our knowledge (DOK) is one and the chaotic factor (Chf and MChf ) is 0 since the state of the random system is totally known. During the process of the system evolution ( > 1) we have: 0.5 < DOK < 1, -0.5 < Chf < 0, and 0 < MChf < 0.5. Notice that during this whole process we have always Pc 2 = DOK -Chf = DOK + MChf = 1 = Pc, that means that the phenomenon which seems to be random and stochastic in R is now deterministic and certain in C = R + M , and this after adding to R the contributions of M and hence after subtracting the chaotic factor from the degree of our knowledge. Furthermore, the probabilities of the system microstate corresponding to each instance of have been determined in the probability sets R, M , and C by P r , P m , and Pc respectively. Therefore, at each instance of , the Boltzmann's theory parameters S, P r , DOK, Chf, MChf, Pc, Z, etc . . . are surely predicted in the complex set C with Pc maintained as equal to 1 and S C kept as equal to 0 permanently. Furthermore, using all these illustrated graphs and simulations throughout the whole paper, we can visualize and quantify both the system chaos (Chf and MChf ) and the certain knowledge (DOK and Pc) of the statistical mechanics model. This is certainly very interesting and fruitful and shows once again the benefits of extending Kolmogorov's axioms and thus the originality and usefulness of this new field in applied mathematics and prognostic that can be called verily: 'The Complex Probability Paradigm.' It is important to mention here that one essential and very well-known probability distribution was considered in the current research work which is the uniform probability distribution (P r = 1/ at equilibrium) although the original CPP model can be applied to any random distribution considered in my previous publications (Abou Jaoude, 2013a; Abou Jaoude, 2013b; Abou Jaoude, 2014; Abou Jaoude, 2015a, April; Abou Jaoude, 2015b; Abou Jaoude, 2016a; Abou Jaoude, 2016b; Abou Jaoude, 2017a; Abou Jaoude, 2017b; Abou Jaoude, 2017c; Abou . This will lead surely to similar results and conclusions and proves the success of my novel paradigm. Consequently, a way to develop and to further improve the results is to consider and to study the system entropy at non-equilibrium state. This would be a progress on this new paradigm and original field.
It is also important to state that it is possible to compare the current results with the existing ones from both theoretical analysis and simulation studies.
Additionally, the dissemination of the results could be further explained and compared with some existing results and this will be the subject of a future research work.
As a prospective and future work and challenges, it is planned to more develop the novel proposed prognostic paradigm and to apply it to a wide set of stochastic and random systems like the analytic prognostic of vehicle suspensions systems and of petrochemical pipelines (in their three modes: unburied, buried, and offshore) under the linear and nonlinear damage accumulation cases. Additionally, CPP will be applied also to prognostic using the first order reliability method (FORM) in engineering as well as to random walk which has huge applications in economics, in physics, in chemistry, in pure and applied mathematics.

Disclosure statement
No potential conflict of interest was reported by the author(s).