An introduction to non-smooth convex analysis via multiplicative derivative

In this study, *-directional derivative and *-subgradient are defined using the multiplicative derivative, making a new contribution to non-Newtonian calculus for use in non-smooth analysis. As for directional derivative and subgradient, which are used in the non-smooth optimization theory, basic definitions and preliminary facts related to optimization theory are stated and proved, and the *-subgradient concept is illustrated by providing some examples, such as absolute value and exponential functions. In addition, necessary and sufficient optimality conditions are obtained for convex problems.

Conversely, the non-Newtonian calculus concept is based on the definition of multiplicative derivative. The multiplicative derivative is used to find the factor by which the value of a function changes as its variable changes. It is different than for the conventional derivative, which finds the rate at which the value of a function changes. For several real-life models such as population growth, growth and decay, and economic models wherein the dependent variables increase or decrease exponentially, it is more important to know the factor by which the variables change than the rate at which they change. Some real word motivation can be found in [35][36][37]. In addition, these types of models are often encountered in non-smooth optimization problems.
To motivate the use of multiplicative derivative, we now present the following three applications. Two of them use discrete variables and the other use continuous variables. The first example involves the process of cell division, by which the number of cells in the body increases exponentially. First, a cell splits into two cells, which in turn split to form four cells, then eight and so on. Since the division process causes the number of cells to grow exponentially, multiplicative derivative is more appropriate than the conventional one for the continuous form of this model.
The second application is from the computational theory. Some computer algorithms require an exponentially increasing amount of resources (e.g., time, computer memory or number of function evaluations) although the problem size increases linearly. For instance, an algorithm for a problem of size n = 1, 2, 3, . . . , n may take 20, 40, 80, . . . , 2 n · 10 seconds, respectively, for completion, thus making it difficult to solve the problem for more than 20 variables.
To give another example for the continuous case, assume that you deposit y 0 $ in a bank account and it grows to y 1 $ after 1 year. The annual growth rate is y 1 /y 0 , but what is the monthly growth rate? If the monthly rate is c, then after 12 months the total amount will be y 1 = y 0 c 12 . In other words, the monthly rate c is (y 1 /y 0 ) 1/12 . If we instead assume that the balance is updated daily or hourly, we obtain c = (y 1 /y 0 ) 1/365 or c = (y 1 /y 0 ) 1/8760 , respectively. By expressing the balance over time as the function f, we obtain the formula c = (f (x + h)/f (x)) 1/h for appropriate values of h. In other words, if we are working in terms of months, then c = (f (12)/f (0)) 1/12 = (y 1 /y 0 ) 1/12 . Moreover, if we are working in terms of days, then c = (f (365)/f (0)) 1/365 = (y 1 /y 0 ) 1/365 . Finally, if the balance is updated continuously, then the rate of change is c = (f (x + h)/f (x)) 1/h , in the limit as 1/h tends to zero.
Due to aforementioned three applications and their reasons, Michael Grossman and Robert Katz [38] defined and used *-differential to construct a non-Newtonian calculus in 1972. Later, another brief paper [39] was published in 1999. Since then, several applications of the *-derivative have been presented [40][41][42][43][44].
Our main aim in this paper is to generalize multiplicative derivative to convex functions and state the properties of the derivative. This generalization contributes to the development of non-Newtonian calculus and can be used in the non-smooth optimization theory.
The paper is organized as follows. In Section 2, the multiplicative derivative and *-gradient are recalled. After that, *-directional derivative and *-subdifferential are defined in Section 3. Then, their properties are stated, proved and discussed. By using this new concept, a optimality condition for convex optimization problem is given in Section 4. Section 5 concludes the paper.

Preliminaries
In this section, we provide some basic information about the multiplicative derivative, i.e. non-Newtonian calculus, which can be found in [38,39,41].

Definition 2.1:
Assume that the function f : IR → IR is positive valued. If the limit exists, then f is said to be *-differentiable at the point x. The value of this limit is known as the *-derivative (or multiplicative derivative) of the function f at the point x and is denoted by f * (x) or d * f (x)/dx.

Remark 2.1:
Since f is a positive-valued function, it can easily be seen from Definition 2.1 that f * (x) ≥ 0.
By using Definition 2.1, *-derivative can be extended to functions of several variables. Let us consider the function f : IR n → IR of n variables. The partial *derivative of f with respect to x i for i ∈ {1, 2, . . . , n} can then be defined by fixing all other variables x k , where k ∈ {1, 2, . . . , i − 1, i + 1, . . . , n} and is denoted by

Definition 2.2:
Assume that the function f : IR n → IR is positive valued. If all the partial *-derivatives of f exist, then the *-gradient of f is a vector-valued function and is defined as and is denoted by The relationship between the *-derivative and the classical derivative was given in [39] as In [41], the multiplicative chain rule for functions of two variables is given without proof. This rule can be generalized to functions of n variables, as shown in the following theorem.
By using the classical chain rule, we then find

*-Directional derivative and *-subdifferential
The partial *-derivative shows how many times the value of a function changes along one of its coordinates while all other coordinates are constant. Since in optimization theory, this change needs to be directional in any direction v, the following definition can be given.

Definition 3.1:
Let f : IR n → IR be a positive valued function, and v ∈ IR n be a non-zero vector. The function f is said to be *-directionally differentiable at a point x in the direction v, if there exists the finite limit Then, using the following theorem, we can find the relation between *-directional derivative and directional derivative. Theorem 3.1: Let f : IR n → IR be a positive function. If f is directionally differentiable at a point x in a direction v, then f is *-directionally differentiable at x in the direction v, and Proof: By the definition of *-directional derivative, .
Let us define By using this notation, we can give the connection between *-directional derivative f * (x; v) and *-gradient ∇ * f (x) in the following theorem.
Proof: Similarly, the proof of Theorem 2.1, the proof of this theorem can be obtained from Theorem 3.1.
For the convex functions, the *-directional derivative has the following features.
Proof: Let us define the function ψ : IR → IR for an arbitrary vector v ∈ IR n , as Let > 0, without loss of generality we can choose Then, by using convexity of the function f (x), it can be easily seen that This means that function ψ decreases as h ↓ 0. On the other hand, since we study in the case h ↓ 0, we can assume 0 Since f is convex, Since the above inequality satisfied for all , the function ψ is bounded below. Therefore since ψ is decreasing as h ↓ 0 and bounded below for h ∈ (0, ) ⊂ (0, 1), lim h↓0 ψ(h) exists and From the definition of *-derivative, one has From (3), we obtain f * (x; v) = inf h>0 e ψ(h) . Hence

Remark 3.1:
As it is seen from the proof of Theorem 3.3, for any > 0 one has In the following theorems, we will show that *-directional derivative satisfies some good properties which are necessary for calculation and most of them obtained from Theorem 3.3. When using Theorem 3.3, it is assumed that h ∈ (0, 1) according to Remark 3.1.
Let k := λh. Since 0 < λ and h ↓ 0, then k ↓ 0. Thus Proof: (a) By using Theorem 3.3, we obtain Since f is convex and h < 1, we get Since a convex function f : IR n → IR is locally Lipschitz continuous at any point x ∈ IR n (for proof see [1]), the following properties of *-directional derivative can be presented. Theorem 3.6: Let f : IR n → IR be a positive valued convex function with a Lipschitz constant K at x ∈ IR n . Then the function v → f * (x; v) satisfies the following inequality: Proof: From Theorem 3.3, one can see that In the literature, the subgradient and subdifferential of a convex function are defined from the fact that f (y) ≥ f (x) + ∇f (x) T (y − x) under the differentiability assumption. Clarke [2] defined the subgradient and the subdifferential by using this inequality. Now, we are going to define the *-subgradient and *-subdifferential, but before that we need to mention a similar inequality which hold for the *-gradient. Theorem 3.7: Let f : IR n → IR be a positive valued convex function. If f has all its partial *-derivatives, then the following inequality: is satisfied.
Proof: It is clear that if y = x, then inequality (4) holds. Thus, assume that x = y. By Theorem 3.3, we can write Then, using Theorem 3.3, it can be written as Since f is convex and h < 1, we have

Remark 3.2:
The term ln ∇ * f (x) f (x)(y−x) in the righthand side of inequality (4) in Theorem 3.7 is not equal The reason of this difference is that (y − x) ∈ IR n while f (x) ∈ IR and moreover ∇ * f (x) f (x)(y−x) is real number which is defined as in (2).

Definition 3.2:
The *-subdifferential of a convex function f : IR n → IR at the point x ∈ IR n is the set ∂ * f (x) of vectors v ∈ IR n + such that for all y ∈ IR n , where the vector v ∈ ∂ * f (x) is called *-subgradient.

Now
, some examples can be given to visualize the *subdifferential concept.
Thus the function f is not *-differentiable at x = 0. On the other hand, since the function f is convex, *subdifferential of the function f at the point x = 0 exists and Now, let us find the *-subdifferential of the function f at the point x where the function f is *-differentiable. Using Theorem 3.2, we can write *-subdifferential of f at the point x > 0 as Then, let us calculate the *-subdifferential ln v for all y < 0.
By solving the first inequality for two cases, namely, y > x and y < x, we find v = e 1/x+1 . Then if we substitute v = e 1/x+1 in the second inequality, we obtain −y ≥ y which is always true for y < 0.
Hence the result ∂ * f (x) = {∇ * f (x)} obtained from Example 3.1 can be generalized for functions which are *-differentiable at the point x. Consequently, the following theorem can be given.

Theorem 3.8: If a positive convex function f is *-differentiable at the point x, then the *-subdifferential
Before the proof of this theorem, we need to give the following lemma.

From Theorem 3.3, one can obtain that
.
Therefore we get where h ≤ 1. If we choose d: = y−x, it can be clearly seen Thus v ∈ S. Now, we can give the proof of Theorem 3.8.

Example 3.2:
In this example, we consider function f (x) = e |x| and try to find the *-subdifferential of this function. Since the function f is continuous except As it is seen in this example, it is not an easy task to compute *-subgradient at a non-differential point. This is an expected difficulty which is caused by the nature of non-smoothness and the subgradient itself already has the same difficulty. Alternatively *-subgradient can be used to determine whether a given point is an extremum point or not, which is discussed in Section 4. Although subgradient can be used to determine being an extremum point, *-subgradient is also useful for exponential functions since the rate of growth increases rapidly. On the other hand, the use of *-subgradient in optimization algorithms for some class of optimization problems may give us more efficient results. At this point, it may be useful to give the following theorem putting the relation between subgradient and *-subgradient.
Proof: First of all, let's recall the definition of the subgradient ∂f (x) for a convex function f : IR n → IR which is Assume v ∈ ∂f (x). We should show that the vector for all y = (y 1 , y 2 , . . . , y n ) ∈ IR n , we have .
Similarly, other direction of Theorem 1 can be shown.
As a result of this theorem, one can say that ∂ * f (x) is a non-empty compact set, since the subdifferential ∂f (x) of a convex function f (x) is a non-empty compact set [1]. Being a non-empty set is obvious. For compactness of the set ∂ * f (x), recall that any continuous image of a compact set is compact. This fact can be given as a following corollary.

Corollary 3.1:
The set ∂ * f (x) is a non-empty compact set in IR n .

Optimality condition for convex problem
Let us give a optimality condition via *-subgradient for the following non-smooth unconstrained optimization problem where f : IR n → IR is a convex function. This problem is also known as a convex problem.

Definition 4.1:
A point x * ∈ IR n is a global optimum of the problem (6) is satisfied.
In general, the following definition helps us to find the point satisfying the necessary optimality condition. For convex case, it gives us an idea whether the point is global optimum or not.
The following theorem presents the relation between global optimum and *-stationary point for the unconstrained non-smooth convex problem (6). Proof: First, assume that x * is a *-stationary point of f.
Consequently, x * is a global solution of the non-smooth optimization problem (6).
Conversely, assume that x * is a global solution of the non-smooth optimization problem (6). Then, f (x * + hd) ≥ f (x * ) for all d ∈ IR n and h ∈ IR, which implies

Conclusion
In this study, a new concept known as *-subgradient has been introduced with the help of the multiplicative derivative for use in the non-smooth optimization theory. To achieve this, *-directional derivative has been defined for positive-valued functions, and its properties have been proven. Since optimization theory deals with finding minimum function values, the objective function should be bounded below; otherwise, the function will not have a global optimum. In light of this fact, one can make the function positive by adding a sufficiently large number M so that f (x) + M ≥ 0. Then, an inequality involving the *-gradient was stated and proved. Using this inequality, we were then able to define *-subgradient similar to the way subgradient has been defined in the literature. In addition, the relationship between *-directional derivative and *-subgradient has been presented. Finally, an optimality condition for non-smooth unconstrained convex optimization problems has been proven.
In summary, *-subgradient can be used to develop new methods in optimization theory and can be generalized to local Lipschitz continuous functions in an analogous way. It may have some advantage for some type of optimization problems, for example, if you use *-subgradient for any method, you study on large values instead of on small values. Thus it is good idea to develop new methods for functions whose values are small.