Transfer function fitting using a continuous Ant Colony Optimization ( ACO ) algorithm

An original approach is proposed in order to achieve the fitting of ultra-wideband complex frequency functions, such as the complex impedances, by using the so-called ACO (Ant Colony Optimization) methods. First, we present the optimization principle of ACO, which originally was dedicated to the combinatorial problems. Further on, the extension to the continuous and mixed problems is explained in more details. The interest in this approach is proved by its ability to define practical constraints and objectives, such as minimizing the number of filters used in the model with respect to a fixed relative error. Finally, the establishment of the model for the first and second order filter types illustrates the power of the method and its interest for the time-domain electromagnetic computation.


Introduction
In the optimization area, some classical gradient descentbased methods have been developed since several decades.These approaches, i.e. the gradient and derived (conjugate gradient) method, the simplex, and so on [1] can be qualified to be deterministic.They are based on the definition of a path, starting from a unique initial point towards the optimum to be found.As these methods start from only one point, the best efficiency is reached when this point is situated not far from the searched one.Moreover, since a multivariable function can present many local minima, only the solution located near the starting point will be attained.More recently, the heuristic approaches based on the numerical miming of biological phenomena have successfully been applied in optimization processes.In this case, the principle is very different; this process starts from a family of points randomly chosen in a previously defined research area.Clearly, it seems evident that the probability to attain the global optimum is increased.The most famous among all these is the genetic algorithm method, which considers the evolution of a population through some generations.This approach is based on the work of Holland and Goldberg [2], [3] for the binary coding, as well as on the approach developed by De Jong [4] using a more general coding.In all the cases, this kind of approach starts from a population of individuals, which represent the parameters to be optimized in the form of a string.These strings are assembled on the so-called chromosomes that constitute the characteristic of each individual.Then the method is based on three main phases: the initialization, the selection and the reproduction.Along an iteration process on different generations, the variables, modeled as chromosomes, are more and more in accuracy with an objective fixed beforehand.
Like the Genetic Algorithm approach, some other methods, which are so-called metaheuristics, have been developed to solve the complex problem that corresponds to the class of the NP complete problem.Mathematically, it means that the computational time needed to solve such problems, versus the number of parameter, increases more quickly than a polynomial function over time.Among these approaches, the most famous are the Simulated Annealing [5] and the Particle Swarm Optimization (PSO) |6].More recently the ant colony method initiated by Dorigo [7], [8], [9], [10] has been the subject of studies in different domains.The first algorithms were developed to solve a combinatorial problem, which corresponds to finding an optimum path between discrete points; this problem is clearly directly connected to the graph theory.As the original approach deals with discrete parameters, some recent modifications of the algorithm have been made to extend continuous variables to the mixed ones.One of the most contributions in this domain is the works of Socha [11]- [14].
In this paper, the ant colony approach is detailed for both discrete and continuous optimization.It is shown how to perform the fitting of complex frequency functions such as transfer function, input impedances and to automatically minimize the number of filters.Moreover, on the contrary to the direct approaches such as Vector Fitting [15], the objective is not to fit as accurately as possible the original curve, since it can result from measurements which can be noisy.
The resolution of the electromagnetic wave propagation in dispersive media using a time domain method is a frequency-dependant problem that can be solved efficiently if the complex frequency function is expressed under the first and second order filters expansion.Hence, the fitting of the curve in a set of filters results in some auxiliary ordinary differential equations that is solved by numerical temporal method.In such a case, we understand the importance of limiting the number of filters in order to reduce the time computation.Thus one of the objectives is to admit a fixed deviation of the fitted curve around the original curve and to include this constraint in the optimization process.
In the following section, the ant colony method is presented, first for discrete optimization, followed by the continuous problem.Afterwards, some results show how it is possible to optimize the fitting by fixing an error level while maintaining the number of filters as low as possible.Finally, the example of a conducting medium representation is established over a very large bandwidth.

Discrete problem
In order to quickly present the principle of the classical ant optimization approach, it is more practical to illustrate it on a simple example.The most classical problem to be solved by such an approach is the Travelling Salesman Problem (TSP) that is shortly described below.Considering N towns to be visited, the TSP consists in visiting all towns one and only one time and to minimize the path for coming back to the starting point.We can easily show that such a process is similar to finding the spanning tree of a graph constituted by N nodes and junctions between these nodes [9].This is a NP complete problem as the number of paths that has to be explored strongly increases as the power of the number of towns.
The basic principle of the discrete ant colonies method is explained by the following steps.First we have to choose a starting point among the different towns to be visited.We also choose an initial number of ants that will be launched on the grid of the graph composed of all possible connections between towns.Then, in the first step, all the ants move in the grid randomly to a second town.After that, they move to the next one avoiding to come back to an already visited town.Finally they come back home making one loop after travelling all towns.Similarly to real ants, artificial ants will proceed to pheromone deposit, this pheromone will be useful to other ants in the iterative process because, the next ants will follow the way containing more pheromone.The pheromone quantity is proportional to the inverse of the path length.The shortest path is then preferred by the following ants.But on the contrary of real ants, the pheromone deposit quantity is evaluated at the end of the trajectory.After that, when each ant has completed its solution, a pheromone deposit process along the edges of the grid is made.Let L k be the length of the path after one complete tour for the ant number k. (i,j) denotes the edge connecting the i and j towns, the deposit for the k th ant on the (i,j) edge after one tour is given by , withT k being the trajectory of the k th ant at the time t.Q is a constant parameter to be defined.Moreover some pheromone evaporatesat each iteration, and then an evaporation coefficient  is defined.As a result at the first iteration, a quantity of pheromone on each branch along the path is given by After, the ants begin a second iteration choosing different paths with a probability which is in direct connection with the quantity of pheromone along the different branches.The probability of choosing a path is given by the formula Where  is the amount of pheromone deposit at the transition level between states x and y and  is the desirability of state transition which behaves as 1/d, d being the distance of a considered edge.
Finally, we come back to the first step, taking into account the amount of pheromone along the branches of the graph in order to determine the probability of each ant to choose this path.Using such an algorithm, it becomes clear that the more ants take a path, the pheromone deposit will be more significant.It is also clear that the shortest the path is, the more pheromone deposit appears.As a result, such an algorithm always converges towards the shortest path.

Continuous problem
In the case of continuous problems, the previous algorithm cannot be applied.To overcome this drawback, some authors have developed particular algorithms.More generally, it should be interesting to deal with continuous and discrete parameters in the same algorithm.As a consequence, a mixed method has been developed, firstly presented by Socha [11] that keeps the same concepts as the discrete ACO method.
For the case of combinatorial optimization, the choices of the different paths of the graph are function of a discrete law of probability.In order to extend the process to continuous variables, Socha keeps the usual algorithm, but changes the discrete probability law into a continuous Probability Density Function (PDF) (figure 1) [11].That is the principle of the ACOR (Ant Colony Optimization for Reals) method.
As it is not possible to obtain all the values of the parameters in the continuous case, a table called solution archive is defined ( ) ( ) . . .( ) . . .( ) . . . . . .
The dimensions of the table are K by N where K is the ant number and N the number of parameters to be optimized.In the solution archive table, the s i k value represents the actual solution for the i th parameter of the k th ant.Two extra vectors are also defined to help finding the best ants: f(s k ) represents the value of the objective function for the k th ant and  k is a weight that corresponds to the "quality" of the solution.To define the  vector, the ants are classified into the table from the best one (lower objective function value in the case of the searching for a minimum) to the worst one.
As in the combinatorial ant algorithm, the convergence after several iterations can be attained due to the pheromone deposit as it is shown below.The iterative algorithm is then made following the sequences hereafter.
The first step consists in the initialization of the solution archive table.This is made by randomly choosing the s values for each parameter according to a selected distribution.In the initialization process, a uniform distribution of the mean values and a Gaussian distribution around these values is usually chosen.This Gaussian distribution is denoted g in the following.The initial chosen values, being uniformly distributed, they constitute the mean values for the next process in the iteration step.The probability for each variable will follow a multi-kernel law such as 1 ( ) .( , , ) In this formula, the means value and the standard deviation are chosen in order to cover the entire research domain according to the formula (2 1) 2 Where a i and b i are respectively the lower and upper bound of the i th parameter.
Then, the ACOR algorithm begins; it consists in ordering all the ants in the solution archive table according to their objective function value.The best ant will be the first, so we have 1 2 ...
The  values correspond to the probability of selecting one ant.In the present case, a uniform or a Gaussian law can be chosen.For the particular presented case we consider a Gaussian distribution whose standard deviation is a parameter of the algorithm and the mean is equal to 1, which corresponds to the first ant in the table.The Gaussian distribution gives a weight according to the formula 2 2 2 ( 1) Then the probability of selecting an ant in the following is given by The process continues by selecting randomly m new ants for each parameter; the probability follows a multi-kernel law such as 1 ( ) .( , , ) The objective function is computed next.After that, the (k+ m) ants are reordered from the best to the worst one and the m last is taken out of the archive table.At this point, we can notice that this last process is similar to pheromone evaporation, it corresponds to cancelling some information that are not in agreement with the objective to be reached.It is also important to point out that the K ant values will be assimilated to the K mean values in the multi-kernel probability law, the standard deviation will also decreases when the iteration number increases following where is a parameter of the algorithm, that can be in relation with the current iteration number i_iter (as 1/i_iter) so that the  value decreases as i_iter increases.

Model and objective function
In some EMC applications it is useful to have a mathematical representation function of frequency of impedances, material parameters or transfer functions, particularly in order to include the frequency dependent models for time domain numerical computations, such as the FDTD method.For example, a first or second order model of filter can be used to represent the Debye or Lorentz behavior of material, the summation of filters can represent a macromodel of the input impedance of a passive component.It is well known that a decomposition of a complex frequency-dependent parameter of Maxwell's equations, such as () or () in a collection of first order complex filters, leads to a direct solving of a partial derivative system of first order without the need to calculate a convolution product.As a consequence, if we have some numerical or experimental issues of the frequencydependent parameter, we can use some fitting methods to extract an adequate model.In order to build the fitting of complex curves, the following process is often applied.First, a number of filters is selected, and the quality of the solution is determined by comparing the error that results from the difference between the two complex curves, i.e. initial data and reconstructed data.If this error is too high, the number of filters is increased.In a second part, the quality of the result relies on the best fitting of the initial data.This generally imposes an important number of poles in the fitted solution and, sometimes, the generation of non-physical filters to compensate some variations that are not in agreement with the initial data.
The strategy defined above is generally efficient but the result is an overestimation of the number of poles, which can penalize the solving system that includes such a model.When the initial data is obtained, for example, from measurements, the fitting process does not match perfectly.Hence a more efficient strategy has to be chosen.To present our approach, we choose a model that is expressed as a summation of filters that can be resonant or (RC) law-pass ones.Figure 2 shows a typical example of resonant filter that is similar to the one used in the vector fitting approach.As, in this case, this impedance is composed of a series of parallel circuits, we can easily give the analytical expression of the admittance of each component.The expression (12) gives a resonant filter as defined in the Vector Fitting approach.In the case of first order filters the expression ( 13) is generally used to represent relaxation phenomena 1 1 ( ) ) Then the total impedance is the summation in series of such parallel circuits, and then is expressed as In order to fit curves, we start from a complex dataset, called Z ref ().Then we impose two main constraints.The first one, which is our main objective, consists in making a fitting using a minimum number of filters of each kind (resonant or law pass) The second constraint consists in fixing the accuracy of the fitting by controlling the global error on the whole bandwidth.This is made by imposing an accuracy of the mean square value on the whole bandwidth.Since different objective functions can be chosen two of them will be tested in the following; it is shown that the behavior is different according to this choice.
In the first approach, the objective is to find the minimum number of filters with the constraint that the error is lower than the fixed level on the whole bandwidth.
Mathematically, this can be written as max minimize ( ) 0.01 -err 0 0 with N max the maximum number of filters, n the current number of filter and p the expected value of the relative error, in percent over the bandwidth.The error term called err is defined as with Z( i ) the evaluated value of the impedance at a given frequency, Z ref ( i ) the exact value to be fitted at the same frequency,  a fixed level of error for the relative error.The summation operation is made on the values whose relative error is greater than the fixed threshold.In this first approach, as pointed out in the above formulas, the objective function value has to be lower than a precise value on the global bandwidth.Consequently, we expect the global behavior of the curve to be lower than the given level of error.
The second way that is investigated has a different objective.
In this case, we aim to have an error around a fixed value.In this case, the constraint is quite different 1 ( ) 0.01 The minimization procedure is quite the same as the previous one, the notable difference is the err term (eq-18) that must be as lower as possible, that is to say, the relative error has to be close to the fixed value.In fact, the relative error will oscillate around the fixed value:

Application to resonant input impedance fitting
In order to illustrate the process, we choose to fit a simple input impedance of a simple loaded transmission line.For this application, we have chosen a line of length 50 cm, whose characteristic impedance is 50 .A lossless line has been chosen, the input impedance of the transmission line loaded by 200  real impedance is considered over [10 kHz -1.1 GHz] bandwidth.
The input impedance contains five resonant filters over the considered frequency bandwidth.Now we apply the first optimization process defined above for different error level suited on the whole bandwidth.For this application, the model is established with second order filters only.Then, two error levels are chosen.First, a 10% error in the reconstruction is fixed.Figure 3 shows a good fitting of the curve, the error is equal to or lower than -20 dB (equivalent to p10%) on the whole bandwidth, as expected.In the present case, five filters are founded to give such a result.Then, still using the first objective function, a 1% error has been fixed.In this case a good result is also founded.As expected, the number of filters is greater than in the previous cases, because 9 filters are needed to have a proper reconstruction with relative error lower than -40 dB over the frequency bandwidth (figure 4).
The second criterion is now tested with the same relative error levels.Figures 5 and 6 show the results.The errors oscillate respectively around -20dB and -40 dB as expected.The numbers of filters necessary to obtain these results are 7 and 10, respectively.
As a conclusion, we can say that the first solution is more efficient in terms of computational resources when applied in time domain electromagnetic codes such as FDTD.Moreover, it is important to notice that using the classical Vector Fitting would not allow to control the error around the expected function.In fact, some peak of error would appear due to the lack of filters.

Application to lossy media fitting on a very large bandwidth
In the electromagnetic compatibility, antenna and radar domains we usually have to consider lossy media such as soils and non-perfectly conductive material sheets.Here the problem posed is to take into consideration such a material in time domain codes with time-dependent characteristics.In such a case, it is necessary to compute some extra equations in every discretization cell.As a result we understand the importance of decreasing the number of equations (i.e. the number of filters) to solve in each computational cell.The method is applied for the first constraint set ( 15), ( 16) and on a very large frequency bandwidth.We start from the surface impedance of a conductive thin material described by the following analytical expression coth( ) Where  is the material conductivity, d is the thickness of the material, k m is the complex wavenumber, being given by (1 ) / 2 m k j    .
In order to fit the surface impedance we will consider the model proposed by Jansson et al [16], who suppose the surface impedance can be decomposed as It is easy to show that each filter in the summation is nothing else than a (RL) filter.Then, it can be written as The first part is the low frequency resistor The fitting of the surface impedance has been made over [10 MHz -10 GHz] bandwidth for a 10% expected accuracy.Figure 7 shows that the red curve is situated between the two black ones that give the maximum deviation value +/-10%.The frequency model response oscillates around the reference one given by (20).To obtain such a result, only four filters have been used.
We have seen that the method gives the expected results, but an important error on the result (10%) is observed in the present case.With this result, we can ask what is the consequence of such a difference on the solutions provided by the electromagnetic simulation ?This problem is illustrated next for a cavity application.A rectangular cavity with lossy walls is studied.The conductivity of the walls is 10 6 S/m and the thickness is 1 mm.The cavity is fed by a monopole and the problem is solved using a FDTD approach.The dimensions of the cavity and the different geometrical characteristics are given in the figure 8.The field in the cavity is observed for two situations: a very accurate fitting case where the number of filters is chosen to have a relative error lower than 1% and the fitting with four filters and an error of 10% determined above.
Figure 9 shows a very good agreement between the two results.As a consequence it appears that if an error of about 10% can be admitted; there is no significant propagation error between the input values (the surface impedance) and the output (the electromagnetic field).As a very small error is observed, therefore, the number of filters to be considered has decreased, consequently the computational time and memory will also decrease.

Conclusions
In this paper we showed the ability of the continuous ant colony optimization approach for the fitting error control of the complex curves such as localized impedances and frequency-dependent material parameters.Such an approach allows us to handle various objective functions.Particularly, the possibility of introducing the number of filters was showed, used in order to construct the solution, as the objective function to minimize.Such resulting models can be used in numerical time domain method such as the FDTD.The decrease in the number of filters provides a limitation of computational time, particularly for solving structure with frequency-dependant media.This point is more pertinent since a mitigated accuracy of fitting does not really impact the accuracy of solutions in many applications.

Figure 1 .
Figure 1.From the discrete probability law to the continuous one

Figure 3 .
Figure 3. Impedance seen from the input of the line for a -20 dB error constraint.

Figure 4 .
Figure 4. Impedance seen from the input of the line for a -40 dB error constraint.

Figure 5 .
Figure 5. Impedance seen from the input of the line for a -20 dB error constraint.

Figure 6 .
Figure 6.Impedance seen from the input of the line for a -20 dB error constraint.

Figure 7 .
Figure 7. Fitting of the surface impedance and approximation error.

Figure 9 .
Figure 9. Field in the enclosure for a perfect and a 4 filters fitting of the surface impedance.