A new density for transition properties within the similarity transformed equation of motion approach

ABSTRACT Similarity transformed equation of motion coupled cluster theory offers an efficient way of computing excited state energies by decoupling the space of singles from higher excitations. However, when computing properties with this method, one is left with a choice between an expensive method involving a transformation into the space of the singles and the doubles, or methods that approximate the full density. In this paper, we present a rigorous expectation value formulation of the density to compute transition properties and discuss its relation to other existing techniques. We confirm that the configuration interaction singles approximation we used in earlier studies oscillator strength values is a reliable one, but also that the current formulation provides a cost efficient improvement on it. GRAPHICAL ABSTRACT


Introduction
The coupled cluster (CC) approach is one of the most accurate and systematically improvable method of quantum chemistry [1,2]. The last 70 years has seen tremendous progress in the development of black-box single reference coupled cluster methods, now available in a CONTACT Róbert Izsák robert.izsak@kofo.mpg.de Max-Planck-Institut für Kohlenforschung, Kaiser-Wilhelm-Platz 1., 45470 Mülheim an der Ruhr, Germany Supplemental data for this article can be accessed here. https://doi.org/10.1080/00268976.2020.1818858 wide variety of commercial as well as free of charge quantum chemistry software packages for calculating energy values and properties. Extensions capable of handling multi-reference systems [3][4][5][6] and excited states [2] have also been achieved.
The single reference CC method is generally extended to excited states using the equation of motion (EOM) approach [7,8], typically truncated at the level of single and double excitations (CCSD) which scales as O(N 6 ) with the number of basis functions N and has an O(N 4 ) scaling storage requirement. Thus, the development of lower scaling modifications to EOM-CCSD is a very active field of research [9]. The strategies available to reduce the computational cost of coupled cluster based excited state methods can be broadly classified into four categories. The first category consists of perturbative approximations of the ground and excited state coupled cluster wave-function parameters, such as CC2 [10] and EOM-CCSD(2) [11][12][13]. These methods have a close relationship to the so-called polarisation propagator methods [14,15]. The second category of approximations involves approximating the two electron repulsion integrals in EOM-CCSD calculations using density fitting [16,17] and/or semi-numerical approximations [18,19]. The third category, which has became very popular in recent times, is the truncation of the wave-function using local and natural orbitals [20][21][22][23][24][25][26][27][28].
Nooijen and Bartlett have proposed a fourth approach [29] in which a second similarity transformation is performed to decouple the single excitations from the space of higher order excitations. Consequently, in the similarity transformed EOM-CCSD (STEOM-CCSD) method one can obtain the excitation energy just by diagonalising the singles block of the Hamiltonian which leads to significant reduction of the computational cost without much loss of accuracy. The STEOM-CCSD method is closely related to the so-called Fock-space coupled cluster (FSCC) method [30][31][32][33][34]. The renewed interest in the STEOM-CCSD method over the last few years has produced an efficient new implementation [21], an automatic active space selection scheme [35,36], an implementation including explicit correlation [37] and spin orbit coupling effects [38], an extension to open-shell systems [39] and semi-conductor solids [40].
However, the calculation of excitation energies is just one part of the theoretical treatment of excited state phenomena, and the simulation of experimental spectra also require the calculation of transition properties. Since coupled cluster is a non-variational method, the Hellmann-Feynmann theorem does not apply. Therefore, it is not straightforward to calculate properties within the framework of coupled cluster methods. Nevertheless, the method of Lagrange multipliers [41,42] can be used to introduce a new set of parameters and to define an energy functional which is stationary with respect to the external perturbation. Gauss and co-workers have played a fundamental role in developing the theory of molecular properties [43], methods applicable to open shell systems and analytic derivatives within the framework of the coupled cluster method for ground [44][45][46] and excited states [47,48], to name only a few areas. The first implementation of the EOM-CCSD properties by Stanton and Bartlett used an expectation value formalism [8]. Later exact analytic first derivatives for EOM-CCSD were implemented by Gauss and Stanton using the socalled Z-vector technique [49]. Koch and co-workers [50] have implemented the transition moments and excited state properties for coupled cluster based excited state methods within the linear response approach. Analytic second derivatives for the EOM-CCSD method were reported by Krylov and co-workers [51], while Pal and co-workers [52] have reported the implementation of transition moments for the Fock space multireference coupled cluster (FSMRCC) method. The transition moment for STEOM-CC method was originally implemented by Nooijen and Bartlett using an EOM-CCSD like approximation [29]. Analytic gradients for various flavours of STEOM-CCSD have also been implemented by Nooijen and co-workers [53,54]. Landau has developed [55] a general theoretical framework for the calculation of response properties within STEOM-CC but no numerical results were reported.
For the practical applications of STEOM-CCSD to large molecules one needs to develop a scheme for calculating the transition moment which requires very little effort beyond the energy calculation and is compatible with the both canonical and localised orbital based implementations. Up to this point, we have been using a simple configuration interaction singles (CIS) formula for this purpose [56]. The aim of this paper is to develop an efficient scheme for the calculation of transition properties with the framework of STEOM-CCSD in a more rigorous way. It should be noted that the formulation we propose here is aimed mainly at transition properties, and to the extent it is developed here, it yields no improvement over the CIS formula for excited state rather than transition properties, such as the excited state dipole moment. The formulation of Nooijen and Bartlett [29] and Landau [55] offers a better description of these and may serve as the basis of more efficient implementations in the future.

Coupled cluster for excited states
The CC wave-function ansatz takes the form of an exponential operator, eT, acting on a reference function, which in the single reference CC case is typically the ground state closed shell Hartree-Fock determinant |0 . In practice,T and the corresponding energy are obtained by satisfying the projection conditions in which ab... ij... is a member of the excitation manifold andH = e −TĤ eT (2) is the similarity transformed coupled cluster (CC) Hamiltonian. The operatorT contains the ground state CC amplitudes, which in the CC singles and doubles (CCSD) level can be written aŝ in which the operators E d l are generators of the unitary group. The labels i, j, k . . . denote occupied, a, b, c . . . virtual orbitals. Once the ground state solution is known, the equation of motion (EOM) approach makes it possible to calculate excited states as an eigenproblem ofH.
Similarity transformed equation of motion (STEOM) theory is a modification of the EOM approach to compute excited state energies and properties. It relies on the following doubly similarity transformed Hamiltonian, where the curly braces {} indicate normal ordering with respect to the reference Hartree-Fock (HF) determinant. The labels m and e are reserved for active occupied and active virtual orbitals, respectively, while p, q, . . . may denote any orbital. The operatorŜ consists of two partŝ S =Ŝ + +Ŝ − , which can be written aŝ where the normal ordered expressions have been converted into equivalent expressions without normal ordering. Here, we have neglected the singles contributions, since they do not change the spectrum ofĜ. The STEOM Hamiltonian is diagonalised in the space of single excitations to yield the right and left hand side eigenvectors,R where the notationL 1 ,R 1 are reserved for the singles part of the operator, without the constant term. Note that unlikeR, the left vectors also have a significant doubles component, but we leave that discussion for a later section. Consider now the action of the transformation operator on the reference and singly excited determinants, in which |D and |T denote doubly and triply excited determinants.

The STEOM density in the space of single excitations
The use of the normal ordered operator {eŜ} leads to the complication that {eŜ} −1 = {e −Ŝ }. To overcome this problem, Nooijen and Bartlett [29] write the connected part of the STEOM Hamiltonian in the form Apart from the constant energy term, the operatorĜ can be assumed to be equivalent withĜ C . Furthermore, sinceĜ is to be represented in the space of the reference and single excitations, the second term can be neglected.
Since we are interested in computing the one electron reduced density matrix (1-RDM), γ p q , it also sufficient to consider terms that are linear inŜ. Thus, defining the 1-RDM can be written as since all other contributions are zero. Note, in particular, that contributions from {Ŝ 2 } vanish for the 1-RDM, as the quadratic terms only contribute to higher order reduced density matrices.
We may now systematically evaluate the necessary expressions, starting with Then, in which It is worth noting that this is equivalent to the configuration interaction singles (CIS) result for the same expression, in which case r 0 would be also zero. For the left hand side, in which Here, and in the following the tilde on an amplitude denotest The resulting expression is equivalent to the ground state CC 1-RDM in the space of the singles. Finally, where δ LR survives ifL andR correspond to the same state. We may write and Collecting all terms together, we have We will refer to this result as the full STEOM density within the space of single excitations. The CIS approximation to the full density can be obtained from this by assuming that the t and s amplitudes are zero.
It remains to discuss how the quantities that enter Equation (23) are obtained. Once the STEOM iterations converge, theT, theŜ ± andR 1 for the excited states amplitudes become available. The r 0 contribution to excited states can also be determined simply as [29] For the ground state, r 0 = 1 andR 1 = 0. The excited statê L 1 can be obtained as a general inverse oneR 1 is known, while l 0 = 0 for excited states. For the ground state left vector, l 0 = 1, while remaining parts can be related to the ground state lambda equations or be approximated by thê T amplitudes. Thus, our formulation also yields results for the ground state density, although the ground state CC density is preferred for this purpose. We will discuss this latter issue further in the next section.

Contributions to the STEOM density from double excitations
So far, only the singles contributions to the density were considered. While this is a good approximation for the right hand state vector Equation (7), the same compactness is not achieved for the left hand side, which has significant components in the space of doubles. For the discussion in this section, the left hand side operator in Equation (25) will be rewritten to include the doubles, Nooijen and Bartlett [29] proceed to derive a perturbative formula forL 2 , and then evaluate the 1-RDM with this extended left hand state. They even go one step further and recover the EOM eigenvectorsL andR by expanding the exponent inR and In the letter case,L now contains the doubles. While the issue of the inverse normal ordered operator has to be addressed, the details need not concern us here. It is sufficient to summarise the results that emerge from this transformation, The termL 1 is equal to the sum ofL 1 and terms arising from contractions betweenL 2 andŜ, which we will discuss in more detail below. There is also a doubles component of the right hand vector, Note that these contributions appear automatically in Equation (23). Following the same procedure a triples correction also arises in the EOM picture, but we will not investigate this in any further detail. With these definitions, Nooijen and Bartlett are able to use the usual EOM expectation value formula for property calculations, To evaluate the doubles correction, we must next consider the contributions from theL 2 andR 1 . Since our goal is to find a good approximation as close to CIS as possible, we will introduce further simplifications. For the ground state left hand side, we will assume [57,58] that while for the excited states, the singles are given bȳ after neglecting the terms fromL 2 andŜ amplitudes. The doubles are obtained from the analogue of Equation (29), These might be regarded as first order approximations, and leave the results of the previous section for the singles unchanged. To capture the most important doubles effects from the left hand side, let us consider its interaction with the singles of the right hand side, Here, we have truncated the commutator expansion ofĒ p q in Equation (30), introducing what might be called a CI approximation. Using these notations, we may rewrite 1-RDM in Equation (23) as follows We will refer to this variant as the full STEOM density with the L2 correction. The fact thatT 2 appears in Equation (23), while the corresponding (ground state)L 2 does not, introduces a source of imbalance that is compensated by the presence ofL 2 in Equation (35). We will investigate the effect of this contribution in the discussion below. While the formulation above should account for the most important contributions to the transition dipole moment, it is possible to consider further simple improvements using the quantities above. One is to update theL 1 amplitudes with the terms neglected in Equation (32) Furthermore, we may consider the interaction of the doubles in the EOM picture, and truncate the commutator expansion again, which yields a further contribution to the density We will collectively refer to these improvements as the STEOM-DD density (DD for doubles-doubles). Especially the last terms considered are quite expensive to store and compute. However, we expect that their contribution is small and they will be used primarily to check the quality of the L2 density.

The STEOM transition dipole moment
The calculation of oscillator strength requires the transition dipole T k , where ω k is the excitation energy of state k. If D p q = p| r|q is the dipole integral, and In other words, the transition dipole can be obtained using the density in Equation (23) or Equation (35) and substituting the left and right hand solutions corresponding to the ground state (L 0 ,R 0 ) and the kth excited state (L k ,R k ). Since the states are different, the terms with δ LR do not contribute. With the definition of the oscillator strength, we now have all the quantities necessary to benchmark the various methods in the next section. Notes: N R is the number of roots requested, and N G is the number of CIS guess states from which the active space of orbitals is determined. The superscript a refers to a calculation with N R = 15, N G = 60.

Computational details
To determine the accuracy of the STEOM density describe above and to compare it to the accuracy of the CIS formula we have been using so far, we have investigated a total of 92 singlet states from Thiel's test set [59]. To facilitate comparison with previous studies using this test set [60,61], we used TZVP basis set [62] and the reference values for the excitation energies and oscillator strengths were calculated at the CC3 level of theory [63]. The SCFCONV12 settings were used to minimise the effect of numerical noise on the molecular orbitals and other quantities entering the STEOM calculation. All STEOM-CCSD calculations were performed using a development version of the ORCA quantum chemistry package [64]. STEOM calculations require an appropriate choice of active orbital spaces. An automatic active space selection procedure using configuration interaction singles (CIS) natural orbitals [35,36]. By default, the number of STEOM roots (N R ) is the same as the number of CIS states (N G ), but in this paper we used N R = 30, N G = 60, unless otherwise specified. During the optimisation IP or EA states with a single excitation character less than 80% are also discarded.

Results and discussion
First, we examine the effect of various parameters on the STEOM calculation. For this purpose, we selected a number of states from Thiel's test set. The polyenes have a bright first excited state with a large oscillator strength and we have also included a few of the more difficult cases from the test set, see the data in Table 1. Once the number of STEOM roots N R is chosen, the number of CIS roots N G must be greater or equal to that number. Since in our earlier study [35] it turned out that the excitation energy is not very sensitive to the ratio of these two, by default we set N G = N R . This conclusion is also valid when comparing the first two columns of data in Table 1, except for the 1 B 2u state of s-tetrazine. However, this state is not fully converged with respect to the active space parameters, and in this case the larger N G value is to be preferred. Furthermore, the oscillator strength values are more sensitive to changing N G (up to 20%), even in cases when the energy doesn't change much. Thus, we will use the setting N R = 30, N G = 60 in the following. The oscillator strength values may also depend on N R directly, since the left states are obtained by the general inverse method [58]. However, the data in the third column (N R = 5, N G = 60) indicates that this effect amounts to a few percent only. Finally, the fourth column contains data on the effect of the L2 correction, which may be as large as 30%. Thus, all further oscillator values using the full density will contain this correction.
The main results of our paper are summarised in Tables 2 and 3. Table 2 contains 74 singlet states of small molecules from the Thiel set, mostly of a π → π * or an n → π * character, and also a few σ → π * states. Although Rydberg states are not correctly described without diffuse functions, we have also kept a total of three of these (n → R) for numerical comparison. As in the earlier work of Sous et al. [60], only states with a singles component larger than 87% at the CC3 level were evaluated for the present comparison, since the methods involved are appropriate for only for such states. In Table 3, the selection criteria was relaxed to include all states for which CC3 oscillator strength values were available [61], but even in these cases, the singles component is larger than 81% in all cases. The latter table contains results for the subset of nucleobases, consisting of π → π * and n → π * excitations. While previous studies report the excitation energies [59][60][61] and oscillator strength values [61] for CC3 and other methods, the STEOM-CCSD values are reported here for the first time for molecules within the Thiel set.
For the total set of 92 states, the CC3 and STEOM excitation energy values are in good agreement, the mean error and standard deviation of the STEOM results being −0.02 ± 0.10 eV, with a maximum absolute error of 0.32 eV. It should be remarked that two of the Rydberg states produce an error larger than 0.15 eV, although Table 2. Excitation energy (ω in eV) and oscillator strength (f in a.u.) values calculated at the CC3, CIS and STEOM levels of theory for the small molecules in Thiel's test set.   these are by no means the only large errors in the test set, the largest one corresponding to an n → π * excitation of thymine. To characterise the oscillator strengths computed at the various levels, they are plotted against each other in Figure 1 and the various linear regression results are provided in Table 4. Plotting the STEOM-CIS results against the CC3 ones, it is immediately clear that this approximation overestimates the CC3 densities by about 59% on average. However, the linear fit is quite good even in this case (R 2 = 0.965), and since the intercept parameter b is very small, this amounts effectively to a scaling of the CC3 values. Since in applications it is often the relative intensities that matter, we conclude that the CIS approximation to the STEOM density that we have used so far can be expected to yield reliable results for relative intensities. Turning to the full STEOM density, the extent of the overestimation is reduced to about 14%, and the quality of the fit is somewhat better (R 2 = 0.990). This is in excellent agreement with CC3 results, especially considering the much reduced cost of an STEOM calculation when compared to CC3. Plotting the two STEOM densities against each other, it turns out that the full STEOM  density yields an oscillator strength that is on average 70% of the CIS approximation and the correlation between the two is also very good (R 2 = 0.980). Nevertheless, the full density with the L2 correction is a significant improvement over the simple CIS formula and is set as the new default for STEOM calculations in ORCA. Finally, the last columns of Tables 2 and 3 also contain data on the more expensive STEOM-DD correction. Computing the linear regression parameters for STEOM-DD shows a slightly better agreement with CC3 values, the oscillator strength values being about 8% overestimated with R 2 = 0.990. The STEOM-DD values for the ground state dipole moments are tabulated in the supplementary material, and also yield good agreement with reference values. The improvement over the full STEOM density with the L2 correction is however negligible for transition dipoles, while the additional costs are considerable. Thus, especially for large molecules, we recommend skipping the DD correction since the increase in cost can be especially large. An intermediate solution is to update the L1 amplitudes only according to Equation (36). While this still requires the explicit construction of the L2 amplitudes, the contractions involved are cheaper, and according to the oscillator strength values tabulated in the supplementary material, essentially all the improvements of STEOM-DD can be recovered by correcting L1 only. Thus, this correction may be recommended, if affordable.
To illustrate the performance of the CIS approximation and the full density, two theoretical spectra are shown in Figure 2. The spectrum of formamide is an example for which using the full density has a relatively small effect, the CIS and full density spectra are close to each other and to the CC3 curve also. On the other hand, the case of s-tetrazine demonstrates the impact of the full Table 4. Linear regression results (y = ax + b) corresponding to Figure 1, Table 2 and Table 3. density well. In this case, the full density and CC3 curves are close to one another while the CIS curve significantly overestimates the intensity.

Conclusions
In this study, we have derived and implemented a STEOM density that improves on the simplest CIS approximation for computing transition properties, without demanding the same cost as a density calculation within the space of the singles and the doubles. We concluded that although the CIS formula is a good approximation in may practical applications, the full STEOM density with the L2 correction offers an improvement without much additional cost. The oscillator strength values obtained from this density correlate well with CC3 results, and thus the STEOM method not only yields reliable results for the excitation energy, but it can be used to reproduce and interpret experimental spectra.

Disclosure statement
No potential conflict of interest was reported by the author(s).

Funding
This work was supported by the Max-Planck-Gesellschaft.