RNA polymerase II is recruited to DNA double-strand breaks for dilncRNA transcription in Drosophila

ABSTRACT DNA double-strand breaks are among the most toxic lesions that can occur in a genome and their faithful repair is thus of great importance. Recent findings have uncovered local transcription that initiates at the break and forms a non-coding transcript, called damage-induced long non-coding RNA (dilncRNA), which helps to coordinate the DNA transactions necessary for repair. We provide nascent RNA sequencing-based evidence that RNA polymerase II transcribes the dilncRNA in Drosophila and that this is more efficient for DNA breaks in an intron-containing gene, consistent with the higher damage-induced siRNA levels downstream of an intron. The spliceosome thus stimulates recruitment of RNA polymerase II to the break, rather than merely promoting the annealing of sense and antisense RNA to form the siRNA precursor. In contrast, RNA polymerase III nascent RNA libraries did not contain reads corresponding to the cleaved loci and selective inhibition of RNA polymerase III did not reduce the yield of damage-induced siRNAs. Finally, the damage-induced siRNA density was unchanged downstream of a T8 sequence, which terminates RNA polymerase III transcription. We thus found no evidence for a participation of RNA polymerase III in dilncRNA transcription in cultured Drosophila cells.


Introduction
The siRNA silencing system in Drosophila helps to fend off viral infections [1], but also contributes to the control of transposon mobilization in somatic cells [2]. In both cases, the trigger for siRNA generation is double-stranded RNA (dsRNA). During viral infection, this likely stems from replication intermediates, while for genome surveillance convergent transcription must occur. For multi-copy sequences, this convergent transcription can also be envisaged to occur in trans, i.e. at different instances of the same sequence. A particular form of dsRNA generation has been identified in Drosophila at transcribed DNA double-strand breaks [3]. The genetic requirements indicate an involvement of the spliceosome and this appears to be true for the surveillance of high-copy sequences as well [4]. Intriguingly, stalled spliceosomes can recruit RNA-dependent RNA polymerase (RdRP) to transposon mRNAs in the pathogenic yeast Cryptococcus neoformans [5]. For organisms that lack an RdRP gene, however, induction of convergent transcription must happen at the DNA.
DNA double-strand breaks (DSB) are highly toxic genome lesions that need to be faithfully repaired. A series of molecular interactions is initiated once a DSB has been detected and signalling events recruit repair factors, modify local chromatin structure and mitigate access between transcription and DNA repair proteins [6]. Many studies have concluded that a relatively large region around the DSB is transcriptionally silenced in a reversible manner, presumably to avoid conflicts between transcription and repair [7]. In recent years, however, antisense transcription that initiates at the DNA break has been observed [8][9][10]. In the context of DNA repair, this transcription seems to fine-tune the dose of single-strand binding proteins such as RPA that initially associate with the 3ʹ->5ʹ resected break [8]. Furthermore, damage-induced small RNAs derived from these antisense transcripts have been observed in Neurospora, Arabidopsis and human as well as Drosophila cell lines [3,[11][12][13]. This has provided convenient sequencing-based evidence of DNA break-induced antisense transcription.
While there is thus little doubt that a non-coding transcript initiates at the break (referred to as damage-induced long non-coding RNA or dilncRNA), we still do not have a comprehensive understanding of its biogenesis, in particular regarding whether differences exist between transcribed (i.e. within transcriptionally active genes) and non-transcribed breaks. In vivo, DNA breaks occur in a chromatin-context and the mechanisms of dilncRNA generation may differ depending on the local chromatin state, which determines the accessibility for RNA polymerases. For example, plants have even devoted the function of two polymerase II related, multi-subunit polymerases, RNA polymerase IV and IVb/V, to pervasive genome surveillance [14][15][16][17]. Their non-coding transcripts can activate a number of cellular responses to cope with transposon invasion, viral infection and also DNA breaks [13].
RNA polymerase I transcribes the rDNA and is largely confined to the nucleolus [18], whereas RNA polymerase III generates a series of non-coding transcripts. This polymerase also functions in certain cases to detect aberrant DNA: It transcribes AT-rich linear DNA that may be cytoplasmic [19,20] or nuclear in the case of Herpesviruses [21][22][23]. The resulting pol-III transcripts then activate the cellular interferon response via RIG-I, an RNA helicase recognizing 5ʹtriphosphate-containing RNA in a double-stranded configuration [24]. Furthermore, RNA polymerase III can transcribe transposon-derived Alu elements and can direct new integration sites for the Ty1 transposon in budding yeast [25]. The transcriptional landscape of both, RNA pol-II and pol-III is thus complex and dynamic.
The notion that RNA polymerase II can initiate at a DNA break to generate a dilncRNA is supported by studies using RNA polymerase II specific inhibitors [26], chromatinimmunoprecipitation [27], detection of pre-initiation complexes at DNA breaks [28] single-molecule studies [10] and by the detection of dilncRNAs associated with RNA polymerase phosphorylated at tyrosine-1 within the CTD repeats in a metagene-analysis [29]. Recruitment of RNA polymerase II to the DNA end can involve the Mre11-Rad50-Nbs1 complex (MRN-complex) and transcription initiation at DNA breaks has indeed been reconstituted in vitro with linear DNA, purified RNA polymerase II and the MRN-complex [30]. The RNA polymerase II model has been challenged; however, by observations that claim selective recruitment of RNA polymerase III -also with the help of the MRN complex -to double-strand breaks in cultured human cells [31].
In Drosophila, the dilncRNA originating from a DNA break is converted into damage-induced siRNAs if the break occurs in actively transcribed genes [3]. The convergent transcripts form dsRNA, which is processed by the canonical RNAi machinery into Ago2-loaded siRNAs capable of silencing cognate transcripts [3,32]. While their contribution to repair seems limited [32,33], the siRNAs are much more stable than the original dilncRNA and thus can serve as a convenient proxy of dilncRNA transcription [3,34]. Results from a genome-wide screen in Drosophila cells suggest that spliceosomes assembled on the normal transcript can stimulate the generation of corresponding damage-induced siRNAs. This was corroborated by the observation that DNA breaks upstream of a gene's first intron or anywhere within intronless genes produce few siRNAs upon damage [4].
An important question is thus whether the spliceosome acts upstream or downstream of the dilncRNA induction. In a downstream involvement, the spliceosome would serve as an RNA chaperone and promote the annealing of the coding (sense) and non-coding (antisense) transcripts, thereby boosting siRNA generation. An upstream action implies that the spliceosome stimulates the generation of dilncRNAs, i.e. recruitment of the polymerase to the break, and thereby increases the amount of dsRNA generated. We could now distinguish the two mechanisms by examining nascent transcription at DNA breaks and observed that a DSB downstream of introns leads to higher levels of antisense transcription, arguing that the spliceosome stimulates dilncRNA production. Furthermore, we propose that in Drosophila cells it is RNA polymerase II that transcribes the dilncRNA.

Results and discussion
The aim of our study was to measure the rate of antisense transcription at a transcribed DNA break for an introncontaining and an intronless gene. Furthermore, we wanted to determine which RNA polymerase is recruited for this purpose in Drosophila. Incorporation of labelled nucleotide analoguessuch as 4SU (4SU-Seq) or biotinylated dNTPs (PRO-seq) allows to measure nascent transcriptomes with high sensitivity but cannot distinguish between RNA polymerases. While specific inhibitor treatments are available, they have the caveat that inhibition of RNA polymerase II will also abrogate transcription of the normal mRNA transcripts, which recruit the spliceosome and may thus participate in induction of antisense transcription at intron-containing genes. Yet, this is precisely what we wanted to test.
We therefore established a nascent RNA sequencing strategy based on polymerase-specific immunoprecipitation (nascent elongating transcript sequencing or NET-seq [35,36]). In short, we lysed cultured Drosophila S2-cells harbouring epitope-tags on RNA polymerase II or III (introduced via genome editing) and washed out cytoplasmic and soluble nuclear components. Then, a brief digestion with benzonase liberated chromatin-associated material ('input' in our figures), from which we could subsequently immunopurify tagged polymerases ('IP' in our figures). The short RNA stump protected by the polymerase during the benzonase treatment (23-26 nt) can directly enter our established small RNA sequencing library pipeline because benzonase products carry a 5ʹmonophosphate (see Fig S1 for an outline of our cell fractionation and NET-seq procedure). To verify our protocol, we sequenced both the input material for the IP (roughly speaking chromatin-associated RNA) and the polymeraseassociated transcripts after immunoprecipitation.

Validation of the NET-seq procedure
We first examined the highly transcribed, protein-coding actin gene Act5C. The profile of matching reads from the input material is dominated by the exonic portions of the gene, consistent with the notion that splicing can occur cotranscriptionally before release from the chromatin. Nonetheless, a certain level of intronic reads is already visible and demonstrates that the material also contains nascent transcripts. The nascent, RNA polymerase II associated reads sequenced after specific immunoprecipitation (IP) show a much stronger proportion of these intronic reads ( Fig. 1(a), top panel and Fig. S2B). In comparison, the RNA polymerase III IP only showed non-specific background (distribution essentially unchanged - Fig. 1(a), middle panel). Many genes show RNA polymerase II pausing shortly after transcript initiation. In Drosophila, this phenomenon was first comprehensively described in ChIP-Seq and PRO-seq experiments [37,38]. Accordingly, promoter-proximal pausing is evident in the PRO-seq trace for Act5C as well as in our nuclear RNA sample (input) and particularly in the RNApolymerase II associated, nascent transcripts. When comparing our NET-seq results for this highly abundant mRNA with published results of a nascent RNA labelling apporach (PROseq), it appears that our libraries still contain a moderate overrepresentation of exonic reads [39], presumably reflecting a higher background level in our NET-seq approach. Interstingly, a distinct NET-seq implementation for the analysis of early embryonic transcription -published while our manuscript was in revision -specifically selected for longer polymerase-associated transcripts (>60 nt). This allowed the decetion of splicing intermediates, such as exon 3ʹ-end and intron lariats, but prevented the analysis of promoter-proximal stalling. It thus seems that the two approaches can capture complementary information [40].
For a global perspective, we also mapped reads onto precompiled transcript classes (Flybase genome release 6.19) and determined the recovery (ratio of IP versus input after normalization to total genome matching reads in each library) for RNA polymerase II and III. The CDS collection corresponds to the protein coding part of the transcriptome (start to stop) and the recovery was clearly greater in the pol-II IP than in the pol-III IP (Fig. S2 A, pol-II IP n = 6, pol-III IP n = 4). The Figure 1. Characterization of the NET-seq approach in Drosophila S2-cells. a) NET-seq reads for RNA polymerase II (top) and RNA polymerase III (middle) were mapped to the protein-coding gene actin5C; 'input' refers to a chromatinassociated RNA fraction isolated prior to the polymerase-specific immunoprecipitation (IP). Reads from a published PRO-seq experiment are shown at the bottom. b) NET-seq reads for RNA polymerase II (top) and RNA polymerase III (middle) were mapped to the non-coding 7SK RNA gene, a known RNA polymerase III target; reads from a published PRO-seq experiment are shown at the bottom. c) NET-seq reads for RNA polymerase II (top) and RNA polymerase III (middle) were mapped to the bantam locus, an RNA polymerase II transcribed non-coding RNA; the mature bantam miRNA accumulates to high levels in the cytoplasm and is also an abundant contamination in our nuclear RNA preparations. Reads from a published PRO-seq experiment are shown at the bottom.
intronic part of the transcriptome also showed a preferential recovery with pol-II, but a certain number of introns also trended towards a high recovery in both, the pol-II and the pol-III IP (Fig. S2 B). Manual inspection of an arbitrary subset usually indicated the presence of non-coding RNAs such as snRNAs or snoRNAs in these introns.
To verify successful IP for RNA polymerase III, we analysed the read distribution along the non-coding 7SK RNA locus ( Fig. 1(b)). While RNA polymerase II associated nascent transcripts did not show a particular enrichment of signal along the locus (top panel), the corresponding reads were enriched after IP of RNA polymerase III (middle panel). Note that the 7SK RNA is recruited to chromatin while inhibiting pTEF-b from phosphorylating the RNA polymerase II CTD for release from promoter-proximal stalling. However, this does not appear to contribute substantially to the RNA polymerase II RNA reads when analysed by mapping to the 7SK gene. As expected, the PRO-seq procedure also captured transcription of the RNA polymerase III transcribed 7SK locus (bottom panel). When we mapped the reads onto the Flybase collection of tRNA sequences, we found a preferential recovery for at least a subset of the tRNAs in the RNA polymerase III IP (Fig. S2 C). This is also visible when we mapped the reads onto the Flybase collection of 'all transcripts', which despite its name only comprises the proteincoding and a subset of lncRNAs. Essentially all of these are transcribed by RNA polymerase II but the Ntl locus is a notable exception (Fig. S2 D). This transcript appears pol-III transcribed according to our analysis, overlaps with an intron-containing Tyr-GTA tRNA gene and direct visualization of the mapping traces revealed that the read-counts mapped to the Ntl locus almost exclusively localize to the tRNA portion (Fig. S2 E).
Our NET-seq libraries are contaminated by abundant cytoplasmic non-coding RNAs. This is illustrated with the help of the bantam locus ( Fig. 1(c)). The 23 nt small RNA is one of the most abundant miRNAs in S2-cells and it is nucleolytically processed from a much larger primary transcript by Drosha and Dicer-1. The mature miRNA is cytoplasmic, yet our nuclear RNA fraction still contained a substantial amount of bantam reads (top and middle panel, input). While the IP procedure decreased this contamination, it did not remove the bantam reads completely (top and middle panel, IP). However, in the case of RNA polymerase II the nascent RNA reads indicate that larger precursor ncRNAs are transcribed (top panel, IP). This is consistent with the PRO-seq reads from the locus (bottom panel). The three example loci for Fig. 1 were chosen because the published PRO-seq reads can be represented at roughly comparable ppm-scales, hence their transcriptional output should be, as a first approximation, of comparable magnitude. Our own NET-seq data for Act5C and 7SK can indeed also be displayed with comparable scales, but the bantam locus required different scaling due to the cytoplasmic contamination. We also observed a substantial amount of mature ribosomal RNA reads in our libraries both, before and after IP (23%-72% of total genomematching reads, with no obvious enrichment of unprocessed precursor transcripts). For these RNAs, no interpretation of our sequencing data should be attempted. This also limits conclusions about highly abundant RNAs transcribed by RNA polymerase III such as 5S rRNA. For most other transcripts, we conclude that our nascent RNA sequencing data successfully captures polymerase-specific profiles. Since our question focuses on the induced antisense transcription at DNA breaks, an RNA species that is neither cytoplasmic nor highly abundant, we conclude that the NET-seq libraries are suitable for our analysis.

A DSB downstream of introns shows higher dilncRNA transcription activity
We generated sequencing libraries after employing our established cas9/CRISPR system to cleave in the introncontaining gene CG15098 and, separately, in the intronless gene Tctp [4]. As before, the DNA breaks had been induced by transfection of a corresponding sgRNA expression cassette into cells that stably express the Cas9 protein. The majority of the cells were harvested and processed for NETseq libraries 2 or 3 days after transfection. The remaining cells were processed for a T7 endonuclease assay, demonstrating that the targeted loci were indeed cleaved with comparable efficiency (see also Fig. S1). In our experiments, libraries from the Tctp-cut provide the 'uncut' control for the CG15098 locus and vice-versa. This comparison ensures that any effects not specific to the cut locus or due to Cas9 activation per se will be accounted for.
We mapped the NET-seq libraries onto the respective loci and calculated the number of sense and antisense-matching reads. Fig. 2 shows traces for one NET-seq replicate mapped to CG15098 (left side) and Tctp (right side). For CG15098, IP of RNA polymerase II associated, nascent transcripts led to an enrichment of antisense reads relative to input (Fig. 2(a)). In contrast, the antisense reads did not increase for the cut Tctp locus, consistent with the low amounts of siRNAs generated upon cleavage of this locus [4]. There was no indication for a prominent signal in the RNA polymerase III NET-seq libraries of either locus ( Fig. 2(b)).
To obtain a quantitative view of the replicate data, we normalized the number of antisense reads to the total transcriptional activity of the locus in each library [i.e. antisense / (sense + antisense)] (Fig. 2(c)). There was a significant increase of antisense reads for cut vs. uncut CG15098 (p = 0.012, t-test unpaired, unequal variance, n = 3) while no significant differences were observed for the neighbouring CG15099 (p = 0.640, n = 3) or Act5C, which resides on a different chromosome (p = 0,644, n = 3). We also normalized the antisense reads to the total number of genomematching reads in each library (Fig. S3). In each of the three replicate experiments, the amount of CG15098 antisensematching nascent, RNA polymerase II associated reads was higher in the cut state than in the uncut state (p = 0.034, paired t-test, n = 3). This was not the case for CG15099 gene (p = 0.273, n = 3) or the Act5C gene (p = 0.675, n = 3); there were too few Tctp antisense matching reads for an analogous comparison. Finally, our input material also showed a consistently higher amount of antisense-matching reads for CG15098 in the cut state in each replicate (p = 0.072, paired t-test, n = 3). In agreement with the visual inspection ( Fig. 2(b)), the read quantification did not provide any indication that RNA polymerase III is contributing to antisense transcription (Fig S3, bottom row).
We conclude that induction of a DNA double-strand break in the intron-containing CG15098 gene stimulates antisense transcription by RNA polymerase II. For the intronless Tctp gene, we detected none or only few antisense reads and statistical analysis is not appropriate. Our observations are thus consistent with the notion that a lower antisense transcription activity for the intronless gene (this study) correlates c) Quantitative analysis of the antisense reads relative to all reads mapped to the respective locus revealed a significant increase for CG15098 in the cleaved state (left, t-test unequal variance, n = 3). A cartoon shows the genes in the vicinity of CG15098, the closest neighbour in the same orientation is CG15099. Note that this gene is convergent with CG15083 and thus intrinsically has a higher proportion of antisense transcripts that map to the overlapping region.
with fewer DNA-damage induced siRNAs [4]. It therefore appears that the role of the spliceosome is to stimulate dilncRNA transcription, rather than to promote annealing of the sense and antisense RNA strands. It remains nonetheless possible that the spliceosome plays additional roles downstream of antisense transcript initiation. Since the overwhelming majority of fly genes contains at least one intron, many spontaneously occurring DSBs can be affected by the spliceosome-dependent process(es).
Our findings also have important mechanistic implications since it could be the very same polymerase that synthesizes both sense and antisense transcript. In this most rudimentary form of 'recruitment', stalling of the splicing reaction could e.g. contribute to post-transcriptional modifications on RNA polymerase II that promote direct re-initiation upon a run-off at the break -a 'U-turn' movement, essentially. However, it is currently unclear whether a run-off will occur at a DSB in vivo or whether the polymerase stalls when it encounters the break. As long as the transcript is not cleaved and removed, this creates an R-loop behind the polymerase with concomitant exposure of the non-template strand. This stretch of singlestranded DNA could also serve as a landing site for another RNA polymerase complex and transcription thus initiates in the antisense orientation [41]. In this case, the role of the stalled spliceosome could be to prevent transcript termination and release, thus extending the lifetime of the R-loop that may contribute to DNA damage signalling. Alternatively or in addition to this modulation of the DNA/RNA duplex structures, signalling events that include or emanate from spliceosome components [42] could foster polymerase recruitment to the nearby single-stranded DNA.
In the absence of DNA damage, the spliceosome protects the genome from R-loop mediated instability in transcribed regions [43] and introns may limit the physical and temporal extent of promoter-associated R-loops [44]. The interplay between spliceosome, R-loop and transcription may thus be modulate by the damage-specific context, for example via post-transcriptional or local epigenetic modifications.

No evidence for participation of RNA polymerase III in the biogenesis of damage-induced siRNAs
The recent description of RNA polymerase III recruitment to DNA breaks in human cell lines [31] clearly differs from our observation of a predominant -if not exclusive -role of RNA polymerase II in dilncRNA generation (Fig. 2). It is certainly conceivable that mechanistic differences exist between humans and flies (as is the case for the subsequent processing into siRNAs, see [33]), but we wanted to confirm our observation with independent approaches. We thus turned to our established dual-luciferase reporter system, which relies on the silencing activity of damage-induced siRNAs generated from a co-transfected, linearized plasmid ( Fig. 3(a), right side). With this assay, we had previously screened and detected a role for the MRN-complex in promoting siRNA generation, presumably by preparing the DNA end for RNA polymerases that initiate transcription at the break [4]. The inhibitor Mirin can block the access of Mre11 to dsDNA ends and thus all nucleolytic activities, while its derivative PFM-01 selectively blocks DNA access to the endonuclease active site [45]. Addition of Mirin (25 µM final concentration) clearly reduced the amount of damage-induced siRNAs generated (p = 0.05, t-test, unequal variance, n = 3), while PFM-01 (25 µM) had essentially no effect (Fig. 3(a)). This supports the notion that the initial unwinding of the double-stranded DNA by Mre-11 can contribute to dilncRNA generation, rather than endonucleolytic cleavage and resection that exposes single-stranded DNA with a 3ʹ-end [30].
Importantly, addition of the selective RNA polymerase III inhibitor ML-60218 at a concentration of 10 µM -the highest Inhibition of the MRN-complex with the inhibitor Mirin, but not PFM-01, reduced the amount of damage-induced siRNAs. Inhibition of RNA polymerase III with ML-60218, however, did not lead to any change of siRNA yield compared with the solvent control (DMSO). Three biological replicates of the assay were performed. b) A stretch of 8 adenosines in the second intron of CG15098 will lead to a corresponding sequence of 8 thymines in the dilncRNA transcript. This is preceded by a potential secondary structure element (shown on the right in 5ʹ->3ʹ direction of an antisense transcript) and should lead to termination of RNA polymerase III transcription. Hence, a lower density of damage-induced siRNAs should be observed beyond this point if RNA polymerase III transcribes the dilncRNA. This was, however, not the case. (sequencing data previously published in [4]).
concentration that still produced acceptable levels of luciferase readings (see Fig. S4) -did not lead to a de-repression of Renilla luciferase (Fig. 3(a)). This is consistent with our genome-wide RNAi screen where no RNA polymerase III subunit scored as a confirmed hit [4] and with undetectable dilncRNA transcription in our RNA pol-III NET-seq libraries.
We had previously determined that the damage-induced siRNA response starts in close proximity to the break and extends all the way until the transcription start site [3,4,34]. The corresponding dilncRNA transcripts thus arise over a stretch of more than 1 kb (e.g. 4.5 kb in the case of CG18273, see supplementary Figures in [4]). This would be unusually long for an RNA polymerase III transcript and random pol-III termination sequences might occur along the way. Indeed, inspection of the CG15098 locus revealed a serendipitous stretch of eight adenosines in the second intron. For an RNA polymerase acting in antisense orientation, this corresponds to a T 8 -sequence preceded by a potential secondary structure element (see Fig. 3(b)), which should terminate most RNA polymerase III transcription complexes [46]. We confirmed that this sequence does indeed terminate pol-III transcription in our S2-cells with a plasmid-based assay (Fig. S5). However, the siRNA read density we had observed in our previous deep-sequencing data was similar before and after this pol-III termination site ( Fig. 3(b)). We do note that there is a paucity of siRNA reads in a ~ 20 nt window surrounding the A 8 /T 8 sequence; most likely this is for technical reasons given the short, homopolymeric sequence stretch (e.g. Illuminasequencing or PCR polymerase drop-off). Taken together, it is unlikely that RNA polymerase III functionally contributes to dilncRNA transcription in Drosophila. However, our observations cannot exclude that RNA polymerase III is recruited to sites of DNA damage without subsequently engaging in processive transcription of the dilncRNA.
Previously, several publications have provided independent evidence of RNA polymerase II as an enzyme capable of transcribing the dilncRNA. This includes biochemical reconstitutions [30], in vitro analysis with inhibitors [26], ChIP with qPCR [26], detection of pre-initiation complexes [28] and metagene analysis after ChIP-Seq [29]. A single-molecule study is also suggestive of RNA polymerase II according to the reported speed [10], but the MS2 stem-loop employed as a reporter can in principle also be transcribed by RNA polymerase III [47]. While not all of the published experiments can exclude a concomitant function of more than one RNA polymerasei.e. RNA polymerase II (or IV in plants) and RNA polymerase III-in dilncRNA generation, the recent description of RNA polymerase III as the exclusive source of dilncRNA in cultured human cell lines is surprising [31]. We now provide a direct observation of polymerase-associated, nascent dilncRNA transcripts only in RNA polymerase II NETseq (summarized as a model in Fig. 4). Certainly, differences between organisms may exist: If the primary purpose is to generate a transcript, then the polymerase type could easily be swapped during the course of evolution. In plants, for example, genetic analysis has pinpointed a function of the plant-specific RNA polymerase IV in dilncRNA transcription [13]. The situation is further complicated by the discovery that repair of transcribed genes by homologous recombination is also fostered upon the establishment of mixed DNA/RNA displacement Our new results demonstrate that the role of the spliceosome is to recruit RNA polymerase II for antisense transcription (red arrow). This is a key progress because we can now rule out a mere RNA-chaperone-like activity of the spliceosome. Rather, local transcription and splicing are important mediators of dilncRNA biogenesis. As previously described for mammalian cells, transcript initiation at the break is aided by the Mre11-Rad50-Nbs1 complex in Drosophila as well. Future experiments can address whether the spliceosome-mediated recruitment requires MRN, or whether the two pathways are independent possibilities for recruiting RNA polymerase II to the DNA break. While our NET-seq approach, inhibitor treatments and sequence analysis of the CG15098 model locus did not provide any evidence for RNA polymerase III mediated dilncRNA transcription, we cannot rule out the possibility that RNA polymerase III is recruited to the break without engaging in processive transcription.
loops involving the normal transcript that runs sense towards the break [48]. A parallel comparison of the diverse experimental systems might help to distinguish between technical and true biological differences; the latter will prove invaluable to further our understanding of the molecular mechanisms that lead to dilncRNA transcription.

NET-seq procedure
Cell culture Drosophila S2-cells with stable expression of cas9 protein (clone 5-3) were cultured and transfected as previously described [49]. We further modified this cell line by introducing a twin V5-tag at the C-terminus of the largest subunit of RNA polymerase II (PolR2A, CG1554) and III (PolR3A, CG17209), followed by clonal selection as described [50]. For the NET-seq experiments, we transfected a 30 ml culture of cells expressing tagged RNA polymerase with guideRNA vectors targeting CG15098 or Tctp. The sgRNA expression cassettes were first generated by PCR, then blunt-end cloned into pJet1.2 to yield pRB59 (CG15098) and pRB60 (Tctp). The target sites were 5ʹ-TCCAGTGTAGCTTC CCGTT-3ʹ for CG15098 and 5ʹ-ATATCTAATTTCTTTTTAC-3ʹ for Tctp as described [4].

Cell lysis
48 or 36 hours after transfection, the cells were harvested (density 4-5 × 10 6 cells/ml), resuspended in 500 µl of lysis buffer (10 mM HEPES/KOH pH7.5, 1.5 mM MgCl 2 , 1 mM DTT, 10 mM EDTA, 10% glycerol and 1% Tergitol-type NP40 (Sigma NP40S) supplemented with proteinase inhibitors (Roche complete without EDTA)) and incubated for 10 minutes on ice. Then nuclei were pelleted by centrifugation at 5000xg for 5 minutes and the supernatant (mostly cytosol) was discarded. The pellet was resuspended in lysis buffer without EDTA but containing 1 M urea, incubated for 5 minutes on ice and again pelleted at 5000xg for 5 minutes. The urea washing step was carried out twice in total, then the nuclei were resuspended in 110 µl of lysis buffer without EDTA and without urea. To digest the chromatin, 250 U of benzonase (Merck Millipore E1014, 90% purity grade) were added and the resuspended nuclei were incubated at 37°C for 3 minutes in a heating block. The digestion was stopped by adding EDTA and NaCl to a concentration of 10 mM and 500 mM, respectively. The insoluble fraction was pelleted by centrifugation at 16000xg for 5 minutes and the supernatant was used as input material for the immunoprecipitation. Immunoprecipitation 20 µl of magnetic beads (Dynabeads protein G, Invitrogen 10004D) were washed 3 times with 200 µl of IP buffer (25 mM HEPES/KOH pH 7.5, 150 mM NaCl, 12.5 mM MgCl 2 , 1 mM DTT, 1% Tergitol-Type NP40, 0.1% Empigen (Sigma 30,326) supplemented with Roche complete proteinase inhibitors without EDTA), then 1 µl of V5 antibody was coupled by rotation at 4°C over night. On the following day, the beads were washed 3x with 300 µl of IP buffer, then the input material was added and incubated with agitation for 60 minutes at 4°C. After separation of the unbound supernatant, the beads were washed 5x with 200 µl of IP-buffer. The immunopurified RNA polymerase complexes were the digested with proteinase K to liberate the associated nucleic acids and RNA was prepared by TRIZOL extraction and precipitation.

Library generation and data analysis
RNA fragments with a size of 20-28 nt were PAGE-purified to select for the fragments that were protected from benzonase digestion by the polymerase. Since benzonase products harbour 5ʹ-phosphorylated ends, the RNA fragments were processed for library generation as described [51] without further treatment. The libraries were sequenced in-house on an Illumina HiSeq1500 instrument and the reads were processed with custom PERL and BASH scripts for mapping with Bowtie [52] to the indicated references. During mapping, no mismatches were tolerated and each hit was reported only once. If multiple, perfectly matching sequences exist in the reference, the Bowtie algorithm will assign the read randomly. After mapping, the results were further processed with BEDtools [53] and custom R!-scripts or the IGV genome browser [54] for data visualization.

Luciferase assay
The luciferase assay for the detection of DNA-break induced siRNAs has been previously described [4]. Briefly, 25 ng of pRB2 (firefly-luciferase, circular), 10 ng of pRB1 (Renilla luciferase, circular) and 40 ng of pRB4 (truncated Renilla luciferase, linearized with EcoRI) were transfected per well of a 96-well plate using Fugene-HD (Promega). Inhibitors were added 2 hr prior to transfection in a volume of 1 µl DMSO (volume identical for all compounds and controls). The luciferase assay was performed 96 hrs after transfection using the Dual-Glo Luciferase assay system (Promega E2920) in a Tecan M-1000 plate reader. Data analysis was carried out using Microsoft Excel.