Traduction en français ici


Determination by stylometry
of the probable author of the 
Ferrante corpus: Domenico Starnone 

Books signed under the pseudonym Elena Ferrante are discussed in the media (New York TimesCorriere della SeraIl Sole 24 Ore, L'Express, and again the New York Times). 

Nobody could meet the author of these books. The recent success of the English translations increases the interest in this corpus. Several names have been suggested as a possible writer of these books. Analyses, mainly of linguistic statistics, were conducted to determine the author of the Ferrante corpus.

This OrphAnalytics press release presents the stylometry of the Ferrante corpus obtained by comparing the use of elementary units of text. The observations made it possible to put into perspective the results of the linguistic analyses. These stylistic statistics confirm the assumptions supported by linguistic analyses regarding the identity of the author of the Ferrante corpus.

The journalist Paolo Di Stefano of the Corriere della Sera summed us several suggestions for possible authors publishing under the pseudonym Elena Ferrante. The candidate who is repeatedly proposed is Domenico Starnone. He was first proposed by Luigi Galella

This hypothesis was tested by a method using algorithms of data compression in order to measure the amount of complex information. This approach was developed by researchers from the University La Sapienza of Rome: the professor of physics Vittorio Loreto associated with Andrea Baronchelli. They were able to measure the reliability of the assumption made by Galella using software developed with colleagues in mathematics: according to them, Domenico Starnone  is the author of the Ferrante corpus. 

The results of a linguistic research group also indicate that Starnone could carry the pen name Elena Ferrante. 

Our stylometric analysis allows to challenge this proposal publicly denied by Starnone. With only available texts translated by Anita Raja, and therefore having no personal texts from her, we cannot reach a decision whether she is or not the author of the Ferrante corpus. We will however do it for another writer already suggested as a possible author of this corpus: Erri de Luca. 

Books selection for the stylometrics analysis

To decide on the style used in the signed books by Elena Ferrante, eight books available as e-books were chosen: the first three novels (L'amore molesto, E/O, 1992, I giorni dell'abbandono, E/O, 2002, La figlia oscura, E/O, 2006) and four of the tetralogy (L’amica geniale, Roma, E/O, 2011, Storia del nuovo cognome, volume secondo, E/O, 2012, Storia di chi fugge e di chi resta, volume terzo, E/O, 2013, Storia della bambina perduta, volume quarto, E/O, 2014). An essay has been selected: La frantumaglia, Roma, E/O. A 2003 children's tale was left out: La spiaggia di notte, E/O, 2007. These books were compared with other authors’ texts.

Domenico Starnone regularly appears as one of the possible authors of the Ferrante corpus. Available as e-books, the last five Starnone books were selected: Spavento, Einaudi, 2009, Fare scene. Una storia di cinema, Minimum Fax, 2010. Autobiografia erotica di Aristide Gambia, Einaudi, 2011, Condom, Einaudi, 2013, Lacci, Einaudi, 2014. Apart from his second novel, the other novels (13/19) are not available as e-books.

Domenico Starnone also wrote fifteen essays, seven between 1981 and 1994 and eight between 2002 and 2006. He produced three tales between 2008 and 2013. The second essay period of Starnone covers the publication in 2003 of the essay of Ferrante which explains her approach: La frantumaglia, 2003. The publication of the children's story La spiaggia di notte, 2007 was published one year before the publication of the first tale of Starnone (three stories between 2008 and 2013). It therefore appears that the publications of the essay and the tale signed Ferrante correspond to the periods of publication of similar works of Starnone.

The wife of Domenico Starnone, Anita Raja, is regularly cited as the author who writes under the pseudonym Ferrante. She worked as a translator in the publishing house which publishes the works of Ferrante: until 2011, she translated a writer Christa Wolf. She never published a novel under her name. 

Seven books of Erri de Luca available in e-book have been added for comparison to the books of Ferrante and Starnone: Il contrario di uno, Feltrinelli, 2003, Sulla traccia di Nives, Feltrinelli, 2005, Il cielo in una stalla , Feltrinelli, 2008, Il piu e il meno, Feltrinelli, 2015, La parola contraria, Feltrinelli, 2015, La faccia delle nuvole, Feltrinelli, 2016, La Natura esposta, Feltrinelli, 2016. 

The works of other authors who have written the corpus will later be compared later on.

The algorithmic approach to textual analysis

Derived from research in genomic sequences, an algorithmic approach to text analysis has been developed by the OrphAnalytics company. It measures the use of elementary text patterns (of characters or syllables, tempo or rhythm) in and between words, and in between sentences. A catalog of use patterns is established for each piece of text; a comparison of these catalogs can then measure the conservation of style along a document.

Since this approach does not require any skills in a language, it has been successfully used for the comparison of books written in different languages: e.g. the Swedish crime novels of the Millennium series. The PATOA software (Program of Textual Analysis by OrphAnalytics) distinguishes the dot cloud of the first three Millennium volumes, written by Stieg Larsson (published after his death), from the dot cloud formed by the fourth volume, written by David Lagercrantz more than ten years after the death of Stieg Larsson, and by two other novels of Lagercrantz, completely independent of the series. See the Millennium Figure published in the Tages Anzeiger article.

The Millennium analysis shows that the signal measured by the software is independent of the topics of the novel and the associated vocabulary: the software captures the author's personal style in a genre marked by the features selection of an idiolect, i.e. the specific way of an individual to speak or write: e.g. turns of phrase. 

Analyses

Our approach illustrates the similarity of writing styles by the proximity of points in graphic representations of our stylometric analyses. In Figure 1, the Ferrante corpus has two different styles: a cloud of red symbols for the first three books, and a cloud of blue symbols for the tetralogy of L’amica geniale


Figure 1 : Stylometric analysis of seven books of the Ferrante corpus in Italian: the two dimensions of maximum variance in the multivariate analysis compare the uses of character patterns. Each symbol represents the whole or a fragment of a text of approximately 100,000 characters. The following books signed Elena Ferrante are analyzed: the first three novels in red, i.e. L'amore molesto, triangles, I giorni dell'abbandono, diamonds, La figlia oscura, circles, and the four volumes of the tetralogy in blue, i.e. L'amica geniale, triangles, Storia del nuovo cognome, diamonds, Storia di chi fugge e di chi resta, circles, Storia della bambina perduta, squares. The closer the points, the more similar the style of their texts. 

This distinction between the two styles is maintained in the analysis of texts translated into English (Figure 2).


Figure 2 : Stylometric analysis of the seven books of the Ferrante corpus in English: the two dimensions of maximum variance in the multivariate analysis compare the uses of character patterns. Each symbol represents the whole or a fragment of a text of approximately 100,000 characters. The following translated books signed Elena Ferrante are analysed: the first three novels in red, i.e. Troubling Love, 2006, triangles, The Days of Abandonment, 2005, diamonds, The Lost Daughter, 2008, circles, and the four volumes of the tetralogy in blue, i.e. My Brilliant Friend, 2012, triangles, The Story of a New Name, 2013, diamonds, Those Who Leave and Those Who Stay, 2014, circles, The Story of the Lost Child, 2015, squares.

This result can be explained by two proposals: either each style would be representative of a different author, or one author would be able to write in two different styles, and stick to it, a sign of professional maturity. To choose between both hypotheses, it is necessary to include in these analyses the texts of one or more authors, e.g. Starnone and De Luca. The analysis presented in Figure 3 adds to the texts of the Figure 1 the essay of La frantumaglia signed by Ferrante (pink), four volumes of Starnone (in orange) and seven volumes of De Luca (in grey). The distance between the two styles of Ferrante, smaller between the works of Starnone or between the texts of De Luca, is a clear indication that the same author signs both styles of the Ferrante novels.

Figure 3 : Stylometric analysis of the e-books in Italian of Ferrante, Starnone and De Luca: l: the two dimensions of maximum variance in the multivariate analysis compare the uses of character patterns. Each symbol represents the whole or a fragment of a text of approximately 100,000 characters. To the books of Figure 1 are associed the essay signed Elena Ferrante, La frantumaglia, pink pentagone, the works of Domenico Starnone in orange color, Condom, triangles, Spavento, diamonds, Fare scene. Una storia di cinema, circles, Autobiagrafia erotica di Aristide Gambía, squares, Lacci, inverted triangles, and the books of Erri De Luca in grey, Il contrario di uno, triangles, Sulla traccia di Nives, diamonds, Il cielo in una stalla, circles, Il piu e il meno, squares, La parola contraria, inverted triangles, La faccia delle nuvole, pentagones, La Natura Esposta, hexagones. 

Figure 3 shows that De Luca’s style is distant from that of Ferrante. It is therefore unlikely that De Luca writes under the pseudonym Elena Ferrante. But Starnone's style is very close to a Ferrante counterpart, each style of Ferrante being associated with the similar style of a book of Starnone. We find at the limits of the orange cloud of Starnone, the red and blue clouds of the two styles of Ferrante's novels, and the pink cloud of the essay of Ferrante, which talks about her life and her profession of writer.

In conclusion, the style Starnone, stylometrically very close to that of Ferrante, allows seriously to consider that Domenico Starnone could be the author writing under the pseudonym Elena Ferrante. This result confirms those obtained by text compression comparisons and linguistic analyses. In order to verify our observation, an analysis of the entire corpus of Starnone must be performed, demanding the digitalization of his whole work. 

A confirmation of the results would also be ensured by comparing the books of other writers suggested as author of the Ferrante corpus. We are ready to test the hypotheses proposed by the public. We also expect feedback from the corpus readers on our observation of two styles in Ferrante’s novels (clouds red and blue in Figures 1-3).

This analysis illustrates to the Italian-speaking people the ability of the OrphAnalytics approach in a literary research framework. It has successfully served to produce an expertise authenticating contracts, on the request of an international arbitration court. Currently the first applications include authentication of documents written in different languages to ensure the absence of ghostwriting in these academic texts. Finally, OrphAnalytics launches textually transparent, the first project for delivering stylometric analyses to the public. By publishing the stylometry of his work, a writer publicly presents its will for integrity: an autonomous writing without the help of a third party.

For OrphAnalytics SA, on October 11, 2016, 

Guy Genilloud                                                      Claude-Alain Roten
Project manager                                                  CEO


In the press