Technology is revolutionizing philology. I have done what I could, fortune what he wanted It is the comedy that was conserved in the National Library of Spain, printed in Seville between 1632-1634 and that was attributed to a certain Miguel Bermúdez. However, thanks to the ETSO Project (Stylometry applied to the Teatro del Siglo de Oro) it has just been attributed the authorship of Lope de Vega. It is known as Digital Humanities to the discipline that tries to give a response from technology to the classic problems presented by the humanistic disciplines. The digitization of catalogs, the programs that make semantic searches or word relationships, the photographs with which you can see the blots and taches or the chemical analysis of the inks open a new and wide way to rewrite (or make notes and contributions) the history of literature and cultural heritage.
The professor at the University of Valladolid Germán Vega has been studying Lope de Vega for more than forty-four years, as well as the social and cultural context of the time. He says that four hundred years ago Spanish society was hooked on comedy corrals and it was “like a collective passion, the theater was the main activity to attend to leisure. There was a hyperproduction of works because there was a overconsumption. It was the great entertainment from the end of the 16th century, throughout the 17th and 18th centuries ”.
The theater became the most widely read literary genre. That is why the great authors such as Calderón, Tirso or Lope de Vega produced like churros: “We have 500 works attributed to Lope, Shakespeare has about 30. In the New art of making comedies, which Lope publishes in 1609, explains the keys to success: raising the theme of love and honor on stage; include the archetypes of gallant and lady, or that of the funny man, the one in charge of making the jokes who is almost always the gallant’s servant; and the most applauded final outcome, the marriage concert ”. German Vega assures that Lope de Vega is “unmatched in handling the action and the plot, and terrific with poetry, because you have to remember that the theater was in verse.”
Each author has a style, some words that he handles more than others, some rhythms and intensities, some themes or feelings with which his characters play in a characteristic way. It should be noted that the works in the Golden Age had no copyright (therefore neither royalties) and were subjected to constant marketing. The author sold the work to the director of the company that was going to represent it (that was the only time for payment), and sometimes it was said who it was from, but sometimes not. It could also be signed by a copyist or someone from the company.
Until now the chronology of the works (because most are not dated), as well as the authorship and the details of the intrahistory, were investigated manually by studying the metrics of the stanzas, looking at the intertextuality and expressions in other works of the alleged author, for example. “Many theatrical prints do not have a colophon, so the printing press is unknown, so by studying the typography of all of them you can know which workshop it came from. It is a meticulous and laborious job ”. Vega assures that when he saw the release of I have done what I could, fortune what he wanted and compared it with others, he knew that it was from Francisco de Lira’s workshop in Seville and was able to date it. “I also found the copy of the National Library, since I have hundreds of works cataloged in my paper files at hand,” says the professor.
Jose Luis Bueren is technical director of the National Library of Spain: “We still have a lot to do but we already have more than two hundred thousand digitized works from the collection and uploaded to the Biblioteca Digital Hispánica website. Humanities researchers emphasize that not having to go to the Library has made their work easier, saving time and considerable effort ”.
Bueren assures that, since these texts are available in digital format, the creation of natural language processing programs or photographs that allow to see erasures in manuscripts, as well as critical digital editions in which they can be viewed, at the same time, has been facilitated. par than the original text, the contributions of the researchers. German Vega says that, now that the texts are digitized, finding parallel expressions is simply going to contextual search engines, which years ago took months to do. There are also programs that measure the statistics of the use of the lexicon with which you can make averages by authors and compare them, accounts that are impossible for a person to do.
Álvaro Cuéllar was a student of German Vega, and is now a Ph.D. philologist at the University of Kentucky and recipient of a La Caixa scholarship. Professor and former student have developed the ETSO project (of Stylometry applied to the Golden Age Theater), which revolutionizes the attributions and authorship of theatrical works. Stylometry was used, for example, to discover the identity of a book written by JK Rowling, author of Harry Potter, that tired of the siege she published The song of the cuckoo under a pseudonym, and the investigators quickly found that it was her by applying computer techniques.
Cuéllar tells us: “Thanks to Transkribus, a tool from the University of Innsbruck, I have been able to develop a process that allows us to automatically transcribe both printed matter and ancient manuscripts. The machine matches the text that you transcribe with the image of the letter, in the end it learns to do it alone. This has allowed us to digitize very old documents, with letters that are not modern, and for which the OCR (Optical Character Recognition) does not work ”. Álvaro Cuéllar and German Vega have worked for years and coordinated more than 100 collaborators to collect works in order for the ETSO project to have as many texts as possible. To date they have gathered more than 2,700 of 350 different authors, about 35 million words, with which to make comparisons and statistical tests.
His method is to apply stylometry to find which are the works with style closest to the work being studied. Cuéllar assures us that when we write we use words in a different frequency unconsciously, that each author uses them in different proportions. “Stylometry is capable of comparing the use of the frequency of words and assigning a numerical distance between the style of one author and another. It is with these distances that we can infer authorial relationships ”.
ETSO measures how each author uses the 500 most frequent words and thanks to this it establishes relationships: “we are able to calculate the numerical distance that separates the frequency of use of words, what we can call the style, between one author and another, and as it happened in I have done what I could, fortune what he wanted we can infer that this work is by Lope. But we can investigate another and that it approaches Calderón, or Tirso de Molina or Ruíz Alarcón. Each new work that we analyze can surprise us and lead us to an unexpected attribution ”.
Sònia Boadas is a professor at the Autonomous University of Barcelona and a member of the Prolope research group. He is a specialist in Lope de Vega’s autograph manuscript (those that have come out of his pen) of which he assures that some 45 comedies are preserved distributed in libraries beyond Spanish territory. “The human eye is capable of seeing a certain spectrum of light, spectral photography allows us to see beyond, such as infrared or ultraviolet. This technique has allowed me to see erasures and modifications that Lope made ”, says Boadas, who knows his calligraphy“ down and out ”.
In addition to the spelling, you can see the modifications made by the director of the company, and even, above, the letter and studs of the censor. “So we can investigate which hand the modifications came from. When the erasures are in an ink that is not the author’s own, it is easier to draw conclusions. Another technique that I have used is X-ray fluorescence, with which the chemical composition of the ink is analyzed, so we can deduce how many inks and hands and changes have passed through the texts “says Boadas, who appreciates the interdisciplinary collaboration to continue the track and history of the autograph manuscripts.
Cuéllar concludes: “Digital Humanities allow us to read literature as we have never done before, simultaneously analyzing hundreds or thousands of texts, comparing authors, genres or centuries, and reaching conclusions that escape the limited human brain. In the coming months we will report dozens of authorial findings that may make us rethink what we thought we knew about the Golden Age ”.