NSU scientists have shown that an author's style is determined by the same volume of text for completely different languages

Translation. Region: Russian Federation –

Source: Novosibirsk State University –

An important disclaimer is at the bottom of this article.

Authorial style is inherent to writers, journalists, and all those who write texts. This fact has been widely known to philologists for decades, but has been considered primarily at a qualitative level. In recent years, Boris Yakovlevich Ryabko, a professor at the Faculty of Information Technology (FIT) at Novosibirsk State University and Doctor of Engineering, along with colleagues and students, has developed a quantitative method for determining authorial style. This method utilizes the tools of mathematical statistics, allowing for the reliability of the findings to be determined.

In 2025, Boris Ryabko and his co-authors published a paper Ryabko B., Savina N., Lulu YG, Han Y. The Amount of Data Required to Recognize a Writer's Style Is Consistent Across Different Languages of the World // Entropy. – 2025. – Vol.27. – Iss. 10. – Art.1039. — ISSN 1099-4300, in which, using the developed method, it was shown that the minimum volume of text necessary to determine the author's style is approximately the same for Russian, English, Chinese and the Amharic language used in Ethiopia.

"These languages belong to very distant language groups, and even the question of comparing text length is not so straightforward for them. For example, Russian letters are not comparable to Chinese characters, as each character can be translated into Russian as a whole word, and sometimes even a sentence. It's worth noting that in the study under review, text size was estimated in kilobytes for all languages, meaning the same units," commented Boris Ryabko.

It is important to note that the article's co-authors, Yeshewas Getachew Lulu (Ethiopia) and Yi Han Yunfei (China), are graduate students at the NSU Faculty of Information Technologies (FIT) under the supervision of Professor B. Ya. Ryabko. The paper was published in October in a journal ranked in the top quarter of the best scientific journals by citation frequency (Q1) according to the international classification, and, judging by the number of readings, is generating considerable interest.

The method described in the article was previously used by B. Ya. Ryabko and his colleagues to determine the authorship of literary works (in some cases, the authors of works are unknown or the authorship is questionable, such as with Shakespeare). The method proposed by B. Ya. Ryabko can find practical application in assessing the quality of various translations and the qualifications of translators, including computer translations, and can also be used to identify unauthorized borrowings and other forms of plagiarism.

"The quality of a translation can significantly influence the perception of the translated work. The proposed approach has been applied to the analysis of literary translations. According to this approach, the better the translation, the more it preserves the author's style, and this "degree of preservation" can be quantified. Another important new area of application is assessing the quality of "machine" or "computer" translations performed by various programs. This has not yet been conducted, although such translators play a significant role in modern society. Another, more "prosaic" area of application is identifying parts of a text written by different authors, including fragments written by "artificial intelligence." This task is especially relevant for universities, and perhaps even schools, where the fight against plagiarism in student papers is quite intensive. The described method can be applied to this problem as well," explained Boris Ryabko.

Please note: This information is raw content obtained directly from the source. It represents an accurate account of the source's assertions and does not necessarily reflect the position of MIL-OSI or its clients.