Preview

Modeling and Analysis of Information Systems

Advanced search
Vol 32, No 1 (2025)
View or download the full issue PDF (Russian)

Computing Methodologies and Applications

6-15 134
Abstract

The problem of stability of the equilibrium state in a laser system with fast oscillating coefficients is considered. A system averaged over fast oscillations and with a distributed delay is constructed. Critical cases in the problem of the stability of the equilibrium state are singled out. It is shown that the threshold value of the feedback coefficient at which the equilibrium state becomes unstable increases due to rapid oscillations compared to the corresponding value in the absence of modulation. In critical cases, normal forms are constructed – equations for the slowly varying amplitude of the periodic solutions. The conditions for the existence, stability and instability of cycles are revealed.

Discrete Mathematics in Relation to Computer Science

16-31 129
Abstract

The article considers the Wiener index for weakly connected directed graphs. For such graphs, the distance $d(u,v)$ between vertices $u$ and $v$ is not always defined, which requires a correction for the Wiener index to be meaningful. The convention where it is assumed that $d(u,v)=0$ in the absence of a path between vertices is well-studied. We consider the convention where $d(u,v)$ is equal to the number of vertices in the graph when there is no path between vertices $u$ and $v$. The article presents graphs with $n$ vertices for which the Wiener index with this сonvention reaches minimal and maximal values. We also present experimental results showing how the Wiener index (considering both conventions of distance) changes when arcs are added to a weakly connected directed graph with fixed and random structures.

32-41 109
Abstract

The subset $V' \subset V(G)$ forms a dominant set of vertices of the graph $G$ with a neighborhood $ \varepsilon$ if for any vertex $v \in V \backslash V'$ there is a vertex $u \in V'$ such that the length of the shortest chain connecting these vertices $d(v,u)\leqslant \varepsilon$; $\delta_{\varepsilon}(G)$ is the number of vertices in the minimal $\varepsilon$-dominating set; $\delta_{\varepsilon}(G) = 1$ for $r(G)\leqslant \varepsilon \leqslant d(G)$; for $ \varepsilon < r(G)$ the numbers $\delta_{\varepsilon}(G) > 1$, but the calculation of $\delta_{1}(G)=\delta(G)$ is an NP-complete problem. The paper considers class of trees $t_{d}^{\rho}$ of diameter $d$ whose degrees of all internal vertices are equal to $\rho$. Constructive descriptions of trees $t \in t_{d}^{\rho}$ are given. Procedures for calculating the values $\delta_{\varepsilon}(t)$ in the range $1\leqslant \varepsilon < r (t)$ have been developed. Asymptotic estimates for $\delta_{\varepsilon}(t)$ and their share of the total number of vertices $t \in t_{d}^{\rho}$ are set at $d \to \infty$. Computational examples are given.

Artificial Intelligence

42-65 196
Abstract

The development of automatic assessment systems is a relevant task designed to simplify the routine work of a teacher and speed up feedback for a student. The survey is devoted to research in the field of automatic assessment of student answers based on a teacher's reference answer. The authors of the work analyzed text models used for the tasks of automatic assessment of short answers (ASAG) and automated essay assessment (AES). Several approaches were also taken into account for the task of determining the text similarity, since it is a close task, and the methods for solving it can also be useful for analyzing student answers. Text models can be divided into several large categories. The first is linguistic models based on various stylometric features, both simple ones like a bag of words and n-grams, and complex ones like syntactic and semantic ones. The authors attributed neural network models based on various embeddings to the second category. It highlights large language models as universal, popular and high-quality modeling methods. The third category includes combined models that unite both linguistic features and neural network embeddings. A comparison of modern studies on models, methods and quality metrics showed that the trends in the subject area coincide with the trends in computational linguistics in general. A large number of authors choose large language models to solve their problems, but standard features remain in demand. It is impossible to single out a universal approach; each subtask requires a separate choice of method and adjustment of its parameters. Combined and ensemble approaches allow achieving higher quality than other methods. The vast majority of studies examine texts in English. However, successful results for national languages ​​are also found. It can be concluded that the development and adaptation of methods for assessing students' answers in national languages ​​is a relevant and promising task.

66-79 235
Abstract

The authors propose a methodology for extracting domain-specific entities from student report documents in Russian language using pre-trained transformer-based language models. Extracting domain-specific entities from student report documents is a relevant task since the obtained data can be used for various purposes, ranging from the formation of project teams to the personalization of learning pathways. Additionally, automating the document processing workflow reduces the labor costs associated with manual processing. As training material for training models, expert-annotated student report documents were used. These documents were created by students in information technology programs between 2019 and 2022 for project-based, practical disciplines, and theses. The domain-specific entity extraction task is approached as two subtasks: named entity recognition (NER) and annotated text generation. A comparative analysis was conducted among NER encoder-only models (ruBERT, ruRoBERTa), encoder-decoder models (ruT5, mBART), and decoder-only models (ruGPT, T-lite) for text generation. The effectiveness of the models was evaluated using the F1-score, along with an analysis of common errors. The highest F1-score on the test set was achieved by mBART (93.55%). This model also showed the lowest error rate in domain-specific entity identification during text generation and annotation. The NER models demonstrated a lower tendency for errors but tended to extract domain-specific entities in a fragmented manner. The obtained results indicate the applicability of the examined models for solving the stated tasks, considering the specific requirements of the problem.

80-94 201
Abstract
The exponential growth in scientific publications has heightened the need for robust tools to organize and retrieve research effectively. The Universal Decimal Classification (UDC) serves as a valuable framework for categorizing articles by subject area. However, manual assignment of UDC codes is often prone to inaccuracies or oversimplification, limiting its utility. In this study, we present a novel approach for the automated assignment of UDC codes to scientific articles using BERT-based models. Our methodology was trained and evaluated on a dataset comprising over 19,000 articles in mathematics and related disciplines. To address the hierarchical structure of UDC, we developed two specialized evaluation metrics: hierarchical classification accuracy and hierarchical recommendation accuracy. We also explored multiple strategies for flattening hierarchical labels. Our results demonstrated a hierarchical recommendation accuracy of 0.8220. Furthermore, blind expert evaluation revealed that discrepancies between reference and predicted labels often stem from errors in the original UDC code assignments by article authors. Our approach demonstrates strong potential for automating the classification of scientific articles and can be extended to other hierarchical classification systems.


Creative Commons License
This work is licensed under a Creative Commons Attribution 4.0 License.


ISSN 1818-1015 (Print)
ISSN 2313-5417 (Online)