Vol 29, No 3 (2022)

Research of Models of Topological Relations of Spatial Objects

Vladislav G. Gorshkov, Dmitriy M. Murin, Olga P. Yakimova

PDF (Rus)

154-165 401

Abstract

In map production it is necessary to keep the spatial relationships between map objects. Generalization is the simplification performed on geographical data when decreasing its representation scale. It is a common practice to simplify each type of spatial objects independently (administrative boundaries first, then road network, hydrographic network, etc.). During the process some spatial conflicts, which require manual correction, arise inevitably. The generalization automation still remains an open issue for data producers and users. Many researchers are working to achieve a higher level of automation. In order to detect the spatial conflicts a refined description of spatial relationships is needed.The paper analyzes models of describing topological relationships of spatial objects: the nine intersections model, the topological chain model and the E-WID model. Each considered model allows to take into account some relations between objects, but does not allow to transfer them exactly. As a result, the task of developing a model of relations preserving topology is relevant. We have proposed an improved model of nine intersections, which takes into account the topological conflict that occurs when a point object is located next to a simplified line. Line simplification is one of the most requested actions in map production and generalization. When the mesh covered the map inside the cell there can be points, line segments and polygonal topological objects, which, if the cell is rather small, are polyline objects. Thus, the issue of simplification of topological objects within a cell is reduced to the issue of simplification of linear objects (polylines). The developed algorithm is planned to be used to solve the problem of consistent generalization of spatial data. The ideas outlined in this article will form the basis of a new index of spatial data that preserves their topological relationships.

Two-Step Colouring of Grid Graphs of Different Types

Alexander Valeryevich Smirnov

PDF (Rus)

166-180 374

Abstract

In this article, we consider the NP-hard problem of the two-step colouring of a graph. It is required to colour the graph in a given number of colours in a way, when no pair of vertices has the same colour, if these vertices are at a distance of 1 or 2 between each other. The optimum two-step colouring is one that uses the minimum possible number of colours.The two-step colouring problem is studied in application to grid graphs. We consider four types of grids: triangular, square, hexagonal, and octogonal. We show that the optimum two-step colouring of hexagonal and octogonal grid graphs requires 4 colours in the general case. We formulate the polynomial algorithms for such a colouring. A square grid graph with the maximum vertex degree equal to 3 requires 4 or 5 colours for a two-step colouring. In the paper, we suggest the backtracking algorithm for this case. Also, we present the algorithm, which works in linear time relative to the number of vertices, for the two-step colouring in 7 colours of a triangular grid graph and show that this colouring is always correct. If the maximum vertex degree equals 6, the solution is optimum.

On the Construction of Self-Complementary Codes and their Application in the Problem of Information Hiding

Yury V. Kosolapov, Fedor S. Pevnev, Margarita V. Yagubyants

PDF (Rus)

182-198 352

Abstract

Line codes are widely used to protect against errors in data transmission and storage systems, to ensure the stability of various cryptographic algorithms and protocols, to protect hidden information from errors in a stegocontainer. One of the classes of codes that find application in a number of the listed areas is the class of linear self-complementary codes over a binary field. Such codes contain a vector of all ones, and their weight enumerator is a symmetric polynomial. In applied problems, self-complementary [n, k]-codes are often required for a given length n and dimension k to have the maximum possible code distance d(k, n). For n < 13, the values of d(k, n) are already known. In this paper, for self-complementary codes of length n=13, 14, 15, the problem is to find lower bounds on d(k, n), as well as to find the values of d(k, n) themselves. The development of an efficient method for obtaining a lower estimate close to d(k, n) is an urgent task, since finding the values of d(k, n) in the general case is a difficult task. The paper proposes four methods for finding lower bounds: based on cyclic codes, based on residual codes, based on the (u-u+v)-construction, and based on the tensor product of codes. On the joint use of these methods for the considered lengths, it was possible to efficiently obtain lower bounds, either coinciding with the found values of d(k, n) or differing by one. The paper proposes a sequence of checks, which in some cases helps to prove the absence of a self-complementary [n, k]-code with code distance d. In the final part of the work, on the basis of self-complementary codes, a design for hiding information is proposed that is resistant to interference in the stegocontainer. The above calculations show the greater efficiency of the new design compared to the known designs.

Classification of Articles from Mass Media by Categories and Relevance of the Subject Area

Vladislav Dmitrievich Larionov, Ilya Vyacheslavovich Paramonov

PDF (Rus)

266-279 451

Abstract

The research is devoted to classification of news articles about P. G. Demidov Yaroslavl State University (YarSU) into 4 categories: “society”, “education”, “science and technologies”, “not relevant”.The proposed approaches are based on using the BERT neural network and methods of machine learning: SVM, Logistic Regression, K-Neighbors, Random Forest, in combination of different embedding types: Word2Vec, FastText, TF-IDF, GPT-3. Also approaches of text preprocessing are considered to achieve higher quality of the classification. The experiments showed that the SVM classifier with TF-IDF embedding and trained on full article texts with titles achieved the best result. Its micro-F-measure and macro-F-measure are 0.8214 and 0.8308 respectively. The BERT neural network trained on fragments of paragraphs with YarSU mentions, from which the first 128 words and the last 384 words were taken, showed comparable results. The resulting micro-F-measure and macro-F-measure are 0.8304 and 0.8181 respectively. Thus, using paragraphs with the target organisation mentions is enough to classify text by categories efficiently.

Towards Neural Routing with Verified Bounds on Performance

Igor Petrovich Buzhinsky, Anatoly Abramovich Shalyto

PDF (Rus)

228-245 386

Abstract

When data-driven algorithms, especially the ones based on deep neural networks (DNNs), replace classical ones, their superior performance often comes with difficulty in their analysis. On the way to compensate for this drawback, formal verification techniques, which can provide reliable guarantees on program behavior, were developed for DNNs. These techniques, however, usually consider DNNs alone, excluding real-world environments in which they operate, and the applicability of techniques that do account for such environments is often limited. In this work, we consider the problem of formally verifying a neural controller for the routing problem in a conveyor network. Unlike in known problem statements, our DNNs are executed in a distributed context, and the performance of the routing algorithm, which we measure as the mean delivery time, depends on multiple executions of these DNNs. Under several assumptions, we reduce the problem to a number of DNN output reachability problems, which can be solved with existing tools. Our experiments indicate that sound-and-complete formal verification in such cases is feasible, although it is notably slower than the gradient-based search of adversarial examples.The paper is structured as follows. Section 1 introduces basic concepts. Then, Section 2 introduces the routing problem and DQN-Routing, the DNN-based algorithm that solves it. Section 3 proposes the contribution of this paper: a novel sound and complete approach to formally check an upper bound on the mean delivery time of DNN-based routing. This approach is experimentally evaluated in Section 4. The paper is concluded with some discussion of the results and outline of possible future work.

Transformation of C Programming Language Memory Model into Object-Oriented Representation of EO Language

Alexander I. Legalov, Yegor G. Bugayenko, Nickolay K. Chuykin, Maksim V. Shipitsin, Yaroslav I. Riabtsev, Andrey N. Kamenskiy

PDF (Rus)

246-264 637

Abstract

The paper analyzes the possibilities of transforming C programming language constructs into objects of EO programming language. The key challenge of the method is the transpilation from a system programming language into a language of a higher level of abstraction, which doesn’t allow direct manipulations with computer memory. Almost all application and domain-oriented programming languages disable such direct access to memory. Operations that need to be supported in this case include the use of dereferenced pointers, the imposition of data of different types in the same memory area, and different interpretation of the same data which is located in the same memory address space. A decision was made to create additional EO-objects that directly simulate the interaction with computer memory as in C language. These objects encapsulate unreliable data operations which use pointers. An abstract memory object was proposed for simulating the capabilities of C language to provide interaction with computer memory. The memory object is essentially an array of bytes. It is possible to write into memory and read from memory at a given index. The number of bytes read or written depends on which object is being used. The transformation of various C language constructs into EO code is considered at the level of the compilation unit. To study the variants and analyze the results a transpiler was developed that provides necessary transformations. It is implemented on the basis of Clang, which forms an abstract syntax tree. This tree is processed using LibTooling and LibASTMatchers libraries. As a result of compiling a C program, code in EO language is generated. The considered approach turns out to be appropriate for solving different problems. One of such problems is static code analysis. Such solutions make it possible to isolate low-level code fragments into separate program objects, focusing on their study and possible transformations into more reliable code.

Formation of Machine Learning Features Based on the Construction of Tropical Functions

Sergey N. Chukanov, Ilya S. Chukanov

PDF (Rus)

200-209 430

Abstract

One of the main methods of computational topology and topological data analysis is persistent homology, which combines geometric and topological information about an object using persistent diagrams and barcodes. The persistent homology method from computational topology provides a balance between reducing the data dimension and characterizing the internal structure of an object. Combining machine learning and persistent homology is hampered by topological representations of data, distance metrics, and representation of data objects. The paper considers mathematical models and functions for representing persistent landscape objects based on the persistent homology method. The persistent landscape functions allow you to map persistent diagrams to Hilbert space. The representations of topological functions in various machine learning models are considered. An example of finding the distance between images based on the construction of persistent landscape functions is given. Based on the algebra of polynomials in the barcode space, which are used as coordinates, the distances in the barcode space are determined by comparing intervals from one barcode to another and calculating penalties. For these purposes, tropical functions are used that take into account the basic structure of the barcode space. Methods for constructing rational tropical functions are considered. An example of finding the distance between images based on the construction of tropical functions is given. To increase the variety of parameters (machine learning features), filtering of object scanning by rows from left to right and scanning by columns from bottom to top are built. This adds spatial information to topological information. The method of constructing persistent landscapes is compatible with the approach of constructing tropical rational functions when obtaining persistent homologies.

Testing Dependencies and Inference Rules in Databases

Sergey V. Zykin

PDF (Rus)

210-227 360

Abstract

The process of testing dependencies and inference rules can be used in two ways. First, testing allows verification hypotheses about unknown inference rules. The main goal, in this case, is to search for the relation - a counterexample that illustrates the feasibility of the initial dependencies and contradicts the consequence. The found counterexample refutes the hypothesis, the absence of a counterexample allows searching for a generalization of the rule and conditions for its feasibility (logically imply). Testing cannot be used as a proof of the feasibility of inference rules, since the process of generalization requires the search for universal inference conditions for each rule, which cannot be programmed, since even the form of these conditions is unknown. Secondly, when designing a particular database, it may be necessary to test the feasibility of a rule for which there is no theoretical justification. Such a situation can take place in the presence of anomalies in the superkey. The solution to this problem is based on using join dependency inference rules. For these dependencies, a complete system of rules (axioms) has not yet been found. This paper discusses: 1) a technique for testing inference rules using the example of join dependencies, 2) a scheme of a testing algorithm is proposed, 3) some hypotheses are considered for which there are no counterexamples and inference rules, 4) an example of using testing when searching for a correct decomposition of a superkey is proposed.

Username
Password
	Remember me
Not a user? Register with this site Forgot your password?

Modeling and Analysis of Information Systems

Algorithms

Computer System Organization

Theory of Data

Theory of Computing

Discrete Mathematics in Relation to Computer Science

User

Modeling and Analysis of Information Systems

Algorithms

Computer System Organization

Theory of Data

Theory of Computing

Discrete Mathematics in Relation to Computer Science

Cookies policy