De-duplication on the Backup System with Information Storage in a Database

Sergey M. Taranin

doi:10.18255/1818-1015-2017-2-215-226

De-duplication on the Backup System with Information Storage in a Database

Sergey M. Taranin

https://doi.org/10.18255/1818-1015-2017-2-215-226

Full Text:

PDF (Rus)

Generate QR code

Abstract

Prevention of data loss from digital media includes such a process as a backup. It can be done manually by copying data to external media or automated on a schedule by using special software. There are the remote backup systems, when data are saved over the network to the remote repository. Such systems are multi-user and they process large amounts of data. Shared storage can meet ﬁles containing the same fragments. The elimination of repeated data is based on the mechanism of de-duplication. It is a method of information compression, when the search of copies is performed in the entire dataset rather than within a single ﬁle. The main advantage of using this technology is a signiﬁcant saving of disk space. However, the mechanism of eliminating repetitive data can signiﬁcantly reduce the speed of saving and restoring information. This article is devoted to the problem of implementing such a mechanism in the backup system with information storage in a relational database. In this paper we consider an example of implementation of such a system working in two modes: with the de-duplication of data and without it. The article illustrates a class diagram for the development of a client part of application as well as the description of tables and relationships between them in a database that belongs to the backend. The author oﬀers an algorithm of saving data wiht de-duplication, and also gives the results of comparative tests on the speed of the algorithms of saving and restoring information when working with relational database management systems from diﬀerent manufacturers.

Keywords

ﬁle, data, backup, de-duplication, database

About the Author

Sergey M. Taranin

P.G. Demidov Yaroslavl State University
Russian Federation

PhD

14 Sovetskaya str., Yaroslavl 150003, Russia

References

1. Таранин С.М., “Резервное копирование с хранением в базе данных”, Моделирование и анализ информационных систем, 23:4 (2016), 479–491; [Taranin S.M., “Backup with Storage in a Database”, Modeling and Analysis of Information Systems, 23:4 (2016), 479– 491, (in Russian).]

2. Казаков В.Г., Федосин С.А., “Технологии и алгоритмы резервного копирования”, Всероссийский конкурсный отбор обзорно-аналитических статей по приоритетному направлению Информационно-телекоммуникационные системыы, 2008, 1– 49; [Kazakov V. G., Fedosin S. A., “Technologii i algoritmi reservnogo kopirovania”, Vserossiyskiy konkursniy otbor obzorno-analiticheskih statey po prioritetnomu napravleniu ”Informacionno-telekommunikacionnie sistemi”, 2008, 1–49, (in Russian).]

3. Medeiros J., “NTFS Forensics: A Programmers View of Raw Filesystem Data Extraction”, Grayscale Research, 2008, 1–27.

4. Казаков В.Г., Федосин С.А., Плотникова Н.П., “Способ адаптивной дедупликации с применением многоуровневого индекса размещения копируемых блоков данных”, Фундаментальные исследования, 2013, №8, 1322–1325; [Kazakov V. G., Fedosin S. A., Plotnikova N. P., “Method of adaptive dedublication with multilevel block indexing”, Fundamental research, 2013, № 8, 1322–1325].

5. Sears R., Catharine van Ingen, Gray J., To BLOB or Not To BLOB: Large Object Storage in a Database or a Filesystem? Technical Report MSR-TR-2006-45, 2006, 1–11.

6. Zhu N., Chiueh T., “Portable and Eﬃcient Continuous Data Protection for Network File Servers”, Stony Brook University, 2007, 1–17.

7. Meyer D. T., Bolosky W. J., “A Study of Practical Deduplication”, ACM Transactions on Storage, 7:4 (2012), 1–13.

8. Storer M. W., Greenan K., Long D. D. E., Miller E. L., “Secure Data Deduplication”, Proceedings of the 4th ACM international workshop on Storage security and survivability, 2008, 1–10.

9. Renzel K., Keller W., “Client/Server Architectures for Business Information Systems”, A Pattern Language, 1997, 1–25.

10. Дейт К. Дж., Введение в системы баз данных, 8, Вильямс, 2005; In English: Date C. J., An Introduction to Database Systems, 8, Pearson Education, Inc., 2004.

11. Грофф Д., Вайнберг П., Оппель Э., SQL: полное руководство, 3, Вильямс, 2015; In English: Groﬀ J., Weinberg P., Oppel A., SQL The Complete Reference, 3, The McGraw- Hill Companies, 2010.

12. Дейт К.Дж., SQL и реляционная теория. Как грамотно писать код на SQL, СимволПлюс, 2010; In English: Date C. J., SQL and Relational Theory. How to Write

13. Accurate SQL Code, O’Reilly Media Inc., 2009.

14. Mistry R., Misner S., Introducing Microsoft SQL Server 2008 R2, Microsoft Press, 2010.

15. Максимов В., Козленко Л.А., Маркин С.П., Бойченко И.А., “Защищенная реляционная СУБД Линтер”, Открытые системы. СУБД, 1999, №11–12; [Maksimov V., Kozlenko L. A., Markin C. P., Bojchenko I. A., “Zashchishchennaya relyacionnaya SUBD Linter”, Otkrytye sistemy. SUBD, 1999, № 11–12, (in Russian).]

16. Таненбаум Э., Бос Х., Современные операционные системы, 4, Питер, 2015; In English: Tanenbaum A. S., Bos H., Modern Operating Systems, 4, Pearson Education, Inc., 2015.

Review

For citations:

Taranin S.M. De-duplication on the Backup System with Information Storage in a Database. Modeling and Analysis of Information Systems. 2017;24(2):215-226. (In Russ.) https://doi.org/10.18255/1818-1015-2017-2-215-226

This work is licensed under a Creative Commons Attribution 4.0 License.

ISSN 1818-1015 (Print)
ISSN 2313-5417 (Online)

Username
Password
	Remember me
Not a user? Register with this site Forgot your password?

User

Modeling and Analysis of Information Systems

De-duplication on the Backup System with Information Storage in a Database

Full Text:

Abstract

Keywords

About the Author

References

Review

For citations:

Cookies policy