Database System-lecture1
Edited by 颍川散人 | Data: 2022.2.25 |
---|
Query databases, design databases, build applications with them
名校数据库:
MIT 6.830/6.814
CMU CS15-445
UCB CS186
Stanford CS145
UIUC CS411
范围:
- Chapter 1~7(原理)
- Chapter 12~19(设计)
Introduction
what is database
database system(又称为管理系统DBMS)包含:
- 一堆数据(data)
- 一堆访问数据的程序(program to access)
- 一个支持高效便捷使用的环境(macOS,Linux,...)
database system管理的数据:
- highly valuable(价值高)
- relatively large(数据量大)
- Accessed by multiple users and applications, often at the same time(多人同时访问)
database management system: mySQL, SQlite, ...
概念:database, database system, DBMS
Database applications:
- interprise(企业) : sales accounting HR
- Manufacturing(生产)
- banking and finance(银行和金融)
- University
- Airlines
- Telecommunications (通讯)
- Web-bases service(朋友圈,在线销售,广告)
- Document databases
- Navigation service
早期数据库建立在文件系统上(computer-based file + C program to access), 数据冗余和不一致,access数据难度和数据隔离性完整性不佳,更新的原子性(减和加一定全部完成,当成一个操作,没全部完成就要重新来,不存在中间状态),并发访问(concurrent access),安全问题)
university database example:data and application
View of data(数据视图)重要的是给用户一个数据的抽象视图
- Data models(数据模型)
- Data
- Data relationships
- Data semantics
- Data constraints
- 四类模型:Relational,Entity-Relationship data model,Object-based data models ,Semi-structured data model
- Data abstraction(数据抽象)
relational model(关系模型)All the data is stored in various tables
view of data : 物理层面--逻辑层面(有点像struct)--view level(不同的人看到不同的信息for security )
Instances (实例) and schemas(模式)(就像变量和类型)(logical schema,physical schema)
Data definition language(数据定义语言DDL)
Data dictionary contains metadata(data of data)
data manipulation language(数据操纵语言DML)(查询语言):
- Procedural DML(过程性:具体告诉机器那些数据需要和如何获取)
- Declarative DML(描述性:描绘了一个事情)
SQL Query Language(SQL 查询语言)(不是图灵完备语言)(embedded 嵌入式的,或接口调用)
be written in a host language(C/C++)
Database design(逻辑设计(决定database schema,物理设计)
database engine
一个database system可以被分为以下功能模块:
- The storage manager(存储管理器)
- Interaction with the OS file manager
- Efficient storing, retrieving and updating of data
- 实现一些数据结构:data files, data dictionary(store metadata), indices(索引)
- The query processor component
- DDL interpreter
- DML compiler
- Query evaluation engine
- The transaction management component
- Manage 原子性, 突然断电, 一致性,并发性,etc
Database architecture:
- Centralized databases(集中在中心)
- Client-server(客户服务器)
- Parallel databases(并发数据库)
- Distributed databases(分布式)
Database Applications
- 两层结构application---- network---- database system
- 三层结构application client---- network---- application server---- database system(做到轻量级用户端)
Database Users
- DBA(超级用户)
- naive user
- application programmer
- sophisticated user
history of database system
- 1950s~1960s 穿孔纸
- 1960s~1970s hard disk,关系数据模型,Network and hierarchical data models
- 1980s SQL,并发和面向对象database system
- 1990s 大容量存储介质,Large decision support and data-mining applications
- 2000s 大数据存储系统,大数据分析
- 2010s SQL reloaded
main DBMS products:
- Commercial : MS SQL Server
- Open source: mySQL
HW: Ex 1.7, 1.8, 1.9, 1.15
1.7 List four significant differences between a file-processing system and a DBMS
the significant differences between a file-processing system and a DBMS:
- the file-processing system may lead to data redundancy and inconsistency(数据冗余和不一致), but a DBMS not
- the difficult in accessing data have difference
- the data isolation have difference, it is impossible for a file to open different areas to different users
- Atomicity problems(原子问题), such as reduce and add data must be considered as one operate. This is DBMS specific
- Concurrent-access(并发访问)and security problems.
1.8 Explain the concept of physical data independence and its importance in database systems.
Physical data independence is defined as the ability to make changes to the lowest level of DBMS, but without affecting the logical and the view level.Physical data independence helps to separate conceptual levels from the internal or physical level.Allows us to provide a logical description of the database without the need to specify the physical structures.
1.9 List five responsibilities of a database-management system. For each responsi- bility, explain the problems that would arise if the responsibility were not dis- charged.
Five responsbilities:
Interaction with the File Manager(和文件管理器的交互)
if there is no interaction with the file manager, then nothing stored in the files can be retrieved.
Integrity Enforcement
if not, consistency constraints may not be satisfied(一致性约束可能无法满足)
Security Enforcement(安全)
if not, users can freely access content they do not have permission to access
Backup and Recovery(备份和回溯)
if not, data may be permanently lost and cannot be traced back
Concurrency Control(并发访问)
if not, although each operation is fully executed, the consistency constraint may still not be satisfied
1.15 Describe at least three tables that might be used to store information in a social- networking system such as Facebook.
- one user table contains the id, name, hometown, or any other thing of users.
- one group table contains the id, name, etc of groups
- the friend table contains the friends of each user