csrins

Student. Teacher. Softsmith.

Archive for September, 2006

Advanced Database Systems Discussion Group Now Online

Posted by csrins on September 26, 2006

An online discussion group to facilitate exchange of technical queries and solutions has been set up for the students enrolled in this course at R.D.National College.

Bonafide students of the course are advised to join up. This group will be one of the major channels of communication with the course faculty, and within the student community.

How do you join?

If you are not already subscribed, send me an email at the contact address in the following format:

Subject: MSCCS1 [name]: Join Advanced Database Systems

Body:

[Your Complete Name]

It is highly desirable, though not mandatory that your email address be something that’s easy enough to identify you; for example firstname.lastname@gmail.com, but if you wish to conduct business using something like piggywiggy@emailprovider.com that’s entirely up to you! As mature post-graduate students, you are expected to follow proper netiquette in your dealings in the discussion group. Verbal abuse and off-topic discussions will not be tolerated. Be polite, conscientious, and dilligent, and make the most of the tools at your disposal.Creative Commons LicenseThis work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivs 2.5 License.

Posted in Advanced Database Systems, Education, M.Sc. Computer Science | Leave a Comment »

Syllabus: Advanced Database Systems [M.Sc. I] @ Mumbai Uni

Posted by csrins on September 25, 2006

Paper IV

Section-I

Objectives: To study the further database techniques beyond which covered in the second year, and thus to acquaint the students with some relatively advanced issues. At the end of the course students should be able to: gain an awareness of the basic issues in objected oriented data models, learn about the Web-DBMS integration technology and XML for Internet database applications, familiarize with the data-warehousing and data-mining techniques and other advanced topics, apply the knowledge acquired to solve simple problems

1. The Extended Entity Relationship Model and Object Model: The ER model revisited, Motivation for complex data types, User defined abstract data types and structured types, Subclasses, Super classes, Inheritance, Specialization and Generalization, Constraints and characteristics of specialization and Generalization, Relationship types of degree higher than two.

2. Object-Oriented Databases: Overview of Object-Oriented concepts, Object identity, Object structure, and type constructors, Encapsulation of operations, Methods, and Persistence, Type hierarchies and Inheritance, Type extents and queries, Complex objects; Database schema design for OODBMS; OQL, Persistent programming languages; OODBMS architecture and storage issues; Transactions and Concurrency control, Example of ODBMS

3. Object Relational and Extended Relational Databases: Database design for an ORDBMS – Nested relations and collections; Storage and access methods, Query processing and Optimization; An overview of SQL3, Implementation issues for extended type; Systems comparison of RDBMS, OODBMS, ORDBMS

4. Parallel and Distributed Databases and Client-Server Architecture: Architectures for parallel databases, Parallel query evaluation; Parallelizing individual operations, Sorting, Joins; Distributed database concepts, Data fragmentation, Replication, and allocation techniques for distributed database design; Query processing in distributed databases; Concurrency control and Recovery in distributed databases. An overview of Client-Server architecture

5. Databases on the Web and Semi Structured Data: Web interfaces to the Web, Overview of XML; Structure of XML data, Document schema, Querying XML data; Storage of XML data, XML applications; The semi structured data model, Implementation issues, Indexes for text data

6. Enhanced Data Models for Advanced Applications: Active database concepts. Temporal database concepts.; Spatial databases, Concepts and architecture; Deductive databases and Query processing; Mobile databases, Geographic information systems.

Text Books:

1. Elmasri and Navathe, Fundamentals of Database Systems [4e], Pearson Education
2. Raghu Ramakrishnan, Johannes Gehrke, Database Management Systems [3e], McGraw-Hill

References:

1. Korth, Silberchatz, Sudarshan , Database System Concepts, McGraw-Hill.
2. Peter Rob and Coronel, Database Systems, Design, Implementation and Management, Thomson Learning.
3. C.J.Date, Longman, Introduction To Database Systems, Pearson Education

Practicals

Topic: Distributed databases
Software used: Oracle 9.2

1. Create a global conceptual schema Emp(Eno;Ename;Address;Email;Salary) and insert 10 records. Divide Emp into vertical fragments Emp1(Eno;Ename;Address)and Emp2(Eno;Email;Salary)on two different nodes. Fire the following queries:

(i) Find the salary of an employee where employee number is known.
(ii) Find the Email where the employee name is known.
(iii) Find the employee name and Email where employee number is known.
(iv) Find the employee name whose salary is > 2000.

2. Create a global conceptual schema Emp(Eno;Ename;Address;Email;Salary) and insert 10 records.Divide Emp into horizontal fragments using the condition that Emp1 contains the tuples with salary = 10,000 and Emp2 with 10,000< salary = 20,000 on two different nodes. Fire the following queries:

(i) Find the salary of all employees.
(ii) Find the Email of all employees where salary = 15,000
(iii) Find the employee name and Email where employee number is known.
(iv) Find the employee name and address where employee number is known.

3. Create a global conceptual schema Emp(Eno;Ename;Address;Email;Salary) and insert 10 records. Store the replication of Emp into two different nodes and fire the following queries:

(i) Find the salary of all employees.
(ii) Find the Email of all employees where salary = 15,000
(iii) Find the employee name and Email where employee number is known.
(iv) Find the employee name and address where employee number is known.

Topic: Object Oriented Databases
Software used: Oracle 9.2

4. Using Object Oriented databases create the following types:

a) AddrType1 (Pincode: number, Street :char, City : char,
a. state :char)
b) (ii)BranchType (address: AddrType1, phone1: integer,phone2: integer )
c) AuthorType (name:char,,addr AddrType1)
d) PublisherType (name: char, addr: AddrType1, branches: BranchTableType
e) AuthorListType as varray, which is a reference to AuthorType

Next create the following tables:

f) BranchTableType of BranchType
g) authors of AuthorType
h) books(title: varchar, year : date,
published_by ref PublisherType,authors AuthorListType)
i) Publishers of PublisherType

Insert 10 records into the above tables and fire the following queries:

a) List all of the authors that have the same pin code as their publisher:
b) List all books that have 2 or more authors:
c) List the name of the publisher that has the most branches
d) Name of authors who have not published a book
e) List all authors who have published more than one book:
f) Name of authors who have published books with at least two different publishers
g) List all books (title) where the same author appears more than once on the list of authors (assuming that an integrity constraint requiring that the name of an author is unique in a list of authors has not been specified).

Topic: Multimedia Database
Software used: Oracle 9.2, J2SDK 1.4.2, Java Media Framework 2

5. [A] Create a table Emp with the attributes Eno as employee number, Ename as employee name, Eaddress as employee address and photo as an employee picture. Also create a table Company with attributes Eno,designation,age. Fire the following queries

a) Find name and designation of all the employees
b) Find name and age of all the employees
c) Find name and photo of a particular employee

[B] Create a table Singer with the attributes sno as singer number, Sname as singer name, Saddress as singer address and audio as an audio clip. Also create a table Company with attributes Sno,age. Fire the following queries

a) Find name and age of all the singer
b) Find name and audio clip of a particular singer

[C] Create a table Singer with the attributes sno as singer number, Sname as singer name, Saddress as singer address and video as an audio clip. Also create a table Company with attributes Sno,age. Fire the following queries

a) Find name and age of all the singer
b) Find name and video clip of a particular singer

Topic: Temporal Databases
Software used: Oracle 9.2

6.[A] Create a table tblEmp_Appnt, which stores the account number, name, and valid time say, recruitment date and retirement date. Insert 10 records and fire the following queries

a) Find all the employees who join the company on 2/3/2001
b) Find all the employees who will retired on 2/3/2001

[B] Create a table tbl_shares, which stores the, name of company, number of shares, and price per share at transaction time. Insert 10 records and fire the following queries

a) Find all the names of a company whose share price is more than Rs. 100 at 11:45 A.M.
b) Find the name of company which has highest share price at 5.00 P.M.

[C] Create a table tblEmp_Appnt, which stores the account number, name, and valid time say, recruitment date and retirement date. Create a trigger for valid time to check that no two records of same employee have common employment period and does not allow the user to update the records. Trigger should also fill up the empty retirement date.

Topic: Active Databases
Software used: Oracle 9.2

7. Create a table emp (eno, ename, hrs, pno, super_no) and project (pname, pno, thrs, head_no) where thrs is the total hours and is the derived attribute. Its value is the sum of hrs of all employees working on that project. eno and pno are primary keys, head_no is foreign key to emp relation. Insert 10 tuples and write triggers to do the following:

a) Creating a trigger to insert a new employee tuple and display the new total hours from project table.
b) Creating a trigger to change the hrs of existing employee and display the new total hours from project table.
c) Creating a trigger to change the project of an employee and display the new total hours from project table.
d) Creating a trigger to deleting the project of an employee.

Topic: XML databases
Software used: Oracle 9.2

8. Create a table employee having dept_id as number datatype and employee_spec as XML datatype (XMLType).The employee_spec is a schema with attributes emp id, name, email, acc_no, managerEmail, dateOf Joning. Insert 10 tuples into employee table. Fire the following queries on XML database.

a) Retrieve the names of employee.
b) Retrieve the acc_no of employees.
c) Retrieve the names,acc_no, email of employees.
d) Update the 3rd record from the table and display the name of an employee.
e) Delete 4 th record from the table.

Topic: Spatial databases
Software used: Oracle 9.2

9. Create a spatial database table that stores the number, name and location, which consists of four different areas say abc, pqr, mno and xyz. Fire the following queries

a) Find the topological intersection of two geometries.
b) Find whether two geometric figures are equivalent to each other.
c) Find the areas of all different locations.
d) Find the area of only one location.
e) Find the distance between two geometries.

Posted in Advanced Database Systems, Education, M.Sc. Computer Science | Leave a Comment »

Data Warehousing and Mining – Unit 10 – Review Questions

Posted by csrins on September 22, 2006

Data Warehousing and Mining – Unit 10 – Web Mining

  1. Explain in brief the classification of web data.
  2. Write a short note on web mining taxonomy.
  3. What is targeting? Explain with examples.
  4. Explain the different activities of text mining.
  5. What is a crawler? Distinguish between periodic and incremental crawlers.
  6. Write a short note on focused crawling.
  7. Write a short note on unfocused crawling.
  8. Explain in brief the Context Focused Crawler approach. How does this differ from traditional crawlers?
  9. What is semantic indexing? Explain in the context of harvest caching.
  10. Write a short note on WebML.
  11. Explain personalization in the context of web content mining. What are the three types of web page personalization?
  12. What is page rank? Explain its role in web structure mining.
  13. Present and explain in brief the HITS algorithm.
  14. What is web usage mining? Explain the three activities in web usage mining.
  15. Explain the usefulness of a session  in preprocessing of web usage logs.
  16. What is a trie? Explain its use in web usage mining.
  17. Write a short note on pattren discovery on clickstream data.
  18. Present and explain the maximal frequent forward sequences algorithm.
  19. Present and explain the Online Adaptive Traversal Patterns (OAT) algorithm.
  20. Write a short note on pattern analysis.
  21. Construct a trie for the string (A,B,A,C).
  22. Construct a suffix tree for the string (A,B,A,C).
  23. Given the following sessions, {(A,B,A,C),(C,B,D,F),(A,B,A)}, indicate the sequential patterns, forward sequences, and maximal frequent sequences assuming a minimum support of 30%. Assume each session occurs from a different user.

Creative Commons License
This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivs 2.5 License.

Posted in Data Mining, Data Warehousing and Mining, Education, M.Sc. Computer Science | Leave a Comment »

Data Warehousing and Mining – Unit 9 – Review Questions

Posted by csrins on September 8, 2006

Data Warehousing and Mining – Unit 9 – Knowledge Discovery in Databases

  1. What is data mining? What is the difference between a normal query environment and data mining?
  2. Discuss the problems and hurdles usually faced by data mining projects.
  3. Define learning. Explain in brief the salient features of learning.
  4. What is a self-learning computer system? Discuss.
  5. Explain the empirical cycle of scientific research.
  6. Explain the concepts of theory formation and theory falsification.
  7. Write a short note on concept learning.
  8. Explain the following terms:
    1. Classification accuracy
    2. Transparency
    3. Statistical significance
    4. Information content
    5. Complexity of the search space
  9. What is a data warehouse? Elaborate the four rules governing the structure of a data warehouse.
  10. Explain the concept of meta-data.
  11. How are decision support systems designed? Explain in brief.
  12. Write a short note on architecture and infrastructure for a data warehouse.
  13. Explain the six stages of the Knowledge Discovery process.
  14. Write a short note on data cleaning.
  15. Explain the coding step in KDD.
  16. Discuss various visualization techniques in KDD.
  17. Write a short note on the application of Kohonen self-organizing maps.
  18. Discuss the role of Genetic Algorithms in visualization.
  19. What is knowledge? How is it different from information? What are the four different types of knowledge?
  20. What are the rules for setting up a KDD environment?
  21. Discuss the application of data mining in customer profiling.
  22. How can KDD be used to discover foreign key relationships?

Creative Commons License
This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivs 2.5 License.

Posted in Data Mining, Data Warehousing and Mining, Education, M.Sc. Computer Science | Leave a Comment »

Data Warehousing and Mining – Unit 8 – Review Questions Part 3

Posted by csrins on September 7, 2006

Data Warehousing and Mining – Unit 8 – Core Topics [Part 3]

  1. Define the following terms:
    1. association rule
    2. support for an association rule
    3. confidence or strength for an association rule
  2. Explain the relationship between an antecedent and a consequent in an association rule.
  3. Define the association rule problem. How can the efficiency of association rule algorithms be measured?
  4. What is an itemset? Define a large (frequent) itemset.
  5. Present and explain an algorithm for generating association rules satisfying given support and confidence values.
  6. Explain the large itemset property. What is downward closure?
  7. Present and explain the apriori-gen algorithm.
  8. Present and explain the apriori algorithm. What is the major weakness of the apriori approach?
  9. Present and explain the sampling algorithm for generating large itemsets.
  10. How does partitioning improve the performance of finding large itemsets?
  11. Present and explain the partition algorithm for finding large itemsets.
  12. Present and explain the count distribution algorithm (CDA) for finding large itemsets.
  13. Present and explain the data distribution algorithm (DDA) for finding large itemsets.
  14. Explain the classification for association rule algorithms.
  15. Write a short note on incremental updating approaches for finding large itemsets.
  16. Write a short note on multiple-level association rules.
  17. What is a quantitative association rule? Present and explain an algorithm for determining large itemsets using quantitative association rules.
  18. Write a short note on the rare item problem.
  19. Explain the different measures for the quality of association rules.

Creative Commons License
This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivs 2.5 License.

Posted in Data Mining, Data Warehousing and Mining, Education, M.Sc. Computer Science | Leave a Comment »