What is the difference between Authentication and Authorization?
Ans. Authentication verifies who you are.
Authorization verifies what you are authorized to do.
What are the essential characteristics of Data warehouse?
Ans. Important characteristics of a data warehouse are as follows:
1. Subject-Oriented : Instead of being a dump of the enterprise-wide information(mostly thought so), it is a collection of subject wise categorized data. The information from the data warehouse is organized and presented subject area wise: e.g. Sales, Product, Customer etc can be different subject areas of a manufacturing firm’s data warehouse.
2. Integrated: It can be called integrated because of two reasons:
Firstly, because it integrates the data from different subject areas. It is a one-stop-shop for all the enterprise-wide information needs catering to the different departments of an organization.
Secondly, It can integrate data from various data sources. E.g. It can integrate data from the ERP system of the organization with the Billing system etc. It can also deal with data from different sources in a heterogeneous environment where data is hosted on different types of systems.
3. Non-Volatile: It contains stable information that doesn’t change each time an operational process is executed. Usually DELETE and UPDATE operations are not performed on the data residing in the data warehouse to preserve historical track of data. In simpler words data once inserted will not be deleted or modified hence non-volatile.
4. Time-Variant: For An operational system, the stored data contains the current value. The data warehouse, however, contains the history of data values. In simpler words, it shows its evolution over time.
What is ROLAP, MOLAP, HOLAP?
Ans. OLAP(Online Analytical Processing) basically has two storage models namely
1. ROLAP(Relational OLAP)
2. MOLAP(Multidimensional OLAP)
Apart from these two there is another model called HOLAP (Hybrid OLAP) which is a combination of MOLAP and ROLAP.
This is the traditional mode in OLAP analysis. In MOLAP data is stored in form of multidimensional cubes and not in relational databases unlike ROLAP where the data is stored in relations (tables).
1. Fast Retrieval of data: Data retrieval in the case of MOLAP is faster as compared to ROLAP. The reason is the pre-generated calculations at the time of cube creation.
2. Complex calculations: Complex calculations are not only possible but also relatively faster in MOLAP because of pre-generated calculations.
1. Can handle limited amount of data: The reason is that the calculations have been pre-built when the cube was created, hence the cube cannot be derived from a large volume of data.
2. More Investment: The cube technology is proprietary hence going with MOLAP model requires more investment as compared to ROLAP.
The underlying data in this model is stored in relational databases.
This methodology relies on manipulating the data stored in the relational database to give the appearance of traditional OLAP’s slicing and dicing functionality.
1. Can handle large amounts of data: The data size limitation of ROLAP technology is the limitation on data size of the underlying relational database.
2. Can leverage functionalities inherent in the relational database: Often, relational database already comes with a host of functionalities. ROLAP technologies, since they sit on top of the relational database, can therefore leverage these functionalities.
1. Performance can be slow: As it is relying on SQL queries for reporting the performance can be slow in case of huge data being targeted.
2. Limited by SQL functionalities: Since, SQL has its own limitations in handling data, these
incapabilities come inherently to the ROLAP model and thus it cannot perform all the complex operations like a cube.
HOLAP technologies attempt to combine the advantages of MOLAP and ROLAP. For summary-type information, HOLAP leverages cube technology for faster performance. When detail information is needed, HOLAP can “drill through” from the cube into the underlying relational data.
What is a Cube?
Ans. A cube is a set of data that is usually constructed from a subset of a data warehouse and is organized and summarized into a multidimensional structure defined by a set of dimensions and measures.
Every cube has a schema, which is the set of joined tables in the data warehouse from which the cube draws its source data. The central table in the schema is the fact table, the source of the cube’s measures. The other tables are dimension tables, the sources of the cube’s dimensions. For example:
Pre-calculated summary data called aggregations provides the mechanism for rapid and uniform response times to queries.
End users use client applications to connect to an Analysis server and query the cubes on the server. In most client applications, end users issue a query on a cube by manipulating the user interface controls, which determine the contents of the query. This spares end users from writing language-based queries.
Events are associated with cube cells and cube edges stand for analysis dimensions.
Each cube cell is given a value for each measure.
Above figure shows an intuitive representation of a cube.
Here the fact is a sale in a store chain.
Its analysis dimensions are store, product and date.
An event stands for a specific item sold in a specific store on a specific date, and it is described by two measures: the quantity sold and the receipts.
This figure highlights that the cube is sparse—this means that many events did not actually take place. Of course, you cannot sell every item every day in every store.
What is a Hypercube?
Ans. A cube with more than 3 dimensions is called a hypercube.