Business Intelligence: December 2012

Friday, December 28, 2012

Oracle Application Express

What is Oracle Application Express?

Oracle Application Express (Oracle APEX) is a declarative, rapid web application development tool for the Oracle database. It is a fully supported, no cost option available with all editions of the Oracle database. Using only a web browser, you can develop and deploy professional applications that are both fast and secure.

Whether you are an experienced SQL and PL/SQL developer or a power user used to writing reports, wizards allow you to quickly build Web applications on top of your Oracle database objects. Enhancing and maintaining these applications is done using a declarative framework, all of which increases your productivity.

Oracle Application Express is database-centric and suited to building a vast array of applications. You can start with webifying a spreadsheet to facilitate collaboration or dive right into extremely complex applications with numerous external interfaces such as the Oracle Store. Because Oracle APEX resides within the Oracle Database and can easily integrate with authentication schemes (such as Oracle Access Manager, SSO, LDAP, etc.) you can build secure applications that can scale to meet your largest user communities.

SCD Type2 Through Informatica with date range

https://docs.google.com/open?id=0ByVBmePwMuhZd01iT3VNd3ZzUWM

DAC Tasks & Explanation

Step	Task Name	Descriptiopn	Notes
1	Seed Data Configuration	a) Task Logical folders	One Time Only. This is to be done when a new Informatica Folder is created.
		b) Task Physical Folders	One Time Only. This is to be done when a new Informatica Folder is created.
		c) Task Phase – Optional	One Time Only. This is to be done when a new Informatica Folder is created.
2	Source Folder Relationship	In the Design under Source System Folders tab, link the Task Logical folders and Task Physical Folders	One Time Only. This is to be done when a new Informatica Folder is created.
3	Import Table/s	In the Design view under Tables Tab, import the new fact or dimension table by right clicking in the work space. This will show a list of tables and select the table that you want to import.	Everytime you Add a new Dimension (Extending an existing Star or Custom Star) or Fact table (a new Star Schema)
4	Import Columns for the custom table/s	Right click on the custom table imported in step 3 and select Import > Database columns. This imports all of the columns for the particular table. If it is a fact table under the columns sub-tab make sure to enter the Foreign Key table and Foreign Key. (This is a very important step)	Everytime you Add a new Dimension (Extending an existing Star or Custom Star) or Fact table (a new Star Schema)
5	Create Task	In Design view under Task Tab, click New. Create a new Task. Enter the all of the required details in the Edit Subtab. Make sure to enter the correct Workflow names for Full and Incremental loads. Select the correct Logical folder name from the drop down. Also, select the correct source and target connections for the task. Default Source = DBConnection_OLTP and Target = DBConnection_OLAP	Everytime you Add a new Dimension (Extending an existing Star or Custom Star) or Fact table (a new Star Schema)
6	Synchronize the Task	Right click on the Task that you created in Step 5 > Select Synchronize task. In the pop-up window select the option for the Selected Task Only. This should synchronize the DAC Task with your Informatica task and the Source and Target tables are filled out for the Task	Everytime you Add a new Dimension (Extending an existing Star or Custom Star) or Fact table (a new Star Schema)
7	Update the Source and Target table options	After you perform Step 6, click on the Sources subtab and enter the required details. Make sure to enter the Source connection value which is always DBConnection_OLTP. Repeat the above step for the target table. Select the truncate for Full load or Truncate always option. Make sure to enter the Target connection value which is always DBConnection_OLAP	Everytime you Add a new Dimension (Extending an existing Star or Custom Star) or Fact table (a new Star Schema)
8	Update the newly imported tables	In the Design view under Tables Tab, identify all of the tables including source and targets that got imported from Step 6 and update the correct Table type values.	Everytime you Add a new Dimension (Extending an existing Star or Custom Star) or Fact table (a new Star Schema)
9	Import Indexes	In the Design view under Tables Tab, identify all of the tables that you created in Step 6 and right click and import all the indexes. The prerequisite for this is the index has to be created on the tables at the database level.	Everytime you Add a new Dimension (Extending an existing Star or Custom Star) or Fact table (a new Star Schema)
10	Set Index Properties	Once the Indexes have been imported in Step 9, click Indexes Tab in the Design View and identify all the indexes you just imported and set the correct value. Make sure to set it as an ETL or a query index. Also, set the Drop and recreate bit map indexes always option so that the indexes are always dropped and recreated for Full load.	Everytime you Add a new Dimension (Extending an existing Star or Custom Star) or Fact table (a new Star Schema)
11	Create Subject Area	In Design View > Subject Area Tab, click new. Enter a name for your subject area. Click Save. Click on the Tables subtab and add the fact table/s for the star or subject area that your are building. Click Save.	Everytime you Add a new Dimension (Extending an existing Star or Custom Star) or Fact table (a new Star Schema)
12	Assemble the Subject area	Once the Subjecta area is built as in Step 11, click on the Assemble button to create a list of all Tasks for that subject area. This will prompt you for the selected or all records option and make sure to select select record only. If you are assmebling more than one SA, then select the other option. For this step to happen correctly, Steps 4, 5 & 6 needs to be completed accurately.	Everytime you Add a new Dimension (Extending an existing Star or Custom Star) or Fact table (a new Star Schema). When ever you add delete or Modify any Tasks or related objects Assemble the Subject Area. Advisable to perform this step on changes to any of the DAC objects.

13	Create New Execution Plan	In Execute View, click on New. Enter a Name for the Execution Plan. Click Save.	When you create a new Execution Plan
14	Add the Subject area	Cick on the Subject Area subtab under Execute View > Add the new Subject Area configured in Step 11 and 12. Click Save	When you create a new Execution Plan or Modify an existing one to add or remove Subject Areas
15	Generate Parameters	Cick on the Parameters subtab under Execute View > Click Generate. Select 1 from the pop-up window. Once the parmeters are generated, select appropriate values for the DataSources. Validate the Folder assignment which happens by default. Click Save	When you create a new Execution Plan or Modify an existing one to add or remove Subject Areas
16	Build the Execution Plan	In Execute View > Execution Plan, Select the execution Plan created and configured in Step 13,14,15. Click Build. This will prompt you for the selected or all records and make sure to select select record only. Complete a series of prompts and wait for the Execution Plan to be Built. Your execution plan is ready to be Run!	When ever you add delete or Modify any Tasks or related objects. Advisable to perform this step on changes to any of the DAC objects.

Basic UNIX/Linux commands

Custom commands available on departmental Linux computers but not on 
Linux systems elsewhere have descriptions marked by an *asterisk. 
Notice that for some standard Unix/Linux commands, friendlier versions 
have been implemented; on departmental Linux computers the names of such 
custom commands end with a period.

------------------------------------------------------------------------------
INTERACTIVE FEATURES

TAB                         Command completion !!! USEFUL !!!
UPARROW                     Command history !!! USEFUL !!!

CTRL-C                      Interrupt/kill current process.

CTRL-D                      (at the beginning of the input line) end input.
CTRL-D CTRL-D               (in the middle of the input line) end input.

------------------------------------------------------------------------------
WILDCARDS AND DIRECTORIES USED IN COMMANDS

*                           Replaces any string of characters in a file name
                            except the initial dot.
?                           Replaces any single character in a file name
                            except the initial dot.
~                           The home directory of the current user.
~abcde001                   The home directory of the user abcde001.
..                          The parent directory.
.                           The present directory.
/                           The root directory

Example:
ls ~/files/csc221/*.txt

------------------------------------------------------------------------------
REDIRECTIONS AND PIPES

COMMAND < FILE              Take input from FILE instead of from the keybaord.
COMMAND > FILE              Put output of COMMAND to FILE
COMMAND >> FILE             Append the output of COMMAND to FILE
COMMAND 2> FILE             Put error messages of COMMAND to FILE.
COMMAND 2>> FILE            Append error messages of COMMAND to FILE.
COMMAND > FILE1 2> FILE2    Put output and error messages in separate files.
COMMAND >& FILE             Put output and error messages in the same FILE.
COMMAND >>& FILE            Append output and error messages of COMMAND to FILE.

COMMAND1 | COMMAND2         Output of COMMAND1 becomes input for COMMAND2.
COMMAND | more              See the output of COMMAND page by page.
COMMAND | sort | more       See the output lines of COMMAND sorted and page by page.

------------------------------------------------------------------------------
ONLINE HELP

h                           *Custom help.
COMMAND --help | more       Basic help on a Unix COMMAND (for most commands).
COMMAND -h | more           Basic help on a Unix COMMAND (for some commands).
whatis COMMAND              One-line information on COMMAND.
man COMMAND                 Display the UNIX manual page on COMMAND.
info COMMAND                Info help on COMMAND.
xman                        Browser for Unix manual pages (under X-windows).
apropos KEYWORD | more      Find man pages relevant to COMMAND.
help COMMAND      Help on a bash built-in COMMAND.
perldoc                     Perl documentation.

------------------------------------------------------------------------------
FILES AND DIRECTORIES

ls                          List contents of current directory.
ls -l                       List contents of current directory in a long form.
ls -a                       Same as ls but .* files are displayed as well.
ls -al                      Combination of ls -a and ls -l 
ls DIRECTORY                List contents of DIRECTORY (specified by a path).
ls SUBDIRECTORY             List contents of SUBDIRECTORY.
ls FILE(S)                  Check whether FILE exists (or what FILES exist).

pwd                         Display absolute path to present working directory.

mkdir DIRECTORY             Create DIRECTORY (i.e. a folder)

cd                          Change to your home directory.
cd ..                       Change to the parent directory.
cd SUBDIRECTORY             Change to SUBDIRECTORY.
cd DIRECTORY                Change to DIRECTORY (specified by a path).
cd -                        Change to the directory you were in previously.
cd. ARGUMENTS               *Same as cd followed by ls

cp FILE NEWFILE             Copy FILE to NEWFILE.
cp -r DIR NEWDIR            Copy DIR and all its contents to NEWDIR.
cp. ARGUMENTS               *Same as cp -r but preserving file attributes.

mv FILE NAME                Rename FILE to new NAME.
mv DIR NAME                 Rename directory DIR to new NAME.
mv FILE DIR                 Move FILE into existing directory DIR.
swap FILE1 FILE2            *Swap contents of FILE1 and FILE2.

ln -s FILE LINK             Create symbolic LINK (i.e. shortcut) to existing FILE.

quota                       Displays your disk quota.
quota.                      *Displays your disk quota and current disk usage.

rm FILE(S)                  Remove FILE(S).
rmdir DIRECTORY             Remove empty DIRECTORY.
rm -r DIRECTORY             Remove DIRECTORY and its entire contents.
rm -rf DIRECTORY            Same as rm -r but without asking for confirmations.
clean                       *Remove non-essential files, interactively 
clean -f                    *Remove non-essential files, without interaction.
junk FILE                   *Move FILE to ~/junk instead of removing it.

find. FILE(S)               *Search current dir and its subdirs for FILE(S).

touch FILE                  Update modification date/time of FILE.

file FILE                   Find out the type of FILE.

gzip                        Compress or expand files.
zip                         Compress or expand files.
compress                    Compress or expand files.
tar                         Archive a directory into a file, or expand such a file.
targz DIRECTORY             *Pack DIRECTORY into archive file *.tgz 
untargz ARCHIVE.tgz         *Unpack *.tgz archive into a directory.

------------------------------------------------------------------------------
TEXT FILES

more FILE                   Display contents of FILE, page by page.
less FILE                   Display contents of FILE, page by page.
cat FILE                    Display a file. (For very short files.)
head FILE                   Display first lines of FILE.
tail FILE                   Display last lines of FILE.

pico FILE                   Edit FILE using a user-friendly editor. 
nano FILE                   Edit FILE using a user-friendly editor. 
kwrite FILE                 Edit FILE using a user-friendly editor under X windows.
gedit FILE                  Edit FILE using a user-friendly editor under X windows.
kate FILE                   Edit FILE using a user-friendly editor under X windows.
emacs FILE                  Edit FILE using a powerful editor.
vim FILE                    Edit FILE using a powerful editor with cryptic syntax.

aspell -c FILE              Check spelling in text-file FILE.
ispell FILE                 *Check spelling in text-file FILE.

cat FILE1 FILE2 > NEW       Append FILE1 and FILE2 creating new file NEW.
cat FILE1 >> FILE2          Append FILE1 at the end of FILE2.

sort FILE > NEWFILE         Sort lines of FILE alphabetically and put them in NEWFILE.

grep STRING FILE(S)         Display lines of FILE(S) which contain STRING.
grep. STRING FILE(S)        *Similar to that above, but better.

wc FILE(S)                  Count characters, words and lines in FILE(S).

diff FILE1 FILE2 | more     Show differences between two versions of a file.

filter FILE NEWFILE         *Filter out strange characters from FILE.

COMMAND | cut -b 1-9,15     Remove sections from each line.
COMMAND | uniq              Omit repeated lines.

------------------------------------------------------------------------------
PRINTING

lpr FILE                    In Hawk153B or Redcay 141A, print FILE from a workstation. 
lprint1 FILE                *Print text-file on local printer; see help printing
lprint2 FILE                *Print text-file on local printer; see help printing

------------------------------------------------------------------------------
PROGRAMMING LANGUAGES

python                      Listener of Python 2.x programming language.
python3                     *Listener of Python 3.x programming language.
idle                        IDE for Ptyhon 2.x.
idle3                       *IDE for Ptyhon 3.x.

cc -g -Wall -o FILE FILE.c  Compile C source FILE.c into executable FILE.
gcc -g -Wall -o FILE FILE.c Compile C source FILE.c into executable FILE.
c++ -g -Wall -o FIL FIL.cxx Compile C++ source FIL.cxx into executable FIL.
g++ -g -Wall -o FIL FIL.cxx Compile C++ source FIL.cxx into executable FIL.
gdb EXECUTABLE              Start debugging a C/C++ program.
make FILE                   Compile and link C/C++ files specified in makefile
m                           *Same as make but directs messages to a log file.
c-work                      *Repeatedly edit-compile-run a C program.

javac CLASSNAME.java        Compile a Java program.
java CLASSNAME              Run a Java program.
javadoc CLASSNAME.java      Create an html documentation file for CLASSNAME.
appletviewer CLASSNAME      Run an applet.

scheme                      *Listener of Scheme programming language.
lisp                        *Listener of LISP programming language.
prolog                      *Listener of Prolog programming language.

------------------------------------------------------------------------------
INTERNET

lynx                        Web browser (for text-based terminals).
firefox                     Web browser.
konqueror                   Web browser.
BROWSER                     Browse the Internet (with one of the browsers above.
BROWSER FILE.html           Display a local html file.
BROWSER FILE.pdf            Display a local pdf file.

mutt                        Text-based e-mail manager.
pine                        Text-based e-mail manager (on some systems).

ssh HOST                    Open interactive session on HOST using secure shell.
sftp HOST                   Open sftp (secure file transfer) connection to HOST.
rsync ARGUMENTS             Synchronize directories on local and remote host.

------------------------------------------------------------------------------
UNDER X-WINDOWS

libreoffice                 LibreOffice productiveity suite
soffice                     OpenOffice productivity suite
 
acroread FILE.pdf           Display pdf FILE.pdf
epdfviewer FILE.pdf         Display pdf FILE.pdf
okular FILE                 Display FILE (pdf, postscript, ...)

xterm                       A shell window
konsole                     A better shell window
xcalc                       A calculator
xclock                      A clock
xeyes                       They watch you work and report to the Boss :-)

------------------------------------------------------------------------------
RESET

xfwm4                       Reset the windows manager on Lab and MiniLab computers.
Ctrl-Alt-Backspace          Restart the X-server. (You may need to do that twice.)
clear                       Clear shell window.
xrefresh                    Refresh X-windows.
reset                       *Reset session.
setup-account               *Set up or reset your account (Dr. Plaza's customizations)

------------------------------------------------------------------------------
MISCELLANEOUS

exit                        Exit from any shell.
logout                      Exit from the login shell and terminate session.

svn                         Version control system.    

date                        Display date and time.

------------------------------------------------------------------------------
COURSEWORK IN DR. PLAZA'S COURSES

Commands and directory names related to csc219 have 219 as a suffix. 
By changing the suffix you will obtain commands for other courses.

ls $csc319                  *List files related to csc319. 
cd $csc319                  *change into instructor's public directory for csc319. 
cp $csc319/FILE .           *Copy FILE related to csc319 to the current directory
cp -r $csc319/SUBDIR .      *Copy SUBDIR of csc319 to the current directory.
submit                      *Submit a directory with files for an assignment. 
grades                      *See your grades. Used in some courses only.

------------------------------------------------------------------------------
PROCESS CONTROL

Notes: A process is a run of a program; 
       One program can be used to create many concurrent processes.
       A job may consist of several processes with pipes and redirections.
       Processes are managed by the kernel. 
       Jobs are managed by the shell.
 
CTRL-Z                      Suspend current foreground process.
fg                          Bring job suspended by CTRL-Z to the foreground.
bg JOB                      Restart suspended JOb in the background.
ps                          List processes.
ps.                         *List processes.
jobs                        List current jobs (A job may involve many processes).
kill PROCESS                Kill PROCESS (however some processes may resist).
ctrl-C                      Kill the foreground process (but it may resist).
kill -9 PROCESS             Kill PROCESS (no process can resist.)
kill. PROCESS               *Kill PROCESS; same as kill -9.
COMMAND &                   Run COMMAND in the background.

------------------------------------------------------------------------------
ENVIRONMENT VARIABLES IN BASH

env | sort | more           List all the environment variables with values.
echo $VARIABLE              List the value of VARIABLE.
unset VARIABLE              Remove VARIABLE.
export VARIABLE=VALUE       Create environment variable VARIABLE and set to VALUE.

------------------------------------------------------------------------------

Dimensional Modeling

Dimensional modeling (DM) is the name of a set of techniques and concepts used in data warehouse design. Dimensional modeling always uses the concepts of facts (measures), and dimensions (context).

Facts : Facts are typically (but not always) numeric values that can be aggregated, and dimensions are groups of hierarchies and descriptors that define the facts.

Types of Facts :

There are three types of facts:

Additive: Additive facts are facts that can be summed up through all of the dimensions in the fact table.
Semi-Additive: Semi-additive facts are facts that can be summed up for some of the dimensions in the fact table, but not the others.
Non-Additive: Non-additive facts are facts that cannot be summed up for any of the dimensions present in the fact table.

Types of Fact Tables

There are two types of fact tables:

Cumulative: This type of fact table describes what has happened over a period of time. For example, this fact table may describe the total sales by product by store by day. The facts for this type of fact tables are mostly additive facts. The first example presented here is a cumulative fact table.
Snapshot: This type of fact table describes the state of things in a particular instance of time, and usually includes more semi-additive and non-additive facts. The second example presented here is a snapshot fact table.

Dimension : A dimension is a data element that categorizes each item in a data set into non-overlapping regions. A data warehouse dimension provides the means to "slice and dice" data in a data warehouse. Dimensions provide structured labeling information to otherwise unordered numeric measures.

Types of Dimension :

Conformed dimension :

In data warehousing, a conformed dimension is a dimension that has the same meaning to every fact with which it relates. Conformed dimensions allow facts and measures to be categorized and described in the same way across multiple facts and/or data marts, ensuring consistent reporting across the enterprise.

A conformed dimension can exist as a single dimension table that relates to multiple fact tables within the same data warehouse, or as identical dimension tables in separate data marts.

Junk dimension :

A Junk Dimension is a dimension table consisting of attributes that do not belong in the fact table or in any of the existing dimension tables.The junk dimension should contain a single row representing the blanks as a surrogate key that will be used in the fact table for every row returned with a blank comment field^.

The designer is faced with the challenge of where to put attributes that do not belong in the other dimensions,Solution is to create a new dimension for each of the remaining attributes, but due to their nature, it could be necessary to create a vast number of new dimensions resulting in a fact table with a very large number of foreign keys.

Degenerate dimension :

A dimension key, such as a transaction number, invoice number, ticket number, or bill-of-lading number, that has no attributes and hence does not join to an actual dimension table. Degenerate dimensions are very common when the grain of a fact table represents a single transaction item or line item because the degenerate dimension represents the unique identifier of the parent. Degenerate dimensions often play an integral role in the fact table's primary key.

Dimensional modeling structure:

The dimensional model is built on a star-like schema, with dimensions surrounding the fact table. To build the schema, the following design model is used:

Choose the business process
Declare the Grain
Identify the dimensions
Identify the Fact

Benefits of dimensional modeling :

Benefits of the dimensional modeling are following:

Understandability
Query performance
Extensibility

Data Warehouse Schemas

A schema is a collection of database objects, including tables, views, indexes, and synonyms. You can arrange schema objects in the schema models designed for data warehousing in a variety of ways.

Star Schemas :

The star schema (also called star-join schema, data cube, or multi-dimensional schema) is the simplest style of data warehouse schema. The star schema consists of one or more fact tables referencing any number of dimension tables

Text description of dwhsg007.gif follows

The facts that the data warehouse helps analyze are classified along different dimensions:

The fact table holds the main data. It includes a large amount of aggregated data, such as price and units sold. There may be multiple fact tables in a star schema.
Dimension tables, which are usually smaller than fact tables, include the attributes that describe the facts. Often this is a separate table for each dimension. Dimension tables can be joined to the fact table(s) as needed.

Dimension tables have a simple primary key, while fact tables have a set of foreign keys which make up a compound primary key consisting of a combination of relevant dimension keys.

Advantages :

Provide a direct and intuitive mapping between the business entities being analyzed by end users and the schema design.
Provide highly optimized performance for typical star queries.
Are widely supported by a large number of business intelligence tools, which may anticipate or even require that the data-warehouse schema contain dimension tables

Snow Flake Schemas : The snowflake schema is represented by centralized fact tables which are connected to multiple dimensions. In the snowflake schema, dimensions are normalized into multiple related tables, whereas the star schema's dimensions are denormalized with each dimension represented by a single table.

Snowflake schemas are often better with more sophisticated query tools that isolate users from the raw table structures and for environments having numerous queries with complex criteria.

Text description of dwhsg008.gif follows

Advantages :

Some OLAP multidimensional database modeling tools that use dimensional data marts as data sources are optimized for snowflake schemas.
A snowflake schema can sometimes reflect the way in which users think about data. Users may prefer to generate queries using a star schema in some cases, although this may or may not be reflected in the underlying organization of the database.
A multidimensional view is sometimes added to an existing transactional database to aid reporting. In this case, the tables which describe the dimensions will already exist and will typically be normalized. A snowflake schema will therefore be easier to implement.
If a dimension is very sparse (i.e. most of the possible values for the dimension have no data) and/or a dimension has a very long list of attributes which may be used in a query, the dimension table may occupy a significant proportion of the database and snowflaking may be appropriate.

Data Warehouse Architecture

The above Architecture is taken from the www.databaseanswers.com . We would recommend you to visit this site to get good understanding towards data modeling.

Now we are goanna define each and every terminology in the above picture to facilitate better understanding of the subject.

1. Operational Data Store : is a database designed to integrate data from multiple sources for additional operations on the data. The data is then passed back to operational systems for further operations and to the data warehouse for reporting.

2. ERP : Enterprise resource planning integrates internal and externalmanagement information across an entire organization, embracingfinance/accounting, manufacturing, sales and service, etc.

Its purpose is to facilitate the flow of information between all business functions inside the boundaries of the organization and manage the connections to outside stakeholders.

3. CRM : Customer relationship management is a widely-implemented strategy for managing a company’s interactions with customers, clients and sales prospects. It involves using technology to organize, automate, and synchronize business processes—principally sales activities, but also those for marketing, customer service, and technical support.

Customer relationship management describes a company-wide business strategy including customer-interface departments as well as other departments.

4. Flat Files In data Ware Housing : Flat Files Doesn’t Maintain referential Integrity like RDBMS and are Usually seperated by some delimiters like comma and pipes etcs.

Right from Informatica 8.6 unstructured data sources like Ms-word,Email and Pdf can be taken as source.

5. ETL (Extract,Transform, And load) :

is a process in database usage and especially in data warehousing that involves:

Extracting data from outside sources
Transforming it to fit operational needs (which can include quality levels)
Loading it into the end target (database or data warehouse)

6. Data Marts: A data mart (DM) is the access layer of the data warehouse (DW) environment that is used to get data out to the users. The DM is a subset of the DW, usually oriented to a specific business line or team.

For the Definition of the Data Warehouse Please Refer to Introduction to the Data ware Housing.

7. OLAP : OLAP (Online Analytical Processing) is a methodology to provide end users with access to large amounts of data in an intuitive and rapid manner to assist with deductions based on investigative reasoning.

OLAP systems need to:

Support the complex analysis requirements of decision-makers,
Analyze the data from a number of different perspectives (business dimensions), and
Support complex analyses against large input (atomic-level) data sets.

8. OLTP : Online transaction processing, or OLTP, refers to a class of systems that facilitate and manage transaction-oriented applications, typically for data entry and retrieval transaction processing.

9. Data Mining: Is the process of extracting patterns from large data sets by combining methods from statistics and artificial intelligence withdatabase management. Data mining is seen as an increasingly important tool by modern business to transform data into business intelligence giving an informational advantage.

Introduction to Data warehousing

A data warehouse (DW) is a database used for reporting. The data is offloaded from the operational systems for reporting. The data may pass through an operational data store for additional operations before it is used in the DW for reporting.

A data warehouse maintains its functions in three layers: staging, integration, and access. Staging is used to store raw data for use by developers (analysis and support). The integration layer is used to integrate data and to have a level of abstraction from users. The accesslayer is for getting data out for users.

1. Ralph Kimball's paradigm: Data warehouse is the conglomerate of all data marts within the enterprise. Information is always stored in the dimensional model.

Definition as Per Ralph Kimball : A data warehouse is a copy of transaction data specifically structured for query and analysis.

His Approach towards towards the Data warehouse Design is Bottom-Up.In the bottom-up approach data marts are first created to provide reporting and analytical capabilities for specific business processes

2.Bill Inmon's paradigm: Data warehouse is one part of the overall business intelligence system. An enterprise has one data warehouse, and data marts source their information from the data warehouse. In the data warehouse, information is stored in 3rd normal form.

Definition as Per Bill Inmon:

A data warehouse is a subject-oriented, integrated, time-variant and non-volatile collection of data in support of management's decision making process.

Subject-Oriented: A data warehouse can be used to analyze a particular subject area. For example, "sales" can be a particular subject.

Integrated: A data warehouse integrates data from multiple data sources. For example, source A and source B may have different ways of identifying a product, but in a data warehouse, there will be only a single way of identifying a product.

Time-Variant: Historical data is kept in a data warehouse. For example, one can retrieve data from 3 months, 6 months, 12 months, or even older data from a data warehouse. This contrasts with a transactions system, where often only the most recent data is kept. For example, a transaction system may hold the most recent address of a customer, where a data warehouse can hold all addresses associated with a customer.

Non-volatile: Once data is in the data warehouse, it will not change. So, historical data in a data warehouse should never be altered.

His Approach towards towards the Data warehouse Design is Top-Down. In top-down approach to data warehouse design, in which the data warehouse is designed using a normalized enterprise data model."Atomic" data, that is, data at the lowest level of detail, are stored in the data warehouse.

Thursday, December 27, 2012

Security in OBIEE

Types of Security in OBIEE with example

Why do we need security in OBIEE?

OBIEE is a reporting tool wherein multiple users belonging to multiple groups create multiple reports and dashboards. Reports created by a particular group of users should be visible to that particular group only or some specific data should be visible to only a specific set of people. So, to achieve this we need to have some sort of security thereby we can protect reports belonging to a group of users from the users of other groups.

Users and Groups in OBIEE

End users who make use of OBIEE for reporting need to be defined somewhere. These users can be defined either in the OBIEE RPD, External database tables, LDAP Servers or in Active directories with their respective passwords.

The users belonging to same business unit can be clubbed and Groups can be created for them. It’s not always necessary to create users in the RPD but its necessary to create the groups in the RPD. Infact, creating several users in the RPD can be a cumbersome job and it will also increase the size of the RPD, so, according to the best practice create the users and groups on the DB(or add in AD/LDAP) and associate them with the RPD groups by creating groups of the same name on the RPD as in DB.

Authentication & Authorization

Authentication means validating the user while logging in the OBIEE application. When a user logs in the OBIEE application a request is sent to the BI Server asking that whether this user is a valid user or not. When BI Server validates the user,then only the user is able to login in the application.

Authorization means a user is authorized to view what all objects. Example, User A might be authorized to view only particular set of reports and dashboards based on the security applied.

Now we can understand these terms in detail.

Types of Security in OBIEE

Security in Oracle BI can be classified broadly into the following three types.

1. Object Level security (Authorization)

2. Data Level security (Authorization)

3. User Level Security (Authentication)

User Level Security in OBIEE

User Authentication in OBIEE

The goal of the authentication configuration is to get a confirmation of the identity of a user based on the credentials provided.

In OBIEE, the credentials provided are hold in this two variables:

USER
PASSWORD

The authentication process in OBIEE is managed by the BI Server.

OBIEE Support four types of authentication

LDAP Authentication : Users are authenticated based on credentials stored in LDAP.This is the BEST method to do authentication in OBIEE and it supports company’s Single Sign On (SSO) philosophy as well.
External Table Authentication : you can maintain lists of users and their passwords in an external database table and use this table for authentication purposes.
Database Authentication : The Oracle BI Server can authenticates user based on database logins. If a user has read permission on a specific database. Oracle BI Presentation Services authenticates those users.
Oracle BI Server User Authentication : You can maintain lists of users and their passwords in the Oracle BI repository using the Administration Tool. The Oracle BI Server will attempt to authenticate users against this list when they log on.

Object Level Security in OBIEE

As the name states, Object Level security refers to restricting access to OBIEE objects between different users and groups. The access to following objects can be restricted using object level security: Presentation tables, Presentation table columns, Subject Areas, Reports, Dashboards, and Project Specific shared folders.

Object-level security controls the visibility to business logical objects based on a user's role.

You can set up Object-Level Security for :

Repository level: In Presentation layer of Administration Tool, we can set Repository level security by giving permission or deny permission to users/groups to see particular table or column.

Web level: This provides security for objects stored in the Presentation Catalog, such as dashboards, dashboards pages, folder and reports. You can only view the objects for which you are authorized. For example, a mid-level manager may not be granted access to a dashboard containing summary information for an entire department.

Data Level Security in OBIEE

Data Level Security is basically securing the data. Users belonging to particular group should see a certain set a data whereas users outside that groups shouldn’t see that data. Example: Users belonging to Asia group should see only the data for Asia region whereas users belonging to US region should see data for US region.

Data-level security controls the visibility of data (content rendered in subject areas, dashboards, Oracle BI Answers, and so on) based on the user's association to data in the transactional system.

This controls the type and amount of data that you can see in a report. When multiple users run the same report, the results that are returned to each depend on their access rights and roles in the organization. For example, a sales vice president sees results for all regions, while a sales representative for a particular region sees only data for that region.

Example

Here we will look at creating and using a session variable and how to implement row level security. This is mainly used to restrict data based on the user rights. The row level security will be useful in situations like:

1. Allowing user to see data that she has access to.

2. Showing data based on current date.

3. A sales manager can be shown data in his region only. A CEO can be shown data for all regions.

In this post we look at showing units ordered in the current month. We use a security filter to filter data for the current month.

Steps:

1. The first step is to create the session variable for the current month. To do so

a. In the Administration window, click on Action - > New -> Session -> Variable.Give CURRENT_MONTH as the name of the variable. Click on 'New' near the initialization block.

b. Give CURRENT_MONTH_INIT as the name of the initialization block. Click on Edit Data Source.

c. A new window opens. Select the connection pool by using the browse button.

d. Use database as the data source type.

e.Type in the following query:" select month(curdate()); " in the default initialization string.

f. Click Ok to close the dialog.

g. In the Session variable initialization block, click on edit data target.

h. select the CURRENT_MONTH variable. Click on Ok.

i. Click on ok to create the session variable.

2. The next step is to use this session variable to filter the result for this month.

a.In the Administration tool. click on Manage -> Security.

b.Create a new User called MonthlyUser.

c. Create a new group called MonthlyUserGroup. Assign MonthlyUser to this group.

d.Open the MonthlyUserGroup dialog and click on Permissions.

e.Click the tab that says filters. Click on 'Add'

f.In the name of the filter select the name of the table that you want to apply the filter on.

g.Click on the ellipsis in the business model filter column.Apply the filter
h. The group is now created.

3. Login to BI answers using the MonthlyUser user. Select the columns from the store database. view results. You will notice that the results show data for the current month only.If you login by a user from the administrators group, data for all months will be visible.