Stage tables
For many applications with a database back end, stage tables are the first point of entry (or first line of defense, depending on your perspective) for data entering a database. In their simplest form, stage tables are images of the contents of a data file. In terms of purpose, stage tables serve as intake, upload, or collector mechanisms. A very common method of populating stage tables is by using SQL*Loader. Stage table loading can also be used to perform data hygiene. Create a unique index on a particular column, and the upload process takes care of removing duplicates. Other bad or discarded line items can be collected and reported back to the data provider. Depending on your needs, you can also use stage tables to begin or perform transformation processes.From a use perspective, you will probably want to reuse stage tables. The application or database - it doesn’t matter which one, just as long as it gets done - should perform a clean-up or preparation step prior to uploading data. Is it done by using REPLACE in SQL*Loader or by a prepared statement in Java which uses TRUNCATE TABLE? From a design perspective, does the table need to have logging enabled? If so, why? Are you ever going to rollback or recover anything in a stage table? Probably not, so keep the DDL type of features or options to a minimum. Now that the data is inside that database, where does it go next?
Operational tables
One way to define tables of this type is to consider what they contain. Live data, such as customer or account information, can be stored in operational tables. These tables are the guts of the database, so to speak. In one form or another, everything in an application revolves around the contents of these tables. Operational tables should be first and foremost with respect to security, backup, and recovery, and therefore, should employ whatever means or features to meet these requirements. This includes logging, enabled row movement for flashback, explicit permissions, auditing, fine grained access, virtual private database, and whatever else your situation/application may require.These tables are further characterized by how they are designed. Normalization is a word or process that should immediately come to mind, and along with this comes the consideration of related objects such as indexes, views, sequences, and triggers, to name a few. These tables are what I referred to as "normal" tables in the introduction and normal seems to be a good fit for two reasons. First is that these tables are normally seen throughout a schema, and second is that these tables typically need to be normalized.
Another type of operational table is one related to how an application or schema operates. An example would be storing information about disposition tables (see below). Another example is familiar for those who design security (roles and permissions) into a schema as opposed to having Oracle manage this function.
Functional tables
Lookup or intersection tables fall into this category as their main purpose is to perform a function (i.e., resolve a many-to-many relationship). Another type of functional table is a shopping cart/order entry table (contains an order number and one or more line items). These tables are relatively easy to identify and their purpose is discussed in virtually all database design books. Functional tables bear mentioning because of their close association with operational tables; however, not all tables are created equal.Identify the crucial functional tables and place them into the same category with respect to the importance placed on operational tables. How can you distinguish between them? One way is to consider how the contents change, that is, is the data static or not? Static tables (e.g., job code versus job description) can be repopulated from a script. Dynamic tables require recovery. Do you directly populate these tables or is it done indirectly (via a trigger)?
Disposition tables
Tables of this type are frequently created by performing a join between a stage table and a combination of one or more operational and functional tables. Disposition tables are commonly used for reporting or staging data to be spooled out into disposition files. You may find many tables of this type in a schema, all with the same definition. What differentiates them is the date they were created or the number or records (weekly report or newsletter recipient list consisting of fewer rows as compared to an end of month report or recipient list).A disposition table can also be similar to a stage table in that its contents are routinely flushed and repopulated for a pending disposition or report process. An alternative means of keeping disposition data is to borrow from the data warehouse side of things and use a flight history type of table. You will not need to retain 100% of the disposition table's contents, but you will need to capture what is essential. Who sent what, and when, and can details be reconstructed if necessary? If details are lost due to data changes elsewhere in the database, then the record keeping requirements become more complicated. It then becomes a trade off between keeping scores of disposition tables and maintaining a huge flight history type of table.
Archive tables
A close cousin on the production side to a warehouse's flight history table is the archive table. By its nature, an archive table's content is permanent, so one way to delineate the life span of what is in a disposition table is to consider the degree or permanency of the data. Forever (or, at least whatever the service level agreement retention policy is) equals archive, and something more short term is subject to being disposed of in a disposition table.Other tables
Audit tables are discussed in detail throughout much of Oracle's documentation and generally so in most textbooks. It is not the data, but who did what to the data and when. One type of table being seen more in applications is the metadata table (data about data). Not sure of what an example of this type of table looks like? Several tables in Oracle's data dictionary are excellent examples. The DBA/ALL/USER_TABLE family contains data about a table's data (number of rows, last analyzed, etc.).Figure 2-1 HR Schema
In Closing
Production database tables, for the most part, seem to lack the same naming clarity as seen in data warehouses. Fact and dimension tables have clearly defined names because of their purpose, but what type of table is your customer account table? Whether or not you agree that the type name should be operational, what does matter is that everyone involved in the management of this type of table have the same understanding as to its purpose. If someone asks you to name table types in Oracle (and this applies to other database systems), ask that person if he or she meant physical or logical, how they are designed (DDL) or how they are used, production or warehouse, and so on. You can help expand that person's frame of reference by going beyond what is in the Database Concepts documentation.Oracle Tables and Statistics
The guys who wrote Oracle are pretty smart. One of the
things they built in the database is this program called the optimizer. The
optimizer’s job is to take SQL statements and decide how to get the data that is
being asked for in the SQL statement and how to get it in the quickest way
possible.
When a SQL statement is executed, the database must
convert the query into an execution plan and choose the best way to retrieve the
data. For Oracle, each SQL query has many choices for execution plans, including
which index to use to retrieve table row, what order in which to join multiple
tables together, and which internal join methods to use (Oracle has nested loop
joins, hash joins, star joins, and sort merge join methods). These execution
plans are computed by the Oracle cost-based SQL optimizer commonly known as the
CBO.
The choice of executions plans made by the Oracle SQL
optimizer is only as good as the Oracle statistics. To always choose the best
execution plan for a SQL query, Oracle relies on information about the tables
and indexes in the query.
Once the optimizer has done its job, it provides an
execution plan to Oracle. An execution plan is like a set of instructions that
tells Oracle how to go and get the data.
This is a pretty simple plan. In it, the optimizer tells
Oracle to first go get all the rows of the EMP department, and then sort those
rows (Reading an execution plan is somewhat of an art, so trust us on this one,
the full scan of the EMP table is first).
Did you notice in the plan that there is a column called
ROWS? This is the number of rows that the query will process. How did Oracle
know that it was going to process 100 rows. This Oracle thing is smart, isn’t
it.
Well, Oracle isn’t quite that smart. In this case,
Oracle knew (or in most cases it’s a good guess) we would process 100 rows
because we generated statistics on the EMP table after we created the table. The
optimizer uses these statistics to generate execution plans.
The optimizer program uses statistics on tables and on
the indexes surrounding those tables, so it’s important to have statistics on
both. In the next section, we will show you how to generate statistics on tables
and indexes in your database. Starting with the introduction of the dbms_stats
package, Oracle provides a simple way for the Oracle professional to collect
statistics for the CBO.
The old-fashioned analyze table and dbms_utility methods
for generating CBO statistics are obsolete and somewhat dangerous to SQL
performance because they don't always capture high-quality information about
tables and indexes. The CBO uses object statistics to choose the best execution
plan for all SQL statements.
The dbms_stats utility does a far better job in
estimating statistics, especially for large partitioned tables, and the better
stats result in faster SQL execution plans. Here is a sample execution of
dbms_stats with the options clause.
exec
dbms_stats.gather_schema_stats( -
ownname
=> 'SCOTT', -
options => 'GATHER AUTO', -
estimate_percent => dbms_stats.auto_sample_size, -
method_opt => 'for all columns size repeat', -
degree => 15 -
)
How to Generate Oracle Statistics
Oracle provides a stored procedure (or program) for you
to run that will generate the statistics is needs. Oracle requires statistics on
both tables and any associated indexes (we will talk about indexes next), and
most of the time you will generate both with just one command.
To generate statistics we use the dbms_stats stored
package. There are two procedures contained within the dbms_stats package that
you will mostly be interested in, dbms_stats.gather_schma_stats and
dbms_stats.gather_table_stats. Also, in Oracle database 10g you will want to
gather system statistics and fixed view statistics. Let’s look at each of these
operations in a bit more detail next.
There is also an analyze command that you can use to
generate statistics. It’s been deprecated in Oracle Database 10g (which means
it’s really not supported anymore). So we don’t cover it in this book.
Using dbms_stats.gather_schema_stats
The dbms_stats.gather_schema_stats procedure allows you
to gather statistics for all objects in a give schema. This is the easiest way
to generate statistics for a large number of objects. Here is an example of
using the dbms_stats.gather_schema_stats procedure to gather statistics on the
SCOTT schema of a database:
EXEC
dbms_stats.gather_schema_stats(’SCOTT’, cascade=>TRUE);
This command will generate statistics on all tables in
the SCOTT schema. Since we included the cascade command, the indexes will also
have statistics generated on them. This is important, you need statistics on
indexes as well as on tables in Oracle!
Of course, this is just the basic way to run this
command. Several options are available, but for now as a new DBA this will do.
In fact, Oracle 10g automatically collects database statistics every night out
of the box. Later you will want to investigate some of the Oracle Database 10g
statistics gathering options such as histograms, and granularity.
If you create a new table, then it may not be practical
or desirable to re-generate statistics on the entire schema if the schema is
quite large and the database is very busy. Instead you will use the
dbms_stats.gather_table_stats command to generate statistics for a single table,
and optionally for related table indexes. Here is an example:
EXEC
dbms_stats.gather_table_stats(‘SCOTT’,’EMP’,cascade=>TRUE);
In this case we are generating statistics for the EMP
table in the SCOTT schema. Again we use the cascade parameter to insure all of
the indexes get analyzed.
কোন মন্তব্য নেই:
একটি মন্তব্য পোস্ট করুন