Tag Archive


Aashish Aashish Dutta Koirala Aashish Koirala Add new tag basantapur bhaktapur Buddha children d2 Daman deerwalk Dhulikhel football Godavari Hike Hiking hiking in nepal indra jatra Kathmandu Lamatar Life Life in Nepal Life In Nepal Photograph LIN love nagarkot Nepal Nepali oracle pashupatinath patan Peace People photography Pokhara Ravi Sharma Sankhu Shutterbug Sundarijal Telkot temple Tihar US village Women

The top ten most important features for getting the best Oracle database performance using VMWARE ESX Server and other technologies

The top ten most important features for getting the best Oracle database performance using VMWARE ESX Server and other technologies…
There are a few common myths about virtualizing databases:

* Databases have a high overhead when virtualized: Virtualized Databases can perform at or near the speed of physical systems, in terms of latency and throughput. The virtualization overhead for typical real-world databases is minimal – for VMware ESX Server, we measured CPU overhead to be less than 10%.

* Databases have too much I/O to be virtualized: Databases typically have a large number of small random I/Os, and it is in theory possible to hit a scaling ceiling in the hypervisor layer. VMware ESX’s thin hypervisor layer can drive over 63,000 database I/Os per second, which is equivalent to more than 600 disk spindles of I/O throughput. This is sufficient I/O scaling for even the largest databases on x86 systems.

* Virtualization should only be used for smaller, non-critical applications: The ESX hypervisor is very robust: many customers are seeing over two years of uptime from ESX based systems. In addition, the ESX hypervisor remains stable, even if resources are overcomitted.

There isn’t one quick hit to make databases work well for a wide range of real-world applications – good performance is something that is earned from the long term discipline of focusing the lessons learned from many customer-oriented real-world database workloads, and applying those lessons across the architecture of the hypervisor.
The Nature of Databases
Databases have some unique properties, such as a-large memory footprint. At the outset this can make them slightly more complex to virtualize well. However this has proven to be an opportunity, since we can optimize specifically for these defining properties.

* Large Memory: Databases use large amounts of memory to cache their storage. A large cache is one of the most important performance criteria for databases, since it can often reduce physical I/O by 10-100 fold.
* High Performance Block I/O: Databases read and write their data in fixed, block sized chunks. The I/Os are typically small, and operate at a very high rate on a small number of files or devices.
* Throughput Oriented: Databases often have a large number of concurrent users, giving them natural parallelism and makes them ideally suited to take advantage of systems with multiple logical or physical processors.
* Near Native Performance: Oracle databases run at performance similar to that of a physical system
* Extreme Database I/O Scalability: VMware ESX Server’s thin hypervisor layer can drive over 63,000 database I/Os per second (fifty times the requirement of a typical database)
* Multi-core Scaling: Scale up using SMP virtual machines and multiple database instances
* Large Memory : Scalable memory – 64GB per database, 256GB per host
We are getting ready (or at least entertaining the idea of) to switch to Oracle 11g on RHEL. We understand that this is a very successful configuration in the IT world currently. We are wanting to run this on ESX and will thus be adding additional licensed copies to our environment should this be the final direction.

1. Is this a good fit for VMWare (i.e. are you folks doing it currently?)
2. We will be running EPM (PeopleSoft) on this and we understand that it is fairly intensive model building and data warehousing.
3. We have been told that it will require major cpu and ram but, like everything else, this is generally not the case once you put it on VMWare as you end up seeing how little of it gets used.

We will also be looking at professional help for this implementation should we go down this path so any suggestions you have along these lines would be appreciated too.
Some Sources:

http://vmware.com/partners/virtualize_oracle_landscape.html

http://communities.vmware.com/docs/DOC-2150

http://blogs.vmware.com/performance/2007/11/ten-reasons-why.html?cid=153655165

An Introduction to Real-Time Data Integration oracle server service oriented applications SOA download free

An Introduction to Real-Time Data Integration oracle server service oriented applications SOA download free
Being Java-based, these applications run in any Java environment, including Microsoft Windows, Macintosh OS X, and Linux.
In Oracle Data Integrator, a physical database, a service, or an event-based datasource is known as a data server. Using the Topology Manager, you create three new data servers:

1. An Oracle Database data server, set up with the SYSTEM users’ credentials, that maps to the ORDERS and ORDERS_WORKAREA schemas on the database. The ORDERS schema contains the orders data you want to extract, whereas the ORDERS_WORKAREA schema is one you have specially set up, as an empty schema, to hold the working tables Oracle Data Integrator creates. Use the Oracle JDBC driver to make this connection.
2. A File data server that maps to a comma-separated file containing details on employees. Use the Sunopsis File JDBC Driver to make this connection.
3. A Microsoft SQL Server data server that maps to a database called ORDERS_DATA_MART. Use the Sun JDBC-ODBC Bridge JDBC Driver to create this connection, or use the Microsoft JDBC drivers, which you can download from the Microsoft Web site.
Make sure that if the underlying source tables do not have primary keys defined, you define them, by using the Designer application, and have Oracle Data Integrator enforce them “virtually,” because many of Oracle Data Integrator’s mapping features rely on constraints’ being defined

Now that the data stores are defined, you can start setting up the changed-data-capture process that obtains your source data.

Before you do this, though, you import into your project the knowledge module that provides the changed-data-capture functionality. To do this, you click the Projects tab in the Designer application, right-click the project, and choose Import->Import Knowledge Modules. From the list, select the following knowledge modules, which provide changed-data-capture functionality and will be used in other parts of the project.

* CKM SQL
* IKM SQL Incremental Update
* JKM Oracle 10g Consistent (LOGMINER)
* LKM File to SQL
* LKM SQL to SQL

Now that the required knowledge modules are available, you edit the Oracle module created previously and select the Journalizing tab. Because you want to capture changes to the ORDERS and CUSTOMER tables in a consistent fashion, you select the Consistent option and the JKM Oracle 10g Consistent (LOGMINER) knowledge module. This knowledge module, shown in the figure below, will capture new and changed data, using the LogMiner feature of Oracle Database 10g, and will asynchronously propagate changes across a queue using Oracle Streams.
* Asynchronous Mode: Yes
* Auto-Configuration: Yes
* Journal Table Options: default

Click Apply to save the changes, and then click OK to complete the configuration. You now need to add tables to the changed-data-capture set.

To do this, you locate the Oracle data server in the Designer list of models, right-click the CUSTOMERS and ORDERS tables in turn, and choose Changed Data Capture ->Add to CDC. Then edit the model again the Journalized Tables tab, and use the up and down arrow keys to place the ORDERS table above the CUSTOMERS table.

ou can quickly check which rows are in the table journals by right-clicking the relevant data store, choosing Changed Data Capture and then Journal Data…, or you can execute the interface by opening it again in the editor and clicking Execute at the bottom right corner of the screen. Note that if you have chosen Asynchronous mode for your JKM, there may be a delay of between a second and a few minutes before your journalized data is ready whilst the data is being transferred asynchronously between the source and target databases. If you require your journalized data to be available immediately, choose Synchronous mode instead and your data will be captured and transferred using internal triggers.

Because you have already loaded the initial set of data into your target data mart, using the first interface you created, you now create a Oracle Data Integrator package to carry out the following steps:

1. Check the ORDERS and CUSTOMER journalized data to see if new or changed data records have been added. Once a predefined number of journal records are detected, run the rest of the package or jump to the last step without loading any data.
2. If journalized data is detected, extend the journal window.
3. Execute the interface to read from the journalized data, join it to the file, and load the target data store.
4. Purge the journal window.
5. Start this package again.

Creating this package and then deploying it as an Oracle Data Integrator scenario effectively creates a real-time, continuously running ETL process. Using Oracle Data Integrator’s event detection feature, it will start itself, once a set number of changed data records is detected or after a set number of milliseconds has elapsed. By setting appropriate thresholds for the amount of journalized data and the timeout, you can create a real-time integration process with minimal latency.

To create this package, you navigate to the Projects tab in the Designer application, locate the folder containing the interfaces you defined earlier, find the Packages entry, right-click it, and select Insert Package. You give the package a name and then navigate to the Diagram tab in the package details dialog box.
Because the final OdiStartScen step refers to scenarios, which are productionized versions of packages, you locate the package you are working on in the Project tab of the Designer application, right-click it, and select Generate Scenario. Once the scenario is created, you edit the properties of the OdiStartScen step to reference the scenario name you just generated. By adding this final step to the package, you will ensure that it runs continuously, propagating new and changed data from the Oracle source tables across to the target database in real time.
Summary

Oracle Data Integrator, a new addition to the Oracle Fusion Middleware family of products, gives you the ability to perform data, event, and service-oriented integration across a wide number of platforms. It complements Oracle Warehouse Builder and provides a graphical interface for Oracle Database-specific features such as bulk data loading and Oracle Change Data Capture. This article has examined how Oracle Data Integrator can be used to create real-time data integration processes across disparate platforms, and the declarative approach to the integration process allows you to focus on the business rules rather than the details of implementation.
Mark Rittman [http://www.rittmanmead.com/blog] is an Oracle ACE Director and cofounder of Rittman Mead Consulting, a specialist Oracle Partner based in the U.K. focused on Oracle business intelligence and data warehousing. He is a regular contributor to OTN and the OTN Forums and is one of the authors of the Oracle Press book Oracle Business Intelligence Suite Developers’ Guide, forthcoming in 2008.
Oracle Data Integrator Product Architecture

Oracle Data Integrator is organized around a modular repository that is accessed by Java graphical modules and scheduling agents. The graphical modules are used to design and build the integration process, with agents being used to schedule and coordinate the integration task. When Oracle Data Integrator projects are moved into production, data stewards can use the Web-based Metadata Navigator application to report on metadata in the repository. Out-of-the-box Knowledge Modules extract and load data across heterogeneous platforms, using platform-specific code and utilities.
In these days of complex, “hot-pluggable” systems and service-oriented architecture (SOA), bringing data together and making sense of it becomes increasingly difficult. Although your primary applications database might run on Oracle Database, you may well have other, smaller systems running on databases and platforms supplied by other vendors. Your applications themselves may intercommunicate by using technologies such as Web services, and your applications and data may be hosted remotely as well as managed by you in your corporate data center.
Four graphical modules are used to create and manage Oracle Data Integrator projects:

* Designer is used to define data stores (tables, files, Web services, and so on), interfaces (data mappings), and packages (sets of integration steps, including interfaces).
* Topology Manager is used to create and manage connections to datasources and agents and is usually restricted so that only administrators have access.
* Operator is used to view and manage production integration jobs.
* Security Manager manages users and their repository privileges.
n general, a data integration task consists of two key areas:

* The business rules about what bit of data is transformed and combined with other bits
* The technical specifics of how the data is actually extracted, loaded, and so on

This split in focus means that often the best people to define the business rules are an organization’s technical business or data experts, whereas the technical specifics are often better left to technical staff such as developers and DBAs. With most data integration tools, it is often difficult to split responsibilities in this way, because their data mapping features mix up business rules and technical implementation details in the same data mapping. Oracle Data Integrator takes a different approach, though, and, like SQL, uses a declarative approach to building data mappings, which are referred to within the tool as “interfaces.”

When creating a new interface, the developer or technical business user first defines which data is integrated and which business rules should be used. In this step, tables are joined, filters are applied, and SQL expressions are used to transform data. The particular dialect of SQL that is used is determined by the database platform on which the code is executed.

Then, in a separate step, technical staff can choose the most efficient way to extract, combine, and then integrate this data, using database-specific tools and design techniques such as incremental loads, bulk-loading utilities, slowly changing dimensions, and changed-data capture.

Extensible Knowledge Modules

As Oracle Data Integrator loads and transforms data from many different database platforms and uses message-based technologies such as Web services while being able to respond to events, the technology used to access and load these different datasources needs to be flexible, extensible, and yet efficient. Oracle Data Integrator solves this problem through the use of knowledge modules.
Another useful technique for minimizing data load times is to load only data that is new or has changed. If you are lucky, the designers of your applications have helpfully provided indicators and dates to identify data that is new or changed, but in most cases, this information is not available and it is up to you to identify the data you are interested in.

Because this is a fairly common requirement, Oracle Data Integrator provides journalizing knowledge modules that monitor source databases and copy new and changed records into a journal, which can then be read from instead of the original source table. Where database vendors such as Oracle provide native support for changed-data capture, these features are used; otherwise, the journalize knowledge module uses techniques such as triggers to capture data manipulation language (DML) activity and make the changes available. Later in this article, you will see how support for the Oracle Change Data Capture feature is provided by Oracle Data Integrator and how it can be used to incrementally load, in real time, a database on a different database platform.
Oracle Data Integrator in Relation to Oracle Warehouse Builder

At this point, regular users of Oracle Warehouse Builder are probably wondering how Oracle Data Integrator relates to it and how it fits into the rest of the Oracle data warehousing technology stack. The answer is that Oracle Data Integrator is a tool that’s complementary to Oracle Warehouse Builder and can be particularly useful when the work involved in creating the staging and integration layers in your Oracle data warehouse is nontrivial or involves SOA or non-Oracle database sources.

For those who are building an Oracle data warehouse, Oracle Warehouse Builder has a strong set of Oracle-specific data warehousing features such as support for modeling of relational and multidimensional data structures, integration with Oracle Business Intelligence Discoverer, support for loading slowly changing dimensions, and a data profiler for understanding the structure and semantics of your data.
Oracle Data Integrator in Use: Cross-Platform Real-Time Data Integration

In this scenario, you have been tasked with taking some orders and customer data from an Oracle database, combining it with some employee data held in a file, and then loading the integrated data into a Microsoft SQL Server 2000 database. Because orders need to be analyzed as they arrive, you want to pass these through to the target database in as close to real time as possible and extract only the new and changed data to keep the workload as small as possible. You have read about Oracle Data Integrator on the Oracle Technology Network and want to use this new tool to extract and load your data.