Websphere Commerce Data Load using CSV \Websphere Commerce Data Load using XML

Websphere Commerce Data Load using CSV \Websphere Commerce Data Load using XML


Data can be loaded into WCS tables using data load utility. Data load utility works in the steps as mentioned below.

1.Data Reader
The data reader reads the input file, and transforms the input into a data object. Data objects are typically map objects. Eg: CSV data reader \XML Data Reader

2.Business Object Builder
The business object builder instantiates the business object and populates it from the data in the data object. Eg:The table object builder

3. Business Object Mediator
The business object mediator converts the business object into physical object. The table object mediator is always used with the table object builder. Eg: table-based mediator

4. Data Writer
The data writer writes the physical object into the database. This tutorial uses the default data writer to write the generated physical objects into the database.

There are three configurations files and a input source file required to complete the data loader setup. WCS supports only CSV File Reader for other input source you have to write your own Reader Class. This means that if you are using OOB CSVReader then your input file must be a CSV file. The three configuration files are listed below.


1.wc-dataload-env.xml
2.wc-dataload-loader.xml
3.wc-dataload.xml

wc-dataload-env.xml holds the environment settings such as database name,encrypted password,database type,user name,schema name.

<?xml version="1.0" encoding="UTF-8" ?>
<_config:DataLoadEnvConfiguration xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:schemaLocation="http://www.ibm.com/xmlns/prod/commerce/foundation/config ../../../../xml/config/xsd/wc-dataload-env.xsd"
xmlns:_config="http://www.ibm.com/xmlns/prod/commerce/foundation/config">
<_config:BusinessContext storeIdentifier="Madisons" catalogIdentifier="Madisons" languageId="-1" currency="USD" />
<_config:Database type="derby" name="..\..\db\mall" />
<_config:DataWriter className="com.ibm.commerce.foundation.dataload.datawriter.JDBCDataWriter" />
</_config:DataLoadEnvConfiguration>

wc-dataload-loader.xml holds the mapping between your input csv file and the WCS tables.

<_config:DataMapping>
<_config:mapping xpath="CatalogGroupIdentifier/UniqueID" value="GroupUniqueId" />
<_config:mapping xpath="CatalogGroupIdentifier/ExternalIdentifier/GroupIdentifier" value="GroupIdentifier" />
<_config:mapping xpath="ParentCatalogGroupIdentifier/UniqueID" value="ParentGroupUniqueId" />
<_config:mapping xpath="CatalogGroupIdentifier/ExternalIdentifier/StoreIdentifier/ExternalIdentifier/NameIdentifier" value="StoreIdentifier" valueFrom="InputData" />
<_config:mapping xpath="ParentCatalogGroupIdentifier/ExternalIdentifier/GroupIdentifier" value="ParentGroupIdentifier" />
<_config:mapping xpath="ParentCatalogGroupIdentifier/ExternalIdentifier/StoreIdentifier/ExternalIdentifier/NameIdentifier" value="ParentStoreIdentifier" valueFrom="InputData" />
<_config:mapping xpath="topCatalogGroup" value="TopGroup" />
<_config:mapping xpath="displaySequence" value="Sequence" />
<_config:mapping xpath="Description[0]/Name" value="Name" />
<_config:mapping xpath="Description[0]/ShortDescription" value="ShortDescription" />
<_config:mapping xpath="Description[0]/LongDescription" value="LongDescription" />
<_config:mapping xpath="Description[0]/Thumbnail" value="Thumbnail" />
<_config:mapping xpath="Description[0]/FullImage" value="FullImage" />
<_config:mapping xpath="Description[0]/Attributes/published" value="Published" />
<_config:mapping xpath="Description[0]/Keyword" value="Keyword" />
<_config:mapping xpath="Description[0]/Attributes/note" value="Note" />
<_config:mapping xpath="Attributes/field1" value="Field1"/>
<_config:mapping xpath="Attributes/field2" value="Field2"/>
<_config:mapping xpath="" value="Delete" valueFrom="InputData" deleteValue="1"/>
</_config:DataMapping>

wc-dataload.xml is the main xml that is passed as parameter to the dataload utility, this file contains information about the environment file, loader file and input CSV file for the loader.

Dataload file would like the one shown below.

<?xml version="1.0" encoding="UTF-8" ?>
<_config:DataLoadConfiguration xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:schemaLocation="http://www.ibm.com/xmlns/prod/commerce/foundation/config ../../../../xml/config/xsd/wc-dataload.xsd"
xmlns:_config="http://www.ibm.com/xmlns/prod/commerce/foundation/config">
<_config:DataLoadEnvironment configFile="wc-dataload-env.xml" /> 
<_config:LoadOrder commitCount="100" batchSize="1" dataLoadMode="Replace">
<_config:LoadItem name="CatalogGroup" businessObjectConfigFile="wc-loader-catalog-group.xml">
<_config:DataSourceLocation location="CatalogGroups.csv" />
</_config:LoadItem>
<_config:LoadItem name="CatalogEntry" businessObjectConfigFile="wc-loader-catalog-entry.xml">
<_config:DataSourceLocation location="CatalogEntries.csv" />
</_config:LoadItem>
</_config:LoadOrder>
</_config:DataLoadConfiguration>

Running the Dataload Utility

To run  the dataload utility, just go to command prompt and navigate to WCS installation folder and then open bin folder, by default it would be the one shown below.

C:\IBM\WCDE_INSTALL70\bin
Type dataload.bat <path of dataload file>
for e.g dataload C:/IBM/WCDE_INSTALL70/samples/DataLoad/Catalog/wc-dataload.xml

Custom Data Load -Load data into custom tables
Use the table object builder and table object mediator to load data directly into a table with the Data Load utility when a component-based business object mediator does not exist. You can use the TableObjectMediator and TableObjectBuilder to load your custom data directly into your target database tables. This mediator and builder can ensure that the data is loaded into the correct table and columns. You do not have to create a custom business object mediator and builder or extend an existing mediator or builder.

You can use the table-based mediator to load data with the Data Load utility when the following conditions exist:
You want to load data into a table when a component-based object mediator does not exist for the object that you are loading.
You are loading data into a table that is updated and you do not want to customize or extend the existing component mediator.
You are loading custom UserData into an extended table.You can add your UserData into a default provided sample CSV or XML file. You can then load this file by using the TableObjectMediator and TableObjectBuilder with the Data Load utility.

Table-based builder

The TableObjectBuilder business object builder works with the TableObjectMediator business object mediator. The Data Load utility creates a list of extended table data objects, which the table object builder populates with the column values that are included in the input data. The builder class also populates the data objects with any fixed values that are defined for the objects in the column definitions that are specified in the business object configuration file. The Data Load utility passes the populated list of extended table data object to the TableObjectMediator.

Table-based mediator

TableObjectMediator is used to convert the data objects into physical objects. To use a table-based data load, you must know the physical tables that you want to load data into. You also must know the foreign key relationships, how the keys generate, and how the generated keys are resolved based on the unique index. The column values that the Data Load utility can retrieve from the business contexts must also be known. This mediator accepts the populated table data objects from the builderclass. Any partially populated table data object is populated with more values with this mediator to ensure that the data object can be resolved. The mediator populates the data objects with values from the business context. The data object IDs are resolved through the business context service or through the ID resolver. After the data object is populated, the data object is sent to the data writer to populate the data object into the appropriate database table.

I will try to explain the data load on custom tables using Table-based builder and table object mediator by loading data into a custom table called EMPLOYEE.

Custom Data Load –Loading data into custom table EMPLOYEE

// Create Table  EMPLOYEE

CREATE TABLE  EMPLOYEE (
EMPLOYEE_ID BIGINT NOT NULL,
NAME VARCHAR(254),
TYPE  VARCHAR(254)
);

ALTER TABLE EMPLOYEE
ADD PRIMARY KEY (EMPLOYEE_ID);
 

//keep the table name in lower case letters
//Keys table entry will help to create primary key generation during Data load.

INSERT INTO KEYS (KEYS_ID, TABLENAME, COLUMNNAME, COUNTER, PREFETCHSIZE, LOWERBOUND, UPPERBOUND, OPTCOUNTER)
VALUES ((SELECT MAX(KEYS_ID) + 1 FROM KEYS), 'employee', 'EMPLOYEE_ID', 0000, 1, 1000, 999999, 1);

Method 1: Loading data using CSV  Input

1. Define the CSV file (Employees.csv) with NAME, TYPE as header.

name,type
employeex,admin
employeeY,tester

Define the wc-dataload xml file as below.

This  file  defines the   environment file (wc-dataload-env.xml),Loader file (wc-loader-employee.csv) and csvfile (Employees.csv) as below.


<_config:DataLoadConfiguration
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:schemaLocation="http://www.ibm.com/xmlns/prod/commerce/foundation/config ../xsd/wc-dataload.xsd"
 xmlns:_config="http://www.ibm.com/xmlns/prod/commerce/foundation/config">
<_config:DataLoadEnvironment configFile="wc-dataload-env.xml" />
<_config:LoadOrder commitCount="100" batchSize="1" dataLoadMode="Replace" >
<_config:LoadItem name="employee" businessObjectConfigFile="wc-loader-employee.xml" >
<!--Indicates the location of input csv files, can be overrided at runtime.-->
<_config:DataSourceLocation location="employees.csv" />
</_config:LoadItem>
</_config:LoadOrder>
</_config:DataLoadConfiguration>


Define the wc-dataload-env.xml as below

This  file  defines the storeIdentifier, catalogIdentifier and DB details.


<?xml version="1.0" encoding="UTF-8"?>
<_config:DataLoadEnvConfiguration
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:schemaLocation="http://www.ibm.com/xmlns/prod/commerce/foundation/config ../../..
/xml/config/xsd/wc-dataload-env.xsd"
xmlns:_config="http://www.ibm.com/xmlns/prod/commerce/foundation/config">
<!-- Madisons Store in Toolkit -->
<_config:BusinessContext storeIdentifier="Aurora" catalogIdentifier="Aurora"
languageId="-1" currency="USD">
</_config:BusinessContext>
<!-- database setting for derby in Toolkit -->
<_config:Database type="derby" name="..\db\mall" schema="APP"/>
<!-- database setting for Oracle -->              
<!--
<_config:Database name="<database name>" user="<user>" password="<password>"
port="1521" schema="<schema name>" server="<server>" type="Oracle"
dbDriverType="thin" />
 -->       
<!-- database setting for AIX/DB2 server -->
<!--
<_config:Database type="db2" name="<database name>" user="<user>" password="  
<password>" server="<server>" port="<port>" schema="<schema>" />
--> 
<_config:IDResolver className="com.ibm.commerce.foundation.dataload.idresolve.IDResolverImpl" cacheSize="10000000"/>
<_config:DataWriter className="com.ibm.commerce.foundation.dataload.datawriter.JDBCDataWriter" />
</_config:DataLoadEnvConfiguration>

Define the wc-loader-employee.xml as below


<_config:DataloadBusinessObjectConfiguration
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:schemaLocation="http://www.ibm.com/xmlns/prod/commerce/foundation/config 
xsd/wc-dataload-businessobject.xsd"
xmlns:_config="http://www.ibm.com/xmlns/prod/commerce/foundation/config">
<_config:DataLoader className="com.ibm.commerce.foundation.dataload.BusinessObjectLoader">

<_config:DataReader className="com.ibm.commerce.foundation.dataload.datareader.CSVReader" firstLineIsHeader="true" useHeaderAsColumnName="true" >
<_config:property name="keyColumns" value="Identifier" />
</_config:DataReader>

<_config:BusinessObjectBuilder className="com.ibm.commerce.foundation.dataload.businessobjectbuilder.TableObjectBuilder" >  
<_config:Table name="EMPLOYEE">
<!-- System generated primary key-->
<_config:Column name="EMPLOYEE_ID" value="employee_id" valueFrom="IDResolve">
<_config:IDResolve tableName="EMPLOYEE" generateNewKey="true" />
</_config:Column>
<_config:Column name="NAME" value="name">
</_config:Column>
<_config:Column name="TYPE" value="type">
</_config:Column>
</_config:Table>  
<_config:BusinessObjectMediator className="com.ibm.commerce.foundation.dataload.businessobjectmediator.TableObjectMediator">
</_config:BusinessObjectMediator>
</_config:BusinessObjectBuilder>
</_config:DataLoader>
</_config:DataloadBusinessObjectConfiguration>

This file defines the TableObjectBuilder(with mapping  data ) and  TableObjectMediator.The reader used is CSVReader.

The Data Load utility creates a list of extended table data objects, which the table object builder populates with the column values that are included in the input data. The builder class also populates the data objects with any fixed values that are defined for the objects in the column definitions that are specified in the business object configuration file. The Data Load utility passes the populated list of extended table data object to the TableObjectMediator. TableObjectMediator is used to convert the data objects into physical objects.

Running the data load utitliy

Copy the files (employees.csv, wc-dataload.xml, wc-dataload-env, wc-loader-employee.xml
) in directory C:\IBM\WCDE_ENT70\samples\AuroraDataLoad\custom\CSV

Switch to C:\IBM\WCDE_ENT70\bin

Run the below command

Dataload C:\IBM\WCDE_ENT70\samples\AuroraDataLoad\custom\CSV\ wc-dataload.xml


Method 2: Loading data using  XML input

Define the employees.xml as below


<?xml version="1.0" encoding="UTF-8"?>
<employees>
<EMPLOYEE employee_id="" name="test1" type="admin"/>
<EMPLOYEE employee_id="" name="test2" type="csr"/>
</employees>


Define the wc-dataload xml file as below.


<?xml version="1.0" encoding="UTF-8"?>
<_config:DataLoadConfiguration
 xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
 xsi:schemaLocation="http://www.ibm.com/xmlns/prod/commerce/foundation/config ../../..
 /../xml/config/xsd/wc-dataload.xsd"
 xmlns:_config="http://www.ibm.com/xmlns/prod/commerce/foundation/config">
 <_config:DataLoadEnvironment configFile="wc-dataload-env.xml" />
 <_config:LoadOrder commitCount="100" batchSize="1" dataLoadMode="Replace">
 <_config:property name="firstTwoLinesAreHeader" value="true" />
 <_config:LoadItem name="employees" businessObjectConfigFile="wc-loader-  
  employee.xml" >
 <_config:DataSourceLocation location="employees.xml" oldLocation="" />
 </_config:LoadItem>
 </_config:LoadOrder>
 </_config:DataLoadConfiguration>

Define the wc-dataload-env.xml as below

<?xml version="1.0" encoding="UTF-8"?>
<_config:DataLoadEnvConfiguration
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:schemaLocation="http://www.ibm.com/xmlns/prod/commerce/foundation/config ../../..
/xml/config/xsd/wc-dataload-env.xsd"
xmlns:_config="http://www.ibm.com/xmlns/prod/commerce/foundation/config">
<!-- Madisons Store in Toolkit -->
<_config:BusinessContext storeIdentifier="Aurora" catalogIdentifier="Aurora"
languageId="-1" currency="USD">
</_config:BusinessContext>
<!-- database setting for derby in Toolkit -->
<_config:Database type="derby" name="..\db\mall" schema="APP"/>
<!-- database setting for Oracle -->              
<!--
<_config:Database name="<database name>" user="<user>" password="<password>"
port="1521" schema="<schema name>" server="<server>" type="Oracle"
dbDriverType="thin" />
-->        
<!-- database setting for AIX/DB2 server -->
<!--
<_config:Database type="db2" name="<database name>" user="<user>" password="
<password>" server="<server>" port="<port>" schema="<schema>" />
--> 
<_config:IDResolver className="com.ibm.commerce.foundation.dataload.idresolve.IDResolverImpl" cacheSize="10000000"/>
<_config:DataWriter 
className="com.ibm.commerce.foundation.dataload.datawriter.JDBCDataWriter" />
</_config:DataLoadEnvConfiguration>

Define the wc-loader-employee.xml as below


<_config:DataloadBusinessObjectConfiguration
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:schemaLocation="http://www.ibm.com/xmlns/prod/commerce/foundation/config
xsd/wc-dataload-businessobject.xsd"
xmlns:_config="http://www.ibm.com/xmlns/prod/commerce/foundation/config">
<_config:DataLoader className="com.ibm.commerce.foundation.dataload.BusinessObjectLoader">

<_config:DataReader className="com.ibm.commerce.foundation.dataload.datareader.XmlReader" >
</_config:DataReader>

<_config:BusinessObjectBuilder className="com.ibm.commerce.foundation.dataload.businessobjectbuilder.TableObjectBuilder" >  
<_config:Table name="EMPLOYEE">
<!-- System generated primary key-->
<_config:Column name="EMPLOYEE_ID" value="employee_id" valueFrom="IDResolve">
<_config:IDResolve tableName="EMPLOYEE" generateNewKey="true" />
</_config:Column>
<_config:Column name="NAME" value="name">
</_config:Column>
<_config:Column name="TYPE" value="type">
</_config:Column>
</_config:Table>
<_config:BusinessObjectMediator className="com.ibm.commerce.foundation.dataload.businessobjectmediator.TableObjectMediator">
</_config:BusinessObjectMediator>
</_config:BusinessObjectBuilder>
</_config:DataLoader>
</_config:DataloadBusinessObjectConfiguration>

Running the data load utitliy

Copy the files (employees.csv, wc-dataload.xml, wc-dataload-env, wc-loader-employee.xml
) in directory C:\IBM\WCDE_ENT70\samples\AuroraDataLoad\custom\XML

Switch to C:\IBM\WCDE_ENT70\bin

Run the below command

Dataload C:\IBM\WCDE_ENT70\samples\AuroraDataLoad\custom\XML\ wc-dataload.xml

Query the table to see the results 
  

Comments

  1. Author can represent complex structure of load in understandable manner.

    ReplyDelete

Post a Comment

Popular posts from this blog

Websphere Commerce Catalog Subsystem

Calculation Framework in Websphere Commerce.