IBM Websphere Data Model

Data… probably the most underrated and underestimated part of an e-commerce project. Shame, because good data makes your life much easier, but more importantly, bad data could turn into a major blocker. It is absolutely vital for many reasons. Developing, testing, and accepting depend on it.


Websphere data model

The data model of Websphere commerce is quite clear and straight forward, here and there you have to make some choices, but when you get the hang of it the rigidness will help you.

The first choice you have to make is using XML or CSV files. Where CSV files are easy to generate, the average layman would find it hard to read (also it is not so aesthetically pleasing to the human eye). XML is, in general, the best choice. This is because they work better within different layers and are easier to read (and aesthetically pleasing). This comes in handy when you are trying to troubleshoot data problems in an early stage of the project. Often a MQ layer is used to handle the process of loading files into Websphere. This is a useful aspect, designed to offer the messages in a specific order of time.


Sequence

The Websphere data model works with in a strict sequence. This is a logical way of working, because how can you load a new product in a category, when the category is not even there in the first place? The understanding of this sequence is the foundation of the Websphere data model.


Initially, we start with the attribute dictionary. This is a file that creates all the attributes and it also contains values for the attribute parameters. Furthermore, in this file it is defined whether the attribute works with assigned values or allowed values. I will elaborate on this later. After the attributes, the attribute descriptions need to be loaded, which gives the attributes a name for every locale.

Then, when your data model uses allowed values, you can load them at this point. Allowed values are the predefined values for an attribute. When assigned values are used, you don’t have predefined values for an attribute and every value will be accepted (as long as it matches the data type). The advantage of using allowed values is that the quality of the data is secured. Additionally, it is better for the general performance. Allowed values are being shared between product-attribute combinations, which means the value will be generated only one time. When you use assigned values, every time a product has a value for an attribute, this value is generated. The general rule is: when the values of an attribute are unique for a product, for example a barcode value, use assigned values. Use allowed values when the values are generic and shared between products, for example colour values.

The attribute dictionary is now in place, time to load the catalogs. First there is the master catalog, together with the master catalog description, that holds all products. Then the sales catalog can be loaded. This catalog is a fragment of the master catalog, it contains the products which are used for the specific storefront. Furthermore, the sales catalog description is loaded which contains the names for the locales.

Now that the catalogs are loaded, the products can be loaded too. The catalog entry file contains the part number and the type of entry (item or product). By loading the entries, you automatically give the relation between the master catalog and the product. With the catalog entry description you give the name of the entry for every locale and other entry-related data, such as image URL path, keywords, Meta data, etc.

Finally, two relation files have to be loaded. By loading the catalog entry catalog relation file, the entries are linked to the correct categories and by loading the catalog entry attribute relation file, the entries get an attribute value from the previously loaded allowed values.


It’s in, what now?

At this point you’re thinking: “Great! The base of my data model is loaded!”. Very true, you can see the data on the storefront at this point. But what about processing updates and deletes? Updates are quite straight forward. As long as the identifier of the data element is the same, the update will be processed, no problem.

Deletes are a bit more complicated. Let’s say you deleted a product from your source system (PIM, MDM), the correct thing to do is to send a delete update in the catalog entry file. At the same time the catalog entry attribute relation file will also contain deletes for all the relations that have to be deleted between that entry and attribute allowed values.

So according to the specified sequence of the data load, first the entry will be deleted, and following that, the relations. But by the time the entry is deleted, the relations are automatically not there anymore, this makes sense because the entry doesn’t exist anymore. However, because you still send in the deletes in the catalog entry attribute relation file, the data load will fail: you asked the system to delete a relation that doesn’t exist (anymore)!

In my opinion there are 2 solutions to this problem:

  •  First load all updates and new data, then run the deletes bottom up
  • Only process deletes in the relation files. Down side of this is that unused data will still be in your database.


Know the relations

The data model of Websphere is quite rigid and like everything else, it has positives and some negatives aspects. The good thing is, is that it is quite straight forward. If you follow the rules and your data is good, you will be fine. If not, this could be a blocker in your project. The trick is to just realise how it works and know where the relations are.

To view or add a comment, sign in

More articles by Remko Strengholt

  • How does Solr work?

    In my previous post I discussed why Solr is needed and discussed the benefits of this super-fast search platform, and…

  • Introducing Solr - and why you need it

    Solr is the popular, blazing-fast, open source enterprise search platform built on Apache Lucene™. Don’t take it from…

    2 Comments
  • Why would a customer work Agile Scrum?

    The risks, the uncertainties… Many project-based software buyers are hesitant to agree to develop a complex software…

  • IBM Websphere… but why?

    As we all know there are a few large platforms out there that can facilitate e-commerce needs for large businesses…

Others also viewed

Explore content categories